This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
British Library Cataloguing in Publication Data A CIP record for this book is available from the British Library. ISBN 1 9039 9627 9
Printed and bound in Great Britain by Biddies Ltd, Guildford and King's Lynn www. biddies.co.uk
Contents
Foreword Fernando Brito e Abreu, Geert Poels, Houari A Sahraoui and Horst Zuse 1. A formal approach to building a polymorphism metric Claudia Pons and Luis Olsina 2. A merit factor driven approach to the modularization of object-oriented systems Fernando Brito e Abreu and Miguel Goulao 3.
Object-relational database metrics Mario Piattini, Coral Calero, Houari A Sahraoui and Hakim Lounis
Class cohesion as predictor of changeability: an empirical study Hind Kabaili, Rudolf Keller and Francois Lustman
VII
1
27 49 69 87
6. Building quality estimation models with fuzzy threshold values Houari A Sahraoui, Mounir Boukadoum and Hakim Lounis
107
Index
127
This page intentionally left blank
Foreword
Software internal attributes have been extensively used to help software managers, customers and users to characterize, assess, and improve the quality of software products. Many large software companies have adopted software measures intensively to increase their understanding of how (and how much) software internal attributes affect the overall software quality. Estimation models based on software measures have successfully been used to perform risk analysis and to assess software maintainability, reusability and reliability. However, most measurement efforts have focused on, what we call today, "legacy technology". The OO paradigm provides more powerful design mechanisms. Much work has yet to be done to investigate analytically and/or empirically the relationships between OO design mechanisms, e.g., inheritance, polymorphism, encapsulation, usage, etc., and different aspects of software quality, e.g., modularity, modifiability, understandability, extensibility, reliability, reusability, etc. Furthermore, new technologies, e.g., OO frameworks, OO Analysis/Design patterns, OO architectures, OO components, which take advantage of OO design mechanisms have been proposed in order to improve software engineering productivity and software quality. However, to better understand the pros and cons of these technologies for products developed by using them we must be able to assess the quality of such products via adequate software product measures. A quick look at the literature shows that the work done in the field of quantitative approaches in object-oriented software engineering covers a wide range of topics For this publication, four of them were selected: metrics collection, quality assessment, metrics validation and process management. These four items were identified as key topics during the series of QAOOSE (Quantitative Approaches in Object-Oriented Software Engineering) workshops from which this publication is derived. The first contribution, "A Formal Approach to Building a Polymorphism Metric", proposes a metric that provides an objective and precise mechanism to detect and quantify dynamic polymorphism. This metric is defined using a rigorous formalization of the polymorphism and is validated theoretically. The second, "A Merit Factor Driven Approach to the Modularization of ObjectOriented Systems", presents a quantitative approach for the modularization of object-oriented systems. This approach aims at finding the optimal number of modules using a modularization merit factor and clustering the classes according to this number. The third, "Object-relational Database Metrics", is devoted to the definition and the validation of a suite of metrics for object-relational databases. The definition and the validation (theoretical and empirical) follow a rigorous methodology.
viii
Foreword
The fourth, "Measuring Event-based Object-oriented Conceptual Models", introduces a suite of metrics that covers two important fields of object-oriented technology, namely the early stages of development and dynamic aspects of the design. A first empirical validation of the metrics is presented to show their usefulness. The fifth, "Class Cohesion as Predictor of Changeability: An Empirical Study", describes an investigation on the possibility to use the cohesion metrics as indicators for one of the important quality characteristics, namely the changeability. Although the results didn't demonstrate an evidence of relationship, the authors showed that the problem is related to the definition of the cohesion metrics. The sixth paper, "Building Quality Estimation Models with Fuzzy Threshold Values", proposes an approach for building and using software quality estimation models. This approach is based on a fuzzy logic-based learning algorithm. The main objective is to circumvent one of the major problems with the existing, namely the use of precise metric thresholds values. There are many people to thank for their efforts in the production of this publication, including the external reviewers and specially Bernard Coulange, Guido Dedene, Teade Punter, and Franck Xia for their good work.
Fernando Brito e Abreu FCT/UNL & INESC, Lisboa, Portugal Geert Poels Vlekho Business School, Brussels, Belgique Houari A. Sahraoui IRO, Montreal, Canada Horst Zuse Technische Universitat Berlin, Allemagne
Chapter 1
A formal approach to building a polymorphism metric Claudia Pons Lifia, Universidad National de La Plata, Argentina, Brazil
Luis Olsina GIDIS, Facultad de Ingenieria, UNLPam, Argentina, Brazil
1. Introduction Object-oriented (O-O) software engineers need a better understanding of the desirable and non-desirable characteristics of O-O systems design, and their effect on the quality factor. These properties must represent those characteristics that lead to more understandable, analyzable, extensible and ultimately maintainable software products. The key issues related to the quality assessment of O-O systems are: - It is necessary to determine the desirable and non-desirable characteristics of systems; - There must be a formal definition of these characteristics; - It is necessary to provide mechanisms to detect and quantify the presence of these characteristics; - These mechanisms must be formal and objective. Although quality is not easy to evaluate since it is a complex concept integrated by different characteristics (as efficiency, maintainability, among others), several properties that make for good O-O design have been recognized and widely accepted by the community. Traditional metrics are often quite applicable to OO, but traditional metric suites are not sufficient to measure all OO-specific properties. So, at least, the traditional metric suites must be extended by including new measures (e.g. for polymorphism) to make them useful for measuring OO software. This problem is due to the presence of additional properties that are inherent to the O-O paradigm, such as abstraction, inheritance and polymorphism. These new
2
Quantitative approaches in object-oriented software engineering
concepts are vital for the construction of reusable, flexible and adaptable software products and they must be taken into consideration by O-O software quality metrics. The applicability problem of traditional techniques has been analyzed in the works of Chidamber and Kemerer [CHI 94], Tegarden et al [TEG 92] and Wilde and Huitt [WIL 92] among others. Special metrics for O-O systems have been investigated; see for example the works of Chen and Lu [CHE 93], Kim et al [KIM 94] and Li and Henry [LI 93]. There are numerous proposals addressing the assessment of traditional properties into O-O systems; for example the work of Poulin [POU 97], Briand et al [BRI 97], Price and Demurjian [PR] 97], Benlarbi [BEN 97]. But less work has been done in the field of specific O-O properties; see for example the works of Bansiya [BAN 97] [BAN 99] [BAN 99], Benlarbi and Melo [BEN 99], Abreu and Carapu9a [ABR 94] and Zuse [ZUS 98]. An additional problem is that many of the currently available metrics can be applied only when the product is finished or almost finished, since data is taken from the implementation, so the problems of weakness in quality are detected too late. It is desirable to have a tool that uses information coming from the first stages of the development process (i.e. requirement analysis phases); this would give developers the opportunity to evaluate early and improve the quality of the product in the development process. In this paper, new metrics to measure the quality of an O-O design are defined. These metrics are applied to the conceptual model of a system expressed in the unified modeling language [UML 99], thus permitting an early analysis of the system quality. Although we agree that both the traditional and the specific O-O properties or attributes should be analyzed in assessing the quality of O-O design, our purposes are not to define a complete quality evaluation mechanism (in the sense that it considers every system characteristic), but only to characterize some aspects of the polymorphism attribute. The polymorphism concept can be considered to be one of the key concepts in determining the quality of an O-O design. Regarding the literature, e.g., Benlarbi and Melo [BEN 99], different kinds of polymorphism have been classified, namely: pure, static, and dynamic ones. For instance, considering the latter, for two methods to be polymorphic, they need to have the same name and signature (parameter types and return type) and also the same effects (changing the state of the receiver in the same way and raising the same messages to other objects in the system). Dynamic binding allows one to substitute objects that are polymorphic for each other at runtime. This substitutability is a key concept in O-O systems. Polymorphic systems have several advantages. They simplify the definition of clients, since so long as a client only uses the polymorphic interface, it can substitute an instance of one class for another instance of a class that has the same interface at run-time, because all instances behave the same way. We formally define the dynamic polymorphism concept, giving foundations for its detection and quantification. Thus, the polymorphism measure should be
Building a polymorphism metric
3
combined with the measures of the rest of the properties (such as coupling, cohesion, entropy, etc) with the aim of determining the total quality of the system. However, this metrics combination task is beyond the scope of this work. The structure of this paper is as follows: in the next section, we introduce the M&D-theory, an approach for giving formal semantics to the UML model. In Section 3, we give a formal definition of polymorphism. We define a polymorphism metric and examples in Sections 4 and 5. In the following sections, a conceptual framework for validation is introduced as well as the theoretical validation for that metric suites.
2. The formal domain We first introduce the M&D-theory, a proposal for giving formal semantics to the Unified Modeling Language. The basic idea behind this formalization is the definition of a semantic domain integrating both the model level and the data level. In this way, both static aspects and dynamic aspects of either the model or the modeled system can be described within a first order formal framework. The entities defined by the M&D-theory are classified in two disjoint sets: modeling entities and modeled entities. Modeling entities correspond to concrete syntax of the UML, such as Classes or StateMachine. In contrast, modeled entities, such as Object or Link represent run-time information, i.e. instances of classes and processes running on a concrete system.
2.1. Structure of the theory The M&D-theory is a first-order order-sorted dynamic logic theory consisting of three sub-theories: M&D-theory = UML-theory + SYS-theory + JOINT-theory NOTE.-A first-order order-sorted dynamic logic theory Th consists of a signature E. that defines the language of the theory, and a set of D-axiomas (o: Th = (E, o). A signature E consists of a set of sort symbols S, a partial order relation between sorts <, a set F of function symbols, a set P of predicate symbols , and a set A of Action symbols: E = ((S, <), F, P, A). The language of the theory intentionally follows the notation of the UML metamodel [UML 99] and the Object Constraint Language OCL [UML 99]. 2.1.1. The sub theory UML-theory The theory describes modeling entities (i.e. models). In the UML, Class Diagrams model the structural aspects of the system. Classes and relationships between them, such as Generalizations, Aggregations and Associations constitute Class Diagrams. On the other hand, the dynamic part of the system is modeled by
4
Quantitative approaches in object-oriented software engineering
Sequence and Collaboration diagrams that describe the behavior of a group of instances in terms of message sendings, and by State Machines that show the intraobject dynamics in terms of state transitions. Modeling entities are related to other modeling entities. Consider for example the association between Class and StateMachine by the relation labeled 'behavior'. This association indicates that StateMachines can be used for the definition of the behavior of the instances of a Class. Other example is given by the relation existing between StateMachine and State, which specifies that a StateMachine is composed of a set of States. It is important to formally define how the different UML diagrams are related to one another, to be able to maintain the consistency of the model. Moreover, it is important to specify the effect of modifications of these diagrams, showing what is the impact on other diagrams, if a modification to one diagram is made. The theory consists of a signature formula over
and a
UML-theory The set S UM L contains sort symbols representing modeling elements, such as Class and StateMachine. The order relation between sorts allows for the hierarchical specification of the elements. The sets of symbols F UML and PUML define functions and predicates on modeling entities. The set A UML consists of action symbols representing evolution of specifications over their life cycle. One of the most common forms of evolution involves structural changes such as the extension of an existing specification by addition of new classes of objects or the addition of attributes to the original classes of objects. On the other hand, evolution at this level might reflect not only structural changes but also behavioral changes of the specified objects. Behavioral changes are reflected for example in the modification of sequence diagrams and state machines. The formula oUML is the conjunction of two disjoint sets of formulas, os and <j)D of static and dynamic formulas respectively. The former consists of first-order formulas which have to be valid in every state the system goes through (they are invariants or static properties or well-formedness rules of models). These rules are used to perform schema analysis and to report possible schema design errors. The latter consists of modal formulas defining the semantics of actions, that is to say, the evolution of models. 2.1.2. The sub-theory SYS-theory This theory describes the modeled entities (i.e. data and process). The elements in the data level are basically instances (data value and objects) and messages. At the data level a system is viewed as a set of related objects collaborating concurrently. Objects communicate with each other through messages that are stored
Building a polymorphism metric
5
in semi-public places called mailboxes. Each object has a mailbox where other objects can leave messages. Modeled entities are related to other modeled entities. For example the relationship named 'slot', between Object and AttributeLink, denotes the connection between an Object and the values of its attributes. The theory consists of a signature formula
The set SSys contains sort symbols representing the data in the system and its relationships, such as objects, links, messages, etc. The sets of symbols FSYS and PSYS define functions and predicates on data. The set ASYS consists of action symbols representing evolution of data at run time, such as object state changes. The formula ySYS is the conjunction of two disjoint sets of formulas, y$ and yD of static and dynamic formulas respectively. The former consists of first-order formulas which have to be valid in every state the system goes through (they are invariants or static properties or well-formedness rules of data), whereas the latter consists of modal formulas defining the semantics of actions, that is to say the possible evolution of the data. 2.1.3. The sub-theory JOINT-theory This part of the theory describes the connection between model and data levels. Modeling entities are related to modeled entities. There is a special relationship among some modeled entities with their corresponding modeling entity. This relationship denotes "instantiation", for example, an Object is an instance of a Class, whereas Links are instances of Associations.
6
Quantitative approaches in object-oriented software engineering
Specification of Classifier Sorts Classifier Taxonomy Classifier ModelEvolution Axioms Vc:Classifier Vf:Feature Ve:AssociationEnd Static axioms [l]No Attributes may have the same name Vf,geattributes(c) (name(f) = name(g) -> f=g) [2] symmetry rules: fefeatures(c) <-K>wner(f)=c A eeassociationEnds(c) <-»type(e)=c Dynamic axioms (addFeature(c,f))true —»f g allFeatures(c) [addFeature(c,f)] (Exists(f) A fefeatures(c) A owner(f)=c)
Static aspects of models
Dynamic aspects /of models
End specification of Classifier Figure 1. Sample of the M&D-theory Finally, <|)JOINT is a formula constructed in the extended language SM&D, and thus it can express at the same time data properties (e.g. behavioral properties of objects), model properties (e.g. properties about the specification of the system) and properties relating both aspects. Figures 1 and 2 show a sample of the M&D-theory. More details of the theory can be found in [PON 99a] and [PON 99b].
2.2. Advantages of integration The integration of modeling entities and modeled entities into a single formalism allows us to express both static and dynamic aspects of either the model or the modeled system within a first order framework. The validity problem (i.e. given a sentence (j) of the logic, to decide whether § is valid) is less complex for first-order formalisms than for higher order formalisms. The four different dimensions: static aspects of models, static aspects of data, dynamic aspects of models, dynamic aspects of data, are highlighted in Figures 1 and 2. The integrated formalism is
Building a polymorphism metric
7
suitable for the definition of a variety of properties of O-O systems, structural properties as well as behavioral properties. Specification of Instance Sorts Instance Taxonomy Instance Set of AttributeLink linkEnds: Instance -> Set of LinkEnd classifier: Instance -> Classifier mailBox: Instance -> Seq of Message currentStates: Instance —»State Actions -.-: Instance, Message -» Modification Axioms Vi:Instance VI:LinkEnd Static axioms [ 1 ] the AttributeLinks matches the declarations in the Classifier. Vleslots(i) (attribute(l)eallArtributes(classifier(i))) [2] symmetry lelinkEnds(i)<-» instance(l)=i Dynamic axioms [1] effect of call actions: previousStates=currentStates(o) A firing=firingTransitions(behavior(classifier(o)),m) —» [o.m] (currentStates(o)=(previousStates {source(t) te firing} ) u (target(t) | te firing} A V te firing Vaeeffect(t) sent(a) A m g mailBox(o))
Static /aspects of data
)ynamic \aspects 'of data
End specification of Instance Figure 2. Sample of the M&D-theory The logic allows us to define structural properties such as: depth of class hierarchy, size of class interface, number and type of associations between classes, etc. These properties can be expressed because classes, associations, generalization, etc are first-class citizens in the logic. On the other hand, instances and their behavior are also first-class citizens of the logic; as a consequence, it is possible to define behavioral properties such as pre/post conditions of operations, equivalence of behavior, among others.
8
Quantitative approaches in object-oriented software engineering
2.3. Using the M&D-theory to formalize an O-O model We formally define the semantics of the UML using a two-step approach: - interpretation (or translation) of the UML to the M&D-theory; - semantics interpretation of the M&D-theory, as follows: UML-constructions ->translation M&D-theory ->semantics Semantics-domain The semantics mapping Sem is the composition of both steps. The first step converts an UML model instance to the modal logic theory, the conversion provides a set of formulas that serves as an intermediate description for the meaning of the UML model instance. The key components of this step are rules for mapping the graphic notation onto the formal kernel model. The second step is the formal interpretation of this set of formulae. The semantics domain where dynamic logic formulas are interpreted is the set of transition systems. A transition system, U = (Su,w0,mu), is a set of possible worlds or states with a set of transition relations on worlds; For details about semantics of dynamic logic, see [HAR 00] and [WIE 98]. Formally, let E=( (S, <), F, P, A) be a first-order dynamic logic signature and let EN=( (S, <), FN, PN) be the non-updatable part of I. Let U =( A,m u ) be a EN-algebra, providing a domain for the interpretation of static terms. Formulas of the language are interpreted on Kripke-frames U = (Su,w0,mu), where: - Su is the set of states. Each state weS u is a function that maps terms to the algebra - w 0 eS u is the initial state. - mu associates each action a to a binary relation called the input/output relation of a: m u (ct) c Su x Su The domain for states is an heterogeneous algebra (a EN-algebra) whose elements are both model elements (such as classes) and data elements (such as objects). The interpretation of a term t in a state w given v (written as intw(t)) is defined in the usual way. The satisfaction of a closed formula in a structure U and a state w is defined as follows:
A model for a specification sp
is a structure U such that U,w
Building a polymorphism metric
9
3. Formal definition of polymorphism In this section, we give a rigorous definition of polymorphism in the framework of the M&D-theory. Definitions of the polymorphism concept can be read in [WOO 97]. Let M be an UML model of an O-O system. Let U be the formal semantics of that model, i.e. U =Sem(M). Definition 1: polymorphic methods For two methods to be polymorphic, they need to have the same name and signature (parameter types and return type) and also the same effects (changing the state of the receiver in the same way and raising the same messages to other objects in the system). Let m be a method name. Let C1 and C2 be two classes existing in the model M. The methods named m are polymorphic in C1 and C2 in the model U if the following formula holds: U = Polymorphic(m,C1,C2) Where the predicate Polymorphic is defined as follows:
Def l.1: Vm:Name VC,,C2:Class • Polymorphic(m,C1,C2) <-» 3m1,m2 • ( m1 sQ.operations A m 2 eC 2 .operations A m1.name=m A m2.name=m A hasSameSignature(m1,m2) A has Same Behavior(m,C1,C2)) The formula above states that operations named m, belonging to classes C1 and C2 respectively, are polymorphic if they have both the same signature and behavior. The predicate has Same Signature applied on two methods is true if both methods have the same signature (i.e. name and parameters). It is defined in the M&D-theory as follows:
Def l.2: Vb,b':BehavioralFeatures • has Same Signature(b,b') <-> (b.name= b'.name A are Equivalent(b.parameters, b'.parameters)) Where are Equivalent is defined as follows:
Def l.3: are Equivalent(A.,^)=true are Equivalent(p:ps, A.)=false are Equivalent(?L,p:ps)=false are Equivalent(p,:ps,p 2 :ps')=equivalent(pi,p 2 )A are Equivalent(ps,ps')
10
Quantitative approaches in object-oriented software engineering
where A. denotes the empty sequence and p:ps denotes a non-empty sequence made up from a head (denoted by p) and a tail (denoted by ps). And finally, the predicate equivalent is applied on two single parameters determining their equivalence:
Def l.4: Vp1?p2: Parameter • equivalent(pi,p 2 ) <-> (pi.defaultValue=p 2 .defaultValue A Pi.kind^ p 2 .kind A pi.type=p2.type ) Two methods named m have the same behavior in C1 and C2 if they are indistinguishable (see Figure 3) i.e., for every two objects O1 and o2 (being 01 instance of C1 and o2 instance of C2); the effect of executing o.m is the same as the effect of executing o.m, where o.m denotes that object o receives and execute the method named m. The predicate is defined as follows:
Commutativity of polymorphic methods
Def l.5: U,w |= hasSameBehavior(m,Ci,C2) iff Vo:Object • o.classifier=C! then Vw, • (w,Wi) G m u(o.m) then 3 w',w 5 ! • ((w,w') e Wu(o.migrates(C2)) A (w',w'i) e wLi(0.m) A (w'^w,) e mv(o. migrates(Ci)) ) That is to say, the diagram in Figure 3 commutes, where the action 0.migrates(C) represents that object o switches its class to C (see definition 1.6). Def 1.6: [o.migrates(C)] 0.classifier=C Corollary: for every formula (j), the following schema is valid in the class of models satisfying that the method named m is polymorphic in classes Q and C2:
Building a polymorphism metrictric
11
Voeinstances(C1) [o.m] Definition 2: polymorphic classes Two classes are polymorphic if they define the same methods, and these methods are polymorphic. Two objects belonging to polymorphic classes are polymorphic objects. Dynamic binding lets you substitute objects that are polymorphic for each other at run-time. This substitutability is a key concept in O-O systems. Formally, two classes C1 and C2 are polymorphic in model U, if the following condition holds: U = Polymorphic(C],C 2 ) Where the predicate Polymorphic is defined in the following way:
Def 2.1: VCi,C 2 : Class • Polymorphic^,C2) <-» interface(Ci)=interface(C2) A Vne interface(Ci) • polymorphic(n,Ci ,C2) Function interface returns the set of names of public methods of a class. It is defined in the following way: Def2.2: VCClass • Va:Name • (neinterface(C) <-» 3feC.allFeatures • (f.name=n A f.visibility=#public)) Definition 3: polymorphic hierarchy The concept of polymorphic class that was previously defined is too strong, because in general only some of the methods of a class are polymorphic, not all of them. Therefore, a more flexible concept of polymorphism has been defined (see [WOO 97] named core interface. A core interface is a set of polymorphic methods that several classes share. For a hierarchy to be polymorphic all its classes must share a core interface. Polymorphic hierarchies have several advantages. They simplify the definition of clients, since as long as a client only uses the core interface, it can substitute an instance of one class for other instance of a class that has the same core interface at run-time. Because all instances behave the same, one works just as well as another, with regard to the core interface. Formally, let H be a set of classes in an O-O hierarchy. Hierarchy H is polymorphic in the model U if there exists a core class (this core class might not belong to the set H). The formula below defines the concept of polymorphic hierarchy: U = Polymorphic(H)
12
Quantitative approaches in object-oriented software engineering
Where the predicate Polymorphic is defined in the following way: Def 3.l:
VH:Set of Class • ( Polymorphic(H) <-» 3C:Class • isCore(C,H)) Where: Def 3.2: V H:Set of Class • VC:Class • (isCore(C,H) <-> (interface(C)*0 A V S e H • (interface(C)cinterface(S) A Vneinterface(C) • polymorphic(n,C,S) ) )) The degree of polymorphism depends on the size of the core interface. The larger the core is, the higher the polymorphism degree.
4. The polymorphism metric In the previous sections, we have given a rigorous definition of polymorphism in the framework of the M&D-theory [PON 99]. In addition to this formalization we propose a metric for measuring polymorphism, that provides an objective and precise mechanism to detect and quantify polymorphism. We define the following functions on O-O UML models. Let S be a UML Model, that is to say, a Package containing a hierarchy of modelElements that together describe an abstraction of a physical system [UML 99]. -#hierarchies(S) returns a collection containing every pair-wise disjoint hierarchy defined in S. Then we define the following functions on a class hierarchy h, where single inheritance is considered: - #classes(h) returns a set containing all of the classes belonging to hierarchy h; -#methods(h) returns a bag containing all of the methods defined in the hierarchy, as follows: methods(h)= vBcsciasses(h) interface(c), where 89 represents bags union (with repetitions). - core(h) returns the largest core in the hierarchy. - width(h) returns the width of the hierarchy; it is defined as follows: width(h) = #(interface(core(h))), where the symbol # denotes set cardinality. - children(h) returns the set of direct sub-hierarchies of hierarchy h.
Building a polymorphism metric
13
Let S be an UML model, the polymorphism metric of S being the average polymorphism measure of every pairwise disjoint hierarchies in S. The polymorphism metric function is defined as follows: polymorphism_metric: polymorphism_metric(S)
System —> [0..1] = Zhehierarchies(s) polymorphism_measure(h) #hierarchies(S)
polymorphism_measure: Hierarchy —> [0..1] Trivial Case: if #classes(h)=l then polymorphism_measure(h)=0 General Case: if #classes(h)>l then polymorphism_measure(h)=polymorphic_methods(h) / #methods(h) where: polymorphic-methods(h)= width(h) * #classes(h) + Ehichildren(h) polymorphic-methods(hi-core(h)) where (hi - core(h)) stands for the hierarchy resulting after removing from hi all the methods belonging to core(h). Let us note that the result of the function polymorphism-measure is in the interval [0...1], where the number zero represents the absence of polymorphism, while the number 1 represents the highest degree of polymorphism (i.e., all of the methods in the hierarchy are polymorphic). In the case of trivial hierarchies (i.e., hierarchies made up from a single class), the metrics returns zero.
5. Examples In this section, the polymorphism metric is applied to a hierarchy of Collection.
5.1. Identifying
polymorphism
The hierarchy of collections has been defined and implemented in numerous OO languages. Figure 4, shows a part of the Collection hierarchy of the Smalltalk language [LAL 94]. Using the formal definitions of the M&D-theory we can identify the presence of polymorphic methods in this hierarchy, for example: size, includes:, select:, collect: reject:, remove:, add:, etc. In Figure 4, every polymorphic method appears only once in the hierarchy of classes. For example, every class of
14
Quantitative approaches in object-oriented software engineering
the hierarchy has a polymorphic method named select:. Instead of including the method in each class, we only include it once in the root of the hierarchy. As an example, we show the formal proof of that the method named remove: is polymorphic for classes Set and Bag. The statement a Collection.remove:anElement denotes that the element equal to an Element is removed from the receiver collection. In the case the receiver is a Set, at most one occurrence can belong to the set, but in the case the receiver is a Bag, several occurrences of the same object may belong to it. In the same way, it is possible to prove that method add: is not polymorphic for classes Set and Bag (i.e. I = -,Polymorphic(a<3W:, Set, Bag)) due to the detection of duplicated elements. Classes Collection NotlndexedCollection
Set Bag IndexedCollection FixedSizeCollection Array String VariableSizeCollection OrderedCollection SortedCollection
Public Instance Methods size, includes:, select:, collect:, reject:, detect: remove:
add: add: last, first, at:, at:put:
<5>
remove: add:
add:
Figure 4. Part of the Collection hierarchy in Smalltalk
Theorem 1: remove is polymorphic for classes Set and Bag: = Polymorphic(remove, Set,Bag) Hypothesis: Let Set and Bag be classes in the Smalltalk hierarchy. There exists an operation removeSet belonging to Set.interface, and there exists an operation removeBag belonging to Bag.interface, such that: [hO] removeSeteSet.interfaceA removeBage Bag.interface [h1] removeSet.name=rewoveArernoveSet.visibility=public [h2] removeSet.parameters= [h3] pl.defaultValue=nullElement [h4] pl.kind=in A pl.type=Object
Building a polymorphism metric
15
The specification of the operation is given by the following dynamic logic formula: [h5] Vs,e:Object • (s.classifier^Set -> [s.remove(e)]ees ) [h6] removeBag.name=remove A removeBag.visibility=public [h7] removeBag.parameters= [h8] p2.defaultValue= nullElement [h9] p2.kind = in A p2.type=Object The specification of the operation is given by the following dynamic logic formula: [h lO] Vb,e:Object ( b.classifier=Bag A occurrences(b,e)=n —»[b.rewove(e)]occurrences(b,e)=n-l) We need first to prove the following lemmas: Lemma 1: both operations have the same signature: = hasSameSignature(removeSet, removeBag) Lemma 2: the method remove: has the same behavior in both classes: = hasSameBehavior(remove, Set, Bag) A hasSameBehavior(remove, Bag, Set) Proof of lemma 2: We have to prove that the corollary holds. Since this dynamic logic has a minimal change semantics, it is only necessary to analyze the postconditions of the method remove, because no other change is allowed to happen. That is to say, the instance of the corollary we have to prove is: Vse Set. instances* Ve:Object« [s.remove(e)] e£s <-» [s.migrates(Bag)] [s.remove(e)] [s.migrates(Set)] e£s A VbeBag.instances Ve:Object • [b.remove(e)] occurrences(b, e)=n-l <-^[b.migrates(Set)] [b.remove(e)] [b.migrates(Bag)] occurrences(b,e)=n-l Where n is equal to occurrences(b,e) before the removing. The complete proof can be read in [PON 99]. Proof of the theorem: Now we can prove the theorem, as follows: [tl] 3m 1 ,m 2 » ( m1ESet.operations A m2eBag.operations A m1.name= remove A m2.name= remove A m 1 .visibility=m 2 . visibility A hasSameSignature(m1,m2) A hasSameBehavior(remove,Set,Bag) )
16
Quantitative approaches in object-oriented software engineering
(from lemma 1, Iemma2, [h0], [h 1 ]and [h7]) Finally the theorem is proved applying modus ponens on def. 1.1 and [tl].
5.2. Applying the metric The polymorphism metric is applied to the Smalltalk Collection hierarchy in Figure 4. In that figure polymorphic methods have been previously detected (using the definitions in Section 3) and moved up in the hierarchy (i.e. the root class of the hierarchy is also the largest core class in the hierarchy). Methods appearing twice (or more times) in the hierarchy actually are non-polymorphic methods, for example, the method add: is non-polymorphic for Set and Bag. Measuring Polymorhism in the Collection hierarchy polymorphism_measure(hc) = polymorphic_methods(hC) / #methods(hc) = polymorphic_methods(hC) / 106 ( because #methods(hC) =106 ) = ( width(h C ) * #classes(hC) + E hiechildren (h C ) polymorphic-methods(hi - core(hc))) / 106 (from definition of polymorphic_methods(hc)) = (6 * 11+ E hiechildren (hc) polymorphic-methods(hi - core(hC)) ) /106 (because width(hC)=6, #(classes(hC)=l 1 ) = ( 66 + polymorphic-methods(sub-hierarchy-NotlndexedCollection) + polymorphic-methods(sub-hierarchy-IndexedCollection)) / 106 = (66 + 3 + 31 ) / 106 = 0.94 (or 94%)
Value of the NotlndexedCollection sub-hierarchy (hN) polymorphic_methods(hN) = (width(h N ) * #classes(hN) polymorphic-methods(hi - core(h N )))
Ehiechildren(hN)
+
= (1 * 3 + Zhiechildren(hN) polymorphic-methods(hi - core(hN))) = (1 * 3 + 0 ) = 3 (because children of hN are trivial hierarchies) Value of the IndexedCollection sub-hierarchy (hX) polymorphic_methods(h x ) = ( width(h x ) polymorphic-methods(hi - core(h x )))
Value of the VariableSizeCollection sub-hierarchy (h V) polymorphic_methods(hv) = (width(h v ) * #classes(hv) polymorphic-methods(hi - core(hv)))
+
E hiechildren (hv)
= (1 * 3 + Ehiechildren(hv) polymorphic-methods(hi - core(hv)) ) = (3 + 0) = 3 (because children of hv are trivial hierarchies) In the outcome the high degree of polymorphism of the Collection hierarchy (reaching the value of 94%) can be observed, which contributes potentially to the readability, extensibility, and ultimately to their maintainability. However, some studies, that should further be confirmed, indicate that polymorphism may increase the probability of faults in O-O software - see for example, [BEN 99].
6. Towards a conceptual framework to validation In this section, we introduce the main classes of a conceptual framework, some relationships and axioms useful to validate metrics, in accordance with previous investigations [FEN 97], [KIT 95], [ZUS 98], among others. The contribution of these elements to the validation process will be highlighted.
6.1. Empirical and formal domain and relational systems There exists a theory denominated theory of measurement that declares how to combine empirical conditions with numerical conditions under the assumption of logical equivalence or homomorphism. Extensions of the foundations of this theory with implications for software engineering are well analyzed and documented in the cited Zuse book. For instance, lets S be a set of entities (e.g., UML models) where X is an observable attribute so that xl, x2 are elements of S. Thus, the binary empirical relation xl •> x2 holds if and only if it is judged, e.g., that xl is more polymorphic than x2. Therefore, assignment of a real number m(xl) and m(x2) to each xl, x2 is wanted such that for all pairs belonging to S, xl •>x2<=>m(xl)>m(x2)
[1]
is fulfilled. This statement is the base of measurement, and it can be read in two ways: Firstly, if xl is more polymorphic than x2, it implies (=>) in the formal domain that m(xl) is greater than m(x2); and secondly, if m(xl) is greater than m(x2), then it implies (<=) that xl is more polymorphic than x2 (in the empirical domain). This double implication is called homomorphism. Besides, in (1) the empirical statement is: xl •> x2; and the numerical statement is: m(xl) > m(x2), being •> and > the respective relational operators. It is important to notice that empirical statements are
18
Quantitative approaches in object-oriented software engineering
not true per se; they can be falsified by means of observations and experiments. On the other hand, for the empirical relation •> (or •>=) there can be several interpretations, according to the case. Examples are: equal or more difficult to understand than; equal or more functional than, equal or higher level of defects than, etc. Moreover, the concepts of empirical and formal relational systems, and the concept of metric (or measure) can be introduced. For instance, for the ranking order, the empirical relational system is defined as: S = (S, •>=); and the numerical relational system is defined as: N = (R, >=), where R is a real number. Then, we can write for the m metric, the following expression [ZUS 98]:
where a metric is a correspondence (or mapping) m: S -> R in which the (1) expression is fulfilled for all xl, x2 that belong to S.
6.2. Some classes of the empirical domain: entity and attribute From the evaluation standpoint, in the empirical domain, we have basically the Entity and Attribute classes. The Entity class can be decomposed primarily in three main sub-classes of interest to evaluators, that is to say: a) Process: it is the entity possibly compound of other sub-processes and activities (or tasks), used primarily to produce artifacts; b) Artifact: it is the temporary or persistent entity representing the product of performing a process, and c) Resource: it is an entity required by (or assigned to) a process as input to produce some specified output (project resources are: human, monetary, materials, technological, temporal). The Attribute class represents what is observed and attributed regarding what is known of an entity of the real world, being an object of interest for evaluation. Attributes can be measured by direct or indirect metrics. For a given attribute, there is always at least an empirical relationship of interest that can be captured and represented in the numerical (formal) domain, enabling us to explore the relationship mathematically. There can be a many-to-many relationship between Entity and Attribute classes. That is, an entity can possess several attributes as long as an attribute can belong to several entities [KIT 95]. In the present work, for example, an O-O design specification of a software system is an artifact or product, and polymorphismjneasure is an indirect attribute. It is important to note that a metric of a direct attribute of an entity involves no other attribute measure (such as the number of classes or number of methods of an inheritance hierarchy). However, we can obtain values from equations involving for example two or more direct attributes measures. Hence, we have an Attribute Association in the empirical domain that is formalized by an Equation in the formal domain (such as the polymorphism_measure formula of Section 4).
Building a polymorphism metric
19
6.3. Some classes of the formal domain: value, unit, scale type, and measurement tool A Scale Type and Unit should be considered in order to obtain magnitudes of type Value when a specific Attribute of a given Entity is measured. A Unit of measure determines how the attribute of such entity should be quantified. Therefore, the measured value can not be interpreted unless we know to what entity it is applied, to what attribute it is measured and in what unit it is expressed (i.e., the empirical and formal relational systems should be clearly specified). On the one hand, Value (or scale) and Scale Type are two different classes that are frequently confused. The concept of scale is defined by the triple (S, N, m) - see (2) above. So the scale is defined by a homomorphism. Clearly, it can be noted that the empirical relational system S, the numerical relational system N, and the metric m are needed in order to obtain a value or scale. On the other hand, a scale type is defined by admissible transformations. An admissible transformation is a conversion rule f in which are given two measures m and m', it holds that m' = f m. For example, admissible transformations are m' = a m + b, with a > 0 and b e R; m' = a m, among others. The scale type does not change when an admissible transformation is performed. Besides, the scale type of a measure affects the sort of arithmetical and statistical operations that can be applied to values. Scale types are hierarchically ordered as nominal, ordinal, interval, ratio and absolute scales and can been seen as keywords describing certain empirical knowledge behind values, as stated by Zuse. For example, the nominal scale type implies a very simple empirical condition: the equivalence relationship. Let the empiric relational system be (S, «), and given an observable attribute so that xl, x2 belong to S; a function m: S -> R, then exists so that: is a nominal value. In addition, for the ordinal scale type, the empirical relational system of the nominal one is extended to reflect the ordinal scale as expressed in (2). Zuse indicates that the weak order is a prerequisite for ranking order measurement, which is transitive and complete. So we can express these properties as:
Moreover, the ratio scale type is a very well known one in physics and traditional sciences; e.g., for longitude, money measures, among others. Zuse says, "we want to have something above poor ranking or comparing of objects. We want to be additive in the sense that the combination of two objects is the sum of their measurement values'". The idea of a ratio scale is linked to additive and non-additive properties. An additive ratio scale is represented by:
20
Quantitative approaches in object-oriented software engineering
Therefore, (S, •>=, o) is a closed extensive structure if there exists a function m on S such that for all xl, x2 belonging to S,
Also, a function m' exists that is the admissible transformation, i.e.: m'(x) = a m(x) with a > 0. Furthermore, the author defines the modified extensive structure and the empirical conditions where specific axioms must be satisfied (see Chapter 5). Finally, for an absolute scale type the only admissible transformation is identity, as we will exemplify for some polymorphic metrics, in Section 7. On the other hand, regarding the conceptual classes of the framework, obtaining the measured value can be done either manually or automatically by using partially or totally a Measurement Instrument (a software tool). This instrument can be optional, however. Automated data gathering is more objective and reliable.
7. Validation of the polymorphism metric There are two strategies to corroborate or falsify the validity of metrics: the theoretical and the empirical validation. Theoretical validation is mainly based on mathematical proofs that allow us to formally confirm that the measure does not violate the properties of the relational systems, the conceptual models and criteria. On the other hand, the empirical validation consists frequently of the realization of experiments and observations on the real world in order to corroborate or falsify the metric or model. In addition, validation approaches can be classified according to the type of attribute that is taken into account. From this point of view a metric is valid internally or "valid in the narrow sense" [FEN 97], if it analyses attributes that are inherent of the entity (product, process or resource), while a metric is valid externally or "valid in the wide sense" if it considers higher level characteristics (such as cost, quality, maintainability, etc) both for assessment and for prediction purposes. In this section, we analyze some aspects of the theoretical validation for the polymorphism metrics discussed and exemplified in Sections 4 and 5. These direct and indirect metrics embrace internal attributes of a product. In a general sense, the [KIT 95] assumption is that in order for a measure to be valid these two conditions must be held: 1) the measure must not violate any necessary property of its elements (i.e., the classes and relationships); 2) each model used in the process must be valid. The structural framework of [KIT 95] can be combined with the axiomatic framework of [ZUS 98] to yield a wider validation framework, [OLS 00]. Regarding the proposed conceptual framework in order to decide whether a metric is valid, it is necessary at least to check:
Building a polymorphism metric
21
-Attribute validity, i.e., whether the attribute is actually exhibited by the entity being measured. For a given attribute, there is always at least an empirical relationship of interest that can be captured and represented in the formal domain, enabling us to explore the relationship analytically. This can imply both a theoretical and an empirical validation; - Unit and Scale Type validity, i.e., whether the measurement unit and scale type being used are an appropriate means of quantifying the internal or external attribute. As stated before, when we measure a specific attribute of a particular entity, we consider a scale type and unit in order to obtain magnitudes of type value. Thus, the measured value can not be interpreted unless we know to what entity it is applied, to what attribute it is measured and in what unit it is expressed (i.e., the empirical and formal relational systems should be clearly specified). On the other hand, a scale type is defined by admissible transformations of measures; -Instrument validity, i.e., whether any model underlying a measuring tool is valid and the instrument is properly calibrated; - Protocol validity, i.e., whether an acceptable measurement protocol has been used in order to guarantee repeatability and reproducibility in the measurement process. Regarding the polymorphismjneasure, some empirical considerations should be made. As aforementioned, the tthierarchies(S) function returns the collection containing all the disjoint hierarchies defined in S. This guarantees, for example, that the intersection between two hierarchies gives the empty set. In addition, we are only considering tree hierarchies which allow us to model single inheritance (Java and Smalltalk languages, among others, only support single inheritance). In order to try to guarantee the ratio scale for the polymorphismjneasure metric, we started to investigate the modified extensive structure and the additive properties discussed in [ZUS 98]. However, the initial results indicate that the metric does not accomplish the independence condition Cl, and the axiom of weak monotonicity. Hence, for that metric the absolute scale has in principle been validated as follows: Theorem 7.1: The scale type of the polymorphismjneasure is absolute. Proof: Let m = A/B be the metric, and let A, B be absolute values
[3]
where A represents the polymorphicjnethods attribute; and B represents the total number of methods of a hierarchy (#methods(h)). A <= B is always satisfied, and therefore it holds that A c B. The relationship between A and B can be described by: A = c B; with c>0 Replacing (3) in (4), the following equation is obtained: m = c B/B = c.
[4]
22
Quantitative approaches in object-oriented software engineering
The resulting m is an absolute scale. Besides, percentage measures can be used as an absolute scale, but they do not assume an extensive structure - as indicated by [ZUS 98], p. 237-238. Attribute #classes (h)
Scale Type Absolute
#methods (h)
width(h)
#hierarchies (S)
Polymorphic methods
Absolute
Polymorphism Absolute measure
Unit
Some Criteria and Properties that Apply
Number of classes in h Total number of methods in h (regarding bags) Number of polymorphic methods
• These internal attributes are exhibited in OO design and implementation specifications. They are direct metrics. S Different hierarchy specifications may have different number of classes, methods, etc. for the respective attribute. Conversely, different hierarchy specifications may have the same number of classes, methods, etc. S The unit and scale type are defined and confirmed. The admissible transformation is the identity, i.e. m(x)=x. Accordingly, they are obtained by counting elements where an absolute scale is generally implied (but not always).
Number of hierarchies in the S specification Number of polymorphic methods * Number of classes
[(Number of polymorphic methods * Number of classes) / Total number of methods to h], It represents the degree of polymorphic methods of an inheritance hierarchy
S It is an indirect metric. The equation is shown in Section 4. v' The unit and scale type is defined. The admissible transformation is the identity, i.e. m(x)=x. So, it yields an absolute scale regarding the following combination rule: m(xl o x2) = m(xl) * m(x2). •S It is an indirect metric. The equation is shown in Section 4. S It fulfills that a greater number of polymorphic methods with regard to the total amount of methods of a hierarchy lead to a higher degree of polymorphism -hence, the specification can be more understandable and reusable). The absence of polymorphic methods in a hierarchy yields a zero value. Conversely, the 1 value (or 100%) means that all methods are polymorphic. •S The unit and scale type are defined. It yields an absolute scale as demonstrated by the theorem 7. 1.
Figure 5. Descriptions of theoretical validity for the polymorphism_measure and its elements
•
Building a polymorphism metric
23
Figure 5 shows descriptions of the theoretical validity for a set of used functions for the metric. The target entity is an O-O design specification or a source code of an O-O program. The instrument validity is applicable because data collection and calculations can be carried out automatically (the underlying algorithm is supported by the recursive model). Ultimately, the measure of polymorphism of a set of disjoint hierarchies defined in the S specification is computed by making an average as shown in Section 4. This statistical analysis is allowed to magnitudes of an absolute scale type.
8. Concluding remarks Although quality is not easy to evaluate since it is a complex concept compound with different characteristics (see, for example, the ISO 9126 quality standard [ISO 91]), several attributes that make a good O-O design have been recognized and widely accepted by the software engineering community. We agree that both the traditional and the new O-O properties or attributes should be analyzed in assessing the quality of O-O design. On the other hand, further work should be carried out in order to map and correlate O-O metrics to quality characteristics such as those prescribed in the ISO standard. However, as an initial step in that direction, we believe that it is also necessary to pay attention to the concepts and metrics for polymorphism, since it should be considered to be one of the metrics that influences the quality of an O-O software system. In this paper, we have given a rigorous definition of polymorphism in the framework of the M&D-theory [PON 99]. Accordingly, on top of this formalization, we have proposed a metric for measuring polymorphism that provides an objective and precise mechanism to detect and quantify dynamic polymorphism. In addition, initial efforts have been made in order to validate the discussed metrics regarding the theoretical validation framework. Furthermore, it is important to note that the analyzed metric takes information coming from the first stages of the development lifecycle gives developers the opportunity to early evaluate and improve some attributes of the quality of the software product.
9. References [ABR 94] ABREU, F.B., CARAPUQA, R., "Object Oriented Software Engineering: Measuring and controlling the development process", 4th International Conference on Software Quality, Virginia, USA, 1994. [BAN 97] BANSIYA, J., "Assessing quality of object-oriented designs using a hierarchical approach", OOPSLA '97 Workshop#12 on Object-oriented design quality, Atlanta, USA, October 1997.
24
Quantitative approaches in object-oriented software engineering
[BAN 99] BANSIYA, J. DAVIS, C. ETZKORN L., LI, W., "An entropy-based complexity measure for object oriented designs", Theory and Practice of Object Oriented Systems, 5(2), 1999. [BEN 97] BENLARBI, S., "Object-oriented design metrics for early quality prediction", OOPSLA '97 Workshop#12 on Object-oriented design quality, Atlanta, USA, October 1997. [BEN 99] BENLARBI, S., MELO, W. "Polymorphism measures for early risk prediction", In International Conference of Software Engineering (ICSE'99), Los Angeles, CA, USA, 1999. [BRI 97] BRIAND, L. DEVANBU P., MELO, W., "An investigation into coupling measures for C++", In International Conference of Software Engineering (ICSE'97), Boston, USA, May 1997. [CHI 94] CHIDAMBER S., KEMERER, C., "A metric suite for object oriented design", IEEE Transaction on Software Engineering, 20, 1994. [CHE 93] CHEN J., Lu, J., "A new metric for object oriented design". Information and Software Technology, 35. 1993. [FEN 97] FENTON, N.E.; PFLEEGER, S.L., Software Metrics: a Rigorous and Practical Approach, 2nd Ed., PWS Publishing Company. 1997. [HAR 00] HAREL, D., KOZEN, D., TIURYN, J. Book on Dynamic Logic, to appear, 2000. [ISO 91] ISO/IEC 9126, International Standard Information technology - Software product evaluation - Quality characteristics and guidelines for their use, 1991. [KIM 94] KIM, E. CHANG, o. KUSUMOTO, s. KIKUNO, T., "Analysis of metrics for object oriented program complexity", Procs. 18lh Annual International Computer Software and applications Conference, COMPSAC'94. [KIT 95] KITCHENHAM B., PFLEEGER s. L., FENTON N., "Towards a Framework for Software Measurement Validation", IEEE Transactions on Software Engineering, 21(12), p. 929944, 1995. [LAL 94] LALONDE, WILF, Discovering Smalltalk, Addison Wesley. 1994. [LI 93] LI W., HENRY S., "Object oriented metrics that predict maintainability", The Journal of Systems and Software, #23. 1993. [OLS 00] OLSINA L., PONS C, Rossi G., "Towards Metric and Model Validation in Web-site QEM", In Proceedings of VI CACIC (Argentinean Congress of Computer Science), Ushuaia, Argentina, 2000. [PON 99a] PONS C. , BAUM G., FELDER M., "Foundations of Object-oriented modeling notations in a dynamic logic framework", In Fundamentals of Information Systems, Chapter 1, T.Polle, T.Ripke, K.Schewe Editors, Kluwer Academic Publisher, 1999. [PON 99b] PONS C., Ph.D Thesis, Faculty of Science, University of La Plata, Buenos Aires, Argentina, http://www-lifia.info.unlp.edu.ar/~cpons/ (1999). [POU 97] POULIN J., Measuring Software Reuse-Principles and Practices and Economical Models, Addison Wesley. 1997.
Building a polymorphism metric
25
[PRI 97] PRICE M., DEMURJIAN, S., "Analyzing and measuring reusability in object-oriented design", in Proceedings OOPSLA '97, Atlanta, USA, 1997. [TEG 92] TEGARDEN, D. SHEETZ S., MONARCHI, D., "Effectiveness of traditional software metrics for object-oriented systems", 25th Annual Conference of System Science, Maui, HI., 1992. [UML 99] UML 1.3, Object Management Group, The Unified Modeling Language (UML) Specification - Version 1.3, in http://www.omg.org (1999). [WIE 98] WIERINGA R., BROERSEN, J., "Minima! Transition System Semantics for Lightweight Class and Behavior Diagrams", In PSMT Workshop on Precise Semantics for Software Modeling Techniques, Ed: M.Broy, D.Coleman, T.Maibaum, B.Rumpe, Technische Universitat Munchen, Report TUM-I9803, April 1998. [WIL 92] WILDE N., HUITT, R., "Maintenance support of object-oriented programs", IEEE Transactions on Software Engineering, 18, 1992. [WOO 97] WOOLF, B., "Polymorphic hierarchy", The Smalltalk Report, January 1997. [ZUS 98] ZUSE, H., A Framework of Software Measurement, Walter de Gruyter, Berlin-NY. 1998.
This page intentionally left blank
Chapter 2
A merit factor driven approach to the modularization of object-oriented systems Fernando Brito e Abreu and Miguel Goulao Software Engineering Group, FCT/UNL and INESC, Lisbon, Portugal
\. Introduction Modularity is an essential aspect of all engineering domains. It allows, among other things: (i) design and development of different parts of the same system by different people, often belonging to distinct organizations, (ii) the handling of the complexity of large systems by splitting loosely coupled parts that can be better understood individually, (iii) the testing of systems in a parallel fashion (different people simultaneously), (iv) substitution or repair of defective parts of a system without interfering with other parts, (v) the reuse of existing parts in different contexts, (vi) the division of the system in configuration units to be put under configuration control and (vii) restriction of defect propagation. The architecture of a software system is determined, at the more abstract level, by a set of modules and by the way they are glued together [SCH 96, p. 9]. Generically, a module can be an aggregate of algorithms' implementations and data structures that interact somehow to deliver a given kind of functionality. Each module can have its own state, shared or not, and needs the collaboration of other modules to deliver its functionality. A module should have a clear interface. A protocol should be offered to other modules, by means of some exporting mechanism that makes its interface available. It is desirable, for reasons thoroughly discussed in the software engineering literature, that modules be highly cohesive and loosely coupled [GHE 91, JAC 92, PRE 00, SOM 00]. If we consider a module to be, as we shall henceforth, a set of classes, then intramodular coupling, that is, the coupling among the classes belonging to the module, can represent module cohesion. Complementarily occurs when we talk about module coupling by which we mean inter-modular couplings, that is, those that cross module borders. These may correspond to dependencies of internal classes (those belonging to the module) on external ones, or the other way round. Modularization can be flat
28
Quantitative approaches in object-oriented software engineering
or hierarchical (i.e. modules containing other modules). An objective criterion for modularization should be made explicit in the software documentation. With hierarchical modularization that criterion can be different at each modularization level. Modularity is an internal quality characteristic that influences external software quality characteristics, as suggested in [ISO 9126], and it can be observed at different levels of abstraction [CON 90]. During requirements specification and detailed analysis, modules are usually black boxes that facilitate the dialog and understanding between domain experts and analysts. At design level, modularity is traditionally associated with the identification of subsystems and abstract data types [EMB 88]. Software components, which are usually built as an encapsulated set of interrelated classes, can also be seen as reusable modules [SZY 98]. At source code level, modules usually correspond to operating system files, allowing separate compilation and favoring incremental development. That is why these modules are often called compilation units. At executable code level, modularity also plays an important role, as with overlays or dynamic linked libraries. In this paper we will be mainly concerned with the design level for object-oriented systems. The object-oriented paradigm, along with the spreading availability of processing power, has allowed the conception of increasingly large and complex software systems. These must be developed and integrated modularly. Although the need to aggregate classes seems to be consensual, there is, however, a lack of terminological uniformity in the designation of those aggregates or clusters, as we will see later. In C++ programming language, namespaces support modularization by providing a mechanism for expressing logical grouping [STR 97]. Packages are an important modularization mechanism in Java. They may contain any combination of interfaces (defining types) and implementations (classes) [GOS 96]. In Smalltalk development environments, such as Envy, there is also modularization support through the use of packages. During runtime, those packages are loaded in a specific order, starting with the kernel one, which defines all the primitives. In the Delphi language, an extension of Pascal for object-oriented programming, modules are called units [CAN 96]. Bertrand Meyer, the creator of the Eiffel language, empathizes a modularization abstraction, the cluster, which is the basis of his Cluster Model [MEY 95]. In OMT (Object Modeling Technique) the modularization unit is called subsystem [RUM 91]. In Objectory the same denomination is used [JAC 92]. Grady Booch proposes the word category in his method [BOO 94]. Meilir Page-Jones talks about domains and sub-domains [PAG 95]. The word package is again used in UML (Unified Modeling Language) [BOO 97] and in the Catalysis approach, where it designates any container of developed artifacts [SOU 98, p. 18]. These kinds of modules can contain classes, component specifications, component
Approach to OO systems modularization
implementations, reusable frameworks, modularization) and other deliverable types.
nested
packages
29
(hierarchical
In the above references we could only find qualitative indications to the need of modularization or, at most, some vague guidelines for grouping classes. We will show that the determination of an optimal modularization solution is feasible. This kind of information is obviously useful during the initial design phase. On the other hand, large software systems usually evolved incrementally from smaller ones. Although the initial architecture may have been acceptable, the evolution over time often causes modularity degradation, especially if the underlying criterion was not clearly enforced. At a certain time a modularity reengineering action will be required. Being able to assess the need for such action and to point out the optimal solution were some of the driving forces for the work presented herein. This paper is organized as follows. Section 2 introduces the problems faced in a quantitative approach to software modularization and proposes a theoretical framework to support it. Section 3 describes the methodological approach adopted in a large scale experiment using the proposed framework. In Section 4 the data collected in the experiment is analyzed and discussed. Related and future works are identified in the last two sections.
2. The quantitative way 2.1. How much modularity? Ivar Jacobson mentions that an object module usually has 1 to 5 classes, although he reports on a system developed with Objectory where he had as many as 17 classes [JAC 92, p. 145]. On the other hand, Bertrand Meyer advocates that software systems should be divided into modules (clusters), typically with 5 to 40 classes each and developed by 1 to 4 people [MEY 95].Also, according to him, that dimension should be such that the cluster would play an important role in the system, but not too big that it hinders its understanding by just one person after some effort. Here, as in most references on the subject, the brief quantitative citations have no supporting evidence. In a well-known reference book, Meyer recognizes that the criteria, rules and principles of software modularity are usually introduced only through qualitative definitions, although some of them may be amenable to quantitative analysis [MEY 97, p. 65]. We hope to prove that in this contribution. The question then is: given a system, to what degree should you decompose it, that is, how many modules should you consider? This question leads to the following one: how disparate in size should the modules in a given system be? There is not an answer for these questions unless we define a quantitative criterion for system sectioning. It is acceptable that the number of modules should somehow be proportional to the system size (e.g. expressed in number of classes).
30
Quantitative approaches in object-oriented software engineering
However, such an assertion is not of a great help since the proportionality constant is not known, nor is it likely that the module density (classes per module) should be uniform. At the coding level we often find the classes completely separate (e.g. one .h and one .cpp file per class in C++). If we consider the class to be our atomic part on which modularity is concerned, this situation can be seen as the radical expression of the following criteria to define the optimal modularization size: Strawman Criterion A - The number of modules should be the maximum allowed. Since the idea of modularization is that of grouping related items (and obviously splitting the unrelated ones), criterion A is not acceptable. On the other hand, if we consider the software engineering literature, we could easily be tempted to define another criterion: Strawman Criterion B - The number of modules should be the one that maximizes the coupling among classes within each module and minimizes the coupling for those belonging to different modules. Although appealing, this assertion is nothing but a fallacy: the above number is always 1 (one module). Indeed, for a given system, starting with the extreme situation of one class per module and whatever aggregation sequence is adopted, when we reduce the number of clusters by grouping classes, the coupling between the classes within each module increases monotonically. Meanwhile, the coupling among classes belonging to different modules decreases monotonically as well(in the limit, with only one module, it equals zero)! Criterion B therefore contradicts the idea of modularization itself.
2.2. How to aggregate classes in modules? When the number of modules is fixed we get to the problem of finding the optimal grouping of classes. In the Catalysis approach the authors recognize that it is important to restrict the module (package) dependencies [SOU 98]. Again we are in presence of another qualitative statement. The authors do not explain how to achieve that reduction! The grouping problem is the aim of Cluster Analysis, a subject concerned with the classification of similar items into groups [KAU 90, ROM 90]. Cluster analysis techniques have been used in many areas such as economics, geology, botany, pattern matching and so on. The objective of clustering techniques is the grouping of items in such a way that the relations between items in the same group are stronger than the relations with items in other groups. In order to cluster a group of items, two things are required: some measure of the way each item relates to the others and a method to group them. The expression of
Approach to OO systems modularization
31
"how far" two items are, is known as the dissimilarity or distance between them. Dissimilarities can be obtained in several ways. They are often based on the values of variables that represent certain item properties. The dissimilarities between each pair of classes are usually summarized in a dissimilarities matrix, a square symmetric matrix with zeros on the main diagonal. Besides a distance measure, one also needs an algorithm to drive the clustering process, a clustering method. We have used seven well-known hierarchical agglomerative clustering methods: Single linkage, Complete linkage, Between groups linkage, Within groups linkage, Centroid, Median and Ward's methods [KAU 90]. Agglomerative methods start with all items in separate clusters and proceed in iterations joining them until the defined number of clusters is reached. In Table 1 we provide a brief description of the clustering methods we use. For all the previous methods, the clusters to be merged in each stage are the ones whose distance (as defined in the second column of Table 1) is the shortest. We also used the Ward clustering method, which is somewhat different from the others. It departs from the consideration that at each stage the loss of information that results from the grouping of individuals into clusters can be measured by the total sum of squared deviations of every point from the mean of the cluster to which it belongs. So, this method calculates, for each case, the squared Euclidean distance to the cluster means; these distances are summed for all the cases; at each step, the two clusters that result in the smallest increase in the overall sum of the squared withincluster distance are combined. Table 1. Clustering methods Method name
. . . the distance between their most remote pair of items (opposite of single linkage). ... the average of the distances between all pairs of individuals in the two groups.
Within-groups linkage
... the average of the distances between all pairs of cases in the cluster that would result if they were combined.
Centroid
... the distance between the group centroids, that is the distance between their means, for all of the items.
Median
. . . similar to the centroid but it weights equally the two groups to combine, in the calculation of the centroid.
32
Quantitative approaches in object-oriented software engineering
Due to the nature of our problem - clustering classes in modules - we have to identify how to express the dissimilarities among the classes. Although the solution to this problem has been pointed out long ago, based upon the cohesion (intramodular coupling) and inter-modular coupling [STE 74], we have not yet seen it applied elsewhere in a quantitative way to object-oriented systems by using cluster analysis. Therefore we state the following modularization criterion: Woodenman Criterion C - Given a constant number of classes and modules, the aggregation of classes should be the one that maximizes the coupling among classes within each module and minimizes it for classes belonging to different modules. The coupling between two classes can be characterized by the number of coupling instances discriminated by type. Several authors have proposed coupling taxonomies in the realm of the OO paradigm [BR1 99, HIT 96, LOU 97, POE 98]. In [ABR 00] we have introduced another taxonomy with GOODLY in mind. GOODLY (a Generic Object Oriented Design Language? Yes!) is an intermediate formalism used to specify the design of systems built according to the object oriented paradigm [ABR 99]. The taxonomy used includes the following coupling categories: Direct Inheritance (DI), Class Parameter (CP), Attribute Type (AT), Employed Attribute (EA), Parameter in Operation (PO), Parameter in Message (PM), Parameter in Call (PC), Return in Operation (RO), Return in Message (RM), Return in Call(RC), Local Attribute in Operation (LA) and Message Recipient (MR). The more coupling instances there are between two classes, the stronger is their interconnection strength, which we call affinity. We hypothesize that different coupling types may contribute differently to the affinity. We have used several schemes of combination of the available coupling information to derive different affinity values. These schemes were named as Unweighted Binary (UB), Weighted Binary (WB), Unweighted Additive (UA), Weighted Additive (WA) and Unweighted Multiplicative (UM). Their exact mathematical definition can be found in Table 2.
Approach to OO systems modularization
33
Table 2. Affinity rating schemes Scheme
Definition MR
Unweighted
A UB ( I, J) = E. Ccc(i,j)
Binary
cc=DI MR A wB (i, J) = Ea cc C cc(i , J) cc=DI MR
Weighted Binary Unweighted
A UA (i, J)= E N C C (i, J)
Additive
cc=DI MR
Weighted Additive
AWA(i, j) =
Ea c c N c c (i,j)
cc=DI
MR
Unweighted A
Multiplicative
U M ( i , J)
= Yl N cc(i, J) cc=DI
where: acc - positive non-null weight associated with a given "CC" coupling category; Ccc(i, J) - predicate with values 1 or 0 stating whether classes "i" and "j" are coupled by at least one CC type of coupling or not; Ncc(i,j) - number of instances of CC coupling type, between classes "i" and "j". In the multiplicative schemes, only terms with non-zero Ncc(i, j) value are considered. The weights used in some of these rating schemes (the "weighted" ones, obviously) vary in the range [1, 10] and are calculated with information extracted from the original solution. Each weight for a given coupling category is given by:
10 1
a,.,. =
CI
Round(0,5 +10 x CC
+ CI
=0
CI
=0
CCIN =0^ CI
CCOUT
otherwise
CCIN
CCIN
CI cCCIN # 0 ^ CIccout
CCOUT
where CI CCIN and CIcCOUT are, respectively, the number of intra-modular and inter-modular coupling instances for that category in the original modularization
34
Quantitative approaches in object-oriented software engineering
proposal. This technique partially preserves the original modularization criteria, at least as their developers perceived it. From the affinity values A(i, j), we calculate the dissimilarities D(i,j) among classes using the following transformation:
This standardization transformation guarantees that D(i, J) e[0, 1]. Classes with low affinity have high dissimilarity and vice-versa. With couplings rated according to the proposed schemes, and using this transformation, we can produce 5 dissimilarity matrices, one for each distinct scheme. Since we use 7 clustering methods, we obtain 35 distinct modularization solutions for a given number of modules. If we vary the number of modules in the interval J1, NC/, where NC is the total number of classes, then the number of configurations NS to evaluate, in order to select the best one, is given by:
For instance, a system with 100 classes would have 3430 alternatives. This takes us back to the problem of finding an adequate criterion to select the best among these alternatives.
2.3. A dual decision modularization criterion Criterion C is incomplete since it implies that we already know how many clusters we should have. To solve the problem of finding the most adequate number of clusters, we herein propose the following quantitative modularization criterion: Ironman criterion D - The number of modules should maximize the Modularization Merit Factor. The Modularization Merit Factor, henceforth designated by the MMF acronym, is defined as:
where: ICD = CCIN / CCTOT = Intra-modular Coupling Density AMM = NC / NM = Average Module Membership
Approach to OO systems modularization
35
CCIN = Class Couplings within modules CCOUT = Class Couplings between modules CCTOT = CC1N + CCOUT = Total class couplings NM = Number of Modules NC - Number of Classes The rationale for this metric is the following: 1. when NM and NC are held constant, the modularization will be better if the Intra-modular Coupling Density (ICD) increases; 2. when NC and ICD are held constant, the modularization will be better if the number of modules (NM) increases; in other words, we should split the system as much as possible, as long as ICD is not sacrificed (reduced); 3. when NM and ICD are held constant, the modularization will be worst when the number of classes increases, because we would then be increasing the average module membership without increasing ICD.
Figure 1. MMF versus number of modules
Figure 1 shows the dependency of MMF and its components (intra-modular coupling density and the inverse of the average module membership) on the number of modules for a fixed number of classes. The values were taken from one system of our sample described in Appendix A. The MMF curves for the other systems have similar shapes, all with an identifiable maximum. Notice that ICD E [0,1] and 1/AMM e [0,1]. Therefore, MMF is necessarily restrained to the interval [0,1]. Criterion D does not account for the dispersion on the module sizes. If we simply use it to seize the best solution, the distribution of the module sizes would be highly skewed since the maximization of ICD would lead to a situation where we would have one module with a large membership and all the rest with very few classes, often only one. This configuration still guarantees that the average module
36
Quantitative approaches in object-oriented software engineering
membership (AMM) is kept low, but does not consider its dispersion, as mentioned previously. Therefore, we add a second modularization criterion: Ironman criterion E - The dispersion of the module size in the system should be constrained. We have chosen a common dispersion measure, the standard deviation, and applied its formula considering that we are dealing with the complete population of modules. We then obtain the following formula where CMm is the number of classes in module m.
To find the most adequate solution we are faced with a multiple decision criteria problem. Our approach to this problem was to apply first the MMF-based criterion to select a subset of the best modularization solutions (e.g. 10% of the generated alternatives) and then apply the second criterion to this subset to derive the proposed "best" solution. This order reflects our view that criterion D is more important than criterion E.
3. The methodological approach 3.1. Hypothesis We hypothesized that current software systems are far from taking full advantage of the modularization mechanisms provided by OO design (a). The defined theoretical framework to guide the process of partitioning OO systems into clusters showed that this task is feasible. Our experiment aimed at demonstrating that, with appropriate tool support, this task is also easy to accomplish (b). All of the mentioned clustering methods and affinity rating schemes were tried. It is likely that the selection of a particular clustering method has an impact on the computed cluster organization (c). A similar reasoning can be applied to affinity rating schemes (d).
3.2. The experiment To evaluate these hypotheses we designed and conducted a controlled experiment. This experiment consisted basically on computing optimal
Approach to OO systems modularization
37
modularization solutions for a relatively large set of OO software systems and then evaluating how far the original modularization solutions were from them. While doing so, the feasibility and easiness hypothesis (b) was evaluated. After completing the computation of the MMFs for all the systems in our sample, we used data analysis techniques to discuss hypotheses (a), (c) and (d).
Figure 2. The MOTTO tool
The MOTTO tool was used to compute a range of the best values for MMF while applying criterion D and then, from that range, selecting the optimal value for MMF based on the application of criterion E. MOTTO (Modularization Trial Tool for Object Oriented systems) was developed at INESC and is used in conjunction with MOODK1T G2 described elsewhere [ABR 00] and SPSS, a commercial statistical package tool. MOODKIT G2 generates GOODLY code both from design models or code. The same tool produces the coupling relations information that serves as input for MOTTO. The latter then computes the dissimilarity matrices for the selected rating schemes and generates SPSS scripts for the requested cluster methods and for the given cluster number interval. A SPSS batch facility is then invoked by MOTTO to compute alternative clustering solutions. These are then used by MOTTO to generate matrices of MMF values (one for each cluster number) that are then used to apply the mentioned criteria to derive the optimal solution.
38
Quantitative approaches in object-oriented software engineering
3.3. Experimental design 3.3.1. Data The input to this experiment consists on a sample of specifications of OO systems briefly described in Appendix A. The columns represent the system name, type (Application, LIBrary and Environment), the formalism from which the GOODLY code was generated, and counts of classes and couplings and two other columns that will be explained later in this paper. However, the bulk of the input data consists of typified class coupling instances that are used to calculate the affinity between classes and, from those, the dissimilarities matrix required as input by the clustering methods. 3.3.2. Threats to internal validity A possible threat to this sort of experiment is the existence of unknown factors that may influence the MMF values. By conducting an experiment that is fully reproducible (in the sense that given the same inputs, i.e., the analyzed systems, the results of the experiment described are always the same) this risk is somewhat minimized. Nevertheless, there is a potential problem in the experiment design: all the original systems are converted from their source formalism to GOODLY. For instance, Smalltalk is a weakly typed language while GOODLY is a strongly typed one. To be able to fully capture the exact class couplings in the translation from Smalltalk to GOODLY would require a much more powerful type inference mechanism than the one available in MOODK1T G2. Any loss of information caused by the conversion may have an effect on the conclusions derived from the analysis of theGOODLYAversion of the specification. 3.3.3. Threats to external validity Since one goal is to generalize the results of this experiment, the representativity of our sample is a main concern here. To mitigate this problem, we chose a relatively large sample comprising 21 OO systems, totaling 1224 classes. The following criteria were used while selecting the cases: - systems should have some diversity; this is achieved with specifications of different programming languages, different types (libraries, environment, or applications), different application domains and different sizes; - systems should be in use for several years or be produced in a recognized academic setting, so that their design structure would have been as carefully engineered or had time to be refined by human experts. 3.3.4. Analysis Strategy Our first step is to analyze some descriptive statistics on the computed merit factors. This is important as it allows us to choose adequate statistical tests to explore our hypotheses. Possible relationships between MMF and the configuration
Approach to OO systems modularization
39
options available for their computation (clustering methods and affinity rating schemes) are also investigated, using standard analysis of variance methods (hypothesis (c) and (d)). The distance between the proposed optimal solution and the implemented one is then computed. This allows us to test hypothesis (a).
3.4. Conducting the experiment The experiment consists of computing the MMFs for our systems sample. The experimental process is relatively straightforward as most tasks are fully automated by the MOTTO tool. To provide the reader with an idea of the time involved in computing the best MMF for a given system, here is an example: suppose an experienced MOTTO user wishes to compute the best MMF for a given combination of clustering method and weighting scheme. On a Pentium III @500MHz, with 256Mb of RAM and 1Gb of virtual memory, the times presented in Table 3 were registered for 3 different systems. These elapsed times include the minimum required user interaction with the tool. It should be noted that for these three examples the amount of human interaction with MOTTO is the same. It consists in selecting a specification and then launching in sequence all the production commands needed to compute the best MMF. Table 3. MOTTO benchmark System Structure Stix GNUSmallTalk
Classes 14 110 246
Couplings 713 4119 6934
Elapsed time 00m:28s 02m:50s llm:29s
4. Data analysis 4.1. Choosing a clustering method Seven clustering methods are used in this experiment. Different clustering methods may point to different clustering solutions and therefore to different MMFs. We can speculate that some methods lead to better solutions than others. As a null hypothesis (H0), we assume that there are no significant differences between the MMFs computed by all the methods. The alternative hypothesis (H1 is that there are significant differences for at least one of the methods.
40
Quantitative approaches in object-oriented software engineering
We use a one way analysis of variance (ANOVA) of the best MMFs obtained with each clustering method, to test the null hypothesis. Table 4 presents the results of this test. Table 4. ANOVA test for clustering methods
dof Between Groups Within Groups Total
6 876 882
F Sig. Sum of Mean Squares Square .439 47.936 .000 2.633 .009 8.009 10.642
Table 4 gives us diagnostics concerning between-groups variance and within groups variance. The first data column gives us the degrees of freedom (dof) connected with both the between-groups variance (number of groups - 1) and the within-groups variance (number of observations - number of groups). The sum of squares is presented next to the mean square. The statistic to check the null hypothesis is the ratio between the mean squares between groups and the mean squares within groups. The high value for the F statistic allows us to reject the null hypothesis. The significance of the test tells us that the probability that these results occurred by chance is approximately zero. Therefore, we can reject H0 and assume H1 the coupling method chosen has an influence in the value of the MMF. A Tukey-HSD (Tukey's Honestly Significant Difference) test with a .050 significance level will show where the difference discovered with the ANOVA really occurs (Table 5). The rows in Table 5 display the mean for the best MMF obtained with the corresponding clustering method within our sample of systems. Methods are grouped in homogeneous subsets (columns 1 through 4). There, we can observe that the SL (Single Linkage) method is the one that provides the best coupling solutions, according to MMF.
At the other extreme, CL (Complete Linkage) and WM (Ward Method) provide the less optimized solutions. The conclusion is that if we were to select only one of the clustering methods based on the MMFs obtained with this sample of OO systems, then SL would be the preferred one. The SL method defines the distance between groups as the distance between their closest members. 4.2. Choosing an affinity rating scheme The MOTTO tool provides 6 different affinity rating schemes. We hypothesize that different schemes may lead to different coupling solutions. A similar approach to the one used with the clustering method can shed some light on this issue. As a null hypothesis, we will assume the rating scheme has no influence in the MMF. The alternative hypothesis states that that influence exists.
Table 6 presents the results of the ANOVA test. Table 6. ANOVA test for affinity rating schemes
df
Between Groups Within Groups Total
F Sum of Mean Sig. Squares Square 5 1.077 .215 19.734 .000 877 9.565 .011 882 10.642
42
Quantitative approaches in object-oriented software engineering
The F statistic value and its significance level confirm that we can reject our null hypothesis. As expected, the chosen rating scheme does have an influence on the selected cluster solutions and, consequently, on the MMF. A Tukey-HSD test (Table 7) shows that the binary schemes seem to lead to lower MMFs than their multiplicative and additive counterparts. Another interesting feature about this is that all the unweighted schemes had a slight advantage when compared to their weighted peers. Although the gap is relatively small, the whole purpose of including weights is to improve the performance of the rating schemes. In short, these results suggest that the weights need some calibration to become more effective. Table 7. Tukey-HSD test for affinity rating schemes
Affinity Rating Scheme WB UB WM WA UM UA
Subset for alpha = .05 1 .235 .263
2
.299 .321 .324 .329
4.3. Modularization usage level Earlier in this paper, we hypothesized that current software systems are far from taking full advantage of the modularization mechanisms provided by OO design. We computed the optimal couplings for our sample of software systems, OPT_MMF, and then defined IMP_MMF as the MMF computed for the implemented system. These values are represented in Appendix A. Figure 3 represents the IMP_MMF score against the optimal OPTJVIMF, in each pair of vertical bars. The vertical scale represents the MMF score, which can theoretically vary between 0 and 1. We can observe that the large majority of the systems obtain a significantly lower score with IMP_MMF. About 20% of the implementations use a larger number of modules than the one required by the optimal modularization. This means that classes are scattered throughout an excessive number of modules, thus creating unnecessary coupling between modules relationships. The majority of systems, however, should be split into more modules. Indeed, about 80% of the implementations had some classes with low affinity stored into the same module. This reduces the cohesion within the modules.
Approach to OO systems modularization
43
Figure 3. Implemented versus optimal MMF
5. Related work 5.1. Cluster analysis in procedural systems Clustering techniques have been applied in system structure recovering and identification of abstract data types in the context of reengineering of procedural based systems. They derive some similarity or dissimilarity measures based on the sharing of characteristics such as data structures or the existence of procedural calls. A study to evaluate module-generating techniques, to help reducing the scope of errors, is presented in [HUT 85]. Four types of data bindings are used there to determine how two procedures are related. Two dissimilarity measures are presented and the authors use single linkage and a variation of this method to do the clustering. In [SCH 91] a tool is presented for maintaining large software systems. The tool supports the grouping of related procedures into modules. The author proposes a similarity measure for procedures that is a function of the design information they share. A clustering algorithm is used to group related procedures into modules. The tool also allows finding procedures that seem to have been assigned to the wrong module. The approach presented in [KUN 96] proposes a measure to evaluate the design of distributed applications, with several communicating processes. The similarity measure between pairs of processes uses data types related to communication. The
44
Quantitative approaches in object-oriented software engineering
author also applied hierarchical clustering algorithms for automatic generation of process hierarchies. Cluster analysis was also used in [PER 98] to derive hierarchies of modules. Procedures are related using a dissimilarity measure that uses characteristics shared by the procedures. Several clustering algorithms were implemented in a tool that allows the specification of groups of characteristics to be considered in each analysis. None of the works mentioned refer to object-oriented software systems. 5.2. Modularity analysis in OO systems The evaluation of modularity in OO systems is a current research topic. In [POE 98] a modularity assessment technique based on the coupling strength of MERODE domain object classes is used. MERODE is an OO approach to modeldriven development [SNO 98]. Two metrics for evaluating the modularity of OO frameworks: Variation points as Interfaces (VI) and Interface Coupling (1C) are proposed in [PRE 98]. None of those studies employs clustering techniques. 6. Future work This paper described the steps followed while using cluster analysis techniques and class-coupling information to obtain modularization solutions in the realm of object-oriented systems. The approach presented can be used both in the initial design phase and in the reengineering of object-oriented legacy systems, allowing identification of illdefined modularization situations and proposing alternative ones. Although cluster analysis techniques have been applied in procedural systems, we have not yet seen them applied in the context of OO systems. We hope this paper will contribute to foster this research area. We plan to apply other multiple decision criteria techniques, currently used in the operations research world, to the problem we described in this paper. The approach advocated here considers structural coupling only, and that may not match semantic coupling. We argue, but cannot prove here, that the latter causes the appearance of the former in coupling relations such as inheritance, aggregation or message passing ones. On the other hand, counter example situations may arise. In a graphical widgets library, for instance, the Arc and Rectangle classes could be completely apart but still it could make sense to include them in the same module since they both are two-dimension geometric figures. Nevertheless, our approach is
Approach to OO systems modularization
45
completely generic and applicable as long as it will be possible to express quantitatively the semantic association between each pair of classes. The work presented does not yet consider hierarchical modularization, which is a very important aspect when considering very large systems. Here we could argue that we could apply our approach in a stepwise manner, by first splitting the system into a given number of subsystems (e.g. fixing its number), and then applying the same approach again within each subsystem to identify nested packages. In the near future we intend to apply this nested approach, possibly using different criteria at each abstraction level.
7. References [ABR 99] F. B. ABREU, L. M. OCHOA, M. A. GOULAO, "The GOODLY Design Language for MOOD2 Metrics Collection", presented at the ECOOP Workshop on Quantitative Approaches in Object-Oriented Software Engineering, Lisboa, Portugal, 1999. [ABR 00] F. B. ABREU, G. PEREIRA, P. SOUSA, "A Coupling-Guided Cluster Analysis Approach to Reengineer the Modularity of Object-Oriented Systems", presented at the 4th European Conference on Software Maintenance and Reengineering (CSMR'2000), Zurich, Switzerland, 2000. [BOO 94] G. BOOCH, Object Oriented Analysis and Design with Applications, 2nd ed. Redwood City, LA, USA, The Benjamin Cummings Publishing Company Inc, 1994. [BOO 97] G. BOOCH, I. JACOBSON, J. RUMBAUGH, "UML Semantics", Rational Software Corporation Version 1.0, January 1997. [BRI 99] L. BRIAND, J. W. DALY, J. K. WUST, "A Unified Framework for Coupling Measurement in Object-Oriented Systems", IEEE Transactions on Software Engineering, Vol. 25, No. 1, 1999. [CAN 96] M. CANTU, Mastering Delphi 2 for Windows 95/NT, Sybex, 1996. [CON 90] L. L. CONSTANTINE, "Object-oriented and function-oriented software structure", Computer Language, Vol. 7, p. 34-56, January, 1990. [EMB 88] D. W. EMBLEY, S. N. WOODFIELD, "Assessing the Quality of Abstract Data Types Written in Ada", presented at the International Conference on Software Engineering (10th ICSE), 1988. [GHE 91] C. GHEZZI, M. JAZAYERI, D. MANDRIOLI, Fundamentals of Software Engineering. Englewood Cliffs, NJ, USA, Prentice Hall, 1991. [GOS 96] J. GOSLING, F. YELLIN, The Java Application Programming Interface, Vol. #1 (Core Packages)/#2 (Window Toolkit and Applets), Reading, Massachussets, USA, Addison-Wesley, 1996. [HIT 96] M. HITZ, B. MONTAZERI, "Measuring Coupling in Object-Oriented Systems", Object Currents, April, 1996.
46
Quantitative approaches in object-oriented software engineering
[HUT 85] D. H. HuTCHENS,V. R. BASILI, "System Structure Analysis: Clustering with Data Bindings", IEEE Transactions on Software Engineering, Vol. 11, No. 8, p. 749-757, August, 1985. [ISO 9126] ISO9126, Information Technology - Software Product Evaluation - Software Quality Characteristics and Metrics, Geneva, Switzerland, ISO. [JAC 92] I. JACOBSON, M. CHRISTERSON, P. JONSSON, G. OVERGAARD, Object-Oriented Software Engineering- A Use Case Driven Approach, Reading, MA, USA/Wokingham, England, Addison-Wesley/ACM Press, 1992. [KAU 90] L. KAUFMA, P. J. ROUSSEEUW, Finding Groups In Data: An Introduction To Cluster Analysis, John Wiley and Sons, 1990. [KUN 96] T. KUNZ, "Evaluating Process Clusters to Support Automatic Program Understanding", presented at the 4th Workshop on Program Comprehension, 1996. [LOU 97] H. LOUNIS, H. SAHRAOUI, W. MELO, "Defining, Measuring and Using Coupling Metrics in OO Environment", presented at the OOPSLA Workshop on Object Oriented Product Metrics, Atlanta, USA, 1997. [MEY 95] B. MEYER, Object Success - A Manager's Guide to Object Orientation, its Impact on the Corporation, and its Use for Reengineering the Software Process, Prentice Hall International, 1995. [MEY 97] B. MEYER, Object-Oriented Software Construction, 2nd ed. Upper Saddle River, NJ, USA, Prentice Hall PTR, 1997. [PAG 95] M. PAGE-JONES, What Every Programmer Should Know about Object-Oriented Design, New York, USA, Dorset House, 1995. [PER 98] G. M. PEREIRA, Reengenharia da Modularidade de Sistemas de Informacao, in Departamento de Engenharia Electrotecnica e Computadores. Lisboa, Portugal: master thesis, IST/UTL, 1998. [POE 98] G. POELS, "Evaluating the Modularity of Model-Driven Object-Oriented Software Architectures", presented at the ECOOP Workshop on Techniques, Tools and Formalisms for Capturing and Assessing Architectural Quality in Object-Oriented Software, Brussels, Belgium, 1998. [PRE 98] P. PREDONZANI, G. Succi, A. VALERIO, "Object-oriented frameworks: architecture adaptability", presented at the ECOOP Workshop on Techniques, Tools and Formalisms for Capturing and Assessing Architectural Quality in Object-Oriented Software, Brussels, Belgium, 1998. [PRE 00] R. S. PRESSMAN, Software Engineering: A Practitioner's Approach (European Adaptation), 5th ed., McGraw-Hill Book Company, 2000. [ROM 90] H. C. ROMESBURG, Cluster Analysis for Researchers, Malabar, Florida, USA, Robert E. Krieger Publishing Company, 1990. [RUM 91] J. RUMBAUGH, M. BLAHA, W. PREMERLANI, F. EDDY, W. LORENSEN, ObjectOriented Modelling and Design, Englewood Cliffs, NJ, EUA, Prentice Hall, 1991.
Approach to OO systems modularization
47
[SCH 96] S. R. SCHACH, Classical and Object-Oriented Software Engineering, 3rd ed. Burr Ridge, Illinois, USA, Richard D. Irwin, 1996. [SCH 91] R. W. SCHWANKE, "An Intelligent Tool For Re-enginnering Software Modularity", presented at the 13th International Conference on Software Engineering (ICSE'91), 1991. [SNO 98] M. SNOECK, G. DEDENE, "Existence Dependency: The Key to Semantic Integrity Between Structural and Behavioral Aspects of Object Types", IEEE Transactions on Software Engineering, Vol. 24, No. 4, April, 1998. [SOM 00] I. SOMMERVILLE, Software Engineering, 6th ed., Addison-Wesley Longman, 2000. [SOU 98] D. F. D'SOUZA, A. C. WILLS, Objects, Components and Frameworks with UML: The Catalysis Approach, Reading, Massachussets, Addison Wesley Longman, 1998. [STE 74] W. P. STEVENS, G. J. MYERS, L. L. CONSTANTINE, "Structured Design", IBM Systems Journal, Vol. 13, No. 2, p. 115-139, 1974. [STR 97] B. STROUSTRUP, The C++ Programming Language, 3rd ed. Reading, Massachusetts, USA, Addison-Wesley Publishing Company, 1997. [SZY 98] C. SZYPERSKI, Component Software: Beyond Object-Oriented Programming, New York, ACM Press/Addison-Wesley, 1998.
Appendix A Specification
Type
Formalism
Classes Total couplings MMF cur MMF opt
Allegro
APP
Eiffel
33
3534
0.061
0.535
Port of the Allegro game programming library to the SmallEiffel Eiffel Compiler. Bast
APP
SmallTalk
38
587
0.102
0.380
Object-oriented framework for building fault-tolerant distributed applications. Blox
APP
SmallTalk
42
1264
0.048
0.365
GUI building block tool kit. It is an abstraction on top of the platform's native GUI toolkit that is common across all platforms. BoochApp
APP
Eiffel
18
115
0.000
0.634
2894
0.014
0.449
Small application to test the BoochTower library. BoochTower
LIB
Eiffel
130
Library of structure components such as bags, graphs, lists, stacks, strings, trees... Canfield
APP
Eiffel
9
47
0.000
0.439
APP
SmallTalk
20
26
0.154
0.364
12
52
0.018
0.373
A solitaire game. Cxtnsn
C based extensions to GNU Smalltalk. dinamico
LIB
Delphi
Abstract Data Types library.
48
Quantitative approaches in object-oriented software engineering
Ems APP Eiffel 111 3370 0.022 0.436 Lexical analyzer that takes a list of classes in input and returns the analysis to the standard output. 18
146
0.164
0.377
Gma LIB Delphi 46 Graphics package developed in Delphi.
139
0.231
0.407
GNU_SmallTalk ENV SmallTalk 246 6934 Implementation of Smalltalk-80 under GNU's public license.
0.013
0.408
GPCardGames APP SmallTalk Implementation of card games.
0.063
0.422
funcao3d APP 3D functions viewer.
Delphi
20
362
Gobo LIB Eiffel 119 4563 0.034 0.467 Eiffel libraries portable across various Eiffel compilers including: a Kernel Library, a Structure Library and an Utility Library. SIG_Container LIB Eiffel Containers library for SIG Eiffel.
47
1215
0.042
0.396
SIG_DateTime LIB Eiffel Date and Time library for SIG Eiffel.
20
929
0.153
0.549
SIG_Eiffel ENV Eiffel 79 SIG Eiffel programming environment.
7531
0.030
0.478
SIG_Libraries LIB Libraries for SIG Eiffel.
2509
0.035
0.452
Eiffel
34
Stix APP SmallTalk 110 4119 0.018 0.414 SmallTalk Interface to the X protocol layer that underlies all of X Windows. Structure LIB Eiffel 14 713 A data structure library based on circular-linked-lists.
0.098
0.525
Yoocc LIB Eiffel 58 457 0.054 0.461 Compiler-compiler that uses an extended parse library which derives from the ISE Eiffel parse library. Legend: MMF cur - Current (observed) MMF
MMFopt - Optimal MMF
Chapter 3
Object-relational database metrics Mario Piattini and Coral Calero Dept of Computer Science, University of Castilla-La Mancha, Spain
Houari Sahraoui Dept d'Informatique et Recherche Operationelle, Universite de Montreal, Canada
Hakim Lounis Dept d'Informatique, Universite de Quebec, Montreal, Canada
1. Introduction Metrics for databases have been neglected in the metric community ([SNE 98]). Most all of the metrics proposed from the McCabe ([MCA 76]) famous cyclomatic number until today have been centered in measuring programs complexity. However, in modern information systems (IS) databases have become a crucial component, so there is a need to propose and study some measures to assess its quality. It is important that databases are evaluated for every relevant quality characteristic using validated or widely accepted metrics. These metrics could help designers to choose the most maintainable, among semantically equivalent alternative schemata. Moreover, the object-relational databases will replace relational systems to become the next great wave of databases ([STO 99]) so it is fundamental to propose metrics for control of the quality of this kind of databases. Database quality depends on several factors, one of which is maintainability ([ISO 94]). Maintenance is considered the most important concern for modern IS departments and requires greater attention by the software community ([FRA 92], [MCL 92], [PIG 97]). Maintainability is affected by understandability, modifiability and probability which depend on complexity ([LI 87]). Three types of complexity can be distinguished (HEN 96]): human factor complexity, problem complexity and product complexity. We focus our work in this last kind of complexity. We have put forward different measures (for internal attributes) in order to measure the complexity that affects the maintainability (an external attribute) of the objectrelational databases which is useful for control of its quality.
50
Quantitative approaches in object-oriented software engineering
In this contribution we present, in Section 2, the framework used for metrics definition, metrics proposed for object-relational databases appear in Section 3. In Section 4 we present the formal verification of some of the metrics. We discuss two experiments made to validate our metrics in Section 5. In this section both experiments and the results obtained for each are described. Finally, conclusions and future work are discussed in the last section.
2. A framework for developing and validating database metrics As stated previously, our goal is to define metrics for controlling objectrelational databases maintainability, through metrics that capture complexity. But metrics definition must be made in a methodological way; it is necessary to follow a number of steps for ensure the reliability of the proposed metrics. Figure 1 presents the method we apply for the metrics proposal. The first step is the metrics proposal. This step must be made taking into account the specific characteristics of the object-relational databases and the experience of database designers and administrators of these databases. One methodological way to create the metrics proposal is by following the Goal-Question-Metric (GQM) approach. The goal of this approach is based in the fact that any metric can be defined by a top-down design with three levels, the conceptual level (Goal) where the objectives are defined, the operational level (Question) where the questions are made and the quantitative level where the metrics are defined. In this way, the goal is defined by a set of questions and every question is redefined through a set of metrics.
Figure 1. Steps followed in the definition and validation of the database metrics
Object-relational database metrics
51
It is also important to validate the metrics from a formal point of view in order to ensure its usefulness. Several frameworks for measuring characterization have been proposed. Some of them ([BRI 96], [WEY 88], [BRI 97]) are based on axiomatic approaches. The goal of these approaches is merely definitional by proposing formally desirable properties for measures for a given software attribute, so axioms must be used as guidelines for the definition of a measure. Others ([ZUS 98}) are based on measurement theory which specifies the general framework in which measures should be defined. However, research is needed into aspects of software measurement, ([NEI 94]), both from the theoretical but also from a practical point of view ([GLA 96]). So it is necessary to do experiments to validate the metrics. Empirical validation can be used to investigate the association between proposed software metrics and other indicators of software quality such as maintainability ([HAR 98]). So, the goal is to prove the practical utility of the proposed metrics. There are a lot of ways to do so but basically we can divide the empirical validation in two: experimentation and case studies. Experimentation is usually carried out using controlled experiments and the case studies usually work with real data. Both of them are necessary, the controlled experiments for a first approach and the case studies for reinforcing the results. In both cases, the results are analyzed using either statistical tests or advanced techniques. Also, it is necessary to replicat the experiment because with isolated experimental results it is difficult to understand how widely applicable the results are and, thus, to assess the true contribution to the field ([BAS 99]). As we can see in Figure 2, the process of defining and validating database metrics is evolutionary and iterative. As a result of the feedback, themetrics could be redefined by discarding, depending on the theoretical, empirical or psychological validations. In the rest of this paper we will demonstrate the different steps of the framework applied to obtain metrics for object-relational databases.
3. Object-relational metrics definition One of the problems of relational databases is related to representativeness limitations (complex elements which are present in several domain like graphics, geography are hard to represent). On the other hand, object oriented (OO) databases are not enough mature to be accepted, and it is really difficult to convert relational specialists and to convince managers to adopt this new paradigm with all the possible risks involved. From this point of view, the object-relational paradigm proposes a good compromise between both worlds. Object-relational databases combine traditional database characteristics (data model, recovery, security, concurrency, high-level language, etc) with object-oriented principles (e.g. encapsulation, generalization, aggregation, polymorphism etc). These products offer the possibility of defining
52
Quantitative approaches in object-oriented software engineering
classes or abstract data types, in addition to tables, primary and foreign keys and constraints', as do relational databases. Furthermore, generalization hierarchies can be defined between classes (super and subclasses) and between tables, subtables and supertables. Table attributes can be defined in a simple domain, e.g. CHAR(25), or in a user-defined class as a complex number or image. In Figure 2 we present an example of two objectrelational tables definition. In this example we can notice that part of the data is expressed using relational concepts (tables, primary and foreign keys and references) and the other part using OO concepts (types, and methods). The richness of the resulting model somewhat increases its complexity ([STO 99]). For this reason it is very important to have metrics that allow for the complexity of this kind of databases to be controlled. CREATE TABLE subs( idsubs INTEGER, name VARCHAR(20), subs_add address, PRIMARY KEY (idsubs));
CREATE TYPE address AS( street CHAR(30), city CHAR(20), state CHAR(2), zip INTEGER);
Figure 2. Example of table definition in SQL: 1999 I. In this first approximation constraints are not considered for measure purposes.
Object-relational database metrics
53553
For this kind of database we can propose table related metrics (when we apply the metrics to a table) and schema oriented metrics (when the metrics are applied to the schema).
3.1. Table level metrics At the table level we propose, T being a table, the metrics DRT(T), RD(T), PCC(T), NIC(T), NSC(T) and TS defined as follows: - DRT(T) metric. Depth of relational tree of a table T (DRT(T)) is defined as the longest referential path between tables, from the table T to any other table in the schema database - RD(T) metric. Referential Degree of a table T (RD(T)) is defined as the number of foreign keys in the table T. - PCC(T) metric. Percentage of complex columns of a table T. -NIC(T) metric. Number of involved classes. This measures the number of all classes that compose the types of the complex columns of T using the generalization and the aggregation relationships. - NSC(T) metric. Number of shared classes. This measures the number of involved classes for T that are used by other tables. - TS metric. The table size metric is defined as the sum of the total size of the simple columns (TSSC) and the total size of the complex columns (TSCC); each of these complex columns can be a class or an user defined type UDT) in the table:
We consider that all simple columns have a size equal to one, then the TSSC metric is equal to the number of simple attributes in the table (NSA).
And the TSCC is defined as the sum of each complex column size (CCS): NCC being the number of complex columns in the table. The value for CCS is obtained by:
SHC being the size of the hierarchy above which the column is defined and NCU is the number of columns defined above this hierarchy. This expression arises from the fact that the understandablility is less if more than one column is defined above
54
Quantitative approaches in object-oriented software engineering
the same class. If the number of columns that are defined above a class is greater than one, the complexity of this class decreases (in respect to each column, but not for the total columns) and this fact must be acknowledged when we calculate the complexity of a class. The SHC may be defined as the sum of each class size in the hierarchy (SC): NCH being the number of classes in the hierarchy. The size of a class is defined as:
SAC being the sum of the size attributes of the class, SMC the size methods of the class and NHC the number of hierarchies to which the class pertains. The attributes of a class may also be simple or complex (which can be a class or an UDT), then the SAC is defined as the sum of the simple attributes size (SAS, that have size equal to one; then the metric corresponds to the number of simple attributes) and the complex attributes size (CAS) in the class.
and the SMC is calculated with the version of the cyclomatic complexity of McCabe given by ([LI 93]): NMC being the number of methods in the class
3.2. Schema level metrics At the schema level, we can apply the next metrics: -DRT metric. Depth of referential Tree, defined of the longest referential path between tables in the database schema. — RD metric. Referential Degree is defined as the number of foreign keys in the schema database. — PCC metric. Percentage of complex columns in the schema database. — NIC metric. Number of involved classes, number of all classes that composes the types of the complex columns, using the generalization and aggregation relationships of all tables in the schema. - NSC metric. Number of shared classes, number of shared classes by tables of the schema.
Object-relational database metrics
55
- SS metric. Size of a schema defined as the sum of the tables size (TS) in the schema: NT being the number of tables in the schema.
3.3. Example We present the values for the different metrics for the example presented in Figure 2. Let us assume that the date type has a size equal to one. We can calculate the values for the address and location classes as:
And with these values we can obtain the values shown in Table 1 for each column size of each table: Table 1. Size for each column
SUBS
DEP
SUBS_DEP
EMPLOYEE
Column name
Column type
Column size
idsubs
Simple
name
Simple
subs add
Complex
iddep
Simple
1 1 2 1 1 1.5 1 1 1 1 1 1 1.5 2 1 1
name
Simple
dep_loc
Complex
budget
Simple
idsubs
Simple
iddep
Simple
idemp
Simple
name
Simple
emp date
Simple
emp_loc
Complex
emp_add
Complex
manager
Simple
dep
Simple
56
Quantitative approaches in object-oriented software engineering With these data, we obtain the following values for the table size metric:
The other metrics for the tables are summarized in Table 2. Table 2. Metric values for the example of Figure 2 SUBS
TS RD DRT PCC NIC NSC
4 0 0 33% 1 1
DEP 4.5 0 0 25% 1 1
SUBS DEP
EMPLOYEE
2 2 1 0% 0 0
8.5 2 2 28.57%
2 2
4. Object-relational metrics formal verification As we have said previously, it is important to validate the metrics from a formal point of view in order to ensure its usefulness and there are two main approaches to carrying it out: axiomatic approaches (the goal of these approaches is merely definitional by proposing formally desirable properties for measures) and the formal frameworks based on measurement theory which specify the general framework in which measures should be defined. The strength of measurement theory is the formulation of empirical conditions from which we can derive hypotheses of reality. Measurement theory gives clear definitions of terminology, a sound basis of software measures, criteria for experimentation, conditions for validation of software measures, foundations of prediction models, empirical properties of software measures, and criteria for measurement scales. In this section we present the formal verification of the TS, the RD and the DRT metrics made in the formal framework proposed by Zuse ([ZUS 98]) and based on measurement theory. All the information related to this framework can be found in ([ZUS 98]).
Object-relational database metrics
57
For our purposes, the Empirical Relational System could be defined as:
Where R is a non-empty set of relations (tables), • >= is the empirical relation "more or equal complex than" on R and o is a closed binary (concatenation) operation on R. In our case we will choose natural join as the concatenation operation. Natural join is defined generally as ([ELM 99]):
Where <listl> specifies a list of i attributes of R and <list2> is a list of i attributes of S. These lists are used in order to make the comparison equality conditions between pairs of attributes. These conditions are afterwards related with the AND operator. Only the list corresponding to the R relation is preserved in Q. Depending on the characteristics of the combined tables, natural join can be derived in Cartesian product. Furthermore, it is possible to make the natural join through foreign key-primary key or between any columns of two tables defined over the same domain. All these characteristics of the natural join will be useful in order to design the combination rule of the metrics.
TS metric formal verification The TS (table Size) measure is a mapping: TS: R -> SR such that the following holds for all relations Ri and Rj e R: Ri • >= Rj<=>TS(Ri) >= TS(Rj). In order to obtain the combination rule for TS when we combine tables by natural join we may think that if the combined tables have no common columns, the attributes of the table obtained is the union of the attributes of the two table combined and the size will be the sum of each attribute size, but if the tables have any common column, the size of the tables obtained will be the sum of each size attribute minus the size of the duplicate simple column (by definition we must subtract only the simple column size because on the size of a complex column is reflected whether the hierarchy, by which the column is defined, is shared by more than one column). So, we can define the combination rule for TS as:
Where SASC(RiuRj) is the size of the common simple attributes of Ri and Rj. We can rename this last expression as v (being v a variable) and define the combination rule for TS as:
TS fulfils the first axiom of weak order, because if we have two relations RI and R2, it is obvious that TS(R1) >= TS(R2) or TS(R2) >= TS(R1) (completeness) and let
58
Quantitative approaches in object-oriented software engineering
Rl, R2 and R3 three relations, transitivity is always fulfilled: TS(R1) >= TS(R2) and TS(R2) >= TS(R3), then TS(R1) >= TS(R3). TS does not fulfil positivity, because if we combine a relation Rl with itself without cycles: TS(R1 o Rl) is not greater than TS(R1). But it fulfils weak positivity, because it is always true that: TS(R1 o R2) >= TS(R1) for all Rl, R2 e R. TS fulfils associativity and commutativity (axioms 3 and 4), because the natural join operation is associative and commutative. TS does not fulfil weak monotonicity because if we have two tables (Rl and R2) with the same number of attributes with the same size and we combine every one of these tables with a third table (R3) that has one common attribute with the first table (Rl) and none common attribute with the second table (R2), the table that results of Rl o R3 will have less size than the table that results of R2oR3. Due to the fact that the number of attributes varies when we combine one table with itself, we can conclude that the metric is not idempotent and is necessary to prove the Archimedean axiom. In order to prove that the Archimedean axiom is not fulfilled it is important to observe that when two tables are combined by natural join successively, the number of attributes vary and also the size. Moreover, the tables obtained in successive concatenations will be the same as those obtained in the first concatenation. Then, if we have four tables Rl, R2, R3 and R4, and R3 has three attributes and a size equal to three, R4 has two attributes and a size equal to two, Rl has three attributes (one of them common with R3) and a size equal to three and R2 has four attributes and a size equal to four, and we make the concatenation R3o Rl (that is equal to the concatenation R3oR21oRlo...), obtaining a table with five attributes and a size equal to 5, and we make the concatenation R4 oR2 (that is equal to the concatenation R4oR2oR2o...), obtaining a table with six attributes and a size equal to six, the Archimedean axiom is not accomplished. So, the measure TS does not assume an extensive structure. Would TS verify the independence conditions?. As we have seen the metric does not accomplish the axiom of weak monotonicity, and then it can accomplish neither independence conditions. In fact, this type of combination rules does not assume the independence conditions. The part -v rejects the condition Cl that implies the rejection of the axiom of weak monotonicity, monotonicity and extensive structure. Then we must consider if TS fulfils some of the modified relations of belief. MRB1 is fulfilled because of giving two relations Rl and R2 e 3 (3 is the set of all the possible relations made with the attributes of the relational schema) TS(R1) >=
Object-relational database metrics
59
MRB2 is also fulfilled (transitivity of natural join). For MRB3 we will consider that a relation Rl 3 R2 if all the attributes of R2 are present in Rl. In this case it is evident that TS(R1) >= TS(R2), and MRB3 is fulfilled. MRB4 is fulfilled because if a relation Rl > R2 then TS(R1) > TS(R2) and TS(R1 U R3) > TS(R2 U R3), being Rl n R3 = 0. If the relations Rl and R3 do not have any attribute in common, adding the attributes of R3 to both Rl and R2, (if Rl subsumes R2), then the number of attributes of Rl and R3 is greater than the number of attributes of R2 and R3, and also their size. MRB5 is fulfilled because a relation must always have zero or more attributes; the size must then be equal or greater than zero. In summary, we can characterize TS as a measure above the level of the ordinal scale, assuming the modified relation of belief. The validation of the other metrics can be made following the same steps: defining the combination rule for the metric and proving the different properties in order to obtain the appropriate scale for the metric.
5. Object-relational metrics empirical validation In this section, we present the experiment developed in order to evaluate whether the proposed measures can be used as indicators for estimating the maintainability of an OR database.
5.1. Data Collection Five object-relational databases were used in this experiment with the average of 10 relations per database (ranging from 6 to 13). These databases were originally relational ones. For the purpose of the experiment, they were redesigned as OR databases. A brief description of these databases is given in Table 3.
60
Quantitative approaches in object-oriented software engineering
Table 3. Databases used in the experiment Database
Number of tables
Average attributes/table
Average complex attributes/table
Airlines
6
4.16
1.83
Animals
10
2.7
0.6
Library
12
2.91
0.75
Movies
9
4.33
0.88
Treebase
13
3.46
0.86
Five people participated in the experiment the first time we conducted it (Canadian experiment): one researcher, two research assistants and two graduate students. All of them are experienced in both relational databases and objectoriented programming. In the first experiment, one person did not complete the experiment, and we had to discard his partial results. So, in the replication (Spanish experiment) only four people made the experiment. All of them are also experienced in both relational databases and object-oriented programming The people were given a form, which include for each table a triplet of values to compute using the corresponding schema. These values are those of three measures TS, DRT and RD. Our idea is that to compute these measures, we need to understand the subschema (objects and relations) defined by the table concerned. A table (and then the corresponding subschema) is easy to understand if (almost) all the people find the right values of the metrics in a limited time (2 minutes per table). We wanted to measure understandability; we decided to give our people a limited time to finish the tests they had to carry out and then use all the tests that had been answered in the given time and in a correct way (following all the indications given for the development of the experiment). So, our study would focus on the number of metrics correctly calculated. Formally, a value 1 is assigned to the maintainability of a table if at least 10 of 12 measures are computed correctly in the specified time (4 people and 3 measures). A value 0 is assigned otherwise. The tables are given to the people in a random order and not by database.
5.2. Validation Technique To analyze the usefulness of the metrics proposed, we used two techniques: C4.5 ([QUI 93]), a machine learning algorithms and RoC ([RAM 99]), a robust Bayesian classifier. C4.5 belongs to the divide and conquer algorithms family. In this family, the induced knowledge is generally represented by a decision tree. The principle of this approach could be summarized by this algorithm:
Object-relational database metrics
61
If the examples are all of the same class Then - create a leaf labelled by the class name; Else - select a test based on one attribute; - divide the training set into subsets, each associated with one of the possible values of the tested attribute; - apply the same procedure to each subset; Endif. The key step of the algorithm above is the selection of the "best" attribute to obtain compact trees with high predictive accuracy. Information theory-based heuristics have provided effective guidance for this division process. C4.5 induces Classification Models, also called Decision Trees, from data. It works with a set of examples where each example has the same structure, consisting of a number of attribute/value pairs. One of these attributes represents the class of the example. The problem is to determine a decision tree that correctly predicts the value of the class attribute (i.e., the dependent variable), based on answers to questions about the nonclass attributes (i.e., the independent variables). In our study, the C4.5 algorithm partitions continuous attributes (the database metrics), finding the best threshold among the set of training cases to classify them on the dependent variable (i.e. understandability of the database schemes). RoC is a Bayesian classifier. It is trained by estimating the conditional probability distributions of each attribute, given the class label. The classification of a case, represented by a set of values for each attribute, is accomplished by computing the posterior probability of each class label, given the attributes values, by using Bayes' theorem. The case is then assigned to the class with the highest posterior probability. The simplifying assumptions underpinning the Bayesian classifier are that the classes are mutually exclusive and exhaustive and that the attributes are conditionally independent once the class is known. RoC extends the capabilities of the Bayesian classifier to situations in which the database reports some entries as unknown. It can then train a Bayesian classifier from an incomplete database. One of the great advantages of C4.5 compared with RoC is that it produces a set of rules, directly understandable by software manager and engineers.
5.3. Results As specified in the validation technique section, we applied RoC and C4.5 to evaluate the usefulness of the OR metrics in estimating the maintainability of the tables in an OR schema.
62
Quantitative approaches in object-oriented software engineering
5.3.1. RoC technique Using the cross-validation technique, the algorithm RoC was applied 10 times on the 50 examples obtained from the 50 tables of the five schematas (500 cases). 369 cases were correctly estimated for the Canadian experiment (accuracy 73.8%) and 407 cases for the Spanish one (accuracy 81.4%) and all the other cases in both experiments were missclassified. Contrary to C4.5, RoC does not propose a default classification rule which guaranteed a coverage of all the proposed cases. However, in this experiment, it succeeded in covering all the 500 cases (coverage of 100%). These results are summarized in the Table 4. Table 4. RoC quantitative results with data from Spain and from Canada Spain
Canada
Correct:
407
369
Incorrect:
93
131
Not classified:
0
0
Accuracy:
81.4%
73.8%
Coverage:
100.0%
100.0%
RoC produces the model presented in Figure 3 with the Canadian data. From this model, it is hard to say which metric is more relevant than another in an absolute manner. However, we can notice that when TS is smaller, the probability that the table is understandable is higher (for example 55% for TS <= 3). This probability decrease when the table size increases (9.5% for TS >10). Inversely, the same probability increases when estimating the tables that are not understandable (varying from 13.6% for TS <= 3 to 33.6% for TS >10).
Object-relational database metrics
63
Figure 3. The model generated by RoC with data from Canada
For DRT and RD, it is hard to draw a conclusion since no uniform variation is shown. This can be explained by the fact that for the sample used in this experiment, the values of DRT and RD are in defined in a narrow range ([0, 3] and [0, 5]). RoC produces the model presented in Figure 4 with the Spanish data. The conclusions from this second model are the same as the first one because the models are very similar.
64
Quantitative approaches in object-oriented software engineering
Figure 4. The model generated by RoC with data from Spain
5.3.2. C4.5 technique The results obtained for the Canadian experiment are shown in Table 5. The model of C4.5 was very accurate in estimating the maintainability of a table, 94% and represent a high level of completeness (up to 100% for not understandable tables) and correctness (up to 100% for understandable tables). Table 5. C4.5 quantitative results from the Canadian experiment Predicted maintainability 0 1 Real Maintainability
0 1 Correctness Accuracy = 94°/o
28 3
0 19
90.32%
100%
Completeness 100% 86.36%
Object-relational database metrics
65
And the rules obtained with C4.5 are: Rule 1: TS <= 9 A DRT = 0 A NSC = 0 -> class 1
[84.1%]
Rule 2: TS <= 3 A RD > 1 -> class 1
[82.0%]
Rule 7: TS <= 9 A DRT <= 2 A NIC > 0 A NSC = 0 -> class 1 [82.0%] Rule 5: TS > 9 -> class 0
[82.2%]
Rule 6: DRT > 2 -> class 0
[82.0%]
Default class: 0
Figure 5. C4.5 estimation model from the Canadian data
TS seems to be an important indicator for the maintainability of the tables. Rules 1, 2 and 7, which determine if a table is maintainable, all state as part of the conditions that TS must be small. Inversely, in rule 5, it is stated a large size is sufficient to declare the table as not understandable. A small DRT is also required for rules 1 and 7 as partial condition to classify the table as understandable. At the same time, a high value of DRT means that the table is hard to understand (rule 6). RD does not represent an interesting indicator. The results obtained for the Spanish experiments are shown in Table 6. In this case the accuracy in estimating the maintainability was 94% and the levels of completeness and correctness were smaller than the Canadian experiment but were also very high. Table 6. C4.5 quantitative results from the Spanish experiment
Quantitative approaches in object-oriented software engineering
And the rules obtained with C4.5 are: Rule 1: TS <= 5 A DRT <= 2 -> class 1
[89.4%]
Rule 3: TS > 5 A PCC <= 66 -> class 0
[82.3%]
Rule 2: DRT>2 -> class 0
[66.2%]
Default class: 1
Figure 6. C4.5 estimation model from the Spanish data The model rule is smaller than the one of the first experiment but it confirms that (at least for the sample studied) TS and DRT are good indicators and not RD. Both experiments and both techniques establish that the table size metric (TS) is a good indicator for the maintainability of a table. The depth of the referential tree metric (DRT) is also presented as an indicator by C4.5 on both experiments, and the referential degree metric (RD) does not seem to have a real impact on the maintainability of a table.
6. Conclusions and future work It is important that software products, and obviously databases, are evaluated for all relevant quality characteristics, using validated or widely accepted metrics. However, more research is needed into the aspects of software measurement ([NEI 94]), both from a theoretical and from a practical points of view ([GLA 96]). We think it is very interesting to dispose of metrics for object-relational databases. These metrics can be used to flag outlying schemata for special attention; a strong requirement for low testing and maintenance costs would argue for justifying extra managerial attention to a quite significant fraction of object-relational database schemata. We have put forward different proposals (for internal attributes) in order to measure the complexity that affects the maintainability (an external attribute) of the relational database schemata and consequently control its quality. These metrics were developed and characterized in accordance with a set of sound measurement principles, applying the formal framework proposed by Zuse ([ZUS 98]), in order to obtain the scales to which the metrics pertain. ' We have carried out experiments to validate the proposed metrics, but others are being developed at this moment. However, the controlled experiments have problems (such as the large number of variables that cause differences, or the fact that these
Object-relational database metrics
67
experiments deal with low level issues, microcosms of reality and small sets of variables) and limits (e.g. they do not scale up, are performed in a class in training situations, are made in vitro and face a variety of threats of validity). Therefore, it is convenient to run multiple studies, mixing controlled experiments and case studies ([BAS 99]). For these reasons, a deeper empirical evaluation is under way in collaboration with industrial and public organizations in "real" situations.
7. References [AND 83] ANDERSON, J.R. (1983), The Architecture of Cognition, Cambridge, MA, Harvard Universitiy Press. [BAS 99] BASILI, V.R., SHULL, F., LANUBILLE, F. (1999), "Building Knowledge through families of experiments", IEEE Transactions on Software Engineering, July/August, No. 4. p. 456473. [BRI 96] BRIAND, L.C., MORASCA, S., BASILI, V. (1996), "Property-based software engineering measurement", IEEE Transactions on Software Engineering, 22(1), p. 68-85. [BRI 97] BRIAND L.C., MORASCA S. (1997), "Towards a Theoretical Framework for Measuring Software Attributes", Proceeding of the Fourth International, Software Metrics Symposium, p. 119-126. [ELM 99] ELMASRI, R., NAVATHE, S. (1999), Fundamentals of Database Systems, Third edition, Addison-Wesley, Massachussets. [FRA 92] FRAZER, A. (1992), "Reverse engineering-hype, hope or here?", in P.A.V. Hall, Software Reuse and Reverse Engineering in Practice, Chapman & Hall. [GLA 96] GLASS, R. (1996), "The Relationship Between Theory and Practice in Software Engineering", IEEE Software, November, 39 (11), p. 11-13. [HAR 98] HARRISON, R., COUNSELL, S.. NITHI, R. (1998), "Coupling metrics for ObjectOriented Design", 5th. International Symposium on Software Metrics, IEEE Computer Society, Bethesda, Maryland, 20-21 November. [HEN 96] HENDERSON-SELLERS, B. (1996), Object-oriented Metrics - Measures of complexity, Prentice-Hall, Upper Saddle River, New Jersey. [ISO 94] ISO, (1994), "Software Product Evaluation-Quality Characteristics and Guidelines for their Use", ISO/I EC Standard 9126, Geneva. [LI 87] LI, H.F., CHEN, W.K., "An empirical study of software metrics", IEEE Trans, on Software Engineering, (1987), 13 (6), p. 679-708. [LI 93] LI, W., HENRY, S. (1993). "Object-Oriented metrics that predicts maintainability", J. Sys. Software,23,p. 111-122. [MCA 76] McCABE. T.J. (1976), "A complexity measure", IEEE Trans. Software Engineering 2(5), p. 308-320.
68
Quantitative approaches in object-oriented software engineering
[MCL 92] McCLURE, C. (1992), The Three R's of Software Automation: Re-engineering, Repository, Reusability, Englewood Cliffs, Prentice-Hall. [NEI 94] NEIL, M. (1994), "Measurement as an Alternative to Bureaucracy for the Achievement of Software Quality", Software Quality Journal 3 (2), p. 65-78. [PIG 97] PlGOSKl, T.M. (1997), Practical Software Maintenance, Wiley Computer Publishing, New York, USA. [QUI 93] QUINLAN, J.R., (1993), C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers. [RAM 99] RAMONI, M., SEBASTIANI, P. (1999), "Bayesian methods for intelligent data analysis", In M. Berthold and D.J. Hand, editors, An Introduction to Intelligent Data Analysis, New York, Springer. [SIA 99] SIAU, K. (1999), "Information Modeling and Method Engineering: A Psychological Perspective", Journal of Database Management 10 (4), p. 44-50. [SNE 98] SNEED, H.M., FOSHAG, O., "Measuring Legacy Database Structures", Proc of The European Software Measurement Conference FESMA 98, (Antwerp, May 6-8, 1998), Coombes, Van Huysduynen and Peeters (eds.), p. 199-211. [STO 99] STONEBRAKER, M., BROWN, P., Object-Relational DBMSs tracking the next great wave, (California, 1999), Morgan Kauffman Publishers. [WEY 88] WEYUKER, E.J. (1988), "Evaluating software complexity measures", Transactions on Software Engineering, Vol. 14(9), p. 1357-1365.
IEEE
[ZUS 98] ZUSE, H. (1998), A Framework of Software Measurement, Berlin, Walter de Gruyter.
Chapter 4
Measuring event-based object-oriented conceptual models Geert Poels Vlekho Business School, University of Sciences and Arts, Belgium
1. Introduction A decade of research into object-oriented software measurement has produced a large number of measures for object-oriented software models and artefacts. Zuse provides a list of no less than 137 of such measures and estimates the actual number of measures at more than three hundred [ZUS 98]. In spite of these research efforts there are still areas that remain largely untouched by object-oriented software measurement research. In this paper we present an approach that addresses two of these research 'niches'. A first problem area, identified during the 1999 ECOOP International Workshop on Quantitative Approaches in Object-Oriented Software Engineering [BRIT 99], concerns the lack of measures for models that capture the dynamic aspects of an object-oriented software system. A typical measure suite for object-oriented software, like for instance MOOSE [CHI 94], focuses on the data and function dimensions of software, but ignores the behaviour dimension as captured by behavioural models like state-transition diagrams and activity diagrams. Moreover, object-interaction characteristics such as interaction coupling [EDE 93] or dynamic coupling [BRIT 96] have only been measured on the basis of collaboration diagrams and message sequence diagrams. Coupling between objects has not been measured using object-interaction diagrams that are based on other object-communication mechanisms, like event broadcasting. A second problem area concerns the lack of measures for object-oriented software specifications [BRIA 99a]. Although industry begs for measurement instruments that can be applied in the early phases of the development process (mainly for early quality control and project budgeting decisions), nearly all published object-oriented software measures can only be used after (high-level)
70
Quantitative approaches in object-oriented software engineering
system design. Some exceptions known to us are measures for object-oriented analysis models (e.g. task points [GRA 95] for Graham's SOMA, the QOOD measure suite [BAD 95] for Coad and Yourdon's OOA, and the complexity measures presented in [GEN 99] for Rumbaugh's OMT). We believe these two problem areas to be somewhat related. Modern, UMLcompliant approaches towards domain analysis, object-oriented analysis and design, and component-based software engineering, like for instance Catalysis [DSO 99], put emphasis on both the static and dynamic aspects of a domain or software system. However, in an object-oriented implementation the dynamic aspects become somewhat subordinate to the static aspects. Many of the 'rules' that were explicitly captured during behavioural and object-interaction modelling are translated into class invariants or into preconditions and postconditions that are attached to specific methods within the class definitions. Strangely enough, these types of assertions have not received the attention of research into object-oriented design and code measurement either. In this paper, we present part of a framework for measuring object-oriented conceptual models. Conceptual modelling is used to model, structure and analyse a (part of a) domain 1 , irrespective of the software system that must be built. Defining a domain model is part of the requirements engineering step in the development of a software system. All the rules described in the domain model have to be supported by the system. Object-oriented analysis in general also aims at modelling and analysing the specific system requirements (user interface, data storage, workflow aspects, quality requirements, etc). The conceptual model can be considered as an early object-oriented analysis artefact. As a consequence, our framework is suited for early measurement. Conceptual models are combinations of different sub-models. Generally, three sub-models are distinguished: a structural model (e.g. class diagram), an objectinteraction model (e.g. message sequence chart) and a set of object-behaviour models (e.g. finite state machines). The part of the framework presented in this paper focuses on the object-interaction model. To model the interaction (and communication) between objects the framework assumes the event broadcasting mechanism, which from a conceptual modelling point of view is to be preferred
1. The term "conceptual modelling", as used in this paper, does not necessarily refer to the concept of domain analysis or engineering. Domain analysis methods like FODA [KAN 90] are used to derive models that are common to a collection of individual organisations. An analysis of the similarities and differences between the individual enterprise models (also called business models) plays a crucial role in such methods. When we use the term "domain" or "domain model" in this paper, we mean a "domain" in a more general sense. It can refer to a real domain model (e.g. stock/inventory management, front office, manufacturing) as well as to an enterprise model that is specific to a particular organisation.
Measuring event-based OO conceptual models
71
above the message passing mechanism 2 . The cornerstone of the measurement framework is a formally defined object-interaction model based on event broadcasting, called the object-event table (OET). The OET provides a formal basis for a suite of measures that is defined in terms of (common) event participations. In Section 2 the object-event table is presented. A compact suite of measures, including size, coupling, inheritance, specialisation, propagation and polymorphism measures is presented in Section 3. The results of a first measure validation experiment are briefly discussed in Section 4. Finally, Section 5 contains conclusions and topics for further research. 2. The object-event table In conceptual modelling, the events that are modelled are real-world events, sometimes also referred to as business events. Real-world events are characterised by the following properties: - A real-world event corresponds to something that happens in the real world. This 'real world' is the universe of discourse, i.e. the domain or relevant part of the domain that must be modelled; - A real-world event has no duration, i.e. it occurs or is recognised at one point in time; - The real-world events that are identified during conceptual modelling are not further decomposable. They are defined at the lowest level of granularity and cannot be meaningfully split into other, more fine-grained, events. It is common to model events as event types, rather than referring to specific event occurrences. Since real-world events are the focal point in event-based conceptual modelling, a notation is needed to designate the set of event types that is relevant for a particular universe of discourse. We use A to denote the universe of event types associated with some universe of discourse. All event types relevant to the universe of discourse are elements of A An example is presented of a simplified loan circulation process in the context of a library. Assume that the scope of the LIBRARY conceptual model is initially delimited such that the universe of event types is A = {start member ship, end member ship, acquire, catalogue, borrow, renew, return, sell, reserve, cancel, fetch, lose} 2. In reality, objects do not pass messages to each other. For instance, if a person rents a car, then the person does not send a "rent"' message to the car, nor the other way round. However, both objects (person and car) are involved in the same real-world event, i.e. the renting of the car by the person. The event broadcasting mechanism simultaneously notifies all participating objects of the event occurrence, without deciding on an order yet as with message passing (e.g. the person notifies the car or the car notifies the person). Compared to message passing, the event broadcasting mechanism leads to more maintainable and reusable conceptual models [SNO 00].
72
Quantitative approaches in object-oriented software engineering
A conceptual model also identifies the entities (persons, things, etc) in the universe of discourse that participate in real-world events. Such entities are said to be "relevant to" the universe of event types A. In object-oriented conceptual modelling these entities are represented as objects. Objects are characterised as follows: - Each object in the conceptual model corresponds to a real-world concept; - Objects are described by a number of properties. The properties of an object are specified in an object type (e.g. a UML classifier with an <