The Unified Modeling Language. UML'98: Beyond the Notation: First International Workshop, Mulhouse, France, June 3-4, 1998, Selected Papers

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen 1618 ¿ Berlin Heidelberg New Yo...

Author: Jean Bezivin | Pierre-Alain Muller

25 downloads 767 Views 7MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen

1618

¿ Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Singapore Tokyo

Jean B´ezivin Pierre-Alain Muller (Eds.)

The Unified Modeling Language

UML’98: Beyond the Notation

First International Workshop Mulhouse, France, June 3-4, 1998 Selected Papers

½¿

Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Jean B´ezivin Universit´e de Nantes, Facult´e des Sciences et Techniques 2, Rue de la Houssini`ere, B.P. 92208, F-44322 Nantes Cedex 3, France E-mail: [email protected] Pierre-Alain Muller ObjeXion Software 5, Rue Gutenberg, F-68800 Vieux-Thann, France E-mail: [email protected]

Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme The unified modeling language : first international workshop ; selected papers / UML ’98: Beyond the Notation, Mulhouse, France, June 3 - 4, 1998. Jean B´ezivin ; Pierre-Alain Muller (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 1999 (Lecture notes in computer science ; Vol. 1618) ISBN 3-540-66252-9

CR Subject Classification (1998): D.2, D.3 ISSN 0302-9743 ISBN 3-540-66252-9 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. c Springer-Verlag Berlin Heidelberg 1999 Printed in Germany

Typesetting: Camera-ready by author SPIN: 10705238 06/3142 – 5 4 3 2 1 0

Printed on acid-free paper

Preface This volume contains mainly the revised versions of papers presented at the workshop <>'98, "Beyond the Notation", that took place in Mulhouse, France on June 3-4, 1998. We thank all those that have made this possible, and particularly all the people in Mulhouse that worked hard to make this meeting a success, with such a short delay between the announcement and the realization. We are specially grateful to Nathalie Gaertner, who put in a tremendous amount of effort in the initial preparation of the workshop. We were pleasantly surprised of the quality of the submitted material and of the level of the technical exchanges at the Mulhouse meeting. More than one hundred attendees, from about twenty different countries, representing the main actors in the UML research and development scene, gathered in Mulhouse for two full study days. We would like to express our deepest appreciation to the authors of submitted papers, the editorial committee for this volume, the program committee for the initial workshop, the external referees, and many others who contributed towards the final contents of this volume.

April 1999 Jean Bézivin Pierre-Alain Muller

Editorial Committee for This Volume F. Alizon, France

S. Kent, UK

C. Atkinson, Germany

N. Kettani, France

J. Bézivin, France

H. Kilov, USA

G. Bochmann, Canada

K. Kobryn, USA

G. Booch, USA

P. Kruchten, Canada

M. Bouzeghoub, France

K. Lano, UK

D. Coleman, USA

P. Laublet, France

S. Cook, UK

T. Mens, Belgium

L. Delcambre, USA

P.A. Muller, France

P. Desfray, France

J. Odell, USA

D. d'Souza, USA

G. Overgaard, Sweden

W. Emmerich, UK

B. Paech, Germany

G. Engels, Germany

B. Pernici, Italy

J. Ernst, USA

W. Pidcock, USA

R. France, USA

T. Reenskaug, Norway

U. Frank, Germany

B. Rumpe, Germany

M. Gogolla, Germany

B. Selic, Canada

B. Henderson-Sellers, Australia

J. Warmer, Netherlands

M. Hitz, Austria

T. Wasserman, USA

P. Hruby, Denmark

R. Wirfs-Brock, USA

S. Iyengar, USA

M. Schader, Germany

I. Jacobson, USA

R. Soley, USA

J.M. Jézéquel, France

Additional Reviewers M. Bousse, France

A. Le Guennec, France

B. Caillaud, France

H. Mili, Canada

A. Cockburn, USA

P. Perrin, France

J.P. Giraudin, France

H. Wai Ming, France

L. Helouet, France

Table of Contents UML: The Birth and Rise of a Standard Notation......................................................................... 1 J. Bézivin, P.A. Muller Developing with UML - Some Pitfalls and Workarounds ............................................................ 9 M. Hitz, G. Kappel Supporting and Applying the UML Conceptual Framework...................................................... 21 C. Atkinson Modeling: Is It Turning Informal into Formal?........................................................................... 37 B. Morand Best of Both Worlds – A Mapping from EXPRESS-G to UML................................................. 49 F. Arnold, G. Podehl Porting ROSES to UML – An Experience Report ...................................................................... 64 A. Olivé, M.R. Sancho Making UML Models Interoperable with UXF........................................................................... 78 J. Suzuki, Y. Yamamoto Transformation Rules for UML Class Diagrams......................................................................... 92 M. Gogolla & M. Richters Semantics and Transformations for UML Models .................................................................... 107 K. Lano, J. Bicarregui Automation of Design Pattern: Concepts, Tools and Practices................................................. 120 P. Desfray Automating the Synthesis of UML StateChart Diagrams from Multiple Collaboration Diagrams ............................................................................................................. 132 I. Khriss, M. Elkoutbi, R. K. Keller Informal Formality? The Object Constraint Language and Its Application in the UML Metamodel........................................................................................................................ 148 A. Kleppe, J. Warner, S. Cook Reflections on the Object Constraint Language ........................................................................ 162 A. Hamie, F. Civello, J. Howse, S. Kent, R. Mitchell

VIII

Table of Contents

On Using UML Class Diagrams for Object-Oriented Database Design Specification of Integrity Constraints .............................................................................................................. 173 Y. Ou Literate Modelling – Capturing Business Knowledge with the UML ...................................... 189 J. Arlow, W. Emmerich, J. Quinn Applying UML to Design an Inter-domain Service Management Application........................ 200 M. Mancona Kandé, S. Mazaher, O. Prnjat, L. Sacks, M. Wittig BOOSTER*Process A Software Development Process Model Integrating Business Object Technology and UML .................................................................................................... 215 A. Korthaus, S. Kuhlins Hierarchical Context Diagram with UML: An Experience Report on Satellite Ground System Analysis............................................................................................................ 227 E. Bourdeau, P. Lugagne, P. Roques Extension of UML Sequence Diagrams for Real-Time Systems.............................................. 240 J. Seeman, J. Wolff v. Gudenberg UML and User Interface Modeling............................................................................................ 253 S. Kovacevik On the Role of Activity Diagrams in UML – A User Task Centered Development Process for UML ........................................................................................................................ 267 B. Paech Structuring UML Design Deliverables ...................................................................................... 278 P. Hruby Considerations of and Suggestions for a UML-Specific Process Model .................................. 294 K. Kivisto An Action Language for UML: Proposal for a Precise Execution Semantics.......................... 307 S.J. Mellor, S.R. Tockey, R. Arthaud, P. Leblanc Real-Time Modeling with UML: The ACCORD Approach..................................................... 319 A. Lanusse, S. Gérard, F. Terrier The UML as a Formal Modeling Notation ................................................................................ 336 A. Evans, R. France, K. Lano, B. Rumpe OML: Proposals to Enhance UML ............................................................................................ 349 B. Henderson-Sellers

Table of Contents

IX

Validating Distributed Software Modeled with the Unified Modeling Language ................... 365 J.M. Jézéquel, A. Le Guennec, F. Pennanearc'h Supporting Disciplined Reuse and Evolution of UML Models ................................................ 378 T. Mens, C. Lucas, P. Steyaert Applying UML Extensions to Facilitate Software Reuse ......................................................... 393 N.G. Lester, F.G. Wilkie, D.W. Bustard A Formal Approach to Use Cases and Their Relationships ...................................................... 406 G. Övergaard, K. Palmkvis A Practical Framework for Applying UML ............................................................................. 419 P. Allen Extending Aggregation Constructs in UML.............................................................................. 434 M. Saksena, M.M. Larrondo-Petrie, R.B. France, M.P.Evett Author Index............................................................................................................................... 443

UML: The Birth and Rise of a Standard Modeling Notation Jean Bézivin1, Pierre-Alain Muller2 1

Laboratoire de Recherche en Sciences de Gestion Université de Nantes Faculté des Sciences et Techniques 2, rue de la Houssinière BP92208 44322 Nantes cedex 3 France [email protected] 2

ESSAIM Université de Haute-Alsace 12, rue des frères Lumière 68093 Mulhouse France [email protected]

Abstract. Officially the Unified Modeling Language UML is a graphical language for visualizing, specifying, constructing and documenting the artifacts of a software-intensive system. For many, UML is much more than that and symbolizes the transition from code-oriented to model-oriented software production techniques. It is very likely that, in a historical perspective, UML will be given credit for the perspectives opened as well as for the direct achievements realized. This introductory paper presents some of the characteristics of the notation and discusses some of the perspectives that have been and that are being opened by the UML proposal.

Introduction The first few years of the 90s saw the blossoming of around fifty different objectoriented methods. This proliferation is a sign of the great vitality of object-oriented technology, but it is also the fruit of a multitude of interpretation of exactly what an object is. The drawback of this abundance of methodologies is that it encourages confusion, leading users to adopt a 'wait and see' attitude that limits the progress made by methods.

J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 1–8, 1999. © Springer-Verlag Berlin Heidelberg 1999

2

Jean Bézivin and Pierre-Alain Muller

In 1996, the Object Management Group (OMG) put together a task force chartered with defining and approving a notational and meta-model standard for object-oriented analysis and design. The task force was made up of vendors of related tools that initially clustered themselves into four major camps. One of these camps aggregated around the submission originated by Rational Software and promoted the Unified Modeling Language (UML) that Rational built from the OMT, Booch and OOSE methodologies created by the three methodologists (Rumbaugh, Booch, and Jacobson) in its employ. The four proposals were submitted to OMG in January 1997. Other camps noted the absence of support for software development process, business process modeling, and real-time extensions within de UML definition. In March 97, the factions agreed to work closely together to add the capabilities needed for the UML to satisfy their various needs, and in December 1997, the standard was formally adopted. From the Unified Method to the Unified Modeling Language The unification of object-oriented modeling methods became possible as experience allowed the evaluation of the various concepts proposed by existing methods. Based on the fact that differences between the various methods were becoming smaller, and that the method war did not move object-oriented technology forward any longer, Jim Rumbaugh and Grady Booch decided at the end of 1994 to unify their work within a single method: the Unified Method. About one year later, they were joined by Ivar Jacobson, the father of use cases, a very efficient technique for the determination of the requirements. Booch, Rumbaugh and Jacobson adopted four goals: • • • •

To represent complete systems (instead of only the software portion) using object-oriented concepts To establish an explicit coupling between concepts and executable code To take into account the scaling factors that are inherent to complex and critical systems To create a modeling language usable by both humans and machines

The authors of the Unified Method rapidly reached a consensus with respect to fundamental object-oriented concepts. However, convergence on the notation elements was more difficult to obtain, and the graphical representation used for the various model elements went through several modifications. The first version of the description of the Unified Method was presented in October 1995 in a document titled Unified Method V0.8. This document was widely distributed, and the authors received more than a thousand detailed comments from the user community. These comments were taken into account in version 0.9, released in June 1996. However, it was version 0.91, released in October 1996, which represented a substantial evolution of the Unified Method. The main effort was a change in the

UML: The Birth and Rise of a Standard Modeling Notation

3

direction of the unification effort, so that the first objective was the definition of a universal language for object-oriented modeling, and the standardization of the object-oriented development process would follow later. The Unified Method was transformed into UML (the Unified Modeling Language for object-oriented development). As we are approaching today the version 1.4 of UML [7], the OMG Revision Task Force is already thinking to a future version 2.0. At the same time the notation is now well documented, with a rapidly increasing number of textbooks (e.g. [1], [3], [4], [5], etc.) Model and Meta-model The initial effort focused on the identification and definition of the semantics of fundamental concepts - the building blocks of object-oriented modeling. These concepts are the artifacts of the development process, and must be exchanged between the different parties involved in a project. To implement these exchanges, it was first necessary to agree on the relative importance of each concept, to study the consequences of these choices, and to select a graphical representation, of which the syntax must be simple, intuitive, and expressive. To facilitate this definition work, and to help formalize UML, all the different concepts have themselves been modeled using a subset of UML. This recursive definition, called meta-modeling, has the double advantage of allowing the classification of concepts by abstraction level, by complexity and by application domain, while also guaranteeing a notation with an expressive power such that it can be used to represent itself. A meta-model describes formally the model elements, and the syntax and semantics of the notation that allow their manipulation. The raise in abstraction introduced by the construction of a meta-model facilitates the discovery of potential inconsistencies, and promotes generalization. The UML meta-model is used as a reference guide for building tools, and for sharing models between different tools. A model is an abstract description of a system or a process - a simplified representation that promotes understanding and enables simulation. The term 'modeling' is often used as a synonym of analysis, that is, the decomposition into simple elements that are easier to understand. In computer science, modeling usually starts with the description of a problem, and then describes the solution to the problem. These activities are called respectively 'analysis' and 'design'. The form of the model depends on the meta-model. Functional modeling decomposes tasks into functions that are simpler to implement. Object-oriented modeling decomposes systems into collaborating objects. Each meta-model defines model elements, and rules for the composition of these model elements.

4

Jean Bézivin and Pierre-Alain Muller

The content of the model depends on the problem. A modeling language like UML is sufficiently general to be used in all software-engineering domains and beyond - it could be applied to business engineering, for example. A model is the basic unit of development; it is highly self-consistent and loosely coupled with other models by navigation links. Dependent on the development process in use, a model may relate to a specific phase or activity of the software lifecycle. A model by itself is usually not visible by users. It capture the underlying semantics of a problem, and contain data accessed by the tools to facilitate information exchange, code generation, navigation, etc. Models are browsed and manipulated by users by means of graphical representations, which are projections of the elements contained in one or more models. Many different perspectives can be constructed for a base model - each can show all or part of the model, and each has one or more corresponding diagrams. The UML Diagrams UML defines nine different types of diagram: • • • • • • • • •

Class diagrams Sequence diagrams Collaboration diagrams Object diagrams Statechart diagrams Activity diagrams Use case diagrams Components diagrams Deployment diagrams

Different notations can be used to represent the same model. The Booch, OMT, and OOSE notations use different graphical syntax, but they all represent the same object-oriented concepts. These different graphical notations are just views of the same model elements, so that it is quite possible to use different notations without loosing the semantic content. At heart, then, UML is simply another graphical representation of a common semantic model. However, by combining the most useful elements of the objectoriented methods, and extending the notation to cover new aspects of system development, UML provides a comprehensive notation for the full lifecycle of objectoriented development. The UML notation is a fusion of Booch, OMT, OOSE and others. UML is designed to be readable on a large variety of media, such as whiteboards, paper, restaurant tablecloths, computer displays, black and white printouts, etc. The designers of

UML: The Birth and Rise of a Standard Modeling Notation

5

the notation have sought simplicity above all – UML is straightforward, homogeneous, and consistent. Awkward, redundant and superfluous symbols have been eliminated, in order to favor a better visual rendering. UML focuses on the description of software development artifacts, rather than on the formalization of the development process itself, and it can therefore be used to describe software entities obtained through the application of various development processes. UML is not a rigid notation: it is generic, extensible, and can be tailored to the needs of the user. UML does not look for over-specification – there is not a graphical representation for all possible concepts. In the case of particular requirements, details may be added using extension mechanisms and textual comments. Great freedom remains for tools to filter the information displayed. The use of colors, drawings, and particular visual attributes is left up to the user. Achievements and Perspectives It is now clear that UML is being adopted, with benefits, by a variety of users. We have mainly presented above, the short term achievements of UML, in a rather conventional way. Before concluding this introductory presentation, let us take a more high level view of the potential long term contribution of UML. The OMG has grown to be an adaptable organization with an ability to detect very rapidly the evolution of industrial trends in technology deployment. At a time when many were still discovering the virtues of object orientation, OMG was already working on one of the first detected bottleneck of this technology: lack of interoperability. The answer to this has been the CORBA software bus. It is not by pure chance that the work on UML started there, it was because a real and urgent need to define modeling standards in the domain of object-oriented analysis and design emerged. However the consequences of this move are generally underestimated. What really happened then, was not only the definition of another specific new standard OMG recommendation, but also the starting point for a whole set of new activities. Previous activities were centered around the software transfer bus CORBA with its associated IDL language, IIOP protocol and OMA architecture. In the post-UML period, a new modeling culture is emerging, with a new knowledge bus incorporating UML, MOF, the OCL language [8] and the XMI transfer format [6]. The two buses and the two OMG activities are obviously linked, but the modeling camp is rapidly becoming important. It is now recognized that there are two ways to consider object interoperability, one is executable code interoperability and the second one model interoperability. UML is now a conceptual tool, but it has also served as an experimentation field. As previously mentioned, the self definition of UML was an interesting exercise and was successful per se. However, it also demonstrated that the applicability of this technique could be made broader than just the handling of software artifacts. As a

6

Jean Bézivin and Pierre-Alain Muller

consequence a new architecture was defined around the MOF (Meta-Object Facility). This architecture is complex and still evolving, but it could be compared to the OMA in importance. At the heart there is this self-defined MOF, which is more or less synchronized with the core definitions of UML. The MOF uses UML in various ways, for example for graphical presentations. But the main differences is that the MOF and UML are not at the same level in the OMG four-level model architecture. The MOF is a meta-meta-model and is at the M3 level while UML is a meta-model and stands at the M2 level. The MOF is a language for defining meta-models and UML is just one of these meta-models. Other meta-models that are being defined at the M2 level are for example related to common warehouse, workflow, software process, etc. So, UML has been instrumental in triggering the development of a new modeling architecture based on the MOF. Many ideas have been successfully tested on UML and then transferred to the MOF because they were found to be of broader applicability. The first one is the OCL (Object Constraint Language [8]). OCL is an expression language that enables one to describe constraints on object-oriented models and other artifacts. The word constraint is used here with the meaning of a precisely identified restriction on one or more values of a model. We see here a pleasant property of the global OMG modeling architecture. Since a meta-meta-model is structurally similar to a meta-model, features applied to one, may also be applied to the other one. So OCL, that could be applied to meta-models to give more precise semantics to models, could also be applied to meta-meta-models to give more precise semantics to metamodels. And this is exactly what happens when OCL is applied at the MOF level. Another example is the recent answer to the SMIF RFP of the OMG [6]. Initially the purpose of the Stream-based Model Interchange Format was mainly to exchange UML models. As it has finally been issued, answered and approved, the proposal is being known as XMI, a new standard for Metadata Interchange based on XML and on the MOF. Once again, there is nothing to loose, if by providing a technical solution to an UML problem, it is possible to provide a more general solution that could be applied to the UML meta-model, as well as to other meta-models already defined or yet to be proposed. Many more examples could be given of this trend. There is for example several demands to provide structured extension mechanisms for UML, going beyond single stereotypes, tagged values and constraints. Requests are being submitted for specialized UML-based meta-models on subjects like real-time or business objects. A possible answer to this would be some notion of profiles. In the case where this improvement is allowed to the UML meta-model, there is no reason why other MOFcompliant meta-models should not also benefit from these added modular modeling mechanisms. A UML profile may be defined as a subset or a superset of the basic meta-model. There is however no agreement yet on the way this notion of a profile could be defined.

UML: The Birth and Rise of a Standard Modeling Notation

7

Conclusion It is very tempting to draw a parallel between the historical development of programming languages since the early fifties and the more recent development of modeling languages. The important usage of graphical symbols in analysis and design notations may be made in correspondence with the old time art of flowcharting. Some of the OA&D notations were more business-oriented and some other were more scientific or real-time oriented, like Cobol and Fortran were also two different answers to these programming communities. We may also remember that these programming languages were usually the result of normative, industrial-oriented processes. So, should UML be considered as the PL/1 of modeling languages? The question is in fact troubling because the similarities in the definition process are numerous, specially in the way ingredients have been put together in order to satisfy the maximum of needs. If we take this resemblance for granted, what will then be the Algol 60, Algol 68, Pascal, C, C++, Occam or Java of modeling languages? As we know, the history of programming languages has not always been a linear progression according to scientific or technical criteria. At the beginning of this new period of development of modeling languages, we may hope that some lessons of the past have been learnt, but we shall not bet on this. Anyway, as we have sometimes heard in the last decade that "programming is thinking" we will surely hear in the coming years that "modeling is thinking" (or why not that "thinking is modeling"), and a good notation to write down its thinking will always be most valuable. One of the recognized contributions of UML is that it has stopped many sterile wars of notations on aspects that were not highly significant. No more long discussions on the fifteen ways or so to note cardinalities or to draw classes and instances. This does not mean that the choices have always been the best possible ones [2], only that they have been grown from a general consensus and that they will allow a higher and more productive level of debate. Another important decision that has been reported above is the separation of the debate on the notation from the debate on the process. This was a decision that was not easy to take and that will probably be considered as one of the main contribution of the authors. Now the work on the notation can progress and the work on the process can start integrating known research results and experience knowledge. UML is not the first achievement in the modeling world. If we had to quote some of them we could choose SADT/IDEF0 for the simplicity and JSD for the principle of coupling the modeling of the system to the modeling of its environment. The next big challenge that UML will have to face is how to deal with the emerging and multifaceted notion of software component. This will be a major test in the coming years and if successfully passed, it may well become the main qualification title of this modeling notation.

8

Jean Bézivin and Pierre-Alain Muller

References 1. Booch, G., Rumbaugh, J., Jacobson, I. The Unified Modeling Language: User guide Addison Wesley, (November 1998) 2. Bergner, K. et al. A Critical Look at UML1.0. The Unified Modeling Language - Technical Aspects and Applications, M. Schader and A. Korthaus (eds.), Physica-Verlag (1998) 3. Fowler, M. UML Distilled: Applying the Standard Object Modeling Notation. Addison Wesley (1997) 4. Harmon, P., Watson, M. Understanding UML - The Developer's Guide with a Web-based Application in Java. Morgan-Kaufmann (1998) 5. Muller, P.A Instant UML Wrox Press, Chicago, (December 1997) 6. OMG XML MetaData Interchange (XMI) Proposal to the OMG OA&D TF RFP3 : Stream Based Model Interchange Format (SMIF) Document ad/98-10-05, (October 20, 1998), Adopted at the Washington Meeting, (January 1999) 7. UML Specification. Version 1.3R9, Rational Software (January 1999) 8. Warmer, J., & Kleppe, A. The Object Constraint Language Precise Modeling with UML Addison Wesley, (October 1998)

Developing with UML - Some Pitfalls and Workarounds Martin Hitz1, Gerti Kappel2 1

Department of Data Engineering Institute of Applied Computer Science and Information Systems University of Vienna A-1010 Vienna, Austria [email protected] 2 Department of Information Systems Institute of Applied Computer Science Johannes Kepler University of Linz A-4040 Linz, Austria [email protected]

Abstract. The object-oriented modeling language UML offers various notations for all phases of application development. The user is left alone, however, when applying UML in up-to-date application development involving distribution, data management, and component-oriented mechanisms. Moreover, various shortcomings have been encountered, most notably w.r.t. refinement of model elements throughout the development life cycle and employment of interaction diagrams to formalize use cases. The paper will shed some light on how these issues may be handled with UML.

1

Introduction

"When it comes down to it, the real point of software development is cutting code. Diagrams are, after all, just pretty pictures." [4, p.7] This opinion is still alive among researchers working in the area of software development as well as practitioners involved in software projects. Nonetheless, it has been more and more commonly accepted that the early phases of software development such as requirements specification, analysis, and design are key to the successful development and deployment of software systems. Not least due to the usage of some intuitive but rigor diagrammatic notations representing the artifacts of these development phases the software development process has been improved considerably. Object-oriented software development follows the same lines of thought. From the very beginning of requirements specification, object-oriented modeling notations provide intuitive mechanisms for representing the objects and J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 9–20, 1999. © Springer-Verlag Berlin Heidelberg 1999

10

Martin Hitz and Gerti Kappel

their interactions for reaching a common goal, namely the required system functionality. Several object-oriented modeling notations and methods had been developed in the late eighties and early nineties (for an overview we refer to [5]). After different merging efforts and a request for proposals by the Object Management Group, UML (Unified Modeling Language) was adopted in November 1997 as the official industry standard for object-oriented software modeling notations [3, 4]. UML covers several advantages, among which only three shall be mentioned here. First and most importantly, the standardization of UML helps to bypass notational discussions and to concentrate on the real problems, such as modeling guidelines and design heuristics, proper development process, and proper tool support. Second, UML represents the fusion of the Booch method, Jacobson's Objectory, and Rumbaugh's OMT. As such and thanks to Objectory, the very first step of objectoriented modeling does not encompass finding objects in the problem domain - as has been the case in most other object-oriented modeling techniques - but the identification of the system functionality as required by the users. These so called use cases correspond to what has been depicted in level zero data flow diagrams known from traditional structured analysis. With use cases it has been possible both to overcome the "everything is an object and everything taken from structured development is bad"-mentality and to concentrate at the very beginning of software development on the user's requirements, which is just functionality and not objects. And third, different model views supported by UML allow to comprehend a complex system in terms of its essential characteristics. These are its system functionality (use case view), its internal static and dynamic structure (logical view), its synchronization behavior (concurrency view), and its implementation and physical layout (deployment view, component view) [3]. In this contribution, however, we will not dig into a further discussion of UML’s goodies, but rather concentrate on pitfalls (which are more interesting anyway). The main problems encountered during the development of a web-based calendar manager [8] are due to UML´s partially sloppy definition of notations, which lack a precise semantic specification. The main contribution of this paper is to shed some light on some of these deficiencies and discuss possible workarounds, some of which may be considered as suggestions of future enhancements of the notation. In the next section some refinements of UML constructs are discussed. Section 3 concentrates on the employment of interaction diagrams to formalize use cases. Finally, Section 4 points to the development of data-intensive, distributed applications based on component technology. Section 5 concludes the paper.

2

Refinement of Models

Development of complex systems based on various model views requires that the modeled diagrams can be related to each other for the purpose of traceability, i.e., connecting two model elements that represent the same concept at different levels of granularity. In addition, consistency checking between various model views

Developing with UML - Some Pitfalls and Workarounds

11

representing different though overlapping characteristics of the system at hand is a prerequisite for correct system development. Last but not least, most applications have to cope dynamically with changing requirements. Thus, various kinds of evolution mechanisms should be provided by the modeling notation. To adequately support traceability, consistency checking, and evolution, UML should provide for the refinement of model elements. In this context, refinement refers to "... a historical or derivation connection between two model elements with a mapping (not necessarily complete) between them." [16, p.71]. Note, that in contrast to the official UML document which refers to traceability as being mainly a tools and process problem we advocate the necessity to offer some kind of "meta notation" to graphically relate model elements which are derived from each other. In the following we will question some of UML's refinement mechanisms. We will investigate use case diagrams, class diagrams, and statechart diagrams. Sequence diagrams are discussed in the context of use case diagrams, too. 2.1

Refinement of Use Case Diagrams

A use case represents some system functionality. Several use cases together depicted in a use case diagram (not necessarily limited to a single physical page) make up the whole system to be implemented. To support both reuse and the stepwise specification of the required functionality, two use case relationships are provided by UML, the extends relationship, and the uses1 relationship. Their precise meaning, however, is only poorly specified. A A

“inner”

B

<<extends>>

A

B

“super” B

Fig. 1. Extends relationship between two use cases

Concerning the extends relationship, in [16, p.78] it is stated that if use case A extends use case B then an instance of use case B may include the behavior specified by A. Figure 1 depicts such a use case relationship. In the object-oriented literature there are two well-known interpretations for this relationship, which are captured by the inner concept of Beta and the super concept of Smalltalk, respectively.

1

At the time of publication of this paper – Oct. 98 – the OMG UML revision task force is discussing UML 1.3, where "uses" will be renamed to "includes". Since these documents have not been officially released, we stick to the notions of the official UML 1.1 documentation.

12

Martin Hitz and Gerti Kappel

In Beta [12], the keyword inner may be placed somewhere within the implementation of an operation B (in analogy to use case B) of some object class B'. Within some subclass A' of B', the implementation of B may be overridden. During runtime, if the operation B is invoked on an instance of A', not only the implementation of A' but also the one of B' gets executed in such a way, that the inner construct is replaced with the specialized implementation and the such extended implementation of B is executed (cf. lower left part of Figure 1, where the implementation of a use case is depicted as a sequence diagram). The inner construct in Beta specifies an unambiguous place in the implementation of an operation where to insert specialized code. In Smalltalk, the keyword super may be placed somewhere within the specialized implementation of the operation A (in analogy to use case A) of some class A', and always refers to the class’ superclass. Forwarding the message to super, it is possible to invoke the overridden implementation of the respective operation in the superclass within the specialized implementation of the subclass (cf. lower right part of Figure 1). Again, the exact location of this forwarding plays a crucial role. Both interpretations rely on the exact definition where the behavior extension takes place, but this is not possible in UML. Although extension points may be specified in the original use case (cf. definition of extension points in [17, p.95]), these extension points are just declared within the elliptic representation of the use case but there is no referencing mechanism from within the corresponding sequence diagrams. Concerning the uses relationship, in [16, p.78] it is stated that if use case A uses use case B then an instance of use case A will also include the behavior as specified by B. Figure 2 depicts such a use case relationship. Again, in UML the exact interpretation of this uses relationship is left unspecified. There is no indication in the implementation of use case A where to include the behavior of B.

A

<<uses>>

B

B

A

B

Fig. 2. Uses relationship between two use cases

For both refinement relationships, probes as defined in Objectory [9] may be used as a workaround. A probe is a position in the implementation of a use case, i.e., in a sequence diagram, where an additional behavior can be inserted (cf. lower part of Figure 2). It should be easy to include an appropriate notation in UML.

Developing with UML - Some Pitfalls and Workarounds

2.2

13

Refinement of Class Diagrams

Although the UML standards document states that the details of specifying the refinement, i.e., the derivation, are beyond the scope of UML [17, p.46], there should be at least some notational conventions provided to support any of traceability, consistency checking, and evolution. Especially the evolution from an analysis document to a design document should be supported. A class diagram is a typical example of such a "moving target". Object classes, associations, and generalizations are deleted and added, and multiplicities and directions of associations are changed, to mention just a few. A recurring pattern of class evolution is shown in Figure 3. There, a one-to-many association between object class X and object class Y is further inserted between X and Y. Class X

Class X

1

α1

1

1

α

*

Class Y_Set

Class Y

α2 *

Class Y

Fig. 3. Refinement of class diagrams

Since the object-oriented paradigm is strong at modeling single objects and navigation among them but falls short at working with sets of objects, container classes are heavily used helper classes. In a car reservation system, for example, if some client wants to reserve a car the availability of all the cars has to be checked to find the optimal car. This is a typical operation to be invoked on a set of objects, namely cars. Thus, either the operation is modeled in terms of a class operation or a container class is inserted holding sets of cars. In the latter case, the availability check would be invoked on instances of the container class. Besides constraints, which may be specified arbitrarily, UML provides no mechanism to annotate the derivation, e.g., that association α has evolved into α1 and α2, and the class Y_Set has been inserted. Bergner et al. have drawn similar conclusions and have suggested extensions to the refinement notation [1]. To increase standardization and portability, the definition of the precise semantics of the most common derivation rules should not only be left to some UML CASE tool designers. 2.3

Inheritance of Statechart Diagrams

Refinement of statechart diagrams is properly supported as far as state refinement is concerned. State refinement comes in two different flavors, and-refinement, which

14

Martin Hitz and Gerti Kappel

implies that the original state is decomposed into a set of parallel substates, and orrefinement, which implies that the original state is decomposed into a statechart again. However, the refinement of statechart diagrams must also be seen in the light of the inheritance of statecharts. The reason is the following. In general, object classes are organized in class hierarchies, in which subclasses inherit the structure as well as the behavior of superclasses. As far as the inheritance of behavior is concerned, the discussion has mainly focused on inheritance of single operations in the past. Object behavior, however, is specified at two interrelated levels of detail: at the operation level and at the object class level. The latter is specified in terms of object life cycles that identify legal sequences of states and state changes, i.e., operations. In UML, object life cycles are modeled in terms of statechart diagrams, i.e., inheritance of object life cycles has to be treated in the realm of inheritance of statechart diagrams. Whereas there exist a common understanding on the inheritance of single operations in terms of inheriting their signatures and implementations, and specializing them [18], there exist no common understanding on how to specialize object life cycles in terms of specializing statechart diagrams and which criteria to follow. The encountered problems are briefly investigated in the following. There are several possibilities to inherit and to specialize object life cycles ranging from no restriction at all, called arbitrary inheritance, to allowing no specialization at all, called strict inheritance. Whereas the former does not support any notion of substitutability in the sense that an instance of a subclass can be used when an instance of a superclass is expected [18], the latter prohibits the specification of new operations in the subclass at all. Whereas the former notion is too unrestricted to build reusable and reliable systems, the latter notion is too restrictive. What would be necessary instead is a common understanding of the notion of consistent inheritance. Two alternative notions of consistent inheritance prevail: covariance and contravariance. Covariance requires that input and output parameters be restricted to subclasses and that pre- and postconditions of operations be strengthened when operations are redefined for a subclass. Contravariance requires that input parameters be generalized to superclasses and preconditions be weakened, while output parameters be restricted to subclasses and postconditions be strengthened. Covariance is favored by object-oriented modeling methods as it supports the concept of specialization in the tradition of conceptual modeling and knowledge representation [13]. Contravariance is favored by programming language folks as it supports strong type checking in the presence of type substitutability [18]. Object life cycles may be specialized by extension and by refinement. Extension means adding states and transitions. Refinement means expanding inherited states into substatechart diagrams, which consist of newly added states and transitions in turn. Whereas the latter has been treated more thoroughly in the literature (for an overview, we refer to [6, 15]), even within the UML standards document (see below), there is less attention paid to the former. We will discuss some peculiarities of inheritance by extension in the following. Consider the unshaded states of Figure 4, which depicts the life cycle of a generic class RESERVATION (gray shaded states and incident transitions are considered below). A reservation object is created, the availability of the thing to be reserved is

Developing with UML - Some Pitfalls and Workarounds

15

checked, and the reservation is either confirmed or a sorry letter is sent. After the reservation is consumed, it has to be paid. Let's assume a subclass CAR_RESERVATION, which extends the inherited life cycle in that the signing of an insurance contract is added (cf. light-gray shaded states in Figure 4). This parallel extension seems to be most intuitive and a frequently recurring pattern in reality. Parallel extension implies the covariant notion of consistent inheritance in that both the postcondition of the inherited transition confirm and the precondition of the inherited transition pay are strengthened (preconditions and postconditions of transitions are their prestates and poststates, respectively; in the example: post(confirm) = {s2} and pre(pay) = {s3} for the superclass, and post(confirm) = {s2, s5} and pre(pay) = {s3, s6} for the subclass). If one wants to adhere to the notion of type substitutability, one would have to disregard parallel extension, and support only alternative extensions in some subclasses. Let's assume a subclass RESERVATION_WITH_CANCEL of class RESERVATION, which extends the inherited life cycle with the possibility to cancel the reservation (cf. dark-gray shaded states in Figure 4). This alternative extension implies the contravariant notion of consistent inheritance, in that no inherited conditions and no inherited types of parameters are changed. The interpretation of covariant and contravariant inheritance is further elaborated on by Ebert and Engels [2] along the following lines: Parallel extension conforms to covariant inheritance, which implies observation consistency, i.e., any instance of a subclass may be observed like an instance of a superclass disregarding the added states and state transitions. Alternative extension conforms to contravariant inheritance, which implies invocation consistency, i.e., on any instance of a subclass each operation of the superclass may be invoked disregarding the added states and state transitions. Observation consistency and invocation consistency exclude each other. For a detailed discussion and formal proof thereof, we refer to [2, 11, 14]. cancel checkAvailability create s1 s0

confirm

s2

s5

s7

consume

makeInsurance

payCancellationFee

s3

pay

s4

s6

sendSorryLetter

Fig. 4. Statechart diagram of object class RESERVATION plus extensions Extension mechanisms of statechart diagrams are not discussed at all in the UML standards document. Refinement of statechart diagrams is discussed to that effect that "... state machine refinement as defined here does not specify or favor any specific policy of state machine refinement. Instead, it simply provides a flexible mechanism

16

Martin Hitz and Gerti Kappel

that allows subtyping (behavioral compatibility), inheritance (implementation reuse), or general refinement policies.'' [17, p.117]. With the above considerations in mind, we would advocate for a more complete notion of inheritance of statecharts within the realm of UML. More specifically, within the statechart diagram of a subclass, the inherited parts should be clearly distinguishable from the newly defined ones. Possible solutions may include shading of inherited states or qualifying state names with the class names where they have been originally defined.

3

Formalizing Use Cases

A use case provides a high-level, rather abstract notion for representing some required system functionality. If one wants to show how this use case is realized by the underlying objects and their interactions, one has to formalize use cases in terms of sequence diagrams, or collaboration diagrams, respectively. Since sequence diagrams and collaboration diagrams are deemed equivalent in terms of expressive power, we concentrate in the following on sequence diagrams. We have extensively used them in the realm of our calendar management system. Some of the encountered problems and possible workarounds are discussed in the following. Concerning class operations, it is not specified how they are represented in sequence diagrams, besides the special class operation create. Due to the representation of time in sequence diagrams it is not possible to depict general purpose class operations like create operations, i.e., leading to the box representing the object. Instead, it would be possible to borrow the class diagram notation and underline any class operation. Another solution would be to represent the respective class as an object and thus be able to handle each class operation like any other object operation. The flaw of both solutions concerns the different notations for class operations, one for create operations, and one for all the other class operations.

: Calendar

i: Participant

c: CV

sendNotify()

displayNotify()

CVs c of Participant i

Α Participants i

Α

Fig. 5. Implementation of use case Update_View

Concerning set operations, the equivalence of multiobjects in collaboration diagrams has been left out in sequence diagrams. Multiobjects are a convenient

Developing with UML - Some Pitfalls and Workarounds

17

mechanism especially for data intensive applications where sets of objects are involved. A possible solution to iterate over objects in a set is discussed below. Concerning the objective of sequence diagrams, they are used for representing either scenarios or algorithms. Concerning the former, it is an intuitive way to capture the main idea of a use case. However, only one possible execution path is depicted. If one prefers a rather complete specification of the use case's semantics, one would have to use sequence diagrams for representing whole algorithms including iterations and conditional execution paths. In particular, iterations are poorly specified within sequence diagrams. Consider the sequence diagram in Figure 5, which depicts the implementation of the use case Update_View within our calendar manager. The purpose of this use case is to inform all participants of a date that something has changed, e.g., a date has been inserted, or its start time has been moved. Thus, the operation update() is invoked on all participants of the respective date. In the UML standard document, there is no indication on how to represent messages sent to each object of a set. We suggest to index the objects of a set by some iteration variable, and use this index also as object name at the top of the respective lifeline (cf. ∀ Participants i and i:Participant in Figure 5). The nesting of iterations is treated in an analogous way. Referring to Figure 5, for each participant the message update() is sent to each client view of that participant displaying his/her personal calendar (cf. ∀ CVs c of Participant i in Figure 5). : UI

: Calendar

insertDate(...)

create()

: Date

createNot()

1..3 create() create() create()

n1: Notification n2: Notification n3: Notification

addParticipant() add(PID) i = 1..n

add(PID)

Notify

Fig. 6. Implementation of use case Insert_Date

Last but not least, concerning the inclusion of component sequence diagrams into more complex sequence diagrams in analogy to subprogram calls, there is no discussion thereof in the standards document. We suggest to use probes from Objectory to precisely specify where and when to include another sequence diagram (cf. discussion on uses relationship of use case diagrams in subsection 2.1). Figure 6 shows the usage of probes. There, within the implementation of the use case Insert_Date, the use case Update_View is called. Another extension, which is depicted in Figure 6, refers to the dynamic creation of a (possibly variable) set of objects and the interaction with those objects. We borrow the notion of multiobjects from

18

Martin Hitz and Gerti Kappel

collaboration diagrams. Messages to multiobjects address the entire set (exhibiting cascading semantics in general), whereas in order to communicate with a single element of the multiobject, the former has to be explicitly depicted with a separate lifeline (not shown in Figure 6). Our system supports at most three notifications per date. The corresponding multiobject and its elements are constructed by the create message. For each participant, all notification objects are informed of his existence via the add message to the multiobject, which is assumed to be cascaded to the element objects.

4

Component-Based Development

This section on component-based development does not provide any solutions. Rather its purpose is to give a quick tour on various topics on component-based development, which all point to open research issues. Similar to the question posed on objects ten years ago, it has still to be clarified what a component is all about. The least common denominator may define a component being a reusable artifact. Thus, it encapsulates certain functionality and provides a clear notion of interface to use this functionality. Figure 7 depicts two dimensions to classify components, based on the kinds of artifacts, and on the kinds of software development phases, where components are reused. Along the artifacts axis, we may distinguish executable objects, class descriptions, patterns of reusable knowledge, frameworks in the sense of patterns with inversion of control [10], and whole executable programs. Along the phases axis, reusability may occur during all software development phases ranging from requirements specification to implementation. An interesting topic of research remains to look into each combination of the two dimensions and investigate their relevance for component technology in turn. Artifacts Programme Framework Pattern Class Object

Phases Implementation

Design

Analysis Requ.Spec.

Fig. 7. Kinds of reusable artifacts

UML supports the notion of components. There, "a component is a reusable part that provides the physical packaging of model elements.'' [17, p.45] Thus, in UML a component is a very low-level, implementation oriented notion. In other words, it is a

Developing with UML - Some Pitfalls and Workarounds

19

physical component, which comprises either source code or executable code. However, we feel that this is not enough. To explore the whole potential of reusability, there should be also the notion of a logical component with a clear interface definition supporting both the notion of a provided interface and a required interface. Examples thereof exist in the literature. Subsystems in RDD [19] have contracts, which enclose the provided functionality to the "outside world''. At the same time, RDD also supports the notion of collaborators, which are other object classes necessary to fulfill the functionality of the object class at hand. Thus, collaborators and their provided operations make up the required interface of the respective object class. Another question concerns the packaging of functionality within components. Components may be fine-grained encapsulating some small functionality, e.g., a sort algorithm, or they are coarse-grained encapsulating whole applications. Concerning up-to-date application development including distribution and database functionality, we also regard components as a possible mechanism to encapsulate various levels of implementation details and to provide an easy-to-use interface to connect to some database and to use some underlying distribution mechanism, respectively. We feel that the component notation provided by UML is by far not sufficient. It seems, however, that the software development community has not yet agreed upon a uniform notion of component based development. Thus, defining a standard notation might be premature at this point of time.

5

Conclusion

The purpose of the paper was to demonstrate that UML in its present state is still suffering a certain lack of expressive power as well as several weaknesses in its definitions. In addition, we have tried to be constructive and have shown a route how to overcome some of the problems of UML.

References 1. Bergner, K. et al.: A Critical Look at UML1.0. The Unified Modeling Language Technical Aspects and Applications, M. Schader and A. Korthaus (eds.), Physica-Verlag (1998) 2. Ebert, J., Engels, G.: Observable or Invokable Behavior - You have to Choose! Technical Report, Institute of Computer Science, Leiden University (1994) 3. Eriksson, H.-E., Penker, M.:UML Toolkit. John Wiley & Sons (1998) 4. Fowler, M.: UML Distilled: Applying the Standard Object Modeling Notation. Addison Wesley (1997) 5. Fowler, M.: A Survey of Object-Oriented Analyses and Design Methods. Tutorial Notes of European Conference on Object-Oriented Programming (ECOOP) 1996, Linz/Austria (1996)

20

Martin Hitz and Gerti Kappel

6. Harel, D., Gery, E.: Executable Object Modeling with Statecharts. IEEE Computer, 30 (7), p. 31-42 (July 1997) 7. Harmon, P., Watson, M.: Understanding UML - The Developer's Guide with a Web-based Application in Java. Morgan-Kaufmann (1998) 8. Hitz, M., Kappel, G.: Software Development with UML. dpunkt Verlag (1998) (in preparation, in German) 9. Jacobson, I., Christerson, M., Jonsson, P., Oevergaard, G.: Object-Oriented Software Engineering - A Use Case Driven Approach. Addison-Wesley (1992) 10. Johnson, R.E.: Frameworks = Components + Patterns. Communications of the ACM, 40 (10), p. 39-42 (October 1997) 11. Kappel, G., Schrefl, M.: Inheritance of Object Behavior - Consistent Extensions of Object Life Cycles. Extending Information Systems Technology, Proceedings of the Second International East/West Database Workshop, J. Eder and L. Kalinichenko (eds.), SpringerVerlag , Workshop in Computing Surveys, (1994) 12. Lehrmann Madsen, O., Moller-Pedersen, B., Nygaard, K.: Object-Oriented Programming in the Beta Programming Language. Addison Wesley (1993) 13. Mylopoulos, J.: Object-Oriented and Knowledge Representation. Proceedings of the IFIP TC2 Working Conference on Object-Oriented Databases (DS-4), R. Meersman and W. Kent (eds.), North-Holland (1990) 14. Schrefl, M., Stumptner, M.: Behavior Consistent Extension of Object Life Circles. Proceedings of the International Conference on Object-Oriented and Entity-Relationship Modeling, LNCS Vol. 1021, Springer-Verlag (1995) 15. Schrefl, M., Stumptner, M.: Behavior Consistent Refinement of Object Life Cycles. Proceedings of the 16th International Conference on Entity-Relationship Modeling, Springer-Verlag LNCS (1997) 16. UML Notation Guide. Version 1.1, Rational Software (September 1997) 17. UML Semantics, Version 1.1, Rational Software (September 1997) 18. Wegner, P., Zdonik, S.B.: Inheritance as an Incremental Modification Mechanism or What Like Is and Isn't Like. European Conference on Object-Oriented Programming (ECOOP 1988), S. Gjessing and K. Nygaard (eds), Springer LNCS 322, p. 55-77 (August 1988) 19. Wirfs-Brock, R., Wilkerson, B., Wiener, L.: Designing Object-Oriented Software. Prentice Hall (1990)

Supporting and Applying the UML Conceptual Framework Colin Atkinson Fraunhofer Institute for Experimental Software Engineering D-67661 Kaiserslautern, Germany [email protected]

Abstract. The Unified Modelling Language (UML) ostensibly assumes a four level (meta) modelling framework, both for its definition and for the conceptual context in which its users operate. In practice, however, it is still dominated by the traditional two level (model + data) view of object modelling and neither supports nor applies the four level framework properly. This not only diminishes the clarity of the UML semantics, but also complicates the task of those users who do wish to fully embrace a multi-level approach. After outlining the characteristics of the intended conceptual framework, and the problems resulting from the UML’s current two-level bias, this paper presents three simple enhancements to the UML which provide the required expressive power for multi-level modelling. The paper then goes on to discuss issues in the application of the conceptual framework within the UML’s own definition.

1

Introduction

Although the current version of the Unified Modelling Language (UML) [1] ostensibly assumes a four level conceptual framework, in reality it is very much dominated by the traditional two-level view of object modelling (i.e. model + data). The resulting asymmetry is manifest in two ways; first by the lack of generalized multi-level modelling features, and second by the failure to properly apply the four level conceptual framework in the definition of the UML semantics. To a certain extent the second problem is a symptom of the first, because the UML is used in its own definition. In other words, the first (and probably the most important) application of the UML conceptual framework is in the definition of the UML itself. Both of these problems have consequences for users of the UML. The first problem complicates the task of users who wish to work beyond the traditional “model” and “data” levels and apply the UML framework in its full generality. Instead of being able to use a notation which recognizes and supports the fundamental symmetry between the levels, such users are forced to try to adapt features which in practice were designed to support only two levels. This results in class diagrams which are overly complicated and inconsistent in their use of object modelling principles. The second problem not only unnecessarily complicates the semantics of the UML, but also causes confusion about what really is the intended UML conceptual framework. On the one hand users are presented with a “framework” ostensibly based on four levels, but a notation which really only supports two. J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 21–36, 1999. © Springer-Verlag Berlin Heidelberg 1999

22

Colin Atkinson

Fortunately, the required expressive power can be attained with only a few minor enhancements to the existing notation. The following section provides an overview of the conceptual framework underpinning the UML. Section 3 then proposes three simple notational enhancements which significantly increase the UML’s support for this multi-level framework. Finally, Section 4 describes problems in the way the UML conceptual framework is used in its own definition.

2

The UML Conceptual Framework

The current form of the conceptual framework underpinning the UML was driven by the OMG standardization requirements, particularly the need for alignment with the jointly standardized Meta Object Facility (MOF) [2], and the desire to be compatible with prevailing industry practice [3], [4], [6]. It is no accident that the MOF and the UML share the same underlying conceptual framework, illustrated in Fig. 1. Not only did they share many of the same contributors, but the last phase in the UML and MOF development process involved an extensive “alignment” activity which aimed to bring the two proposals into agreement. (M3) Meta-meta-model (MOF) instance_of

instance_of

(M2) Meta-model (UML Meta-Model)

(M1) Model instance_of

(M0) Data

Fig. 1. UML/MOF Conceptual Framework

An issue which always arises in a discussion of this type of framework is the confusion surrounding the commonly used terminology. The problem is that the word “meta” tends to be used in both an absolute sense to indicate a model’s position in the level hierarchy, and in a relative sense to indicate a model’s relationship to another model. For example, the top level in such a four level hierarchy is typically called a “meta-meta-model”, because it is a meta-model for a meta-model. However, it can just as easily be viewed as (and hence called) a meta-model, or for that matter simply a model. The basic problem is the asymmetric use of terminology across the levels. To avoid such confusion in this paper we either use the alphanumeric labels shown on the diagram in Fig. 1, or the names “MOF” and “UML meta-model” for the top level and second level, respectively.

Supporting and Applying the UML Conceptual Framework

23

The basic purpose of the MOF, at the top (M3) level, is to facilitate the creation of object-oriented meta-models at the level below. Many of the concepts appearing in the MOF are thus familiar object-modelling concepts such as Class, Association and Operation etc. Fig. 2 illustrates the MOF element, Class: Class isSingleton : Boolean isVisible ()

Fig. 2. Typical M3 level element

This is actually a descendent of numerous other elements in the MOF, and has various other inherited attributes and methods not shown in Fig. 2. IsSingleton happens to be a local attribute of Class, while isVisible() is an inherited method. Since the concepts used in the creation of a model must all be defined somewhere (i.e. everything must be an instance of something), the MOF is viewed as being an instance of itself [2]. The UML meta-model, at the second (M2) level, is regarded as being an “instanceof” the MOF. Its function is to describe the abstract syntax of the modelling concepts provided by the UML. Not surprisingly, since the UML is also intended to support object modelling (among other things) the part of the UML meta-model describing the object modelling features is very similar to the MOF. For example, the Class concept also appears in the UML meta-model, although with different attributes: Class isActive : Boolean

Fig. 3. Typical M2 level element

Class diagrams are only one part of the UML, and the UML meta-model naturally has other packages describing the other areas. Fig. 4, is a typical example of the kind of element that might appear in an M1 level user model. This level is often called the “model” level, and is viewed as being an “instance_of” the UML meta-model: Person name : String birth_date : Integer address : String age() : Integer

Fig. 4. Typical M1 level element

24

Colin Atkinson

The bottom (M0) level contains the actual entities that appear in the final objectoriented program or database. This level is often therefore referred to as the “data level”, and is viewed as being an instance of an M1 model. A typical element at this level is shown in Fig. 5. President : Person name = “Bill Clinton” age = 1952 address = “White House”

Fig. 5. Typical M0 level element

Fig. 5 also shows the UML convention for distinguishing an “instance” from a “type” (i.e. template). The same graphical symbol is used in both cases, but the names of instances are underlined. 2.1

Built-in Extension Mechanism

The part of the UML which most clearly illustrates its two level bias is the so called built-in extension mechanism. The purpose of this mechanism is to enable users to customize the UML for their specific needs by extending the set of available modelling concepts. Since it is itself an object model, the UML meta-model is already inherently extensible at the M2 level through the normal object-oriented specialization mechanism. However, the UML documentation downplays this approach to extension, and instead encourages users to define extensions indirectly at the M1 level in terms of three special “built in” extension features: stereotypes, tagged values and constraints. These essentially provide an elaborate way of simulating M2 level specializations at the M1 level. Stereotypes Stereotypes provide a way of classifying model elements in terms of a classifier that is an implicit component of the UML meta-model. A stereotype can therefore be thought of as a “virtual” or “pseudo” M2 class that is a specialization of an explicit M2 class. It follows that the names of stereotypes, which are shown in guillemets, cannot clash with the names of explicit UML meta-model elements. In principle, stereotypes can be applied to any kind of model element appearing in the UML meta-model. Fig. 6 shows an example of a stereotype «Testable_Class » applied to a class, Country. Stereotypes are never defined separately, but always in terms of their application to a stereotyped model element. The example indicates that Country is not an ordinary class, but is a special kind of class that has the stereotype «Testable_Class». This obviously is meant to designate the fact that the class can be tested, which is only true for executable classes. Some classes, like abstract classes, are not executable.

Supporting and Applying the UML Conceptual Framework

25

«Testable_Class » Country name : String creation_date : Integer population : Integer age() : Integer

Fig. 6. Stereotyped class

The stereotype applied to Country in Fig. 6 is user defined. Users can introduce new stereotypes at any time during their modelling work simply by assigning a stereotype name to a model element. The UML also has a predefined set of stereotypes known as standard elements. Obviously the names of user defined stereotypes cannot clash with those of predefined stereotypes. Tagged Values The UML allows arbitrary properties to be assigned to M1 level model elements at any time during the modelling process. Such properties are known as tagged values and take the form of tag/value pairs with the syntax “tag = value”. A comma-separated list of such tag names and tagged values inside a pair of braces is known as a property specification, and appears under the name of the model element possessing those properties. Fig. 7 extends the class in Fig. 6 with tagged values to indicate that the class Country has two associated properties, tested which has value 10.12.95, and known_bugs which has value 2. These are properties of the class and are not passed on to its instances. They consequently correspond to M2 level attribute values. «Testable_Class » Country {tested = 10.12.95, known_bugs = 2} name : String creation_date : Integer population : Integer age() : Integer

Fig. 7. Stereotyped class with tagged values

Tagged values are often associated with stereotypes, as in this example. When this is the case, the assignment of a particular stereotype to a model element also mandates the provision of values for the corresponding tags. Like stereotypes, there are a certain number of predefined tagged values which form part of the UML standard elements. Constraints Constraints are much like stereotypes in that they define special variants of a given type of model element. However, in contrast with stereotypes, constraints define the precise conditions that must be met by the variant. Thus, for example, “disjoint” is a

26

Colin Atkinson

constraint that can be applied to generalization relationships to indicate that the resulting subclasses have no instances in common. A generalization subject to this constraint is still a generalization, but with the additional properties specified by the constraint. In the case of stereotypes, on the other hand, the characteristics that are implied by the stereotype (e.g. Testable_Class) are not formally specified as part of the UML model. Constraints can also be attached to stereotypes directly, in which case every model element possessing that stereotype must also adhere to the associated constraint(s). Although they can be applied to any generalizable element, in practice they tend to be most often applied to relationships. In particular, all the predefined constraints (standard elements) apply to some kind of relationship, or relationship components (e.g. link end). 2.2

Extensions Versus Variants

Whenever one or more of these special “built in” extension features is used within an M1 level model, the result is called a “UML extension.” A UML extension represents a customization of the UML with modelling concepts specialized for the domain of interest (e.g. Testable_Class, tested, known_bugs etc.). The UML documentation contains two predefined UML extensions, one for business process modelling, and the other for supporting the classic Objectory process [5]. As noted above, since the UML meta-model is an instance of the MOF (i.e. an object model), it can be extended directly just like any normal class diagram. Such an extension of the UML meta-model is known as a “UML variant’. There is thus a distinction between a “UML extension”, which is based on an M1 level application of the “built in” extension features described above, and a “UML variant”, which is a direct M2 level extension of the UML meta-model. The UML documentation makes clear its preference for the former. So much so, in fact, that while it provides an elaborate set of notational features for creating UML extensions, it largely ignores the notational needs of UML variants. A clean and consistent graphical description of a UML variant, indeed of the UML meta-model itself, therefore requires the use of some minor notational enhancements of the kind described in the following section.

3

Supporting a Multi-level Modelling Framework

The UML documentation makes clear the fundamental importance of the typeinstance dichotomy in the UML (page 11 in the UML notation guide) [1]. However, problems arise if this dichotomy is not applied uniformly, and with great care, when there are more that two levels in the modelling framework. The whole point of a meta-model is to define the concepts from which models in the layer below are created. Thus, by definition, every element, in every model, at every level is an instance of something else (assuming that the top-level is an instance of itself). Moreover, it follows that every instantiatable model element (in levels M1 and above) is both a type and an instance. In terms of the examples in Sect. 2, not only is President an instance, but the M1 level element Person, the M2 level element Class,

Supporting and Applying the UML Conceptual Framework

27

and the M3 element Class are also instances. Moreover, apart from President, each of these elements is both an instance and a type (i.e. both an object and a class). The simple type-instance dichotomy and associated notation that worked in a twolevel framework is consequently no longer adequate for a multi-level framework. For example, a simple application of the rule that the names of instances are underlined would result in the name of every model element being underlined. Also, the current UML notation forces the features of a model element to be depicted in different ways depending on whether it is being viewed as a type or as an instance. For example, since Person is an instance of the M2 element Class, it is perfectly legal UML to treat it as an object and provide a value for its isActive attribute: Person : Class isActive = False

Fig. 8. Instance view of Person

However, in the type view of Person (Fig. 9), the only way to show the value for the attribute isActive is in the form of a tagged value: Person {isActive = False} name : String birth_date : Integer address : String age() : Integer

Fig. 9. Type view of Person

At least there is a way to show meta-attribute values in the type view of a class. However, if Person also had a method instance; that is, if Class had a method type, such as the method Example() illustrated in Fig. 10, there is no way this could be shown in the type view of Person using the current notation. Class isActive : Boolean Example() : Boolean

Fig. 10. Variant of UML metaclass, Class, with example method

The basic problem with the current version of the UML notation is that it fails to reconcile the type and instance facets of instantiatable model elements, and fails to apply the basic tenets of the type instance dichotomy uniformly across the different levels of the conceptual framework. However, this can be rectified quite easily with the three simple enhancements described in the following subsections.

28

3.1

Colin Atkinson

Instance_Of Relationship

The first enhancement does not really require an addition to the UML notation, as such, but rather an extension of the way in which an existing feature is used. As illustrated in Fig. 5, the UML already incorporates a textual representation for the instance_of relationship in the form of the traditional “:” operator. By simply generalizing the use of this feature to all levels, as in Fig. 8, the type of any model element, at whatever level, can be uniformly identified. Thus, for example, it is possible to indicate that the M1 element Person is an instance of the M2 element Class from Fig. 3 as follows: Person : Class name : String birth_date : Integer address : String age() : Integer

Fig. 11. "Instance_of" notation

Name Underlining Clearly the convention of underlining the names of instances no longer makes sense in a multi-level framework because every model element is an instance. However, the underlining of names still has a useful role to play. The reason is that although every model element is an instance, not every model element is a type. Many model elements, including all those at the M0 level, are not instantiatable and thus do not have a type facet. It makes sense, therefore, to use the underlining of names to distinguish instantiatable elements (with both an instance and a type facet) from noninstantiatable elements (without a type facet). Obviously, to remain faithful to the intent of the current underlining rule, it is the names of the latter that are underlined. An example of a non-instantiatable model element which occupies a level above M0 is a specific stereotype instance such as Testable_Class: Testable_Class : Stereotype

Fig. 12. Stereotype instance

3.2

Class/object Duality

As mentioned above, in a multi-level modelling framework, instantiatable model elements have both an instance facet and a type facet, both of which are equally valid. A way of reconciling these two facets notationally is offered by the 3D visualization of an instantiatable model element as a cube:

Supporting and Applying the UML Conceptual Framework

Type (class) view

29

Instance (object) view

Fig. 13. 3D visualization of instantiatable element

The right hand face of this cube represents the instance (or object) facet of the model element, and contains the attribute values and method instances derived from the element from which it was instantiated. The left hand face represents the type view of the model element, and contains the attributes and method (types) which its instances will receive. By “flattening” this cube into two dimensions we obtain a representation of model elements capable of handling both facets of an instantiatable element. The basic convention is that “instance” related features are indented with respect to the “type” related features to convey the idea that they are on the right hand face of the cube. Name attributes attribute values method types method instances

Fig. 14. Generalized notation

This is the generalized notation for model elements in a multi-level modelling framework. It allows the type and instance facets of a model element to be shown together in a consistent, uniform way at any level. Fig. 15 shows how this notation would be used in the case of the class Person. Person : Class name : String birth_date : Integer address : String isActive = False age() : Integer

Fig. 15. Generalized notation applied to Person

30

Colin Atkinson

Class Scope Attributes and Methods The UML supports the concept of so called “class scope” attributes and methods, which in C++ correspond to static data members and static functions respectively. A class scope feature (i.e. method or attribute) differs from a normal feature in that only one instance exists in the final running system, regardless of the number of instances of the class. In a sense, therefore, it belongs to the class rather than to the individual instances of the class. Indeed, this is precisely how class-scope entities are represented in languages such as Smalltalk which allow classes to exist at run-time. In Smalltalk a class-scope attribute would be implemented as a class instance variable, and a class scope method as a class method. From the perspective of a multi-level modelling framework, class scope features essentially correspond to instances of meta-features. In the case of attributes, however, there is a slight difference between the dynamic properties of class scope attributes and those of meta-attribute instances. Class scope attributes as currently understood in the UML are allowed to change their values over time, which implies a run-time presence, whereas meta-attributes of the form discussed previously (e.g. IsActive) are generally assumed to be constant. In other words, they implicitly possess the UML property “{frozen}” which indicates that something has constant value. Class scope attributes, therefore, are really a more general form of meta-attribute which are amenable to change over time. Since class-scope features essentially correspond to meta-features, with more general dynamic properties in the case of attributes, the notation suggested here can be applied without difficulty. Basically, class scope features are indented with respect to normal features. The convention of underling class scope features is thus redundant, but does not clash with the indentation convention and so can be used if desired. 3.3

Level Identification

Since the generalized notation in Fig. 14 is intended to be used uniformly at all model levels, it is important to have some way of indicating which level an element occupies. A simple but effective approach is to make the level number a superscript following the element name. Thus, to indicate that the element Person inhabits the M1 model, its name would be appended with the superscript 1, as follows: Person 1 : Class

Fig. 16. Level identification notation

The level number is tightly bound to the name it is a superscript for. Thus, if Class were to be given a level identifier in Fig. 16 it would obviously be 2, since this is the level it occupies. However, it is rarely necessary to provide level identifiers for both the instance name and the type name of an element because it is assumed that the type occupies the level above its instances. In the rare cases where this is not so, such as in the MOF, the levels of both the instance and type should be shown explicitly. Another situation where it makes sense to show both is when the instance and the type have the same name, as in the case of the M2 element Class:

Supporting and Applying the UML Conceptual Framework

31

Class 2 : Class 3

Fig. 17. Level notation applied to both type and instance

3.4

Creating a UML Variant

These three enhancements to the UML provide all the features needed to develop models at any level in a uniform and consistent way. For example, they can be used at the M1 level as an alternative representation for the UML extension in Fig. 7: Country1 : Testable_Class name : String creation_date : Integer population : Integer tested = 10.12.95 known_bugs = 2 age() : Integer

Fig. 18. Example UML extension

Fig. 18 has exactly the same meaning and effect as Fig. 7. It basically indicates that Country is an instance of a new kind of Class, Testable_Class, and has two attribute values. Of course, the main benefit of the enhancements is that they allow UML variants as well as extensions to be fully described graphically. For example, in Fig. 19 the new model element, Testable_Class, is shown explicitly within the M2 generalization hierarchy as a specialization of Class, and defines the additional attributes which each of its instances must possess. Class 2

Testable_Class 2 : Class tested : Date known_bugs : Integer

Fig. 19. Example UML variant

Fig. 20 summarizes the features of the generalized notation, and compares it to the way the existing UML notation has to be used to describe the two facets of a model element. Example is an imaginary M2 element.

32

Colin Atkinson Example 2 : Class Example {isSIngleton = False}

Example : Class

+

anAttribute

isSingleton = False

aMethod ()

isVisible ()

UML1.1 Type View

UML1.1 Object View

=

anAttribute isSingleton = False aMethod () isVisible ()

Generalized UML

Fig. 20. Summary of generalized notation

An important point to note about the notational enhancements suggested here is that with one exception they represent a generalization of the existing notation rather than a change. In other words, the existing (type-oriented) UML representation of classes is a natural subset of the enhanced notation. The one exception is the representation of noninstantiatable model elements (i.e. pure objects) which typically occupy the M0 level. Strictly applying the notation suggested here would require attribute values and method instances to be indented. However, since such elements never have any type properties (i.e. attributes and method types) this requirement can be relaxed if it felt too onerous.

4

Applying the UML Conceptual Framework

The previous section introduced three notational enhancements to help the UML support its conceptual framework in a more uniform way. However, in addition to the deficiencies in its support for the multi-level framework there are also some significant problems in the way it applies the framework in its own definition. This section discusses some of the main problems. 4.1

Type and Attribute Value Specification

As mentioned in Sect. 2, every model element is an instance of some other model element, no matter where it exists in the level hierarchy. However, the UML semantics document provides no indication of the type from which any of the UML meta-model elements are instantiated. Worse still, it provides no indication of the values for the attributes defined by the type. This is a significant omission, since by definition, an instance must have values for the attributes defined by its type, even if they are default values. There is no point in defining attributes at the MOF level if none of the UML meta-model elements have values for them. Using the notation introduced in the previous section a full description of a UML meta-model element would not only specify which MOF element it is an instance of, but also provide values for its attributes as shown in Fig. 21. The UML documentation provides no value for isSingleton, so the value here is an example. Class 2 : Class 3 isActive : Boolean isSingleton = False

Fig. 21. Full specification of UML meta-model element

Supporting and Applying the UML Conceptual Framework

33

Note that these problems exists in the definition of the MOF as well as the UML. Since the MOF is defined to be an instance of itself, every element in the MOF must be an instance of some other element in the MOF and must thus have the corresponding attribute values. 4.2

Standard Element Location

Since the built in extension mechanism involves no actual changes to the M2 level, the information represented by an extension has to be stored in the form of stereotype instances and tag values. The part of the UML meta-model which describes how this is achieved is illustrated in Fig. 22. extendedElement

ModelElement

0..1 taggedValue *

GeneralizableElement

TaggedValue requiredTag *

stereotype

0..1

Stereotype

Fig. 22. UML meta-model segment for extension mechanisms

Applying this model to the stereotyped class County in Fig. 7 yields the following data structure. Country

Known_Bugs : TaggedValue

Tested : TaggedValue

Testable_Class : Stereotype

Fig. 23. Example data structure for a UML extension

Notice that all the model elements in Fig. 23 except Country have their names underlined, because none of them is instantiatable. The big question raised by this strategy for representing extensions is where these model elements reside. Clearly Country occupies the M1 level since it is a regular, user-defined class. This would seem to require that the other elements in Fig. 23 also exist at the M1 level otherwise the links between them would have to cross meta level boundaries. However, if these stereotype and tagged value instances are viewed as M1 level elements, then surely so

34

Colin Atkinson

must all other stereotype and tagged value instances, including those defined as standard elements. However, this contradicts the view in the UML Notation Guide (page 21) which states that “the classification hierarchy of the stereotypes themselves could be displayed on a class diagram: however, this would be a metamodel diagram and must be distinguished (by user and tool) from an ordinary model diagram”. At least the notation guide makes a statement on the issue. The semantics document does not even address the question of the location of the standard elements. Since the location of Country at the M1 level is indisputable, unless it is deemed acceptable to have links crossing meta levels, Country’s stereotype and tagged value instances must also reside at the M1 level. 4.3

Strict versus Loose Meta-modelling

The previous issue is a symptom of a more fundamental problem - the use of a “loose” meta-modelling approach in the definition of the UML semantics. This approach allows instances to coexist with their types at the same level of a metamodelling hierarchy [7 ]. In contrast, “strict” meta-modelling requires that an instance always resides at the level below its type, except at the top level where the rule can be relaxed in order to cleanly terminate the level hierarchy. The Common Behavior package (Fig. 14, page 67, UML Notation Guide) contains the most concrete examples of coexistent instances and types in the UML metamodel. The problem with a loose meta-modelling approach is that it erodes the integrity of the distinction between the levels because it is impossible to avoid links and associations crossing level boundaries. Consider the case of the model element Object, for example. In the UML meta-model, Object is an instance of the M2 level element Class, and itself resides at the M2 level. However, other “normal” instance of Class, such as Person and Country, clearly reside at the M1 level. Therefore, if one wished to establish relationships between Object and these classes, particularly generalization, these would have to cross the boundary between M1 and M2 [6]. The effect of loose meta-modelling is therefore to blur the boundaries between the levels so that ultimately their content become arbitrary, and they essentially act like packages within a single model. The only way to cleanly separate the levels is to adopt a strict meta-modelling approach and ensure that associations and links never cross meta-level boundaries. However, this requires that the UML modelling framework explicitly recognize that a certain number of the predefined model elements exist at the M1 level within one or more predefined packages. As discussed above, the elements that would reside here include the predefined standard elements, and the generalized instances of M2 level elements such as class and association. This is consistent with the approach adopted in many object-oriented language environments such as Java and Smalltalk, where user defined classes are added to a predefined inheritance hierarchy rooted in a class typically called object (or something similar). In order to adopt a strict meta-modelling approach the UML needs to define a similar predefined “library” (i.e. package) of model elements. User concepts would then be added as specializations of the M1 class hierarchy, and as instances of the M2 model elements.

Supporting and Applying the UML Conceptual Framework

5

35

Conclusion

Due to the relatively late adoption of meta-modelling in the UML development process there are some shortcomings in the way the UML supports and applies its own conceptual framework. Although the framework ostensibly has four levels, in practice both the UML notation and the UML definition are still dominated by the traditional two level view of object modelling, particularly in the area of customization. The built-in extension mechanism (based on stereotypes, tagged values and constraints) has no advantage over the more fundamental M2 level approach to customization. In fact, the notational enhancements suggested in this paper can achieve precisely the same effects as the current “built in” mechanism but in a way that is more uniform and consistent with fundamental tenets of object modelling. Stereotypes and tagged values simply complicate what otherwise would be a very simple, clean and natural approach to customization. The main goal of this paper is to make potential users of the UML aware of the pitfalls arising in a multi-level modelling framework of the kind adopted by the UML, and to give them the tools needed to avoid them. The majority of UML users will probably never need to customize the UML or to work at any levels other than M1 or M0, but it is nevertheless useful for them to appreciate the wider picture. In particular, an understanding of the problems with the application of multi-level modelling concepts in the definition of the UML will not only help users gain a better understanding of its semantics, but also help them to avoid the same problems in their own work. For those users who wish to develop customizations of the UML, the enhancements put forward in this paper facilitate the complete and concise description of UML variants, or the description of UML extensions using a more uniform and consistent notation than that available with stereotypes and tagged values.

Acknowledgements The author is grateful to Dilhar DeSilva of Platinum technology for his input into the ideas expressed in this paper, and to Mr. A. L. Atkinson for his comments on early versions of the paper.

References 1. Unified Modeling Language Documentation Set, Version 1.1. Rational Software Corp. (1997) 2. Meta Object Facility (MOF) Specification, OMG Document ad/97-08-14 (1997) 3. CDIF Framework for Modeling and Extensibility (IS-107). Electronic Industries Association (1993) 4. Object Analysis and Design Facility. OMG OA&D RFP response by Platinum Technology (1997) 5. Jacobson I.: Object-Oriented Software Engineering: A Use Case Driven Approach. Addison-Wesley, Reading, MA (1994)

36

Colin Atkinson

6. Bezivin J. and Lemesle R.: Ontology-Based Layered Semantics for Precise OA&D Modeling. In: ECOOP’97 Workshop on Precise Semantics for Object-Oriented Modeling Techniques (1997) 7. Atkinson C.: Metamodeling for Distributed Object Environments. In: First International Enterprise Distributed Object Computing Workshop (EDOC’97). Brisbane, Australia (1997)

Modeling: Is It Turning Informal into Formal? Bernard Morand, GREYC UPRESA CNRS 6072, Université (IUT) et ISMRA, 14032 Caen Cedex, France [email protected] http://www.iutc3.unicaen.fr/~moranb

Abstract. This work studies the meaning of the qualifier « semi-formal », which is usually attributed to design diagrams. Starting with a UML diagram as an example, the paper deals with the three modes of expressing things about the outside world: symbols, indexes and icons. The idea that the informational process consists in formalizing an informal given is discussed with regard to the supposed informal nature of the users’ requirements. It is also shown that a modeling language such as UML, although formalized in its inner constructions, can not strictly formalize the connection to the outside world it intends to model. This framework, arising from C. S. Peirce’s semiotics, allows to account for the modeling process as a effective interpretation reasoning on diagrams which are themselves made of signs. Thus we go beyond the apparent contradiction between the formal and the informal, using the concept of Interpretant. We can then envisage the study of design reasoning as dialogs between a model, its interpretants and the outside world or domain.

1 Introduction The Information Systems Design domain bears historical marks of a great variety in approaches and modeling tools. This diversity arose from the difference between the concerned application domains as well as the variety in the Software Engineering paradigms. For example, the functional approach has used for a long time Data Flow Diagrams [1] for data processing applications in structured programming environments (Cobol, Pascal). The data approach uses an Entity Relationship model [2] for database applications in a declarative programming environment (SQL). The events approach uses State Charts [3] to develop real-time systems with specific programming languages. This diversity has for a long time been a source for two difficulties in modeling activities. On one hand, an application domain rarely belongs to a pure category and the designer must use several types of diagrams simultaneously, which then bring the problem of their inter-relations and coherence. On the other hand, the haziness of the link between diagrams and their computer implementation leads to the emergence of two separate or even antagonistic cultures: one dealing with analysis, which focuses on users’ requirements and the other dealing with system design (see [4] for a review). These difficulties have reached today a critical point in a context of

J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 37-48, 1999. © Springer-Verlag Berlin Heidelberg 1999

38

Bernard Morand

more and more distributed systems, communicating with and transversal to the application domains. From this point of view, the object-oriented approach offers a considerable advance by unifying the data, activity and behavior aspects. It also allows a synthesis of the different approaches that can be instantiated in various application domains. Furthermore, by offering a unique language (a concepts’ system) from the users’ requirements to the software objects design, the OO approach questions in a new way the relationship between analysis and design. Going from one stage to the other by means of model rewriting rules has given way to the progressive enrichment of a unique concepts system. It is with this particular meaning that one should understand the adjective «Unified» in Unified Modeling Language: «One of the key motivations in the minds of the UML developers was to create a set of semantics and notation that adequately addresses all scales of architectural complexity, across all domains» [5]. Henceforth, these advances allow us to wonder about the following: given a set of general concepts, formally and clearly defined within UML, how can they be used in practice? In other words, is it possible to single out some general rules about language concept usage that would be independent of the designers’ individual skills? Software engineering traditionally offers «methods» that articulates the models within stages in order to specify the «how-to» [6, 7, 8, 9]. UML has chosen not to give any standard in this domain, leaving free range for each company context by means of language extensions. However, an unresolved problem remains: how do diagrams allow to «capture» the domain knowledge, to represent it in a model and to communicate it by means of UML? How to make the language constructions effective in a practical modeling situation? This paper intends to offer research directions by studying the nature of the reasoning chain applied by the designer, alone or as a group, to create a UML diagram. We show here the idea that the modeling task does not consist in a progressive shaping by means of a language. Thus we question the expression «semi-formal». It assumes indeed implicitly that a diagram is a transition between informal initial contents towards a fully formalized implementation schema. The stakes are the specification of a real intelligent tool for modeling assistance. From an example, we show in section 2 that a diagram uses an original combination of signs containing symbols (replicas to those in the UML language) but also indexes that designate domain concepts and icons that reuse some natural language terms. Section 3 shows two arguments. On one hand, we demonstrate that the inputs to the modeling process have already been formalized in their own way: they are information. On the other hand, the process resource (UML) is not fully formalized, at least under the Model Theory. Section 4 shows how the interpretant concept (a diagram author/reader abstraction) allows to account for the chain reasoning and its complexity. This semiotic approach of the modeling task allows to substitute the dyadic informal to formal transformation relation by a triadic relationship between a signified, a signifier and an interpretant. Such a relationship explains the production of new signs in diagrams and defines the modeling process concept.

Modeling: Is It Turning Informal into Formal?

39

2. A UML Diagram Expresses Its Subject According to Three Modes Let given the following informational object, an archetype of Information Systems: «customer’s orders for products». No hypothesis is brought forward as to its nature or origin: a natural language expression written on specifications, something heard in a discussion with the users-to-be, a written document found on a desk or an icon on a computer display. A potential UML diagram for this informational object is shown in figure 1: ICON CUSTOMER Name: string place Address: string 1 0..* Create() Delete()

INDEX ORDER Ord. Date: date Create() Delete() Display()

SYMBOLS ITEM Quantity: integer 1..* Product: string {ordered} Add() Delete() Display()

Fig. 1. A possible UML diagram for customer’s orders for products

From the semiotic point of view issued from Peirce [10, 11, 12, 13], we consider such a diagram as a sign made of signs According to the author’s definition, «a sign is something which stands to somebody for something in some respect or capacity» [10]. The sign relationship implies three correlates: the sign S as such, an object O which it stands for and an interpretant I which it stands to (under some terminology, S is sometimes called signifier or expression, O signified or content). This triadic relation S-O-I associates to the sign S, an object O by substitution («stands for») on one hand, and on the other hand, the S-O link itself is associated to the interpretant I («stands to»). The connection S-O can be considered as a reference relationship and the connection (S-O)-I as an interpretation relationship. For example, figure 1 as a whole is a sign S, which object O is «customer’s orders for products» for some interpretant I, reader of the diagram. Peirce’s essential contribution consists in the acknowledgement of the genuine nature of this triad S-O-I: it can not be split into two distinct pairs S-O on one side and S-I on the other (see [14] for further discussions on this point). In this section we do not extend further than the S-O reference link in the context of figure 1 (the interpretation link will be detailed in section 4). A sign S stands for an object O in three non-exclusive ways, respectively the icon, the index and the symbol. 2.1

Symbol Replicas Stemming from the UML Notation

The diagram uses a notation prescribed by the UML language concepts that allow to express in Figure 1, classes, attributes, operations, an association, a compo-

40

Bernard Morand

sition link, multiplicity and also a constraint. We name SYMBOLS the basic notation signs as defined by UML. The term symbol is used here according to Peirce’s definition: «a symbol is constituted as a sign merely or mainly by the fact that it is used and understood as such, whether the habit is natural or conventional, and without reference to the motives which originally governed its selection» [10]. In fact, a given diagram shows replicas of the language symbols which meaning results from their use in this diagram within a context. For example, the solid diamond in figure 1 shows that the ORDER-ITEM relation is a composition concept case. One can also say that the designer has applied the composition concept as defined in UML for the structure of order items. However, orders and lines are not themselves parts of UML concepts. Although they are simultaneously represented as classes, in the diagram they stand for the expression «customer’s orders for products». Subsequently, in the diagram, the symbols replicas stand for the informational objects perceived about the outside world, and the latter was in the designer’s mind before its symbolization. 2.2

Icons Reminiscent of the Represented Subjects

The actualization of the UML symbols implies a second mode of expression. It is made of natural language words such as «CUSTOMER», «Name» about which UML constructions merely allow to state that they are classes, attributes, etc. As themselves, these words refer to another sign category: the ICON. Contrary to the symbol, which nature is law or convention, the icon is based on a pure principle of quality and isomorphism. According to this principle, a set of characters (the signifier) written in a diagram stands for something (a signified) by analogy to shapes perceived by the reader. Thus in figure 1, the letters O-R-D-E-R (the name of a class) are placed to call to mind the natural language word «order». Besides, one can notice that the relevance of the choice of those icons is essential for the readability and communication of the diagrams, in particular for the application’s future users. 2.3.

Indexes to Exhibit What Has Been Selected from the Subject

Finally, we can notice a third mode of expression in the way a schema exhibits some sort of cutting out of the expression of reference. It is the third sign category: the INDEX. We name index, a sign which function is to exhibit the existence of an object by means of a causal connection. It works as a trace of its subject. In the example, the designer has punched out the notion of ITEM, although it would not appear explicitly in the expression «customer’s orders for products». Meanwhile, the products are set back to second place, as they are only represented as simple attributes to ITEM. The index, as sign to object reference mode, explains why there are other design alternatives even though they also use UML symbols (see figure 2). The specific property of diagrams to show and illustrate something to the observer surely explains their great popularity amongst analysts, even without any properly formalized notation. Diagrams work as geometrical shapes.

Modeling: Is It Turning Informal into Formal?

ORDER Ord. Date: date Customer name: string Customer address : string Create() Cancel() Display()

1..*

requires

1..*

41

PRODUCT Product name: string Location: string In(quantity: integer) Out(quantity: integer)

Quantity : integer

Fig. 2. Another UML diagram for customer’s orders for products

2.4.

Symbols Replicas, Indexes and Icons Contribute Together to the Diagram

The symbolic nature of the UML formalized notation can not be used practically without resort to indexes and icons. We see in this property, the raison d’être of diagrams. Diagrams can not be understood as formal, in the meaning of a pure symbolic structure, but as articulating together three modes of signification. One can consider that the invention of the neologism «semi-formal», this hazy term often used about diagrams, shows an intuition of this result. Modeling consists in creating symbols replicas simultaneously associated with indexes and icons. As an example, the ORDER rectangle is a class symbol replica. Still, its attributes are arranged in two lots to indicate (index) that they are either static attributes or operations, and they are named by means of natural language terms (icons). Finally, the ORDER class and its attributes (symbol) show the existence (index) of orders in the expression of reference by means of an icon (the name of the class). A quick description of the modes of expression used in diagrams has allowed to show the existence of indexes and icons behind symbols, and the latter can not stand without the formers. As a result, a diagram can not be a simple reproduction by copy of a formal system, and it does not inherit ipso facto its symbolic nature. This conclusion brings the problem of the status of the user objects level on which the UML Four-Layer Metamodeling Architecture [15] is based. The user objects (or user data) level is described as an instance of the model (level 2), itself described as an instance of the metamodel (level 3). Level 4 defines a metamodel specification language. In our view, one can not define the user objects layer as a simple «instance of a model» [15] since, while it contains symbols replicas (the instances) defined in the model, it uses other meaning processes (indexes and icons). Finally, to qualify a diagram as a formal structure, one should establish that: i) the three modes of signification presented here intuitively are necessary and sufficient to express the whole diagram, ii) there exists rules that determine the combinations allowed for these three basic elements and iii) there are typical properties of such a structure. It is truly Peirce’s plan in his Sign Logics project [13].

42

3.

Bernard Morand

The Information Process: The Informal and the Formal

We have just shown that the design result, as appearing in a diagram, can not be reduced to a mere instantiation of a model, or to a pure syntactic combination of symbols. We question now, on one hand, the assumed informal nature of the domain objects, and on the other hand, the assumed formal nature of the modeling language. The latter can be considered as the resource of the modeling process, while the formers are the inputs of the same process. We intend thus to criticize the current view according to which modeling would consist in going from the informal to the formalized by means of a language (formal itself). The stakes are a more precise definition of the «modeling process» concept. 3.1.

The Modeling Process Applies to Some Already Formalized Being

What is called the users’ requirements or «real world» or « Universe of Discourse » is most often considered as naturally given and thus non-formalized. This idea was in fact borrowed from Biology and Physics, which have, for a long time, based the scientific activity as a research activity and discovery of immanent laws. One implicitly admits that the modeling process inputs have never been constructed and a fortiori formalized before the beginning of this process. This hypothesis is obviously unacceptable in all the cases, which are numerous nowadays, where the domain has already been computerized before the beginning of a new computerization project. Moreover, even a purely manual information system already contains formalized constructions. The expression «customer’s orders for products» refers to paper documents («forms») organized according to a format with a header, lines, imprints, various areas to fill in manually, etc. The same expression can refer to a drawing, displayed on a computer, which will most probably bear similarities with the former document. The modeling process primary data are already formalized and we have shown [16, 17] that, from a conceptual point of view, information, diagrams and models are identical in nature: they are signs. Informing is making new information by means of information and this process genesis reveals a pure sequence of formalizations. Customer’s orders for products are nothing more than a type of «Tope là» that could be heard on yesteryear’s cattle markets, except it has been made more complex, and has been socially developed by modern organizations. This ancient sign used to testify of the promise given by both parts to the exchange and of their agreement on its conditions. The development of exchanges has only made Information Systems more complex while formalizing them: oral has turned to written, usage has become law and contracts. It is thus by reducing the informational process at this present time that we can maintain the illusion that its inputs have an informal nature.

Modeling: Is It Turning Informal into Formal?

3.2.

43

The Meaning of the Word «modeling» in the Expression «Modeling Language»

The use of a formal language such as UML does not guarantee that the diagrams it allows to create are symbolic structures (section 2). We consider now the formal nature of the language itself. The idea to appeal to formal languages has presided over the birth of modern logic in its project to invent an artificial language, an ideography (Frege) able to rigorously describe concepts. One hoped to avoid natural languages ambiguities in the expression of articulations that link formulae in a mathematical demonstration. One can also find the same idea with the expression «lingua franca» which is claimed to be the motivation for the UML language definition [5]. The introduction of the language semantics systematically uses two levels respectively called Description and Basic Semantics in version 1.0 [18]. We take here as an example the definitions given for both the Model Element and the Element concepts [15]: Model Element A model element is an element that is an abstraction drawn from the system being modeled. (Description) In the metamodel a Model Element is a named entity in a Model. (Basic Semantics) Element An element is an atomic constituent of a model. (Description) In the metamodel an Element is the top metaclass in the metaclass hierarchy. (Basic Semantics) On one hand, the language symbols define each other, in a process of successive abstractions, which result consists in piling up the concepts: Model Element is a subclass of Element. On the other hand, the definition process is itself represented at the metalanguage level in its own terms (with Element, the basic abstract class of the metamodel that stops the definition recursions). Without discussing here the principle that, in fact, defines formally the language inner constructions (Basic Semantics), we examine their relation with what they refer to outside the language (Description). As an example, the Model Element description leaves a wide range for the designer to decide the precise nature of the abstraction made from the system to be modeled (Classifier, Association, Attribute, etc.). This has already been shown by Figure 2 in section 2. That these language inner constructions are not determined, relatively to the external notions they allow to represent, is also patent in the Element concept description. The tautological nature of its description results in the metaclasses architecture. In fact, the tree, which root is Element, guarantees good properties to the language but does not allow to say what is an outer system element, which we intend to model. This is why it is important to understand that the word «modeling» in «Unified Modeling Language» must be understood in a different meaning than in the Model Theory [19]. In Model Theory, a theory T is made of a set of formulae built from a set of axioms. A model is defined by the datum of a domain D and an interpretation function I. If the pair {D, I} makes all formulae true in T, then it is said to be a model of T. One can notice here some sort of inversion in which the model is a

44

Bernard Morand

special case of the theory while expressing nothing about the real world. «What the recursive definition of truth entitles us, is to calculate the truth value of a formula in a given model as long as we know which individuals in the model domain satisfy the propositional functions. It does not claim at all the outrageous power to decide on questions about reality» [20]. For the UML language to belong to this type of formal language, one should establish that the Descriptions are a model, a {D, I} pair that makes Basic Semantics true. This is probably impossible since the «modeling» problem of information systems concerns outside world objects and that its result can not be generated on the basis of a set of axioms. It is more an activity that belongs to experimental sciences even though it uses logic resources. The modeling language is thus, in its own way, «semi-formal»: the symbols it formally defines must be instantiated with the help of indexes and icons that refer to world objects. The diagram function will be to establish a connection between the language-formalized constructions and the domain to be represented. Therefore diagrams make Descriptions meaningful.

4.

Reasoning with Diagrams: An Interpretative Dialog

To make this presentation easier, we have shown in section 2 how the reference relation between a sign S and an object O works in a diagram. However, this relation can not be separated from the interpretation relation that we have noted (S-O)-I. We show now the three ways according to which the latter relation works, basing our argument on examples from figure 1. We then develop the idea that a diagram’s main function is to set up an interpretation of the outside world, amongst others. Finally, we examine the chain reasoning used to construct the diagram in order to show that the triad (S-O)-I allows to consider them as a dialog between a model and its interpretations. 4.1.

The Three Modes of Interpretation

As an example, let us take the reference relation S-O between some elements (S) in figure 1 and their equivalent in the expression (O) «customer’s orders for products». This relation can be understood in three ways: 1) The ORDER rectangle means that the outside world orders are homogenous and can be gathered under a unique entity. Peirce names this first mode the immediate interpretant, that is, the effect as such of the sign written in the rectangle shape ORDER. 2) The fact that Product Name is placed as an attribute to ITEM in the diagram can bring to mind the idea that a more complete description of the products would be more relevant. By similarity with the orders represented by means of a class, the idea to modify the diagram seems to appear. Peirce names this second mode the dynamic interpretant. It is the «real» effect of the sign.

Modeling: Is It Turning Informal into Formal?

45

3) The diagram itself and as a whole can be seen as a general definition of what orders are in the outside world: a group of products in a certain amount for a customer. Peirce names this third mode the logical (or final) interpretant, a habit or a guideline generated by the sign. This trichotomic distinction of the interpretant applies to all signs. Let us verify this in the case of a symbol replica from the UML language, the solid diamond that represents the composition concept. 1) The life cycle of an order line corresponds to the life cycle of the order (immediate interpretant). 2) Is a line truly a physical component of the order? Can it be split over several orders (for example a fax, a printed document and an electronic document)? In such a situation, the solid diamond will have to be replaced by a hollow diamond (aggregation), the multiplicity will have to be modified and eventually one will have to add a super-class for orders. Thus seems to appear the idea to question the outside world in order to control which will be the best solution (dynamic interpretant). 3) According to the definition of the UML constructions, every solid diamond marks the composition link and every hollow diamond marks the aggregation. The diagram always obeys to this rule and can not depart from it: a gray diamond makes no sense and it is forbidden (logical interpretant). 4.2.

A UML Diagram Sets Up an Interpretation over the Domain

The interpretation relationship accounts for the fact that there can be no biunivocal correspondence between the UML language symbols and a given diagram. This was already shown in the difference between the diagram in figure 1 and the one in figure 2. As a result, a given diagram necessarily sets up an interpretation of the outside world amongst all other possible ones. As a consequence, the same diagram suggests specific software architecture. This is shown in figure 3, which is applied to the case of a state chart. This modeling example of a chess game is borrowed from [21]: checkmate

Start

White’s turn black move

white move Black’s turn

stalemate stalemate checkmate

Black wins Draw White wins

Fig. 3. A state diagram for chess game

The diagram implicitly sets a time scale held by the designer. It is thus agreed (logical interpretant) that the atomic time unit in this system, the instant, is based upon a move. Between two instantaneous events that are two moves, there is duration represented as a state (a turn). A move being considered as instantaneous will not

46

Bernard Morand

deal with the case where a player inadvertently drops a checker during his move. For this purpose, one would have to consider the move as a state confined between two events, the catch and the final release of a checker. Moreover, if we consider the game from the point of view of a machine that would implement a simple interface for two human players, the turns could be seen as events and the moves as states during which the trajectories would have to be simulated graphically. Time interpretation would therefore be inverted! 4.3.

The Modeling Process as Logical Reasoning about Interpretations

We have shown with examples how the signifier S can stand for the signified O only in relation to an interpretant I. Consequently, the usual opposition between a supposedly formal signifier and a supposedly informal signified is misleading. On one hand, this approach reduces signification to the reference by removing the interpretation problem. It assumes at least one bi-univocal interpretation, an equivalence relationship between the things in the world and the language symbols. If that were the case, we would face a paradox. In fact, if we assume that a symbols system can directly express the domain objects, one must conclude that these objects are given straight of in the same terms as the language concepts. It then becomes hard to justify the raison d’être of System Design: a powerful enough language would allow to avoid this stage. This is, to us, the founding hypothesis of formal specification methods. On the other hand, the opposition signifier/signified, transposed in the opposition formal/informal, does not allow to make room for reasoning in the modeling process. Now, we have shown that even in trivial cases, this reasoning is necessarily complex. Taking the interpretant into account allows, on the contrary, to study the design reasoning as a dialogical process between an author and a diagram, between a diagram and a reader. Author and reader must be understood as potentially the same person and especially as an abstraction that we can also call interpretant. This way, the modeling reasoning can be analyzed as an interactive exchange between the signs and their interpretants, which are signs themselves. Thus, we indicate a research direction along which the model formalization would not be questioned in terms of syntactic conformity of a result according to a formal language, but in terms of dynamic and interactive construction of diagrams by means of logical reasoning chains.

5.

Conclusions

Going from users‘ requirements to a symbolic structure is no pure function because it modifies the state of the informational world. The initial needs are not formal since they result from the shaping of objects and information procedures, which took place prior to the modeling process. The target structure, the diagram, which must be shaped by the same process, can not be purely symbolic since it contains indexes and

Modeling: Is It Turning Informal into Formal?

47

icons. Between the two, the modeling language can neither be strictly formal as far as it offers a description of the world objects. The advance allowed by UML, as well as its future developments, lies as much in its capacity to supply a notation shared by an entire community of analysts, designers and users, as in its strictly formalized nature. The «semi-formal» qualifier, approximate and autocontradictory, illustrates this situation. We have offered in this paper to go further than the formal/informal duality by putting to the fore the nature of the objects manipulated in diagrams and by recommending to focus on reasoning chains running during the modeling process. The approach that we offer is based on the observation that modeling consists in producing new information from information. Its originality and novelty results from the fact that we have become convinced for the past few years, that the Information Theory necessarily relies upon a more general Sign Theory. This statement may surprise since few previous works, as much in the Knowledge Representation as in Software Engineering fields, have envisaged the problem in that way. Some attempts [22, 23, 24] from a semiotic point of view are worth noting but they often are either domain specific or restricted to the graphic properties of diagrammatic tools: Conceptual Graphs [25, 26], Visual Programming, Learning, Human Computer Interfaces, etc. The main benefit that one can require from a semiotic approach is: given a precise, standardized and largely acknowledged modeling language such as UML, how can the notation become operational, efficient in a practical project without regards to the designers’ personal and various skills? We have thus been able to identify the main characteristics of the modeling process: reasoning chains that implement three different types of signs (icons, indexes and symbols) in three types of interpretation (immediate, dynamical and logical). By contrast, a classical formal approach will not supply any other solution than advising the analysts to strictly follow the UML language rules. Bringing up the Interpretant concept permits, on the contrary, to hope for a specification of the modeling reasoning chains. This is the project we wish to develop in view of improving the current CASE tools services: going from a computerized diagram management to a intelligent design assistance.

6.

Acknowledgements

We wish to thank the anonymous reviewers whose comments have allowed to improve a previous version of this article. This work was supported by the GIS Sciences de la cognition (CNRS, France) within the PIC project (Processus d’Interaction en Conception Distribuée). References 1. De Marco, T.: Structured Analysis and System Specification. Yourdon Press, (1978) 2. Chen, P.P.S.: The Entity-Relationship model. Toward a unified view of data. ACM Transactions on Database Systems 1, 1 (March 1976)

48

Bernard Morand

3. Harel, D.: Statecharts: a visual formalism for complex systems. Science of Computer Programming 8 (1987), 231-274 4. Monarchi, D.E., Puhr G.I.: A research typology for object-oriented analysis and design. Communications of the ACM, Vol.35, n°9 5. UML Summary, version 1.1, (1/09/1997). http://www.rational.com/uml 6. Boehm, B.W.: Software engineering. IEEE Trans. Comp. C-25, (1995) 7. Boehm, B.W.: A spiral model of software development and enhancement. Reprinted in System and Software Requirements Engineering. IEEE Computer Society Press (1990) 8. Henderson-Sellers, B., Edwards, J.M.: The object-oriented systems life-cycle, Communications of the ACM, Vol.33, n°9 9. Jarke, M., Bubenko, J., Rolland, C., Sutcliffe, A., Vassiliou, Y.: Théories underlying requirements engineering. An overview of NATURE at genesis. ESPRIT Project 6353. Report AC-92-1 (1992) 10.Peirce, C.S.: Collected Papers, Harvard University Press (1931-1935, 1958) 11.Houser, N., Kloesel, Ch. (eds.): The Essential Peirce, Selected Philosophical Writings, Vol.1 (1867-1893). Indiana University Press (1992) 12.Peirce Edition Project (ed.): The Essential Peirce, Selected Philosophical Writings, Vol.2 (1893-1913), Indiana University Press (1998) 13.Houser, N., Roberts, D.D., Evra, J.V.(eds.): Studies in the Logics of C.S. Peirce. Indiana University Press (1997) 14.Morand, B.: Les sens de la signification. Pour une théorie a priori du signe. Revue Intellectica, Vol.2, n°25 (1997). http://www.iutc3.unicaen.fr/~moranb. 15.UML Semantics, version 1.1, (1/09/1997). http://www.rational.com/uml 16.Morand, B.: Statut épistémologique des modèles dans la conception des systèmes d’information, Revue Ingénierie des Systèmes d’Information, Hermès, Vol. 3, n°5 (1995), 665-700 17.Morand, B.: From Data, Process and Behaviour Perspectives to Representation as a Semiotic System for IS Modeling. CESA'96, IMACS Multiconference, Lille July 9-12 (1996) 18.UML Semantics, version 1.0 (13/01/1997). http://www.rational.com/uml 19.Tarski, A.: Introduction to Logic and the Methodology of Deductive Science, Oxford University Press (1946) 20.Gochet, P., Gribomont, P.: Logique. Méthodes pour l’informatique fondamentale, Vol.1. Hermès (1990) 21.Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., Lorensen, W.: Object-Oriented Modeling and Design. Prentice-Hall (1991) 22.Allweincud, G., Barwise J.(eds.): Logical Reasoning with Diagrams. Oxford University Press (1996) 23.Glasgow, J., Narayanan, N., Chandresakaran, B.(eds.) : Diagrammatic Reasoning : Cognitive and Computational Perspectives. AAAI Press (1995) 24.Thinking with machines Workshop. http://www.mrc-cbu.cam.ac.uk/projects/twd/workshop.html 25. Sowa, J.F.: Conceptual Structures. Information Processing in mind and machine, AddisonWesley (1984) 26. Keeler, M.: The Philosophical Context of Peirce's Existential Graphs. http://accord.iupui. edu/accord/context.txt

Best of Both Worlds A Mapping from EXPRESS-G to UML Florian Arnold and Gerd Podehl Research Group for Computer Application in Engineering Design Department of Mechanical and Chemical Engineering University of Kaiserslautern Erwin-Schroedinger-Str., D-67653 Kaiserslautern, Germany http://rkk.mv.uni-kl.de {arnold, podehl}@mv.uni-kl.de

Abstract. On the one hand, in the world of Product Data Technology (PDT), the ISO standard STEP (Standard for the Exchange of Product Model Data) gains more and more importance. STEP includes the information model specification language EXPRESS and its graphical notation EXPRESS-G. On the other hand, in the Software Engineering world in general, mainly other modelling languages are in use - particularly the Unified Modeling Language (UML), recently adopted to become a standard by the Object Management Group, will probably achieve broad acceptance. Despite a strong interconnection of PDT with the Software Engineering area, there is a lack of bridging elements concerning the modelling language level. This paper introduces a mapping between EXPRESS-G and UML in order to define a linking bridge and bring the best of both worlds together. Hereby the feasibility of a mapping is shown with representative examples; several problematic cases are discussed as well as possible solutions presented.

1 Introduction Within the world of Product Data Technology (PDT) and Computer Aided technologies (CAx), the need to overcome the proprietary data formats of system suppliers led to the development of the ISO standard STEP (Standard for the Exchange of Product Model Data). STEP was defined specifically to deal with the information consumed or generated during the product lifecycle from design to manufacturing. STEP includes the information model specification language EXPRESS and its graphical notation EXPRESS-G. Although the main focus of STEP and PDT is not software, there are some points of contact between the fields of PDT and Software Engineering (SE). For example, there is a need for software to convert data between the various proprietary and system specific file formats and EXPRESS for data exchange purposes. Besides this, software tools for the modelling, specification and manipulation of product data are necessary. In the case of graphical modelling, mainly EXPRESS-G is used in this field. In the SE world in general, mainly other modelling languages are in use though. Particularly the Unified Modeling Language (UML), recently adopted as a standard by the Object Management Group, will probably achieve broad acceptance. So, in J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 49-63, 1999. © Springer-Verlag Berlin Heidelberg 1999

50

Florian Arnold and Gerd Podehl

spite of the strong interconnection between PDT and SE, there is a lack of bridging elements concerning the modelling language level. This paper introduces a mapping between EXPRESS-G and UML in order to define a linking bridge and bring the best of both worlds together. In the following, some background information on EXPRESS-G is given first. Then, after a short visit to UML some fundamentals for the mapping are explained. After this, the mapping with examples is presented whereby several problematic cases are discussed and possible solutions presented. At the end, some conclusions are drawn and an outlook is given.

2

STEP, EXPRESS and EXPRESS-G

STEP (Standard for the Exchange of Product Model Data) is an international standard (ISO-10303, Industrial automation systems and integration - Product data representation and exchange) for the computer-interpretable representation and the exchange of product model data. Part 11 of STEP is the EXPRESS language reference manual. The formal description language EXPRESS is not a programming language but a specification language for the consistent and logical description of the information models of STEP [4]. EXPRESS contains object-oriented and procedural concepts as well as data base concepts. It enables the complete and non-ambiguous description of a mainly static product model. EXPRESS specifies an information domain in terms of entities, i.e. classes of objects sharing common properties which are represented by associated attributes and constraints. In EXPRESS, constraints are written using a mixture of declarative and procedural language elements. EXPRESS-G is a formal graphical notation of EXPRESS which can, however, only reach part of the expressiveness of EXPRESS. The static components like entities, attributes, type declarations, and hierarchies of inheritance can be represented by EXPRESS-G. But there is a lack of possibilities to visualise functional components, local or global rules, as well as algorithms. Despite this, EXPRESS-G instead of EXPRESS has been chosen for the mapping to enable the evaluation of two graphical representations with examples. Thus, it should be possible to compare EXPRESS-G and UML in the categories of readability, clearness, comprehensibility and complexity. Note that this paper is not intended to explain EXPRESS-G (or even EXPRESS) fundamentals at all. A good introduction to EXPRESS and EXPRESS-G can be found in [7].

3

Unified Modeling Language (UML)

The Unified Modeling Language (UML) defined by the three "amigos" Grady Booch, Jim Rumbaugh and Ivar Jacobson from Rational Software Corporation as a metamodel (and an incidental visual modelling language) for the specification, visualisation, construction, and documentation of the artifacts of software systems [1] is especially suited for the modelling of complex, distributed and concurrent systems [3], [5]. In November 1997, the UML in its current version 1.1 was adopted to the

Best of Both Worlds - A Mapping from EXPRESS-G to UML

51

Object Management Architecture of the Object Management Group and therefore was accepted to become a de-facto official industry standard for the metamodel in the area of object-oriented software design.

4 General Remarks on the Mapping The mapping from EXPRESS-G to UML shows the association of model structures with adequate examples. In doing so, all elements of EXPRESS-G are covered and critical cases are highlighted. 4.1

Entities, Schemas and Entity Level Relations

Because EXPRESS-G does not model dynamics, i.e. it only allows modelling the data of a static product model, there are only static structural diagrams (i.e. class diagrams) needed on UML side. Hereby, some basic modelling elements are corresponding in a quite obvious way: EXPRESS-G entities are mapped onto classes and schemas to packages (left half of Fig. 1) because they share the same semantics. EXPRESS-G

Entity

Schema

UML

1 Class

Package 0..1

Fig. 1. Mapping of some basic elements

EXPRESS-G relations (right half of Fig. 1) are bi-directional but nevertheless one direction is emphasised through an open circle.

52

Florian Arnold and Gerd Podehl

The EXPRESS-G supertype-subtype relation symbol is a thick line with a circle on the subtype end of the relationship and is mapped to the generalisation arrow in UML. Regarding EXPRESS-G attribute relations, it has to be kept in mind that there is a big difference between the semantics of relational symbols on entity level and on schema level. On entity level, a dashed line with a circle at one end denotes an optional attribute and all other attribute relationships are stated as solid lines with a circle at one end. Thereby, the circle is always on the attribute type end. The explicit attribute relationship symbols of EXPRESS-G are mapped to unidirectional (navigable) associations in UML. In the case of optional attributes, a multiplicity of 0..1 is added at the arrow end (compare Fig. 1) of the navigable UML association. Derived, inverse, and redeclared attributes will be examined in the following chapters. The totally different meaning of relationship symbols on schema level will be looked at later on, too. 4.2

Some Primitive Types

The EXPRESS-G predefined simple types Binary, Boolean, Integer, Logical (TRUE, FALSE or UNKNOWN), Real, Number (Integer or Real) and String can directly be seen as primitive types in UML. EXPRESS-G user defined types (like Date, Fig. 5, and Strings, Fig. 7) are also mapped to classes. Another possibility would have been to declare them as new user defined primitive types in UML. The way of defining new user defined primitive types in UML has been chosen for the enumeration type of EXPRESS-G. So the respective enumeration definitions are not noted in UML (as they are not in EXPRESS-G). Regarding the EXPRESS-G collection types Set, List, Array and Bag, there are two cases to be distinguished. First, a certain collection type is simply assigned to a user defined EXPRESS-G type. In this case, an aggregation association is used in UML (see class Date, Fig. 5). Second, they are used to state the type of an attribute (see attribute values of class FromEnt, Fig. 7, and attribute elements of class Mesh, Fig. 13). In all four cases of EXPRESS-G collection types, the actual kind of collection type can easily be denoted: an unordered set is the default case when denoting a multiplicity greater than one on an association end in UML. A list may be indicated in UML by adding the constraint {ordered}, an array by adding {array}, and a bag by adding {bag}. 4.3

Complex Constraints

In some cases, complex constraints which cannot be directly represented in EXPRESS-G and, therefore, have to be noted outside the EXPRESS-G diagrams (i.e. in EXPRESS) have been included in the UML representation.

Best of Both Worlds - A Mapping from EXPRESS-G to UML

4.4

53

Representational Aspects

While assigning names to entities, schemas, classes etc., the different naming conventions of EXPRESS-G and UML have been kept to point out the differences and to illustrate the transferability. It has been waived to map the page referencing symbols of EXPRESS-G because they have no semantic contents and are only used for structuring and formatting purposes of documents in the case of large models.

5 A Mapping with Examples The following EXPRESS-G examples are essentially based on the examples from [7] and mainly have the purpose to clarify certain aspects of the usage of this language rather than being examples for good information modelling in general. 5.1

Generalisation Issues

Fig. 2 and Fig. 3 show that inheritance trees look very much the same in both notations, with EXPRESS-G being something more explicit by using the (ABS) prefix for abstract classes (Fig. 2). The supertype-subtype inheritance hierarchy symbol is labelled with digit 1 highlighting that it is a OneOf relation.

Root

Leaf1

(ABS) Sub1

1

Leaf2

Leaf3

Fig. 2. Inheritance in EXPRESS-G

In UML (Fig. 3), this is realised through a generalisation with a {disjoint} constraint. Disjoint applies to a set of generalisations, specifying that instances may have no more than one of the given subtypes as a type of the instance [1].

54

Florian Arnold and Gerd Podehl

Root

Sub1

Leaf1

{ disjoint }

Leaf2

Leaf3

Fig. 3. Inheritance in UML

There are three different kinds of inheritance in EXPRESS resp. EXPRESS-G [7]: Normal inheritance, i.e. inheritance of attributes and constraints. In EXPRESS, the existence of attributes is inherited by subtypes from their supertypes. Nevertheless, it is possible to redeclare an attribute declared in a supertype in its subtypes. This topic will be examined in 5.3. Subtypes also inherit all the constraints applied to their supertypes. Multiple inheritance: when a subtype has more than one supertype, it inherits all supertype attributes. When a subtype inherits attributes from disjoint supertypes it is possible that the supertypes have attributes that have the same name. This kind of naming ambiguity is resolved by prefixing the name of the attribute with the name of the supertype entity. Repeated inheritance means that a subtype may inherit the same attribute from different supertypes that in turn have inherited it from a common ancestor. In this case, the subtype only inherits the attribute once. The usual (in the UML area) categories of implementation inheritance and interface inheritance are not really applicable to EXPRESS / EXPRESS-G because there are no operations at all to be inherited. 5.2

Attributes and Constraints

Fig. 4 shows the visualisation of a concept Person in EXPRESS-G notation. Here, a person has several characteristics, like a first name and a last name, an optional nickname, a special type of hair, a date of birth, and implicitly a certain age. Age has been prefixed with (DER), for derived, to denote that it is a derived attribute. The enumeration HairType : {bald, dyed, natural, wig} has to be noted outside the diagram. In the example, a person is either female or male. If female, the person optionally has a maiden name. (This relation surely depends on the country's laws and can be regarded as being sexistic.)

Best of Both Worlds - A Mapping from EXPRESS-G to UML

55

A person may have children and up to two (living) parents who naturally are persons, too. The attribute parents is defined as being inverse to children by a preceding (INV). In EXPRESS-G, an inverse attribute denotes a bi-directional relationship between two entities: an inverse attribute of an entity A references an entity B that itself references entity A [6]. Children S[0:?] Hair

(INV) Parents S[0:2]

HairType

BirthDate

Person

Date

A [1:3]

FirstName LastName STRING NickName (DER) Age

INTEGER

1

Male

Female

*Husband

MaidenName

STRING

*Wife Married

Fig. 4. Concept Person in EXPRESS-G

A man and a woman may be married whereby in the chosen example polygamy as well as (for equality reasons) polyandry are forbidden through uniqueness constraints, i.e. that the values of husband and wife must be unique across all instances of entity Married. In EXPRESS-G, only the pure existence of these constraints can be displayed by prefixing Husband and Wife with an asterisk while the constraints (no_polyandry and no_polygamy) themselves have to be noted and defined outside the diagram, i.e. in EXPRESS. When translating this concept to UML (Fig. 5), there are several points to be taken into account. The relationship between Person and Male and Female, respectively, is a OneOf relationship, i.e. a person can either be female or male but not both at the same time. A Male and a Female may be married. Hereby the constraints described above can straightforward be denoted in UML. Person is modelled in UML (Fig. 5)

56

Florian Arnold and Gerd Podehl

as a class with several attributes. The attribute nickName is marked as optional by the use of [0..1] as can be seen in the attribute compartment of class Person. /age denotes that it is a derived attribute. The children and parents attributes of EXPRESS-G are modelled in UML as one association with the role names children and parents and appropriate multiplicities.

Integer

parents 0..2

3 { array } children 0..*

Person hair : HairType firstName : String lastName : String nickName[0..1] : String /age : Integer

1

birthDate

Date

{ disjoint }

Female

Male

maidenName[0..1] : String

husband { no_polyandry }

Married

wife { no_polygamy }

Fig. 5. Concept Person in UML

An additional example, mainly for the use of EXPRESS-G constraint concepts, is given in Fig. 6. Pick is of Type select. An EXPRESS-G select type defines a named collection of other types (entity types or defined types). A value of a select type is a value of one of these specified types. The EXPRESS-G symbol for a select type is a dashed rectangle with two vertical lines at the left end and can be modeled in UML by an aggregation association with an {or} constraint (Fig. 7). The {or} constraint indicates a situation in which only one of the several potential associations may be instantiated at one time for a single object [1]. For mapping the defined class Name, the {alias} constraint denotes the simple renaming of the class STRING into Name. The EXPRESS-G user defined type Strings is mapped to an aggregation association while adding an {ordered} constraint to state that it corresponds with a List collection type in EXPRESS-G.

Best of Both Worlds - A Mapping from EXPRESS-G to UML

Root

choose

Sub2

57

Pick

graph

BINARY

Sub1

attr

attr

description

FromEnt

Name

AnEnt

text

ToEnt

Strings

L[1:?]

*val values A[1:3]

STRING

REAL

Fig. 6. Defined classes, select classes, and constraints in EXPRESS-G

Root

Sub2

Sub1

Pick

choose graph : Binary

{ alias }

{ or }

String 1..* { ordered }

attr AnEnt

Name

attr 1 0..1

FromEnt values[3] : Real { array }

1

description

ToEnt text val : Real { positive : val >= 0.0 }

Fig. 7. UML notation for Fig. 6

Strings

58

Florian Arnold and Gerd Podehl

The val attribute is constrained as well (Fig. 7). The asterisk preceding an attribute denotes that there is a description of the rule in the accompanying documentation (i.e. noted in EXPRESS: WHERE positive : val >= 0.0). For this example, a positive value constraint was included into the UML description. 5.3

Attribute Redeclaration

Attribute redeclaration is a concept which is specific for EXPRESS and EXPRESS-G, respectively, and has no direct equivalence within UML. In EXPRESS-G, a subtype may redeclare its inherited attributes by preceding their name by (RT). In Fig. 8, Leaf inherits the optional attribute Attr from Middle and redeclares it in two ways: first, to be of type Sub which must be a specialisation of the original type Super of Attr and second, to be mandatory instead of being an optional valued attribute as stated in Middle. In the second redeclaration the attribute Num is redeclared to be of type INTEGER instead of type NUMBER (which could be REAL or INTEGER).

Root

NumAttr

No Numero

NUMBER

Attr Middle

Leaf

Attr1

(RT) Attr

Super

Sub

Num

(RT) Num

NUMBER

INTEGER

Fig. 8. Attribute redeclaration in EXPRESS-G

UML does not generally support the possibility of inheriting the structure of a class and then redeclare it. But fortunately, since the redeclaration of attributes in EXPRESS-G requires that the subtype attribute has a value domain that is a subset of the supertype attribute's value domain, it is possible to support redeclaration in UML by adding appropriate constraints to the subtype (Fig. 9).

Best of Both Worlds - A Mapping from EXPRESS-G to UML

Root

Middle

1

Numero numAttr

0..1

no : Number

attr1

Super

1 attr

0..1

num : Number

Leaf { attr[1] : Sub }

59

Sub { num : Integer }

Fig. 9. Attribute redeclaration in UML

In Fig. 9 Leaf inherits two attributes from Middle: attr1 is left unchanged, while attr is constrained to be a mandatory attribute of type Sub instead of being an optional valued attribute of type Super (as stated in Middle). Analogously, attribute num of class Sub is constrained to be of type Integer instead of Number. 5.4

Schemas

All EXPRESS-G examples investigated until now are entity-level models, i.e. only the contents of one single schema has been displayed. Next we are going to examine a schema-level model, i.e. an example with multiple schemas where solely schemas and their relations are shown. Remember that on schema level, two EXPRESS-G relation symbols have a meaning which differs from the one on entity level. Sad but true! A dashed line with an open circle at one end denotes a schema-schema 'reference', i.e. instances of referenced entities can only occur when required as attribute values. A normal line with an open circle at one end displays a 'use' relation between schemas, i.e. one or more foreign entity declarations are treated as local declarations. Fig. 10 shows three schemas on schema level with schema fem using the entity property from schema mat and referencing the entity point from schema geom while giving point the alias node.

60

Florian Arnold and Gerd Podehl

point > node

geom fem

mat

property

Fig. 10. Schema level model in EXPRESS-G

In UML (Fig. 11) this complies to a package Fem, where the class Mat::Property is explicitely imported by using its full qualified pathname. Besides this, there is a class Node which is associated with class Point of package Geom. Hereby, the property-string {reference} has been added to Node to indicate the special kind of relationship between Node and Point which cannot be directly expressed in UML.

Fem

Geom Node { reference }

{ alias }

Point

Mat::Property

Fig. 11. Representation of EXPRESS-G schemas in UML

Another possibility would have been to use property strings {EXPRESS-G references} and {EXPRESS-G uses} for reasons of a similar treatment of both relationships. In principle, it may also be possible to map these relationships to stereotypes in UML, but the stereotype <<uses>> has already been defined with totally different semantics so this would surely lead to misunderstandings at least. Inter-Schema References Fig. 12 shows an EXPRESS-G model on entity-level that corresponds to the schema-level model of Fig. 10 and should clarify the usage of inter-schema references.

Best of Both Worlds - A Mapping from EXPRESS-G to UML

61

material

mesh

elements L[1:?]

geom.point

nodes L[1:?]

material

element

mat.property

node

Fig. 12. Inter-schema references in EXPRESS-G

The entity element has two attributes. Thereby nodes is a 'reference' to entity point (aliased as node) from schema geom. The fact that it is a 'reference' is expressed by the dashed rectangle. The optional attribute material 'uses' (EXPRESS-G semantics) entity property from schema mat, denoted by the solid rectangle. Entity mesh also has two attributes: elements, a non-empty list of elements of type element, and material which also has type property from schema mat. Geom Mesh Point 1 material

elements

{ alias } 1..* { ordered } Node { reference }

1..* { ordered } nodes

1

Element

0..1 1

material

Mat::Property

Fig. 13. Representation of EXPRESS-G inter-schema references in UML

Regarding the UML representation (Fig. 13) of this example, the following particularities are worthwhile explaining: there is no direct counterpart in UML for the 'uses' and 'references' statements of EXPRESS-G. Thus, another way has to be chosen: the 'uses' statement is modelled by the qualified import Mat::Property of a class from another package. On the other side, class Node is associated with class Point from package Geom and simply aliases it. The class Point itself is thereby not known inside package Fem to match the EXPRESS-G 'reference' semantics.

62

6

Florian Arnold and Gerd Podehl

Conclusion and Outlook

The above examples show the feasibility of a mapping between EXPRESS-G and UML. But since EXPRESS-G is not consequently object-oriented and shows some specifics, the mapping to UML is by no means trivial or unambiguous. Not in all cases really elegant solutions are possible but nevertheless a complete and consistent mapping can be realised. Although the final judgement on the introduced mapping will evolve over time and has to be left to the reader, some further impressions may be noted: • Having once gotten familiar (to some degree) with both EXPRESS-G and UML there is no general advantage of one notation over the other concerning readability and comprehensibility. • The expressive power of UML is not at all exhausted to reach the expressiveness of EXPRESS-G (the mapping is not onto at all). Furthermore, the mapping is not reversible (one-to-one) because of the difference in object models. Consequently, the defined mapping enables semantic interoperability between EXPRESS-G and the corresponding subset of UML. • Complex constraints which cannot be directly represented in EXPRESS-G and have to be attached (in pure EXPRESS notation) to the diagrams can easily be represented in UML. An example can be found in the explanatory text for Fig. 6 and Fig. 7. A mapping from EXPRESS (including the elements which cannot be represented with EXPRESS-G) to UML is implicitly given for those elements which can be expressed by EXPRESS-G while the rest still has to be examined in detail. Just before delivering the final version of this paper to the <>'98 workshop we were notified that for a few weeks there has been a CASE tool extension "that converts your EXPRESS models to UML" [8]. This CASE tool extension is pretty new (available since March 31st, 1998) and we just examined it very roughly. This tool extension is certainly based on a mapping from EXPRESS to UML but our impression is that this mapping is fairly different from ours and has surely been developed independently. Their mapping seems not to be final in all details today, e.g. definitions of select and user defined types, as well as constraints and inter-schema references are only managed internally but they are not visualised in UML. So, we assume that there are other CASE tools evolving in the same direction and that the future will show which tool (and mapping) is the most convenient to meet the various requirements. There is also an approach on bridging UML and STEP/EXPRESS with CDIF (originally for CASE Data Interchange Format) [12] started by JTC1/SC7/WG1. In this study period [9] different types of mappings have been carried out for CDIF and EXPRESS [10]. Moreover, there is some preliminary work on the mapping between CDIF and UML resulting in the first version of a mapping which is concerned with mappings of UML to the semantic meta-model of CDIF and of the UML core package to the CDIF meta-model [11]. Further information about the CDIF approach of bridging STEP/EXPRESS and UML can be found in [2].

Best of Both Worlds - A Mapping from EXPRESS-G to UML

63

References 1. G. Booch, I. Jacobson, J. Rumbaugh: "The Unified Modeling Language, Documentation Set 1.1" (1997) 2. H. Davis: "Mapping between CDIF and EXPRESS for a Study Period on Mapping Modelling Languages for Analysis & Design", to appear in the proceedings of OOPSLA'98 Workshop #25: Model Engineering, Methods and Tools Integration with CDIF (1998) 3. M. Fowler, K. Scott: "UML Distilled - Applying the Standard Object Modeling Language", Addison-Wesley Object Technology Series (1997) 4. M. Holland: "Produktdatentechnologie und STEP", STEP Grundschulung, ProSTEP GmbH, Darmstadt (1995) 5. B. Oestereich: "Objektorientierte Softwareentwicklung mit der Unified Modeling Language", 3., aktualisierte Auflage (UML 1.0), Verlag R. Oldenbourg, München (1997) 6. STEP GmbH: "EXPRESS-Grundkurs - Schulungsunterlagen", ProSTEP GmbH, Darmstadt (1994) 7. D. Schenck, P. Wilson: "Information Modeling the EXPRESS Way", Oxford University Press (1994) 8. SoftLab AB (SoftLab is a subsidiary of Rational Software Corporation): "Rational Rose EXPRESS Extension" (Software), http://www.softlab.se/extern/products/express_uml/ index.htm, Sweden, available since March 31st (1998) 9. JTC1/SC7: "Terms of Reference for an Initial Study Period on mapping Modelling Languages for Analysis & Design Models", http://www.CDIF.org/liaisons/07N1764.pdf (1997) 10."Mappings of CDIF and EXPRESS", Version 3, 2nd April 1998, British Standards Institution (1998) 11."Using the CDIF Transfer Format to exchange UML models", CDIF-JE-N34-V2, September 5th (1997) 12.For more information on CDIF mission and status, and how to obtain CDIF standards, see the CDIF Website at http://www.CDIF.org.

Porting ROSES to UML - An Experience Report Antoni Olivé and Maria-Ribera Sancho Universitat Politècnica de Catalunya, Dept. Llenguatges i Sistemes Informàtics Jordi Girona Salgado 1-3, Mod. C6, E08034 Barcelona (Catalonia) e-mail: [olive|ribera]@lsi.upc.es

Abstract. We report on our experience in porting ROSES to UML. ROSES is an information systems conceptual modeling language that we had developed prior to UML, which includes some new concepts that make the language attractive, at least in some contexts. However, the recent standarization of UML may make the adoption of non-UML languages very difficult, even if they have some features which may be of interest in general or particular contexts. But UML is in principle extensible, and able to accommodate concepts from nonUML languages. The paper explains how we have expressed those concepts in UML. We have had to give up our own notation, but we expect a considerable gain in availability of CASE tools and ease of adoption by professionals.

1 Introduction It is well known that there are many conceptual modeling languages, which are in different stages of development, experimentation and use. Each of such languages has its own set of concepts and notation conventions. Given the origins and recent standardization of UML, it is likely that UML will become widely used in many projects and, on the other hand, that many UML-based CASE tools will be developed, marketed and adopted by many organizations. This fact poses a barrier to the adoption of non-UML languages, even if they have some features which may be of interest in general or in particular contexts. However, UML is a wide-spectrum modeling language, that also includes some extension mechanisms. This might make UML able to accommodate concepts from non-UML languages. For a non-UML language developer, the accommodation of his/her language in UML requires, in general, to change some (or all) of the notations used by the language, but the gains might considerable: availability of CASE tools and ease of adoption by professionals. In this paper we report on our experience in porting ROSES to UML. ROSES is an information systems conceptual modeling language that we had developed prior to UML [3]. ROSES includes some new concepts that make the language attractive, at least in some contexts. The paper explains how we have expressed those concepts in UML. We describe the main problems we faced, and the solution we gave to them. Also, some proposals for improvement of UML are made.

J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 64-77, 1999. © Springer-Verlag Berlin Heidelberg 1999

Porting ROSES to UML - An Experience Report

65

A conceptual model consists of two (sub)models [9]: The structural and the behavioral model. We describe the first in the next section. Sections 3 and 4 deal with the behavioral model. Section 5 summarizes the conclusion.

2 Structural Models The structural modeling concepts of our language are based on the well-known concepts of objects, partitions, attributes, derivation rules and constraints [7,5,8]. For the purposes of this paper, it is not necessary to give the formal and complete details of the structural model. Instead, we will show an annotated example, introducing the main features of the language and how we have translated them into UML. We have chosen an example which deals with persons and their marital status. Figure 1 shows its object classes, the attributes and the associations relating them. The same example specified in the ROSES language can be found in [3]. We consider four kinds of classes: Base, abstract, event selected and derived, depending on the role they play in the generalization/specialization hierarchy and the way how their population is determined. Objects are created in base classes. An insertion structural event creates a new object in the class and assigns an identifier (oid) to it. This instance will continue existing until a deletion structural event occurs. The deletion declares that the object ceases to exist in any class to which it conforms. In our example, Man and Woman are base classes. Note that they have the stereotype <>, abbreviation of “Historical Class”. All base classes in ROSES/UML have this stereotype in order to represent the fact that we adopt the temporal approach, in which object existence and attribute values depend on time [4,6]. Stereotype <> has been defined as a subtype of <>. All object classes stereotyped with <> have the following properties: − They can only be used to specify objects existing in the domain. − They do not include any implementation aspect. − Objects conforming to these classes know their complete history. That is, they know when they have been created, when they have ceased to exist and the value of their attributes at any time of their existence. − They are subclasses of the abstract object class Object which has the operation existsAt(t:time):Boolean defined on it. The result of the operation applied to a given object is “true” if the object exists at t, and “false” otherwise. For each attribute attr:Type defined in these classes we implicitly assume the existence of an operation attributeAt(t:Time):Type. The result of the operation applied to an object gives the value of the attribute for that object at t. The second kind of classes are abstract classes. They do not have direct instances and its population is obtained by the union of its base and abstract subclasses. We can directly represent them as UML abstract classes, with the name written in italics. As in the case of base classes, abstract classes in ROSES/UML are stereotyped with <>. In the example, Person is an abstract class.

66

Antoni Olivé and Maria-Ribera Sancho

parents 2..2 <> Person

children *

<> Man

male_female {disjoint, complete}

name: string(30)

<> Woman

existence {disjoint, complete}

<> AlivePerson

<> DeadPerson

marital_status {disjoint, complete} <> Single

<> MarriedPerson

<> Divorced

<> Widowed

dateOfMarriage: Date spouse: MarriedPerson Man, Woman population = "permanent instances" initial membership = "always" membership interval = "single permanent" Person initial membership = "always" membership interval = "single permanent" key = “name” Person.name changeable = "frozen" initial value = “always” existence interval = “single permanent” Person.parents initial value = “always" changeable = “frozen” Person.children changeable = "addOnly"

AlivePerson, Single initial membership = "always" membership interval = "single non-permanent" DeadPerson initial membership = "never" membership interval = “single permanent" MarriedPerson, Divorced, Widowed initial membership = "never" membership interval = "multiple" MarriedPerson.dateOf Marriage, spouse initial value = “always” existence interval = “single permanent” changeable = "frozen"

Fig. 1. Example

Porting ROSES to UML - An Experience Report

67

The second kind of classes are abstract classes. They do not have direct instances and its population is obtained by the union of its base and abstract subclasses. We can directly represent them as UML abstract classes, with the name written in italics. As in the case of base classes, abstract classes in ROSES/UML are stereotyped with <>. In the example, Person is an abstract class. In ROSES, subclass/superclass relationships are structured in partitions, which may be complete or partial. Usually, the subclasses of a partition are of the same kind. We have seen an example in the complete partition of Person into base classes Man and Woman. Figure 1 also shows two other examples: Existence, partition of abstract class Person into event-selected classes AlivePerson and DeadPerson, and marital_status, partition of class AlivePerson into event-selected classes Single, MarriedPerson, Divorced and Widowed. In event-selected classes the population consists on all the instances conforming to all its superclasses that have been explicitly inserted in the class and have not been removed. The elimination does not destroy the object, it just removes it from the class (and all its subclasses). The insertion/elimination of objects to/from an event selected class is done by means of special structural events. In the example, AlivePerson, DeadPerson, Single, MarriedPerson, Divorced and Widowed are event selected classes. Unfortunately, the semantics of UML models (version 1.1) seems to be such that objects cannot change the set of classes to which they conform during their life-times, so an object created to conform to the AlivePerson class could never become a DeadPerson. To be able to model this kind of situations, for objects conforming to an event selected class, we have defined the stereotype <>, abbreviation of “Historical Role”. It has been defined as a subtype of <>. All object classes stereotyped with <> have the following properties: − They can only be used to specify roles. − They do not include any implementation aspect. − They have one <> as a superclass (either directly or indirectly). The instances of the <> are also instances of that <> . − Object classes stereotyped with <> know the complete history about its membership in <> classes and about its state while belonging to each one of them. − We implicitly assume the existence of an operation hasRoleAt(HRole:Role,t:Time):Boolean, defined in the abstract object class Object. The result of the operation applied to a given object for a given time instant t is “true” if the object conforms to HRole at t, and “false” otherwise. Our concept of event-selected classes is similar to Syntropy’s state types [1]. State types appear in the state machine that describes the behavior of their supertype. This may suggest that we could model our event-selected classes as states in a UML Statechart Diagram. However UML’s states cannot have attributes or associations with other classes, while event-selected classes and Syntropy’s state types can have arbitrary number of attributes and can be associated with other classes. This fact has forced us to use the <> stereotype mechanism.

68

Antoni Olivé and Maria-Ribera Sancho

Derived classes model another kind of partitioning in ROSES. Its population consists on all the instances conforming to all its superclasses that satisfy a given condition. This kind of classes can be represented as a UML derived element. They will be shown by placing a slash in front of its name. The corresponding derivation rule will be represented as an invariant. There are no derived classes in the example of figure 1. One of the innovative concepts of our language is that it provides, at the structural level, a number of new temporal features (related to class populations and to attributes) that allow to define, in a simple way, common dynamic (or temporal) constraints [2]. As we will see in the next section, these features have a significant impact on the possible structural events. We will represent them extending UML with properties for classes and attributes, using the tagged value mechanism. In our example case we define that Person is an object class with two population features: Initial membership = always means that all persons must be instance of this class when they are created (other options are never|sometimes); and membership interval = single permanent means that a person will remain as member of this class until the end of the Person’s lifespan (other options are single non-permanent|multiple). The population feature membership interval = single non-permanent of AlivePerson means that a person may leave this class (while still being a Person). Another population feature is illustrated in object classes Man and Woman. Population = permanent instances means that once a Man/Woman is created, he/she exists until the end of the system’s lifespan. With respect to attributes, we have defined that name is a single-valued attribute of Person, with three attribute features: Initial value = always means that when a person is created he/she must have a name (other options are never|sometimes); Existence interval = single permanent means that once a person has a name he/she must have always a name (other options are single non-permanent|multiple); and Changeable = frozen means that a person's name cannot be changed. Note that we have adopted the changeable standard UML property. We have also defined a multivalued attribute parents of Person with a UML association. Two attribute features have been defined for the association end parents: Initial value = always means that parents must be known when the person is created. Changeable = frozen means that a person's parents cannot be changed. The other end of the association defines the inverse attribute children, with the attribute feature Changeable = AddOnly meaning that additional children may be added but they may not be deleted. Finally, attribute spouse of MarriedPerson represents the relationship between a married person and his/her spouse. Notice that it could have been modeled as a reflexive association. The problem we have here is that the spouse association is inherently symmetric, and it breaks the UML rule that all the associations emanating from a type must have distinct role names. ROSES allows the definition of any kind of structural constraint, in a temporal framework. In general, we have been able to translate them into OCL. However,

Porting ROSES to UML - An Experience Report

69

some frequent and special constraints are better expressed using the tagged value mechanism. For example, we have defined the property key ={attribute name} to specify that a set of attributes identify an object instance.

3 Events ROSES includes several concepts related to events, that will be introduced in this section. Figure 2 shows the general structure for the existing event types and the way they have been modeled in UML. <> Event

time: Time {disjoint, complete}

<> ExtEvent

<> IntEvent

<> GenEvent

<> StrEvent apply() Fig. 2. Event representation

All event classes in ROSES/UML have the stereotype <>, abbreviation of “Conceptual Event”. This Stereotype has been defined as a subtype of <>. All event classes stereotyped with <> have the following properties: − They can only be used to specify events existing in the domain. − They do not include any implementation aspect. − Attributes and associations of objects conforming to these classes are permanent. Their value is determined at object creation and they cannot be modified. In ROSES, all events have an attribute time: Time indicating the time instant when the event occurs. This attribute has been defined in the abstract class Event. We consider three kinds of events: External, internal and generated. Classes ExtEvent, IntEvent and GenEvent are abstract classes that generalize all concrete classes of the corresponding type.

70

Antoni Olivé and Maria-Ribera Sancho

An external event corresponds to a change in the Universe of Discourse that occurs at a given time instant and which is not induced by the Information System itself. External event classes group those external events that notify the same kind of change in the Universe of Discourse. An important subtype of ExtEvent is StrEvent, which is an abstract class that generalizes all structural events, as we will see in the following subsection. We also have generated events. Such events are generated according to a rule defined in some object class. Typically, they are used to signal an state of the IB for which some action is required. In some complex cases, it may also be convenient to define internal events, which lie between external and structural events. Their only purpose is to provide a way to give a structure to the event rules. 3.1 Structural Events Conceptual modeling languages provide (either implicitly or explicitly) a set of structural events based on the structural model. Such events allow common changes such as inserting or deleting an object instance, changing the value of attributes, and so on. How such events are invoked depends on the specific approach or language used. In ROSES we have elaborated a little bit more the concept of structural event. We take into account the structure of object classes, and the set of population and attribute temporal features (as well as the common static features) and determine automatically the set of possible structural events. Each structural event has associated a set of constraints and effects on the Information Base. For example, insertions of men can only be done through the structural event Insert_Man. Its definition is shown in figure 3.

<> StrEvent apply()

<> Insert_Man name: String(30)

parents 2..2

<> Person

apply() Fig. 3. Structural event Insert_man

with the following associated constraints and effects: − The name cannot be nil. − A new man is created, with the given attributes. The man becomes also an instance of class Person. − The man is made instance of class AlivePerson (because we have always initially member in this class) − The man is made instance of class Single (for a similar reason).

Porting ROSES to UML - An Experience Report

71

To obtain the desired effects, structural events have operation apply(), which is defined in class StrEvent, and refined in each structural event. We assume, through a general rule that whenever the creation of an structural event is induced, its apply operation is immediately invoked. ROSES/UML automatically determines the effect of each apply operation. In the example of figure 3 the occurrence of an structural event Insert_Man causes the invocation of its operation apply with the effects (defined in OCL): Insert_Man:: apply() post: Man.allInstancesAt(self.time) → exists! (pnot(Man.allInstancesAt(self.time-1)→includes(p)) and p.name = self.name and p.parents = self.parents) and p.hasRoleAt(AlivePerson, self.time) and p.hasRoleAt(Single, self.time) ) Note the use of the exists! operator, which is an extension of the OCL’s exists, to mean ‘exactly one’ instance. The property allInstancesAt(t:Time) results in the set of all instances of a Type existing at time t. This is for us a required extension to the allInstances predefined feature of OCL. It is not possible to delete persons (men or women) since we have defined their classes with the population feature permanent instances. The rest of the structural events corresponding to the object model given in the previous Section appear in figure 4. <> Insert_M an nam e: Strin g( 30 )

2..2 p arents

2..2 p arents

<> Person

<> Insert_W om an nam e: Strin g( 30 ) <> Insert_D eadPerson <> Insert_D ivorced

...

<> A livePerson spouse

<> Insert_W idow ed <> Insert_M arriedPerson

Fig. 4. Structural events

<> StrEvent

72

Antoni Olivé and Maria-Ribera Sancho

Note that there are not structural events allowing to change attributes of persons. This is due to the attribute features defined for those attributes. 3.2 External Events ROSES provides a rich set of language constructs to define external events, including generalization/specialization hierarchies, simple/complex events, derived attributes and constraints. However, we have found that very often our structural events themselves correspond (including their associated constraints and effects) to external events and, thus, they need not to be defined explicitly as such. For instance, the above Insert_Man structural event may be used as external event to communicate the birth of a man. In other cases, the external events do not correspond to one of the structural events, and must be defined explicitly. For instance, marriages and divorces can be communicated with external events Marriage and Divorce, as shown in figure 5. <> ExtEvent

<> Divorce

<> AlivePerson

...

{NotMarried}

<> Marriage

bridegroom bride

<> Man <> Woman

<> Person

{MarriedAlready} <> MarriedAlready Marriage not self.bridegroom.HasRoleAt(MarriedPerson, self.time-1) and not self.bride.HasRoleAt(MarriedPerson, self.time-1) <> NotMarried Divorce self.AlivePerson.HasRoleAt(MarriedPerson, self.time-1) Fig. 5. External events Marriage and Divorce

Event attributes would be defined in the attributes compartment but, in this case, there are no such attributes (with the exception of time attribute, defined in abstract class Event). On the other hand Marriage has two associations, one with Man (indicating the bridegroom) and one with Woman (indicating the bride). Divorce has also one association that identifies one AlivePerson that gets divorced. Integrity constraints directly related to external events are described as UML invariants in the OCL language at the class level. In figure 5 the first invariant states that a marriage self, occurring at self.time, violates constraint named

Porting ROSES to UML - An Experience Report

73

MarriedAlready if the bridegroom (or the bride) of self was a MarriedPerson at self.time-1. Such invariants must be true for each instance of the corresponding class.

4 Behavioral Models We need now to establish the relationship between external and structural events. This is done by means of event rules. The trivial case occurs when a structural event is used as external event. In this case, the rule is implicit: when the external event occurs, the corresponding structural event is induced and, if its associated integrity constraints are satisfied, the IB is modified according to its associated effects, as specified in the apply operation. In the example, this case happens with structural events Insert_Man, Insert_Woman and Insert_DeadPerson. In general, a ROSES event rule has the form: rule ruleName event_1 (att_1:X,...,att_n:Y,time:T) if event_2 (Z), Z.time = T, F; [duplicates [non-]allowed]; [before|after] ruleName; end where: − event_1 is a (structural, internal or external) event with attributes att_1,... att_n and time. − event_2 is a (external, internal, generated or structural) event, and − F is a formula with at least variables {X,...,Y}. The meaning of the above rule is: Each occurrence Z of event_2, with occurrence time T, if formula F is true, induces an occurrence of event_1 with attribute values given by Z,T and variables in F. In the rule, event_2 is the triggering event, and event_1 is the induced (or triggered) event. We also say that, in the rule, the triggering event is the cause, and that the induced event is the effect. When the effect is a structural event, its induction implies that the associated effect is performed on the Information Base. There may be several solutions to model these rules in UML. One of them is to define an execute operation in each event class. The invocation of this operation for a triggering event would evaluate formula F and, if true, would create one or more instances of the induced event. The operation could be specified in any of the allowed forms in UML, including interaction diagrams. However, we rejected this solution because an event may be a triggering event in several rules and, therefore, its execute operation could become large and complex. At the end, we chose to model event rules as UML invariants written in the OCL language, stereotyped with <<event rule>> and associated to an event class. The invariant must be true for each instance (self) of the class. For example, the rule:

74

Antoni Olivé and Maria-Ribera Sancho

<<event rule>> Marriage {name = Marriage_1} Insert_MarriedPerson.allInstancesAt(self.time) → exists! (e e.alivePerson = self.bridegroom and e.spouse = self.bride and e.time = self.time ) declares that the occurrence of a Marriage (external event) induces an occurrence of Insert_MarriedPerson (structural event). In this case, the rule only serves to give the values of attributes alivePerson and spouse of the induced event. In other cases it may include references to other objects. The rule must be true for each instance (self) of Marriage. An event may be a triggering event in several rules and, on the other hand, an event may be a triggered event in several rules. If two or more rules have the same triggering event, they can be combined into a single rule. This must be understood as a syntactic simplification only, with no semantic effect. For example: <<event rule>> Marriage {name = Marriage} Insert_MarriedPerson.allInstancesAt(self.time) → exists! (e e.alivePerson = self.bridegroom and e.spouse = self.bride and e.time = self.time ) and Insert_MarriedPerson.allInstancesAt(self.time) → exists! (e e.alivePerson = self.bride and e.spouse = self.bridegroom and e.time = self.time ) The last rule of our example, shows a structural event that acts as a triggering event: <<event rule>> Insert_DeadPerson {name = BecomingWidowed} self.alivePerson.hasRoleAt(MarriedPerson,self.time-1) implies Insert_Widowed.allInstancesAt(self.time) → exists! (ee.alivePerson = self.alivePerson. oclAsType(MarriedPerson).spouseAt(self.time-1) and e.time = self.time) which defines that the death of a married person induces the structural event insert_widowed of the person that was his/her spouse at the time previous to death's occurrence. Note the use of the OCL operation oclAsType to re-type self.alivePerson to MarriedPerson, where the spouseAt operation is defined. The options duplicates [non-]allowed can be used when, for a given triggering event, formula F can instantiate in two or more different ways variables {X,...,Y} {Z,T}. In this case, the option duplicates allowed means that there must be an induced event for each instantiation, while duplicates non-allowed would give a unique induced event. In UML, we do not need to define this option as a property of the rule. OCL provides the language constructs we need to specify the same effect. For example, suppose that sometimes users wish to send letters to persons that have married during the current year. We would define a new external event SendLetters that will be

Porting ROSES to UML - An Experience Report

75

used at any time to tell the system to send the letters. Assume also that the sending of the letters will be associated with the internal event SendTo with attributes name:String(30) and time:Time. The event rule expressed in OCL could be: <<event rule>> SendLetters {name = SendingOfLetters} marriage.allInstancesAt(self.time) → (select(myear(m.time) = year(self.time)).bridegroom → asSet union select(myear(m.time) = year(self.time)).bride → asSet ) → forAll(rSendTo.allInstancesAt(self.time) → exists! (u u.name = r.name and u.time = self.time) ) where the asSet operation removes duplicates. Note in this case that, with our temporal approach, we can refer to marriage external events that occurred in the past. The expression: select(myear(m.time) = year(self.time)) is satisfied by all marriages such that the year of their occurrence time is the year of the occurrence of SendLetters event. The options [before|after] ruleName can be used to establish a priority in rule evaluation when several orders are possible. This can be expressed as a property of the rule. Practical experience shows that the modeling constructs: − object classes hierarchies, and derived object classes, − external events hierarchies, and derived external events, and − event rules, with external events as triggers and structural events as induced events, are sufficient to model behavior of most IS. In particular, observe that a single external event occurrence may belong to several external event classes and, thus, it may trigger simultaneously several rules. For example, assume that we define an external event class ChangeOfMaritalStatus (see figure 6) as a generalization of events marriage and divorce, and that repetition is a derived external event subclass of marriage. Its definition is given in the following derivation rule: <<derivation rule>> Repetition allInstancesAt(t) = Marriage.allInstancesAt(t) → select (m m.bridegroom.hasRoleAt(MarriedPerson,T1) and T1< m.time and m.bridegroom.oclAsType(MarriedPerson).spouseAt(T1) = m.bride) Then, an occurrence of marriage is also a changeOfMaritalStatus event and, if the two involved persons were already married in the past, it is also an occurrence of repetition event. In this case, a single occurrence of a marriage would trigger all rules having as triggering event one of these three events.

76

Antoni Olivé and Maria-Ribera Sancho

<> ExtEvent

<> ChangeOfMaritalStatus

<> Divorce

<> Marriage

<> /Repetition

Fig. 6.

5 Conclusions We have described the porting of the ROSES conceptual modeling language to UML. The main problems we faced, and the solution we gave to them are: − Structural classes in ROSES are temporal, in the sense that they maintain a complete knowledge of their evolution through time. We have defined a new class stereotype (<>) to model such classes. − There are many properties (temporal features) of class population and attributes in our language. We have been able to express them using the tagged value mechanism. − Some classes in ROSES are roles, in the sense that their objects may acquire and lose these roles during their lifetime. We have not found a similar concept in UML, and therefore we have defined a new class stereotype (<>). − ROSES is a formal language. We have been able to maintain such formality in UML, using the OCL language, with some minor modifications.

Acknowledgments We would like to thank the members of the ROSES group and the anonymous referees for their helpful comments. This work has been partially supported by PRONTIC CICYT program project TIC95-0735.

References 1. Cook,S.; Daniels,J. "Designing Object Systems. Object-Oriented Modeling with Syntropy", Prentice Hall 1994.

Porting ROSES to UML - An Experience Report

77

2. Costal, D.; Olivé, A.; Sancho, M.R. "Temporal Features of Class Populations and Attributes in Conceptual Models". 16th International Conference on conceptual Modelling (ER'97), Los Angeles, November 1997 (LNCS 1331), pp.57-70. 3. Costal, D.; Sancho, M.R.; Olivé, A.; Barceló, M.; Costa, P.; Quer, C.; Roselló, A. "The Cause-Effect Rules of Roses", First East-European Symposium on Advances in Databases and Information Systems (ADBIS'97), St. Petersburg, Russia, September 1997. 4. Gustafsson,M.R.; Karlsson,T.; Bubenko,J.A. “A Declarative Approach to Conceptual Information Modeling”. In “Information Systems Design Methodologies: A Comparative Review”, North-Holland, pp. 93-142. 5. Martin,J.; Odell, J.J. “Object-oriented Methods: A Foundation”, Prentice Hall 1995. 6. Olivé,A. “On the design and implementation of information systems from deductive conceptual models”, Proc.VLDB89, Amsterdam, pp.3-11. 7. Peckham,J.; Maryanski,F. "Semantic Data Models", ACM Computing Surveys, 20, 3 (Sept.), pp. 153-189. 8. Rational Software Corporation, "Unified Modeling Language (UML)", Version 1.1, September. 9. van Griethuysen, J.J. "Concepts and terminology for the conceptual schema and the information base", ISO/TC97/SC5/WG3.

Making UML Models Interoperable with UXF Junichi Suzuki and Yoshikazu Yamamoto Department of Computer Science, Graduate School of Science and Technology, Keio University Yokohama City, 223-8522, Japan. +81-45-563-3925 (Phone and FAX) {suzuki, yama}@yy.cs.keio.ac.jp, http://www.yy.cs.keio.ac.jp/∼ suzuki/project/uxf

Abstract. Uniﬁed Modeling Language (UML) has been widely accepted in the software engineering area, because it provides most of the concepts and notations that are essential for documenting object-oriented models. However, UML does not have an explicit format to describe and interchange its model information intentionally. This paper addresses the UML model interchange and presents our eﬀorts to make UML highly interoperable. We developed an interchange format called UXF (UML eXchange Format) based on XML (Extensible Markup Language). UXF is a simple and well-structured format to encode UML models. It leverages the tool interoperability, team development and reuse of design models by interchanging the model information with the the XML standard. Also, we propose an open distribution platform for UML models, which provides multiple levels of interoperability of UML models. Our work shows an important step in the evolution for the interoperable UML.

1

Introduction

Uniﬁed Modeling Language (UML) [1,2,3,4,5,6,7,8] has been widely accepted as an object oriented software analysis/design methodology in the software engineering community. It provides most of the concepts and notations that are essential for documenting object oriented models. While UML is the union of the previously leading object modeling methodologies; Booch [9], OMT [10] and OOSE [11], it includes additional constructs that these methods did not address, such as Object Constraint Language (OCL) [6] and Object Analysis & Design CORBAfacility Interface Deﬁnition [8]. It is the state of the art convergence of practices in the academic and industrial community. Also, as a publicly available standard, UML is now in the process of standardization and revision at Object Management Group (OMG) [12]. UML provides a series of diagrams with the ﬁne level of abstraction to specify object models for a given problem. Complex systems can be modeled through a small set of nearly independent diagrams. UML deﬁnes two aspects for constructs in every diagram: J. B´ ezivin and P.-A. Muller (Eds.): UML’98, LNCS 1618, pp. 78–91, 1999. c Springer-Verlag Berlin Heidelberg 1999

Making UML Models Interoperable with UXF

79

– Semantics: The UML metamodel deﬁnes the abstract syntax and semantics of object modeling concepts. – Notations: UML deﬁnes graphical notations for the visual representation of its model elements. While UML deﬁnes coherent model elements and their interchangeable semantics, it does not intentionally provide the explicit format to exchange the model information. The ability of model interchange is quite important because it is likely a development team resides in separate places on a network environment, and because current UML models are not interoperable between development tools due to the lack of an application-neutral exchange format [13]. This paper addresses the standard-based UML model interchange and presents our eﬀorts to make UML models interoperable. We have developed an interchange format called UXF (UML eXchange Format) [14], which is based on XML (eXtensible Markup Language). We consider the use of XML as a mechanism for encoding and exchanging the structured data deﬁned with UML. The remainder of this paper is organized as follows. Section 2 discusses some candidate formats and their pros and cons. Then, we describe rationale and merits to employ XML. Section 3 outlines comparisons with related work. Section 4 deﬁnes the scope and syntax of UXF and presents some examples of processing UXF formatted data. Section 5 presents our applications using UXF. We conclude with a note on the current status of project and future work in Section 6 and 7.

2

UML Model Interchange

The most important factor in interchanging UML models is the semantics within models should be described explicitly and transferred precisely. This section describes why we chose XML from some candidates, and presents the characteristics of UXF. 2.1

Candidate Formats

There are some candidate formats to encode and interchange UML models. The following sections discuss their pros and cons. Proprietary format The ﬁrst candidate is a proprietary format. It is a straightforward strategy, and allows development tools to use a certain optimized syntax to encode model information. However, it suﬀers from non-interoperability: the model information cannot be reused between diﬀerent tools. Though some tools (e.g. CASE tools) provides export/import capabilities that translate a proprietary format into another one, these are not substantial solution for the UML model interchange.

80

Junichi Suzuki and Yoshikazu Yamamoto

HTML (HyperText Markup Language) The second candidate is HTML. It is an easy to learn format, and has been widely accepted in the Web and documentation community. HTML, however, cannot describe arbitrary or complex data structure because it provides ﬁxed set of tags. An example of HTML documentation tools is javadoc included in Java Development Kit (JDK), which is a translator from the comments in Java source code into speciﬁcation documents written in HTML. While such a tool is valuable and helpful for everyday development work, some important model information is unfortunately thrown away in the process of producing HTML documents, due to its ﬁxed tag set. In other words, HTML documents generated by documentation tools cannot describe semantics of model information precisely and also cannot be reused in other applications. XML (eXtensible Markup Language) XML is an emerging data format in the Web community, which is standardized by the World Wide Web Consortium (W3C) [14]. While HTML is deﬁned by SGML (Standard Generalized Markup Language: ISO 8879), XML is a sophisticated subset of SGML, and designed to describe arbitrary structures of documents beyond HTML. One of the goals of XML is to be suitable for use on the Web; thus to provide a generic mechanism for the delivery of information over the Internet. XML has the following characteristics: – – – – –

application neutrality (vender independence) user extensibility ability to represent arbitrary and complex information validation scheme of data structure human readability

As its name implies, extensibility is a key feature of XML; users or applications are free to declare and use their own tags and attributes. Therefore, XML ensures that both the logical structure and content of semantically rich information can be retained. It emphasizes the description of information structure and content as distinct from its presentation. The data structure and its syntax are deﬁned in a DTD (Document Type Deﬁnition) speciﬁcation, which speciﬁes a set of tags and their constraints. Every XML documents can validate its content structure by comparing with its DTD. XML is also the text-based format. This means the editing of XML documents are easy and existing text manipulation tools can be used to process them. In contrast to data structure, the presentation issue is addressed by XSL (XML Style Language) [15], which is also a W3C’s standard to describe stylesheets for XML documents. XSL is based on DSSSL (Document Style Semantics and Speciﬁcation Language ISO/IEC 10179) and complement CSS (Cascading Style Sheet) [16], which is a style deﬁnition language for HTML. In addition, XPointer [17] and XLink [18] are also in the process of standardization at W3C, which are speciﬁcations to deﬁne anchors and links within or across XML documents.

Making UML Models Interoperable with UXF Programming Languages

Reverse engineering tools

81

CASE tools

UML Exchange Format

Printed materials

Visual profiling tools Hyperlinked online help

Design metrics tools

Repository

Fig. 1. UXF allows the seamless interchange of UML model information between development tools

2.2

UML eXchange Format (UXF)

As such, XML has great potential as an interchange format for UML. We have developed a XML-based format called UXF (UML eXchange Format). UXF facilitates: – Interoperability between development tools: Software models are dynamically changed in the analysis/design, revision and maintenance phases, and the software tools used by a development team employ their own proprietary formats to describe the model information. UXF allows UML models to be interoperable between development tools throughout the lifecycle of software development. Once encoded with a certain format, the model information can be reusable for a wide range of diﬀerent development tools with diﬀerent strengths (Fig. 1). This seamless interoperability increases our productivity of UML modeling. – Intercommunications between software developers: The Internet is a promising infrastructure to distribute and share software model information, because it is eﬀective and economical for making information available to the separated group of individuals. Within the Internet/Intranet environment, especially the Web environment, we can represent and communicate software modeling insights and understandings with each other. For example, We may write down model information into electronic mails, or use a distributed communication system to transfer UXF descriptions. UXF simpliﬁes the circulation of UML models between software developers. – Natural extension from the existing Web environment: UXF is a natural and transparent extension from the existing Web environment. Thus, it allows to edit, publish, access and exchange the UXF description as easily as is currently possible with HTML. In addition, most of the existing Web applications can be used for handling UXF with relatively minor modiﬁcations. To author and view UML models encoded with UXF, existing markup languages could be converted to UXF, and most development tools such as CASE

82

Junichi Suzuki and Yoshikazu Yamamoto

tools, documentation tools, visual proﬁling tools and document repositories, can be modiﬁed so that they recognize UXF. In the current situation where many XML-aware applications exist, it is relatively easy to extend these tools. Also, UXF descriptions can be handled by every Web application that manipulates HTML as well as Web browsers/servers in the near future. UXF also ensures a variety of possibilities of its output representations by applying diﬀerent stylesheets to a UXF documnents. Output formats include RTF (Rich Text Format), HTML, LaTeX, PDF (Portable Document Format). Moreover, UXF data can embed hyperlinks using the linking mechanisms of XPointer and XLink. This allows us to link UML model elements. As such, developers can use technical materials as printed, electronic or interactive documents (Fig. 1).

3

Related Work

A well-known and mature format for exchanging the software modeling information is CDIF (CASE Data Interchange Format) [19]. CDIF is a generic mechanism and format to interchange the software models between CASE tools, and a family of standards deﬁned by the Electronic Industries Association (EIA) and International Standard Organization (ISO). CDIF deﬁnes a meta-metamodel, a tool interchange format, and a series of subject areas: – – – – – – – – – – – – –

CDIF Framework for Modeling and Extensibility CDIF Transfer Format General Rules for Syntaxes and Encodings SYNTAX.1 ENCODING.1 CDIF Integrated Metamodel Foundation Subject Area Common Subject Area Data Modeling Subject Area Data Flow Model Subject Area Data Deﬁnition Subject Area State/Event Model Subject Area Presentation Location and Connectivity Subject Area

CDIF separates the semantics and syntax from the encoding, and thus provides ﬂexibility in the representation and transfer mechanism. SYNTAX.1 and ENCODING.1 deﬁnes the means that allows for a tool-independent exchange of models. CDIF has provided the mapping to UML [20], by using the Foundation Subject Area and CDIF Transfer Format, and by deﬁning the UML subject area that provides the deﬁnitions of metamodel entities and their relationships in UML. The UML Subject Area is dependent on the CDIF Foundation Subject Area. UXF is a UML-speciﬁc exchange format and an alternative vehicle to transfer UML models. Since it is a straightforward extension from and transparent to the Web distributed environment, it can be easy-to-learn for the huge amount

Making UML Models Interoperable with UXF

83

of people that are familiar with HTML or SGML. We believe UXF is much easy and practical approach for interchanging UML models over the Internet. We are also investigating the possibility to integrate UXF with the CDIF eﬀort (see Section 6). As described in Section 1, UML is now in the process of revision. As for model interchange, OMG issued a RFP (Request For Proposal) for SMIF (Stream-based Model Interchange Format) speciﬁcation [21]. Responses for SMIF include CDIF based, STEP based and XML based proposals. At present, UXF is not compliant to SMIF intentionally for the simplicity of the format. SMIF is just proposed and has not been frozen, at the time of this writing. Once SMIF is frozen or more mature, we will develop a translator between UXF and SMIF. UXF is carefully designed to be [13,22,23]: – Simple: UXF is compact by including only UML’s semantics, while the scope of SMIF includes other speciﬁcation (e.g. Meta Object Facility). – Intuitive: UXF is easy-to-learn and readable. – Lightweight: The intention of UXF does not include only an interchange format, but also more broad range of interoperability for UML models (see Section 6). UXF serves as a lightweight means for such usage. These characteristics are also strength for other description formats for UML [24,25].

4

UXF Design Principle

In terms of interchanging model information between development tools, there can be two types of information that should be exchanged [20]: – Model-related information – View-related information While model-related information is a series of building blocks to represent a given problem domain (e.g. classes, attributes and associations), view-related information is composed of the way in which the model elements are rendered (e.g. the shapes and position of graphical objects). This paper concentrates on exchanging model-related information. The interchange of view-related information is out of the scope of our work. However, it is easy to obtain the view-related information by generating a data description for a certain rendering application, or applying XSL stylesheets to UXF. 4.1

UXF DTDs

The UXF speciﬁcation actually consists of a series of XML DTDs. It provides the mapping of UML model elements into XML tags. UXF captures the model elements in the UML metamodel and deﬁnes each as a tag (or document element) straightforwardly. The attributes of each UML element are mapped into attributes of the corresponding UXF tag.

84

Junichi Suzuki and Yoshikazu Yamamoto

We have speciﬁed UXF DTDs for essential diagrams for the analysis and design: Class, Collaboration and Statechart diagrams. Table 1 depicts the mapping of UML model elements and UXF tags. Current UXF supports most elements in the Core, Collaboration, State Machines package and some elements in other packages in UML version 1.1. Using UXF, most essential concepts and constructs in UML can be mapped to the stream-based description seamlessly. Complete DTDs, sample markup examples and other materials can be found at [26]. Note that constructs described with UXF are not shared between diﬀerent diagrams. Section 6 describes this issue. UML Package Core

UML model element UXF tag Association AssociationEnd Attribute Class Dependency Generalization Interface Operation Parameter <Parameter> Auxiliary Elements Reﬁnement Extension TaggedValue Common Behavior Exception <Exception> Action ActionSequence Instance Model Management Model <Model> Package <Package> Collaborations Collaboration Interaction Message <Message> StateMachines CompositeState Event <Event> Guard State <State> Transition PseudoState Table 1. Comparison between UML model elements and UXF tags

4.2

Processing UXF Documents

This section outlines how a UXF documents might be processed. In every phase, we can reuse various existing XML or SGML tools.

Making UML Models Interoperable with UXF

85

Fig. 2. Sample screenshots of XML editors editing UXF descriptions (XML Pro from Vervet logic and a XML major mode for Emacs named psgml)

Authoring UXF description can be created with any text editor because it is a text-based format. In practice, it is expected to use an editing tool that helps users’ input. Figure 2 shows sample screenshots of commercial and freely available XML editors editing UXF description. Conversion Data conversion makes the authoring work simple and productive. UXF description can be converted from/to other data (e.g. legacy documents, program source code, documentation format or data representation in a development tools). UXF allows such conversion programs to be written easily. Examples are described in Section 5. Parsing Parsing is the process to analyze and validate the syntax of UXF documents. XML allows for two kinds of descriptions; valid and well-formed. Validity requires that a document refers a proper DTD and obeys its constraints. Wellformedness is a less strict criteria and requires that a document just obeys the syntax of XML. UXF requires a validating parser in authoring UXF descriptions, and a non-validating parser in browsing or delivering the document. We can use any XML parser from huge amount of existing parsers. Distribution UXF is designed to distribute UML models precisely over the network environment. It can be used in existing document distribution systems to

86

Junichi Suzuki and Yoshikazu Yamamoto

Fig. 3. Sample screenshots of XML browsers rendering UXF descriptions (a XML viewer named Jumbo and Microsoft Internet Explorer

share and manage UML model information. Also, it can be used on the existing Web environment so that a Web browser downloads UXF description and displays them using a stylesheet or Java applets. We have developed a distributed management system for UML models (see Section 5). Rendering and Browsing Rendering and browsing involves the delivery of stylesheets or any specialized software for display such as Java applets (see also Section 5). Figure 3 shows a Web browser that displays a UXF document using a XSL stylesheet, and a hierarchical XML browser.

5

Applications

This section presents our applications using UXF. These applications show the potential of UXF and provide standard-based ways to share UML models between various tools or over the distributed network environment. 5.1

Source Code Documentation Tools

In general, source code documentation tool is generally a tool that imports the program source code and generates documents, along with any specialized format. We have developed a documentation tool that parses source code written in Java and generates UXF formatted documents. This tool uses Java Development Kit (JDK) by creating a class UxfDocumentationGenerator that extends

Making UML Models Interoperable with UXF

87

Fig. 4. Sample screenshots of UXF applications (Netscape Communicator and Rational Rose)

DocumentationGenerator included in JDK [13]. It translates the declaration in a Java program into the corresponding UXF representation based on the mapping in Table 1. This tool allows the model information obtained from source code to be reusable for other applications including CASE tools and repositories. 5.2

Translator between UXF and Case Tools

This tool is a translator that converts a UXF description into a proprietary format of a CASE tool, vice versa. This sort of tool is highly required, because CASE tools are frequently used in many development projects. Our tool generates the importable ﬁles (*.mdl ﬁles) of Rational Rose [22]. The left of Figure 4 shows a screenshot that Rational Rose displays a class diagram converted from a UXF description (The graphical position of classes, associations and labels are moved manually for the readability.). 5.3

Distributed Model Management System

The last application is a distributed model management system that shares and manages UML design information within a networked environment. Our system leverages the team development that allows developers to continue their work concurrently at the physically separated places. We have developed this system on top of the existing Web environment and a Java-based ORB (Object Request

88

Junichi Suzuki and Yoshikazu Yamamoto

Fig. 5. Deployment architecture of our distributed model management system

Broker) compliant to CORBA (Common Object Request Broker Architecture), which is a standard for the distributed object middleware [27]. It is based on the three-tier deployment architecture, and provides two kinds of accesses to UXF documents; via HTTP and IIOP (Internet Inter-ORB Protocol), which is a TCP/IP based standard protocol of CORBA (Fig. 5). The communications via IIOP is achieved through the CORBA standard IDL (Interface Deﬁnition Language) interfaces (Fig. 6). The HTTP access aims to allow client applications including Web browsers to refer the UXF documents that are stored in Web servers or any backend databases. Figure 3 and Figure 4 show Web browsers that display a UXF description together with diﬀerent XSL stylesheets. As such, diﬀerent presentations suited to the speciﬁc purpose can be displayed, if diﬀerent stylesheets are prepared [28]. We have also developed a Java applet to display a graphical representation for UXF [28]. The IIOP access aims to allow developers at separated places to consistently register, refer, process and change UXF descriptions. As depicted in Figure 6 our system transfers UXF descriptions through interfaces provided by Document Object Model (DOM), a standard of the World Wide Web Consortium (W3C) [29]. DOM deﬁnes a set of interfaces to manipulate the content, structure and style of XML/HTML documents. We implemented DOM APIs on top of CORBA [23]. By combining promising standards, we achieved an open system allowing UML models to be interoperable in a distributed environment. A server application parses UXF documents at the system’s start-up time or on the ﬂy, and creates their in-memory structures; tree structures of parsed UXF elements. Client applications include simple command-line tools, GUI proﬁling tools and development environments [23].

6

Current Ongoing Projects and Future Work

UXF currently supports class, collaboration and statechart diagrams. We are developing DTDs for all the UML diagrams. Also, we are investigating a translator from/to the CDIF XML-based Transfer Format [30] and the possibility of the integration with it. Also, we plan to use XPointer and XLink to connect logically same model elements in diﬀerent diagrams, because UXF descriptions are not shared across diagrams, as described in Section 4.1. As for UXF-aware tools, UXF converters from/to C++, Smalltalk, Python and CORBA IDL are currently developed. The model information implemented

Making UML Models Interoperable with UXF

89

Fig. 6. Layered architecture based on DOM and CORBA

in diﬀerent programming languages can be fully interoperable between diﬀerent development tools by using multiple source code generation tools. A diagram editing/drawing tools are also planned. As for distributed model management system, we are investigating to use an object-oriented database for a persistent storage of UXF descriptions. It enhances the current transient CORBA severs to be a persistent, which can maintain the tree structures of parsed UXF elements even after the shutdown of a server. Another enhancement is to provide a capability of the revision control for UXF using two XML tags; and . We are working for some further projects that enhance the interoperability of UML models. Our goal is to provide multiple levels of interoperability for UML. Three levels of interoperability [23] are achieved at present: – UXF: UXF allows UML models to be interoperable between UML compliant tools. – DOM: DOM allows UXF descriptions (virtually XML documents) to be interoperable between XML compliant tools through the uniform interface. – CORBA: CORBA provides the standard interfaces to allow DOM compliant tools to interact with each other on the network environment, thereby UXF descriptions can be transferred between distributed DOM compliant tools. We will use UXF as a lightweight interchange format for a testbed to enhance UML with emerging standards and technologies.

7

Conclusion

This paper addressed how UML models can be interoperable and proposed a solution that provides an standard-based format called UXF. We also proposed an open environment for highly interoperable UML models by combining some emerging standards: XML, DOM and CORBA. With UXF, UML models can be distributed universally. Our work shows how UML compliant tools can be used in the near future, and provides a blue print indicating the evolution for the interoperable UML. Information on our project can be obtained at [26].

90

Junichi Suzuki and Yoshikazu Yamamoto

References 1. Rational Software et.al. UML Proposal Summary. OMG document number: ad/9708-02, 1997. 2. Rational Software et.al. UML Summary. OMG document number: ad/97-08-03, 1997. 3. Rational Software et.al. UML Semantics. OMG document number: ad/97-08-04, 1997. 4. Rational Software et.al. UML Notation Guide. OMG document number: ad/9708-05, 1997. 5. Rational Software et.al. UML Extension for Objectory Process for Software Engineering. OMG document number: ad/97-08-06, 1997. 6. Rational Software et.al. Object Constraint Language Speciﬁcation. OMG document number: ad/97-08-08, 1997. 7. Rational Software et.al. UML Extension for Business Modeling. OMG document number: ad/97-08-07, 1997. 8. Rational Software et.al. OA&D CORBAfacility. OMG document number: ad/9708-09, 1997. 9. G. Booch. Object-Oriented Analysis and Design 2nd edition. The Benjamin/Cummings Publishing, 1994. 10. J. Rumbaugh et.al. Object-Oriented Modeling and Design. Prentice Hall, 1991. 11. I. Jacobson. Object-Oriented Software Engineering: A Use Case Driven Approach. Addison-Wesley, 1995. 12. UML Revision Task Force in Object Management Group at http://uml.systemhouse.mci.com/. 13. J. Suzuki and Y. Yamamoto. Making UML models exchangeable over the internet with XML. In Proceedings of UML ’98, pages 65–74, Mulhouse, France, June 1998. 14. T. Bray J. Paoli and C. M. Sperberg-McQueen (eds.). Extensible Markup Language (XML) 1.0. W3C Recommendation 10-February-1998, http://www.w3.org/TR/1998/REC-xml-19980210, 1998. 15. J. Clark and S. Deach (eds.). Extensible Stylesheet Language (XSL). W3C Working Draft 18-August-1998, http://www.w3.org/TR/WD-xsl, 1998. 16. B. Bos H. W. Lie C. Lilley and I. Jacobs (eds.). Cascading Style Sheets, level 2: CSS2 Speciﬁcation. W3C Recommendation 12-May-1998, http://www.w3.org/TR/REC-CSS2/, 1998. 17. E. Maler and S. DeRose (eds.). XML Pointer Language (XPointer). W3C Working Draft 03-March-1998, http://www.w3.org/TR/1998/WD-xptr-19980303, 1998. 18. E. Maler and S. DeRose (eds.). XML Linking Language (XLink). W3C Working Draft 03-March-1998, http://www.w3.org/TR/1998/WD-xlink-19980303, 1998. 19. A series of CDIF speciﬁcations are available at http://www.cdif.org/. 20. Rational Software. UML-Compliant Interchange Format. OMG document number: ad/97-01-13, 1997. 21. Object Management Group. Stream based Model Interchange Format (SMIF) speciﬁcation RFP. OMG document number ad/97-12-03, http://www.omg.org/library/schedule/Stream-based Model Interchange.htm, 1998. 22. J. Suzuki and Y. Yamamoto. Managing the software design documents with XML. In Proceedings of the 16th Annual International Conference of Computer Documentation (ACM SIGDOC ’98), pages 127–136, Quebec City, Canada, September 1998.

Making UML Models Interoperable with UXF

91

23. J. Suzuki and Y. Yamamoto. Toward the interoperable software design models: quartet of UML, XML, DOM and CORBA. In Proceedings of the 4th IEEE International Software Engineering Standards Symposium (ISESS ’99), to be appeared, May 1999. 24. UML Xchange. at http://www.cam.org/ nrivard/uml/umlxchng.html. 25. UML to Text. at http://www.ccs.neu.edu/home/nickman/com1205/umltext.html. 26. UXF project Web site. at http://www.yy.cs.keio.ac.jp/ ∼ suzuki/project/uxf. 27. Object Management Group. Common Object Request Broker Architecture version 2.2. available at http://www.omg.org/, 1998. 28. J. Suzuki and Y. Yamamoto. Document brokering with agents: Persona approach. In Proceedings of Workshop on Interactive System and Software (WISS) ’98, to be appeared, December 1998. 29. V. Apparao et al (eds.). Document Object Model (DOM) Level 1 Speciﬁcation version 1.0. W3C proposed recommendation, 18 August 1998, 1998. 30. The CDIF XML-based Transfer Format. at http://www.cdif.org/ overview/xmlsyntax.html.

Transformation Rules for UML Class Diagrams Martin Gogolla & Mark Richters University of Bremen, FB 3, Computer Science Department Postfach 330440, D-28334 Bremen, Germany Phone +49-421-218-3495 Fax +49-421-218-3054

fgogolla|[email protected]

Abstract. UML is a complex language with many modeling features.

In particular, the modeling of static structures with class diagrams is supported by a rich set of description primitives. We show how to transform UML class diagrams involving cardinality constraints, qualiers, association classes, aggregations, compositions, and generalizations into equivalent UML class diagrams employing only n-ary associations and OCL constraints. This provides a better understanding of UML features. By explaining more complex features in terms of basic ones, we suggest an easy way users can gradually extend the set of UML elements they commonly apply in the modeling process. Keywords: UML Class Diagram, Model Transformation, Model Transformation, OCL, Constraint.

1 Introduction UML BJR97c,BJR97b,BJR97a] is a complex language with many modeling features. But up to now, no overall agreed precise semantic foundation has been developed. So naturally the question arises how to deal with the language element variety, for example, to relate the dierent language elements. In this paper we study UML class diagrams and compare the language elements available there. The general idea here is to explain advanced UML features by some more basic UML constructs. A similar approach GPP98] has been taken by us recently for UML state diagrams concerning dynamic modeling aspects. Although UML provides a rich set of language features, many projects are forced to restrict themselves to utilizing only a small set of the language. This is partly due to a current lack of suciently reliable and ecient tools supporting the whole language and partly due to the fact that more time and eort is needed to learn and adapt additional features. Indeed, using only a small set of features can reduce the complexity of a design and facilitate communication. Our approach enables a smooth transition from utilizing a basic set of language features to more sophisticated ones by explaining them in terms of already known features. On the other hand, giving a denition in terms of simple features possibly helps identifying certain repeating patterns in existing designs which may better be represented by more suitable special features, thus emphasizing a particular J. Bézivin and P.-A. Muller (Eds.): <>’98, LNCS 1618, pp. 92-106, 1999.  Springer-Verlag Berlin Heidelberg 1999

Transformation Rules for UML Class Diagrams

93

design decision. Thus, what we do in this paper is to { in a sense { reverse engineer simpler UML concepts from more sophisticated ones. Our work is related to recent approaches handling formal aspects of UML and other object-oriented methods. Work has been done on the basis of well-established traditional approaches to specication like Z and VDM: SF97,FBLPS97] focuses on the UML type system, a general integration of Z with object-orientation is discussed in Ebe97], and in Lan96,BLM97] an object calculus enhancing the expressibility of object-oriented notations has been proposed. Other approaches treat in detail the UML predecessor OMT BCV96,WRC97], and in particular class diagrams BC95] in connection with Larch are discussed. BHH+ 97] sketches a general scenario for most UML diagrams without going into technical details, and Ove98] presents a general framework for relationships in UML. We cannot give a deeper discussion here due to space limitations. The structure of the rest of this paper is as follows. Sections 2 to 6 point out how to translate n-ary associations with cardinality restrictions, qualiers, association classes, aggregations and compositions, and generalizations, respectively. The UML subset we use employs general n-ary associations with additional constraints formulated in OCL. These sections all have the same structure. First, we motivate the handled concept by citing respective parts from the \ocial" UML material: the semantics document BJR97c], the notation guide BJR97b], and the OCL description BJR97a]. Then, a general translation scheme is presented, which is explained afterwards by an example. The paper ends with some concluding remarks.

2 N-ary Associations with Cardinalities 2.1 Statements from UML Material { An n-ary association is an association among three or more classes (a single class may appear more than once). Notation, page 61, line 4. { Multiplicity for n-ary associations may be specied but is less obvious than

binary multiplicity. The multiplicity of a role represents the potential number of instance tuples in the association when the other N-1 values are xed. Notation, page 61, line 7.

2.2 General Translation As indicated in the original UML statements, the translation of n-ary associations with cardinality restrictions is a bit more involved in comparison to the binary case. The diagram in Fig. 1 keeps on the right hand side the starting situation except the cardinality requirement.1 The formula xes the pair of objects on the opposite side of the cardinality specication and requires, that the size of the 1

We use the notion of transformation here. Indeed, we could have also called this rule an equivalence rule in the following sense: Both sides of the rule have the same state

94

Martin Gogolla and Mark Richters A

ra

R

rb

A

B

ra

R

rb

B

rc

rc l..h C

C + constraint

Fig. 1. Transformation Rule for N-ary Association set of related objects is restricted by the given lower and upper bounds. The formula expresses the connection between three participating objects as R(a,b,c)2, and A->forAll(...) is short for A.allInstances->forAll(...) (analogously for exists and select). A->forAll( a | B->forAll( b | C->select( c | R(a,b,c) )->size>=l and C->select( c | R(a,b,c) )->size<=h ) )

Above, we have shown only the translation for a ternary association. The formula can be generalized to the arbitrary case with n participating classes by introducing a universally quantied variable for each of the n ; 1 classes on the opposite side of the cardinality specication.

2.3 Example Student

examinee

Examine

subject

Subject

0..3 examiner Teacher

Fig. 2. Example for Cardinality Notation in N-ary Association The example in Fig. 2 shows a ternary association where students can be examined by teachers on certain subjects. The following cardinality restriction requires that a student can be examined on a single subject at most three times (each time by a dierent teacher). space in that for each left hand side state there is a corresponding right hand side state and vice versa. But some rules to follow will introduce more operations on the right hand side. 2 The current version of OCL does not support this form of syntax. There is no way in OCL to express the existence of links for n-ary associations.

Transformation Rules for UML Class Diagrams

95

Student->forAll( st | Subject->forAll( su | Teacher->select( t | Examine(st,su,t) )->size>=0 and Teacher->select( t | Examine(st,su,t) )->size<=3 ) )

3 Association Classes 3.1 Statements from UML Material { An association class is an association that also has class properties (or a class that has association properties). Notation, page 59, line -12. { An association class is an association that is also a class. It not only connects a set of classiers classes] but also denes a set of features that belong to the relationship itself and not any of the classiers. Semantics, page 18, line 2.

3.2 General Translation A

rb

ra

C

B

A

ra

R

rb

B

C + constraint

Fig. 3. Transformation Rule for Association Class In Fig. 3 the translation of association classes into ternary associations is shown. The specied role names induce the operations ra: C -> Set(A) and rb: C -> Set(B). The formula expresses the idea that each C object \points" to a unique pair of A and B objects. In other words, R constitutes an injective function from C into the product of A and B. C->forAll( c | c.ra->size=1 and c.rb->size=1 and C->forAll( c' | (c.ra=c'.ra and c.rb=c'.rb) implies c=c' ) )

96

Martin Gogolla and Mark Richters

3.3 Example Connection Bank

Person Account

Bank

Person Account

Fig. 4. Example for Association Class In Fig. 4 a bank example is modeled with a ternary association called Connection between Bank, Account, and Person. The constraint demands that (1) an account is related to exactly one bank and to exactly one person, and (2) there cannot be a dierent account with the same links. Account->forAll( a | a.bank->size=1 and a.person->size=1 and Account->forAll( a' | ( a.bank=a'.bank and a.person=a'.person ) implies a=a' ) )

4 Qualier 4.1 Statements from UML Material { A qualier is an attribute or list of attributes whose values serve to partition the set of objects associated with an object across an association. The qualiers are attributes of the association. Notation, page 58, line 4. { The multiplicity attached to the target role denotes the possible cardinalities of the set of target objects selected by the pairing of a source object and a qualier value. Notation, page 58, line 13. { A] qualier is] an association attribute or tuple of attributes whose values partition the set of objects related to an object across an association. Semantics, page 156, line 9.

4.2 General Translation Figure 5 shows that a qualier is translated into an association class (if the association class is already present, the qualier translates into an additional attribute of the association class). The respective constraint is illustrated in Fig. 6: We x an A object, look for all AC objects with a xed q attribute value, and restrict the number of related B objects. The formula requires that the size of the set of B objects determined by a combination of an A object and a q

Transformation Rules for UML Class Diagrams A

q

ra

rb l..h

B

A

rb

ra

97

B

ac

AC q + constraint

Fig. 5. Transformation Rule for Qualier attribute value is restricted by the given lower and upper bound. Regarding the operations used in the dening formula, the role name rb implies we have an operation rb: A -> Set(B). Due to the additional association class AC, we have further operations ra: AC -> A, rb: AC -> B, ac: A -> Set(AC), and ac: B -> Set(AC). A->forAll( a | a.ac->forAll( ac | a.rb->select( b | b.ac->exists( ac' | ac'.q=ac.q and ac'.ra=a ) )->size>=l and a.rb->select( b | b.ac->exists( ac' | ac'.q=ac.q and ac'.ra=a ) )->size<=h ) )

b1:B ac1:AC q=v a:A

...

...

bk:B ack:AC q=v l <= k <= h

Fig. 6. Explanation for Qualier Formula

98

Martin Gogolla and Mark Richters

4.3 Example Bank

account#

0..1

Person

Bank

Person Account account#

Fig. 7. Example for Qualier The example in Fig. 7 taken from the OCL description BJR97b] expresses that an account number at a given bank either uniquely determines a person or the account number is not connected to a person at all. The dening formula makes use of implicit role names and induced operations as follows: person: Bank -> Set(Person), bank: Account -> Bank, person: Account -> Person, account: Bank -> Set(Account), and account: Person -> Set(Account). The formula expresses the following: (1) For all banks b and all accounts a at this bank, we select from all persons p connected with the bank b those who possess an account a' at bank b having the same account number as account a. (2) The size of this set of selected persons is either 0 or 1. Bank->forAll( b | b.account->forAll( a | b.person->select( p | p.account->exists( a' | a'.account#=a.account# and a'.bank=b ) )->size>=0 and b.person->select( p | p.account->exists( a' | a'.account#=a.account# and a'.bank=b ) )->size<=1 ) )

5 Aggregation and Composition 5.1 Statements from UML Material { A lled diamond] signies the strong form of aggregation known as composition. Notation, page 54, line 9. { Composition is a form of aggregation with strong ownership and coincident

lifetimes of part with the whole. The multiplicity of the aggregate may not exceed one (it is unshared). Notation, page 62, line -7.

Transformation Rules for UML Class Diagrams

99

{ AggregationKind=\aggregate"] The part may be contained in other aggregates. Semantics, page 18, line -17. { AggregationKind=\composite"] The part is strongly owned by the composite and may not be part of any other composite. Semantics, page 18, line -15.

{ Aggregation is] a special form of association that species a whole-part relationship between the aggregate (whole) and a component part. Semantics, page 148, line -13. { Both kinds of aggregations dene a transitive, antisymmetric relationship, i.e. the instances form a directed, non-cyclic graph. Composition instances form a strict tree (or rather a forest). Semantics, page 38, line 18.

5.2 General Translation A

rp

P

A

rp

P

+ constraint

A

rp

P

A

rp

P

+ constraint

Fig. 8. Transformation Rule for Aggregation and Composition Aggregate-part (or whole-part) relationships touch three areas to be discussed: Existential dependency, sharing, and instance reexivity. (1) Existential dependency comes in two facets: The part can existentially depend on the aggregate (a part can only exist inside an aggregate), or, the other way round, the aggregate can existentially depend on the part (an aggregate can only exist if its parts exist). (2) Forbidding sharing means that two aggregates cannot have a part in common. With respect to sharing, one can further distinguish the situation where the two aggregates are objects of the same class from the situation where it is possible that the two aggregates belong to dierent classes. (3) Forbidding instance reexivity means that a part cannot be directly or indirectly part of itself. As indicated in Fig. 8, both UML versions of aggregate-part relationships, namely aggregations and compositions, are translated into binary associations. General associations dier from aggregate-part relationships in that instance reexivity is forbidden for aggregations and compositions. For compositions we require in addition the part to be existentially dependent from the aggregate and a strong form of forbidding sharing. These requirements are summarized in Fig. 9. Below, we dene the facets of aggregate-part relationships independent from these UML requirements as OCL constraints.

100

Martin Gogolla and Mark Richters

UML feature Requirement in comparison to association Association Aggregation Forbidding instance reexivity Composition Forbidding instance reexivity Existential dependency for the part Strong form of forbidding sharing

Fig. 9. Requirements for Aggregation and Composition

Existential Dependency for the Part: The part is existentially dependent

from the aggregate. In technical terms, this means that a P object can only exist if it is connected to an A object. P->forAll( p | A->exists( a | a.rp->includes(p) ) )

Existential Dependency for the Aggregate: The aggregate is existentially dependent from the part. In technical terms, this means that an A object can only exist if it is related to a P object. A->forAll( a | P->exists( p | a.rp->includes(p) ) )

Weak Form of Forbidding Sharing: A P object cannot be shared by two dif-

ferent A objects. In technical terms, if two aggregates comprise one common part, then the aggregates coincide. P->forAll( p | A->forAll( a, a' | ( a.rp->includes(p) and a'.rp->includes(p) ) implies a=a' ) )

Strong Form of Forbidding Sharing: A P object cannot be shared by two dierent objects belonging to (potentially) dierent classes A1 and A2. As shown in Fig. 10, this constraint has to be given for any two classes A1 and A2 being potential aggregates for class P. P->forAll( p | A1->forAll( a1 | A2->forAll( a2 | ( a1.rp1->includes(p) and a2.rp2->includes(p) ) implies a1=a2 ) ) )

Transformation Rules for UML Class Diagrams A1

A1

rp1

ra1

rp1

ra2

rp2

P A2

101

P

rp2

A2

+ constraint

Fig. 10. Transformation Rule for Composition with Strong Form of Forbidding Sharing aggregates *

P

parts *

Fig. 11. Reexive Aggregation

Forbidding Instance Reexivity for Aggregation: A P object cannot be

part of itself. In Fig. 11 we have a reexive aggregation (one class participates twice) where the role names dene operations parts: P -> Set(P) and aggregates: P -> Set(P). Without any further restriction, a P object p can be a direct part of itself, i.e. p.parts->includes(p) is possible. The formula below disallows this. It goes even further by forbidding that p is an indirect part of itself by using the transitive closure of the operation parts, here called partsClosure: P -> Set(P).3 The term partsClosure(p) yields the parts of p, the parts of parts of p and so on. The situation in Fig. 11 and the corresponding constraint can be generalized to the case where the reexivity on the class diagram level is established with some intermediate steps. P.forAll( p | not( p.partsClosure->includes(p) ) )

5.3 Example The example in Fig. 12 shows one ordinary association, one aggregation, and two compositions. As an example for an ordinary association, we see that a paper can be connected to a conference (for instance, submitted to a conference). The components of a paper are described by aggregation and composition: A paper has as parts (1) one or more authors (author names), (2) exactly one abstract, and (3) one or more sections. The association between a paper and an author is classied as an aggregation, the ones connecting a paper with abstracts and sections as compositions. This means two dierent papers can share an author, but an abstract and the paper's sections exclusively belong to one paper, thus sharing is not possible. However, a paper has coincident lifetime with its strong 3

We assume here that recursive equations in OCL have a unique least xpoint solution calculated on the basis of set inclusion.

102

Martin Gogolla and Mark Richters Author 1..*

* Abstract

1

1

Paper

*

0..1

Conference

1 1..* Section

Fig. 12. Example for Ordinary Association, Aggregation, and Composition components (abstract and Section), but a paper can exist without any connection to a conference (and conferences can exist without being connected to a paper). The dierence between modeling a weak or strong component is motivated here by the observation that the connection between a paper and an author seems to be weaker than the one between a paper and an abstract or a section. Instance reexivity is not applicable in this example because there is no explicit reexive association.

Existential Dependency for Composition: A paper component classied by composition is existentially dependent from the paper, i.e. an abstract or a section cannot exist without a corresponding paper. Abstract->forAll( a | Paper->forAll( p | p.abstract->includes(a) ) ) Section->forAll( s | Paper->forAll( p | p.section->includes(s) ) )

Forbidding Sharing for Composition: An abstract and a section cannot be

shared by two dierent papers. In this example the weak and the strong form of forbidding sharing coincide because neither Abstract nor Section participate as parts in other compositions. Abstract->forAll( a | Paper->exists( p, p' | ( p.abstract->includes(a) and p'.abstract->includes(a) ) implies p=p' ) ) Section->forAll( s | Paper->exists( p, p' | ( p.section->includes(s) and p'.section->includes(s) ) implies p=p' ) )

Transformation Rules for UML Class Diagrams

103

6 Generalization 6.1 Statements from UML Material { Generalization is the taxonomic relationship between a more general element

and a more specic element that is fully consistent with the rst element and that adds information. Notation, page 67, line 4 Semantics, page 24, line 10 Semantics, page 152, line -5. { An instance of the more specic element may be used where the more general element is allowed. Semantics, page 152, line 3. { The following constraints are predened. Overlapping: A descendent may be descended from more than one of the subclasses. Disjoint: A descendent may not be descended from more than one of the subclasses. Complete: All subclasses have been specied (...) no additional subclasses are expected. Incomplete: Some subclasses have been specied but the list is known to be incomplete. Notation, page 68, line 7.

6.2 General Translation S

G

rg

S

0..1

1..1

G

+ constraint

Fig. 13. Transformation Rule for Generalization As shown in Fig. 13, UML generalizations are transformed to special binary associations. The cardinalities make sure that each specialized object is related with exactly one general object, although not necessarily every general object has a link to a special object. In other words, we have a total mapping from special to general objects. An additional constraint assures the injectivity of the mapping, or, in other words, a special object is associated with a unique general object (no two special objects are associated with the same general object). The dening formula employs rg as the role name with an induced operation rg: S -> G. S->forAll( s, s' | s<>s' implies s.rg<>s'.rg )

This simple translation for generalizations does even handle type substitutability in the context of subtype polymorphism: Wherever an object of the general class G is expected (for example as an argument of an operation), we can substitute a specialized object of class S after applying the \type cast" rg. For example, if op: boolean is an operation in class G, the expression s.rg.op would be allowed for the S object s.

104

Martin Gogolla and Mark Richters

Generalization relationships can further be classied as disjoint or complete. Due to space limitations we have to refrain from presenting the details, but simply state that equivalent constraints for these cases can be formulated. Discussion: This translation of generalization into associations seems to be controversial, at least from our referee's point of view. One might argue that Fig. 13 throws away a useful cognitive tool. And one may doubt that the transformation of generalization into an aggregation is appropriate. However, we had our reasons for taking the explained choice: 1. We do not want to throw away generalization as a concept but we want to indicate how it could be implemented on a lower level. The translation shown is much in the spirit of translating higher level, i.e. semantic, database schemas into lower level data model schemas, for example relational data model schemas. 2. We emphasize that the translation shown does not map generalization into aggregation but into general association. 3. We are not alone in doing such a translation, because one of the standard books on Java Fla97] mentions a similar translation by taking the viewpoint of \inheritance by delegation": We want to implement GraphicCircle so that it can make use of the code we've already written for Circle. One way to do that is the following: public class GraphicCircle { public Circle c public double area() { return c.area() } public double circumference() { return c.circumference() } // new variables and methods public Color outline public void draw(...) {...} }

This approach would work, but it is not particularly elegant. 4. We also emphasize that in OCL the expression rg(s) with s being an arbitrary variable of class S can formally be used anywhere where an expression of class G is expected. However, this requires that it is always statically known whether we handle an S or a G object.

6.3 Example Car->forAll( c, c' | c<>c' implies c.vehicle<>c'.vehicle )

The example in Fig. 14 shows a specialization of vehicles to cars. The constraint requires, that two given distinct cars are mapped to distinct vehicles by the \type cast" vehicle (the operation induced by the association).

Transformation Rules for UML Class Diagrams Vehicle

Car

Car

0..1

1..1

105

Vehicle

Fig. 14. Example for Generalization

7 Conclusion We have achieved guiding rules for UML designers in order to cope with the UML diagram variety. Our approach can be seen as a way to give semantics to an advanced UML language layer. What remains to be done is to give semantics to the \low-level" UML layer by stating a translation into an abstract model. Such a model together with a semantics for the UML constraint language OCL has already been worked out RG98]. Our results suggest that some of the UML class diagram concepts do not really increase the modeling power but merely serve as shortcuts for existing techniques. We have translated all discussed UML features into n-ary associations with additional constraints. Due to the modeling power of associations and OCL this seems appropriate, especially because the association concept is a very general one which is able to model many situations. But one can go even further and transform all n-ary associations into an additional class (plus the classes given before) resolving the n-ary association into n binary associations. This is subject to further consideration.

Acknowledgments The comments of the referees have helped to improve this paper.

References BC95]

R. Bourdeau and B. Cheng. A Formal Semantics for Object Model Diagrams. IEEE Transactions on Software Engineering, 21(10):799{821, 1995. BCV96] E. Bertino, D. Castelli, and F. Vitale. A Formal Representation for State Diagrams in the OMT Methodology. In K.G. Jeery, J. Kral, and M. Bartosek, editors, Proc. Seminar Theory and Practice of Informatics (SOFSEM'96), pages 327{341. Springer, Berlin, LNCS 1175, 1996. BHH+ 97] Ruth Breu, Ursula Hinkel, Christoph Hofmann, Cornel Klein, Barbara Paech, Bernhard Rumpe, and Veronika Thurner. Towards a Formalization of the Unied Modeling Language. In Mehmet Aksit and Satoshi Matsuoka, editors, Proc. 11th European Conf. Object-Oriented Programming (ECOOP'97), pages 344{366. Springer, Berlin, LNCS 1241, 1997. BJR97a] Grady Booch, Ivar Jacobson, and James Rumbaugh, editors. Object Constraint Language Speci cation (Version 1.1). Rational Corporation, Santa Clara, 1997. http://www.rational.com.

106

Martin Gogolla and Mark Richters

BJR97b] Grady Booch, Ivar Jacobson, and James Rumbaugh, editors. UML Notation Guide (Version 1.1). Rational Corporation, Santa Clara, 1997. http://www.rational.com. BJR97c] Grady Booch, Ivar Jacobson, and James Rumbaugh, editors. UML Semantics (Version 1.1). Rational Corporation, Santa Clara, 1997. http://www.rational.com. BLM97] J.C. Bicarregui, Kevin Lano, and Tom S.E. Maibaum. Objects, Associations and Subsystems: A Hierarchical Approach to Encapsulation. In Mehmet Aksit and Satoshi Matsuoka, editors, Proc. 11th European Conf. ObjectOriented Programming (ECOOP'97), pages 324{343. Springer, Berlin, LNCS 1241, 1997. Ebe97] Jurgen Ebert. Integration of Z-Based Semantics of OO-Notations. In Haim Kilov and Bernhard Rumpe, editors, Proc. ECOOP'97 Workshop on Precise Semantics for Object-Oriented Modeling Techniques. Technische Universitat Munchen, Informatik-Bericht TUM-I9725, 1997. FBLPS97] R.B. France, J.M. Bruel, M. Larrondo-Petrie, and M. Shro. Exploring the Semantics of UML Type Structures with Z. In H. Bowman and J. Derrick, editors, Proc. 2nd IFIP Conf. Formal Methods for Open Object-Based Distributed Systems (FMOODS'97), pages 247{260. Chapman and Hall, London, 1997. Fla97] D. Flanagan. Java in a Nutshell. O'Reilly, Cambridge, 1997. GPP98] Martin Gogolla and Francesco Parisi-Presicce. State Diagrams in UML: A Formal Semantics using Graph Transformations. In Bernhard Rumpe, Manfred Broy, Derek Coleman, and Tom S.E. Maibaum, editors, Proc. ICSE'98 Workshop on Precise Semantics of Modeling Techniques (PSMT'98), 1998. http://www4.informatik.tu-muenchen.de/~rumpe/icse98-ws. Lan96] Kevin Lano. Enhancing Object-Oriented Methods with Formal Notations. Theory and Practice of Object Systems, 2(4):247{268, 1996. Ove98] Gunnar Overgaard. A Formal Approach to Relationships in the Unied Modeling Language. In Bernhard Rumpe, Manfred Broy, Derek Coleman, and Tom S.E. Maibaum, editors, Proc. ICSE'98 Workshop on Precise Semantics of Modeling Techniques (PSMT'98), 1998. http://www4.informatik.tu-muenchen.de/~rumpe/icse98-ws. RG98] Mark Richters and Martin Gogolla. On Formalizing the UML Object Constraint Language OCL. In Tok-Wang Ling, editor, Proc. Int. Conf. EntityRelationship Approach (ER'98), LNCS, Springer, 1998. SF97] M. Shro and R.B. France. Towards a Formalization of UML Class Structures in Z. In Proc. 21st Annual Int. Computer Software and Applications Conference (COMPSAC'97), pages 646{651. IEEE, 1997. WRC97] Enoch Y. Wang, Heather A. Richter, and Betty H. C. Cheng. Formalizing and Integrating the Dynamic Model within OMT. In Proc. 19th Int. Conf. on Software Engineering (ICSE'97), pages 45{55. ACM Press, 1997.

Semantics and Transformations for UML Models K. Lano and J. Bicarregui Dept. of Computing, Imperial College 180 Queens Gate, London SW7 2BZ [email protected]

Abstract. This paper presents a semantic framework for a large part of UML, and gives a set of transformations on UML models based on this semantics. These transformations can be used to enhance, rationalise, rene or abstract UML models.

1 Introduction A semantically-based transformation calculus for UML 19] and related OO notations is useful in a number of ways: 1. it provides a set of correct transformations which are equivalences or enhancements of models, and can be used to support forward or reverse engineering 12] 2. the transformations clarify the meaning of the modelling notations, without the developer needing to manipulate the mathematical formalisms underpinning the transformations. A more rigorous development approach is essential for applications in critical areas, such as medical database and robotic systems 16], defence 17] and chemical process control 13]. Although our semantic model is not a complete semantics for UML, it provides a su cient basis to justify transformations which are expected to be model enhancements or renements. It is a step towards a full semantics. The transformational approach is consistent with the presentation of UML in 19] (which includes, for example, equivalences on notations for composition aggregation), and lends itself to CASE tool support. The transformations could themselves be expressed in UML as renements (typically with subdependencies) in which the new model is the client and the old model the supplier. In this paper we present extracts from the proposed semantic framework and show how it can be used to justify some example transformations on the main modelling notations of UML.

2 Basic Semantic Elements A mathematical semantic representation of UML models can be given in terms of theories in a suitable logic, as in the semantics presented for Syntropy in 3] J. Bézivin and P.-A. Muller (Eds.): <>’98, LNCS 1618, pp. 107-119, 1999.  Springer-Verlag Berlin Heidelberg 1999

108

K. Lano and J. Bicarregui

and VDM++ in 15]. In order to reason about real-time specications we will use the more general version of this formal framework, termed Real-time Action Logic (RAL), presented in 15]. A RAL theory has the form:

theory Name types local type symbols attributes time-varying data, representing instance or class variables actions actions which may aect the data, such as operations, statechart transitions and methods axioms logical properties and constraints between the theory elements.

The logical notation which can be used in theories is rst order predicate logic using Z notations such as F(T ), the set of nite subsets of T , together with temporal operators (next), 2 (henceforth), (eventually). There are also terms ( i ), !( i ), "( i ) and #( i ) denoting the request send, request arrival, initiation and termination times respectively of an action invocation ( i ) for action and i : N1. Theories can be used to represent classes, instances, associations and general submodels of a UML model.

2.1 Example Semantic Representation

An example UML class diagram is shown in Figure 1. The corresponding theory worker *

Person

employee *

employer 0..1

Company

0..1 boss

{Person.employer = Person.boss.employer }

Fig. 1. UML Class Diagram is: theory Employment types Person , Company attributes

Person : F Person Company : F Company employee employer : Person $ Company

Semantics and Transformations for UML Models

109

employee : Company ! F(Person ) employer : Person ! F(Company ) worker boss : Person $ Person worker : Person ! F(Person ) boss : Person ! F(Person )

Person represents the nite set of existing objects of class Person { the extension ext (Person ) of Person in the terms of 18]. Instance variables of class C are modelled as attributes of a function type C ! T . Associations between classes

are modelled as relations between their types. actions Standard prede ned actions to modify classes and associations:

createPerson (p : Person ) fPersong killPerson (p : Person ) fPerson g createCompany (c : Company ) fCompany g killCompany (c : Company ) fCompany g add linkemployee employer (p : Person c : Company ) femployee employer employer employee g delete linkemployee employer (p : Person c : Company ) femployee employer employer employee g add linkworker boss (p : Person q : Person ) fworker boss worker boss g delete linkworker boss (p : Person q : Person ) fworker boss worker boss g

We present the write frame of each action as a set after the action declaration. This is the set of attributes which it may change. Query operations in the sense of UML are therefore represented by actions with an empty write frame. axioms The association links only existing persons and companies:

employee employer 2 Person $ Company The two directions of the association are derived from the set of pairs in its relation:

8 p : Person c : Company c 2 employer (p ) (p c ) 2 employee employer ^ p 2 employee (c ) (p c ) 2 employee employer

Cardinality constraints:

8 p : Person card (employer (p)) 8 p : Person card (boss (p)) 1

1

There are similar axioms for worker boss . The constraint of the model is expressed by the formula: 8 p : Person employer (p) = employer (j boss (p) j)

f (j X j) denotes the set of values f (x ) for x

2 X . OCL notation could be used for the axioms, but would be more prolix in general. Theories can be linked by theory morphisms 9,7], which enable the theory of a complete model to be assembled from theories of submodels and eventually from the theories of specic elements, classes, states, associations, etc.

110

K. Lano and J. Bicarregui

Generalisation of class C by class D in UML is directly represented by the theory T (D ) of D being the source of a signature morphism into T (C ) which is the identity (each symbol of T (D ) is interpreted by itself in T (C )). Dashed generalisation of C by D is directly represented by an interface morphism (a signature morphism which only maps action symbols of the rst theory to action symbols of the second theory) from T (D ) to T (C ) which is the identity on the action symbols of D and their signature types. A theory morphism is a signature morphism s from T 1 to T 2 which preserves all the axioms of the source theory. That is, T 2 proves s (P ) for each axiom P of T 1. The simplest form of theory morphism is the inclusion of one theory (all its symbols and axioms) in another. This is denoted by writing includes T 1 after the header of theory T 2. Using this we can re-express theory Employment above as: theory Employment includes WorkerBoss , EmployeeEmployer axioms 8 p : Person employer (p ) = employer (j boss (p ) j) where WorkerBoss , etc are theories of the associations which themselves include the theories of Person and Company (Figure 1).

3 Static Structure Diagrams A UML class C is semantically represented by a theory T (C ) of the form: theory T (C ) types C attributes C : F(C ) self : C ! C att1 : C ! T1 :::

actions

createC (c : C ) fC g killC (c : C ) fC g op1 (c : C x : X1 ) : Y1 :::

axioms

8c : C

self (c ) = c

^

createC (c )](c 2 C )

^

killC (c )](c 62 C )

The notation action ]P denotes that every execution of action terminates with the predicate P being true. Thus createC (c ) always adds c to the set of existing C objects, and killC (c ) removes it. Each instance attribute atti : Ti of C gains an additional parameter of type C in the class theory T (C ) and similarly for operations. The class theory can be generated from a theory of a typical C instance by means of an A-morphism

Semantics and Transformations for UML Models

111

3]. Class attributes and actions do not gain the additional C parameter as they are independent of any particular instance. We denote att (a ) for attribute att of instance a by the standard OO notation a :att , and similarly denote actions act (a x ) by a !act (x ). We will refer to the conjunction of all the properties of the attributes of C as the invariant InvC of the class. We include the axiom 8 a : C a :InvC in T (C ) to express this, where a :P is P with a added as the rst parameter of all instance attributes and actions of C in P . Similarly each association lr can be interpreted by a theory which contains an attribute lr representing the current extent of the association (the set of pairs in it) and actions add link and delete link to add and remove pairs (links) from this. Axioms dene the cardinality of the association ends and other properties of the association. If D inherits from C then T (D ) is constructed by include ing T (C ), adding symbols and axioms for the new features of D , and adjoining the axioms D C ^ D C which ensure that attributes and operations of C can be applied to instances of D . If class C has subclasses S1 , : : :, Sn , we can assert that objects cannot migrate from one subclass to another by axioms:

8 x : Si x 62 Sj ) 2(x 62 Sj ) for j = 6 i . However, if Si and Sj arise as states in a statechart, then such subtype migration is permitted. That two subclasses S1 and S2 are disjoint is expressed by axioms S1 \S2 = ? in a theory which contains both class theories. If a class C is abstract with a complete set of subclasses S1 , : : :, Sn then we can assert that C = S1 : : : Sn

in a theory containing all of these class theories. A complete set of subclasses for C prevents the application of any transformation to introduce new direct subclasses of C . Likewise, if a class is asserted to be a leaf , then no transformation can introduce subclasses of this class, and no superclasses can be introduced for a root class.

3.1 Rationalising Inheritance Hierarchies If two classes A and B are both subclasses of another class D , then it is valid to introduce a subclass C of D which acts as an abstract superclass of both A and B (Figure 2). This transformation is valid because A D ^ B D imply that C = A B is a subset of D .

3.2 Rationalising Disjoint Associations The following transformation (Figure 3) can be applied to object models to eliminate some cases of optional association ends. This transformation is logically

112

K. Lano and J. Bicarregui

D D

C abstract

C abstract

B

A

B

A

Fig. 2. Minimal Superclass Transformation

A * A *

r * 1

r1 {or} 0..1 B

r2

BorC 0..1 C

B

{ r = r1 union r2 }

Fig. 3. Rationalising Disjoint Associations

C

Semantics and Transformations for UML Models

113

valid as r1 and r2 are disjoint and function-like by denition of the \or" constraint 19]:

8 a 2 A (9 b 2 B (a b ) 2 r1) _ (9 c 2 C (a c) 2 r2) ^ 8 a 2 A : ((9 b 2 B (a b ) 2 r1) ^ (9 c 2 C (a c) 2 r2))

and B and C are disjoint. Thus the abstract generalisation class BorC which has BorC = B C can be constructed, and r = r1 r2 has the specied cardinality at the BorC end. A similar transformation works for any cardinality combination at the A end: the resulting association has cardinality the generalisation of the separate r1 and r2 cardinalities at this end.

3.3 Rening Class Invariants Logically strengthening a class invariant is a renement transformation. If class C has invariant InvC , then adding extra constraints or restating InvC in a logically stronger manner to produce a predicate InvC yields a rened class. The theory interpretation is the identity. 0

3.4 Transitivity of Composition Aggregation One proposed meaning 5] of composition aggregation of B instances into A via an association ab is that the B instances are frozen in their relationship with a particular A instance: the inverse image ab 1(j fb g j) is constant for each b : B for the duration of its membership in ab . If ab is a one-many association this means that b cannot move from one container to another (): ;

8 a : A b : B (a b ) 2 ab ^ ((a b ) 2 ab ) ) a = a 0

0

P denotes that P holds at the current or some future time.

The relational composition of two one-many composition aggregations is then itself a composition aggregation because: (a c ) 2 ab bc ) 9 b : B (a b ) 2 ab ^ (b c) 2 bc

((1) ) (2)) and

((a c) 2 ab bc) ) (9 b : B (a b ) 2 ab ^ (b c) 2 bc ) ((3) ) (4)). But then (1) ^ (3) implies (2) ^ (4), so by () applied to b , c we have b = b . Therefore, applying () to a , b we have a = a as required. 0

0

0

0

0

0

0

114

K. Lano and J. Bicarregui

3.5 Deduction Transformations If we know that a diagram M1 ensures that the properties of an enhanced diagram M2 also hold, then we say that M2 can be deduced from M1 : M1 ` M2 . This is just the same as asserting that there is a renement transformation from M2 to M1. A particular example is that the composition of `selector' associations remains a selector of the composed association. In other words, if we know that r 1 R 1, r 2 R 2, then also the composition r 1 r 2 is a subset of R 1 R 2.

4 Sequence and Collaboration Diagrams A sequence diagram denes constraints on the timing of method requests, activations and terminations. For example, a timing mark a at the source point of a message m sent from object s to object t represents the time a = (t !m i ) of some request send of m . If this arrow is horizontal this is also the time a = !(t !m i ) of arrival of this request at t . A timing mark at the destination of a signal arrow represents a request arrival time !(t !m i ), or the termination time #(t !m i ) of an invocation in the case that the arrow represents the return of a procedural call t !m (ie, the arrow is dashed with source t ). For example, Figure 4 translates to the following assertions, where each message execution lifeline is interpreted by a particular message instance: 0

8 i : N1 9 j k l l : N1 !(Op i ) = "(createC 1(ob 1) l ) #(createC 1(ob 1) l ) (ob 3!bar (x ) j ) = !(ob 3!bar (x ) j ) (ob 4!do (w ) k ) = !(ob 4!do (w ) k ) #(ob 4!do (w ) k ) #(ob 3!bar (x ) j ) #(killC 1(ob 1) l ) = #(Op i ) 0

0

These assertions can then be checked for consistency against detailed implementation level statecharts. Replacing such constraints by logically stronger formulae (eg, reducing the range of possible time delays between a request arrival and a result signal) is therefore a rening transformation. It is also valid to introduce new objects and calls on these provided that the existing model elements are preserved. The structural elements of a collaboration diagram simply represent particular instances of classes and their links, and so may be expressed in suitable extensions of class or submodel theories. The interaction aspects can be modelled using composite actions 15] such as (sequential composition), := (assignment) jj (concurrent composition) for all (iteration over a set) if (conditional execution) u (binary choice of actions), create and kill , etc.

Semantics and Transformations for UML Models ob3:C3

115

ob4:C4

Op()

(Op,i) (create(ob1),l)

ob1:C1 bar(x)

(ob3!bar(x),j)

do(w)

(ob4!do(w),k)

(ob4!do(w),k) (ob3!bar(x),j)

(Op,i)

Fig. 4. Example Sequence Diagram with Annotations

5 Statecharts A statechart specication of the behaviour of instances of a class C can be formalised as an extension of the class theory T (C ) of C , as follows. We use the relationship \ calls " for action symbols and to denote that every occurrence of coincides with an occurrence of :

8 i : N1 9 j : N1 "( i ) = "( j ) ^ #(

i ) = #( j )

Then the extended theory of C has the additional axioms: 1. Each state S is represented in the same manner as a subclass of C , and in general, nesting of state S1 in state S2 is expressed by axioms S1 S2 and S1 S2 as for class generalisation. 2. Each transition in the statechart and each event for which the statechart denes a response yields a distinct action symbol. The occurrence of an event e is equivalent to the occurrence of one of its transitions ti (it is the abstract generalisation of the transition actions): t1 e

^ : : : ^ tn e

3. The axiom for the eect of a transition t from state S1 to state S2 with label e (x )G ]=Post a Act where Post is some postcondition constraint on the resulting state, is

8 a : C a :G ^ a 2 S1 )

a !t (x )](a :Post ^ a 2 S2 )

116

K. Lano and J. Bicarregui

4. The transition only occurs if the trigger event occurs whilst the object is in the correct state:

8 a : C a 2 S1 ^ a :G ) (a !e (x ) a !t (x ))

5. The generated actions must occur at some future time (after t has occurred): a !t (x )

) a :Act

Transitions g with labels of the form after (t ) from source state S have an alternative axiom 4 dening their triggering which asserts that they are triggered t time units after the most recent entry time to state S 14]. Axiom 5 adopts the semantics given in Syntropy 5] for generated actions: the new state must be established before generated actions can be executed. In contrast to the statemate semantics of statecharts 10], these actions can be executed in steps other than the immediately following step. This appears to be the correct interpretation of asynchronously generated signals in UML 19]. Synchronously invoked actions have the alternative axiom a !t (x )

a :Act

If state S is a concurrent composition of substates, we require that each occurrence of an event results in an occurrence of one transition ti for in each distinct concurrent sub-region of S which has a transition for this event. For example, if there are transitions t2 and t3 for in region 1, and transition t1 in region 2 of a state S , then we have the axioms: a 2 S ) (a !t1 a !t2 u a !t3) a 2 S ) (a !t2 a !t1) a 2 S ) (a !t3 a !t1) Thus changing the isConcurrent attribute of a composite state from false to true represents a theory extension and therefore a renement. Some typical transformations on statecharts are then as follows:

5.1 Source and Target Splitting These transformations 5] can be shown to be valid for UML given the above semantics. Similarly, adding a nested state machine to a simple state S is generally a renement provided that existing transitions from S are not overridden by transitions from substates of S which go to new destination states partly or fully disjoint from the original destinations.

5.2 Abstracting Events In UML signal events can be arranged in a generalisation hierarchy. For example, an event g (x ) can be represented as a generalisation of events h (x y ) and f (x z )

Semantics and Transformations for UML Models

117

on a class diagram (x , z are the attributes of event f , etc). The semantic meaning is that every occurrence of a specialised event is also an occurrence of every event it generalises (1): h (x y ) g (x ) f (x z ) g (x ) This means that transitions for h and f can be replaced by transitions for g , if g is an abstract generalisation of these two actions, since each axiom a 2 S ^ a :G ) (a !g (x ) a !t (x )) for a transition t of g yields the corresponding axiom for h or f . This transformation is useful to reduce the number of events which a control system must respond to, eg, to replace separate events \switch on" and \switch o" by \toggle" 2].

5.3 Strengthening Transition Guards The guard G of a transition from state S to state T may be strengthened by the invariant of S , since this invariant inevitably holds in the source state at points where the system is waiting for input events.

5.4 Eliminating Transitions A transition t with a logically false guard can be eliminated, since it can never be taken. Its eect axiom has the form a 2 S ^ a :G ) t ]Post but this is trivially always true if a :G false . Such transitions may arise as the result of source and target splitting, for example, in Figure 5, we target split the Finished state and transition nish , and then source split the Filling state and the two transitions for nish , yielding 6 separate transitions for nish . However, all but two of the resulting transitions are now impossible, so can be eliminated: the rst transition for nish , with guard level min ^ level < norm cannot occur from either the F 1 or F 3 states, and the second transition cannot occur from either the F 1 or F 2 states. A similar step is carried out in the rst renement of Abrial's development of a distributed protocol 1].

Conclusions This paper has illustrated the use of transformations on UML models as a means of rigorous development and re-engineering based on a detailed semantics of these models. Real-time extensions of these models and corresponding transformations are currently under development. An international collaborative project on the UML semantics is underway to combine other related approaches, such as 6,4] into a common framework. Tool support for transformations as part of a

118

K. Lano and J. Bicarregui

Filling

level >= min

level >= 0

Filling level >= 0

Finished

finish[level >= min]

finish[level >= min level < norm]

finish[level >= norm]

Finished

Normal level < norm level >= min

Overfull level >= norm

Filling

F1 level < min level >= 0

F2 level >= min level < norm F3 level >= norm

finish[level >= min level < norm]

finish[level >= norm]

Finished

Normal level < norm level >= min

Overfull level >= norm

Fig. 5. Successive Splitting and Elimination Transformations general CASE tool for UML will also be developed. A library of proved transformations will be provided, eliminating the need for developers to reason directly in RAL when applying transformations as development steps. Suggestions for improvement of UML which have come from this work are: 1. Consider statechart states as classiers, whose instances are those objects currently in the state. This unies similar concepts in the same metamodel entity. 2. Attach constraints to packages or subsystems which enclose the submodel on which the constraint applies, in preference to attaching the constraint to a possibly large number of elements in this submodel.

References 1. J Abrial, L Mussat. Speci cation and Design of a Transmission Protocol by Successive Re nements using B, 1997. 2. M Awad, J Kuusela, and Jurgen Ziegler. Object-oriented Technology for Real-time Systems. Prentice Hall, 1996. 3. J C Bicarregui, K C Lano, T S E Maibaum, Objects, Associations and Subsystems: a hierarchical approach to encapsulation, ECOOP 97, LNCS, 1997. 4. R Breu, U Hinkel, C Hofmann, C Klein, B Paech, B Rumpe, V Thurner, Towards a Formalization of the Uni ed Modeling Language, ECOOP 97 proceedings, LNCS 1241, Springer-Verlag, 1997. 5. S Cook and J Daniels. Designing Object Systems: Object-Oriented Modelling with Syntropy. Prentice Hall, Sept 1994.

Semantics and Transformations for UML Models

119

6. A Clark and A Evans, Foundations of the Uni ed Modeling Language. In D Duke and A Evans, editors, BCS FACS { 2nd Northern Formal Methods Workshop, Workshops in Computing, Springer Verlag, 1997. 7. J Fiadeiro and T Maibaum. Temporal Theories as Modularisation Units for Concurrent System Speci cation, Formal Aspects of Computing 4(3), pp. 239{272, 1992 8. R France, A Evans, K Lano, The UML as a Formal Modelling Notation, OOPSLA 97 Workshop on Object-Oriented Behavioral Semantics, 1997. 9. J Goguen and R Burstall. Introducing Institutions.In Clarke and Kozen, eds. Logics of Programs, pp. 221-256, Springer-Verlag, 1984. 10. D Harel and A Naamad, The Statemate Semantics of Statecharts, technical report, i-Logix, Inc, 1995. 11. K Lano, S Goldsack, J Bicarregui and S Kent. Integrating VDM++ and Real-Time System Design, Z User Meeting, 1997. 12. K. Lano, N. Malik, Reengineering Legacy Applications using Design Patterns, STEP '97, IEEE Computer Society Press, 1997. 13. K Lano, A Sanchez, Design of Reactive Control Systems for Event-driven Operations, FME '97, LNCS, Springer-Verlag, 1997. 14. K. Lano, Transformations on Syntropy and UML Models, Technical Report, \Formal Underpinnings for Object Technology" project, Dept. of Computing, Imperial College, 1997. 15. K Lano, Logical Speci cation of Reactive and Real-Time Systems, to appear in Journal of Logic and Computation, 1998. 16. N Leveson, Safeware: system safety and computers, Addison-Wesley, 1995. ISBN 0-201-11972-2. 17. Ministry of Defence, The Procurement of Safety Critical Software in Defence Equipment, DEF-STAN 00-55, Issue 1, Part 2. Room 5150, Kentigern House, 65 Brown St., Glasgow G2 8EX, 1997. 18. R Wieringa, W. de Jonge, P. Spruit, Roles and Dynamic Subclasses: A Modal Logic Approach, IS-CORE report, Faculty of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, 1993. 19. The UML Notation version 1.1, UML resource center, http://www.rational.com, 1997.

Automation of Design Pattern : Concepts, Tools and Practices Philippe Desfray Softeam, 8 rue germain soufflot 78184 Saint Quentin en Yvelines France Tel : 331 30 12 16 60, Fax : 331 30 43 86 06 Email : [email protected], Web : http://www.softeam.fr

Abstract. Model transformation is a technique that makes it possible to automate design patterns. Applied to UML, the result is highly promising. However, model transformation rules have to be structured by a specific organization mechanism called viewpoint, and be coupled with the UML model extension features (tagged values, stereotypes, etc.). This has been done through a specific technique, called « hypergenericity », which is implemented by a case tool and used since 1994.

1 Presentation

1.1 Model Transformation Becomes a Major Technology Model transformation is a technology capable of automating the transition from analysis to design and from design to the final code. This kind of technology is gaining a wide interest because of its strong automation capabilities, and because the underlying necessary layers (object oriented models, metamodel definition, case tool support) are becoming stable or standardized. In this paper, such a technology will be presented through a specific approach called " hypergenericity " [1, 2, 3]. This technology, first developed in 1992, is supported by a dedicated case tool called " Objecteering " since 1993, and has since evolved through a significant number of application feedback in several software projects. In addition, design patterns are successful due to their ability to communicate technical solutions to developers. Therefore, having the ability to automate design pattern through transformations to well recognized models becomes a major technology.

J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 120–131, 1999. © Springer-Verlag Berlin Heidelberg 1999

Automation of Design Pattern: Concepts, Tools and Practices

ANALYSIS MODEL CHOICE

121

TRANSFORMATION RULES 1

MODEL TRANSFORMATION

TRANSFORMATION RULES 2 TRANSFORMATION RULES 3

DESIGN MODEL

DESIGN PATTERNS

Fig 1 - Hypergenericity is a technology for implementing model transformation

1.2 The Inherent Difficulty of Software Design

MyAnalysisClass

PersistentMyAnalysisClass

ClientMyAnalysisClass

display 0..1

0..*

WindowMyAnalysisClass

ServerMyAnalysisClass

Fig 2 - What guarantee do we have that this design is a correct one ?

The preliminary design activities are one of the most difficult activities in software development. Analysis is a difficult and decisive task, but check points can be defined, such as defining tracability links to user terms (glossary), using a Use Case approach for describing the user needs, etc. At least, the user can check that needs are taken into account, and that important notions are managed. During the design phase, the developers are in a pure virtual world, unintelligible to the final user. It is very hard to know what they are doing, what notions they are handling, and what technical features (performances, storage volume, reliability, etc.) the final software will have. During both the design phase and model implementation, many technical details are added to the model in the form of classes, attributes, methods, and so on. Figure 2 shows an example of the design model for one concept (MyAnalysisClass) which must be stored on a database, distributed in a Client server context, and have a GUI representation. The resulting application model is complex, very abstract, mixes analysis and implementation notions, and cannot be " formally " proven as a good model. Experience of comparable application development really matters at this stage. Consequently, the end user or author of the user requirements can review the analysis phase but no longer understands design and implementation documents.

122

Philippe Desfray

Very often, developers are novices in one or several techniques they use (RDB, OO technology, Client Server, GUI, etc.). During design, they will have to imagine solutions which are rarely obvious, specifically in the object oriented field. This is the reason why design patterns are so successful : they provide already proven solutions to developers , who can take benefit from the skill of the design pattern authors. In order to gain confidence on the final result, project managers only need to know that design will be based on predefined patterns, that have been used in similar contexts. 1.3 Writing Code Is a Tedious and Repetitive Task, Which Can Be Automated to a Great Proportion Considering closely the hand written code of applications, it occurs that most of the code is a systematically written code, which has to deal with technical problems such as managing an increment of a list, implementing a " handler " of something, mapping data to specific formats, initializing data, etc. Only a small portion (say 10% at most) of the code is of real " functional " interest. Consider for example how many lines of code are needed to implement a GUI for any managed concept (say a " Customer " or an " Order ", etc.) that we also want to store on a database or to distribute on a network. We have only mentioned a concept, and we already have to consider hundreds of lines of code, filled in with plenty of technical tricks dedicated to the usage of a specific programming language, and technical environments. Those lines of code entirely depend on the technical choices that have been made and can be automated.

2 Implementing Design Patterns

2.1 Design Patterns Are Everywhere in Programs Design patterns range from " micro " design pattern to " macro design patterns ". A micro design pattern is for example the iterator technique for handling lists. This kind of pattern, named « idioms » by Buschmann (Buschmann 96) is often solved through programming techniques, such as " templates " or inheritance, or through the usage of libraries. At a higher level, some design patterns are implemented by case tools having code generation capabilities. Consider the example Figure 3, for the transformation of an " association " into specific management members (lists, accessors, etc.).

Automation of Design Pattern: Concepts, Tools and Practices

123

Company 1 Company

+developed

Production

Product

- developedProduct:Set + getDeveloped(< index)Product

0..*

+ setDeveloped(< index ,< newElt) + cardDeveloped()integer

Figure 3 - Pattern for implementing an association (transformation of the " Company " class) At a higher level, there exist design patterns for implementing the vast majority of design features. Buschmann simply call them « design patterns ». For example, typical design patterns are recommended in Java for implementing typical programming cases. Figure 4 shows the " Event modeling pattern " recommended for Java. In this example, any class that needs to notify a specific event (here the " Order " event) to the other classes (here the " Product " class) must posses these specific operations, and these specific associations to these specific complementary classes. Such a pattern can be automated. The user only needs to introduce the dedicated " JavaEventSender " annotation. Every model element is (for example Listener class, addListener method) deductible from the initial model. <> util::EventListener

Product

<>

{ JavaEventSender(Order) } notifier + addOrderListener(< listener) { JavaSynchronize }

1

+ removeOrderListener(< listener)

+notified

Notification

0..*

OrderListener

+ orderOccurence(< event)

+ notify(< event) { JavaSynchronize }

util::EventObject

Product { JavaEventSender(Order) } OrderEvent

Fig 4 - Transformation of a class for the " Java Event Pattern "

The higher level identified in Buschmann 96, called « architectural patterns », is too general for being directly automated. Only specific cases can be automated through the hypergenericity technique. 2.2 Model Transformation Becomes Necessary for the Most Sophisticated Patterns The " iterator " pattern is implemented trough libraries and templates. The " association pattern " is implemented through code generators. However, the " Event " pattern cannot be implemented simply with these techniques. A pattern does not necessarily provide a complete detailed design model, ready for a comprehensive

124

Philippe Desfray

code generation. It rather provides an intermediate design model that must be reworked by the developer in order to get the final detailed model. Model transformation is in that case the right solution : The developer asks the tool for a specific transformation, corresponding to the selected pattern. The developer still has the opportunity to apply new model transformation at each step, or to add manually new details. Analysis model

Pattern Transformation1

Design model1

Human design work

Design model2

Pattern Transformation2

Detailed Design model

Fig 5 - Design is a succession of automated or manual model transformations

2.3 Using Model Transformation Automating the development process directly from analysis to implementation, just by pushing a magic " generate " button is not possible. The main reason, is that from the same analysis model, there can exist an infinite number of implementations, depending on the design choices, and of the chosen execution environment. Implementers must choose between a lot of choices at each design stage. Design pattern provides the choice elements to the developer. What can be automated is the following process : " Given a specific model, the technical target, and a specific implementation technique (design pattern), the developer must make detailed choices in order to express what role does each part of the initial model plays in the pattern, and what are the desired implementations for each element of the model. " . For example, the final design in Figure 4 has been deduced from the " JavaEventSender " annotation added to the initial class. Annotations must be made at the model level, in order to indicate what the technical choices are for each element of a model. In UML, Tagged values provide a convenient means for annotating a model. For example, Figure 6 shows a class annotated with the Tagged Values {persistent} in order to express that its instances must be stored in a database, one of its attribute being annotated {identifier}, denoting that its value identifies the object, and one being tagged {transient}, denoting that it has only a " dynamic meaning ", and that it therefore should not be stored in a database. Tagged values provide the means to introduce added semantics to the model, which is specific to the target environment (for example a relational database) to the selected design pattern, and to the set of

Automation of Design Pattern: Concepts, Tools and Practices

125

transformation rules automating the pattern. Tagged values is one of the three extension mechanisms of UML, the other being stereotypes and constraints. Product { persistent } + Number:integer { identifier } + State:undefined + Price:real + ToBeDeleted:boolean { transient }

Fig 6 - Annotating model elements for a specific target environment

With all these elements, a set of adapted transformation rules can be applied to the model, in order to transform it into a more accurate design model, based on all the contextual information.

3 Hypergenericity

3.1 Definition

MODEL

TRANSFORMATION RULES

HYPERGENERICITY DOMAIN EXPERTS

abcedf"fsdds abcedf"fsddsf f gfouidjhbv gfouidjhbv xwf xwf sdfqsdf sdfqsdf fsdfsdgdfhcv fsdfsdgdfhcv svxcvcxvqdfg svxcvcxvqdfg sqdfsqdfdffs sqdfsqdfdffs sdfqsdfdsfdsf sdfqsdfdsfds f

TECHNICAL EXPERT

IMPLEMENTATION

Fig 7 - Hypergenericity principle

Hypergenericity is the ability to automatically refine or transform a model by applying an external knowledge on it. This " external knowledge ", which can be for example design patterns, is expressed through a specific language called H, and using a dedicated structuring mechanism called "viewpoint ". This knowledge applies orthogonally to the model : it can be applied to several models, whereas the same model can use different modeling techniques.

126

Philippe Desfray

Hypergenericity let technical specialists develop transformation rules, apart from the domain specialists, who must focus on the analysis part, and on the best rule appliance for their needs (Figure 7). 3.2 H : The Language for Managing Model Transformation META-MODEL

H RULES

MODEL RULES EXECUTION

REFINED MODEL

Fig 8 - H runs at the metamodel level, and handles objects which are model elements from a user perspective.

An OO language called H has been developed to declare the Hypergeneric rules. H gives access to the models information, and can change the model elements. H runs at the " metamodel " level (Figure 8). The " metamodel " is the model of the model itself. H runs on several metamodels, including the UML metamodel currently in a standardization process at OMG. H instructions are inserted into methods declared on the metaclasses provided by the metamodel. H is an interpreted language, which drives a case tool kernel, just like interactive user’s actions do. Every user’s action can be realized through H or by the user, like for example extending a model, or inserting lines of code. At the metamodel level, transforming a model only means creating new objects, changing current values, all actions that a classical language can do. For example, the following instructions create a new operation called " print ", and add it to the " C " class. C : Class ; m : Method ; m := Method.create() ; m.setName ("print") ; C.addComponent (m) ; --Note : Every term used here such as " Class ", " Method ", --" Component ", " Name ", is defined by the metamodel

This is exactly what developers do manually : creating classes, adding operations, parameter, inserting programming code, defining associations, etc. All these operations are simply model transformation actions.

Automation of Design Pattern: Concepts, Tools and Practices

127

H is aimed for metamodel handling : It provides powerful and simple mechanisms for navigating inside a model, handling sets, selecting subsets in order to strongly limit the usage of classical control structures such as " for ", " while ", " if ". In addition to the yet classical object oriented mechanisms, it provides specific features like message diffusion on a set, and " anonymous methods ". H methods are organized by the " viewpoint " structuring mechanism. This mechanism extends the usual " method redefinition " principle, by providing an additional " viewpoints lookup mechanism " which insures patterns reuse and extensions. 3.3 Viewpoints: Structuring Rules into Different Center of Interest The metamodel is predefined (and soon standardized). Users can extend the metamodel (for example in order to provide a technical target metamodel, such as a relational database metamodel), but the vast majority will only use the predefined one. If every pattern and every rule were defined together, then metaclasses would own hundreds of methods. A structuring mechanisms becomes therefore necessary, for obvious management reasons. Classical structuring mechanisms such as the UML packages, cannot be used because the vast majority of metaclasses remains the same for different technical areas and patterns. DEFAULT

CODE GEN

RDB GEN

METAMODEL C++

JAVA

Fig 9 : Different viewpoints consider the same metamodel from different target perspectives

Viewpoint is a mechanism for structuring rules, that provides a means to look at the same metaclass from different angles of view, depending on the metamodel usage that is expected. For example, documentation generation rules, C++ patterns rules, relational database rules, will each have a specific interest in the metamodel. These different usage needs will be materialized into the so called " viewpoint " concept. Viewpoints are defined at the metamodel level. They are a mechanisms for structuring « usage domains » applied to the metamodel. As such, they are not defined in UML. From a UML designer perspective, (model level) the viewpoint represents the context of his current work such as the current development phase (analysis, design, etc.), the problem domain (business modeling, RDB architecture, C++ programming, etc.). Under a certain viewpoint, a certain set of tagged values, stereotypes, work products are allowed, a certain set of modeling rules must be applied, etc.

128

Philippe Desfray PROJECTS

VIEWPOINTS

P1

P2

P3

C++ GEN

!

!

!

DOC GEN

!

RBD GEN

! ! !

BDO GEN GUI GEN

P4

!

Fig 10 - Every project can choose its specific viewpoints.

The viewpoint structure is a hierarchical structure, showing refinements links between different viewpoints. A sub-viewpoint inherits the H rules of the parent viewpoint, and is able to refine or extend the parent rules. Thus, a more specific pattern can reuse a general pattern and adapt the specific features. Viewpoints are very important for the developers, but are as well necessary to the final users : At the project level, users choose which already defined viewpoint they want. The project becomes customized by the user’s choice for their specific need, having specific rules and generation dedicated to their need (RDB, ODB, Client/Server, etc.). model explorer

.Packages .Classes .Methods

viewpoints and metamodel explorer

.viewpoints .metaclasses .H methods .tagged value definitions

Fig 11 - The metamodel level tool drives the model level tool, and structure the rules into viewpoints

At the tool level, this selection will change the tool aspect, providing specific icons, specific annotation system (tagged values), specific menus items, specific consistency rules, etc., for every project. All these elements, together with the H methods, are structured with viewpoints (Figure 11).

Automation of Design Pattern: Concepts, Tools and Practices

129

3.4 Persistent or Temporary Model Transformation TRANSFORMATION

GENERATION Design

Analysis

{printable} Human

Human

Name : string Age : integer

Name : string Age : integer print()

Implementation class Human{ public void print (); string Name; int Age; } void Human::print(){ cout<
C++ code Fig 12 - Code generation is immediately triggered after the " temporary " model transformation

We distinguish the persistent transformations from the temporary transformations : If a pattern is accurate enough to produce the final code, then the model can be temporarily transformed, only for code generation purpose. For example, if the user knows that a " print " method will be automatically added to his classes, then he does not need to see this method at the model level, and just expect it to be in the final code. The transformation is done internally by the case tool, just before the code generation process which produces the final code, and is then forgotten. The user sees an unchanged model, but many generated elements in the final corresponding code. This is generally the case with the micro design pattern, or with code generators. In the " association pattern " example, users only represent the associations, and get the final code with its accessors and list attributes. At the opposite, if the pattern is sophisticated, and need further description from the designer, then transformation must be persistent : this means that the model is visually transformed into a more accurate model just as if the designer had made it manually. Temporary transformation hides implementation aspects and keeps the model uncluttered by unnecessary highly specific technical details. This is not always wanted, especially when the transformation rules are not well known from the designers. Persistent transformation need strong traceability management mechanisms : the users may want to come back to earlier models (before transformation), for example in order to introduce some new analysis aspects. If a model has been transformed, and new elements have been introduced by the designer, then the designer may want to " undo " the transformation, add a new analysis information, transform again the new version of his model, and recover the previously introduced design elements he had made before. Transformation must not be defined as a " one shot transformation ", but incorporate tracability elements in his processing.

130

Philippe Desfray

4 Conclusion 4.1 Experience Feedback More than fifty projects have already used this kind of approach, in a wide area of technical cases. This technology has been used in client/server cases, network supervision system, GUI generation needs, tests generation, case tools architecture, many kinds of frameworks, databases storage, etc. We have had good results, but we also have encountered some failures. Both experiences helped us in improving this new technology. This technology is very powerful, especially in the case of application families, based on similar implementation principles. Users get a production tool that generates very easily new versions of applications. Cost benefit is high on maintenance, reusability and extensibility. This technology has been used as well for developing in an iterative way : it facilitates technical developments separated from the analysis work. An analysis model can be rapidly implemented, using a first set of simple rules, and then be again implemented using a more sophisticated set of rules (Figure 13). For example, client server application can be first implemented in a monolithic application, and then be implemented in a client server configuration. That kind of approach would often be unfeasible in classical development, due to the multiple implementation overcost.

Analysis model

Architecture rules

ANNOTATIONS Design model

Lower layers (libraries, frameworks)

Fig 13 - Architecture rules are defined apart from the model, and thus can be interchanged

Failures came, as very often with new technologies, from misuses made by enthusiastic beginners : a bulldozer is more powerful than a spade, but misuse can be disastrous. Precursors in object orientation know about this kind of drawback. As an example, some development did implement so many systematic model transformations, that the final code got unmanageable. We have encountered " overskilled " people, who introduced so tricky implementation transformations, that no usual developer could understand what happened to their model. This becomes a disaster at the final implementation stage.

Automation of Design Pattern: Concepts, Tools and Practices

131

The most important lesson is to define design pattern accessible to the developers community, which needs to understand what a tool automatically does. This goal has been reached by defining fine grain stepwise transformations, and by providing sophisticated tracability management mechanisms. 4.2 Model Transformation in Software Development The model transformation technique is a powerful tool that provides complementary mechanisms to these currently used in the OO field (inheritance, templates, frameworks, etc.). As usual, this technique has been known for a long time (Schlear & Mellor use it in their approach, the " metamodeling community " is well aware of such techniques, metaclass languages do support many features of model transformation). The novelty is that, at last, the computer science community has the standards and tools basis for employing it widely. It reinforces the modeling approach, and the design pattern technology. Hypergenericity is a technology that provides a convenient language for model transformation, accompanied with the necessary structuring mechanisms. Software lifecycles can be adapted in a more flexible way, using these techniques which entirely separat the design oriented tasks from the analysis one (Figure 13). Reuse is broadly favored, by the analysis/design separation. Reuse can be ensured from the implementation aspect (reusing design rules, interchanging these rules) or from the analysis aspect (reusing an analysis model apart from its implementation). This paper is focused on the " design automation aspect ", but this technology as wider interests, such as for example automatically ensuring quality rules on a model, or reporting impacts of a change in a model. We believe that the advent of a standardized object model (UML), will strongly promote model driven development approaches, and thus that the design and code automation will be a major trend in the coming decade.

References 1. "Design Patterns, Elements of Reusable Object-Oriented Software" - Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides - Addison Wesley 1995 2. "Pattern Oriented Software Application" - Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad & Michael Stal - John Wiley & sons 1996 3. "Pattern Languages of Program Design vol. 2" - John M. Vlissides, James O. Coplien, Norman L. Kerth edt. - Addison Wesley 1996 4. "Pattern Languages of Program Design vol.3" - Robert Martin, Dirk Riehle & Frank Bushmann edt - Addison Wesley 1998 5. "Recursive design of an application independent architecture". Sally Schlaer and Stephen J. Mellor; IEE Software - January 1997 6. " Object Engineering : The fourth dimension ". Philippe Desfray ; Addison Wesley 1994 7. " Hypergenericity : Automating Object Oriented development ". Philippe Desfray ; Object Expert Nov/Dec 1995 8. " Automated object design : the Client/Server case ". Philippe Desfray ; COMPUTER february 1996

Automating the Synthesis of UML Statechart Diagrams from Multiple Collaboration Diagrams 1 Ismail Khriss, Mohammed Elkoutbi, and Rudolf K. Keller Ddpartement d’informatiqueet de recherche opdrationnelle UniversitC de Montrdal C.P. 6128, succursale Centre-ville, Montrdal, Quebec H3C 357, Canada voice: (514) 343-6782 fax: (514) 343-5834 e-mail: { khriss, elkoutbi, keller}@iro.umontreal.ca http://www.iro.umontreal.ca/-{ khriss, elkoutbi, keller )

Abstract. The use of scenarios has become a popular technique for

requirements elicitation and specification building. Since scenarios capture only partial descriptions of system behavior, an approach for scenario composition and integration is needed to produce more complete specifications.The Unified Modeling Language (UML), which is emerging as a unified notation for objectoriented modeling, provides a suitable framework for scenario acquisition using Use Case diagrams and Collaboration diagrams and for behavioral specification using Statechart diagrams; yet it does not propose any specific modeling process, let alone a process for transforming scenarios into behavioral specifications. In this paper, we suggest a four-step process for synthesizing behavioral specifications from scenarios. It generates from a given set of Collaboration diagrams the Statechart diagrams of all the objects involved. Our approach is incremental and is fully compliant with the UML. Furthermore, it provides an elegant solution to the problem of scenario interleaving. The underlying algorithm has been implemented and validated with several examples, and is fit for integration into CASE tools supporting the UML.

This work is in part supported by FCAR (Fonds pour la formation des chercheurs et I’aide 3 la recherche au Quebec) and by the SPOOL project organized by CSER (Consortium Software Engineering Research) which is funded by Bell Canada, NSERC (National Sciences and Research Council of Canada), and NRC (National Research Council of Canada). J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 132–147, 1999. © Springer-Verlag Berlin Heidelberg 1999

Automating the Synthesis of UML StateChart Diagrams

133

1 Introduction Recently, two aspects have received a lot of attention in object-oriented development: the emergence of the Unified Modeling Language (UML) as a unified notation for object-oriented analysis and design, and a growing consensus on Use Case (or scenario) approaches to software development. The UML [18] can be seen as the successor of the wave of object-oriented analysis and design methods that appeared in the late 80s and early 90s. It unifies the methods of Booch [Boo94], Rumbaugh (OMT) [19], and Jacobson (OOSE) [12]. The UML is expected to be the standard modeling language in the future. It gives notations for describing a system in various views, but does not define any specific process for software development, beyond some preliminary process description reported, for instance, in [ 171. Scenarios have received significant attention and have been used for different purposes such as understanding requirements [161, Human Computer Interaction analysis [ 151, specification or prototype generation [ 11, and object-oriented analysis and design [3, 19,20, 121. In this paper, we extend our prior work [22] by proposing an incremental approach for dynamic modeling. It provides a four-step process with limited manual intervention for deriving the dynamic specification of objects from scenarios. Our work, in contrast to others such as [14], supports UML Collaboration diagrams with all their facets (iteration, condition, concurrency) for scenario acquisition and leverages the expressiveness of UML Statechart diagrams (concurrency, hierarchy) for capturing the resultant specifications. We also resolve the problem of interleaving between scenarios, which means that the generated specifications capture exactly the behavior given in the input scenarios. For example, if we have two scenarios that share a common state, the resultant specification may capture more than the two given scenarios: for instance, execution may initially follow the first scenario; when it reaches the common state, execution may continue according to the second scenario. Our approach precludes the generation of such overly general specifications. The integration of our approach into CASE tools that support the UML will make these tools more powerful. The tedious activity of translating the dynamic models of the UML will become a mostly automatic, tool-supported activity. Section 2 of this paper gives a brief overview of the UML diagrams relevant for our work and introduces a running example. Section 3 presents the four activities of our approach. Section 4 addresses related work in the area of specification generation from scenarios. In section 5, we discuss several aspects of our work. Finally, section 6 provides some concluding remarks and points out future work.

2

Unified Modelling Language

The UML [ 181 provides a syntactic notation to describe all views of a system using different kinds of diagrams. In this section, we will only discuss diagrams we have. used in our approach: Use Case diagram (UsecaseD), Collaboration diagram (CollD)

134

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

and Statechart diagram (StateD). As a running example, we have chosen to study a part of a library system. 2.1

Use case diagram (UsecaseD)

The UsecaseD is concerned with interaction between the system and actors (objects outside the system that interact directly with it). It presents a collection of use cases and their corresponding external actors. A use case is a generic description,of an entire transaction involving several objects of the system. Use cases are represented as ellipses, and actors are depicted as icons connected with solid lines to the use cases, which they interact with. One use case can call upon the services of another use case. Such a relation is called a uses relation and is represented by a directed dashed line. The direction of a uses relation does not imply order of execution. Figure 1 shows an example of a use case diagram corresponding to the library system. In this UsecaseD, we find two actors (Attendant and Manager) interacting with seven use cases (Reader-check, Document-check, Reader-registration, Document-registration, Lend-service, Return-service and Statistics). There are also many relations uses, for example the use case Lend-service uses the services of Reader-check and Document-check uses cases.

Fig. 1.UsecaseD for the library system.

A UsecaseD is very helpful in visualizing the context of a system and the boundaries of the system’s behavior. A given use case is typically characterized by multiple scenarios. 2.2

Collaboration diagram (CollD)

A scenario shows a particular series of interactions among objects in a single execution of a use case of a system (execution instance of a use case). Scenarios can be viewed in two different ways through SequenceDs (Sequence diagrams) or CollDs.

Automating the Synthesis of UML StateChart Diagrams

135

Both types of diagrams rely on the same underlying semantics. Conversion from one to the other is possible. A SequenceD shows interactions among a set of objects in temporal order, which is good for understanding timing issues. A CollD concentrates on the structure of the interaction between objects and their inter-relationships rather than focuses the temporal dimensions of a scenario. A CollD is a graph where nodes are objects participating in the scenario and edges represent structural relations between objects (association, aggregation, inheritance, etc.). Messages sent between objects are labelled with a text string and a direction arrow. One edge can be used to send many messages in both directions. Each message label contains a sequence number representing the nested procedural calling sequence throughout the scenario. Sequence numbers contain a list of sequence elements separated by dots. Each element can have the following parts: a letter indicating a concurrent thread (see the message 1.4.2. l a in Figure 2a), a number showing the sequential position of the message, 0 an iteration indicator * (see the message 1.4 in Figure 2a) indicating that several messages of the same form are sent sequentially to a single target or concurrently to a set of targets, etc. Figures 2a, 2b and 2c give three scenarios (CollDs) of the use case Lend-service. Figure 2a represents the scenario where the loan is correctly registered, Figure 2b represents the case where the document loaned is not registered in the system and Figure 2c shows the scenario where the user is not registered yet in the system.

Attendant

-

1 : create-loan() + 1.1: User-id-enteredO + 1.4.1: document-id-entered(i) +

Terminal

-Documents

1.4*[i:=l..nbr-docl: process-doc(i) + 1.4.2: d:=check-document(i) + t 1.4.3.la[d=ok]: display-doc-info(i) T 1.5:new()

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

136

1 1.3: [u=ok] display-user-info()

-Documents

Terminal

Attendant 1: create-loan() 1.1: user-id-entexed() + 1.4.1: document-id-entaed(i)

-

1.4*[i:=l..nbr-d~~]:proces-doc(i) + 1.4.2: d=check-document(i) + t 1.4.3[d=notok]: display-doc-info(i) t 1.5: new()

+

1 1.3: [u=not &I 1: create-Im() -+ 1.1: user-id-entered()

Attendant

+

-Terminal i

2.3

display-usex-info0

-llcmxmnts

StateChart diagram (StateD)

A StateD shows the sequence of states that an object goes through during its life cycle in response to stimuli. Generally a StateD may be attached to a class of objects with an interesting dynamic behavior. The formalism (notation and semantics) used in StateDs is derived from Statecharts as defined by Hare1 [ll]. Statecharts are an extension of state-event diagrams to include hierarchy and concurrency. Any state in a Statechart can be recursively decomposed into exclusive states (or-st a t e ) or concurrent states (ands t a t e ) . When a transition in a Statechart is triggered (event received and guard condition tested), the object leaves its current state, initiates the action(s) for that

Automating the Synthesis of UML StateChart Diagrams

137

transition and enter a new state. Any internal or external event is broadcasted to all states of all objects in the system. Transitions between concurrent states are not allowed, but synchronization and information exchange is possible through events. As illustration, Figure 5a gives a StateD for the object Terminal where we can see an example of hierarchy and concurrency. The state Processing-document-list (or-state) is composed of two sub-states check-lis t-doc and Processing-document (and-state) which itself contains two concurrent sub-states separated by a dashed line.

3

Description of the approach

In this section, we describe the overall process to derive a system behavior specification. This process will provide an automatic way to transform requirement informations to a formal specification. We consider that the behavior specification of a system is given by the behavior specifications of its constituent objects. The approach we define here consists of four major activities (see Figure 3): 1. Requirement acquisition 2. Generation of partial object specifications from scenarios 3. Analysis of partial specifications 4. Object specifications integration.

3.1

Requirement acquisition

Scenarios are one of techniques mostly used in this activity. They are used in object-oriented methodologies [12, 19 and 31 as an approach to requirements engineering. The U M L (which represents the unification of these 0-0methodologies) propose a suitable framework for scenarios acquisition using UsecaseD for capturing system fonctionalities and SequenceDs or CollDs for describing scenarios. In this phase, the analyst begins by elaborating the UsecaseD for the system that consists to identify use cases and external actors interacting with. An example of such diagram was given in Figure 1. Then, he acquires scenarios as CollDs for each use case in the UsecaseD. We have already shown in Figures 2a, 2b and 2c examples of three CollDs corresponding to the use case Lend-sewice of the Library system. At the end of this activity, we get as result: a set of use cases UC={UC,, UC,, . . . , ucn>representing all fonctionalities of the studied system a set of objects OB= { 0,, 0,, . . . , 0,) participating in different scenarios of the system, a set of scenarios corresponding to each uc,, the union of these sets is c~={cd,, cd,, . . . , c q > .

138

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

Fig. 3. Overview of the approach.

3.2

Generation of partial object specificationsfrom scenarios

In this step, we apply repeatedly on each element cd,of the set CD the CTS (CollD To StateD) algorithm [22] in order to generate automatically partial specifications for objects participating in scenarios of the system. Transforming one CollD into StateDs is a process of five steps [22]. Step 1 creates a StateD for every distinct class implied by the objects in the CollD. Step 2 introduces as state variables all variables which are not attributes of the objects of CollD. Step 3 creates transitions for the objects from which messages are sent. Step 4 creates transitions for the objects to which messages are sent. Finally, step 5 has a role to bring for all StateDs the set of generated transitions into correct sequences, connecting them by states, split bars and merge bars. The sequencing follows the type of messages in a CollD: iteration messages, conditional messages, concurrent

Automating the Synthesis of UML StateChart Diagrams

139

messages and messages with multiple predecessors. Note that the CTS algorithm generates StateDs with concurrent states (see Figure 4a). The result of this step is a set of StateDs SD = {sd,,, l<=i<=k, l<=j<=m} where i refers to the CollD cd, and j to the object 0,. For illustration, we show in Figures 4a, 4b and 4c the StateDs generated by the CTS algorithm for the object Terminal.

Terminal

process-dc

IC

--------------------______________.

document-id-entered[ i
Fig. 4a. StateD for object Terminal generated from scenario 1 given in Figure 2a.

i, d, u

Terminal

process-dc

IC

Fig. 4b. StateD for object Terminal generated from scenario 2 given in Figure 2b.

140

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

U

Terminal user-id-entered

display-user-error[u=Notok]

Fig. 4c. StateD for object Terminal generated from scenario 3 given in Figure 2c.

3.3

Analysis of partial specifications

The previous activity generates StateDs with unlabeled states. In respect to the fourth activity, the analyst must add state names to the generated StateDs. In fact, our algorithm is based on state names as we will see later. He can also add structural informations like grouping states. Figures 5a, 5b and 5c show some added informations to the StateDs generated for the object Terminal (Figure 4a, 4b and 4c). The result of this step is a set of StateDs SD which is the same as SD plus additional textual and structural informations. Terminal

i, d, u

display-user-info[u=ok]

I

Fig.5a. the modified StateD for object Terminal corresponding to the StateD in Figure 4a.

Automating the Synthesis of UML StateChart Diagrams

I I

I

Terminal

141

I

i, d, u

proc=_dc c

aocument-id_entered[i
Fig. 5b. the modified StateD for object Terminal corresponding to the StateD in Figure 4b.

Terminal

i, d, u

display-user-error[u=NotOk]

Fig. 5c. the modified StateD for object Terminal correspondingto the StateD in Figure 4c.

3.4

Object specification integration

This activity represents the most important part of our approach. Indeed, we have defined a new algorithm for merging scenarios [13] based on Statechart formalism. As the algorithm is incremental, we will discuss here the integration of the two StateDs and we will refer for illustration to StateDs of object Terminal (StateD1 in figure 5a and StateD2 in figure 5b). The integration algorithm is a process of five steps: 0 validation of state names 0 incorporation of scenario variables 0 integration of state variable lists 0 integration of substate lists 0 integration of transition lists.

142

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

Step 1 consists to validate state names introduced by the analyst in the activity three of our approach (see section 3.3). It checks the absence of conflicts between StateDs (a conflict occurs when one state appears in the two StateDs at different levels of hierarchy). Step 2 incorporates composition variables in each StateD. The goal of composition variables is to solve the problem of interleaving between scenarios (see below). Step 3 merges state variables lists of the two StateDs into one. Step 4 merges hierarchically states of the two StateDs. Two kinds of merging are considered: an or-merging and an and-merging. An or-merging, as at Terminal state (the high level of StateDl and StateD2) of Figure 6,occurs when one state has the same initial substates in the two StateDs. An and-merging, as at processing-document-lis t state of Figure 6, occurs when one state has different initial substates in the two StateDs. Figure 6 shows the result of algorithm application on StateDl and StateD2. Finally, step 5 consists to integrate transition lists of the two StateDs into one transition list of the resultant StateD. Let transList1 be the transition list of StateDl and transList2 the transition list of StateD2. For each transition trans2 in trans~ist2,the algorithm looks for a transition trans1 in transListl which has the same triple fromNode state, toNode state and the event field. If transl does not exist, trans2 is added to translistl. Otherwise this step checks the guardcondition and action parts of transl and trans2. Three cases have to be considered. The first case is when trans1 and trans2 have the same guardcondition and different action parts, the algorithm adds trans2 to transList1 and outputs a message indicating that the resultant StateD will have a non-deterministic behavior. The second case is when transl and trans2 have different guardcondi tion fields and same action parts, guardCondi tion field of transl becomes equal to [ transl {guardCondition} OR trans2 (guardCondition}1. The last case occurs when transl and trans2 have different guardcondi tion fields and different action parts, trans2 is added to transList1. The transition list of the resultant StateD is therefore translistl. In Figure 7,we show the resultant StateD of object Terminal after integration of the three scenarios.

Technique for avoiding interleaving problem To solve the problem of interleaving between scenarios in a StateD, we have defined three composition variables: scenarioList, dynamicScenarioLis t and transScenarioList . scenarioList is a set of scenario names, it keeps scenario names that the StateD captures. dynamicScenarioList is also a set of scenario names. It is initialized to scenarioList and can change during the execution of the StateD. At each time of execution, it saves scenario names that remain possible in the next execution. transScenarioList is an array of sets of scenario names. It keeps the scenario names concerned by each transition of the StateD. For each transition in a StateD, we introduce a special condition sc which is equal [ tr] n dynamicScenarioList) f $1 (tr is the index to [ (trans~cenario~ist of a transition); and a special action sa which is equal to dynamicScenarioList := dynamicScenarioList n transScenarioList [ trl excepting for transitions that end one scenario where we introduce a re-initialization action ra which is equal todynamicScenarioList:= scenariolist.

Automating the Synthesis of UML StateChart Diagrams

143

In Figure 7, the StateD of object Terminal can never execute for example the scenario with the sequence [Tl, T2, T3, T5, T8, T9, T11, T6] which is different of the three input scenarios.

r

Terminal

Fig. 6. the resultant StateD for the Terminal object after integration of StateDl and StateD2. Terminal

i, d, u scenarioList := ( 1 , 2 , 3 ) dynamicScenarioList := scenarioList transScenarioList := [ ( 1,2,3), (1,2,3), (1.2). ( 3 ) , (1.2). (1,2), (1).(1,2).(1),(1).(2)1

Fig. 7. the resultant StateD for the Terminal object after integration of the three scenarios.

144

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

Note that, in this work, we are more interested in generating specifications capturing exactly the input scenarios, rather than focusing in verification aspects like coherence and completeness that we plan to study in the future work.

4

Related Work

In the area of scenario integration, most research has merely addressed the problem of integration of sequential scenarios, and few researchers have been interested in general forms of integration. In this section, we will describe some related work and point out advantages and weaknesses. Koskimies et a1.[14] present an algorithm SMS (state machine synthesis) for synthesizing state machines (Statechart diagrams) from a set of scenarios. They address synthesis as an inductive problem, basing their algorithm on Biermann’s method [2]. The main idea behind SMS is to infer a Statechart diagram able to execute all the given input traces. The SMS algorithm can not be used within concurrent systems, and the resultant state machines present no structural information such as hierarchies. Desharnais et al. [8] define a scenario as the union of two relations Re and RS where Re represents the relation of the environment which captures all the possible actions of the environment and RS the relation corresponding to the system reaction. The scenario integration is given by the composition of the scenarios relations. This approach uses the Z notation to represent scenario relations (Re and RS) and exhibits limitations similar to the ones described above. Particularly, it does not support hierarchies nor concurrency. Glinz [ 101 describes an approach for composing scenarios represented by Statecharts using some operators (conditional, iterative, and concurrent). In his work, hierarchies and concurrency are well supported. He does not give a method to solve the semantic integration problems that arise when separate scenarios are brought together (scenario overlapping). Moreover, he considers a scenario as a use case, and consequently his approach is more related to use case composition than to scenario integration. Some et al. [21] represent scenarios as timed automata. In their work, they are mostly interested in timing issues in describing scenarios. They define a scenario integration algorithm to generate a timed automaton modeling the behavior of the entire system. The generated specification does not capture hierarchy and concurrency features. Dan0 et al. [6] propose a formalization of scenarios with Petri nets, and compose them using a list of temporal relations (begin at the same time, end at the same time, before, just before,etc.). Since their approach is based on Petri nets, they address the problem of concurrency between scenarios. Citrin et al. [4] give a formalism for Temporal Message-Flow Diagrams (TMFDs) which are similar to sequence diagrams. They provide a set of tools, called collectively Cara, that support many aspects of using TMFDs in developing communicating systems such as edition, simulation and derivation of rule-based specifications from TMFDs. As specifications generated are rule-based, concurrency and hierarchies are not supported.

Automating the Synthesis of UML StateChart Diagrams

145

Elkoutbi et Keller [9] describe an approach based on colored Petri nets. They also address the problems of concurrency, hierarchies and scenarios interleaving. However in their work, hierarchies are limited to two levels (use case level and scenario level).

5

Discussion of Approach

Below we will discuss our approach in respect to some interesting aspects: evaluation of the approach, state name based integration, interleaving problem and relevance of the approach in the development process. Validation of approach The two algorithms which constitute the basis of our approach have been implemented in the Java language. For validation purposes, we have adopted a textual format for scenario acquisition and the presentation of the resulting specifications (in the absence of graphical editors for CollDs and StateDs). Note that the two algorithms have a polynomial complexity. Our approach has been successfully applied to a number of examples such as the library system presented in this paper, a gas station simulator [5], an ATM (Automatic Teller Machine) system [ 191, and a filing system [7]. State name based integration We have seen that our integration algorithm is based on state names. This way of integration has been chosen after studying other possibilities such as changing the CTS algorithm [22] to become incremental. But we have found that even if we introduce additional informations like relations between scenarios, the problem of overlapping between scenarios remains unsolved. Recall that we were interested in a more general form of integration. Moreover, informations included in CollDs do not enable us to generate labelled StateDs. One solution consists in extending the notation of CollDs with state names of objects in events. We refrained form this solution for two reasons. Firstly, we do not want to extend the UML without a major gain in expressiveness. Secondly, we think that adding object states in CollDs is counter to the spirit of such diagrams. Problem of interleaving between scenarios In this work, we have solved the problem of interleaving between scenarios by defining and introducing composition variables, without changing the syntax and semantic of UML Statechart diagrams. Note that sometimes interleaving is needed to capture scenarios that are not already considered. By executing or not executing step 2 of our algorithm, one has the choice of allowing or preventing the interleaving between scenarios.

Relevance of approach in the development process Current object-oriented CASE tools support various graphical notations for modeling a system from different views, but lack the possibility of automatic

146

Ismaïl Khriss, Mohammed Elkoutbi, and Rudolf K. Keller

transformations between models. The incorporation of our work into such CASE tools will ease the activity of dynamic modeling. Furthermore, the incrementality of our algorithm enables us to have an iterative process for dynamic modeling. In case of changes in some scenarios that have already been integrated, new partially labelled StateDs are generated after reapplication of activities two and three of our approach. The integration algorithm (activity four) is then reapplied over the partially labelled StateDs corresponding to the unchanged scenarios and the new ones. Note that the partially labelled StateDs should always be kept, even after integration.

6

Conclusion and Future Work

The work presented here proposes a new approach based on the UML for generating system specifications from scenarios. Scenarios are acquired as collaboration diagrams. These collaboration diagrams are transformed into partial object specifications through an existing algorithm. Then, these partial object specifications are merged using a new algorithm that we have defined. The most interesting features of our approach can be summarized in two points. The first point concerns the general nature of the integrated scenarios which may exhibit concurrent behavior. The second point consists in solving the problem of interleaving between scenarios in the resultant specifications. As future work, we aim to provide automatic support for verification tasks such as coherence and completeness checking in scenarios. Furthermore, we will extend our approach in order to generate user interface prototypes from object specifications. Also, we plan to develop graphical editors for Statechart diagrams and Collaboration diagrams, to ease scenario acquisition and allow for the visualization of the generated behavior specifications.

References 1. Anderson, J.S., Durney, B: Using Scenario in Deficiency-driven Requirements Engineering. In Requirements Engineering’93. IEEE Computer Society Press (1993) 134141. 2. Biermann, A.W., Ramachandran Krishnaswamy, R.: Constructing programs from example computations. IEEE Transactions on Software Engineering, Vol. 2 Num. 3 (1976) 141153. 3. Booch, G.: Object Oriented Analysis and Design with Applications. Second edition . BenjamirdCummings Publishing Company Inc., Redwood City, CA (1994). 4. Citrin, W., Cockburn, A., Khe, J.V., Hauser, R.: Using Formalized Temporal Messageflow Diagrams. Software-Practice & Experience, Vol. 25 Num. 12 (1995) 1367-1401. 5. Coleman, D., Arnold, P., Bodoff, S., Dollin, C., Gilchrist, H., Hayes, F., Jeremaes, P.: Object-Oriented Development: The Fusion Method. Prentice-Hall, Inc. (1994). 6. Dano, B., Briand, H., Barbier, F.: An Approach Based on the Concept of Use Cases to Produce Dynamic Object-Oriented Specifications. In Proceedings of the Third IEEE International Symposium on Requirements Engineering (1997). 7. Derr, K.W.: Applying OMT: A practical step-by-step guide to using the Object Modelling Technique. SIGS BOOKSPrentice Hall (1996).

Automating the Synthesis of UML StateChart Diagrams

147

8. Desharnais, J., Frappier, M, Khari, R., Mili, A,: Integration of sequential scenarios. IEEE Transactions on Software Engineering, Vol, 24 Num. 9 (1998) 695-708. 9. Elkoutbi, M. Keller, R.K.: Modeling Interactive Systems with Hierarchical Petri Nets. To appear in a High-PerformanceComputing Workshop of Advanced Simulation Technology Conference, Boston (1998). 10. Glinz, M.. An integrated formal model of scenarios based on StateCharts. In Fifth European Software Engineering Conference. Lecture Notes in Computer Science, Vol. 989. SpringerVerlag (1995) 254-27 1. 11. Harel, D.: StateCharts: A visual formalism for complex systems. Science of Computer Programming, Vol. 8 (1987) 23 1-274. 12. Jacobson, I, Christerson, M., Jonson, P., Overgaard, G.: Object-Oriented Software Engineering, A Use Case Driven Approach. Addison-Wesley (1992). 13. Khriss, I., Elkoutbi, M., Keller, R.K.: A New Approach to the Synthesis of Behavioral Specifications from Scenarios. Technical Report GELO-82, Universite de Montreal, Montreal, Quebec, Canada (1998). 14. Koskimies, K., Systa, T., Tuomi, J., Mannisto, T.: Automatic support for modeling 00 software. IEEE Software, Vol. 15 Num. 1 (1998) 42-50. 15. Nardi, B. A.: The Use Of Scenarios In Design. SIGCHI Bulletin, Vol. 24 Num. 4 (1992). 16. Potts, C., Takahashi, K., Anton, A,: Inquiry-Based Scenario Analysis of System Requirements. Technical Report GIT-CC-94/14,Georgia Institute of Technology (1994). 17. Rational Software Corporation: Rational Objectory Process 4.1 - Your UML Process. Santa Clara, CA (1998). 18. Rational Software Corporation , Microsoft, Hewlett-Packard , Oracle, Sterling, MCI, Unisys, ICON , IntelliCorp, i-Logix, IBM, ObjecTime, Platinum, Ptech, Taskon, Reich Technologies, Softeam: UML notation guide, version 1.1, Rational Software Corporation, Santa Clara, CA (1997). 19. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., Lorensen, W.: Object-oriented Modeling and Design. Prentice-Hall, Inc. (1991). 20. Rubin, K.S., Goldberg, A.: Object Behavior Analysis. Communications of the ACM, Vol. 35 Num. 9 (1992) 48-62. 21. Some, S., Dssouli, R., Vaucher, J.: Toward an automation of requirements engineering using scenarios. Journal of Computing and Information, Vol. 2 Num. 1 (1996)lllO-1132. 22. Schonberger, S., Keller, R.K., Khriss, I.: Algorithmic Support for Transformations in Object-Oriented Software Development. Technical Report GELO-83, Universite de Montreal, Montreal, Quebec, Canada (1998).

Informal Formality? The Object Constraint Language and Its Application in the UML Metamodel Anneke Kleppe1, Jos Warmer2, and Steve Cook3 1

Klasse Objecten, Postbus 3082, 3760 DB Soest, The Netherlands [email protected] 2 IBM Netherlands, Watsonweg 2, 1423 ND Uithoorn, The Netherlands [email protected] 3 IBM UK Ltd, 1 New Square, Bedfont Lakes, Feltham, Middlesex TW14 8HB, UK [email protected]

Abstract. Within the field of object technology it is becoming recognised that constraints are a good way to produce more precise and formal specifications than with diagrams alone. Evidence of this is that UML incorporates a standard constraint language called OCL (Object Constraint Language). The availability of OCL will encourage UML users to add constraints to their UML models. This paper explains OCL and demonstrates its applicability. Probably the largest application of OCL to date was its use to define the metamodel of UML, and the experiences gained in this application are discussed.

1

Introduction

Modelling object oriented systems up to now has by-and-large been a very informal task: make some diagrams, talk about them, make more diagrams and finally decide to put it all into code. In the authors’ opinion it would be a big improvement to have models that are - even a bit - more precise than the diagrammatic sketches of systems typically used today. The Object Constraint Language (OCL) offers UML modellers a means to express accurately a lot more about what the modelled system should be than with diagrams alone, or only supported by descriptive text. OCL is a language in which one can write constraints that contain extra information about, or restrictions to, UML diagrams. Its expressions look rather similar to navigation expressions that have been proposed in OMT [1], in UML 1.0 [2], and in the object navigation language defined in [3]. However, OCL is substantially more comprehensive. The principles of OCL were based on set theory. It could be argued that OCL is a formal language, although at the time of writing no complete formal semantics exist for it. In this paper we do not address the question of whether or not OCL is a formal language. Instead we demonstrate the possibilities of OCL as a precise modelling tool in combination with the UML diagrammatic techniques, and illustrate these with several examples taken from the use of OCL to model UML itself.

J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 148–161, 1999. © Springer-Verlag Berlin Heidelberg 1999

The Object Constraint Language and Its Application in the UML Metamodel

2

What Is the Object Constraint Language?

2.1

OCL in Perspective

149

The origin of OCL is the IBM/ObjecTime Limited proposal to the OMG’s call for an object-oriented analysis and design standard [4]. OCL was part of that proposal. Subsequently all of the proposals were merged into the final standard (UML 1.1 [5]), and as a result OCL has become part of the OMG standard for object oriented analysis and design. OCL is an optional part of this standard, in the sense that an implementation of the standard need not contain an implementation of OCL. The roots of OCL are considerably older. In a preliminary form OCL expressions can be found in the work of Cook & Daniels [6] which itself borrowed heavily from the formal specification language Z [7]. The language Eiffel [8] was a strong influence. OCL was developed further within the IBM Insurance Industry Solutions Development Center, until it was submitted to the OMG in January ’97. OCL is intended to be simple to read and write. Its syntax is familiar, and there are many predefined operations which simplify its use. Let’s first consider why should you use constraints at all in object-oriented modelling. Constraints augment the visual models defined by various analysis and design methods. A constraint is like a caption to a figure, in that without it you might understand the figure fully, but the chances are high that you might not. Constraints come in different forms: • An invariant is a constraint on a class or type that must always hold for the class to behave correctly. • A pre-condition is a constraint that must hold before the execution of an operation. • A post-condition is a constraint that will hold after the execution of an operation. • A guard is a constraint on the transition of an object from one state to another. Any of these constraints may be expressed with an OCL expression. Because an OCL expression is formal, it can be interpreted by a computer, and thus OCL constraints could be checked at “run-time” if required. However, OCL is not particularly designed for run-time checking: if it were, it would have to take into account several operational issues, such as what to do whenever a constraint fails or is undefined. The value of OCL is primarily in the act of writing it, and in the kind of simple automatic checking that can be done on the model itself, as we illustrate later in this paper. OCL is part of the new OMG standard. As such it is the standard constraint language for UML, and to conform fully to the standard, any constraint written in the context of a UML diagram should be written in OCL. Of course the fact that something is proclaimed a standard is not necessarily a reason to use it. The next subsection will introduce OCL and - hopefully - provide enough reasons to try it out. There is not space in this paper to give a full specification of OCL; such a specification can be found in the UML submission [5] or downloaded from http://www.ibm.software.com/ad/ocl. The full specification covers several aspects of OCL not mentioned in this paper.

150

2.2

Anneke Kleppe, Jos Warmer, and Steve Cook

Use of Constraints with UML Diagrams

As defined in the UML 1.1 standard, OCL constraints are always coupled to a UML diagram. This could be a class diagram, a state transition diagram, or any other diagram defined in the standard. Few readers will already be familiar with OCL therefore we will explain the constructs that can be used. For this we will use the example class diagram in figure 1.

Airport name: String

determineTax(): Integer

Flight

departing flight departTime: Time /arrivalTime: Time origin * duration: Time arriving maxnrPassengers: Integer flight depart() destination * arrive()

*

Airline name: String nationality: String

*

{ordered}

Passenger name:String age: Integer gender:{male, female } needsAssistance:Boolean checkIn()

Fig. 1. Class diagram for simple airline domain

To emphasise the independent identity of every object in an object-oriented system, every constraint that can be written for a UML diagram is always linked to a certain context object. From this object, which is called the constraint context or for short the context, we can ‘view’ the objects in its environment and express constraints for them. Whenever the constraint is not written in a note-box in the UML diagram but, as in this paper, in a separate document, the context is by convention indicated by putting the name of the class of the context underlined before the constraint. The next statement is an example of an invariant on every instance of class Passenger. Passenger age > 0 In OCL the special word self is used to denote the context object. So the constraint on Passenger can also be written as Passenger self.age > 0 Normally, as here, self is obvious and can be omitted, but this is not always the case.

The Object Constraint Language and Its Application in the UML Metamodel

151

This example also shows the common basic types like Integer, Real, Boolean and String that can all be used in OCL expressions. Most predefined operators on these types are well known and will not be explained in this paper. Two operators may however need some clarification: the implies and the if-then-else operators. The implies operator takes two Boolean expressions and combines them. The combined expression is true if, when the first expression is true, the second expression is true also. If the first expression is false the combined expression is always true no matter what the value of the second expression is. Take as example again an invariant on instances of the Passenger class: Passenger age >= 90 implies needsAssistance = true If the age of the passenger object is less than 90 the invariant is never broken, but if the age is 90 or more the attribute ‘needsAssistance’ of Passenger must be true. The second operator that may need explanation is the if-then-else. When evaluating an OCL expression the type of the result must always be clear. Therefore the else-part of an if-then-else expression may never be omitted. It is always mandatory that the type resulting from the then-part is equal to the type resulting from the else-part of the expression, as the next example illustrates. Note that the value of the enumeration type is indicated by the prefix ‘#’. Passenger if gender = #male then

name.substring(1,3) = "Mr."

else name.substring(1,3) = "Ms." endif For these operators, and for other logical operators such as or, and, etc, whenever a sub-expression does not affect the overall answer, that sub-expression does not need to have a defined value. That is, the operators are not “strict”. More interesting types are the types and classes defined in the UML model. In the diagram in figure 1 a class called ‘Time’ is assumed to be present in the model. From the context of Flight all attributes and query operations defined for this class may be used. A query operation is an operation that does not alter any values in the complete system. Its result is merely a statement of the current values. Suppose, for the sake of example, that the Time class has an operation called ‘after’ to compare two times, which returns true when the called object represents a point in time that lies after the parameter object, then the next constraint could be an invariant on instances of the Flight class: Flight arrivalTime.after(departTime) This invariant states that the arrival time of any flight is a point in time later than the time of departure. Note that when evaluating this constraint the state of the system is not altered. From one constraint context we may express constraints on other objects associated with the context object. To indicate the associated object we use the rolename of that object in the association, or when the rolename is not present the name of the class in lower case letters. One of the airport objects associated with the

152

Anneke Kleppe, Jos Warmer, and Steve Cook

context object of class Flight can be indicated using ‘origin’, the other may be indicated with ‘destination’. To forbid round-trip flights the following invariant simply states that the ‘origin’ object must be different from the ‘destination’ object. Flight origin <> destination All attributes and query operations of associated objects may be used in constraints too. The following example expresses the fact that all flights must be destined to the Paris Orly airport. Flight destination.name = "Paris Orly" If in our example the operation ‘determineTax’ in the class ‘Airport’ is a query operation, we can state that the total of taxes for every flight to be paid to the airport of origin and the destination airport may not exceed a certain amount, in this case 2000. Flight origin.determineTax() + destination.determineTax() < 2000 Often, associations in the UML diagram specify a one-to-many or many-to-many relationship. In those cases using a rolename in an OCL expression will result not in a single object but in a collection of objects. OCL has a predefined type called ‘Collection’ to deal with these cases. In the following constraint the number in incoming flights on any airport is limited to 100. The operation ‘size’ used here, is one of the standard operations on collections. To indicate the use of a collection operation versus the use of an operation defined in the UML model an arrow is used instead of a dot between the collection and the operation. Airport self.arrivingFlight->size < 100 The Collection type in OCL is the supertype of Set, Bag and Sequence types. An instance of the Set type is a mathematical set. Bag represents collections in which an element may be present more than once. Sequence represents an ordered Bag. In the next constraint the operation ‘notEmpty’ is used on a set. This constraint states that every airline must have some flight objects associated with it. Airline self.flight->notEmpty The following example states that if the name of the airline is ‘KLM’ then all flights will depart from ‘Amsterdam’. In this constraint ‘self.flight’ represents a set. When we use the collect operation on this set to collect all airport objects associated through the ‘origin’ link the result is a bag. Each ‘origin’ airport may be present a number of times in the collection.

The Object Constraint Language and Its Application in the UML Metamodel

153

Airline (self.name = "KLM") implies self.flight->collect( origin )->forAll ( name = "Amsterdam" ) With this example we can introduce a convenient syntactic shortcut. When a collect appears in an expression in such a way that the need for the collect can be inferred unambiguously from the context, the collect may be omitted. In this case, the example can be re-written as: Airline (self.name = "KLM") implies self.flight.origin ->forAll ( name = "Amsterdam" ) Lastly, an example of a sequence: Flight self.passenger->select( needsAssistance )->size < 10 Here ‘self.passenger’ results in a sequence. From this sequence all elements for which the attribute ‘needsAssistance’ is true are selected. The constraint as a whole states that the number of these may not exceed 10. 2.3

Characteristics of OCL

OCL is rich in expressive power. Using OCL it is possible to specify a great deal more information about a model than by just using the visual notations of UML. Without OCL a visual model, to be precisely understood, must be surrounded by paragraphs of natural language explaining all of the additional information about the structure and behaviour of the modelled system which cannot be expressed by the diagrams alone. Such rules as “a pilot may only fly an aircraft for which he/she has received appropriate training” which are almost impossible to represent visually, are straightforward to represent and may be specified with absolute precision in OCL. In some cases, natural language alone may be sufficient to specify such additional information accurately and concisely. However, natural language is often ambiguous. OCL is never ambiguous, and the act of creating an OCL specification can show up areas of inconsistency or incompleteness which the act of writing in natural language will most likely leave undiscovered. It might perhaps be proposed that a solution to the specification of this information might be with program code in a common programming language such as Java. There are several difficulties with this idea. Firstly, the effort required to write accurate program code greatly exceeds the effort to write OCL constraints. Secondly, using program code over-specifies the behaviour, requiring the programmer to make all kinds of design decisions which are not appropriate at the time the specification is created. Thirdly, program code is a very poor way to express simple structural invariants. And finally, specifications written using program code are hard to read. OCL is tightly coupled to UML diagrams. Every OCL expression is directly linked to a UML diagram. The expression states a fact about the model-elements on the

154

Anneke Kleppe, Jos Warmer, and Steve Cook

diagram, and correspondingly the model-elements may be used in the OCL expression. This synergy between visual and formal textual specification has great advantages over the use of either technique alone. An experienced modeller can look at a UML diagram and understand within a few moments the objects and the simple relationships between them. OCL allows the modeller to “drill down” progressively into the meaning from the diagram, adding detail and removing ambiguity to the desired degree of precision. The diagram alone is relatively weak in expressive power, as we have observed; the textual specifications alone do not provide a simple overview of the meaning, and because they do not indicate what needs to be understood first, can be hard to grasp. It is the combination of the two that provides the real power. Furthermore, because the linkage is formal, automatic consistency checking can be applied between the diagrams and the OCL statements. OCL is free of side-effects Operations used in OCL expressions may not have any effect on the state of the modelled system. OCL expressions just express unchanging facts about the structure and behaviour of the modelled system. If operators with side-effects were provided it would be much more difficult to understand a specification, because to understand any OCL expression the modeller would have to reason about the overall consequences of the side-effects, which might be complicated and hard to understand. Without side-effects each expression can be fully understood locally in its context. OCL has a familiar ‘look and feel’. OCL is essentially predicate logic applied to object models. However, modellers can easily be put off by the traditional vocabulary of logic which uses unfamiliar and perhaps intimidating symbols such as ∀, ∃, ⇒ etc. One of the design goals of OCL was to provide the semantic equivalent of these symbols in a more digestible form, using operators such as forAll, exists, implies, includes and so on. The syntax of OCL is very familiar for a person used to Smalltalk, C++ or Java, and can be learnt quite easily by somebody used to programming notations. OCL has a large number of predefined operations. The OCL specification provides the following built-in types: Real, Integer, String, Boolean, Enumeration, Collection, Set, Bag, Sequence. Also OclAny is the implicit supertype of all modelled types, and OclType is the type of all modelled types (giving access to the meta-level of OCL). Each of these types comes with a large number of pre-defined operations, for example Set has 22 predefined operations: {size, includes, count, includesAll, isEmpty, notEmpty, sum, exists, forAll, iterate, union, =, intersection, -, including, excluding, symmetricDifference, select, reject, collect, asSequence, asBag}. This extensive vocabulary means that the modeller can express model constraints without the need to define common basic operations.

The Object Constraint Language and Its Application in the UML Metamodel

155

OCL is based on a formal approach. At the time of writing, OCL has no complete formal semantics defined. However, OCL is based on well-known logical and settheoretic concepts, and we see no difficulty in principle in giving it complete formal semantics if this were seen as necessary. Such an exercise would flush out any remaining inconsistencies in the OCL definition. OCL may be extended further. Various extensions to OCL are possible, and have been proposed by several authors. The following have been suggested and could be considered for a future release of the standard: • • • •

3.

Declaring values to be constant, i.e. unchanging over time. Declaring values to be only increasing or decreasing over time. Declaring values to be unique within a given set of objects. Introducing let-clauses with subsidiary variables within OCL expressions (although as observed elsewhere in this paper, introducing additional operators into the model makes these superfluous).

Use of OCL in the UML Metamodel

When UML 1.0 was presented at the January 1997 meeting of the OMG, one of the first things mentioned by one of its authors was that the UML designers would like to use OCL in the next version of UML. The text of the UML 1.0 specification, although carefully written in precise English, lacked the rigour needed for an unambiguous interpretation. The IBM/ObjecTime Limited proposal demonstrated that OCL was an effective way to increase the precision of such a specification. When IBM joined the UML core team, it was agreed that OCL would be used to help specify the structure of UML. The structure of UML 1.1 is described in several packages. The package specification is structured in separate sections: • Abstract syntax, described as a visual UML class model. This shows the metaclasses, their attributes and relationships. The model is supported by a natural language definition of each metaclass, attribute and relationship. • Well-formedness rules, constructed using OCL. Each well-formedness rule is accompanied by some explanatory English text. • Semantics, which describes the meaning of the metaclasses in natural language. • Standard elements describing stereotypes of the metaclasses defined above. • Notes describing rationale for choices, examples etc. The well-formedness rules describe a set of invariants for each UML metaclass. Each instance of the metaclass must satisfy these invariants to be meaningful. The well-formedness rules add specific constraints to the visual model. To help the reader in understanding the invariants, each OCL rule has a textual explanation. For example the metaclass Interface is specified in UML 1.1 using the following three wellformedness rules:

156

Anneke Kleppe, Jos Warmer, and Steve Cook

Interface [a] An Interface can only contain Operations. self.allFeatures->forAll(f | f.oclIsKindOf(Operation))

[b] An Interface cannot contain any Classifiers. self.allContents->isEmpty

[c] All Features defined in an Interface are public. self.allFeatures->forAll ( f | f.visibility = #public )

The UML specification contains 137 OCL invariants and is, as far as the authors are aware, the biggest application of OCL up to now. About ten people were involved in writing the OCL expressions for UML 1.1. A clear choice was made at the start to provide the team with guidelines on the style to be used in the OCL constraints. The style guidelines are: • Always use self, even though it could be left out. • Always use an iterator in collection operations like select, forAll etc. (The 'f' in the above well-formedness rules [a] and [c] is an iterator). • Use the shortcut for collect. • Use so-called "additional operations" to simplify writing OCL. An additional operation is an extra attribute of the metaclass, which is defined purely for the purpose of specifying OCL constraints. These operations are therefore not shown in the graphical models. The use of these additional operations renders the need for variables and “let” clauses in OCL superfluous. 3.1

Simplifying the UML Metamodel

The use of OCL influenced the structure of the metamodel of UML in a number of places. As a result of using OCL, the metamodel could be made more generic with features being promoted to (abstract) superclasses. The restrictions on the specific subclasses were specified using OCL constraints. For example, the Namespace metaclass acts as a generic container that can own all kinds of ModelElements. This ownership is represented by the operation allContents, which results in the set of owned ModelElements. Namespace has many subclasses, most of which are containers as well. These subclasses have specific restrictions on the kind of things they are allowed to contain. For example the metaclasses Class, Interface and Datatype are (indirect) subclasses of Namespace and the restrictions on their contents are stated in the following invariants in the UML specification: •

A Class can only contain Classes, Associations, Generalizations, UseCases, Constraints, Dependencies, Collaborations, and Interfaces as a Namespace. self.allContents->forAll(c | c.oclIsKindOf(Class ) or

The Object Constraint Language and Its Application in the UML Metamodel

157

c.oclIsKindOf(Association ) or c.oclIsKindOf(Generalization) or c.oclIsKindOf(UseCase ) or c.oclIsKindOf(Constraint ) or c.oclIsKindOf(Dependency ) or c.oclIsKindOf(Collaboration ) or c.oclIsKindOf(Interface ))

•

A DataType cannot contain any other ModelElements. self.allContents->isEmpty

•

An Interface cannot contain any Classifiers. self.allContents->isEmpty1

Another example of simplification is the metamodel for state transition models. The metaclass StateVertex has one or more incoming and outgoing Transitions. Therefore its subclass Pseudostate has those as well. However, the well-formedness rules for Pseudostate place restrictions on the incoming and outgoing transitions based on the value of the meta-attribute 'kind' of Pseudostate.

Guard guard

source

StateVertex target

Pseudostate kind : PseudostateKind

outgoing * * *

Transition

incoming

trigger

Event

Fig. 2. Meta-model for UML state transition models

Constraints on the presence of incoming and outgoing transitions are specified with OCL, allowing Pseudostate and Transition to be defined in a very generic way. Without these constraints, there would have been a need for different subclasses of Transition and different subclasses of Pseudostate to specify the (non-) presence of associated objects. This would have made the metamodel much more complex. The following well-formedness rules from the UML specification express the restrictions:

1

In this example, taken literally from the UML 1.1 specification, the text and OCL appear not to match, or at least raise a question about their mutual consistency. In fact the OCL is correct, which again underlines the message of this paper.

158

Anneke Kleppe, Jos Warmer, and Steve Cook

PseudoState [a] An initial vertex can have at most one outgoing transition and no incoming transitions (self.kind = #initial) implies ((self.outgoing->size <= 1) and (self.incoming->isEmpty))

[b]

A final pseudo state cannot have outgoing transitions (self.kind = #final) implies (self.outgoing->isEmpty)

[c]

History vertices can have at most one outgoing transition ((self.kind = #deepHistory) or (self.kind = #shallowHistory)) implies (self.outgoing->size <= 1)

[d] A join vertex must have at least two incoming transitions and exactly one outgoing transition (self.kind = #join) implies ((self.outgoing->size = 1) and (self.incoming->size >= 2))

[e] A fork vertex must have at least two outgoing transitions and exactly one incoming transition (self.kind = #fork) implies ((self.incoming->size = 1) and (self.outgoing->size >= 2))

[f] A branch vertex must have one incoming transition segment and at least two outgoing transition segments with guards. (self.kind = #branch) implies ((self.incoming->size = 1) and ((self.outgoing->size >= 2) and self.outgoing->forAll(t | t.guard->size = 1)))

A conclusion from this is that use of OCL can lead to simplified visual models. The visual models will then fulfil the purpose of specifying the basic structure, to which the OCL constraints add more specific information. 3.2

Recursive Definition of Features

Properties cannot be defined recursively in visual UML. At several places in the UML metamodel there was a need for recursion. For example in the metaclass GeneralizableElement the 'supertype' is defined by an additional operation, as shown below. The additional operation 'allSupertypes' recursively defines all the supertypes of the GeneralizableElement.

The Object Constraint Language and Its Application in the UML Metamodel

159

[a] The operation supertype returns a Set containing all direct supertypes. supertype : Set(GeneralizableElement); supertype = self.generalization.supertype

[b] The operation allSupertypes returns a Set containing all the GeneralizableElements inherited by this GeneralizableElements (the transitive closure), excluding the GeneralizableElement itself. allSupertypes : Set(GeneralizableElement); allSupertypes = self.supertype->union (self.supertype.allSupertypes)

Such a recursive definition should be read as a logical equation where the solution is the smallest set satisfying the equation. Note that the alternative to define 'allSupertypes' would be some cryptic English text. The well-formedness rules for GeneralizableElement can now use the operations defined above, as can be seen in [c] below. GeneralizableElement [a] A root cannot have any Generalizations. self.isRoot implies self.generalization->isEmpty

[b] No GeneralizableElement can have a supertype Generalization to an element which is a leaf. self.supertype->forAll(s | not s.isLeaf)

[c] Circular inheritance is not allowed. not self.allSupertypes->includes(self)

[d] The supertype must be included in the Namespace of the GeneralizableElement. self.generalization->forAll(g | self.namespace.allContents->includes(g.supertype) )

3.3

Specifying UML Metamodel Variants Using OCL

A specific use of OCL is the possibility to attach constraints to stereotypes. The meaning of an OCL invariant which is attached to a stereotype, is that the model element which is labelled with the stereotype needs to fulfil the constraint. For example we could introduce a UML stereotype called <>, with constraint "visibility = #public", since we are only interested in public features during analysis. Whenever the stereotype <> is applied to a feature in a UML model, this constraint applies. The above technique allows one to specialise the UML metamodel by defining a collection of stereotypes with attached invariants.

160

3.4

Anneke Kleppe, Jos Warmer, and Steve Cook

Tool Availability

During the final few weeks before submitting UML 1.1 an OCL Parser, developed by IBM, became available. Checking all the well-formedness rules with the parser revealed about eighty errors. Many of those errors were due to the fact that names of attributes and association ends had been changed a number of times. Another group of errors included erroneous OCL, where the author of the constraint didn't write down his intention correctly. Those errors often revealed questions of semantics. Answering them helped making the UML specification clearer and more consistent. Although the checks on naming problems did not seem very important at first glance, they were actually of great importance. We all know about the documentation problems in software development: how to keep the different items of documentation up to date and consistent with each other. The consequence of having natural language documentation is that by definition it is difficult to maintain. All changes to other parts of the documentation need to be manually checked with the natural language. Having documentation in the form of a precise language with a welldefined syntax means that we can automate these consistency checks and verify whether the documentation is internally inconsistent. This advantage was not expected when we started the UML specification, but turned out to be considerable. Taking the previous point further, the use of OCL will enable automated impact analysis. Whenever a change is made to a UML model, the constraints that make use of the specific part of the model can be found. These are the constraints that are potentially invalidated by the change and need attention.

4.

Conclusion

OCL is a significant step towards improving the engineering of object-oriented software. It may be a small step, as of course OCL itself must develop and grow as time goes by, but the use of a precise and expressive constraint language integrated with UML diagrams is definitely an improvement of the software development process. Using OCL in the specification of the UML 1.1 metamodel proved to be beneficial. During the creation of the metamodel it sharpened the minds of the people involved. Presently it fulfils an important role in answering questions about the structure of UML and in clarifying the intent of the metamodel. Without the use of OCL constraints the UML metamodel would lack the precision needed for such a standard.

References 1. JamesRumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, William Lorensen, “Object-oriented modeling and design”, Prentice-Hall, 1991. 2. OMG document ad/97-01-14: UML 1.0 Proposal 3. Michael Blaha and William Premerlani, “Object-oriented modeling and design for database applications”, Prentice-Hall, 1998.

The Object Constraint Language and Its Application in the UML Metamodel

161

4. OMG document ad/97-01-18 IBM/ObjecTime Limited joint submission for AD RFP1 5. OMG documents ad/97-08-02 through ad/97-08-11: UML 1.1 Proposal 6. Steve Cook and John Daniels, “Designing Object Systems: Object-oriented modelling with Syntropy”, Prentice-Hall, 1994. 7. John B.Wordsworth, “Software Development with Z”, Addison-Wesley, 1992 8. Bertrand Meyer, “Object-oriented software construction”, Prentice-Hall, 1988. 9. Desmond d’Souza and Alan Cameron Wills, “Objects, Components, and Frameworks with UML: the Catalysis Approach”, forthcoming. 10. Kevin Lano and Howard Haughton (eds), “Object-oriented specification case studies”, Prentice-Hall, 1994. 11. Derek Coleman, Patrick Arnold, Stephanie Bodoff, Chris Dollin, Helena Gilchrist, Fiona Hayes, Paul Jeremaes, “Object-oriented development: the Fusion method”, Prentice-Hall, 1994. 12. Jos Warmer and Anneke Kleppe, “The Object Constraint Language: precise modeling with UML”, Addison-Wesley, 1998.

Reflections on the Object Constraint Language Ali Hatnic, Franco Civcllo, John Howse, Stuart Kent, Richard Mitchell Distributed Information Systems Research Group,

IT Faculty, University of Rrighton, Brighton BN2 4GJ, UK. http://www.biro.brighton.ac.uW, a.a.hatnicQbrighton.nc.uk

Abstract. The Object Constraint Language (OCL), which forms part of the UML set of modelling notations, is a precise, textual language for expressing constraints that cannot be shown diagrammatically in UML. This paper reflects on a number of

aspects of the syntax and semantics of the OCL,and makes proposals for clarification or extension. Specifically, the paper suggests that: the concept of flattening collections of collections is unnecessary, state models should be connectable to class models, defining object creation should be made more convenient, OCL should be based on a 2-valucd logic, sct subtraction should be covered more fully, and a "let"feature should bc introduccd.

1 Introduction The Object Constraint Language [I21 is a precise, textual language designed to complement the largely graphical UML [ 111. Specifically, OCL supports the expression of invariants, preconditions and postconditions, a1lowing the modeller to define precise constraints on the behaviour of a model, without getting embroiled in implementation details. OCL is the culmination of recent work in object-oriented modelling [ I , 2, 3, 81 which has selected ideas from formal methods to combine with diagrammatic, object-oricntcd modelling resulting in a more precise, robust and expressive notation. Syntropy [ 11 extended OMT [ 131 with a 2-like textual language for adding invariants to class diagrams rind annotating transitions on state diagrams with preconditions and postconditions. Catalysis [2,3] has done something very similar. OCL adopts a simple non-symbolic syntax and restricts itself to a small set of core concepts. Onc of the most important aspects of OCL is that it is part of the Unified Modelling Langungc, which has recently become a standard modelling language, under the auspices of the Object Management Group. As a result, it is likely to get much greater exposure and use than previously proposed formal specification languages such as VDM [9] and Z [ 141, and work invcsted in ensuring that it is correct and appropriate for its purpose is therefore more likely to reap a dividend than work on the aforementioned languages. However, the OCL is an optional part of U M L specifications. Thc purpose of this paper is to contributc to discussions on the correctness and appropriateness of OCL. We identify a number of issues which, in our opinion, need to be resolved; where possible we suggest a solution, or at least an outline direction for further investigation. The paper is organised as follows. Section 2 deals with navigation in object-oriented modelling, in particular navigating from collections. Section 3 considers object states. SecJ. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 162–172, 1999. © Springer-Verlag Berlin Heidelberg 1999

Reflections on the Object Constraint Language

163

tion 4 considers object creation and thc fcaturc allInstances. Scction 5 looks at the issue of undefined values, Section 6 proposes adding more collection operations. Section 7 suggests allowing local definitions. And Section 8 briefly summarises the issues examined and proposes that future semantics work on OCL be driven by the needs of CASE tool builders and users.

2 Navigation in 00 Modelling Navigation in 00 triodclling means following links from one object to locate another object or a collection of objects. It is possible to navigate across many links, and hence to navigate from a collection to a collection. Navigation is at the core of OCL. OCL expressions allow us to write constraints on the behaviour of objects identified by navigating from the object or objects which are the focus of the constraint. At the specification level, the expressions appear in invariants, preconditions and postconditions. In this section we review some of the issues concerning the meaning of navigation cxprcssions, and outline a semantics for them which takes account of these issues. We conclude by examining what the OCL specification says about navigation expressions and suggest that the notion of flattening collections of collections is not needed. 2.1

Example Model Figure 1 presents a small, contrived example of a class model in UML for a simple system that supports schediiling of offerings of seminars to a collection of attendees by presenters who must be qualified for the seminars they present. A full description of the notation can bc found in [ I I ] and a distilled description can be found in [4].

*I

SeminarScheduling System

name: String

*

*I

is-cancelled: Boolean date: Date

Figure 1: A class diagram for a seminar scheduling system

2.2

Navigating from single objects

Navigation expressions start with an object, which can be explicitly declared or given by a context. For example, a declaration such as s :Seminar means that s is a variable that can

164

Ali Hamie et al.

refer to an object taken from the set of objects conforming to type Seminar.Here, the type name is used to represent the set of objects in the model that conform to the type. A navigation expression is written using an attribute or role name, and an optional parainctcr list. Givcn the earlier declaration, the OCL expression s .title rcpresents the value of the attribute title for the object represented by s . An OCL expression can also usc the name self to refer to a contextual instance. In the following example, self refers to an instance of Seminar. self.title .

Navigating from an object via an association role can result in a single object or a collection, depending on the cardinality annotations of the association role. A collection is, by default, a set. For example, given the declaration p: Presenter, the expression p .qualif iedFor results in the set of seminars p is qualified to present. The association between Seminar and offering has the annotation {ordered) on thc offering rolc. As a result, the expression s .offering,where s is a seminar, results in a sequence. Notice that this means that the operator " is overloaded, because it can map from an object to a set, to a bag, or to a sequence.

.

I'

2.3 Navigating from collections Assume we have the declaration p :Presenter. The OCL navigation expression p .qua1 i f iedFor . tit 1 e (which is an abbreviation of the following expression p. qualif iedFor->collect (title)) involves navigating first from a single object and then from a collection, namely the set of seminars for which presenter p is qualified. This is because the expression parses as (p.qua1ifiedFor) .title.The result of this exprcssion is obtained by applying title to each member of the set p .qualif iedFor. Similarly, navigating from a bag yields a bag and navigating from a sequence yields a sequence (but see Section 2.4). This means that every property (attribute or association role) must, in general, be applicable to a set, a bag or a sequence, and this can be seen in terms of overloading of the navigation operators. For example, within the model of Figurc 1, wc havc the following overloaded versions of the name" and date" opcrators (the symbol indicates the position of the argument): 'I-.

'I-.

44-"

-.name - .name

PresenterjString Set(Presenter) + Bag(String) _.name : Bag(Presenter) + Bag(String)

- .date

: :

: Offering4Date _.date : Sequence(Presenter)

+ Sequence(Date)

.

Hence, the following OCL expressions p .name,(p quali f iedFor ) .name, well-typed. The

(p.qualif iedFor->asBag).name,and (s.offering) ,date are opcrator asBag converts a set or a sequence into a bag.

The overloaded versions of the operator -. property (property is an attribute or association role) must satisfy the axioms: Set0.property = B a g 0 (s->including(e)).property = (s->excluding(e).property)->including(e.property)

Reflections on the Object Constraint Language

165

Bag{}.property = Bag{) (b->including(e)).property = (b.property)->including(e.property) Sequence{}.property = Sequence{} (q->including(e)).property= (q.property)->including(e.property)

Intuitively, these axioms define that applying property to a collection yields a second collcctioti, obtaincd by iipplyiiig property to cnch clcincnt of thc original collcction. The property can be an attribute or an association role. In the axioms, s is a set, b is a bag and q is a sequence, e is some element. Here e .property returns a single element; we can give similar axioms for the case where e .property returns a collection. OCL specifies navigation from collections by using the feature collect, which takes a collection and an expression as arguments and yields a collection obtained by applying the expression to each element in the collection. When the type of the expression is also a collcction then the result can be seen as a collection of collections. According to the OCL documentation, a collection of collections is automatically flattened. Such a view is ciisy to tcach to modcllcrs, but hard to definc without falling into traps. For instancc, a well-defined function will satisfy x = y implies f(x) = f(y)

where x and expression.

y

are values and f is a function. Consider the following OCL navigation

sss.presenter->collect(qualifiedFor)

where sss is an object of type SeminarSchedulingSystem. The first part of the expression sss.presenter

yields a set of presenters. The full expression, without flattening, yields a bag of sets of seminars, such as Bag( Set{sl, 921, Set(s2, s3) )

With flattening, the full expression yields a bag of seminars, such as Bag{ sl, s2, s2, s3 1

In the flattening step, no elements are lost or gained (we just lose structure). The two expressions above are of types Bag (Set (Seminar)) and Bag (Seminar), respectively. Thus, any well-defined function we wish to specify on elements of type Bag (Seminar) will not apply to elements of type Bag (set (seminar)1, unless we specify it in various overloaded forms. There would be as many overloaded forms as there are possible levels of structure in the model. If, instead, OCL defined the result of navigating via collections simply in terms of left-to-right parsing, there would be no need for any concept of flattening. For instance, sss.presenter.qualifiedFor.offering

166

Ali Hamie et al.

is parscd as (sss.presenter) .qualifiedFor) ).offering

whose meaning can be found by repeated application of navigation from one collection to another. Each application of navigation yields a collection, which is the source of the next navigation. This does not entail building a collection of collections of collections and then flattening it.

2.4 Navigating from sequences According to the OCL document, navigating from a sequence yields another sequence. For cxarnple, givcn the declaration s : Seminar, the expression s offering results in the sequence of offerings for seminar s. The expression s .offering attendee results in the sequence of attendees for all offerings of seminar s. The value of this expression is obtained by applying the association role attendee to each element of the sequence s .offering. This results in a sequence of sets which is then flattened to give the desired sequence. However, there are many ways to flatten sequence of sets, which would result in diffcrent sequences. OCL does not indicate how such collections of collections are flatkncd. In addition, there are situations where it is not appropriate to get a sequence when navigating from a sequence. For example, given a seminar s we would be more interested in thc bag of all attendees for all offerings of s rather than in the (underspecified) sequence.

.

.

3 States In objcct-oriented modelling, class diagrams can be supplemented by state diagrams. A state diagram for a given object type shows the possible states an object of this type can be in, together with the transitions that move an object from one state to another. A state diagram contributes to the behavioural specification of a type in a model. An object state is an abstraction of its detailed property values. Figure 2 shows a state diagram of offering with two states, Scheduled and Cancelled, meaning that an offering of a seminar can be scheduled or cancelled but not both. There are several ways of connecting class diagrams and statc diagrams. One approach is takcn by Syntropy [ l], which amounts to treating states as dynamic subtypes, so that an object can move from one type to another. A sccond approach is to treat states as if they were boolean attributes in class diagrams. In UML it is not clcar how to connect class diagrams and state diagrams, and OCL does not clarify the issue. If UML allows states to be represented as dynamic subtypes on a class diagram then the OCL feature oclIsKindOf can be used to assert that an object is in a given state. For cxarnple, we could use 0.oclIsKindOf (Scheduled) to assert that offering o is in the state Scheduled. If states are represented as boolean attributes then the corresponding attributes could bc uscd to represent states in OCL.For example the expression p .Scheduled would be true if p is in state Scheduled, and false otherwise. These state-model attributes can be related to other properties by means of invariants. For example, the state Cancelled in Figure 2 can be related to the attribute goingAhead in Figure 1 by an obvious invariant. Yet another way would be to introduce a function in with the signature:

Reflections on the Object Constraint Language

167

Offering

cancel()

Scheduled

Cancelled

Figure 2: A state diagram for a seminar offering -in_.: Presentation, StatedBoolean

where (p in Scheduled) is true if p is in state Scheduled, and false otherwise, and where State would be an enumerated type of object states. From the point of view of using OCL,the mapping to boolean attributes is, perhaps, the easiest to explain to modellers. However, from the point of view of providing an integrated semantics for UML, treating states as dynamic types might be the most elegant approach: substating then has the same semantics as inheritance, dynamic classes in class diagrams are just states in state diagrams, there can be associations targeted and sourced on states (dynamic classes), and so on.' Whichever approach is chosen, it should be clear to modcllers how the names of states can be defined in terms of class model properties, and how they can be used in OCL expressions.

4 Object creation OCL provides a type operation allInstances, which delivers a set of all instances of a given type. For example, Presentation. allInstances would be a set of all instances of type Presentation in the model at a given point in time. Although the italicised condition is not explicitly covered in the OCL documentation, it has been inferred from a private communication on object creation with Jos Warmer, one of the authors of the OCL. In gcncral, for a given type T,the meaning of T.allInstances is the set of all elements of type T at some moment in the life of a model containing type T. The set T.allInstances can change as a result of creation operations associated with the type T.One use of allInstances is in the postcondition of an operation specification to assert that an object has been created. In the example system, one result of executing an operation schedule is the creation of a new offering. In order to assert that a new offering o is created, we need to assert that it did not exist prior to executing the operation but does exist after executing the operation. We can use the allInstances operation, as follows: (0ffering.allInstances

-

Offering.allInstances@pre) -> includes ( 0 )

I . Notc that this scmantics is not necessarily in accordance with the semanticsof state diagrams as currently described in thc UML 1 . I . documentation.Discussion of the relationship between these two approaches appears in [lo].

168

Ali Hamie et al.

whcrc offering .allInstances@pre is thc set of offerings that existcd in the model prior to executing schedule.Asserting that a new object has been created is such a common thing to do that we propose the introduction of a limited number of convenient abbreviations. Here are two candidates. T.

d

:

post : se1f.seminar.offering->exists(o : Offering I 0ffering.allInstances-Offering.Allinstances@pre ->includes( 0 ) and o.seminar = s and o.date = d and o.attendee->isEmpty and o.presenter->isEmpty 0.goingAhead) and

Figure 3: Specification of operation schedule

The first recogniscs that asserting creation in a postcondition often involves saying "thcrc is a new object o of typc T and it has the following properties". For example, in the model of Figure 1, the postcondition of an operation to schedule a new presentation of a seminar is given in Figure 3. Loosely, this begins by saying that after the schedule operation there exists an offering which was not in the set of offerings before the operation, and continues by defining four properties of the new offering (seminar,date,attendee and presenter). This is such a common idiom that a combincd operator to assert existence and newness would be useful, as in Figure 4.

post: self.seminar.offering->existsNew( o : Offering o.seminar = s and o.date = d and o.attendee->isEmpty and o.presenter->isEmpty and 0 . goingAhead)

I

Figure 4: Alternative specification of operation schedule

Now the newness is captured in the operator and the body of the quantified expression concentrates on defining what properties the new object should have. Our second candidate for a convenient operator associated with creation is inspired by thc allInstances operator. An operator newInstances,as in, for example, 0ffering.newInstances

could be used in postconditions to mean exactly those instances of type offering that did not exist in the predate. The Catalysis method [3] has something similar. We see no harm in having several overlapping ways to talk about new objects.

Reflections on the Object Constraint Language

169

5 Undefined Values The OCL document [ 121 (p7) admits the possibility that some expressions may be undefined when evaluated. Having an undefined value could be important for a number of purposes. It could serve as the result of an illegal operation such as dividing by zero; or as indicated in the OCL definition (p15) when asking for the property of an object that has bccn dcstroyed in the post-condition of an operation; or for the @pre property of one that has just been created; or when type casting (p6). In addition, an undefined value could be used to stand for a non-terminating computation such as an infinite loop. Several approaches have been used in other languages to deal with undefined expressions. One approach is to regard undefined expressions as being unknown or underspecified. In this case the result of, for instance, dividing 1 by 0 is an integer but its value is unknown. This is similar to declaring a variable of a given type: the variable has a value of the declared type, but the precise value is unknown. In this approach, boolean expressions are either true or false, resulting in a two-valued logical system. It is the approach generally adopted in classical mathematics, which admits only total functions, and in some formal spccification languagcs, such as the Larch Shared Language [S]. Another approach is to include a special value I to denote that something is undefined. If the logical connectives are treated as boolean functions then the undefined value propagates into logical expressions. For example, b and I = I.This results in a 3-valued logic, as in, for instance, VDM. Yet another approach, adopted by 2,is to maintain the distinction between logical operators and expressions. Undefined expressions are interpreted as meaningless, that is, they do not denote anything in the interpretation domain. Since logical expressions are not treated as expressions within the language, their truth values are unknown if they involve undefined expressions. In OCL expressions can be undefined. However, it is not clear from the documentation what is meant by being undefined. One possibility is that undefined is not interpreted as unknown. Let I stand for the undefined value. According to OCL, if a subexpression of an expression evaluates to undefined then the whole expression is undefined. The only cxceptions to this are: true and I = true true=true false and I = f a l s e I and false= false

I and

that is, true OR-ed with anything is true,and false AND-ed with anything is false. With other boolean operations we deduce the following: false implies I = true I implies true = true not(l) =

I

The boolean operations agree with the classical logical connectives on the ordinary truth values, i.e., true and false. However, when I is involved they reflect a model of computation which is mainly strict. For example, with the operation not, if the argument is

170

Ali Hamie et al.

iindefincd then whole expression is undefined, that is to say n o t is strict in its argument. The operation or, however, is not strict in either the first or the second argument. In addition we have the following axiom:

I or

1 = 1

which implies that the law of excluded middle does not always hold, that is, a boolean exprcssion can be true, false or undefined. (From the definition of b implies b2 , i.e., not(b) o r (b and b2) , given on p24 of the OCL document, we could deduce that 1 implies true = I, which is not consistent with either 2-valued or 3-valued logic. IIowcvcr, this dcfinition is probably erroneous and should have been no t(b) o r b2 ). There is one place in OCL where undefinednessdefinitely is not required: when navigating over an optional association (cardinality 0 . .1).By forcing the result of navigation to be a set, the equivalent of a 'null' or 'nil' reference is the empty set (and similarly for optional attributes). Thus 'null' does not correspond to an undefined value, Both 2-valued and 3-valued logics have advantages. However, we would suggest that OCL be based on a 2-valued logic, for the following reasons. If the logic is to be used for specifying properties without reasoning about partial functions, 2-valued logic seems appropriate and simpler. In addition, reasoning with 3-valued logic is harder because of the absence of some logical laws, e.g., the law of excluded middle. We would suggest that an understanding of 3-valued logic is not required by users, so perhaps references to 3-valued logic are an unnecessary complication if practitioners are the audience.

6 Completing the set of collection operators In its current form, the Object Constraint Language contains an includes operation, as in p.qualifiedFor->includes ( s ) ,which says that seminar s is an element of the set p qualif iedFor (the set of seminars presenter p is qualified to present), but there is no p . qualif iedFor->excludes ( s ) Perhaps more importantly, there is p .qualifiedFor->includesAll (pl.qualifiedFor) , saying that the set

.

.

pl.qualifiedFor

of seminars is a subset of p.qualifiedFor, but no

.

p .qualifiedFor->excludesAll (pl.qualifiedFor) Instead the latter has to be

expressed using the rather cumbersome expression: (p.qualifiedFor->intersection(pl.qualifiedFor))->isEmpty

There is, however, an operation p . qualif iedFor->excluding (s), and the set subtraction operator "-" found in traditional mathematical notation. We suggest that the set of operations on collections could be extended so that the inclusive operators all have their exclusive counterparts.

7 Local definitions In VDM [9], "let" expressions have the following syntax: let(x = expr : oclExpression) in (exprl : oclExpression) end : expr1,evaluationType

Reflections on the Object Constraint Language

171

Thc valuc of a Ict expression is evaluated by cvaluating expression expr and then using the result in the cvaluation of exprl.This is equivalent to exprl [expr/x](the expression exprl with x substituted for expr). Let expressions are useful when the same expression needs to be used a number of tiitics in thc siiiiic asscrtion. This is particularly true when long navigation expressions are combined with operators on collections to identify particular sets of objects. Then having to rcpcat such cxprcssions scvcral times is cumbcrsomc, and can obscure the meaning of the overall assertion. We therefore recommend that some form of local dcfinition mechanism be included.

8 Further work In this paper we have considered some issues related to the OCL language. We believe that the ideas we have presented about navigation should be tested by including them in a proper formal semantics for OCL. With regard to object states, we have commented on the fact that there is a problem in UML with the integration of state and class diagrams, and no attempt has been made to rcsolvc this in OCL. Wc Iiwc skctched some approaches to providing an intcgrated semantics. However, there is semantic work to be done here, too. For instance, the approach bascd on dynamic subtypes is at odds with the (informally described) semantics provided as part of the UML 1. I. In particular, it takes no account of events and requires the restriction that all transitions must be atomic and at the same level of granularity to be lifted. We believe that work in this area is crucial if UML is to proceed any further, especially when one considers that UML-RT (Real Time) is likely to provide us with yet another possible scinantics for state diagrams and, at least initially, seems to be taking a "bolt on" rather than "integrative" approach. In general, the integration of the UML notation set, including OCL, needs attention. We have highlighted a range of approaches in the formal methods literature for dealing with undefinedness. We do not believe this issue can be resolved without providing a formal semantics for OCL, and the way it is resolved will depend on the semantics approach taken. We believe that a semantics should be built for a purpose, which in our vicw should bc to support CASE tools for reasoning about and checking the integrity of models specified using UML and OCL.

Acknowledgements This work was supported by funds from the UK EPSRC under grant number GWK67304.

References 1.

Cook, S., Daniels, J.: Designing Object Systems: Object-Oriented Modelling with Syntropy. Prentice Hall, UK (1 994)

2.

D'Souza, D., Wills, A.: Extending Fusion: practical rigor and refinement. R. Malan et al., 00 Development at Work, Prentice Hall (1996)

172

Ali Hamie et al.

3. D'Souza, D., Wills, A,: Objects, Components and Frameworks with UML: The Catalysis Approach. Addison-Wesley, to appear 1998. Draft and other related material available at http:/ /www.trireme.com/catalysis 4.

Fowler, M., Scott, K.: UML Distilled. Addison-Wesley (1997)

5.

Guttag, J., Homing, J.: Larch: Languages and Tools for Formal Specifications. Springer-Verlag (1993) Hamie, A,, I-lowse. J., Kent, S.: Navigation Expressions in Object-Oriented Modelling. Lec, ture Notes in Computer Science, Vol. 1382. Springer-Verlag (1 998) 123-137

6

9

7. Harnie, A., Howse, J., Kent, S.: Compositional Semantics for Object-Oriented Models. In Duke, D. and Evans A., editors, 3rd Northem Formal Methods Workshop, electronic Workshops in Computing, UK, Springer-Verlag (1 998) 8.

Meyer, B.: Eiffel the Language. Prentice Hall (1992)

9. Jones, B. C.: Systematic SoftwareDevelopment using VDM. Prentice Hall (1990)

to.

Kent, S. : UML: What does it all mean? 1 day tutorial at ETAPS'98, Lisbon, Portugal. Notes available from http://www.it.brighton.ac.uk/staff/Stuart.Kent(1998)

I I . Rational Software Corporation: The Unified Modeling Language Version 1.1. Available from http://www.rational.com ( I 997) 12. Rational Software Corporation: The Object Constraint Language Specification, Version 1.1. Available from http://www.rational.com ( 1997) 13. Rumbaugh, J., Blaha, M., Premerali, W., Eddy, F, Lorensen, W.:Object-Oriented Modelling

and Design. Prentice Hall (1991) 14. Spivey, M.: The Z Notation. 2nd ed. Prentice Hall, UK (1992)

On Using UML Class Diagrams for Object-Oriented Database Design Specification of Integrity Constraints Yongzhen Ou

University of Konstanz Department of Computer Science, Fach D188, D-78457 Konstanz, Germany [email protected]

Abstract. In the course of object-oriented software engineering, the UML class diagrams are used to specify the static structure of the system under study, such as classes, types and various kinds of static relationships among them. Objects of the persistent classes can be stored in object-oriented databases or in relational databases. In the former case, the UML class diagrams are actually used for conceptual object-oriented database designs. However, the standard UML class diagram lacks the ability to specify some inherent integrity constraints, such as keys and uniqueness, for object-oriented databases. This paper proposes an extension to the UML metamodel, i.e., the introduction of two new model elements (key and IConstraint) and some new attributes to the existing metamodel, to accommodate further, additional features for constraint speciﬁcation. On the model level, a compartment CONSTRAINT of the class notation and some property strings for displaying the integrity constraints are added. The database design is then mapped to the extended ODMG-ODL schema deﬁnition. Keywords: Conceptual Data Modeling, Integrity Constraints, ObjectOriented Database Design

1

Introduction

The Uniﬁed Modeling Language (UML) [16,18] has become the de-facto industry standard for object modeling. In the process of object-oriented analysis and design, a model and a few diagrams can be produced. The model contains all of the underlying elements of information about a system under consideration, while diagrams capture diﬀerent perspectives, or views, of the system model. A class diagram shows types, classes, and their relationships. It is the backbone of a software system. The objects of the persistent classes of a model can be stored in object-oriented databases or in relational databases. In the former case, a UML class diagram can be regarded as a conceptual model of an object-oriented database design.

Thank those anonymous reviewers and Prof. M. H. Scholl for their helpful comments.

J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 173–188, 1999. c Springer-Verlag Berlin Heidelberg 1999

174

Yongzhen Ou

A conceptual object-oriented data model represents the concepts in the domain under study. This model should be transformed to a database design (represented by a database schema deﬁnition), which can be implemented in an objectoriented database (OODB) system. An OODB, like a relational database, is supposed to serve as a repository of “correct” data. The accuracy or correctness of the data in the database is referred to as the integrity of the database [6]. During the last two decades, a lot of research has been conducted on the identiﬁcation, speciﬁcation, generation, and maintenance of integrity, both in the context of relational and object-oriented databases. Some examples are [3,8,19,13,12,4]. Database integrity can be enforced by integrity rules (or constraints). In relational databases, Date [6] identiﬁes four categories of rules classiﬁed after their “scope”, namely domain rules, attribute rules, relation rules and database rules. These rules specify the legal values for a given domain, a given attribute, a given relation, and a given database, respectively. The speciﬁcation of integrity rules falls naturally into the responsibility of a database design and not into the database applications. In the context of OODBs, according to the above scheme, integrity rules can also be classiﬁed into four categories: domain rules, attribute rules, class rules, and database rules. However, by the virtue of object orientation, the domain rules and attribute rules are represented and maintained “for free” in an OODB by the class hierarchy and the type system [13]. Therefore, only class rules and database rules need to be speciﬁed. Class rules apply to the objects of a given class only, while database rules apply to objects from two or more distinct classes. In case an object belongs to more than one class (especially in the superclass/subclass relationship), a database rule becomes a class rule. Moreover, rules that interrelate two or more classes can be generally transformed into class rules that may reference objects from other classes. As a result, we only deal with class rules in this paper. A similar approach is also taken in [13], where the integrity constraints are classiﬁed into intra-object constraints that apply within an object and inter-object constraints that apply across objects. In this paper, we assume that each object is associated with its most speciﬁc class, hence an object only belongs to one class. Taking this approach, we achieve the speciﬁcation of class rules and database rules by adding attributelevel and class-level constraints to a UML class diagram. Though the UML class diagrams oﬀer many concepts to describe the structure of a database model, they lack the ability to specify some inherent integrity constraints for objectoriented databases. This work proposes two ways to specify integrity constraints in a class. One is to use a property string to specify the constraints on the attribute-level. The other way is to add a compartment CONSTRAINT to the class notation to accommodate the speciﬁcation of class-level constraints. The layout of this paper is as follows. Section 2 proposes an extension to the UML metamodel for speciﬁcation of integrity constraints, followed by a discussion of specifying integrity constraints in UML class diagrams in Sect. 3. Section 4 presents an extension to the ODMG Object Model to accommodate constraint speciﬁcation, and Sect. 5 gives the detailed procedure of deriving an

On Using UML Class Diagrams for Object-Oriented Database Design

175

ODMG database design from a class diagram with deﬁnitions of constraints. Finally, Sect. 6 discusses some important issues regarding database design and concludes the work.

2

Extending the UML Metamodel for the Specification of Integrity Constraints

In UML, a constraint can be speciﬁed for any model element by putting a text string inside braces ({}) and then this constraint can be attached to the constrained element. Such a constraint is of very general purpose, it can be written in natural language or in a formal language such as the UML OCL (Object Constraint Language) [17]. A constraint can be a note or a comment as well. In this work, however, we restrict the constraints to database integrity constraints. The UML metamodel itself oﬀers some mechanisms to specify integrity constraints. For example, the attribute multiplicity of metaclass Attribute can be used to deﬁne whether the attribute is optional or mandatory, single-valued or multivalued. Let the multiplicity of an attribute be m..n with n >= m, m >= 0, and n > 0. If m = 0, the attribute is optional (may be NULL or a default value); and if m = 1, it is mandatory. If n = 1, the attribute is single-valued; and if n > 1, it is multivalued. The properties of an AssociationEnd, such as aggregation, isNavigable, multiplicity, and qualifier, can also be used to specify integrity constraints. However, the core UML metamodel does not include speciﬁcation of integrity constraints. For example, the uniqueness constraint, which requires that every object of a certain class have a unique value for some attribute, cannot be speciﬁed in the UML models. Though such a constraint can be formulated by using OCL, it is more convenient for database designers to specify the constraint in the modeling process with an explicit language device (or a single mouse click in a graphical tool), like in many Entity Relationship modeling tools. Other general constraints related to classes should be deﬁnable in the structure-modeling phase as well. To accommodate the speciﬁcation of integrity constraints, we introduce two new model elements and some attributes to the core package of the UML metamodel: 1. New model element Key. Key is introduced as a subclass of ModelElement that can be associated with Constraint and is subject to Namespace. A class consists of zero or more Keys, and a Key is composed of one or more Attributes and/or AssociationEnds. 2. New model element IConstraint. IConstraint is a subclass of Feature. It is used to state user-deﬁned integrity constraints for a Class. It has a Name attribute, which names the constraint. A class consists of zero or more IConstraints.

176

Yongzhen Ou

3. New attribute multiplicity for Class. The multiplicity of a class states the possible number of objects that may be maintained in the extent of the class. The default value is 0..*. 4. New attribute isUnique and isKey for Attribute. Both isUnique and isKey have Boolean as their type with default value false. If isUnique is true, then every object of the class has a unique value for this attribute. If isKey is true, then every object of the class can be uniquely identiﬁed by this attribute. isKey implies isUnique but not necessarily vice versa (a unique attribute may take NULL value but a key must not). 5. New attribute rfIntegrity for AssociationEnd. rfIntegrity stands for referential integrity which requires the existence of the target object referenced by a source object. When placed on a target end, it speciﬁes the policy that can be used to enforce referential integrity [13]: (a) abort: the deletion of a referenced object is disallowed; (b) cascade: the deletion of a referenced object causes the deletion of the referencing object; (c) nullify: the deletion of a referenced object causes the deletion of the reference in the referencing object. 6. New attribute rlIntegrity for AssociationEnd. rlIntegrity stands for relational integrity which enforce the consistency of a binary relationship. It is useful only in the case of a bi-directional association. When placed on a source end, it speciﬁes the action to be taken when the relational integrity is violated [13]: (a) abort: abort the transaction; (b) cascade: ﬁx the reverse reference. After the extension, the changed part (in italic) of the metamodel is shown in Fig. 1.

3

Specifying Constraints in Conceptual Modeling Using UML Class Diagrams

Nowadays, the Entity-Relationship (ER) model is the most widely used approach for conceptual data modeling [1]. Since its introduction by Chen [5], many extensions have been proposed to it. There exists a number of variants of Enhanced (or Extended) Entity-Relationship (EER) models. One of those EERs is deﬁned in [9]. During the last two decades, many tools have been developed to support conceptual modeling based on ER. ERwin [10] from Logic Works and DB-MAIN [7] from the University of Namur, Belgium, are two of the ER CASE tools. Although the ER models can also be used to aid OODB design, it is very imprecise in certain respects and lacks the ability to specify a number of details, especially integrity details [6]. In [15], the OMT (Object Modeling Technique) is developed as an enhanced form of ER with new concepts (such as qualiﬁcation). The UML class diagram is derived from OMT with some renaming and extensions. After the new extension we made to the UML metamodel in the above section, we can use the UML class diagrams to model various aspects of a domain of interest, including integrity rules.

On Using UML Class Diagrams for Object-Oriented Database Design

177

Fig. 1. Extended UML Core Package

3.1

Objects

In a UML class diagram, objects with similar structure, behavior, and relationships are grouped together into a class. The structure of a class is deﬁned by its attributes and the behavior of a class by its operations. In addition to operations, some integrity constraints can be speciﬁed in the class to control the behavior of the instances of the class. The relationships among objects may be deﬁned by association, generalization, and dependency, which are the subject of the next subsection. For example, an Employee class is shown in Fig. 2. In the class diagram, the constraints attached to class name and attribute name are displayed as property strings. Other constraints concerning all objects of the class (within a class extent) are stated in a separate compartment following the operation compartment. The property string 0..100 in the upper right corner of the Employee class indicates that only up to 100 objects are allowed to be maintained in the class extent. The property strings 1..1 and 0..1 specify the cardinalities of the associated attributes as mandatory single-valued and optional single-valued, respectively. A mandatory attribute may not have NULL (or default-value in some cases) while an optional one may. The property string 0..3 attached to attribute phone indicates that an employee can have up to three phone numbers. The uniqueness and key speciﬁcation of a single attribute can be done either on the attribute level (as a property string) or on the class level (as an explicit constraint), while the uniqueness and key of composite attributes can only be speciﬁed by explicit constraints. We oﬀer both facilities for the sake of convenience to the modeler but only show them in the constraint compartment in this paper. Moreover,

178

Yongzhen Ou

those attributes with a minimum cardinality of 1 are collected together in a not_null list and shown in the constraint compartment. This should be done automatically by a corresponding tool implementing our proposal. In the constraint compartment, the key constraint speciﬁes the attribute id as a key of the class extent while the unique constraint enforces that no two Employee objects have the same name and the same birthday values. Since both name and birthday are speciﬁed as mandatory attributes as well, the combination of them can also be served as a key. The CS1 constraint states that all employees must be older than 18 and younger than 65 years. The CS2 constraint says that the years-of-schooling of an employee must be at least 5 less than age. Both CS1 and CS2 are intra-object constraints in the sense of [13], which only apply to a single Employee object. However, the unique constraint is an interobject constraint that refers to all objects of the class. Note that in the constraint compartment, we only show the constraints but no actions specifying the intended behavior in the case of violation of the constraints. Actions are either the default of aborting the transaction or a user-deﬁned procedure that may be speciﬁed in the modeling phase or later in the design phase. In the class diagram, we can choose to Fig. 2: The Employee class hide such details to improve its readability. The syntax for key and uniqueness speciﬁcation is the same except for the diﬀerence in keywords (key or unique). We hence only show the syntax for key in the following in BNF (similar to the ODL of ODMG [2], but we allow roles as a part of a key): < key_spec > ::= key: < key_list > < key_list > ::= < key > | < key >,< key_list > < key > ::= < property_name > | < property_list >) < property_list > ::= < property_name > | < property_list > < property_name > ::= < attribute_name > | < role_name >

The syntax for specifying constraints in UML class diagrams is: constraint_name : constraint_condition[: action]

The general constraints may be expressed using any ODMG OQL predicates, or any OQL queries of type Boolean. In this case, the transformation of constraints from UML class diagrams to ODMG ODL is straightforward. As Gogolla and Richters [24] pointed out, the UML-OCL can also be used to specify constraints and queries. However, according to the above reference, there is a need to improve the present concepts and their semantics of OCL because its interpretation is partly incomplete and in some cases inconsistent.

On Using UML Class Diagrams for Object-Oriented Database Design

3.2

179

Relationships

A relationship is a semantic connection among classes. There exist several different kinds of relationships in UML, namely association, generalization, dependency, and derived element. In this section, we only discuss the ﬁrst two relationships. Associations. Besides the fact that a UML association is more expressive w.r.t. constraints than a relationship in ER models, a UML association that links objects of classes together is basically a relationship of the ER model. The associations which only relate two (not necessarily distinct) classes are called binary associations. Those associations relating more than two classes are called n-ary associations. An association path may have an association name and/or an association class as adornments. The following properties (among others) may be deﬁned for an association end: multiplicity, ordering, qualiﬁer, navigability, aggregation indicator, role name, rfIntegrity, and rlIntegrity. If a qualiﬁer is deﬁned, the association is called qualiﬁed association. If an aggregation indicator is available, the association is an aggregation. Depending on the navigability, associations can be classiﬁed into bi-directional and uni-directional associations. Theoretically, any n-ary (n > 2) association may be decomposed into n binary associations and an artiﬁcial class. Therefore we only discuss binary associations here. Moreover, the or-association in UML can be modeled by the introduction of a generalization with the complete and disjoint conFig. 3: Binary association works_for straints. Hence we don’t discuss the or-association hereafter. Figure 3 shows the association works_for between class Employee and class Department. The class diagram indicates that one employee may only (and must) work for one department while one department may consist of one to many employees. The referential integrity (rfI) of nullify for the association end attached to Employee class states that if an Employee object is deleted, the reference in a Department object to the deleted object must be set to NULL. The referential integrity (rfI) of cascade for the association end attached to Department class states that if a Department object is deleted, the Employee objects associated with this deleted objects must also be deleted (all Employee objects will disappear along with their working Department object). On the other hand, the relational integrity (rlI) of cascade attached to the Department end means that a Department object may modify its links to the Employee objects and the modiﬁcation must be propagated to the related Employee objects. The relational integrity (rlI) of abort attached to the Employee end means that the change of the working department within an Employee object would cause the abort of a transaction.

180

Yongzhen Ou

Generalization. Generalization is used to model the superclass/subclass relationship. The superclass is more general than its subclass while the subclass is more speciﬁc than its superclass. The subclass inherits the structure and behavior of its superclass. Multiple-inheritance is allowed in UML, where a subclass may have more than one superclass. A generalization path may have a discriminator as a text label to deﬁne the name of a partition for the subtypes of the superclass. Moreover, the four predeﬁned constraints of overlapping or disjoint and complete or incomplete may also be used to indicate semantic constraints among the subclasses. Figure 4 shows an exFig. 4: A generalization ample of a generalization.

4

Extending the ODMG Object Model for Constraint Specification

There exist quite a few object database management systems (ODBMSs) on the market, such as O2 [20], ObjectStore [21], GemStone [22], and Ontos/DB [23], to name a few. Each of them is based on its own data model. A database design must always target at a speciﬁc ODBMS. Recently, the ODMG standard has provided a framework in which the core aspects of an ODBMS can be deﬁned in a system-independent way [11]. The language binding provided by ODMG may then be used to map the ODBMS to a speciﬁc system. Therefore, this work maps the UML conceptual design to a database design based on the ODMG Object Model. The ODMG Object Model (ODMG/OM) is deﬁned by the Object Database Management Group (ODMG)[2]. It provides a standard for object database management systems (ODBMSs). The constructs speciﬁed by the ODMG Object Model include: 1. Object and literal. An object has a unique identiﬁer while a literal has none. 2. Type. A type is used to describe the common range of states and the common behavior of objects or literals. An object can be regarded as an instance of its type. 3. Property. A set of properties is used to describe the state of an object. Attributes and relationships are both properties of objects. 4. Operation. Operations are used to describe the behavior of an object. 5. Database. Objects are stored in a database. A database is based on a schema that is deﬁned in ODL (Object Deﬁnition Language). A database is an instance of its schema. The speciﬁcation language used to deﬁne the object types for the ODMG/OM is called Object Deﬁnition language (ODL). In ODL, a type can be speciﬁed by its interface or its class. A class can be instantiated, while an interface cannot. Interfaces and classes may inherit from other interfaces. However, interfaces

On Using UML Class Diagrams for Object-Oriented Database Design

181

may not inherit from classes, and classes may not inherit from other classes either. But a class may EXTEND another class. A class may have attributes and relationships describing the state of its instance and operations describing the behaviors of its instance. An object is an instance of a class, and a literal can be an attribute of a class. A database schema contains a set of class deﬁnitions. In ODMG 2.0, the combination of key and extent speciﬁcation can be used to enforce entity (object) integrity. The referential integrity may be “guaranteed” by the deﬁnition of relationships. However, what response should the application take if an integrity constraint is violated? How to deﬁne other, more general, constraints in the schema deﬁnition? These questions are not addressed in the current standard yet. Hence we make some extensions to the ODMG/OM to accommodate constraint speciﬁcation. 1. Referential integrity for object-valued attributes. In ODMG/OM, an attribute’s value may be either a literal or an object identiﬁer. In the latter case, the attribute is actually a reference to another object. In such a situation, the referential integrity should be maintained. In the case of integrity violation, an action should be taken, which may be speciﬁed in UML conceptual modeling using the rfIntegrity. To record this speciﬁcation in ODMG schema deﬁnition, we add the referential integrity deﬁnition to the ODL-BNF of the attribute deﬁnition: < att_dcl > ::= [readonly]attribute < domain_type > < attribute_name > [< f ixed_array_size >] [reference abort|cascade|nullify]

2. Referential integrity and relational integrity for relationships. The referential integrity speciﬁes the actions which should be taken in case of integrity violation while the relational integrity (the rlIntegrity in UML) maintains the consistency of the recording of references. The semantics of the nullify option of rfIntegrity is already implied by the relationship deﬁnition, hence we ignore it here. To include the integrity deﬁnitions in the ODMG-ODL, the BNF for relationship declaration is extended accordingly: < rel_dcl > ::= relationship < target_of _path >< identif ier > inverse < inverse_traversal_path > [abort|cascade] [reference abort|cascade]

3. The deﬁnitions of o-constraint and t-constraint. An o-constraint is an objectlevel constraint (cf. hard constraint in Ode [13]), which is checked immediately after the update of the associate object. Such a constraint usually involves only one single object. A t-constraint is a transaction-level constraint (cf. soft constraint in Ode) which is checked just before the commit of a transaction. Such a constraint usually involves more than one object. We regard the constraint deﬁnitions as a part of the class deﬁnition and extend the BNF for class declaration accordingly: < class >::=< class_header > {< interf ace_body > [< cons_dcl >]} < cons_dcl >::=< cons_spec >< condition >:< action > < cons_spec >::= o_constraint|t_constraint < condition >::=< query > < action >::= savepoint|abort| < user_def ined_procedure >

182

Yongzhen Ou

We use an ODMG-OQL (Object Query Language) query of type Boolean to deﬁne the condition of a constraint. If the condition of a constraint evaluates to false, the deﬁned action is taken. Two predeﬁned actions are the transactional commands savepoint and abort (rollback). Other actions may be deﬁned by application users.

5

Deriving ODMG Database Design from UML Class Diagrams

The ﬁrst step of semantic modeling is to identify useful semantic concepts [6]. In this work we assume the semantic concepts of interest are captured in UML class diagrams in conceptual modeling. After that, formal objects, formal integrity rules, and formal operators should be devised. The formal deﬁnition is done with ODMG-ODL. The UML is a powerful modeling language with which all steps–from conceptual modeling to system speciﬁcation and system implementation– can be done. In our work, we mainly use the UML class diagrams to design an OODB conceptually. Therefore, Fig. 5: The transformation of a qualisome concepts in UML, such as parameter- ﬁed association to a ternary association ized classes and visibility, are not that interesting to us. As Rumbaugh, et. al [15] pointed out, a qualiﬁed association can be considered a form of ternary association. Figure 5 (mui stands for ith multiplicity) shows the transformation of a qualiﬁed association to an association class. After the transformation, the semantics of the qualiﬁer should be implemented by methods of the target class. Furthermore, as we have mentioned in Sect. 3, any n-ary (n > 2) association can be equivalently decomposed into n binary associations and an auxiliary class. Figure 6 illustrates the transformation of a ternary association to three binary associations. Similarly, an association with a class can be transformed into two binary associations without an association class. As a result, we only need to map binary associations and other necessary concepts from a UML Fig. 6: The transformation of a class diagram to ODMG-ODL in the fol- ternary association to three binary lowing. associations The derivation of an ODMG schema begins from a package. If there exists any dependency among packages, the import/export schema facilities are used to

On Using UML Class Diagrams for Object-Oriented Database Design

183

make those dependent classes visible. Within a package, the following steps are needed to map a UML class diagram to an ODMG-ODL schema: 1. For each class in UML, create a class deﬁnition in ODMG-ODL with the same class name. Then examine the UML class speciﬁcation: (a) If the isAbstract attribute is false, then add an extent deﬁnition to the ODMG class. The name of the extent is the plural name of the class name. (b) For each attribute in the UML class with a cardinality of the form m..1 (m >= 0), deﬁne the attribute as a single value; otherwise, deﬁne the attribute as a set. (c) For each operation in the UML class, make a correspondent operation deﬁnition in the ODL class. (d) For each constraint of the UML class: i. If the name of the constraint is key, add correspondent key deﬁnition in the ODL class. ii. If the name of the constraint is unique with attribute list (attrl1, attrl2, . . ., attrli, . . ., attrln), where attrli is of the form (attri1, attri2, . . .), add the following t-constraints to the class deﬁnition (Cextent stands for the name of the class extent): t- constraint: for all o1 in Cextent : for all o2 in Cextent : o1 = o2 or (o1.attr11! = o2.attr11 or . . .) and . . . and (o1.attri1! = o2.attri1 or . . .) and . . .

iii. If the name of the constraint is not_null with attribute list (attr1, attr2, . . .), add the following o-constraint to the class deﬁnition ( a literal with default value is evaluated to nil): o- constraint:for all o in Cextent : o.attr1! = nil and o.attr2! = nil and . . .

iv. For all other constraints, transform them either to o-constraint or t-constraint in the form of (for all x in e1:e2) or (exists x in e1:e2) with the proper class extents as scope and the constraints as where conditions. v. If the class has a cardinality constraint of the form m..n with n as a ﬁnite integer, the following t-constraint should be added to the ODL class: t-constraint: count(Cextent) >= m and count(Cextent) <= n

2. For each interface in UML, create an interface deﬁnition in ODMG-ODL with the same interface name. 3. For each binary association with association_end1 attached to class1, association_end2 attached to class2. And role name role1, multiplicity mult1, constraint cons1, navigability navi1, referential integrity rf I1, and relational integrity rlI1 are the attributes of association_end1, while role2, mult2, cons2, navi2, rf I2, and rlI2 are those of association_end2 (when a role name is not available, the class name is used instead). (a) If only one of navi1 and navi2, say navi2 is true, then i. if mult2 is of the form m..1, then add a single object-valued attribute deﬁnition (1) to the class1 deﬁnition;

184

Yongzhen Ou

ii. if mult2 is of the form m..n with n > 1, and if cons2=ordered, then add the attribute deﬁnition (2), otherwise (3) to the class1 deﬁnition. (1) attribute classe2 role2 reference rf I2 (2) attribute list < classe2 > role2 reference rf I2 (3) attribute set < classe2 > role2 reference rf I2

(b) If both navi1 and navi2 are true, then i. depends on the value of mult2 and cons2, add one of the following relationship deﬁnition to the class1 deﬁnition: (1) relationship class2 role2 inverse class2 :: role1 rlI1 reference rf I2 (2) relationship list < class2 > role2 inverse class2 :: role1 rlI1 reference rf I2 (3) relationship set < class2 > role2 inverse class2 :: role1 rlI1 reference rf I2

ii. depends on the value of mult1 and cons1, add one of the following relationship deﬁnition to the class2 deﬁnition: (1) relationship class1 role1 inverse class1 :: role2 rlI2 reference rf I1 (2) relationship list < class1 > role1 inverse class1 :: role2 rlI2 reference rf I1 (3) relationship set < class1 > role1 inverse class1 :: role2 rlI2 reference rf I1

4. For each generalization between a supertype stype and one or more subtype type1, type2, . . .: (a) If stype is a class, then add the EXTENDS relationship from type1, type2, . . . to stype: type1 extends stype type2 extends stype ...

(b) If stype is an interface, then add the ISA relationship from type1, type2, . . . to stype: type1 : stype type2 : stype ...

(c) If there are any predeﬁned constraints attached to the generalization, add the corresponding t- constraint deﬁnitions to the class deﬁnition of stype. Table 1 gives the mapping of predeﬁned constraints to ODL-constraints (Sext stands for the extent of stype, T exti the extent of typei): Figure 5 gives an example ODL class deﬁnition for class Employee from Fig. 2 and Fig. 3.

6

Discussion and Conclusion

This paper proposed a framework of extending UML and ODMG-ODL for database design with more capability of specifying integrity constraints. Generally speaking, the speciﬁcation of integrity constraints may be done in the

On Using UML Class Diagrams for Object-Oriented Database Design

185

Table 1. Mapping of predeﬁned constraints to ODL-constraints predeﬁned constraint complete incomplete disjoint overlapping

ODL-constraint count(Sext) == count(T ext1) + count( T ext2)+ . . . count(Sext) >= count(T ext1) + count( T ext2) + . . . 0 = count(T ext1 intersect T ext2 intersect . . . ) 0 =< count(T ext1 intersect T ext2 intersect . . . )

conceptual modeling phase or in the design phase. We prefer the former approach because it makes it easy to understand the inherent constraints of the conceptual model. We provide some ﬂexibility for deﬁning key and uniqueness constraints either on the attribute level or on the class level. General constraints may be deﬁned using ODMG-OQL with some abbreviations. As an alternative to the extension to the UML metamodel, we can also use the extension mechanism of UML, namely stereotypes and tag values, to achieve the goal of specifying integrity constraints during UML modeling. For example, Rational Rose Oracle8 [14] introduces a few stereotypes, such as <> and <>, and property sets for projects, classes, operations, and attributes, to facilitate the object-relational database design and the speciﬁcation of integrity constraints. Instead of using ODMG-OQL to specify integrity constraints, one may directly use the UML-OCL [24]. The possibility of using stereotypes and UML-OCL to accommodate the speciﬁcation of constraints and the translation of UML-OCL to ODMG-OQL is currently being investigated by the author. The quality of a UML schema has a great impact on the quality of the resulting ODMG database schema. In [1], several quality criteria, such as completeness, correctness, expressiveness, readability, self-explanation, extensibility, and normality, are proposed to validate a database schema. Normally, the quality of a schema can be improved by applying a sequence of schema transformations to it. Due to the object-orientation of UML, some issues, such as expressiveness and self-explanation, are already more or less implied by the modeling language itself. Some other issues, such as normality, are not that relevant to objectoriented modeling. However, another quality check, i.e. integrity, should be done with each UML schema. For instance, constraints can be speciﬁed for classes, attributes and associations using the extended UML. These constraints should be checked carefully to ensure consistency and minimality of the schema. For example, if an attribute is speciﬁed as key, its cardinality should be 1..1. A prototype of database design tool based on the extended UML is going to be implemented. Such a tool should provide the database designers to use UML class diagrams to capture the semantic concepts of interest and specify integrity constraints in the semantic model. After ﬁnishing the conceptual modeling, the tool guides the designer in the transformation of qualiﬁed associations to association classes, and the replacement of n-ary associations and association classes with equivalent binary associations. In the end, an ODMG-ODL schema with constraints should be (semi-)automatically generated. Though we have chosen

186

Yongzhen Ou class Employee (extent Employees, key id) { // attributes attribute string name; attribute Date birthday; attribute short id; attribute float salary; attribute set<string> phone; attribute integer years_of_scholling; // relationships relationship Department department inverse Department::employee abort reference cascade; // Operations integer age(); void hire(); void ﬁre(); t-constraint: count(Employees)<= 100; // class cardinality constraint // uniqueness for all o1 in Employees: for all o2 in Employees: o1 = o2 or (o1.name != o2.name or o1.birthday ! = o2.birthday); o-constraint: // not_null constraints for all o in Employees: o.name != nil and o.birthday != nil and o.id != nil and o.years_of_schooling != nil and o.department != nil; // CS1 and CS2 for all o in Employees: o.age > 18 and o.age < 65; for all o in Employees: o.years_of_schooling <= o.age − 5; for all o in Employees: count(o.phone) <= 3; //attr. card. constraint }

Fig. 7. A derived class deﬁnition with constraint speciﬁcations

the ODMG/OM as target data model, with minor modiﬁcation, the derivation method may be used to produce database schemas based on other languages as well. However, because of the diﬀerences in expressive power, there are some concepts in UML that cannot be exactly mapped to ODMG/OM. For example, bi-directional associations and aggregations in UML are both mapped to relationships in ODMG/OM. Moreover, there is no distinction between shared aggregation and composite aggregation in ODMG/OM. Such subtle diﬀerences between data models can be expressed in an attached documentation in the target system so that the precise semantics of the system can be more or less preserved. Another way to amend this is to use constraints in ODMG/OM to implement the exact semantics of the original UML concepts so that the transformation of a UML class diagram to an ODMG database schema leads to no loss of information. We plan to automatically generate such constraints in the

On Using UML Class Diagrams for Object-Oriented Database Design

187

future. Another aspect that needs more investigation is the algorithm of deriving constraints in ODMG from a UML class diagram. In the case of constraints involving objects from more than one class and constraints with parameters, the derivation rules will become very complex. Furthermore, it is possible to use UML to specify methods that implement operations. However, ODMG itself does not provide any object manipulation language (OML) but only provides language bindings to C++, Smalltalk, and Java. As a result, if we want to map the method speciﬁcations in UML to a formal language as well, we need to make the schema transformation languagespeciﬁc. Visually modeling database updates targeting at the Java programming language is under consideration. Last but not least, the problems of how to enforce and when to check the constraints in OODB eﬃciently must be further studied. By putting the constraints into one central place, namely into the database, and not scatter them around in the application code, the quality of the database can be greatly improved and the application development can be accelerated. On the other hand, the performance of the database may be slowed down due to the extra checking of constraints at execution time. This can be remedied by localizing database operations and optimizing checking of constraints. There is good hope for that to succeed, since constraints checking is essentially processing (Boolean) queries.

References 1. Batini, C., Ceri, S., and Navathe, S. B.: Conceptual Database Design: An EntityRelationship Approach. The Benjamin/Cummings Publishing Company, Inc. 1992 2. Cattell, R.G.G., Barry, Douglas.: The Object Database Standard: ODMG 2.0. Morgan Kaufmann Publishers, Inc.1997. 3. Ceri S., Widom J.: Deriving production rules for constraint maintenance. In: Proc. 16th Intl. Conf. in Very Large Data Bases. Morgan Kaufmann, San Mateo, Calif.,567-577. 1990. 4. Widom, J. and Ceri, S. (editors): Active Database Systems: Triggers and Rules for Advanced Database Processing, Morgan Kaufmann Publishers, San Francisco, CA, 1996. 5. Chen, P. P.: The Entity-Relationship Model - Toward a Uniﬁed View of Data, ACM Transactions on Database Systems, Vol.1, No.1, pp9-36, 1976. 6. Date, C. J.: An Introduction to Database Systems, Sixth Edition. Addison-Wesley Publishing Company, Inc. 1995. 7. The DB-MAIN Database Engineering CASE Tool developed by Professor J-L. Hainaut. http://www.info.fundp.ac.be/ dbm/index.html 8. Diaz O.: Deriving rules for constraint maintenance in an object-oriented database. In: Ramos I, Tjoa AM (eds.) Proc. Intl. Conf. on Databases and Expert Systems DEXA. Springer, Berlin,332-337. 1992. 9. Elmasri, R. and Navathe, S.B.: Fundamentals of Database Systems, Second Edition. The Benjamin/Cummings Publishing Company, Inc., 1994. 10. ERwin Features Guide. Logic Works, Inc. 1997. 11. Fahrner, C., Vossen, G.: Transforming Relational Database Schemas into ObjectOriented Schemas according to ODMG-93. In Proceedings of the 4th International

188

12.

13.

14. 15. 16. 17. 18. 19.

20. 21. 22. 23. 24.

Yongzhen Ou Conference on Deductive and Object-Oriented Database, National University of Singapore, December 1995, 429-446. Gehani, N. and Jagadish, H.V.: Ode as an active database: Constraints and triggers. In Proceedings of the Seventeenth International Conference on Very Large Data Bases,327- 336, Barcelona, Spain, Sep. 1991. Jagadish, H. V. and Qian X.: Integrity Maintenance in an Object-Oriented Database, In Proc. of the 18th Int’l Conf. on Very Large Databases, Vancouver, BC, Canada, Aug. 1992. Rational Rose 98: Using Rational Rose / Oracle8. Rational Software Corporation, 1998. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., and Lorensen, W.: ObjectOriented Modeling and Design. Englewood Cliﬀs, New Jersey, Prentice-Hall. 1991. UML Notation Guide, Version 1.1, Rational Software Corporation, 1 September 1997. http://www.rational.com/uml/ UML Object Constraint Language Speciﬁcation, Version 1.1, Rational Software Corporation, 1 September 1997. UML Semantics, Version 1.1, Rational Software Corporation, 1 September 1997. Urban S.D., Desiderio M.: Translating constraints to rules in CONTEXT: A CONstrainT EXplanation Tool. In: Kent W, Meersman RA, Khosla S (eds.) ObjectOriented Databases: Analysis, Design and Construction. North-Holland, Amsterdam, 373-392. 1991. The O2 System. O2 Technology, Paris, France. http://www.o2tech.com. Object Design Inc., Burlington, MA, USA. http://www.odi.com. Servio logic Inc., Beaverton, USA. http://www.gemstone.com. ONTOS Inc.,Burlington, MA, USA. http://www.ontos.com. Gogolla, M. and Richters, M.: On Constraints and Queries in UML. In: Schader M. and Korthaus A. (eds.), The Uniﬁed Modeling Language: Technical Aspects and Applications. Physica-Verlag, 109-121, 1997.

Literate Modelling — Capturing Business Knowledge with the UML Jim Arlow1 , Wolfgang Emmerich2, and John Quinn1 1

British Airways Plc, TBE (E124), Viscount Way, Houslow, UK [email protected] 2 Dept. of Computer Science, University College London, London WC1E 6BT, UK [email protected]

Abstract. At British Airways, we have found during several large OO projects documented using the UML that non-technical end-users, managers and business domain experts find it difficult to understand UML visual models. This leads to problems in requirement capture and review. To solve this problem, we have developed the technique of Literate Modelling. Literate Models are UML diagrams that are embedded in texts explaining the models. In that way end-users, managers and domain experts gain useful understanding of the models, whilst object-oriented analysts see exactly and precisely how the models define business requirements and imperatives. We discuss some early experiences with Literate Modelling at British Airways where it was used extensively in their Enterprise Object Modelling initiative. We explain why Literate Modelling is viewed as one of the critical success factors for this significant project. Finally, we propose that Literate Modelling may be a valuable extension to many other object-oriented and non object-oriented visual modelling languages.

1 Introduction In working with several large companies such as British Airways, we have become aware of significant deficiencies in visual modelling as it is currently practised. OO was introduced into BA in 1991 [1], and since then there have been about ten significant projects that used Object Technology. CASE tools, such as Rational Rose and System Architect, support easy construction of models which capture business processes and rationale. Once embedded in the model, however, this information becomes somewhat inaccessible. In order to be able to extract the information from the model one generally needs a working knowledge of the visual modelling languages and some knowledge of the operation of the CASE tool. The visual modelling languages that we have used at British Airways include Boochs Design Notation [3] and OMT [9]. We have also used the UML [7] in several large projects at British Airways and since last year, UML has become the standard notation for Object-Oriented Analysis at British Airways. We gained the experience reported about in this paper when we developed the enterprise object model for the airline. This enterprise object model has been developed during the past two years. The work was started using Booch diagrams and we switched using to the UML subset supported by Rational Rose 4.0 as in Summer 1997. The enterprise object model captures the semantics of concepts, such as flights, segments, aircraft J. Bézivin and P.-A. Muller (Eds.): <> '98, LNCS 1618, pp. 189-199, 1999.  Springer-Verlag Berlin Heidelberg 1999

190

Jim Arlow, Wolfgang Emmerich, and John Quinn

and crew at such a level of abstraction that they can be re-used across different projects of the airline. The enterprise object model is, thus an analysis object model that fully abstracts from implementation details. Even with UML knowledge, it is often impossible to uncover the important business requirements and imperatives underlying the model, as these become invisible when taken out of business context and expressed in a visual notation. It is quite common for an Analyst or Designer to look at a UML model constructed just a few months ago, and be unable to explain the forces and business requirements that shaped that model. We call this the trivialisation of business requirements and it is not unique to the UML – other visual languages suffer precisely the same defect. The reason for this is that one of the strengths of visual modelling, its conciseness and terseness is also a weakness. Important business requirements may be expressed as a class, relationship, method, constraint or multiplicity, and so become lost amidst other similar modelling elements, which have less business impact. In this paper, we explore this problem from the perspective of the UML and using the successful Enterprise Object Modelling work recently carried out at British Airways (BA) to provide examples. We then go on to describe Literate Modelling, our extension to the UML developed during our work at BA, which can alleviate these problems. This paper is structured as follows. In the next section, we assess how well different groups of people involved in a system development can understand UML. This assessment is based on anecdotal evidence that we gained from using the UML in various substantial and mission critical projects at British Airways. We then argue why the UML tends to trivialise important business requirements and give an example for that. We introduce the concept of Literate Modelling that we have successfully employed to avoid such trivialisation. Literate Modelling is used to produce Business Context documents that are hybrid documents in which different types of UML diagrams are interleaved with carefully produced natural English explaining them. We conclude by indicating requirements for tool support for Literate Modelling.

2 Accessibility and Comprehension of the UML Models The UML provides a variety of powerful and useful models, and each of these models targets and is accessible and comprehensible to a limited audience. In this context accessibility means “ability to understand the syntax and work the CASE tool”. From our perspective comprehension is the more important issue, and it is often contingent upon accessibility. By comprehension we mean “understanding the business semantics of the model”. Table 1 illustrates comprehension for a logical Analysis model. The table rates comprehensibility of diagrams in analysis models by different stakeholder groups on a scale between 0 and 6. We obtained the figures when we observed comprehension of different stakeholders during structured walkthroughs and presentations of the enterprise object model. Zero indicates the lowest and 6 indicates the highest comprehensibility. At this point, we do not consider the physical models (Component diagrams and Deployment diagrams). We also defer discussion of design models to a later date. We agree with [8] that Activity diagrams would be useful to specify workflow. Because our most commonly used CASE tool does not support them, we have however insuf-

Literate Modelling - Capturing Business Knowledge with the UML

191

ficient experience and data to include Activity diagrams in this discussion. However, we expect that their comprehensibility and accessibility will be quite high – similar to Sequence Diagrams.

Use Case Descriptions Use Case Diagrams Sequence Diagrams Collaboration Diagrams Class Diagrams State Diagrams

Manager User Domain Expert Analyst Designer Programmer 4 6 6 6 3 2 3 5 5 6 4 3 2 2 4 6 5 4 1 1 3 6 6 5 1 1 2 6 6 5 0 0 0 4 6 5

Table 1. Comprehensibility of UML Diagrams in Analysis Models

We have considered six common classes of customer for UML models, and rated their comprehension on a scale of zero to six. On this scale zero means no comprehension at all and six denotes full comprehension of a UML model. These estimates are averages based on our private communications with many individuals performing roughly these roles over the course of many different UML projects in many businesses. Clearly, the chart is subjective, but it serves to illustrate several important observations. 2.1 Use Case Descriptions What is immediately apparent from Table 1 is that Use Case descriptions have the highest comprehensibility. This is because: 1. They are usually written in plain English and so there is no accessibility problem 2. They are often already familiar to non-OO practitioners, as they are just descriptions of business processes from the point of view of the user. It is quite easy to step inside the Use Case Description and role play in order to enhance comprehension However, one of the problems with Use Cases in general is that they are descriptions of specific business processes from the perspective of a particular Actor. As such they do not give a clear picture of the overall business context and imperatives that actually generate the requirements for these business processes. This means that they can be quite incomprehensible to non-domain experts. We can cite many examples of this from our work at British Airways and other companies where Use Cases, taken individually, make little sense to the uninitiated. This is because there is a business context, such as rules of the air, operational imperatives, interoperation with partners, interoperation with Actors which are legacy systems and standard jargon. This context is not well captured or explained by the Use Case Description or any other UML construct. More technical Designer and Programmer roles can have problems with comprehension: Understanding a Use Case often requires domain knowledge that the designer or programmer simply may not have. What is disturbing is that the UML provides no

192

Jim Arlow, Wolfgang Emmerich, and John Quinn

formal mechanism (except for the catch-alls of the Note and free-text annotations to diagrams) to capture and present this important information. Embedding this information in the Use Cases themselves is not really an option, as this business context is in many ways orthogonal to any particular business processes. 2.2 Use Case Diagrams Generally we find these to have similar comprehensibility to Use Case Descriptions, and, although diagrammatic in nature, their accessibility is very high because of the simplicity of the notation. Compared to Use Case Descriptions the Use Case Diagram is semantically weak and we have found that they are not comprehensible without explanation or reference to the Use Case Description. We have therefore given them a lower comprehensibility than Use Case Descriptions for the non-technical roles, NonTechnical Manager, User and Domain Expert. We expect that the Analyst will have sufficient mastery of the CASE tool to navigate to the Use Case Descriptions from the Use Case Diagram, and so there is no loss of comprehensibility. Again, more technical roles, such as Designer, Programmer, may not have sufficient business domain knowledge and knowledge of the business context to appreciate the Use Cases fully. 2.3 Sequence Diagrams We are now in the realm of object-orientation, and comprehensibility falls sharply for non-OO literate participants. We have found that Non Technical Managers and Users find raw sequence very difficult to follow and they dont really understand the details of object interaction. Comprehension is slightly higher for Domain Experts, as these roles often have some exposure to object-orientation through working with Analysts. Adorning the Sequence Diagrams with scripts increases comprehensibility markedly for this group, but comprehensibility is now of the script rather than the visual model that remains largely obscure. Because Sequence Diagrams express interaction between objects, Designers and Programmers naturally understand the interactions even though they might not be so sure about the underlying business processes that drives the interaction. 2.4 Collaboration Diagrams Non-technical roles such as Non Technical Manager, User and Domain Expert typically find these confusing, and there is no reasonable opportunity for adornment with a script to increase comprehensibility. We give these a very low comprehensibility for this audience although again, comprehension is higher for the Domain Expert. However, Analysts, Designers and Programmers find these diagrams useful and comprehensible. 2.5 Class Diagrams For comprehensibility these require: 1. Some basic OO training

Literate Modelling - Capturing Business Knowledge with the UML

193

2. Knowledge of UML syntax 3. Ability to use the CASE tool to uncover class and relationship semantics We have found that comprehensibility of these diagrams is typically very low for Non Technical Managers and Users. It is slightly higher for Domain Experts, as these roles often have some exposure to object-orientation through working with Analysts. Naturally, for the technical group, Analysts, Designers and Programmers, comprehensibility is very high, although we have noticed that many programmers do not have sufficient understanding of UML syntax and object-oriented analysis to fully appreciate them. Hence, we argue that the Programmers comprehension is lower than that of Analysts and Designers. Analysts will tend to understand the class diagram from the business perspective, Designers will often know less about the business, and their comprehension will be more in terms of object-oriented design issues such as patterns and idioms. Programmers are in many organisations more junior than Analysts and Designers. They will tend to know little about the business and little about good object-oriented design principles. This leads to a lack of comprehension of all aspects of the model, and is particularly dangerous. 2.6 State Diagrams State diagrams are quite specialised and have a particularly elegant yet terse syntax, which is rarely understood by the non- technical group of Non Technical Managers, Users and Domain Experts. Comprehensibility is effectively zero on our scale for this group. In fact, we have found that it is only really Designers, and not all Designers at that, who have a clear grasp of State Diagrams. The essence of this problem is that we are trying to capture a dynamic system in a static notation. It is our suspicion that State Diagrams will only increase in comprehensibility when they can be executed and animated in the CASE tool. 2.7 The Problem Several important issues arise from the above discussion: 1. As we move from Use Cases to State Diagrams, the non technical group gradually loses comprehension of the models 2. As we move from Use Cases to State Diagrams the emphasis shifts from a focus on business requirements and imperatives to a focus on the intricacies of object modelling 3. There is a traceability issue. As the non-technical group who understand the business requirements best lose comprehension of the UML Sequence, Collaboration, Class and State diagrams, the traceability of high level requirements to class diagrams becomes more and more problematic. 4. We have found that the technical group Designers and Programmers often have little understanding of the actual business and its needs, and so we cannot rely on them to capture business requirements correctly in their models.

194

Jim Arlow, Wolfgang Emmerich, and John Quinn

5. Important business requirements become trivialised to classes, relationships, constraints or multiplicity, which may be hidden amongst others of their kind. We call this process trivialisation, because important requirements are translated into a context in which their importance is no longer apparent. We discuss this issue in the next section.

3 The Trivialisation of Business Requirements by Visual Modelling Some business requirements are more important than others. Often, there is no way to tell from a blunt statement of the requirement how important it is to the overall operation of the business. We need to see the requirement in its business context to correctly gauge its importance. It is precisely the business context that is lacking in all UML meta models. As well, in the real world, important business requirements are often highlighted by a certain amount of ceremony – there may be papers, working groups investigating the requirement and discussion at managerial level. This activity is typically how we know that something is perceived to be important to the business. All of this valuable contextual information is absent from the UML model. Although we may have a statement of a particular requirement as part of a UML Use Case, we have no formal mechanism to highlight the importance of this requirement or set it in its true business context. As well, when the requirement is expressed in a class diagram, it becomes a cluster of modelling elements much like any other, and so essential requirements, rather than being highlighted in the visual model, may fade into the background. We have a specific example from our work at British Airways, which nicely illustrates this point. The 90s have been the age of the global airline, and there has been a great deal of activity forming various alliances so that one partner may sell seating capacity on another partners flight. This is known as codeshare, and is good for passengers, as they can complete a complex journey using a set of co-operating companies. It also improves customer service and can generate new business worth millions of pounds sterling. There is a clear and important business requirement at BA and other alliance partners to support codeshare in its systems. How is codeshare represented in Alliance systems? The key to this is that each flight must have the capability to have many flight numbers. Otherwise codeshare is not supported by the system, and millions of pounds may be lost. How is the business requirement for codeshare represented in a UML model? Clearly, there will be a Use Case where a passenger travels on a flight which has a BA flight number but which is operated by an Alliance partner and vice versa. Already, we lose sight of the requirement in the sea of Use Cases. In the class diagram codeshare is represented as shown in Fig. 1 below: So a multimillion-pound business requirement, affecting an alliance of companies together worth billions, is represented as an asterisk on a UML Analysis class diagram. This is exactly and precisely what we mean by the trivialisation of business requirements by visual modelling.

Literate Modelling - Capturing Business Knowledge with the UML

1 Flight

195

1..* FlightNumber

Fig. 1. Example of Trivialisation

4 Literate Modelling We aim to solve the dilemma of precision and conciseness versus comprehensibility by applying a technique that we refer to as Literate Modelling. Literate Modelling is the application of Knuths idea of Literate Programming [6] to object-oriented analysis models. Similar to Knuths approach we aim at interleaving models with text that explains the model to both the author and externals so that the models can be better appraised and changed. Literate Modelling attempts to address both of the issues we have raised: the accessibility and comprehensibility of the UML models and the trivialisation of business requirements by visual modelling. It does this by providing the missing business context for the models. We extend the UML by adding new documents, Business Context documents, which explain the model in light of the business context that has generated the forces that have shaped it. Important business requirements are highlighted and unambiguously mapped to parts of the model. As well, important features of the model are discussed in these documents and it is explained why particular modelling choices were made, again always in terms of the business requirements and context. It is our experience that this provides the missing information that makes a UML model accessible and comprehensible to a wide audience whilst simultaneously resolving the trivialisation issue. 4.1 The Business Context Document When we had completed the first iteration of the Enterprise Object Model at British Airways, we then had to present these results to a wide audience. This audience ranged from non-technical Senior Managers to Programmers. We even presented it to Alliance Partners, who did not belong to the Airline. We found that we were having great difficulty presenting the standard UML models. Often our audience had no understanding of UML syntax or of object-orientation at all. We performed model walkthroughs that have been suggested for other Analysis techniques [4]. We took a business perspective highlighting key features of the model that supported specific, key business requirements, introducing the bare minimum of UML syntax to create understanding as we went along. In many cases, we also had to explain the business context and forces that generated these requirements and our realisation of them in UML models. We found that even skilled UML practitioners were not always able to consider a UML modelling element and relate that back to a specific business requirement such as codeshare. The level of detail they were confronted with in relatively short periods overwhelmed the non-technical audience. They regularly missed the important points we tried to make during a walkthrough. We concluded

196

Jim Arlow, Wolfgang Emmerich, and John Quinn

that walkthroughs through UML diagrams are not the right way to communicate with a non-technical audience about essential business requirements. We found that a more permanent representation was needed to explain the business requirements that led to UML models. We decided to extend the UML with Business Context documents that capture this information. We used the following approach: 1. Any UML class model is partitioned into packages, which mapg onto essential, recognised business areas. For example, we might have Products, Accounts and Orders. 2. Within each of these packages, UML class diagrams are created with the criterion that each diagram should “tell a story”. By this, we mean that the diagrams should illustrate important business processes and requirements. 3. Business context documents are created which discuss in general terms each of these business areas. They discuss background, general principles and concepts, essential requirements and the forces that shape this part of the business. 4. The Business Context documents are always worded in terms of the classes in one of the class diagrams. For example, if we are writing a Business Context about the Products area, then it will contain phrases such as “Each Product has a PriceList which contains zero or more Prices”. It will then go on to explain exactly why this has to be the case, often presenting simple examples. The words in Italics are the names of specific classes on one of the diagrams in the Products package. In a similar way, all attributes, operations and relationships that are discussed in the Business Context are explicitly named in the model, and are always referenced by name. In this way, we tie a high-level description of the business requirements and context directly to the class diagram. This gives a high degree of requirements traceability without introducing formal requirements engineering techniques. 5. UML diagrams are embedded in the Business Context documents as figures. 6. When a UML diagram is discussed, a brief explanation of the relevant UML syntax is included in a footnote. The discussions are carefully constructed so that: (a) The non-technical reader can follow them without referring to the diagram. (b) The readers with limited object-orientation, database or UML knowledge can glean some understanding of the diagram by referring to the footnotes. 7. Any description of any part of the model is always from a business perspective. 8. Interesting/subtle use of UML and use of design patterns is referenced in footnotes for the technical reader. 9. We have found that all our readers like real, yet simple, examples to illustrate specific points. Again, the example will be couched, wherever possible, in terms introduced in the UML model. If we find we have to use business terms which do not exist in our model, this indicates either (a) The example exceeds the scope of the model in some way. (b) The scope of our model needs to be expanded. 10. We find that the Class Diagram, Use Case Diagram and Sequence Diagram are quoted most often in the Business Context document. However, in some cases, we have found it expedient to include references to State Diagrams for more technical readers. 11. Informal diagrams can be used liberally wherever they enhance the text.

Literate Modelling - Capturing Business Knowledge with the UML

197

A good Business Context is actually quite difficult to write as the author must have a very sound and broad overview of the business, excellent UML modelling and communication skills and be highly articulate and literate. One of us is a licensed practitioner of Neuro-Linguistic Programming [5], and was able to bring these skills directly to bear in the creation of the Business Contexts. Neuro-Linguistic Programming is a model of communication, and a set of specific techniques, which facilitate and improve the quality of communication. We find that we get the best results when the Business Context document is lively, involving, direct, provocative, precise, concise and, if possible, humorous. It is, however difficult to incorporate humour well and it should be avoided if in doubt. The passive voice should never be used as it has hypnotic qualities [2], disassociates the reader from the story line of the document and leads to boredom and low comprehension. The purpose of the Business Context is communication, and this requires capturing and involving the attention of the reader. In particular, we have found in many different companies that high-level managers often have quite short attention spans. 4.2 Business Processes, Business Context and Packages It is well known that a business workflow might cut across the Package and Use Case structure of a UML model. For example, selling a Product will require collaboration between classes in the Products, Orders and Accounts packages as a minimum and will involve many Use Cases. UML Activity Diagrams cut across Packages, Use Cases, and model workflow, in which several Use Cases will participate. Similarly, our Business Context documents cut across Packages and Use Cases. A Business Context document associated with Orders is, primarily, associated with the Orders package, but it actually quotes from diagrams that belong to the Products, Accounts and Orders packages. 4.3 Making Enterprise Object Modelling succeed with Literate Modelling We believe that the issues of accessibility and comprehensibility are two critical reasons why Enterprise Object Modelling has largely failed so far. Our introduction of Literate Modelling has been shown to successfully address these issues. Literate modelling renders object-oriented notations, such as the UML, useful for Enterprise Object Modelling. It is particularly important in that application domain that the ideas and concepts expressed in a model can be explained and discussed with a large body of non-technical stakeholders. The textual descriptions provided as part of a literate model renders formal UML diagrams understandable and makes them accessible for a review by these nontechnical people. The transient descriptions provided during a walkthrough are largely inappropriate for that purpose. There is a second advantage to using Literate Modelling for Enterprise Objects: the textual explanations of business semantics that result from literate models are extremely precise. When applying Literate Modelling in the Enterprise Object Model of British Airways, we found that the textual descriptions that describe a UML diagram explain business semantics more precisely than those that can be elicited from stakeholders do. A marketing manager at British Airways greeted one of our literate models as the best description of codeshare he had seen ever seen. We attribute this to the fact that

198

Jim Arlow, Wolfgang Emmerich, and John Quinn

analysts develop a very precise and unambiguous understanding of concepts during the development of UML models. This enables them to write very clear and well-structured natural language descriptions, which are largely free of ambiguity. 4.4 Literate Modelling - The Future We see a bright future for Literate Modelling, and will certainly apply it in future projects. We hope to develop the technique in the following ways: 1. CASE support. At present, the Business Context is maintained in a word processor, and ideally, we would like it managed by the CASE tool 2. Even high level models can rapidly become quite complex. We feel the need to introduce a Business Roadmap document that provides a high-level overview of the Business Context documents, and, as its name suggests, provides a way for interested parties to rapidly find the information they need. 3. We would like to publish the Literate Model on the Internet/Intranet: (a) We would like to hyperlink key words and phrases in the Business Context document directly to the appropriate UML diagrams. (b) We would like to hyperlink the Business Contexts and model to other important documents that may already exist. (c) A search engine would be a useful addition. Then users could look up key words and phrases in the Business Contexts. (d) We would like discussion forums where readers can discuss the model. (e) FAQs (frequently asked questions) have proven to be a very powerful way to lower support costs, so we would like to develop FAQs for our model. We have found that the same questions come up repeatedly. We expect that we will need Business FAQs for each Business Context document and a Technical FAQs for UML modelling in general. 4. Attempt to define a more formal structure for the Business Context document and generate guidelines for writing Business Contexts.

5 Summary Our initial application of Literate Modelling has been very successful and we believe that the technique addresses accessibility and comprehensibility issues in the UML. The embedding of UML diagrams into explanatory text renders the diagrams comprehensible, even for audiences who are not literate in object-oriented concepts and notation. UML analysts are able to describe business semantics more precisely, once they have formalised it from different perspectives. Literate modelling is highly advantageous when UML models have to be used as a basis for communication between those with a very detailed understanding of objectoriented concepts and those who are not object-oriented literate. This is particularly the case when object-oriented analysts need to interact with stakeholders that have a nontechnical understanding. This type of communication is required during object-oriented analysis, enterprise object modelling and domain modelling.

Literate Modelling - Capturing Business Knowledge with the UML

199

We conclude by pointing out that we believe that Literate Modelling is also beneficial for visual notations other than the UML. All visual object-oriented analysis notations that we are aware of suffer from precisely the same problems as the UML when they need to be used for communication with a non object-oriented literate. Yet they are all amenable to Literate Modelling as they all produce diagrams of different forms that can be embedded into explanatory text. We suspect that the technique is just as applicable to non object-oriented formalisms, though we suspect that many of these lack some of the expressiveness of the UML and other object-oriented notations. This makes them less suitable for enterprise and domain modelling.

References 1. J. Arlow and M. Phoenix. Introducing Object Technology into British Airways. In Proc. of the Object Expo Europe 1995. SIGS Conferences Ltd, 1995. 2. R. Bandler and J. Grinder. Patterns of the Hypnotic Techniques of Milton H Erickson. Meta Publications, 1977. 3. G. Booch. Object Oriented Analysis and Design with Applications. Benjamin Cummings, 1994. 4. T. de Marco. Structured Analysis and System Specification. Yourdan, 1978. 5. R. Dilts. Applications of Neuro-linguistic Programming. Meta Publications, 1983. ISBN 0-916990-13-0. 6. D. Knuth. Literate Programming. The Computer Journal, pages 97–111, 1984. 7. Object Management Group, 492 Old Connecticut Path, Framingham, Mass. UML Notation Guide, ad/97-08-05 edition, NOV 1997. 8. B. Paech. On the Role of Activity Diagrams in UML. In J. Bezivin and P.-A. Muller, editors, UML 98 – Beyond the Notation, Proc. of the Int. Workshop, Lecture Notes in Computer Science. Springer, 1999. 9. J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-Oriented Modeling and Design. Prentice Hall, 1991.

Applying UML to Design an Inter-Domain Service Management Application Mohamed Mancona Kandé1, Shahrzade Mazaher2, Ognjen Prnjat3, Lionel Sacks3, and Marcus Wittig1 1

GMD-Fokus, Kaiserin-Augusta-Allee 31, D-10589 Berlin, Germany {kande, Wittig}@fokus.gmd.de 2 Norwegian Computing Center, P.O.Box 114 Blindern, 0314 Oslo, Norway [email protected] 3 University College of London, Dept. of Electrical Engineering, Gower Str., London WC1E6BT, UK {oprnjat, lsacks}@eleceng.ucl.ac.uk

Abstract. We present a component-oriented approach to demonstrate the use of the Unified Modeling Language (UML) and Open Distributed Processing (ODP) concepts to design service components of a telecommunications management system. This paper is based on the work done in the design phase of the ACTS project “TRUMPET” (Inter-Domain Management with Integrity). This project undertook to produce a service management architecture suitable for emerging liberalised telecommunications markets. The criteria where that the system should be highly distributed both technologically and administratively -- i.e., across many kinds of organisations. TRUMPET project presents a good model environment to develop not only the service architecture itself but also the methodologies for producing such designs. In our approach, we use the conceptual framework of ODP and discuss some methodological issues related to ODP-viewpoint modelling of distributed systems using UML notations. We conclude with recognising the power of combining UML and ODP so as to manage the complexity of the problem.

1

Introduction

Designing and implementing complicated distributed control systems in large international groups or consortia is non-trivial. Techniques are required to ensure that all the participants understand where their work applies, the work of their collaborators, and the relationship between these. Further, when building systems which are expected to have some longevity it is important that people looking at the documentation can understand what is involved. These considerations motivate the use of architectural design methodologies and semi-formal methods in system J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 200–214, 1999. © Springer-Verlag Berlin Heidelberg 1999

201

Mohamed Mancona Kandé et al.

modelling. Another important consideration is that the methodologies used should facilitate close analysis of the design to ensure, at early stages, that the system is complete in meeting its requirements and consistent in its operation. This should yield robust, well-engineered products. Finally, when presenting the product to users, its functionality and roles within their business practices must be clearly documented and defined (following the motivations behind the Network Management Forum OMNIPoint Ensembles [9]). This paper illustrates how two popular techniques can be combined to achieve these goals, and how this was done in the context of the TRUMPET service management architecture [3]. The TRUMPET project undertook to produce a service management architecture suitable for emerging liberalised telecommunications markets. The criteria were that the system should be highly distributed, both technologically and administratively -- i.e., across many kinds of organisations. Specifically, these organisations include: Public Network Operators (PNOs), Customer Premises Network centres (CPNs) and third party Value Added Service Providers (VASPs) or bandwidth retailers. Moreover, the TRUMPET consortium consisted of a number of engineers, from several companies across Europe who had to collaborate to design, build and run the system. Thus the TRUMPET project presented a good model environment to develop not only the service architecture itself, but also the approach for producing such an architecture. The two techniques used were the ITU-T Open Distributed Processing (ODP) [10] architecture and the Universal Modeling Language (UML) [11]. ODP is a framework that supports the building of distributed systems. At the top level, ODP requires that the system at hand be modelled from a number of different viewpoints defined within the ODP Reference Model (RM-ODP)[10]. UML is a semi-formal modelling language which provides a diagrammatic notation and its semantics using mature object-oriented concepts. By using the appropriate UML diagrams for each of the ODP viewpoints, it is possible to design a system with clear traceability between the different viewpoints. UML and ODP, both being well documented and widely accepted within industry, can be understood by engineers from many countries and companies. Using the two together therefore makes it possible to present the designs in an -- almost -- universal language both to collaborating designers and to customers. The rest of this paper is arranged as follows. In Section 2, a brief overview of the TRUMPET scenario is given. Section 3 describes the ODP viewpoints and discusses the rationale behind using UML for describing ODP viewpoints. In Section 4, a Virtual Private Network service is specified in each of the ODP viewpoints by using the appropriate UML diagram types. Section 5 concludes the paper by assessing the used approach.

2 The TRUMPET Scenario The ACTS project TRUMPET (Inter-Domain Management with Integrity) focuses on the secure operation of inter-domain management systems within the Open Network Provisioning framework. The TRUMPET scenario in Fig. 1 involves the following

Applying UML to Design an Inter-domain Service Management Application

202

players: two (or more) Public Network Operators (PNOs), a Value Added Service Provider (VASP), and a number of customers at various sites -- Customer Premises Networks (CPNs) [2], [15]. The customers see an end-to-end connection and are not necessarily aware of which PNOs are contributing to establish the connection. The VASP sees the connection as a set of segments, each supported by a different PNO, but does not know how each segment has been set up within the corresponding PNO (i.e., what Asynchronous Transfer Mode-based switches are used). The management systems of the players mentioned above form a service provisioning system for management and provision of broadband (Asynchronous Transfer Mode-based) network connections between two customers/end users. CPN is a dedicated service in the customer organisation, which already has a contract with the VASP. The VASP management system provides network connectivity to customers by utilising the resources of one or more Public Network Operators. The VASP allows customers to create, modify and delete connections, thus effectively providing the Virtual Private Network (VPN) service to the customers. PNOs provide the physical infrastructure, i.e., the network, and the adequate management interface to interact with it. VASP

Customer2 / end-user VASP OS

Customer1 / end-user

Xuser’’ Xuser’’ Xuser’

CPN OS PNO OS

Xuser’

Xcoop

CPN OS

PNO OS

Customer Premises Network 2

Customer Premises Network 1

PNO A

PNO B

Fig. 1. TRUMPET Scenario

The TRUMPET management system thus represents a typical component-based application for distributed service management. Entering the design phase, a need for suitable methodology and supporting notation for modelling such a distributed system emerged. Having critically reflected on existing methodologies and notation schemes, the consortium decided to adapt the Unified Modeling Language concepts to the ODP framework and design the system in such a fashion.

203

Mohamed Mancona Kandé et al.

3 Design Approach: UML and ODP This section is concerned with the methodology and the notation chosen for the analysis and design of the TRUMPET management system. It presents the methodology and discusses the rationale for the choice of the notation to be used. A Methodology represents a set of rules for structuring the task of developing a system, by grouping the analysis and design information and defining steps to be followed during system refinement. Notation refers to the way, either textual or diagrammatic, of representing the information, and of describing the structure and functionality of the system and its components. The methodology adopted in TRUMPET is that of the Open Distributed Processing Reference Model (RM-ODP) [10]. ODP provides a general architectural framework to which distributed systems, operating in an open, heterogeneous environment, must conform. The basis of this architectural framework is the development methodology encompassing five distinct viewpoints. These viewpoints allow different participants in system development to observe the system from different perspectives and levels of abstraction: • The Enterprise viewpoint is directed at the needs of the users of a system. The system is modelled in terms of the required functionality, the domains involved, the actors and their roles. • The Information viewpoint describes a consistent, common view of information resources that support the information requirements of the Enterprise viewpoint. It also defines relationships between information elements and the information processing. • The Computational viewpoint is concerned with the functional decomposition of a system, i.e., the computational objects, the interfaces they offer (the functionality they support), and the interactions between them. • The Engineering viewpoint describes the infrastructure needed to support distribution, i.e., the provisioning of transparencies. The Engineering viewpoint defines the communication needs and the deployment of functionality. • The Technology viewpoint is concerned with the details of the components and platforms from which the system is constructed. The viewpoints are partial views of the complete system specification, and the description of the same component can exist in different viewpoints. This gives rise to a viewpoint consistency issue, i.e., the consistency of specifications across different viewpoints. For each viewpoint, ODP has defined a language, meaning a set of concepts and rules to be used when specifying the system in the corresponding viewpoint. However, it has left open the issue of which notation to use for these viewpoint languages. There are some major concerns when choosing notations for the ODP viewpoints. First, since ODP is based on the object paradigm, the notation for each viewpoint should support that paradigm as well. Second, the notation should be able to express the concepts defined in that viewpoint. Third, it should be possible to check the consistency of the different viewpoint specifications of a system. It would be an added advantage if one could trace an entity that occurs in more than one viewpoint.

Applying UML to Design an Inter-domain Service Management Application

204

Another consideration is that the notation should be intuitive and easy to use and understand. In other words, the notation should ease the communication among development team members and others outside the team (such as customers). This latter consideration becomes even more important in a distributed, international environment such as that of TRUMPET. Different semi-formal and formal languages are used for specifying different ODP viewpoints. Formal descriptions are employed in the ODP framework to enable a precise and unambiguous definition and interpretation of ODP standards. But usually, different languages are used for different viewpoints, making it difficult to check consistency between different viewpoint specifications and to trace system components across viewpoints. Applying a single language/notation to all of the ODP viewpoints will solve these problems, but it requires that the language/notation have a rich set of core concepts to cover all the viewpoints. There are several (semi-) formal languages/notations that can be considered for the purpose of specifying ODP viewpoints. Among those, the notation that best fulfils the requirements above is the Unified Modeling language (UML). Among the other languages, the Specification and Description Language (SDL) [12] is also objectoriented and has a graphical notation, but it lacks the richness of concepts available in UML. Moreover, SDL does not have any extension mechanism, such as UML’s stereotype, to compensate for the missing concepts. The lack of extension mechanism also applies to the other formal languages considered. The formal languages based on mathematical notations, e.g., LOTOS [8] and Z [16], are not only difficult to understand and communicate, but have a limited set of basic concepts. The approach proposed in [7] for mapping the different ODP viewpoints to UML diagram types has been developed and adopted in the TRUMPET project. This approach is illustrated in Fig. 2. The major benefit of such a mapping consists in supporting both an object-oriented modelling and design process, and the design of reusable components and distributed services [6] in the sense of distributed systems as described in the Reference Model for Open Distributed Processing (RM-ODP) [10]. A combination of ODP and UML helps also to map and to implement some ODP functions in different technologies like Common Object Request Broker Architecture (CORBA) and Telecommunications Information Networking Architecture -Distributed Processing Environment [13]. Fig. 2 depicts the links between ODP viewpoints [1], and shows the relationships between these viewpoints and the UML diagrams, as well as between UML diagrams themselves (which are mapped to the same viewpoint specifying different aspects of same objects). The design of a distributed management application starts with capturing requirements of the system in terms of Use Case Diagrams. The results obtained from a use case model may be used to present high level Static Structure Diagrams/Class Diagrams as indicated in the Enterprise viewpoint, and these diagrams can be specified in more detail in the Information viewpoint. The step from Class Diagrams to Statechart Diagrams allows the specification of the dynamic behaviour of significant information objects.

205

Mohamed Mancona Kandé et al. Computational Viewpoint

Information Viewpoint

Sequence Diagram

Static Structure Diagram

Static Structure Diagram Statechart Diagram

Technology Viewpoint

Enterprise Viewpoint

Collaboration Diagram

Use Case Diagram

Activity Diagram

Static Structure Diagram

Component Diagram

Engineering Viewpoint Component Diagram Deployment Diagram

Relationship between specifications of different viewpoints Relationship between different models of same objects Relationship between viewpoints in development lifecycle Fig. 2. Relationships between UML Diagrams and ODP Viewpoints

The Computational viewpoint maps to the UML Collaboration, Sequence, Activity and Component Diagram types. While the UML Collaboration Diagram shows the interactions among instances and their links to each other, the Sequence Diagram describes object interactions arranged in a time sequence. The Activity Diagram allows the specification of the order in which activities (such as operations provided by a computational object interface) have to be executed. The Component Diagram shows the organisations and dependencies among components. In addition to the Component Diagram, the Deployment Diagram is also mapped to the Engineering viewpoint. This latter diagram type shows how components and objects are distributed and moved around the system [4]. There is no mapping between UML and the Technology viewpoint offered by this approach.

4

Case Study

This section describes how, within the ODP framework, different UML concepts and notations were used to design the TRUMPET management system. A separate section is dedicated to each of the five ODP viewpoints by projecting the Virtual Private Network service in the corresponding viewpoint using the appropriate UML notation schemes. Each viewpoint first gives an overall view of the TRUMPET management system, and then focuses on a more detailed description of the VPN service within the Value Added Service Provider domain.

Applying UML to Design an Inter-domain Service Management Application

4.1

206

Enterprise Viewpoint

The ODP Enterprise viewpoint represents an overview of the system’s aims, constraints and functionality as seen by the enterprise. This viewpoint models the basic system decomposition into components, identifies actors, policies and domains, and describes the general scenarios of the system’s use. The TRUMPET system incorporates three domains (or enterprise objects in ODP terminology) namely the Customer Premises Network, CPN, the Value Added Service Provider, VASP, and the Public Network Operator, PNO. The PNO domain is further subdivided into PNO Service Layer and PNO Network Layer. These four entities were modelled as UML Packages with interdependencies, using UML Class Diagram notation as shown on Fig. 3. <> TrumpetManagementSystem <<domain>> PNO <<domain>> VASP

PNOService Management

PNONetwork Management

<<domain>> CPN

Fig. 3. The TRUMPET Class Diagram Enterprise Package

Next, the desired functionality of the system is described. In the ODP context, this is done by specifying the scenarios, or Use Cases, that describe how actors/entities interact in the context of using the system. <> TrumpetManagementSystem

<<user>>

Reserve Connection

Release Connection

Modify

Status Request

Notify Activation Connection Release Notification

<>

Fig. 4. The Use-Case Diagram

The TRUMPET scenarios were specified using the UML Use Case diagram, depicting the actors, sets of Use Cases (ellipses) within a system, and associations between actors and Use Cases -- illustrated in Fig. 4. Note that the UML stereotype <> has been used to classify the high-level enterprise object “TrumpetManagementSystem“ as community in the sense of ODP.

207

Mohamed Mancona Kandé et al.

Fig. 4 depicts the functionality (or Management Functions) identified in the Value Added Service Provider management service as Use Cases, and the interaction of the different users with the Use Case package. As shown, there are six Use Cases. Customers/end users are capable of reserving end-to-end connections (of a given duration and desired Quality of Service), modifying them (changing duration, Quality of Service, or both), and releasing, i.e., deleting these connections. Public Network Operators are capable of notifying the users via the Value Added Service Provider of connection activation, or notifying the users of the connection release due to a segment/link failure. 4.2

Information Viewpoint

In the ODP Information viewpoint, the information objects of the system are identified and their structures and relationships described. UML Class Diagrams were used to describe the static structure of the information objects. <<domain>> CPN

<<domain>> VASP

<> Cpn

<> Vasp VPNContract

EndUser

uses

<<domain>> PNO

maps/retrieves

VPNConnection

<> Pno 1 establishes/maintains 1..* Segment

connects to

Fig. 5. The Class Diagram of the VPN Service

Fig. 5 illustrates, by means of a UML Class Diagram, the overall structure of the Virtual Private Network service. That is, what entities are involved, what their relationships are, and how they interact. In the following, the emphasis is put on the Value Added Service Provider (VASP) entity of the above diagram, and a more detailed description of this entity is given. In the Class Diagram of Fig. 6, the VASP is decomposed into its three main components: the CustomerServer which handles the communication with the customer domain (CPN), the ControlServer which, after negotiations with the involved Public Network Operators, establishes, modifies, or releases the Virtual Private Network connections, and the MIB (Management Information Base) which contains all the Managed Objects. These objects contain information about the different entities that the VASP either interacts with or manages, e.g., objects containing information about the VASP customers or objects representing the connections that the VASP manages. Both the CustomerServer and the ControlServer have access to the MIB for either retrieving information from it or updating it.

Applying UML to Design an Inter-domain Service Management Application

208

<> VASP <> Vasp

Mib

CustomerMib

.. .

ControlServer

CustomerServer

ConnetionMib

Fig. 6. The Overall Class Diagram of the VASP

Among the different components of the VASP, the Management Information Base represents the pure informational objects. The other two entities can be regarded as information processing units (manipulating the MIB) although each of them contains information about the current state of the VASP. The customer MIB contains information pertaining to the VASP customers, their respective service profiles, and the terms of their subscriptions. A corresponding MIB exists for the Public Network Operators whom the VASP is dealing with. These MIBs are rather static, in the sense that the information they contain is seldom updated. The Connection MIB contains information about all the connections that the VASP is currently supporting. Furthermore, its structure reflects the view that the VASP has of a connection, i.e., a connection consisting of segments individually supported by a Public Network Operator. This MIB is being constantly updated (therefore dynamic) as requests for new Virtual Private Connections and change/release of the existing ones are received from the customers. 4.3

Computational Viewpoint

The Computational viewpoint describes how the management functions, identified via Enterprise Use Cases, are performed by the management system. Each management function is described in terms of computational objects and computational activities, the latter representing sequences of operations invoked on computational objects. As a starting point for the computational design, the components identified in the Enterprise viewpoint can be mapped to computational objects which provide an abstract, course grain computational view of the management system. Each component can then be broken further down into a set of computational objects representing the detailed computational object model. At this level, the UML Class Diagrams have been used to describe the structure of computational objects, their interfaces, and the relationships between them. At the abstract level, UML Component Diagrams have been used to describe the system components and their external interfaces. Fig. 7 depicts the design of the VaspVpnManager component (referred to in the Information Viewpoint as the Control Server). We distinguish the VaspVpnManagerFacade package containing the external structure (client view of the

209

Mohamed Mancona Kandé et al.

component) from the VaspVpnManagerImpl package implementation details about the internal class structure.

that

contains

the

V a s p V p n M a n a g e rF a c a d < < c lie n t> > C u s to m e r-S e rve r

< < in te rfa c e > > V P N E ve n tH a n d le r

< < in te rfa c e > > V P N S e rvic e

c a lls

< < c lie n t> > P n o C o n n e c tio n M a n a g e r

c a lls

o ffe rs

< < a b s tra c t> > V aspV P NM ngr

o ffe rs

V a s p V p n M a n a g e rIm p l < < c lie n t> > C u s to m e r-S e rve r

V aspV pnM ngr

V a s p V P C o n n e c tio n

R o u te F in d e r V aspV P S egm ent V P N E ve n tH a n d le r

C u s to m e rE n d P o in t

< < c lie n t> > P n o C o n n e c tio n M a n a g e r

Fig. 7. Internal Structure of the VASP-VPN-Manager Computational Object Type

The VaspVpnManagerImpl package realises the interfaces contained in the “façade“ package. It does so by implementing objects that directly support the management functions offered by the interfaces, and objects that support the former objects in carrying out their task. VASP-VPN-Manager

CustomerServer ...

VPNService

VPConnServEventHandler

<> reserveConnetion( ) modify( ) getStatus( ) releaseConnection( ) <> activateConnectionNotify( ) releaseConnectionNotify( ) connectionNotify( )

Fig. 8. The VASP-VPN-Manager Computational Object Type Diagram with Interfaces

Note that the interface type diagrams of Fig. 8 and Fig. 9 provide the information necessary to easily produce an OMG-IDL (interface definition language) file. This would be the first step in mapping the computational design to a concrete CORBAcompliant implementation platform. The UML diagrams thus support directly and ease the implementation task for those platforms. <> VPNService reserveConnetion( ) modify( ) getStatus( ) releaseConnection( )

<> VPConnServEventHandler activateConnectionNotify( ) releaseConnectionNotify( ) connectionNotify( )

Fig. 9. Computational Object Interface Type for VASP-VPN-Manager

Applying UML to Design an Inter-domain Service Management Application

210

Next, the computational activities are described. Computational activities are the interactions between the computational objects in order to perform the management functions defined through Use Cases in the Enterprise viewpoint. Interaction between computational objects is described in terms of an operation invocation initiated by a client object requesting an operation to be performed by a server object. Precedence rules are used to define the sequence of operations performed when an interaction takes place. To describe the computational objects interactions UML Collaboration diagrams and Sequence diagrams have been used. These two diagram types convey mostly the same information. Depending on how important the time dimension or the lifetime of computational objects are, one could use one or both of the diagram types. :Customer Server

:RouteFinder

:VASP-VPNManager

:VASP-VP-Conn

:PnoConnetion :VASP-VP-Seg Manager ment

:Customer EndPoint

1: reserveConnection() 2: findeRoute(CustId, CustId) 3: create(VaspId, CustId, CustId, Duration,Bw ) 4: reserveConnection (VaspId, VaspId, AccessP, AccessP, Duration,Bw) 5: create(ConnId, AccessP, AccessP, Bw)

6:allocateConnection(VaspId, VaspId, AccessP, AccessP, Duration, BW) 7: create(CustId, CPNConnId , AccessP, Bw)

Fig. 10. The Reserve Connection Sequence Diagram

The UML Sequence and Collaboration Diagrams as depicted in FIg. 10 and Fig. 11 describe the Reserve Connection Use Case defined in the Enterprise viewpoint. These diagrams illustrate the interactions, sequences of messages, and relationships among computational components (such as PnoConnectionManager, VaspVpnManager) as well as programming level objects (e.g., instances of UML objects within the VaspVpnMamager package) [7]. :VASP-VP-Segment 5: create(ConnId, AccessP, AccessP, Bw)

:Customer-CPN

:CustomerServer

6: allocateConnection(VaspId, VaspId, AccessP, AccessP, Duration, Bw)

1: reserveConnection()

:VASP-VPN-Manager

:VASP-VP-Conn 3: create(VaspId, CustId, CustId, Duration, Bw )

7: create(CustId, CPNConnId, AccessP, Bw)

2: findRoute(CustId, CustId)

:CustomerEndPoint 4: reserveConnection(VaspId, VaspId, AccessP, AccessP, Duration, Bw)

:PnoConnetionManager

Fig. 11. The Reserve Connection Collaboration Diagram

:RouteFinder

211

4.4

Mohamed Mancona Kandé et al.

Engineering Viewpoint

This viewpoint focuses on the actual realisation of interactions between distributed objects and on the resources needed to accomplish this interaction. It comprises concepts, rules and structures for the specification of the system viewed from the engineering perspective. This viewpoint introduces three main concepts, namely, nodes, clusters and capsules. ODP nodes match quite closely UML nodes, therefore, as depicted in Fig. 12, the concept of UML-nodes was applied to design PNO-Host, VASP-Host, and CPN-Host. ODP capsules are a grouping of engineering objects forming a single unit for the purpose of encapsulation, processing and storage (e.g., VpnManagerServer and PNOConnMngrServer in Fig. 12). They can be thought of as runtime modules. The UML Component Diagram is therefore the appropriate candidate to specify the set of runtime modules and their interactions. The overall view of the system, i.e., how the capsules are distributed across the nodes, is then most naturally conveyed by the UML Deployment Diagram. An ODP cluster refers to a group of objects that are always together and can migrate only as a whole from one capsule to another (whether on the same node or not). The tight coupling that exists among the cluster objects can be best conveyed by the UML Composition concept of Class Diagrams. In Fig. 12, the Capsule CustomerServer contains a cluster, a composite object tagged with the stereotype <>, that can migrate from the VASP-Host to the CPN-Host as indicated by the <> stereotype.

P N O -H o s t

PN O C onnM ngr S e rv e r

< < c a lls > >

C P N -H o s t

V p n M a n a g e rS e rv e r

U s e rA p p lic a tio n

C u s to m e rS e rv e r

< < C lu s te r> > G U Ia G U Ib

< < C lu s te r> > G U Ia G U Ib

V A S P -H o s t < >

Fig. 12. Deployment Diagram

Applying UML to Design an Inter-domain Service Management Application

4.5

212

Technology Viewpoint

The ODP Technology viewpoint describes the choice of implementation technologies used to bring the design accomplished through the four previous viewpoints to life. This viewpoint describes the configuration of the hardware and software on which the distributed system relies. Although there are no dedicated UML diagrams to describe this viewpoint, the object-oriented concepts of encapsulation and abstraction provided by UML in previous design steps allow the system to be implemented in an heterogeneous environment in terms of computer architectures, programming languages and operating systems. This is one of the main advantages of the objectoriented approach adopted by UML. The Technology viewpoint was therefore described using plain English. The Customer Premises Network is realised as a group of Java objects providing an interface to the Value Added Service Provider’s Virtual Private Network functionality. A HTML based user interface is also provided as the end-user interface to CPN Java objects. The CPN-VASP communications are implemented in Voyager, a Java-based communications mechanism. The VASP is fully implemented in Java, apart from the Management Information Base which is based on the Lightweight Directory Access Protocol (LDAP) which effectively supports functionality required by the Telecommunications Management Network [5] for (TMN)-like Management Information Base. Public Network Operator’s management system is a TMN-OSI platform, the HP-OpenView, which communicates via Common Management Information Protocol (CMIP). Thus, VASP implementation requires a JAVA-toCMIP gateway, which is realised as a platform-independent CORBA gateway. These technologies were chosen so as to fulfil the trial requirements and project aims.

5

Conclusion: Experiences and Lessons Learned

This paper discussed the rationale behind the choice of UML as a notation scheme for describing ODP viewpoints and its application in the context of the ODP framework for the design of elecommunications management services. A case study describing the development of the TRUMPET inter-domain service management system illustrated the approach. This approach proved to have some advantages that validated the choice of UML as well as some drawbacks. There were two main advantages in using the ODP-UML combination. First, there was no need to shift paradigm when trying to represent ODP core concepts in UML, since both are base on object-oriented principles. This eased the task of mapping by allowing to focus solely on finding the most suitable UML representation for the entities to be specified -- different paradigms would entail mapping (the building blocks of) the two paradigms to each other first. Second, both UML and ODP look at a system from different perspectives. UML’s versatile diagram types made it possible to use only one notation to specify all of ODP’s five viewpoints, illustrated by the example used throughout the paper (Use Cases were used in the Enterprise, Class Diagrams in the Information, Collaboration Diagrams and interface types in the Computational, and Component Diagrams and Deployment Diagrams in the Engineering viewpoints).

213

Mohamed Mancona Kandé et al.

Furthermore, the use of a single notation for all ODP viewpoints resulted in a shorter start-up phase for the project. It also had the benefit that consistency checks between the different viewpoint specifications could be done more easily, as needed, by the project members. Conversely, the ODP framework proved to be an efficient way of structuring different UML notations and thus managing the potential high complexity of large models. Although the UML notation has many attractive aspects for use in the design of a distributed system, it also has some drawbacks. Many of the ODP core concepts are not directly supported by UML. For such reasons, UML introduces the concept of stereotypes to provide for extensibility. In the example of the previous sections, stereotypes have been extensively used to map ODP concepts that did not have a direct counterpart in the UML notation, such as enterprise objects, communities, clusters, etc. Moreover, although the concept of an interface is part of UML, its description uses the same notation as for a class. Again a stereotype, <>, has been used to differentiate between the class and the interface descriptions. The existence of an extension mechanism compensates for the lack of suitable basic concepts, but the extensive use of the same notation to express quite different concepts becomes at best confusing. One of the main benefits of a pictorial notation is to be able to get a general understanding of a given diagram without having to rely on the annotation text. This benefit is lost when the same representation is used for many different core entities. UML did not prove to have enough power to fully describe the ODP concept of the computational object. During the design, only the external interfaces provided by a component were specified, and concepts like binding rules and lifetime aspects were not included. In conclusion, the use of the ODP-UML methodology in TRUMPET proved efficient in supporting collaborative work in a large, geographically distributed consortium. After the initial methodology was decided upon, the work assignment was agreed on and understood within an afternoon of discussions. After the labour division was made, the consortium undertook to design the system according to the approach defined. UML being quite widespread in both industry and academia, most people involved in the project had had some kind of experience with it before. Those less fluent in UML picked it up rather easily partly because of its graphical notation and intuitiveness and partly because of their knowledge of other similar notations. The introduction of UML in the project was therefore quite smooth and established a common base of communication among the members of the development team. The design was developed within the contractual deadline (three months). There were 15 individuals involved in producing the design document. A high quality design was achieved despite the size of the TRUMPET system and the limited development resources available. The documents produced in this phase, were also extensively used by the trials team, which helped run the system in the operational environment. The shortcomings of the approach were mainly due to the lack of direct support in UML of ODP’s core concepts. That would be greatly alleviated if UML would offer a graphical extension mechanism.

Applying UML to Design an Inter-domain Service Management Application

214

Acknowledgements This paper is based on the original work developed by the ACTS project TRUMPET. The authors wish to thank all the partners of the TRUMPET consortium who contributed to this work. Ognjen Prnjat wishes to acknowledge the financial support provided by the British Council Overseas Research Scholarship. More information on the TRUMPET project can be found at http://ascom.eurecom.fr/ASRL/TRUMPET/Trumpet_public/.

References 1.

2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

Berquist, K., Berquist. A. (Eds.): Managing Information Highways. The PRISM book: Principles, Methods and Case Studies for Designing Telecommunications Management Systems. Lecture Notes in Computer Science, Vol. 1164, Springer –Verlag, Berlin Heidelberg New York (1996). ACTS Project AC112 TRUMPET: NIL-Security Prototype Report, Deliverable 6 (1997). ACTS Project AC112 TRUMPET: Detailed Component and Scenario Designs, Deliverable 8 (1997). Fowler, M., Scott, K.: UML Distilled: Applying the Standard Object Modeling Language. Addison-Wesley (1995). Hall, J. (Ed.): Management of Telecommunication Systems and Services: Modeling and Implementing TMN-based Multi-domain Management. Springer-Verlag, Berlin New York Tokyo (1996). Jacobson, I., Griss, M., Jonsson, P.: Software Reuse: Architecture, Process and Organization for Business Success. Addison-Wesley (1997). Kandé, M. M., Tai, S., Wittig, M.: On the Use of UML for ODP-Viewpoint modeling. In OOPSLA 97 Workshop on Object Oriented Technology for Service, System and Network Management. Atlanta, Georgia (1997). ISO 8807: LOTOS: A formal description technique based on the temporal ordering of the observational behaviour. (1989). The Network Management Forum: The OMNIPoint Strategic Framework. A ServiceBased Approach to the Management of Network and systems. NJ (1993). ISO/IEC 10746-1/2/3: Reference Model for Open Distributed Processing -Part1:Overview/Part2: Foundations/Part3: Architecture. (1995). Rational Software, Microsoft, Hewlett-Packard, Oracle, Texas Instruments, MCI Systemhouse, Unisys, ICON Computing, IntelliCorp: The Unified Modeling Language, Joint Submission, OMG TC doc ad/97-01-01 - ad/97-01-14 . CCITT: Recommendation Z100 Specification and Description Language (SDL). (1992). Graubmann, P., Mercouroff, N.: Engineering Modeling Concepts (DPE Architecture). In TINA Baseline document TB_NS0005_2.0_0.94 (1994). Sacks, L., et. al.: TRUMPET Service Management Architecture. In Proceedings of the 2nd International Enterprise Distributed Object Computing Workshop (1998). ISO JTC1/SC22/WG19: Z Notation (draft version 1.4). (1998).

Booster*Process A Software Development Process Model Integrating Business Object Technology and UML Axel Korthaus and Stefan Kuhlins Universität Mannheim Lehrstuhl für Wirtschaftsinformatik III, Schloß, D-68131 Mannheim Phone: +49 621 292 5075 Fax: +49 621 292 5701 {korthaus, kuhlins}@wifo.uni-mannheim.de

Abstract. This paper describes a UML-based process model (called Booster*Process) for system development founded on business object technology. It integrates business and software engineering aspects, describes the speciﬁc modeling activities needed for business and software system modeling and connects the various UML diagrams, particularly taking into consideration the requirements of business objects and their component character. It propagates a multi-level approach, starting with use case, activity and class modeling at the organizational level, and then shifting to analysis and design of business applications.

1

Promises of UML and Business Object Technology

Nowadays, new component technologies start to emerge rapidly as a successor of object-oriented ideas which have eventually become the mainstream in software industry today. They build upon the most successful concepts of objectorientation, but go further, e.g. by deﬁning larger-grained units of reuse compared with conventional objects in object technology. These component technologies, e.g. OCX/ActiveX [15] or (Enterprise) Java Beans [23], are no longer limited to the ﬁeld of GUI components, but begin focusing on the implementation of business concepts and business logic. One of the most interesting developments in this area are OMG’s standardization eﬀorts for so-called Common Business Objects (CBOs) and a Business Object Facility (BOF) [17] on which we will concentrate in this paper as a representative of those technologies. While CBOs are “objects representing those business semantics that can be shown to be common across most businesses”, the BOF is “the infrastructure (application architecture, services, etc.) required to support business objects operating as cooperative application components in a distributed object environment.” [17]. According to the CBO Working Group of the OMG Business Object Domain Task Force (BODTF), business objects capture “information about a real world’s (business) concept, operations on that concept, constraints on those operations, J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 215–226, 1999. c Springer-Verlag Berlin Heidelberg 1999

216

Axel Korthaus and Stefan Kuhlins

and relationships between that concept and other business concepts. [...] a business application can be speciﬁed in terms of interactions among a conﬁguration of implemented business objects.” [18] This deﬁnition reﬂects the object-oriented foundation of business object technology. The vision of CBOs as both designtime and run-time constructs comprises, above all, the goals of interoperability of business object components, including the possibility of ad-hoc integration (i.e. “plug-and-play”), and simplicity of design, implementation, conﬁguration and deployment, so that an average developer is helped to build business object systems easily [17,22]. The CBO vision aims at a marketplace for standardized “oﬀ-the-shelf” business objects which are easily integrated with other business objects through the BOF and are able to interact with each other in order to perform some business function, even if the collaboration was not planned or foreseen by their developers (for ad-hoc-integration cf. Corba Component Initiative [19]). The eventual achievement of these goals would bring software engineering a signiﬁcant step closer to meeting the increased requirements on modern information systems development. For the purpose of modeling business and software systems with business objects, a suitable object-oriented analysis and design (OOA&D) method is needed [14]. Object-oriented modeling is another area where the OMG has sought standardization and has been successful recently through the adoption of the Uniﬁed Modeling Language (UML) in Nov. 1997 [20]. UML combines common concepts of some earlier analysis and design methods, enhances this set with additional modeling concepts meeting the requirements of current modeling tasks, and deﬁnes notational symbols for those concepts. As a general purpose language, UML is designed to model a wide range of diﬀerent types of systems, from purely technical (non-software) through software to business systems. In contrast to its predecessors, UML is merely a modeling language, not a complete methodology, because there is no speciﬁcation of a particular software development process included in the standard with recommondations of how to deal with the UML elements. The structure of a UML-based process for modeling systems is strongly dependent on the kind of system under development (e.g. business/software/technical system) as well as on other determinants (e.g. project size). Process deﬁnitions have to state which techniques are appropriate at various levels of detail during development, which deliverables have to be produced, who should produce them, and which inspections, standards, metrics, and tests should be used to control quality and certify system correctness [1,13]. A UML-based process for systems development founded on the business object paradigm must span the whole range from business engineering (using business object models) to application engineering (with an emphasis on reusing pre-built business object software components) and business object component engineering (in order to build new business objects). In this paper, we will describe the basics of a process model (Booster*Process) which meets these requirements. Booster*Process is part of a project called Booster, which has just been launched at the University of Mannheim. Booster is an acronym for “Business

Booster*Process

217

Object Oriented Software Technology for Enterprise Reengineering”. Booster will examine topics around object-oriented business and software engineering, OMG business object and infrastructure standards, architecture, analysis, design, and implementation of distributed business object systems, component technologies (e.g. Enterprise Java Beans [23]), business object frameworks (e.g. IBM San Francisco [9]) etc.

2

Multi-Level UML-Based Business Systems Development

According to [7], processes must be viewed from four aspects: process context, process user, process steps, and process evaluation. In the following subsections, we will concentrate on the process steps as well as on the foundations and basic principles of Booster*Process, which describe the activities to be taken and the UML elements to be applied during the process. Although UML does not include a speciﬁc process, its designers had certain basic process principles in mind which are derived from the most popular existing methodologies – above all Booch ’93 [4], OMT [21], and OOSE/Objectory [11]. The UML documentation [20] mentions that processes using UML should be usecase-driven, architecture-centric, iterative, and incremental. Booster*Process sticks to these basic principles which are common practice in object-oriented development, although the concept of use cases is not over-emphasized. BOOSTER* Process Architecture

Central UML Diagrams

Business Engineering

Use Case Diagrams, Activity Diagrams, Class/Collaboration Diagrams

System Architecture Engineering

Application Engineering

Business Object Component Engineering

Component Diagrams, Deployment Diagrams, Class Diagrams Use Case Diagrams, Class/Object Diagrams, State Diagrams, Sequence Diagrams, Collaboration Diagrams Component Diagrams, Deployment Diagrams

Fig. 1. Process architecture of Booster*Process The most important foundation of Booster*Process is a multi-level approach to business object system development, which is represented by a process architecture. This multi-level process architecture deﬁnes a framework for the activities which have to be performed. The macro process constituted by the architecture levels of Booster*Process is broken down into micro processes, which roughly reﬂect the well-known activities of requirements gathering, objectoriented analysis, design, implementation, and test. The macro process architecture of Booster*Process is shown in Fig. 1. As can be seen from the ﬁgure,

218

Axel Korthaus and Stefan Kuhlins

the levels business engineering, system architecture engineering, application engineering, and business object component engineering are distinguished. Each of these levels is described by a micro process (cf. Subsect. 2.1-2.4). The diﬀerent kinds of arrows indicate more or less signiﬁcant directions of information ﬂow and express the principle of iterations and increments, which can be found even in the macro process depicted in Fig. 1. On the right side, those UML diagrams are listed that are most important for the respective engineering activities. While business and software engineering activities have diﬀerent viewpoints and diﬀerent levels of abstraction, they are not independent of each other. Modeling business goals and processes is the basis for deriving requirements on the information systems needed to support the business. New information system technologies, on the other hand, inﬂuence the way how the business processes are to be shaped to provide the best customer value. Therefore, it is very important to integrate these diﬀerent viewpoints within a comprehensive process, using the same underlying technologies, i.e. UML modeling and business object concepts (see e.g. Taylor’s ideas [24] of “convergent engineering”). System architecture engineering serves for designing a stable system architecture as a basis for the development of a number of related applications. Fortunately, the BOF RFP [17] already suggests a basic system architecture for OMG business object systems (see Subsect. 2.2). This basic layered architecture has to be reﬁned by company-speciﬁc enhancements. Application engineering is an activity resulting in a new software application for the organization. Its most important characteristic is the reuse of pre-existing business objects, at the modeling level as well as at the software component level. Business object component engineering is the process of designing and implementing new business object components for reuse. This activity may be independent of a concrete application engineering process. Similar distinctions between these two kinds of processes can be found in several approaches: [1], for example, distinguish between solution projects and component projects, while [13] speak of application system engineering and component system engineering. The normal course of activities begins with a business engineering process as the starting point, followed by a system architecture engineering process. When the system architecture is deﬁned, several application engineering processes will be performed, concurrently with several business object component engineering processes, which are a result of the requirements generated by the application engineering processes, or which independently produce components for future needs, in the sense of a domain engineering activity. In the following subsections, the individual levels of Booster*Process are described in detail. 2.1

Business Engineering

A key characteristic of Booster*Process is that it starts at the enterprise level with a business (re-)engineering activity. Part of a successful business (re-)engineering activity is the modeling of the business with its goals, rules, resources, actions, workﬂow etc. UML provides a number of diagrams which are very useful

Booster*Process

219

for this purpose, namely use case diagrams, class diagrams, and activity diagrams (cf. [14]). [12] describe a process for object-oriented business engineering, which nearly exclusively builds on use cases for modeling business processes. Booster*Process takes up those ideas, but supplements the use case models with activity diagrams and high-level class models. The basic steps in business engineering follow the pattern of those described by [13], but are adapted to the needs of Booster*Process: – Formulate a business vision: Deﬁne the rationale and the goals of the BPR activity, discuss it with managers and employees, consider new technologies which might be helpful to improve the business, formulate objectives and high-level descriptions of future business processes. – Reverse engineer the existing business: Build use case models, class models, and activity models of the existing business structures and processes to be improved in order to facilitate a detailed problem identiﬁcation and analysis. Transform perceived business concepts into suitable business object types, i.e. entity, process and event business objects. – Forward engineer and implement the new business: Produce a detailed description of the new processes and the internal organization of the business in the form of new versions of the use case, class and activity models. Identify suitable business objects and map them to standardized Common Business Objects and existing domain speciﬁc business objects as early as possible in the process. Identify areas of operation which can be supported by business information systems. Implement the new business incrementally and develop the associated software systems. In the context of business engineering, use cases appear as business use cases. Business use cases model sequences of work steps performed in a business system which produce a result of perceived and measurable value to business actors. Business actors are roles that people or external systems in the environment play in relation to the business. The business use cases, which model business processes, should be detailed with the help of high-level class and collaboration models, expressing the internal realization of the business processes by workers with appropriate competencies and a number of business objects the workers work with. In order to facilitate the distinction between UML models at the business and the software level, UML will be adjusted appropriately for business engineering activities in Booster*Process by making use of suitable enhancement techniques such as stereotypes, tagged values and constraints, similar to the UML Extension for Business Modeling which is part of the UML documentation [20]. Furthermore, we recommend deﬁning new stereotypes to be able to express a taxonomy of special kinds of business objects (see above). All models produced during business engineering should be arranged in a top-level package labeled with the stereotype business system. The realization of use cases should not only be modeled by class and collaboration diagrams, but also by activity diagrams, which are very suitable for expressing workﬂow and parallel activities. Activity diagrams are similar to conventional approaches to modeling business processes (e.g. Event Driven Process

220

Axel Korthaus and Stefan Kuhlins

Chains, see [16], thus making additional modeling techniques apart from UML superﬂuous. Like use cases, activity diagrams are not object-oriented in nature and thus render the mapping to object-oriented concepts more diﬃcult. Activity diagrams can help identify activities in the business processes that can be executed or supported by information systems. This is where the transition to application engineering takes place. A rule of thumb regarding the mapping from the business models to models of the information systems could be that each business use case, described by an activity diagram, might be supported by and mapped to several information system use cases. Some of the internal workers identiﬁed and even some of the business actors might become actors in the information system use cases. Information system use cases might correspond with process business objects, and some of the business objects identiﬁed in the high-level class models might be represented by entity business object packages in the information system models. To enable the extensive reuse of existing business object components, their integration must be considered as early in the system life cycle as possible. Therefore, existing business object speciﬁcations must be matched with the business concepts identiﬁed during business engineering. What is needed in order to support this is the standardization of documentation and speciﬁcation techniques for business object components. The BOCA (Business Object Component Architecture) submission [6] to the CBO/BOF RFP [17] represented the ﬁrst step in this direction, because it comprised a Component Deﬁnition Language (CDL), which was designed to rigorously specify business object components. Unfortunately, the work had to be stopped because of technical diﬃculties, but at the moment of writing, new Business Object RFPs are being prepared which will probably continue the work on BOCA and CDL (so we will refer to the current BOCA proposal in this paper). For convenience and to be able to express more of the semantics, CDL speciﬁcations should be supplemented by suitable UML models describing interfaces, structure, and behavior of the components. While the standardization of a business object speciﬁcation technique is a basic requirement, the standardization of CBOs has the additional advantage that even business terminology and semantics are clearly deﬁned, too. Using these standardized semantics and integrating the UML diagrams associated with existing business objects with the models of the system under development, it should be possible to begin reuse activities already at the business engineering level. The earlier the mapping to existing business object components occurs, the better chances are of quick information systems implementation through assembly of existing software components. Thus, business modeling activities should be performed with strict adherence to standardized CBO terminology and semantics where possible. 2.2

System Architecture Engineering

The intrinsic complexity of modern large-scale information systems can be managed best with a good software architecture, which, for example, enables parallel development activities. The goal of architectural modeling is to deﬁne a robust

Booster*Process

221

framework within which applications and subsystems such as business object components may be developed. The system architecture provides the context for deﬁning how applications and business object components interact with one another to perform the needed business functions. As a common base from which all project teams work, a good architecture increases the reusability on system development projects. For applications built from business object components, the basic architecture required will be part of the OMG standard. Figure 2 shows the proposed architecture for business objects. Integrated in OMG’s Object Management Architecture (OMA), which includes a reference model for distributed object computing and deﬁnes OMG’s objectives and terminology, the Business Object Facility (BOF), based on CORBA, provides the infrastructure for CBOs, domain speciﬁc business objects, and enterprise speciﬁc business objects. This is the basic layered architecture on which all business object systems will be based. The software is organized in layers according to this layered architecture. Objects and components in lower levels are more general than those in higher levels and encapsulate technical details about transactions, persistence etc. with which the developer and user of higher-level components does not want to be concerned. The application engineering process uses the diﬀerent kinds of business objects to assemble them into software applications. The business object component engineering process produces business objects ﬁtting the architecture of Fig. 2.

Enterprise Specific Business Objects Financial Business Objects

Manufacturing Business Objects

Other Business Objects

Common Business Objects Business Object Facility CORBA, CORBAservices, CORBAfacilities

Fig. 2. Architecture for business objects [17]

The generic architecture for business objects must be enhanced by an enterprise-speciﬁc, change-tolerant application systems architecture, which deﬁnes subsystems (using UML packages and components) and deﬁnes clear interfaces to reduce communication overhead and to allow graceful system evolution over time. It should be decided which parts of the system are most stable and which will change frequently in order to arrive at a good subsystem organization.

222

Axel Korthaus and Stefan Kuhlins

UML provides suﬃcient modeling capabilities to cleary distinguish between the logical and the physical architecture of the system. While the logical architecture is expressed in the form of class diagrams, for the most part containing only packages, interfaces, and dependency relationships, the physical architecture is modeled by component and deployment diagrams (cf. Fig. 1). UML packages on the logical level and components on the physical level are very important during architectural modeling, because they allow the partitioning and control of the overall software structure. At the logical level, both legacy systems that must be wrapped to ﬁt into the architecture and large-grained business objects identiﬁed during business modeling are modeled as packages. Clear interfaces between these packages have to be deﬁned, so that the packages can be allocated to different teams. Apart from class and component diagrams, deployment diagrams can be useful in structuring the physical architecture and in initial consideration of the run-time distribution of the components already known. The iterative and incremental micro process of system architecture engineering comprises the following activities: – Capturing requirements: Using the results of business engineering as input and doing further research (e.g. interviews etc.), the global needs and expectations of the (internal) customers and end users must be gathered and modeled, often with the help of use case diagrams; furthermore, non-functional requirements have to be analyzed in terms of an overall quality plan; – Perform global analysis: Through examination of the requirements, candidate applications and business object components should be identiﬁed and modeled as packages. Domain analysis activities as well as feedback from previous application engineering projects about needs for reusable business object components can contribute to the results; – Architectural design: As much as possible of the overall architecture should be speciﬁed on a design level. This includes the precise deﬁnition of facades (see Subsect. 2.4) and interfaces of the applications and business object components to be developed, the legacy systems to be integrated, and the technology components for lower system levels (e.g. ORBs, BOF). For this purpose, it might be necessary to begin a more detailed behavioral modeling involving the use of UML interaction diagrams (not mentioned in Fig. 1). – Implementation and test of the layered architecture: At this point the packages identiﬁed have to be implemented (if not already in existence), which involves application engineering and business object component engineering. Finally, the system functionality has to be tested on a global level. In analogy to the business engineering level, UML should be adjusted appropriately for software engineering based on business object technology. This means that a suitable version of UML has to be deﬁned to meet the given needs. Provided that the approach of the BOCA proposal [6] is followed up, business object systems will have to be speciﬁed in CDL (similar to IDL speciﬁcations of CORBA objects). The BOCA metamodel would thus become a design target for the UML models, so that an appropriate UML extension must be deﬁned to facilitate the mapping between UML and CDL.

Booster*Process

2.3

223

Application Engineering

Application engineering is the process during which single business applications are implemented that directly serve the business by supporting the business processes. Since the vision of OMG business objects comprises a considerable simpliﬁcation of developing business information systems, the goals of application engineering in Booster*Process are to develop solutions quickly, but on a sound, evolving architectural base, thus producing applications that confer early user beneﬁts at minimum costs and leverage existing legacy systems where possible, while maintainability is retained. These goals are sometimes subsumed under the term Rapid Architectural Application Development (RAAD), as opposed to Rapid Application Development (RAD), where no models are produced and no system architecture is designed [1].

Business Object UML Models

Business Object Components e.g. .class

Analysis

Design

Implementation

.exe

Fig. 3. Reuse-oriented micro process for application engineering

Application engineering is very much like conventional software engineering approaches described in literature, extended by aspects of reusing existing business objects. The micro process to be followed is composed of the classical activities of object-oriented analysis, design, implementation, and test, performed iteratively and concurrently in part and involving the complete set of UML diagrams to express structure, behavior, and algorithms of the application system. On the right side of Fig. 3, these activities are shown (except for testing) according to the baseball model of object-oriented software engineering described in [5]. The left side shows how the process uses a combination of model-based reuse and component-based reuse. During analysis and design, the developers permanently assess the possibilities of reusing existing business object speciﬁcations (the types of UML diagrams and modeling elements used for this purpose are described in Subsect. 2.4). If appropriate speciﬁcations can be found and integrated in the modeling process, this will lead to reuse and assembly of the corresponding business object components during implementation. Analysis starts with the deﬁnition of use cases and actors, who will interact with the application. As already stated, these requirements can be partially derived from the results of business modeling. More information has to be uncovered in cooperation with the customers and end users to specify the diﬀer-

224

Axel Korthaus and Stefan Kuhlins

ent usage scenarios of the application. If possible, existing use case descriptions belonging to large-grained business object components should be retrieved and used. The second step is to build an analysis model, which should be independent of the technical details of the speciﬁc implementation environment. The analysis model shows structural and behavioral aspects of the problem domain. Therefore, class and object diagrams, collaboration diagrams, sequence diagrams, and state diagrams are produced. There are several heuristics about how to identify the modeling elements in this phase, starting from the speciﬁed requirements. The realization of the use cases, for example, should be shown by sequence diagrams and collaboration diagrams. The UML notation for patterns can be of help here, and available business patterns stemming from business object documentations should be searched and integrated at this point in preparation for component reuse. During design, technical details are added and the models are adjusted to ﬁt the concrete conditions of implementation. Component diagrams, which contain representations of the runtime business object components, and deployment diagrams are added to the set of models. In order to prepare for the implementation in the BOF environment, the business objects have to be speciﬁed in CDL at this point. Probably, future UML CASE tools will provide features for transforming UML models into textual CDL speciﬁcations. In the implementation phase, those parts of the system that could not be assembled by pre-existing components have to be implemented, existing business object components have to be customized and some glue code may have to be written. 2.4

Business Object Component Engineering

Business object component engineering is the process which delivers common, domain speciﬁc or enterprise speciﬁc business objects. Provided that a marketplace for business objects evolves, non-software enterprises will primarily have to deal with the production of their own enterprise speciﬁc business objects, while more generic business objects will be purchased from component suppliers. Delivery of business object components is iterative and incremental, no less than application delivery, but the level of rigor and detail is much greater, since the components have to have a high level of quality in order to be reused frequently. If business object component engineering were restricted to delivering runtime components, reuse would be very hard. Instead, the components must be documented in a way suitable to facilitate understanding and retrieval by reusers to allow reuse in early phases of system development, before too many design decisions have been made that cannot be matched with the available components in the implementation phase. While textual CDL speciﬁcations are the most rigorous and system-speciﬁc tool for documentation, they need to be supplemented by UML models that can be built into the software models during application engineering. It is not suﬃcient to model the software units as UML components with clearly deﬁned interfaces, because more information about the semantics is needed for reuse. Typical usage patterns, modeled as collaborations, allowed sequences of messages, represented by state diagrams and illustrated by exemplary sequence diagrams, and

Booster*Process

225

the business concepts implemented, modeled with the help of class diagrams, can ensure the usability of the components. What can be seen and used by a reuser of the component is called the external design of the business object component. It does not necessarily need to reveal the internal implementation of the component. The internal implementation again builds upon the complete set of UML diagrams. Only those internal aspects of a business object component that must be known to be able to reuse it should be exported via facades (see [8]), which represent a simpliﬁed model of the component and reveal only those parts that need to be directly visible and understood by the application developer. Input to the business object component engineering process are the requirements, models and documents produced during business engineering, potential domain analysis activities and, primarily, the needs of application systems under development. A new business object component ﬁnally has to be modeled as a UML package that contains the implementation ﬁles as well as the documentation and usage guidelines.

3

Conclusion

In this paper, we have described basic elements of a new business and software system development process model, which uses the UML as its modeling language and focuses on the concepts of OMG business object technology (but could easily be adapted to other business component technologies, e.g. Enterprise Java Beans [23]). We have tried to emphasize the necessity of an integrated approach to business and software engineering, building on a clearly deﬁned and stable underlying business object system architecture in order to provide for ease of system enhancement and modiﬁcation and to allow the seamless integration of business object components conforming to OMG standards, and we have suggested the usability of activity diagrams for business process modeling. Booster*Process will have to evolve and to be adapted to the emerging and still changing business object technology. Possibilities of tool support for the process have to be considered explicitly and the role of business object speciﬁcation has to be clariﬁed. Furthermore, the relationships to important non-OMG standards such as RM-ODP [10] have to be examined in order to provide conformance. It appears that the specialized architecture of Booster*Process can easily be mapped to the more general viewpoints architecture of RM-ODP. An important element of our future work will be the evaluation of Booster*Process in the context of a number of software development projects.

References 1. Allen, P. and Frost, S. (1998): Unravelling the Uniﬁed Modeling Language. Application Development Advisor, SIGS Publications, Jan. 2. Atkinson, C. (1997): Adapting the Fusion Process. Object Mag., Nov., 32–39. 3. Boehm, B.W. (1976): Software Engineering. IEEE Transactions on Computers 25 (12), 1226–1241.

226

Axel Korthaus and Stefan Kuhlins

4. Booch, G. (1994): Object-Oriented Analysis and Design with Applications. 2nd edn. The Benjamin/Cummins Publishing Company, Redwood City, CA. 5. Coad, P. and Nicola, J. (1993): Object-Oriented Programming. Yourdon Press, Englewood Cliﬀs, New Jersey. 6. CBOF (1998): Combined Business Object Facility – Business Object Component Architecture (BOCA) Proposal. OMG Business Object Domain Task Force BODTF-RFP 1 Submission. Rev. 1.1. OMG Doc. bom/98-01-07. 7. Eriksson, H.-E. and Penker, M. (1998): UML-Toolkit. Wiley Computer Publishing, New York. 8. Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1994): Design Patterns – Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading, Massachusetts. 9. IBM (1998): IBM San Francisco. IBM Inc. http://www.ibm.com/Java /Sanfrancisco/ (May 1998). 10. ISO/IEC (1995): Reference Model of Open Distributed Processing. ISO/IEC 10746-1 – 10746-4. http://www.iso.ch:8000/RM-ODP/ (May 1998) 11. Jacobson, I., Christerson, M., Jonsson, P., and Övergaard, G. (1992): ObjectOriented Software Engineering – A Use Case Driven Approach. Addison-Wesley, Wokingham, England. 12. Jacobson, I., Ericsson, M. and Jacobson, A. (1995): The Object Advantage – Business Process Reengineering with Object Technology. Addison-Wesley, Wokingham, England. 13. Jacobson, I., Griss, M. and Jonsson, P. (1997): Software Reuse – Architecture, Process and Organization for Business Success. Addison Wesley Longman, Harlow, England. 14. Korthaus, A. (1998): Using UML for Business Object Based Systems Modeling. In: Schader, M. and Korthaus, A. (eds.): The Uniﬁed Modeling Language – Technical Aspects and Applications, Physica, Heidelberg (1998), 220–237. 15. Microsoft (1998): Microsoft COM Homepage. http://www.microsoft.com/ cominfo/ (May 1998). 16. Nüttgens, M., Feld, T., and Zimmermann, V. (1998): Business Process Modeling with EPC and UML – Transformation or Integration? In: Schader, M. and Korthaus, A. (eds.): The Uniﬁed Modeling Language – Technical Aspects and Applications, Physica, Heidelberg (1998), 250–261. 17. OMG (1996): Common Business Objects and Business Object Facility. Common Facilities RFP-4. Object Management Group. OMG Doc. cf/96-01-04. 18. OMG (1997): Business Object DTF – Common Business Objects. Object Management Group. OMG Doc. bom/97-11-11 Version 1.3. 19. OMG (1997): CORBA Component Model RFP. Request for Proposal. Object Management Group. OMG Doc. orbos/96-06-12. 20. OMG (1997): The Uniﬁed Modeling Language. Vers. 1.1, 1 Sept. 1997, Docu. Set, Rational Software Corp. et al., OMG Doc. ad/97-08-03 – ad/97-08-08. 21. Rumbaugh, J., Blaha, M., Remerlani, W., Eddy, F., and Lorensen, W. (1991): Object-Oriented Modeling and Design. Prentice Hall, Englewood Cliﬀs, NJ. 22. Sims, O. (1994): Business Objects – Delivering Cooperative Objects for ClientServer. IBM McGraw-Hill series. McGraw-Hill Book Company, London. 23. Sun (1998): Enterprise Java Beans 1.0 Speciﬁcation. Sun Microsystems Inc. http://java.sun.com/products/ejb/ (May 1998). 24. Taylor, D. (1995): Business Engineering with Object Technology. Wiley Computer Publishing, New York.

Hierarchical Context Diagrams with UML: An Experience Report on Satellite Ground System Analysis Eric Bourdeau1, Philippe Lugagne1, and Pascal Roques2 1 Alcatel Space, 26 Av. J.F. Champollion, B.P. 1187, 31037 Toulouse Cedex 1, France [email protected], [email protected] 2 Valtech, Tersud, Bât. B, 5 av. Marcel Dassault, 31500 Toulouse, France [email protected]

Abstract. Although the UML was mainly designed for software development, we have introduced its use in the requirements analysis phase of the ground segment of a complete satellite system. In the first part of the paper, we present the subset of UML, consisting mainly in Use Cases and Interaction diagrams, that we have used for the requirements analysis phase. We insist on the need to define precisely the scope of the problem, its environment. We mostly assert that the "Context diagram" from traditional structured methods is still relevant in an object-oriented approach, and furthermore that it can be adequately represented by a special usage of the UML Collaboration diagram. This Context diagram can even take into account progressively the underlying physical architecture of satellite systems. The second part of the paper is more prospective: it aims at extracting Context diagrams patterns, and proposes specific stereotypes for satellite system analysis.

1 Introduction: Satellite Systems Modern satellite systems incorporate increasingly sophisticated informationprocessing and control systems. These systems contain both hardware and software, and must meet drastic requirements for reliability, and availability, as well as extensive functionality. The aerospace industry shows a good maturity for managing huge projects, but is struggling harder and harder to deliver more capable systems with shorter development cycles and lower costs. The importance of good modeling techniques is now widely recognized as a key factor of a project's success, but the state of the art in the aerospace industry for system analysis is still functional modeling with well-known techniques such as SADT and SA/RT. However, object-oriented (OO) concepts and techniques have received considerable interest over the last several years, with for instance the enforcement of the Ada programming language by NASA and ESA, coupled with such design methodologies as Booch, or HOOD. The evolution towards object oriented analysis (OOA) is much more recent, even though the OMT method has already been J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 227–239, 1999. © Springer-Verlag Berlin Heidelberg 1999

228

Eric Bourdeau, Philippe Lugagne, and Pascal Roques

experimented on several satellite projects managed by the French CNES. But the growing number of OO methodologies in the early 90's discouraged new users from doing OO modeling. That is where the UML comes in, providing a standard modeling language that incorporates the object-oriented community's consensus on core modeling concepts. So, it is not a surprise that Alcatel Space, one of the major companies in the world-wide space industry, is now feeling a strong interest to introduce these standardized object-oriented concepts right from the requirements analysis of its increasingly complex satellite systems.

2 UML for the Requirements Analysis Phase Requirements specification is both the most difficult and the most important part of any system development process. Requirements are a description of needs or desires for a system. The primary goal of the requirements phase is to identify and document what is really needed, in a form that clearly communicates to the client and to development team members. The challenge is to define the requirements unambiguously, so that the risks are identified and there are no surprises when the system is finally delivered. 2.1 Use Cases An excellent technique to improve understanding of requirements is the creation of so-called "Use Cases". The term was coined by I. Jacobson [1] and the concept included in the UML: "A use case is a coherent unit of functionality provided by a system as manifested by sequences of messages exchanged among the system and one or more outside interactors (called actors) together with actions performed by the system" [2]. The notion of actor is also defined in the UML: "An actor is a role of object or objects outside of a system that interacts directly with it as part of a coherent work unit (a use case). An actor element characterizes the role played by an outside object; one physical object may play several roles and therefore be modeled by several actors" [2]. Then, a use case diagram represents the set of use cases for a system, the actors, and the relation between the actors and use cases. Use cases are illustrated in ellipses, actors are stick figures. The purpose of this diagram is to present a kind of context diagram by which one can quickly understand the external actors of a system and the key ways in which they use it (Fig.1 shows an example). 2.2 Context Diagram More generally, the description of a system's context establishes the boundaries of the project and distinguishes between those elements that are inside and those that are outside. In complex projects, for instance large real-time systems, specifying a system's boundary includes also the list of messages and data provided and consumed

Hierarchical Context Diagrams with UML

229

by all actors. As the use case diagram does not fill this part, we need something more: a real context diagram! Ground System

Manage Payload Configuration

Space System Monitor Downlink Traffic

Manage Reservation Requests

Customer

Fig. 1. Simplified example of a Use Case diagram for a satellite ground system

UML does not explicitly support such a "Context diagram", as could be found in the SADT and SA/RT structured methodologies. But as G. Booch already suggested in [3]: "Notationally, an object message diagram can serve as a context diagram. In such a diagram, one object denotes the system as a whole. This central object is surrounded by other objects that represent external hardware or software agents at the fringe of the system and with which the system interacts". Drawing a context diagram with Rational Rose (3.0) is also explained step by step in [4]. The context diagram is defined there as "a high-level object message diagram. The system and all external actors are shown as objects. Inputs and outputs are shown as messages to and from the object representing the system". The same approach is advocated by B. Douglass in his very recent book [5]: "The external event context is expressed as an object model in which the system itself is treated as a single black-box composite object sending and receiving messages to external actor objects". And he explains further: "The context diagram from traditional structured methods can be used directly in an object-oriented context because it really shows objects rather than functional processes"… "The benefit of creating the context diagram is that it captures the environment of the system, including the actors with which the system must interact. Just as important, it captures and allows you to characterize the messages and events flowing between the system and its environment".

230

Eric Bourdeau, Philippe Lugagne, and Pascal Roques

Fig. 2 shows a simplified version of our top context diagram for a complete satellite system, represented in UML by a collaboration diagram, without message numbering:

: Customer

payment

reservation requests

billing

request status

Entire Satellite System

uplink traffic

downlink traffic

: User's Station

Fig. 2. Simplified version of the top Context diagram for the complete satellite system

Then, we looked for a way to expose progressively the characteristic architecture of satellite systems, distinguishing between Space and Ground systems, then opening the Ground system to show the so-called segments inside it, and further down to the main equipment. We wanted both to draw simple diagrams, resembling to what satellite system engineers were used to draw in their project documentation, but in the same time to abide to the UML 1.1. After some brainstorming, we came across the simple idea to show progressively the contents of the composite object representing the global system by exposing one more level at a time. This is done graphically by nesting the component objects in the composite (as it is allowed in the UML), and then distributing the external message flows to the components, as well as adding internal message flows. The component objects can be in turn considered as composites, so we have a straightforward mean to represent the high-level system architecture, just exploiting the UML concepts of composite object, and collaboration diagram! A simplified result of this process is visible on the following "hierarchical context diagrams". In Fig. 3, the Satellite System object is shown as a composite, made of two component objects: Space System and Ground System. Of course, the external message flows are the same as in the preceding context diagram (Fig. 2).

Hierarchical Context Diagrams with UML

231

Entire Satellite System

Space System

downlink traffic

uplink traffic

TM

: User's Station

TC

downlink traffic request status

billing

Ground System payment

reservation requests

: Customer

Fig. 3. Hierarchical Context diagram of the Entire Satellite System

Then, the Ground System in turn is considered as a composite, exposing its three segments: Ground Control, Mission Control, and Business Control (see Fig. 4). Ground System

TC

Ground Control Segment

payload commands

payload configuration

TM

: Space System

downlink traffic

Mission Control Segment

traffic reports reservation requests

request status

request status

billing

Business Control Segment reservation requests

payment

: Customer

Fig. 4. Hierarchical Context diagram of the Ground System

232

Eric Bourdeau, Philippe Lugagne, and Pascal Roques

This Context diagram (Fig. 4) is to be compared with the Use Case diagram (Fig.1). It is clearly another viewpoint on the Ground System, both being interesting. Interestingly, the mapping from the functional view to the architectural view is not one-to-one, but rather shows the central role of the Mission Control Segment: • UC "Manage Reservation Requests" • UC "Manage Payload Configuration" • UC "Monitor Downlink Traffic"

-> Business Control + Mission Control, -> Ground Control + Mission Control, -> Mission Control.

Fig. 5 shows the critical Mission Control Segment as the composite object to analyze. The other segments (BCS and GCS) now appear as actors.

: GCS

: BCS

payload commands traffic reports

payload configuration

: Space System

downlink traffic

reservation requests

request status

Mission Control Segment

traffic monitoring plans

MCC

CSM

Fig. 5. Hierarchical Context diagram of the Mission Control Segment

Then we arrive at the "equipment" level: the MCC (Mission Control Center) is under study. This diagram is specially interesting, as it introduces a new notion: the equipment operators, acting as "internal actors". We will discuss more about this in the following paragraph.

Hierarchical Context Diagrams with UML BCS

GCS

: BCS Operator

: GCS Operator

request status

traffic reports

233

payload command

reservation requests

payload configuration

MCC mission plan

: MCC Operator mission plan change

MCC equipment

traffic monitoring plans CSM

administration data

administration commands

: MCC Administrator

Fig. 6. Hierarchical Context diagram of the Mission Control Center

So we have a set of five context diagrams in a top-down approach, showing respectively: • The context of the Entire Satellite System, seen as a black-box, • The context of the Entire Satellite System, but with its two main components, • The context of the Ground System, but with its three main segments, • The context of the Mission Control Segment, but with its two main components, • The context of one equipment of the Mission Control Segment: the MCC.

3 Recommendations

3.1 Hierarchical Context Diagrams: A General Approach Fig. 7 represents a generic Context diagram, with the system under study as a composite object, connected to three possible types of actors: • "Actor 1" is a general actor, able to both send (m1) and receive (m3) messages to/from the system,

234

Eric Bourdeau, Philippe Lugagne, and Pascal Roques

• "Actor 2" is only a sender of messages (m2), • "Passive external entity" is only a receiver of messages (m4).

: Actor 1

: Actor 2 m1 m2 m3

Composite

m4

: Passive external entity

Fig. 7. Generic Context diagram

Then, to draw next level, a systematic process can be applied (and was implemented with Rational/Rose 4.0): • Copy / Paste the existing context diagram into an empty new one, • Add new objects to represent main components, and nest them graphically into the composite, • Dispatch the actors / composite links to the relevant components, • Add new links between the components themselves. This simple process ensures a "manual" level of consistency between context diagrams. The strict rule to apply is obviously to keep the same external messages from one level to another, but the tool we used did not particularly help. Problems arise as soon as you have a correction to make: all subsequent levels of decomposition are affected, and you have to propagate the update manually … Funny to think that this was a main issue managed by all the structured analysis tools, a long time ago! It would also be interesting to be able to decompose a message into sub-messages, when going down one level. This idea seems coherent with the fact that UML "signals" may appear in a generalization hierarchy, as indicated in [2], p109-110. But why not also a composition hierarchy for messages? Anyway, the next context diagram, showing the main components of our generic composite could be as Fig. 8.

Hierarchical Context Diagrams with UML

: Passive external entity

: Actor 1

235

: Actor 2

m1 m3

m2 m4

Composite

Component i mji

mik mij

mki

mjk

Component k

Component j mkj

Fig. 8. Generic Context diagram with components

Then, in turn, any component can be thought as a composite itself, and it is possible to draw its own context diagram. The diagrams of the components could be deduced from the composite one, ensuring the consistency between successive levels. For instance, Fig. 9 represents the Context diagram of "Component j".

mij mjk

: Component i mji

: Component k mkj

Component j

m3

: Actor 1

Fig. 9. Deduced Context diagram of Component j

236

Eric Bourdeau, Philippe Lugagne, and Pascal Roques

3.2 Other Solutions for System Decomposition The approach we have employed, based on the notion of composite objects, and their use in collaboration diagrams, is not the only one possible. A first simple idea would be to draw Class diagrams, using a set of uni-directional associations (navigable in one way), each one labeled with the name of a message passed along it. A more natural approach would have been to represent a system or subsystem by a package. A package in UML is a general purpose mechanism for organizing elements into groups. Packages may be nested within other packages. Different stereotypes of package are used for a variety of grouping purposes. A system may be thought of as a single high-level package, with everything else in the system contained in it. A subsystem is a kind of package, marked using the <<subsystem>> keyword, used to divide the system into smaller parts. Subsystems may in turn contain other subsystems. But in UML, the only example of relationship between packages is the dependency. This is not sufficient to draw our context diagrams. What we would need is the kind of following representation, a Collaboration diagram showing interactions between subsystems. Entire Satellite System

<<Subsystem>>

Space System

TM

downlink traffic

uplink traffic User's Station

TC

downlink traffic

<<Subsystem>>

request status

billing

Ground System

reservation requests

payment

Customer

Fig. 10. Hierarchical Context diagram with Packages

This diagram seems to fit exactly what we need, even better than with our composite objects and actor instances, as we want to represent all the potential interactions between the subsystems and the actors. But unfortunately it does not seem to conform to [2], and cannot be drawn with Rational/Rose … However, it is interesting to note that Jacobson himself ([6] p.201, 205, etc.) uses packages in sequence diagrams! "Sequence diagrams can be used to define how each use case for the superordinate system is divided among the design subsystems that correspond to the application and component systems".

Hierarchical Context Diagrams with UML

237

3.3 Context Diagram Patterns For an equipment in a satellite ground segment, actors to consider systematically are: • Its operator, • Its administrator, • Other equipment (from the same segment or another) with which it interacts directly, • External entities (from the entire system) with which it interacts directly. It is also interesting to distinguish between: • The equipment to develop, • The whole, consisting of the equipment plus its operators, which provides services. Both represent valid points of view, one for the development team, the other for the operations team. Moreover, an implicit high-level design principle states that external equipment can be directly connected to the equipment under study, but external operators are usually only connected to its operator. This leads to represent a Context diagram pattern, as in Fig. 11. XXX

: XXX Operator

: External Operator

XXX Equipment : External equipment

: XXX Administrator

Fig. 11. Context diagram pattern for a satellite ground segment equipment

An even more drastic architectural decision would be to impose that every interaction goes through the equipment operator (see Fig. 12).

238

Eric Bourdeau, Philippe Lugagne, and Pascal Roques

XXX

: XXX Operator

: External Operator

XXX Equipment : External equipment

: XXX Administrator

Fig. 12. Alternative Context diagram pattern

3.4 UML Extension for Satellite Analysis The types of actors that we identified in the previous paragraph would better be visually distinguishable. The UML way to achieve this consists in stereotyping classes and actors, as user-defined stereotypes can come with their associated icons. This presents the big advantage that our context diagrams would look like pure drawings to domain experts, even though they are in fact compliant with the UML! So we can propose a first draft of UML extension for satellite system analysis, defined in terms of stereotypes (see Table 1.). Table 1. Proposed stereotypes for UML extension

Metamodel Class Class Class Class Class Actor Actor Actor Actor Collaboration

Stereotype Name Satellite GroundStation Segment Equipment Operator Administrator Customer EquipmentActor ContextDiagram

Hierarchical Context Diagrams with UML

239

The icons have yet to be standardized, but future diagrams could look like Fig. 13, for the benefit of the reader (compare with Fig. 3!). Entire Satellite System

Space System

TM

downlink traffic

uplink traffic

: User's Station

TC downlink traffic

request status

billing

Ground System reservation requests

payment

: Customer

Fig. 13. Context diagram with stereotypes as icons

References 1. Jacobson, I.: Object-Oriented Software Engineering: A Use Case Driven Approach, Addison-Wesley (1992) 2. Rational et al.: UML Notation Guide, version 1.1, www.rational.com/uml (09/1997) 3. Booch, G.: Object Solutions: Managing the Object-Oriented Project, Addison-Wesley (1996) 4. Lockheed Martin Advanced Concepts Center and Rational Software Corporation: Succeeding with the Booch and OMT Methods: a practical approach, Addison-Wesley (1996) 5. Douglass, B.: Real-Time UML: Developing Efficient Objects for Embedded Systems, Addison-Wesley (1998) 6. Jacobson, I., Griss, M., Jonsson, P.: Software Reuse: Architecture, Process and Organization for Business Success, Addison-Wesley (1997)

Extension of UML Sequence Diagrams for Real-Time Systems J. Seemann, J. Wolff v. Gudenberg Würzburg University 1 Am Hubland, D 97074 Würzburg +49-931-888-5517 +49-931-888-4602 {seemann | jwvg}.acm.org

Abstract. The behavior of real-time systems is specified by a number of interaction scenarios between tasks or active objects. Each scenario may be illustrated by a UML sequence diagram. We use the newly developed, textual language UMLscript-RT as input language for our tool AVUS, mainly a compiler, that automatically generates standard UML sequence diagrams. UMLscript-RT extends UML sequence diagrams in two aspects. Firstly, we introduce loops and suggest a graphical notation very similar to that used in Message Sequence Charts. Secondly we give a precise grammar for timing constraints which are mandatory for real-time applications. AVUS generates a directed graph whose vertices are the events and associates the constraints as weights to the arrows. Consistency of the timing constraints is then checked by examining the cycles of that graph.

1. UMLscript-RT UMLscript-RT defines a concrete syntax for UML collaborations. In this paper we concentrate on the elements concerning the dynamic behavior, i.e. exactly that part which is usually shown in sequence diagrams. Although sequence diagrams are part of the dynamic model we prefer to consider them as executable diagrams specified by a visual programming language. UMLscript-RT does not give a textual representation of the graphical artifacts of sequence diagrams, nor is it given by means of a graph grammar or any other visual formalism. It is, however, treated like a usual programming language defined by a simple LL(1)-grammar. Some of the language elements convey information about the relative order of statements that is important for the simulation and the graphical representation. Some others describe the real-time constraints. Sequence diagrams in textual form can easily be entered with usual text editors, existing specification or documentation files can automatically be transformed to the desired input format in some cases.

1

This work was partially supported by a contract between Würzburg University and 3soft GmbH, Erlangen

J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 240–252, 1999. © Springer-Verlag Berlin Heidelberg 1999

Extension of UML Sequence Diagrams for Real-Time Systems

241

In this section we describe the syntax by EBNF rules and some comments. We informally specify a semantics as a foundation for a complementary simulation tool. We do, however, not give a comprehensive, detailed description due to page limits. The next section explains how the layout for UML sequence diagrams is obtained, and in section 3 we introduce our extension for real-time applications and explain the consistency checking algorithm. In summary, the paper considers three different topics. The first is the definition of a scripting language for UML sequence diagrams, the second their automatic visualization and the last is an extension by real-time constraints which follows the guidelines of the RTL approach [JM86]. A UML sequence diagram illustrates a collaboration of interacting objects, where the interactions are invoked by exchange of messages. Its focus is on the temporal order of the message flow. Each object is assigned a column, the messages are shown as horizontal, labeled arrows, and a vertical time axis is assumed. There are two different diagram modes. An instance diagram describes exactly one scenario without any alternatives. A generic diagram illustrates a complete use case with conditional branches, it hence represents a set of related scenarios. Diagrams may be drawn with rectangular activation areas or without. Objects may be created or destroyed during the scenario. Concurrent or sequential execution of threads is possible and synchronous as well as asynchronous message may occur. Arbitrary constraints or comments may be added. Usually a diagram is read and entered from top to bottom where the object in the leftmost column starts the collaboration. This is reflected by the following syntax rules of UMLscript-RT: SeqDia = ["GENERIC"] ["DIAGRAM" Name] ["WITH" "ACTIVATION" ] Statements ["END" ] EOF. SeqDia is the start symbol of the grammar describing one sequence diagram. Each diagram consists of a sequence of statements, optionally a name may be specified. Some global switches have to be set. A statement is an interaction, a loop or an alternative. Statement = Interaction | "TIME" TimeConstraint | "REPEAT" Interactions "UNTIL" Constraint | "WHILE" Constraint "DO" Interactions | "IF" Constraint "THEN" Interactions {"ELSIF" Constraint Interactions} "ELSE" Interactions. A UML collaboration only consists of interactions. The loops and alternatives are our extensions. We also provide for timing constraints concerning the whole scenario and define their syntax and semantics (see section 3). Each interaction specifies a communication between two not necessarily distinct objects by invoking an action, i.e. sending a message of corresponding type. Interaction = Object ['->' Object] [Position] Action [Eventdef]. Position contains information about the relative ordering of the interactions and also some data to arrange them properly on the vertical time axis (see section 2).

242

J. Seemann and J. Wolff v. Gudenberg

Each interaction occurs at a fixed moment in time. If interactions are entered without positional information, they occur in sequential order with groups of concurrent occurrences according to the delimiters of statements or interactions. A semicolon increments the time by one unit. Since this is the default, an end of line symbol is sufficient. The "parallel" delimiter double bar || does not increment the time. Hence groups of concurrent messages can easily be specified. By default, messages are considered to be atomic, i.e. sending time is equal to the reception time. Actions are frequently repeated in most real-time systems. Take periodic control of sensors, waiting for an event, sequences of sub-activities as some examples. UML collaborations only provide a general repetition mechanism by a guarded action that has no graphical representation and an open semantics. We keep this possibility and specify its semantics intentionally informal: The action may be repeated several times. Loops with clearly stated conditions are provided additionally. A repeat loop is executed until the constraint is fulfilled, at least once. A while loop is executed as long as the constraint is fulfilled, may be never. Infinite loops are possible. Note that nested loops are not allowed. Objects that participate in an interaction inside a loop must not communicate with objects outside the loop at the same time. Loops are drawn as large boxes enclosing all interactions. Loops destroy the vertical flow of time, which may be reinstalled by unrolling. In an alternative statement exactly one of the interaction sequences is performed, namely that whose guard is met first, if the constraints are evaluated one after the other. Alternative statements are also drawn as large boxes where now each case is associated a compartment. They also break the time axis. But since only one alternative is scheduled, the sequential order can be obtained by erasing the other cases. Graphical representation as well as the underlying semantics have been chosen similar to Message Sequence Charts [MSC96] where, however, nested loops or case statements are possible in contrast to our approach. Action = Guard [("create" | "return" | "terminate" | "destroy")] [("asynch" | "synch")] ["sleep" | "final"] Message . According to the UML semantics there are the following kinds of actions: • • • • • • • •

create: creates the target instance destroy: destroys the target instance terminate : destroys the source instance call: (the default kind) calls a target operation return: returns from an operation send: sends a signal to the target or to an unspecified set of receivers local: a call of a local operation not resulting in an event uninterpreted: every other action

The latter three actions are not visible in sequence diagrams and hence not supported. We also slightly changed the meaning of return actions. We add the "return" classifier to the call action, its execution time is determined by specific timing marks. If returns shall be considered as independent actions like in UML standard, they have

Extension of UML Sequence Diagrams for Real-Time Systems

243

to be modeled as calls. All actions can be performed asynchronously or synchronously, the latter is the default. A "sleep" characterization indicates that the caller's activation is suspended whereas "final" actions wait with the deactivation of the object, until all actions invoked by them are finished . Guard =

['*'] [Constraint].

Surprisingly, guards are not mentioned in the UML collaboration semantics, although they appear in the notation guide. A guard consists of a constraint and/or an optional asterisk indicating that the action is performed several times. In UMLscript-RT this means an unspecified repetition, otherwise the new loop constructs should be used. There are two different semantics for instance or generic diagrams, respectively. If a guard is not fulfilled in an instance diagram, this diagram describes another scenario and is not valid for the current case. In a generic diagram describing a full use case messages with false guards are ignored. Guarded actions may be used to model the sending of alternative messages at the same time. Mutual exclusion of the guards is not necessary. But because in this manner the alternatives are rather hidden, we have added the genuine alternative statement. Message = MessageName [ParList] [ReturnValue] [Constraint] [MsgComment]. A message defines how a particular request is used. A receipt of a message always is an event. Parameters or return values of signals or operations may be specified. We suggest to explicitly denote empty parameter lists as () for better readability. As an example where we want to illustrate various features, we quote a simple database scenario. diagram database with activation actor ->DBController return newGetChangedValues() DBController -> Material final register() Material asynch check() "comment1" Material privSetIndirectAttr() Material -> DBController privSetIndirectAttr() DBController -> UnitOfWork create asynch unitOfWork() DBController -> UnitOfWork addDoneEvent() Material terminate startStore() End

That UMLscript-RT source is visualized as an ordinary sequence diagram (see diagram 1) by the algorithm described in section 2.

244

J. Seemann and J. Wolff v. Gudenberg

Object = ["active"] ( ObjectName [ClassSpec] | ClassSpec ) | "actor". ClassSpec = ":" ClassName [ "ATTRIBUTES" Attribute { "," Attribute }]. Objects are given in the usual notation where a particular actor icon is provided. UML distinguishes between active and passive objects, but often this distinction is not carried out strictly.

Fig. 1. Diagram 1

Active objects may own a thread of control. Passive objects do not, but they may respond to a request by sending a message, i.e. if an active object calls a method of a passive object, it hands over its thread of control as a loan. Usually objects in sequence diagrams are active. Our active flag is used in diagrams with activation to indicate that one of the object's threads is already running.

Extension of UML Sequence Diagrams for Real-Time Systems

245

We do not give a detailed syntax for the basic constructs. Note only that constraints have to be enclosed in bars and comments in double quotes.

2. Generation of Sequence Diagrams The standard appearance of UML sequence diagrams is supported and two enhancements are added. For a loop we draw a rectangular box around all participating messages with the loop condition in a separate compartment at top or bottom. Alternatives are also included in one rectangle, the different cases are separated by dashed horizontal lines and contain the constraint in a hexagon in the top left corner.

Fig. 2. Diagram 2

Note that this is obviously a contradiction to the vertical flow of time, but if the selected case is chosen and the others are omitted everything looks fine. The other choice to put the alternatives from left to right is worse, because most often the same objects are used and then lifelines would have to be duplicated. A simple loop is shown in the above diagram

246

J. Seemann and J. Wolff v. Gudenberg

If an object supports multiple threads each of them is given its own horizontal position. The position syntax provides data which usually are entered as constraints or comments concerning relative timings. This information is used to model the time axis where we maintain vertical distances proportional to the amount of discrete time steps. Position = ["AT" Mark [ ',' Mark]] ["TILL"

MarkExpr ].

The two marks behind the at symbol separate the sending and reception time of the message and can be used to explicitly state the moment of the event. A symbolic name may be declared for each instance of time. Mark = ("DEFINE" MarkName "=" Term) | MarkExpr . Term = MarkName | Number | "current". We already mentioned that we count the time from the beginning of the scenario. current is the actual value of that counter. Arbitrary positive numbers may be assigned, so the order of interactions is disturbed. A timing mark may be given as a simple expression relative to existing marks. MarkExpr = Term {("+"|"-") Number}. The timing marks provide a second way to express concurrency and may be used for synchronization. Concurrent messages sent to the same object lead to a spawning of several threads in that object. Note that we draw the arrows of concurrent messages as one line, since they have the same vertical time position, but with two heads one to each thread box. The predecessor lists given in the UML documents may be modelled by explicit time marks. The timing mark expression behind the TILL keyword specifies the end of the action invoked by the message, i.e. its return time. This information is necessary to the activation areas for sequential or synchronous flow of control. Inside loops the interactions occur in each cycle, the proper moments are determined by unrolling the loop. Since the compiler cannot always know, how many cycles are performed, and hence find out what other interactions may interfere with the execution of the loop, we recommend to explicitly mark the first interaction after a loop with the counter value current. During the compilation all messages are collected in a linked list. From that list we generate an object list and split each method according to send and receive time into an event list. For each object a list of possible threads is managed. The event list is ordered by the value of the timing marks. From the event list we calculate the vertical positions of the messages, and the activation areas. Additional space requirements for loops and alternative statements are registered. The horizontal order of the objects is taken as the order of occurrence in the event list, and hence keeps very close to the usual drawing of a diagram. It turns out that it is not necessary to minimize the number of crossings. The strings describing the object's name and features determine its horizontal extension. The distance between two lifelines is calculated from the maximal length of a message label.

Extension of UML Sequence Diagrams for Real-Time Systems

247

We then proceed in the following manner: 1. draw object lifelines 2. draw message lines 3. draw activation areas 4. draw message arrow heads, labels, and comments 5. redraw hidden parts of activation areas 6. draw object boxes and termination symbols 7. draw loops Constraints are enclosed in vertical bars in the input text for lexical reasons, but they are drawn at the message arrows in the usual brackets. Comments on the other hand are entered in string quotes and are placed at the right margin of the diagram. This algorithm leads to a nice and proper appearance of a diagram.

3. Real-Time Constraints The timing marks provide the same information which is usually mentioned in examples for sequence diagrams. We use it for a qualitative management of time and to adjust the drawing. Of course consistency checks are performed. We check if a message is received before it is sent, if existing objects are created, if the execution time of an action outlasts the activation period and so on. When modeling a real-time system not only structure and dynamic behavior have to be considered, but also timing constraints are mandatory for the correctness. Often some end-to-end timing constraints given by certain tolerance intervals induce quite a lot of intermediate constraints not explicitly mentioned during system design. For a more formal treatment of real-time requirements we rely on a procedure introduced in the framework of the RTL development. RTL (Real Time Logic) [JM86] is a formal language that describes absolute timing of events and allows reasoning and graph-based checking of consistency. We assume a global clock with a discrete clock-rate. In RTL different kinds of events like external signals or changes of certain variables mark significant points in time. Events have unique names, different occurrences are distinguished by an index value. An event consists of the static event name, the dynamic context, and a time stamp. For our purposes the sending and reception of a message define the basic events. Their names are built from the message name by appending ".SND" or ".RCV", their dynamic context is the sending or receiving object, respectively. More convenient names may be introduced in UMLscript-RT. Eventdef = ["SND" '=' Eventname] ["RCV" '=' Eventname]. TimeConstraint = Condition { ("AND"|"OR") Condition}. Condition = (Eventexp Relop Eventexp) | ( "(" TimeConstraint ")" ).

248

J. Seemann and J. Wolff v. Gudenberg

Eventexp = '@' Eventocc {("+"|"-") (Name|Number)}. Eventocc = Eventname [ "[" ("+"|"-") Number "]"]. Eventname = Name [".SND"|".RCV"]. The constraints are boolean expressions composed of comparisons of event occurrences. Each event occurrence consists of the event name and an optional index. No index as well as the value zero denote the current occurrence, negative indices recent and positive future occurrences. Thus events occurring in different instances of loops may easily be distinguished. The function @ assigns an integer time value to each event occurrence. We allow for addition and subtraction of integer variables or constants. Conditions may describe maximum as well as minimum time spans between two events. Let A and B be events and t1, t2 positive numbers. Then @A <= @B - t1 specifies that at least t1 time units lie between A and B, whereas @A >= @B - t2 means that B occurs at most t2 units later than A.

@A @B

Fig. 3. Events

As an example, we model a GUI which controls two PID devices. Data are requested from PID_1 and at most 100 ms later from PID_2. The acquisition of data from PID_1 maximally lasts 40 ms and for processing these data a time span of 15 ms is necessary. diagram GUI with activation GUI -> PID_1 at(current, current+1) asynch requestData() snd = START rcv = RCV_PID PID_1 -> GUI actualData()at(current, current+1) snd = SND_PID rcv = RCV_GUI GUI processData() till current+1 GUI -> PID_2 at(current, current+1) asynch requestData() snd = END PID_2 -> GUI at(current, current+1) actualData() TIME @RCV_PID >= @SND_PID – 40 TIME @START >= @END – 100 AND @RCV_GUI <= @END – 15 end

Extension of UML Sequence Diagrams for Real-Time Systems

249

Clearly these constraints can be fulfilled. In general the graph-based constraint checking algorithm works as follows [CJD91]. The constraints are transformed into disjunctive normal form where each condition has the form @A <= @B - t1

@SND_PID +40

@RCV_PID

@RCV_GUI

@START +100

@END

-15

Fig. 4. Constraint graph

Then a directed graph is created (see figure 4) whose vertices are the events and an arrow labeled with -t1 leads from A to B. Note that the elemetary timing constraints induced by the temporal order of the events are also considered. They are depicted as dotted arrows in figure 4. If we now find cycles with negative weight, the constraints are obviously contradictory and cannot be fulfilled. An efficient algorithm is described in [TW96]. In our example there are two cycles, each of them has positive sum of weights, hence the system is consistent. During the compilation we also collect the event names and the time constraints, but only the former are shown in the diagram (see diagram 3). The constraint system is built and can be solved by a separate tool.

250

J. Seemann and J. Wolff v. Gudenberg

Fig. 5. Diagram 3

If timing constraints appear in loops, for each iteration of the loop the event occurences may be specified with respect to the current iteration. @START[0] denotes the current occurence, @START[-1] the previous, and @START[+1] the following. It is not allowed to combine event occurences inside and outside of a loop, because the number of iterations cannot be determined in advance. For example, we construct a loop reaching from @START through @RCV_GUI in order to allow for multiple requests from PID_1. Each iteration must be finished within 100 ms. diagram GUI with activation REPEAT GUI -> PID_1 at(current, current+1) asynch requestData() snd = START rcv = RCV_PID PID_1 -> GUI actualData()at(current, current+1) snd = SND_PID rcv = RCV_GUI UNTIL finished

Extension of UML Sequence Diagrams for Real-Time Systems

251

GUI processData() till current+1 GUI -> PID_2 at(current, current+1) asynch requestData() snd = END PID_2 -> GUI at(current, current+1) actualData() TIME @RCV_PID >= @SND_PID – 40 TIME @START[0] >= @START[+1] – 100 end This leads to a very similar constraint graph for the loop. @SND_PID +40

@RCV_PID

@RCV_GUI

@START[0] +100

@START[+1]

Fig. 6. Constraint graph

The time provided for the execution of a loop may be specified by a constraint combining event occurences before and after the loop.

4. Conclusion We have defined a textual language UMLscript-RT to describe UML sequence diagrams. The language may be used as an intermediate format that is portable between different systems or CASE tools. Because all layout information has been skipped, the files are rather concise. They can easily be modified by a conventional text editor. The syntactical correctness and soundness of the diagrams is checked by a compiler. That compiler also is responsible for the drawing of the diagram. Currently we have implemented a very simple drawing algorithm that, nevertheless, has the advantage that the user can influence the layout in an obvious manner. We have added explicit loop and alternative statements, two very helpful constructs for the simulation of real-time systems. The extensions mirror those for the message sequence charts [MSC96].

252

J. Seemann and J. Wolff v. Gudenberg

Constraints, intentionally open in the UML, have been formalized for two purposes. Timing marks control the simulation. They also are interpreted to determine the exact drawing positions of messages at the now scaled time axis. On the other hand, timing constraints are used to model the dynamic system behavior. In a high-level UML sequence diagram timing constraints specify the system requirements. During the software design these diagrams are refined to a number of UML sequence diagrams containing a lot of intermediate timing constraints, which depend on each other. Their consistency can be checked by investigating the cycles in an associated constraint graph. Currently a prototype of the AVUS tool is tested for industrial use.

5. References [CJD91] S. Chodrow, F. Jahanian, M. Donner: Run Time Monitoring of Real-Time Systems, Proc. Real Time Systems Symposium, pp. 74-83, 1991. [GKS 96] M.Gergeleit, J. Kaiser, H.Streich: Checking Timing Constraints in Distributed Object-Oriented Programs, OOPS Messenger, Vol. 7, No. 1, pp. 51-58, 1996. [JM86] F. Jahanian, A. Mok: Safety Analysis of Timing Properties in Real-Time Systems, IEEE Trans. Software Eng. Vol SE-12, No.9, pp. 890-904, 1986. [MSC96] Z.120 (1996), Message Sequence Chart (MSC), ITU-T, Geneva, 1996. [SWvG98] J. Seemann, J. Wolff v. Gudenberg: UMLscript, A Programming Language for Object-Oriented Design, in M. Schader, A. Korthaus (eds.): The Unified Modeling Language, Technical Aspects and Applications, Physica-Verlag, pp. 160-169, 1998. [TW96] J. Tsai, T. Weigert: A Logic-Based Requirements Language for the Specification and Analysis of Real-Time Systems, Proc. 2nd Workshop on OO Real-Time Dependable Systems, IEEE, pp. 8-16,1996. [UML97] G. Booch, I. Jacobson, J. Rumbaugh: The Unified Modeling Language, http://www.rational.com.

UML and User Interface Modeling Srdjan Kovacevic Aonix, 595 Market Street, San Francisco, CA 94105 [email protected]

Abstract. UML and traditional CASE tools still focus more on application internals and less on application usability aspects. A user interface (UI) is modeled in terms of its internal structure and objects comprising it, the same as the rest of the application. The adoption of use cases and interaction scenarios acknowledges the importance of recognizing user tasks when developing an application, but it is still used mainly as a starting point for designing software implementing usage scenarios rather than focusing on modeling user tasks to improve application usability. Explicit modeling of user interface domain knowledge can bring important benefits when utilized by a CASE tool: additional design assistance with exploring UI design alternatives, support for evaluating and critiquing UI designs, as well as increased reuse and easier maintenance. UML can provide a notation framework for integrating user interface modeling with mainstream software engineering OO modeling. The built-in extensibility mechanisms (stereotypes, tagged values and constraints) allow the introduction of new modeling constructs with specialized semantics for UI modeling while staying within UML. The paper identifies modeling constructs needed for UI modeling and proposes a direction for extending UML to better address UI design.

1

Introduction

In an interactive application, over 50% of code is typically devoted to the user interface (UI). Yet UML and traditional CASE tools still focus more on application internals and less on application usability aspects. A user interface is modeled in terms of its internal structure and objects comprising it, the same as the rest of the application. The adoption of use cases and interaction scenarios acknowledges the importance of recognizing user tasks when developing an application, but it is still used mainly as a starting point for identifying application internals and designing software that implements usage scenarios rather than focusing on modeling user tasks to improve application usability. Application usability goes beyond interaction techniques and the widgets used in its UI. How its UI is structured and whether it is appropriate for a user’s task at hand can be even more important. That is why good graphical design alone does not guarantee a good UI. UI design must also incorporate the results of task analysis and modeling. Modeling user tasks and evaluating user interfaces using models such as GOMS [8,11] can help detect potential problems in UI design and bring significant savings, J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 253–266, 1999. © Springer-Verlag Berlin Heidelberg 1999

254

Srdjan Kovacevic

both in development costs (e.g., fewer prototype/evaluate cycles needed due to model-driven evaluations) and users’ productivity (dialogs optimized for critical tasks). Model-driven UI development can bring other benefits as well, including run time management of user interfaces, providing different kinds of help and design space exploration [2,3,5,9,10,12,13,14,16,18,19,22,23]. On the other hand, a big obstacle to the wider adoption of model-based user interface development is the complexity (due to lack of adequate support) and perceived overhead of creating a model. UML provides a rich notation that goes beyond pretty pictures. Although UML is by no means complete and includes several areas that are still undergoing revisions (use cases and activity diagrams among them), it nevertheless provides enough semantics to enable CASE tools to assist in software development tasks. Currently, no such support is possible for UI design tasks. Instead of leveraging UI domain knowledge and the semantics captured in the application model, CASE tools still require UI designers to work directly with low-level UI components, such as dialog boxes, menus and callbacks. The UI is modeled in terms of its component objects, not in terms of user tasks and desired look and feel. It is as if programmers were asked to program their loops using explicit registers, conditionals and branch statements instead of using for/while/repeat constructs. UML can provide a notation framework for integrating user interface modeling with mainstream software engineering OO modeling. The built-in extensibility mechanisms (stereotypes, tagged values and constraints) allow the introduction of new modeling constructs with specialized semantics for UI modeling, while staying within UML. In this paper, I identify elements needed for modeling user interfaces and propose a direction for extending UML to better address UI design needs. The next two sections briefly discuss the main concepts used in UI modeling and the place of UI design in the software development life cycle. Section 4 describes a minimal set of extensions to UML needed for modeling application UIs. Section 5 discusses how these extensions fit into the overall UML framework. Section 6 offers conclusions and directions for future work.

2 2.1

UI Modeling and Design What Defines an Application User Interface

Each user interface is a product of two sets of requirements: (1) application information requirements and (2) look and feel requirements. To fully define a UI, we need to specify both. The primary role of a user interface is to serve its underlying application. It must meet all its information requirements, both in terms of its inputs and outputs. A user interacting with an application must be able to specify what action to perform and all input values for action parameters. The application must also be able to present all relevant information back to the user. These are mandatory requirements for any user interface – if they are not satisfied, the application may not be fully functional (i.e., parts of it may not work, either because it is not possible to specify all required parameters or results cannot be presented to the user).

UML and User Interface Modeling

255

Typically, there are many different ways the application information requirements can be satisfied. For instance, different interaction techniques can be used to specify input value, or values can be specified in varying order. Which particular user interface is selected is determined by look and feel requirements. These requirements are optional in a sense that even if they are not fully satisfied (e.g., there is no desired interaction technique available on a target platform and a different technique must be used), the application will still be functional. The look and feel requirements affect the application usability and not its functionality (though, one may argue that an application that has poor usability is effectively not fully functional, as there are parts that are hard to access and exercise).

Application 2

UI Details

Application 1

Application 3

UI 1c

UI 1a

UI 1d

UI 1b

Application Inform ation Requirem ents

User Interfaces

Fig. 1. Applications and their user interfaces

Figure 1 illustrates that each application can be mapped onto more than one UI. If we change some of the application requirements, we map it to different interfaces. Furthermore, the mapping can be decomposed into two or more steps. The first step (the solid arrow from Application_1 to UI_1a) maps an application to a default UI that meets all its information requirements. The subsequent steps apply different look and feel requirements, mapping a given UI to different designs (e.g., the solid arrow from UI_1a to UI_1d). This is equivalent to transforming a UI to another UI while preserving aspects derived from the application information requirements. Design transformations allow exploration of different UI designs in a model-based UI environment. More details on design transformations can be found in [5,12]. 2.2

Application Conceptual Model

As already pointed out, the basic requirements on an application’s UI are to (1) enable a user to specify all inputs that may be needed to perform any given task and (2) be able to present all necessary information back to the user. These application information requirements can be derived from specification of application actions and objects that users interact with and operate on. In model-based UI systems (e.g., [5,14,18,24]), this information is captured in an application conceptual model, which integrates an object domain model and a task model.

256

Srdjan Kovacevic

An object domain model describes the types of concepts in a given domain and the various kinds of static relationships among these concepts. Circuit Design Task/Action SEQ

Parameter Pre- & Postconditions

Open Diagram ANY Create New Diagram

* Existing Diagram

Edit Diagram

Save Diagram

ANY Create Gate

Move Gate

Rotate Gate

AND Select Object

Select Position

Delete Gate

Connect Gates

<<precondition>> {Exists gate}

Fig. 2. Hierarchical Task decomposition.

A task model describes tasks that users need to perform and how they are structured. Tasks are typically decomposed in a hierarchy, with additional information indicating temporal and logical constraints among tasks, such as sequence, choice, concurrence, enabling. [2,19,25]. Figure 2 shows a partial decomposition of a task for creating circuit designs. It involves three subtasks (Open, Edit, and Save Diagram) that have to be carried in a sequence, where editing is optional and can be iterated. Editing a diagram involves any one of five actions (create, move, rotate, delete, connect). Only parameters for the move-gate action are shown here. 2.3

Look and Feel Aspects of UI

A UI is not fully defined by the application conceptual model; additional information is needed to get a working UI. This includes details on how to present object attributes to a user, what kind of feedback to show, how to present tasks and actions to a user, how to activate an action and specify its parameters, and other information that influence the look and feel aspects of a UI. For instance, from the task hierarchy in figure 2 we know that the move-gate action requires two parameters and that it is active only if there is at least one gate object. However, it does not specify how to select the action, nor how to provide its parameter values. Deciding how these lowest level (interaction) tasks in fig. 2 are satisfied is part of UI design. (Note that UI design goes beyond this and also include deciding how to present application objects, what

UML and User Interface Modeling

257

metaphors to use, etc., but we will not discuss these issues here.) Interaction tasks are satisfied by interaction techniques that specify compositions of interface actions. Essentially, each interaction task can be hierarchically decomposed into interface actions in the same way application tasks are decomposed in figure 2. Interaction techniques can also have pre- and post-conditions. For instance, a mouse-drag technique requires that an object to be dragged is visible. Hierarchical decomposition of interaction tasks and techniques can be used to evaluate an application’s UI, as well as provide help on how to use it (e.g., see [4,12,19, 23]). Move Gate AND Select Action

Select Object

Task/Action Select Position

Parameter Interaction Technique Interface Action

Mouse-Drag

Menu ITec

Feedback

SEQ Mouse to menu

Button down

Open Menu

Mouse to item

Button up

Fig. 3. Explicit action selection

Figures 3 and 4 illustrate two different UI designs for the move-gate action. The first design (fig. 3) uses a menu interaction technique to explicitly select action and mouse-drag to select parameters: an object and a position where to move it. The design in fig. 4 also uses mouse-drag to move an object, but does not require the action to be explicitly selected. The first design allows using mouse-drag for the rotate action as well (if it is explicitly selected from a menu and only one action can be active); the second one does not (unless we reserve a different mouse button and/or a modifier to distinguish the move and rotate actions). If a designer tries to make both actions (move and rotate) implicit and to use the same interaction technique (e.g., mousedrag), a UI design tool can detect this as an inconsistent/ambiguous design and warn the designer. Note that the above examples are simplified (e.g., interaction tasks confirm and cancel-action are not shown, nor are pre- and post-conditions). Nevertheless, they demonstrate how using different interaction techniques to accomplish interaction tasks (of selecting an action and providing parameter values) results in different UI designs sharing the same underlying functionality. Tools that take advantage of this can support exploratory UI design that preserves application semantics [5, 12], as well as additional assistance, such as checking designs for consistency and completeness and providing different forms of help (e.g., see [23] for discussion on how hierarchical task decomposition facilitates animated context-sensitive help).

258

Srdjan Kovacevic Move Gate Task/Action AND

Select Action

Parameter

Select Object

{implicit} Mouse-Drag

Select Position

Interaction Technique Interface Action Feedback

Fig. 4. Implicit action selection

Explicitly specifying every detail for each interaction task and technique is quite tedious, although it is exactly what traditional application development often entails. However, a UI designer does not have to explicitly specify this information; it can be considered a specification freedom [1] and a UI tool can provide it if it is missing. A UI tool can map a conceptual model into a working UI by using some reasonable defaults for missing information, where defaults are selected based on built-in UI domain knowledge.

3

UI Design and Software Development Life Cycle

Currently, a software development life cycle based on a mainstream development process model, such as Fusion or Objectory, does not take into account specifics of UI design - it neither addresses the usability requirements of interactive applications, nor does it leverage UI domain knowledge to provide any of the support discussed earlier. There are three dimensions of integrating UI design into a software development life cycle, as illustrated in Figure 5 (which is in part based on a draft meta-model in [27]): notation, process, and architecture. Notation and Semantics. The same argument that drove adoption of UML – to enable tool interoperability at the semantic level and provide a common language for specifying, visualizing and documenting software artifacts – applies to bridging the gap between UI and OO design. Both UI and OO designers/developers should work on the same model (domain model in fig. 5), focusing on different aspects but collaborating in developing a domain model. While OO A&D focuses on refining a domain model toward the implementation of a functional core, UI design focuses on an interactive model complementing a domain model. Domain models capture application semantics and define information requirements on the application UI. Interactive models capture UI specifics (look and feel). Together, the two models define a model of an interactive system. Notation must facilitate this collaboration by supporting any additional UI related modeling views and providing underlying semantics that tie UI design view constructs to OO A&D constructs (in the same manner UML ties existing views). Architecture. Interactive systems require architecture that will maximize the leveraging of UI domain knowledge and reuse. For instance, fig. 5 illustrates an architecture that clearly (conceptually, not necessarily physically) separates the functional core from a UI component that builds on UI domain knowledge in providing design

UML and User Interface Modeling

259

assistance (evaluation, exploration) and run time services (e.g., UI management and context-sensitive help). Note that the functional core (the non-interactive part of the application) depends on the domain model, but not on the interactive model. The application UI depends on both the interaction and domain models. Interactive System Model

Method/Process

Notation Issues Domain Model

Interactive Model

Architecture

User Interface

Internal Interface

Functional Core

Implementation

Fig. 5. Separating a UI component and the application functional core.

Process. UI design must be incorporated into the software development process as an integral part, not an afterthought, so that the process facilitates collaboration between UI designers and software developers. Note that we intentionally do not propose a specific process for creating task models; that is beyond the scope of this paper. Refer to [6] for more details on process issues.

4

Extending UML with Constructs for Modeling Application UIs

UML already provides sufficient support for object domain modeling – a domain model can be specified using class diagrams. However, task modeling requires a number of extensions to UML. Whereas Use Cases can be used for task modeling – they capture functional goals of a target system and are intended to capture task domain knowledge [7] – they are insufficient to represent all necessary information pertaining to temporal and logical relationship among tasks. For instance, Use Cases allow only two relationships, uses and extends, which are not enough. Behavioral diagrams, in particular interaction diagrams, can supplement Use Case diagrams and capture information about some relationships between tasks (e.g., using sequence values, guards, iterations and predecessor information, as well as explicit timing information), but this is not sufficient and adequate for task modeling. These diagrams are geared toward modeling behavioral aspects of operations. For instance, collaborations focus on how a use case is realized by a set of cooperating classifiers, but not on its usage in a user domain.

260

Srdjan Kovacevic

Activity diagrams provide better constructs for expressing relationships between tasks, but they are still not adequate for task modeling typically used in UI design (e.g., see [10, 17, 19, 25]). Specifically, activity diagrams are geared toward modeling procedural flow of control and business process analysis, whereas task analysis focuses on goal-driven hierarchical decomposition. When procedural aspects are modeled, that is done in a separate diagram and only for lowest level tasks [10]. None of the UML diagrams has all the construct needed to support task modeling for interactive applications, but requires extensions. Given that UML is still dynamic and that the use case and activity diagrams are the two areas being most controversial and most likely to further change, the extensions are defined and presented in terms of goals they are to fulfill rather than specific constructs in the use case or activity diagrams. That is, the focus is on identifying what constructs are needed and why. Only when there is agreement on these two issues we can effectively deal with technical issues of how to best define them and which existing constructs to use as a base. In this section, we first identify the minimal set of constructs we need and then discuss some of the possible ways of implementing them and their relationship to the existing UML constructs and views. The set is minimal in a sense that it aims to address the UI design needs incrementally, by focusing first on task modeling and not trying to cover all possible uses of UI models. For instance, additional properties will be needed to support all kinds of evaluations and UI generation. 4.1

Proposed Constructs/Concepts

This subsection describes proposed additional modeling constructs and how each construct contributes to the UI design support. Task The Task entity represents user tasks/subtasks (as composition of actions) and actions in a task model. A distinction between tasks and actions is that an action has a corresponding semantic action routine that realizes the action in the application functional core, whereas a task is realized through its component subtasks and/or actions. Distinctions between tasks and actions are not emphasized, especially since in the early design phases decisions on how to realize user tasks may not yet be made. However, the distinction is important when it comes to integrating the UI with the application functional core (AFC). The following standard properties can be defined for Tasks: • Name. • Ordering (determine what temporal and logical ordering to apply to its subtasks, e.g., parallel, sequential, AND, choice). If subtasks require different types of ordering, than a hierarchy needs to be reorganized by introducing new subtasks for each type of ordering. • Feedback (this is user interface-specific information that may not be used in early phases, but allows a designer to identify some look and feel aspects associated with a particular task). • Semantic action routine (SAR) that realizes this task, if any. If there is one, then this is actually an application action and it is part of internal interface between a UI and the application functional core (AFC)

UML and User Interface Modeling

261

• Interruptable (can this task be interrupted by another one). This property affects both a UI and internal interface toward AFC (e.g., an interruptable task carried by a system needs additional UI controls for interrupting/suspending/resuming the task) • Resumable (can this task be resumed if interrupted). • Kind (whether it is performed by a user, by a system, or interactively). Additional properties may be needed depending on intended use (e.g., see [2,25]). Parameter The parameter entity defines information requirements for performing a task. Parameters are typically attached to a task as leaf nodes in a task tree, but they can also be further decomposed if we want to model low level interactions (at syntax and lexical levels, representing interaction tasks [23, 25]), as illustrated in figs. 3 and 4. Also, if a parameter corresponds to a composite type, it may be represented as a structured task consisting of two or more subtasks (which in turn may be structured as well). Whether this level of precision is required depends on the intended use. The following standard properties can be defined for task Parameters: • Name. • Type corresponding to a predefined type or an object type defined in a conceptual model (i.e., object domain model). • Value (default value, if any.) • Kind determines whether it is an optional parameter (has a default value), implicit (defined elsewhere) or explicit (has to be explicitly provided). [5] Additional properties may be introduced for modeling look and feel aspects and run time support (e.g., see [12, 25]). Pre-condition and Post-condition Pre-conditions define requirements for each (sub)task. Post-conditions define how a task changes the relevant context. Pre-conditions and Post-conditions implicitly capture dependencies among tasks. They can be used in combination with explicit ordering and dependency relations between tasks. The advantage of Pre-conditions and Post-conditions is that they scale up better (when dependencies are many-to-many, e.g., tasks a1, a2, …, an depend on b1, b2, …, bm). Post-conditions can also be used to represent Goals (to be satisfied by performing a task). Note that UML already has constructs for Pre-conditions and Post-conditions (defined as stereotypes of Constraint), but they are defined only for Operation. The proposal is to define the same construct for tasks and parameters. Associations There are two kinds of associations between the constructs in a task model: Aggregations between a task and its subtasks, parameters, and Pre- and Postconditions. Cardinality of a subtask indicate whether it is optional or mandatory and whether it can be repeated (iterated).

262

Srdjan Kovacevic

Dependency explicitly represents a dependency between two tasks or two parameters. This can be used as an alternate way of representing (or specifying) dependencies between tasks captured in Pre- and Post-conditions. While Pre- and Postconditions scale up better for complex hierarchies, explicit relationships might work better in simple cases. Regardless of the way the dependency information is specified, a tool should be able to induce this information and show it on demand. Dependencies may also represent relationships between parameters. For instance, if there is a dependency between parameter Pa of task A and parameter Pb of its subtask B, there is no need to specify input values twice and this information can be used in evaluating a given UI design. Run Time Implementation Components As Figures 3 and 4 illustrate, a task model may be further decomposed to add details not captured in the high-level constructs discussed so far. To specify low-level details determining look and feel aspects of a UI, we need additional components representing building blocks that can be assembled to implement a UI. It is beyond the scope of this paper to discuss all implementation components. [12] provides indepth coverage of UI building blocks).

4.2

Implementation

As pointed out earlier, none of the existing UML diagrams provides the modeling construct proposed here. Whereas a combination of existing constructs can be used to define a task model, that would be at a too low level of details and the information would be spread across multiple diagrams. For instance, by using a combination of use case, sequence and collaboration diagrams, one can capture different paths through the task model; but specifying decompositions in interaction diagrams requires using conditions and guards and sequence numbers. Activity diagrams appear to be a better choice, but they are geared toward modeling workflow and representing a hierarchical decomposition would require nesting diagrams. Yet another alternative is to represent all UI modeling constructs in a class diagram. In either case, we can: • Define stereotypes to distinguish different modeling construct that have specific meanings in UI modeling and • Use tagged values to define additional properties for each stereotype. • For instance, a top-level task in fig. 2 can be represented as a stereotype «Task» with a tagged value “Ordering=SEQ”. When we extend UML and provide a new notation with additional semantics by using UML extensibility mechanisms, this semantics is expressed in terms of the existing UML modeling constructs and thus remains integrated with the rest of notation set. Obviously, a tool that “understands” the new semantics can provide more assistance during UI design and relate the new constructs to the standard elements used to model the application functional core. For instance, it may relate actions in the task model with external events in the interaction diagrams and check if all semantic action routines used by those actions have corresponding operation defined. Even a tool that does not understand the new semantics and is not able to reason about the new constructs will still be able to support visual modeling of UI designs, as long as it

UML and User Interface Modeling

263

supports the UML extensibility mechanics. While such a tool would not be able to automate refinement steps and provide any process guidelines, a user would still be able to create, manipulate and review diagrams. 4.3

Example

Specifying implementation details for each interaction task can be done manually or, preferably, automatically by a CASE tool leveraging general UI domain knowledge and application-specific knowledge captured in a task model. TACTICS [12,15] is such a tool. TACTICS (Transformation- And Composition-based Tool for Interface Creation and Support) integrates a compositional model of user interfaces and a transformational model of the UI design space. The TACTICS model is compositional because it views a user interface as a collection of primitives structured based on the application and on the desired dialogue style; the model identifies user interface components and structuring principles for assembling components into a coherent interface. The model is transformational because it defines a set of transformations changing UI designs. Transformations modify UI structures to achieve a different look-and-feel, making it easy for a designer to generate and try alternative designs. Another important characteristic of TACTICS is that it shifts the boundary between the application user interface and its functional core by identifying reusable components that allow a UI tool to control look and feel aspects of an application’s UI without affecting its functionality [12]. In that sense, it provides the (conceptual) separation between a UI and the application functional core (fig. 5) discussed in Section 3. TACTICS maps a high-level task model into a working UI based on a method outlined as follows. For each application action, TACTICS generates a goal tree (fig. 6). The top-level sub goals (e.g., SG1, SG2) correspond to information units required by an action. These inc1ude all action parameters, as well as subgoals for selecting the action and confirming selected values. They form a partially ordered AND-subtree – the confirmation subgoal requires other subgoals to be satisfied first, whereas other subgoals mayor may not be ordered. Each subgoal has an OR-subtree representing different ways of satisfying the subgoal. For instance, in the goal tree shown in figure 7, subgoal SG1 can be satisfied by SG11 or by L1; SG11 can in turn be satisfied by L2. Subgoal SG4 is initiaIly satisfied (it has a default value, represented by the + branch). TACTICS can also capture dependencies between different actions, as illustrated in the figure: subgoal SG5 requires a value provided by another action. When propagating solutions up the tree at run time, a proposed value can be rejected, even if it is of the required type, if it does not meet all constraints concerning other subgoals. For instance, two parameters can be constrained so that their values must satisfy a specific relationship, such as "be different".

Information Requirements

Srdjan Kovacevic

Action-1

Actions Action-2

SG1

SG2

SG3

SG4

SG5

Parameters

+ Look and Feel

User Interface

264

SG11

L1

L2

SG12

L3

L4

L5

Interaction Techniques

Fig. 6. Task/action goal tree.

More details on TACTICS and discussion of the mapping process are in [12,13,15]. The mapping from task models to presentations is discussed in [2]. [4] describes how a task model can be extended with low-level details needed for NGOMSL (Natural GOMS Language) evaluation. Even if automatic generation or evaluation is not available, following the mapping process manually can still be beneficial since the model identifies all the critical components and their dependencies. Using your favorite toolkit, you can define a dialog box for an action with controls for each parameter and with dependencies that implement all constraints between parameters and pre- and post-conditions identified in the model. The model also defines internal interface between UI and the application functional core in a form of semantic action routines and feedback.

5

New UML View or Overloading Existing Ones

UI modeling constructs can be defined using different existing elements as a base. One possibility is to define all new UI modeling constructs based on a class element and to use class diagrams for UI modeling. To distinguish task model/UI design diagrams from other class diagrams, we can stereotype the diagram as whole. This would allow a modeling tool that is aware of these extension to recognize task and UI models and provide support for UI design with minimal impact on non-UI design activities. Regardless of what existing elements are used as a base, the new constructs would have additional properties and semantics (in the context of Task Model) that may interfere with non-UI design activities. This is one of the reasons why a different view for UI modeling is justified because UI design often requires a different perspective from traditional OO A&D. UML enables modeling a system at different levels of abstraction and from different perspectives, using different views to emphasize different aspects of the system. UML also provides extensibility mechanisms to accommodate the different needs and different methods that have grown out of UML. Adding UI modeling is best done by adding a new, task modeling perspective/view to existing ones. In the spirit of UML, the task view would be optional, but if used,

UML and User Interface Modeling

265

would be integrated with other views on the same underlying model. For instance, non-interactive applications would clearly not require this view. On the other hand, it is very valuable for methodologies geared toward interactive applications that emphasize task analysis and modeling. Depending on a method, it may not be necessary to start with a task model. For instance, a tool can infer (create) an initial task model from use case diagrams, corresponding interaction diagrams and class diagrams (modeling classes that are handling relevant external events), as well as activity diagrams. Similarly, a task model can be used to initialize the use case and interaction and/or activity diagrams.

6

Conclusion

There is growing interest for bridging the gap between UI design and software engineering (e.g., see [20,27]) in the HCI community. UML provides a solid foundation for integrating UI design into the software development life cycle and this workshop is an opportunity to gain momentum to bring these efforts together.. This paper identifies a minimal set of extensions to UML to support task modeling. I consider this a step in a continued evolution of UML to better address needs of different domains and applications, such as real time modeling and user interface modeling. The proposal presented here is by no means exhaustive and does not pretend to address all requirements of UI modeling. It is intended to address minimal requirements and be a starting point that would lead to a standard notation for UI modeling as part of UML. The importance of having standard UML extensions, and not ad-hoc ones, is to facilitate interoperability not only between different UI tools, but also between UI tools and OOA&D tools and thus to fully leverage a model-driven development approach. Acknowledgments. I want to thank Tony Wasserman for his feedback and discussions related to methodologies for development of interactive systems; Sarah Satterlee, for her editorial comments; and anonymous reviewers who provided many valuable suggestions on earlier versions of this paper.

References 1. Balzer, R., "A 15 year perspective on automatic programming," IEEE Transactions on Software Engineering, vol. SE-11, pp. 1257-1267, 1985. 2. Bodart, F., A.-M. Hennebert, J.-M. Leheureux, J. Vanderdonckt, A Model-Based Approach to Presentation: A Continuum from Task Analysis to Prototype, In F. Paterno (ed), Interactive Systems: Design, specification, and Verification, Springer, 1995. 3. Braudes, R., A Framework for Conceptual Consistency Verification, D. Sc. Dissertation, Dept. of EE&CS, The George Washington University, Washington, DC 20052, 1990. 4. Byrne, M. D., Wood, S. D., Foley, J. D., Kieras, D. E., and Sukaviriya ,P. N., “Automating interface evaluation,” in Proceedings of Human Factors in Computing Systems, CHI’94, ACM Press, 1994 5. Foley, J., Kim, W.C., Kovacevic, S. and Murray, K., "UIDE — An Intelligent User Interface Design Environment," in Architectures for Intelligent Interfaces: Elements and Prototypes, Sullivan, J., Tyler, S., Ed., Addison-Wesley, 1991.

266

Srdjan Kovacevic

6. Hix, D., and Hartson, R., Developing User Interfaces – Ensuring Usability Through Product and Process, John Wiley & Sons, 1993. 7. Jacobson, I., The Use Case Construct in Object-Oriented software Engineering, in ScenarioBased Design – Envisioning Work and Technology in System Development, Caroll, J.M. (Ed), John Wiley & Sons, 1995. 8. John, B. E., and Kieras D.E., “Using GOMS for user interface design and evaluation: Which Technique?” ACM Transactions on Computer-Human Interaction, Vol 3(4), December 1996. -Pp.287-319. 9. Johnson, P., Wilson, S., Markopoulos, P. and Pycock, J., "ADEPT – Advanced DEsign Environment for Prototyping with Task Models," in Proc. of INTERCHI'93, pp. 56-56, 1993. 10. Johnson, P., Johnson, H., and Wilson, S.., Rapid Prototyping of User Interfaces Driven by Task Models, in Scenario-Based Design – Envisioning Work and Technology in System Development, Caroll, J.M. (Ed), John Wiley & Sons, 1995. 11. Kieras D.E., S. D. Wood, and D. E. Meyer, “Predictive Engineering Models Based on the EPIC Architecture for a Multimodal High-Performance Human-Computer Interaction Task,” ACM Transactions on Computer-Human Interaction, Vol 4(3), September 1997. 12. Kovacevic, S., A Compositional Model of Human-Computer Interaction, DSc dissertation, The George Washington University, 1992. 13. Kovacevic, S., "TACTICS – A Model-Based Framework for Multimodal Interaction," in Proceedings of the AAAI Spring Symposium on Intelligent Multi-Media Multi-Modal Systems, 1994. 14. Kovacevic, S., "Flexible, Dynamic User Interfaces for Web-Delivered Training," in Proceedings of the International Workshop on Advanced Visual Interfaces - AVI’96, 1996. 15. Kovacevic, S., " Model-Driven User Interfaces Development," to appear in Proceedings of the 10th International Conf. On Software Engineering and Knowledge Engineering (SEKE’98),. 16. Lonczewski, F. and Schreiber, S., Generating User Interfaces with the FUSE System, TUM-I9612, Technische Universitaet Muenchen, 1996. 17. Tarby, J.-C., Barthet, M.-F. "The DIANE+ Method", in Proceedings of the 2nd International Workshop on Computer-Aided Design of User Interfaces CADUI'96 (Namur, 5-7 June 1996), J. Vanderdonckt (Ed.), Presses Universitaires de Namur, Namur, 1996. 18. Neches, R., et al., "Knowledgeable Development Environments Using Shared Design Models," in Proceedings of 1993 International Workshop on Intelligent User Interfaces, pp. 63-71., 1993. 19. Pangoli, S. and Paterno, F., "Automatic Generation of Task-oriented Help," in Proceedings of the ACM Symposium on User Interface Software and Technology (UIST'95), pp. 181-187, 1995. 20. Mary Beth Rosson, M. B., and Carroll, J. M., “Integrating Task and Software Development for Object-Oriented Applications,” in Rosson M.B and Nielsen J. (eds), Proceedings of Human Factors in Cmputing Systems, CHI’95, ACM Press, 1995. pp 377-384. 21. van Harmelen, M., et al.," Object Models in User Interface Design: A CHI'97 Workshop," SIGCHI Bulleting 29(4), October 1997. 22. Sukaviriya, P. and Foley, J., "Coupling a UI Framework with Automatic Generation of Context-Sensitive Animated Help," in Proc. of the ACM Symp. on User Interface Software and Technology (UIST'90), 1990. 23. Sukaviriya, P. and Foley, J. D., and Griffith , T., “A second generation user interface design environment: the model and the runtime architecture,” in Proceedings of INTERCHI'93, pp. 375-382, 1993. 24. Szekely, P., Luo, P. and Neches, R., "Facilitating the Exploration of Interface Design Alternatives: The HUMANOID Model of Interface Design," in CHI '92 Conference Proceedings, pp. 489-498, 1992. 25. Szekely, P., P. Sukaviriya, P. Castells, J. Muthukumarasamy, E. Salcher. “Declarative interface models for user interface construction tools: the Mastermind approach,” In Engineering for Humand-Computer Interaction, L. Bass and C. Unger (Eds). Chapman & Hall, 199

On the Role of Activity Diagrams in UML A User Task Centered Development Process for UML* B. Paech Institut für Informatik, Technische Universität München Arcisstr.21, D-80290 München ++49/89-28928186 ++49/89-28928183 [email protected]

Abstract.

Activity Diagrams can be used to describe internal processing as well as actionobject flow. Since they do not focus on events and object interaction, it is not clear, how to combine them with the typical object-oriented diagrams like class and statechart diagrams. In this paper we propose to use activity diagrams as a bridge between use case diagrams and class diagrams. This gives three benefits: a smooth transition from business processes to use cases, an abstract specification of complex object interactions and a succinct description of system functions affecting several objects. This use of activity diagrams is embedded in an overall software development process characterized by a focus on user tasks during analysis and incremental class diagram development.

Keywords Analysis and Design, Use Cases, Activity Diagram

1 Introduction UML defines a set of graphical diagrams providing multiple perspectives of the system under analysis or development. As recognized in the UML Summary, version 1.1 [UML1.1] , activity diagrams play a special role among these diagrams. They do not correspond to the typical object-oriented techniques from the predecessor methods Booch [Boo94] , OMT [RBP+91] , and OOSE [Jac92] . In particular, they incorporate concepts from data flow diagrams used in structured methods. Usually it is claimed that structured and object-oriented concepts do not fit well together, since object-orientation focusses on event flow and object interaction, while structured methods focus on data and control flow between processes. Thus, it is not clear how *

This work was funded by the Forschungsverbund ForSoft supported by the Bayerische Forschungsstiftung.

J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 267-277, 1999. © Springer-Verlag Berlin Heidelberg 1999

268

B. Paech

and when to use activity diagrams in an object-oriented software development process. In this paper we propose to use activity diagrams as a bridge between use case diagrams and class diagrams. First, activity diagrams are used to describe the flow between the system and the actors within the use cases. We call these descriptions work processes. They make the user tasks and the division of work between user and system explicit. Work processes can be viewed as a refinement of business process descriptions. Second, activity diagrams are attached to the system to describe the data effects of system functions without fixing the object interaction necessary to achieve the effects. We call these descriptions function processes. They are particularly helpful for a succinct description of functions affecting several objects. Typically, there is a wide variety of possible designs for such functions, since the control can be attributed to several entity or interface objects, as well as to a separate control object [Jac92] . Thus, it is often difficult to recover the effects of such functions from the class diagram. Along with the development of work and function processes the class diagram can be developed incrementally. On the level of work processes a class diagram without operations and navigability indication is used. This preliminary class diagram serves as an illustration of the terminology making explicit the major dependencies between different entities. On the level of function processes navigability is added. This is necessary in order to determine which data is affected by a system function. Otherwise, it is e.g. not clear whether the update of a binary association affects one or two references. Finally, operations are added to the object model corresponding to the activities of the function processes. The rationale behind this use of activity diagrams is a software development process focusing on user tasks in the early phases and a smooth transition from task oriented diagrams to class diagrams. There is evidence that class diagrams are not suitable for requirements engineering, since they are not intuitive for the users (cf. [Moy94]). While use cases are easy to understand for the users, they lack methodical guidance for their intergration with class diagram development [WPJH98]. In our view, work and function processes are an adequate enhancement of use cases, since they are also intuitive for users, but more accurate, and since they give methodical guidance by separating work issues form data effect considerations. The paper is structured as follows: in section 2 we introduce work and function processes by way of an example and discuss necessary extensions to activity diagrams. We show how to develop the class diagram in parallel with work and function processes. In section 3 we sketch the user-oriented software development process in which work and function processes should be embedded. Section 4 contains the conclusions. Related work is discussed along the way.

2 Work and Function Processes In this section we introduce work and function processes by way of the ubiquituous library example. In particular, we treat the book return use case whose textual description is shown in Figure 1. Use cases contain a lot of details about the interaction of the users and the system. This makes it difficult to determine, whether

On the Role of Activity Diagrams in UML

269

all important information has been captured. Also, it is a very big step to the class diagram in which the behaviour described in the use cases has to be distributed between the classes. Thus, we propose to use two intermediary activity diagrams, namely work and function processes, to separate the information of use cases. Along with these intermediary levels the class diagram can be developed incrementally. In the example, we start with the textual use case which is then captured in work and function processes. Of course, one could imagine also the other way round, where work and function processes are developed first, and then enriched to the full textual use case. It depends on the project and especially the user community, which description should be used. For visionary use cases, the textual description might be easier to understand. If complex user task performance is to be captured, work and function processes might be more adequate, because they separate several concerns into different descriptions. The course of events starts when a reader hands a book to the librarian to return it to the library. The librarian enters the book number. The system retrieves the title and author of the book, as well as the reader identity, for the librarian to acknowledge that the correct book is returned from the correct reader. In reaction to the acknowledgement of the librarian the system updates book and reader data and checks whether the book has been reserved. If so, an Email message is sent to the owner of the reservation. Finally, the success of the whole transaction is notified to the librarian. Figure 1: Book Return Use Case

2.1 Work Process and Problem Domain Class Diagram A work process is an activity diagram describing the division of work between user and the system as captured in the use case. Figure 2 shows the work process corresponding to the book return use case. As usual in activity diagrams, we use swimlanes to separate the activities of different actors. Control flow is shown by a solid arrow, object flow by a dashed arrow. As an extension of activity diagrams we use one swimlane for the whole system. Also, we separate object and control flow. Additionally, we allow - similar to sequence diagrams - messages as labels to the control flow arrows. It is not possible to use events, in particular the reception of a message - as typical for statechart diagrams -, instead of messages, since we label an arrow between two swimlanes A and B with the message sent from A to B and not with the message received by A. The Book Return work process consists of the Return Book activity of the user from which the Book flows to the Librarian. The Librarian accepts the Book and commands the Software System to execute the corresponding update. The System checks whether a Reservation exists, and if so, notifies the next Reader. The success of the transaction is signaled to the Librarian which in turn acknowledges the Return to the Reader.

270

B. Paech

Reader

Librarian

Software

Return Book

Book Return (Book Data)

Accept Book

[not reserved]

Update

[reserved]

Accept Acknowledgement

Acknowledge Return

Notify Reservation

Figure 2: Work Process Book Return Figure 3 shows the problem domain class diagram corresponding to the work process. Similar to OOSE we only use a preliminary class diagramm without operations and navigation indication. Only the major entities and their important relationships should be shown. For the Book Return use case, these are the Book and the Reader connected by the borrowing relationship, and the Reservation connected to a unique Book and Reader. Note that we do not include the actors of the work process in the class diagram (e.g. the Librarian), but only the entities which are handled in the work process.

for 0...*

Reservation Date

1

Book Title

for 0...* 1

0...*

borrowed by

0...1

Reader

Figure 3: Problem Domain Class Diagram Work processes and the problem domain class diagram do not show interaction details like the separate acknowledgement of the updates by the librarian or class details like operations. Only the major activities, decisions and entities are shown. At this level one can experiment with different possibilities of work division between the user and the system. The similarity of work process diagrams to business process

On the Role of Activity Diagrams in UML

271

descriptions, e.g. [Sch92], allows for an easy transition from strategic business processes to work place descriptions. This is in line with the adaption of the use case model to business processes in [JEJ94] . Based on a more detailed activity description, e.g. regarding priority, frequency, degree of freedom, ressources and the like (cf. [BJ94] ), the complete set of user activities can be analyzed to determine the adequacy of work design for human labour (e.g. [Uli94]).

2.2 Function Process and Navigation Class Diagram A function process is an activity diagram describing the data effects captured in the use case. More specifically, they detail the software activities identified in the work processes. Figure 4 shows the function process for book return. Function processes do not contain swimlanes, since they describe only activities of one actor, namely the software system. Each activity describes one kind of data change. We do not associate the activities with the objects at this level, since the flow between the objects might be changed through the addition of control objects anyway. The behaviour of functions affecting several objects can be described more succinctly by concentrating on the activities. Again, we extend activity diagrams with labels for the control flow arrows. In this case they represent data dependencies between the activities, namely messages or object flow. In the Book Return function process, Reader and Reservation are data flowing from the Update Book activitiy to the other two activities.

Update Book

Book

Reader

Update Reader

Reservation

Notify Reservation

Figure 4: Function Process Book Return To understand the function processes, one has to know the navigation class diagram which details the problem domain class diagram with navigation indication. Figure 5 shows the navigation class diagram for book return. In comparison with Figure 3, the borrowing relationship has been resolved into two references, while the relationships between Reservation and Book and Reader, respectively, have been resolved into one-way-references from Book to its Reservations and from Reservation to the Reader who owns the Reservation.

272

B. Paech

Reservation reserved 0...*

Date for 1

Book Title

borrowed by

0...*

0...1

Reader

has borrowed

Figure 5: Navigation Diagram for Book Return Because of the two-way-reference between Book and Reader, the Book Return function process details the Update activity of the work process into Book and Reader Update. In principle, Book Update, Reader Update and Reservation Notification could be executed in parallel. However, as shown in the navigation class diagramm of Figure 5, the reference to the borrowing Reader and the Reservations can only be gained from the Book object. Therefore, Update Book should be executed first. Figure 6 shows two possible collaboration diagrams describing the data effects of book return through object interaction. There are essentially two possibilities: either the Book object triggers the Reader and the Reservation object or there is another control object sequencing Book Update, Reader Update and Notification. In general, there are much more possibilities. This example demonstrates, that function processes are more abstract than collaboration diagrams. While function processes only show the major data effects and their sequencing, in collaboration diagrams the data effects are mixed with object interaction details. In our view, collaboration diagrams are more adequate for design than for requirements capture and analysis. Return(b) 1: (res,r) := BReturn(b)

Return(b) b: Book

1: RReturn(b)

r: Reader

c: Control

b: Book

2:RReturn(b)

2:res:= findfirst

:Reservations

3: Notify

r: Reader

3: Notify

res: Reservation

res: Reservation

Figure 6: Two Possible Collaboration Diagrams for Book Return

On the Role of Activity Diagrams in UML

273

2.3 Analysis Class Diagram Based on the function processes and the navigation class diagram, the analysis class diagram can be completed. In particular, control objects are added and operations associated with the classes. Each activity of the function processes leads to an operation at the class corresponding to the affected data. If no control object is added, these operations also contain the sequencing between the operations of the different classes. Otherwise, this is localized in the control object.

Return reserved 0...*

Return

Reservation Date Notify for

of

1

1

Book Title BReturn

borrowed by

0...*

has borrowed

0...1

Reader RReturn

Figure 7: Analysis Class Diagram for Book Return Figure 7 shows the analysis class diagram for the Book Return use case. In the example, we have added the control object with the central Return function. Book and Reader have auxiliary functions for updating their references according to the return. Use cases also contain information about interface objects which is not captured in the work and function processes and the corresponding class diagrams. In [Pae98a] we describe how to derive dialogue models from use cases which serve as input for the identification of interface classes and their operations. Here we only concentrate on the use of activity diagrams.

2.4 Evaluation of Activitiy Diagrams Above we have shown by way of example the use of activity diagrams to model work and function processes. These two uses are quite different. Work processes use swimlanes and are therefore somewhat similar to sequence diagrams. Function processes describe behaviour of one actor. As discussed in [BHH+97] , it is not possible to give a common formal semantics to both uses. By allowing data labels in function processes we have given them yet another flavour - more similar to effect/correspondence diagrams of structured methods like SSADM [DCC92]. Also, for work processes we have separated object and control flow. In our view, these are the essential usages of data flow in the early development stages. Thus, while we only needed slight extensions to the UML notation for activity diagrams, a formal semantics of work processes and function processes requires further effort. However, this effort will be worthwhile, since we will show in the next section that both play

274

B. Paech

an important role in the early phases of a task- and object-oriented software development process.

3 A Task- and Object-Oriented Software Development Process UML only provides a notation, not a process. Of course, the development processes of the predecessor methods OOSE, OMT, and Booch can be adapted to the new notation. We are interested in a user-oriented process which allows for close interaction with the user in fixing the system functions supporting the user tasks and the user interface details. Class Diagrams - as proposed by most object-oriented methods for requirements capture -, are not suitable, since they do not support the notion of user task. As exemplified by the rich literature on work psychology (e.g. [Uli94], [Dia89] ), task is the central notion to describe work places. This is the main reason for the success of use cases which bring into object-orientation the concept of user task. However, as discussed in the last section, in our view textual use cases mix different levels of task description. While they give the user and the developers a good intuition about the system support for user tasks (this is called stakeholder requirements in [SS97] ), they are not adequate as a system specification which has to make explicit the specification of the individual system functions. This detailed system specification is important for the contract between developers and customers, and project management issues like time and budget planning. Therefore, we propose in the following an adaption of the OOSE-process by work and function processes, where the function processes are part of the system specification. This process has not yet been applied in the development of an industrial software system, but the products of the process discussed in the following have proven very useful in the redocumentation of a legacy system. Figure 8 depicts the major products of our task- and object-oriented software development process [Pae98b] . We only deal with business and requirements engineering, as well as the analysis stage. Similar to [JEJ94] , we use the same kind of models for business and requirements engineering. The Business Use Case Model gives an overview of the services of the company to the customer. Business Processes represented as activity diagrams detail the business use cases. At this stage, one can already start the development of the Problem Domain Class Diagram. Business engineering is completed by identifying the actors involved in the software system use (called User Roles) and by fitting the scope of the software system into the overall IT-strategy of the company. To identify the Stakeholder Requirements, on one hand the prospective user community is classified according to their competences regarding IT-usage. On the other hand, business processes are detailed into Work Processes. This means that the user tasks identified in the business processes are divided between the software system and the users. As discussed in section 2, the activities of the work processes should be described in sufficient detail for an Evaluation of the Work Places for the user.

On the Role of Activity Diagrams in UML

275

Business Options Business Use Case Model

Problem Domain Class Diagram

Business Processes

User Roles Embedding into IT-Strategy

Stakeholder Requirements User Properties

Work Evaluation Work Processes Activity Description

System Specification Use Case Model

Function Processes

Usage Design

Navigation Class Diagram

Non-functional Requirements

Analysis Model Interaction Diagrams, Prototyping

Analysis Class Diagram

Statechart Diagrams

Usage Scenarios, Prototyping

Data Views

Dialogue Models

User Interface Design Interface Class Diagram

Interaction Diagrams, Prototyping

Statechart Diagrams

Figure 8: Task- and Object-Oriented Software Development The System Specification collects the work processes into the Use Case Model. It also details the software activities of the work processes into Function Processes in combination with a Navigation Class Diagram. Of course, also a textual system function description and Non-Functional Requirements description as e.g. standardized in [IEEE93], has to be added. Based on the navigation class diagram,

276

B. Paech

one can derive Data Views (cf. [Zie97] ). These views are preliminary interfaces classes, but without operations. Similarly, from the function processes Dialogue Models (cf. [Den92]), can be devised. These dialogue models describe the possible user inputs to control system function execution depending on the data views presented to the user. To tailor the usage options to the specific needs of the users, Prototyping and ScenarioTtechniques as described in [Car95] should be used. For the Analysis Model, the navigation class diagram is complemented with operations and control objects as described in section 2. To support the addition of the control objects, the realization of the function processes through object interaction can be examined with the help of Sequence and Collaboration Diagrams. Statechart Diagrams are used to describe the integration of several function processes within the different classes. Similarly, the User Interface Design consists of the Interface Class Diagram which is a completion of the data views identified in the usage design by operations. Again, Interaction Diagrams and Prototypes are useful for the examination of different realizations of the global dialogue models through object interaction. Statechart Diagrams again integrate the different dialogues within the interface classes. Based on the analysis and interface class diagrams, the OOSE design stage, as well as other object-oriented design stages can be added to the development process. In the process, we have not included the textual use case description of [Jac92] . The information of the textual use cases is separated into different models, namely, the work processes and activity descriptions, function processes, usage scenarios and dialogue models. As discussed in section 2, one could also use only the textual use cases. The separation into different models gives a guidance on how to develop the use case text incrementally and on how to check whether all important information has been captured by the use case descriptions.

4 Conclusions In the paper we have shown how to make use of UML activity diagrams in a useroriented software development process aiming at an object-oriented analysis and interface model. In particular, we incorporated new variants of data flow into the activity diagram. Based on that we enhanced the notion of use cases by work processes and function processes which separate work description from data effect description. This allows for an incremental development of the class diagram, as well as for an explicit system specification. The latter is missing in many object-oriented development methods[Dav95]. Work processes are inspired by business process and task descriptions. Function processes are inspired by structured methods. Both diagrams serve as a bridge between the task- and business-oriented requirements capture and the object-oriented analysis and interface model.

On the Role of Activity Diagrams in UML

277

5 Literature [BHH+97] R. Breu, U. Hinkel, C. Hofmann, C. Klein, B. Paech, B. Rumpe and V. Thurner, Towards a Formalization of the Unified Modling Language, in ECOOP, LNCS 1241, pg. 344-366, Springer, 1997 [BJ94] A. Beck and Ch. Janssen, TASK - Technik der Aufgaben- und Benutzerangemessenen Software-Konstruktion, Technical Report, IAT, 1994 [Boo94] G. Booch, Object-Oriented Analysis and Design with Applications, Redwood City, 1994 [Car95] J.M. Carroll, Scenario-Based Design, John Wiley & Sons, 1995 [Dav95] A. Davis, Object-Oriented Requirements to Object-Oriented Design: An Easy Transition? Journal of Systems Software, 30, pg. 151-159, 1995 [Den92] E. Denert, Software-Engineering, Springer, 1992 [DCC92] E. Downs, P. Clare and I. Coe, Structured Systems Analysis and Design Method: Application and Context, Prentice-Hall, 1992 [Dia89] D. Diaper, Task Analysis for Human-Computer Interaction, Ellis Horwood Limited, 1989 [IEEE93] IEEE-STd. 830-1993, Recommended Practice for Software Requirements Specification [JEJ94] I. Jacobson and M. Ericsson and A. Jacobson, The Object Advantage: Business Process Reengineering with Object Technology, Addison-Wesley, 1994 [Jac92] I. Jacobson, Object-Oriented Software Engineering, Addison-Wesley, 1992 [Moy94] T. Moynihan, Objects versus Functions in User-Validation of Requirements: Which Paradigm Works Best?, in OOIS’94, pg. 54-73 [Pae98a] B. Paech, The Four Levels of Use Case Description, in REFSQ'98, 1998 [Pae98b] B. Paech, Aufgabenorientierte Softwareentwicklung, Habilitationsschrift, eingereicht an der TU München, April 1998 [RBP+91] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy and W. Lorensen, Object-Oriented Modeling and Design, Prentice-Hall, 1991 [Sch92] A. Scheer, Architecture of Integrated Information Systems: Foundations of Enterprise Modelling, Springer, 1992 [SS97] I. Sommerville and P. Sawyer, Requirements Engineering - A Good Practice Guide, Wiley & Sons, 1997 [Uli94] E. Ulich, Arbeitspsychologie, Schaeffer-Poeschel Verlag, 1994 [UML1.1] G. Booch, J. Rumbaugh and I. Jacobson, The Unified Modeling Language for Object-Oriented Development, Version 1.1, 1997 [WPJH98] K. Weidenhaupt, K. Pohl, M. Jarke and P. Haumer, Scenario Usage in System Development, A Report on Current Practice, in ICRE’98, IEEE, 1998 [Zie97] J. Ziegler, Viewnet - Konzeptionelle Gestaltung und Modellierung von Navigationsstrukturen, in Software Ergonomie’97, pg. 343-350

Structuring UML Design Deliverables Pavel Hruby Navision Software a/s Frydenlunds Allé 6 2950 Vedbæk, Denmark Tel.: +45 45 65 50 00 Fax: +45 45 65 50 01 Internet: www.navision.com (click services) E-mail: [email protected]

Abstract. The idea of using Unified Modeling Language (UML) appeals to people, but actually using it can be challenging. Many would like to use UML for software development, but do not know how to structure design models and what the relationships between various UML diagrams are. This paper introduces a structure for design deliverables that can be used for software development with UML. The structure is based on a pattern of four deliverables describing classifier relationships, interactions, responsibilities and state machines. The pattern can be applied to different levels of abstraction and to different views on a software product. The paper also discusses practical considerations for documenting software design in the project repository as well as cases in which UML may not be the most appropriate notation to use. The conference presentation with speaker notes is available at this address: www.navision.com (click services).

1 Motivation To define the behavior of your system, some methods suggest describing scenarios, and other methods suggest creating sequence diagrams. What is the correct approach? To answer this question, we must realize that there is a difference between a design deliverable and its representation. The deliverable determines the information about the software product, and the representation determines how the information is presented. For example, a lifecycle can be represented by a statechart diagram, an activity diagram or a state transition table. The system behavior mentioned above is determined by the system interaction model, the subsystem interaction model or the object interaction model. In UML, each of these models can be represented by a set of sequence diagrams or a set of collaboration diagrams. Useful design documentation is based on precisely defined deliverables1, rather than on diagrams. This paper introduces a simple structure of design deliverables that 1

In this paper I use the term deliverable to mean a unit of information about a software product. A deliverable has a representation, properties, responsibilities, attributes, methods and relationships to other deliverables. Some methodologists substitute the term deliverable by the term model or artifact. However, I want to stress that a deliverable is a piece of

J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 278–293, 1999. © Springer-Verlag Berlin Heidelberg 1999

Structuring UML Design Deliverables

279

traces design information and enables to customize the size of the design specification. It can easily be extended in a predicable way, if we want to specify an information not covered in UML Notation Guide and available literature.

2 Pattern of Four Deliverables Software products can be described at various levels of abstraction and in various views. Some examples of levels of abstraction are the system level, the architectural level and the class level. Some examples of views are the logical view, the use case view and the component view. At each level of abstraction and in each view, the software product can be described by four artifacts: static relationships between classifiers, dynamic interactions between classifiers, classifier responsibilities and classifier state machines. Each of these artifacts can be represented either by UML diagrams or by text. The pattern is illustrated in Fig. 1. Analysis View

Use Case View

Logical View

Component View

Testing View

Reuse View

System Level Architectural Level Class Level Procedural Level Software System

Classifier Model * * Classifier

0..1

* *

Classifier Interaction Model

* 1

*

Classifier Lifecycle

Fig. 1. At each level of abstraction and in each view, the software product can be described by four deliverables. UML classifiers are class, interface, use case, node, subsystem and component.

The classifier model specifies static relationships between classifiers. The classifier model can be represented by a set of static structure diagrams (if classifiers are subsystems, classes or interfaces), a set of use case diagrams (if classifiers are use cases and actors), a set of deployment diagrams (if classifiers are nodes) and a set of component diagrams in their type form (if classifiers are components). The classifier model can also be represented by tables. information about the software product. Deliverables are not, for example, a consistency check, process phase or activity, which are, however, process artifacts.

280

Pavel Hruby

The classifier interaction model specifies interactions between classifiers. The classifier interaction model can be represented by sequence diagrams or collaboration diagrams. The UML Notation Guide describes only interaction diagrams in which classifiers are objects; it does not describe interaction diagrams in which classifiers are use cases, subsystems, nodes or components. These diagrams are discussed in section 6 of this paper. The deliverable called classifier specifies classifier responsibilities, roles, and static properties of classifier interfaces (for example, a list of classifier operations with preconditions and postconditions). Classifiers can be represented by structured text, for example, in the form of a CRC card. The classifier lifecycle specifies classifier state machine and dynamic properties of classifier interfaces (for example, the allowable order operations and events). The classifier lifecycle can be represented by a statechart diagram, an activity diagram, a state transition table and Backus-Naur form (see reference [7]). The models in Fig. 1 represent types of deliverables. They define the structure and the relationships of deliverable instances, which contain the actual information about the software product. A model can consist of a large number of deliverable instances. For example, a class model can consist of several static structure diagrams, each of them representing small parts of a system structure; a system interaction model can consist of many interaction diagrams describing various usage scenarios. An instance of the classifier model can be linked to several instances of the classifier interaction model. All of these instances are linked to instances of the classifier. An instance of the classifier is linked to an instance of the classifier lifecycle. See reference [3] for more information about object-oriented deliverable models.

3 Applying the Pattern Figs. 2 and 3 show the pattern applied in use case, logical, component and deployment views, because UML is intended preferably to be used in these areas. The logical view describes the logical structure of the product in terms of objects and classes, their responsibilities, relationships and interactions. The use case view identifies collaborations between of the system, subsystems, classes, components and nodes with each other. The analysis view describes design suggestions in terms of analysis objects, their responsibilities, relationships and interactions. Unlike the software entities in other views, the software entities in the analysis view do not specify the design of the product. The purpose of the analysis view is to record requirements or to record alternative solutions to design problems. Analysis objects and classes may, but may not, correspond to logical or physical software entities existing in the product. The component view describes the physical structure of the software system in terms of software modules, their responsibilities, relationships and interactions. The deployment view describes the physical structure of the software system in terms of physical devices, their responsibilities, relationships and interactions. The reuse view describes reusable elements, their responsibilities, relationships, interactions and lifecycles. Many large systems are described in additional views. The test view describes the structure of the tests and test suites, their responsibilities, algorithms, relationships and interactions. The user documentation view describes the structure of the help

Structuring UML Design Deliverables

281

system in terms of documents (Help pages), their responsibilities, relationships and interactions. In Figs. 1, 2 and 3, the product is described at four levels of abstraction: the system, architectural, class and procedural levels. Section 5 discusses application of the pattern at several other levels of abstraction and views on the software product. The system level describes the context of the system. The system level specifies responsibilities of the system being designed and responsibilities of the systems that collaborate with it; responsibilities of physical devices and software modules outside the system; and static relationships and dynamic interactions between them and the system being designed. The architectural level describes subsystems, software modules and physical devices inside the system and their static relationships and dynamic interactions. The class level describes classes and objects, their relationships and interactions, and the procedural level describes procedures and their algorithms. Many large systems have additional abstraction levels, which, for the sake of simplicity, are not shown in Figs. 1, 2 and 3. For example, systems with layered architecture have an extra tier level between the system level and the architectural level. The tier level specifies system layers, their relationships and interactions. In a layered system each layer contains subsystems and components, which are specified at the architectural (subsystem) level. As an example, the text in the following paragraphs describes deliverables and their relationships at the architectural level. At all other levels of abstraction, the pattern is applied in a very similar way. The only exception is the procedural level, which does not contain the procedure model (relationships between procedures) or the procedure interaction model (interactions between procedures). The reason for the absence of models is the principle of object-oriented design, in which the class model and the object interaction model substitute procedure relationships and procedure interactions respectively. The subsystem model, subsystem component model, and subsystem node model specify static relationships between subsystems, software modules and physical devices inside the system. The subsystem use case model describes use cases with subsystem scope and their relationships to collaborating subsystems. The subsystem use case model specifies how the subsystem, its software modules and physical devices collaborate with other subsystems or external actors. The dependency with the stereotype «collaborations» in Figs. 2 and 3 indicates that the use case model specifies collaborations2 of subsystem, component and node. The subsystem interaction model, subsystem component interaction model and subsystem node interaction model describe interactions between subsystems, interactions between software modules and interactions between nodes inside the system. The dependency with the stereotype «instance» in Figs. 2 and 3 indicates that interactions specified in these models are instances of subsystem use cases. The deliverables subsystem, component and node specify responsibilities of subsystems, software modules and physical devices inside the system. These deliverables also specify the roles and static properties of the subsystem, component and node interfaces (for example, a list of operations and events). A dependency with 2

UML 1.1 does not have any specific symbol for collaboration. Therefore, in this article I assume that collaborations are specified by use cases.

282

Pavel Hruby

the stereotype «refine» indicates that the deliverables class model, object interaction model, class lifecycle and class, represent detailed design of the subsystem. The subsystem lifecycle, subsystem component lifecycle and subsystem node lifecycle specify behavior of subsystems, software modules and physical devices inside the system. In particular, they specify dynamic properties of their interfaces, for example, the allowable order of their operations and events.

System Level

Logical View

System Interaction Model

System Model

System

System Lifecycle

«realize»

Architectural Level

Subsystem Model

Subsystem Interaction Model

System Use Case Model

System Use Case Interaction Model

System Use Case

System Use Case Lifecycle

«refine»

«instance»

Subsystem Use Case Model

Subsystem Use Case Interaction Model

Subsystem Use Case

Subsystem Use Case Lifecycle

«collaborations» Subsystem Lifecycle

Subsystem

«realize»

«refine»

Object Interaction Model

Class Model

«refine»

«instance»

Class Use Case Model

Class Use Case Interaction Model

Class Use Case

Class Use Case Lifecycle

«collaborations» Class

Class Lifecycle

«refine»

Procedural Level

«instance»

«collaborations»

«refine»

Class Level

Use Case View

«realize»

Procedure Lifecycle

Procedure

«refine»

Code Level

«refine» Source Code

Fig. 2. Deliverables describing the software product in use case and logical views.

The subsystem use case describes responsibilities of a use case with subsystem scope. This deliverable specifies static properties of the use case, for example, use case goal, pre- and postconditions, list of subsystem operations that are called within this use case, or a list of objects and attributes that are accessed or modified by the use case. The dependency with the stereotype «instance» indicates that interaction models

Structuring UML Design Deliverables

283

at the subsystem level represent instances of the subsystem use case. The dependency with the stereotype «realize» indicates that a cluster of four deliverables at the class level represents realization of the subsystem use case. The subsystem use case lifecycle specifies behavior of the subsystem within the scope of the use case. The subsystem use case lifecycle specifies subsystem state transitions and the allowable order of subsystem operations and events, which are relevant for this use case. The use case lifecycle can divide potentially complex lifecycles of the subsystem into several lifecycles of subsystem use cases, which can be simpler. The scope of the use case lifecycle is limited to a particular use case, in contrast to the subsystem lifecycle, which completely describes the behavior of the entire subsystem. Another difference is that the subsystem lifecycle is associated with the subsystem, while the use case lifecycle is associated with the use case. Component View

Use Case View

Deployment View

System Level

«instance» System Component Interaction Model «collaborations»

System Component Model

System Component

«refine»

System Node Model

«collaborations»

System Component Lifecycle

System Node

«realize»

Architectural Level

«instance» Subsystem Component Interaction Model «collaborations»

Subsystem Component Model

Subsystem Component

Subsystem Component Lifecycle

System Node «instance» System Interaction Use Case Model Model

«refine»

Subsystem Node Model

System Node Lifecycle

System Use Case

«realize» Subsystem Node Interaction Model

«instance» Subsystem Use Case Model

«collaborations» Subsystem Node

Subsystem Node Lifecycle

Subsystem Use Case

Fig. 3. Deliverables describing the software product in component and deployment views. The subsystem use case interaction model specifies typical sequences of use case instances. In contrast to the subsystem, component and node interaction models, where a scenario is described as a sequence of messages, the use case interaction model describes the scenario as a sequence of use cases. This model is the only UML deliverable that can describe a scenario consisting of other scenarios. This deliverable also differs from the use case lifecycle. The use case lifecycle completely describes the subsystem behavior within the use case, and it is related to the subsystem use case. The use case interaction model describes only typical scenarios, consisting of subsystem use cases, and it is related to the subsystem use case model. There are more details about the use case interaction model in section 6.2. The system of deliverables discussed in this section can be simplified in various ways. Typically, instances of deliverables are separate documents. However, there might be pragmatic reasons for creating documents containing several closely related deliverables. For instance, classifier responsibilities and state machines are always

284

Pavel Hruby

related together and can be joined into one document (Fill pattern A in Fig.4). It is also possible to join system, subsystem and class use case models to one use case diagram (Fill pattern B in Fig.4), providing that use case levels and relationships between use cases and other deliverables are clearly distinguished. Similarly, component and node models at all levels can be joined into one implementation diagram document, providing that levels of components and nodes are distinguished. It might also be reasonable to create one static structure model within each level and show static relationships between use cases, actors, subsystems, classes, components and nodes in one diagram (Fill pattern C in Fig.4), although the UML Notation Guide does not mention such a combined static structure diagram.

System Level

Use Case View

System Use Case Model

Logical View

System Use Case Interaction Model

Component View

System Model

System Interaction Model

System Component Model

System Component Interaction Model

System Node Model

System Node Interaction Model

System

System State Model

System Component

System Component State Model

System Node

System Node State Model

Subsystem Model

Subsystem Interaction Model

Subsystem Component Model

Subsystem Component Interaction Model

Subsystem Node Model

Subsystem Node Interaction Model

Subsystem

Subsystem State Model

Subsystem Component

Subsystem Component State Model

Subsystem Node

Subsystem Node State Model

«instance»

System Use Case

Deployment View

System Use Case Activity Model «realization»

Architectural Level

«refine»

«refine»

Subsystem Use Case Model

Subsystem Use Case Interaction Model

Subsystem Use Case

Subsystem Use Case Activity Model

«refine»

«instance»

«refine»

«realization» «refine»

Class Use Case Interaction Model

«refine»

«refine»

Class Model

Object Interaction Model

Class Component Model

Class Component Interaction Model

Class Node Model

Class Node Interaction Model

Class

Class State Model

Class Component

Class Component Activity Model

Class Node

Class Node State Model

«instance»

Class Use Case

Class Use Case Activity Model

«refine»

«realization»

Procedural Level

Class Level

«refine»

Class Use Case Model

Procedure Activity Model

Procedure

Code Level

«refine» Source Code

A

B

C

Fig. 4. Several ways how to simplify the structure by joining closely related deliverables.

The UML system of diagrams is not orthogonal. In other words, the same information can be specified in two or more different UML diagrams. For example, both the static structure diagram and the object collaboration diagram specify relationships between objects, and both statecharts and interaction diagrams specify messages between objects. Because the same information can be specified in several places, models either have to be checked for consistency, or users must produce only a certain subset of the deliverables identified in Figs. 2 and 3.

4 Structuring Design Deliverables In well-structured design documentation, the required information about software products can be easily located and closely related information is linked together. It also gives an overview about the completeness of the documentation and consistency between deliverables. This section proposes three rules that help to structure project deliverables in an efficient way. The rules are based on the relationships between the deliverables identified in sections 2 and 3.

Structuring UML Design Deliverables

285

System collaborations

System Interaction Model * System

System responsibility

Use case instance

1

Architectural Level

System Level

The first rule is that relationships among the four deliverables in the pattern, shown in Fig. 1 are the closest relationships between deliverable instances. For example, an instance of the class model is linked to several instances of the object interaction model. All of them are linked to several instances of the class, and each instance of the class is linked to an instance of the class lifecycle. Structuring deliverables in this way provides an overview of the product within the scope of the level of abstraction and the view. However, this rule is not sufficient in cases in which some of the models consist of large numbers of deliverable instances. In such cases, the following two rules, which describe relationships crossing levels of abstraction and views, must be applied.

0..1

System Use Case Model

Use case responsibility

System Use Case

0..1 Use case realization

0..1

Use case responsibility

1

System Collaborations Package of System Use Cases System Responsibility System Use Case Model System Use Case System Interaction Model Subsystem Model Subsystem Interaction Model Subsystem Responsibility Subsystem State Model

System responsibility in the scope of the Use Case Package Instance of the System Use Case Realization of the System Use Case

Fig. 5. Structuring deliverables according to collaborations specified in the use case model. Associations between deliverables are at the top of the figure and an example of their projection is at the bottom of the figure.

The second rule structures deliverables according to collaborations. These relationships are shown in Fig. 2 and Fig. 3 as dependencies with the stereotypes «instance», «realize» and «collaborations». In Fig. 5, these dependencies are refined to associations because associations are more descriptive than dependencies. For example, the system use case model contains a package of use cases. This package is linked to the deliverable system, which specifies the system responsibility in the scope of this use case package. Responsibility of each use case in the package is specified in the use case. Instances of these use cases are shown in the system interaction model, and their realizations are specified in the logical, implementation and deployment views as a cluster of four deliverables at the architectural level.

286

Pavel Hruby

Architectural Level

System Level

Structuring deliverables according to collaborations (their relationships to a use case) is useful for understanding the system functionality in a particular context. Structuring deliverables according to collaborations can make it difficult to see the overall structure and functionality of the system, component or class. Therefore, the third rule structures design deliverables according to their refinement between levels of abstraction. These relationships are shown in Fig. 2 and Fig. 3 as dependencies with a stereotype «refine», and in Fig. 6 these dependencies are refined to associations between deliverables. For example, system responsibilities and system interfaces are defined in the deliverable system. The subsystem model specifies the static structure of the system, and the subsystem interaction model specifies the design of each operation in the system interface in terms of subsystem interactions. The dependency «conform» indicates that the operation design has to match the dynamic properties of the system interface specified in the system lifecycle. System State Model

System

Responsibility 1 1..* Static Structure

System Operation

1..*

Subsystem Model

«conform» Operation Realization 1 Subystem Interaction Model

System Refinement System Responsibility System Operation Subsystem Interaction Model Subsystem Model Subsystem Responsiblity Subsystem State Model

Fig. 6. Structuring deliverables according to their refinement between levels of abstraction. Associations between deliverables are at the top of the figure and an example of their projection is at the bottom of the figure.

All three rules, relationships within the view and level of abstraction, collaborations and refinement between levels of abstraction can be combined if a project repository uses these rules as indexes. If project documentation is saved in a version control system with a single index, or, if the documentation is paper based, then a designer must choose one of these rules. Typically, it is useful to structure high-level documents according to the collaborations and low-level documents according to their refinements.

5 Other Applications of the Pattern The pattern can be applied in different areas to describe various aspects of the system. This section discusses application of the pattern in designing software tests and in designing user documentation.

Structuring UML Design Deliverables

287

5.1 Testing The pattern can be used for designing tests. Deliverables in the test view are the test model (static relationships between tests), the test interaction model (interactions between tests), the test case (description of the test), and the test algorithm (test lifecycle describing the test algorithm). Test deliverables can be described at various levels such as the test suite level, the test level and the test script level. Deliverables at the test suite level are the test suite (a set of tests), the test suite lifecycle (the sequence of tests run within a test suite), test suite model (static relationships between test suites) and the test suite interaction model (interactions between test suites). The dependency with the stereotype «trace» in Fig. 7 indicates that test cases can be based on use cases. Test View

Test Suite Level

Use Case View

Test Suite

Test Suite Activity Model

Use Case

Script Level

Test Level

«refine»

«trace»

Test Model

Test Interaction Model

Test Case

Test Algorithm

«refine» Test Script

Fig. 7 Deliverables for test design.

5.2. User Documentation The pattern can be used for designing online user documentation. Documents (pages in online Help or Internet pages) are shown as stereotyped components in UML. Deliverables for designing user documentation are the document model (static relationships between documents), the document interaction model (typical scenarios that arise in searching for particular information), document responsibility (short descriptions of document purpose and its contents) and document lifecycle (if the document has behavior). Deliverables for user documentation can also be described at various levels: the book level, the document level and the text level.

Pavel Hruby

Document Level

288

Document Model

Document Interaction Model

Document Responsibility

Document State Model

Text Level

«refine» Text

Fig. 8. Design of user documentation.

5.4 User Interface The pattern can be used for documenting user interface design. Screens (windows) can be shown as stereotyped classes in UML. Deliverables for designing user interface are the screen model (static relationships between screens), the screen interaction model (typical sequences of activation of screens), screen (responsibility of a screen with a drawing, for example), and the screen lifecycle (if the screen has behavior). The dependency with the stereotype «instance» in Fig. 10 indicates that screen interactions are instances of use cases. Use Case View

User Interface View

Screen Model

Screen Interaction Model

Screen

Screen State Model

«instance» Use Case

Fig. 9. Deliverables for design of user interface.

6 Less Common UML Diagrams Fig. 2 and Fig. 3 show several models that can be represented by UML, but diagrams of them are not explicitly mentioned in the UML Notation Guide (see reference [5]). They are the use case interaction model, the subsystem interaction model, the node interaction model and the component interaction model. These models can be represented by sequence or collaboration diagrams in which classifier roles are use case, subsystem, node and component roles. In UML 1.1, classifier roles in sequence and collaboration diagrams are shown as objects. This might lead to confusion in cases of interactions between classifiers of

Structuring UML Design Deliverables

289

different kinds. For example, symbols on the collaboration diagram, which represents interactions between the object, subsystem and component, are all shown as objects. Sequence and collaboration diagrams would be easier to understand if an object symbol representing the classifier role was replaced by the symbol of an actual classifier, as shown in Figs. 10 and 11. 6.1 Interaction Diagrams for Subsystem, Component and Node Interactions Interaction diagrams for subsystem, component and node interactions are sequence and collaboration diagrams in which classifiers are subsystem, component and node. These diagrams represent interactions between subsystem, component and node instances, without it being necessary to specify actual objects that send or receive messages. Fig. 10 shows a collaboration diagram representing interactions between objects and subsystems. 1.2: Update(Row) 1: PageDown «actor» User

«subsystem» Presentation Object Manager

Form 1.3: Update 1.1:Row=GetRow

RowSet

«subsystem» Database

1.2.1: Paint «utility» MS Windows

Fig. 10. Collaboration diagram representing subsystem interaction model. The notation is modified UML. In UML 1.1, all symbols are replaced by rectangles.

6.2 Diagrams for Use Case Interactions Use case interaction diagrams are sequence and collaboration diagrams in which classifier roles are use case roles. This type of diagram can represent scenarios consisting of sequences of use cases. An actor can use a system in a way that initiates use cases in a particular order. Such a scenario – a sequence of use cases – can provide useful information about the system, and it can be shown in use case interaction diagrams. Use cases in UML can interact only with actors and not with each other. As a consequence of this, use cases in UML are always initiated by a signal from the actor. Therefore, the label invoke in Fig. 11 means that an actor can invoke a use case while executing another use case. Invocations on the diagram map to signals from an actor to a use case and to static relationships between use cases: generalizations «uses» and «extends», dependencies «invokes» and «precedes», or constraints {invokes} and {precedes}. Please note that the order of use case instances belonging to the use case package can also be specified in the lifecycle of this use case package. The lifecycle of the use case package is represented by a state diagram or activity diagram in which states or action states map to the use cases at the lower level of abstraction. However, there is a significant difference between the use case interaction diagram and the use case

290

Pavel Hruby

package lifecycle. The use case package lifecycle (an activity diagram) completely describes the behavior of the use case package. The lifecycle is precise; however, it can be difficult to develop it correctly, especially in complex cases. The use case interaction model describes only typical scenarios consisting of subsystem use cases. 1.1[request OK]:invoke 1:invoke

Customer requests an item

«extends»

{precedes}

Company ships an item {precedes}

2 [customer not satisfied]: invoke Customer 3:invoke

Customer requests an item

Customer pays for an item

Company ships an item

Customer returns an item

Customer returns an item

Customer pays for an item

Customer invoke [request OK]: invoke

[customer not satisfied]: invoke invoke

Fig. 11. Example of sequence and collaboration diagram representing use case interaction model. The notation is modified UML. In UML 1.1, ellipses are replaced by rectangles.

7 Systems of Deliverables of Other Development Processes Depending on which aspects of software design they focus on, different UMLbased development processes use only certain subsets of the deliverables identified in section 3. This section compares the design deliverables of three major development processes: the Objectory method, the Shlaer-Mellor method and the Fusion method.

Structuring UML Design Deliverables

Domain Level

Use Case View System Use Case Model

Logical View

System Use Case Interaction Model

Component View

System Use Case Activity Model

Deployment View

System Model

System Interaction Model

System Component Model

System Component Interaction Model

System

System State Model

System Component

System Component State Model

System Node

Subsystem Model

Subsystem Interaction Model

Subsystem Component Model

Subsystem Component Interaction Model

Subsystem Node Model

Subsystem Node Interaction Model

Subsystem

Subsystem State Model

Subsystem Component

Subsystem Component State Model

Subsystem Node

Subsystem Node State Model

«instance»

System Use Case

291

System Node Model

System Node Interaction Model

System Node State Model

«realization»

Architectural Level

«refine»

«refine»

Subsystem Use Case Model

Subsystem Use Case Interaction Model

Subsystem Use Case

Subsystem Use Case Activity Model

Class Level Procedural Level

«instance»

«realization»

«refine»

Class Use Case Model

«refine»

Class Use Case Interaction Model

«refine»

«refine»

«refine»

«refine»

Class Model

Object Interaction Model

Class Component Model

Class Component Interaction Model

Class Node Model

Class Node Interaction Model

Class

Class State Model

Class Component

Class Component Activity Model

Class Node

Class Node State Model

«instance»

Class Use Case

Class Use Case Activity Model

«realization»

«refine»

Procedure Activity Model

Procedure

Code Level

«refine» Source Code

Fig. 12. Deliverables of the Shlaer-Mellor method are shown in gray color.

The Shlaer-Mellor method (see reference [6]) has one of the best systems of deliverables. Unlike the system in Figs. 2 and 3, the deliverable system of the ShlaerMellor method is orthogonal, which means that one fact about the product is stated only in one place. Analysis in the Shlaer-Mellor method (hereafter SM) is focused on the logical view, and therefore the method does not produce any deliverables in use case, component and implementation views. The Shlaer-Mellor method does not produce any deliverables at the system level. The method recognizes an extra domain level (see section 5) with the domain model (called domain chart in SM). At the subsystem level, the method produces the subsystem model (subsystem relationship model and subsystem access model in SM), the subsystem interaction model (subsystem communication model in SM) and the subsystem (subsystem description in SM). At the class level the Shlaer-Mellor method produces the class model (object information model and object access model in SM), the object interaction model (object communication model and thread of control chart in SM), the class (object description in SM) and the class lifecycle (state transition diagram and class structure chart in SM). At the procedure level, Shlaer-Mellor produces the procedure (action specification in SM) and the procedure algorithm (action data flow diagram in SM). Please note that the procedure (action specification) is related directly to the state in SM and not first to the class and then to the state as it is in Fig. 2. The Fusion method (see reference [2]) is a method with a succinct and consistent system of deliverables that is also orthogonal and significantly simpler than ShlaerMellor. Fusion focuses on deliverables in the logical view at system, subsystem and class levels. At the system level, Fusion delivers the system model (object model in Fusion), the system interaction model (scenario in Fusion), the system (operation model in Fusion) and the system lifecycle (lifecycle model in Fusion). At the subsystem level, Fusion delivers only the subsystem model (system object model in Fusion). At the class level, Fusion delivers the class model (visibility graphs and inheritance graphs), the object interaction model (object interaction graphs) and the

292

Pavel Hruby

class (class descriptions in Fusion). Fusion does not produce any lifecycles except of the system lifecycle. New Fusion Engineering process (also known as Team Fusion) produces also use case model and use cases. Deliverables are structured according to the refinement between levels of abstraction. Use Case View System Level

System Use Case Model

Logical View

System Use Case Interaction Model

Component View

System Model

System Component Model

System Component Interaction Model

System Node Model

System Node Interaction Model

System

System State Model

System Component

System Component State Model

System Node

System Node State Model

Subsystem Model

Subsystem Interaction Model

Subsystem Component Model

Subsystem Component Interaction Model

Subsystem Node Model

Subsystem Node Interaction Model

Subsystem

Subsystem State Model

Subsystem Component

Subsystem Component State Model

Subsystem Node

Subsystem Node State Model

«instance» System Use Case Activity Model

System Use Case

Deployment View

System Interaction Model

«realization»

Architectural Level

«refine»

«refine»

Subsystem Use Case Model

Subsystem Use Case Interaction Model

Subsystem Use Case

Subsystem Use Case Activity Model

Class Level

«instance»

«realization»

«refine»

Class Use Case Model

«refine»

Class Use Case Interaction Model

«refine»

«refine»

«refine»

«refine»

Class Model

Object Interaction Model

Class Component Model

Class Component Interaction Model

Class Node Model

Class Node Interaction Model

Class

Class State Model

Class Component

Class Component Activity Model

Class Node

Class Node State Model

«instance»

Class Use Case

Class Use Case Activity Model

Procedural Level

«realization»

«refine»

Procedure Activity Model

Procedure

Code Level

«refine» Source Code

Fig. 13. Deliverables of the Fusion method are shown in gray color.

System Level

Use Case View System Use Case Model

Logical View

System Use Case Interaction Model

Component View

System Model

System Interaction Model

System Component Model

System

System State Model

System Component

System Component State Model

System Node

Subsystem Model

Subsystem Interaction Model

Subsystem Component Model

Subsystem Component Interaction Model

Subsystem Node Model

Subsystem Node Interaction Model

Subsystem

Subsystem State Model

Subsystem Component

Subsystem Component State Model

Subsystem Node

Subsystem Node State Model

«instance»

System Use Case

System Use Case Activity Model

Deployment View

System Component Interaction Model

System Node Model

System Node Interaction Model

System Node State Model

«realization»

Architectural Level

«refine»

«refine»

Subsystem Use Case Model

Subsystem Use Case Interaction Model

Subsystem Use Case

Subsystem Use Case Activity Model

Class Level Procedural Level

«instance»

«realization»

«refine»

Class Use Case Model

«refine»

Class Use Case Interaction Model

«refine»

«refine»

«refine»

«refine»

Class Model

Object Interaction Model

Class Component Model

Class Component Interaction Model

Class Node Model

Class Node Interaction Model

Class

Class State Model

Class Component

Class Component Activity Model

Class Node

Class Node State Model

«instance»

Class Use Case

Class Use Case Activity Model

«realization»

«refine»

Procedure Activity Model

Procedure

Code Level

«refine» Source Code

Fig. 14. Deliverables of the Objectory method are shown in gray color.

Although the Objectory method (see reference [5]) specifies deliverables with a wide scope, from a product vision to release notes and training materials, it is quite superficial in its specification of the structure of deliverables containing information about the design of the software product. The deliverables are structured on use case, logical, deployment, implementation and process views, and tier, architectural, and

Structuring UML Design Deliverables

293

class levels. Deployment and implementation views contain only component and node models and component responsibilities. All interaction models are considered as a specific view called process view. The method produces only use cases at the system level; the method does not produce any lifecycles with the exception of the use case lifecycle and the class lifecycle. The deliverables are structured according to their relationships to use cases (in other words, according to their collaborations with external actors).

9 Summary This paper introduced a pattern of four mutually related design deliverables that represent classifier relationships, interactions, responsibilities and lifecycles. The pattern was applied for different levels of abstraction and for different views on a software product. Application of the pattern helped to identify new interaction diagrams not documented in the UML Notation Guide. They are the use case interaction diagram, the subsystem interaction diagram, the node interaction diagram and the component interaction diagram. The paper outlined purpose, relationships and representation of deliverables often used to document software design. The paper also discussed three rules of structuring project deliverables based on: (1) relationships among the four deliverables in the pattern (2) collaborations and (3) refinement between levels of abstraction. The pattern can be easily extended to document various aspects of software design. The paper discussed four of these aspects: domain and analysis models, documentation of test design, design of user interface and design of online user documentation.

References 1. 2. 3. 4. 5. 6. 7.

Cockburn, A.: Using Goal-Based Use Cases, Journal of Object Oriented Programming, November 1997, also available at http://members.aol.com/acockburn/papers/usecases.htm Coleman, D. et al.: Object-Oriented Development: The Fusion Method, Prentice Hall, Inc. 1994 Hruby, P.: The Object-Oriented Model for a Development Process, OOPSLA97, also available at http://www.navision.com/services/default.asp Rational Objectory Process 4.1, demo version, available at http://www.rational.com UML Notation Guide, version 1.1, Rational, 1 September 1997, also at http://www.rational.com/uml Shlaer, S., Mellor, S. J.: Object Lifecycles: Modeling the World in States, Prentice Hall, Inc. 1992 Thibault, E.: What is BNF Notation? Available at http://cuiwww.unige.ch/dbresearch/Enseignement/analyseinfo/AboutBNF.html

Considerations of and Suggestions for a UML-Specific Process Model Kari Kivisto MSG Software P.O. Box 28 FIN-90101 Oulu Finland eMail: [email protected] http://www.msg.fi

Abstract. The developers of the Unified Modeling Language (UML) promote (but do not describe) a development process model that is use case-driven, architecture centric, and iterative and incremental. This paper analyzes these features and suggests some extra features needed in developing objectoriented client/server applications (including Internet). The paper is heavily based on practical experiences, where object-oriented client/server applications have been built with the three mentioned requirements in mind. The paper outlines a process model that meets the stated features. In particular, it connects the roles of the development team and the tasks in the process model. KEYWORDS: Process model, modeling language, role.

1 Introduction The Unified Modeling Language (UML) [19], [20], [21] is the new standard for describing artifacts of an object-oriented development process. It was created by a group of researchers, including Booch [2] [3] [4] , Rumbaugh et al. [18] and Jacobson et al. [13]. ‘The Unified Modeling Language (UML) is a language for specifying, visualizing, constructing, and documenting the artifacts of software systems, as well as for business modeling and other non-software systems. The UML represents a collection of best engineering practices that have proven successful in the modeling of large and complex systems.’ UML Summary ([21], p. 5). The UML does not include a process model, but the authors favor a development process model that is use case-driven, architecture centric, and iterative and incremental ([21], p. 9). They also point out that different organizations and problem domains require different processes. This paper focuses on a process model suitable for object-oriented client/server application development, i.e., applications that are J. Bezivin and P.-A. Muller (Eds.): <>’98, LNCS 1618, pp. 294-306, 1999.  Springer-Verlag Berlin Heidelberg 1999

Considerations of and Suggestions for a UML-Specific Process Model

295

• user-centric, i.e., most of the system’s functions are carried out using user interfaces, • data-centric, i.e., most of the system’s functions use databases, • operative systems as opposed to embedded systems and • tailored applications, not off-the-shelf products. The three main features that are proposed by the UML authors are studied from the experimental point of view after having combined the process models of Jacobson et al. [13], Rumbaugh et al. [18] and Booch [2] [3] [4] already in 1993-1994. This derived process model, called the OOCS, which is suitable for Object-Oriented Client/Server development, is briefly outlined. Some extra features of the model are discussed.

2 A UML-Specific Process Model for Object-Oriented Client/Server Development A team-based OOCS model has been developed during the last four years. It is heavily based on the earlier work of the UML authors (Booch, Rumbaugh and Jacobson). Figure 1 depicts the evolution of the UML, the OOCS model and the TB model (Team-Based role model). 1998

OOCS ver. 2.0

TB ver 2.0

OOCS ver. 1.1

TB ver 1.1

OOCS ver. 1.0

TB ver 1.0

UML notation

1997

UML ver. 1.x

1996

UML ver. 0.9

UML ver. 0.8

Company-level development

Feedback from practice OOCS ver. 0.9

TB ver 0.9

1995

Booch, 1995 Goldberg & Rubin 1995

1994

Booch, 1994

TB ver 0.1

OOCS ver. 0.1

Microsoft, 1994

1993 Microsoft, 1993

1992

1991

Lorenz, 1993

Jacobson et al (OOSE), 1992

Rumbaugh et al. (OMT), 1991

Booch, 1991

Goldberg and Rubin 1990

Fig. 1. The evolution of the UML, the OOCS model and the TB model.

296

Kari Kivisto

As can be seen from the figure, the OOCS model used the work of Jacobson et al. [13], Rumbaugh et al. [18] and Booch [2] before they joined forces and began to create the UML. The TB model, i.e., the Team-Based role model defines the roles of the project team. These two models combine the developers and the process they act in by means of roles, their activities and artifacts. The OOCS model is outlined in Figure 2 and it is briefly defined in this chapter. PROJECT MANAGEMENT QUALITY APPLICATION DEVELOPMENT Quality Asssurance Plan

System Definition US

BO

TL

Project Plan Project management

EX QA

Resources Project level Class Model

Analysis

BO

US

Data Model and Database

TL

AD

Risks BO

DB

Changes management

Reviews Architecture

DB

TL

Project group meetings

QA

EX

Company level Reporting Component design and product Component library management

Design AD

BO

DB

BO

DB

QA

TE

Test Specifications QA

Construction AD

BO AD

TL

Steering group meetings

TE PM

DB Testing US

Test Reports and logs QA

TE

Deployment AD

EX

Maintenance Roles PM TL

Project Manager Team Leader

BO AD DB

Business Object Developer Application Developer Database Developer

Fig. 2. The OOCS model.

QA TE

Quality Assurancer US Tester EX

User Expert

Considerations of and Suggestions for a UML-Specific Process Model

297

The roles of the developers are also named in the figure. The model is in use in some Finnish IT departments. It has been adapted to the needs of their organizations. Experiences with the adaptation process can be read in Kivisto [14], where the reader can also find descriptions of version 1.0 of the OOCS model and the TB model. The team-based OOCS model focuses on client/server architecture, meaning first of all that client/server architecture consists of three sub-architectures: technology architecture, data architecture and application architecture. Secondly, client/server application architecture means a division of the application into presentation, business logic and data management. This topic is discussed in more depth later in this paper. 2.1 Use Case-Driven Development The use case-driven approach was first popularized by Jacobson et al. [13], and it means that use cases control overall system development. The use case approach has turned out to be a very promising solution to the central problem in application development: how to make the user’s invisible know-how visible in the software system. After using use cases successfully during recent years, their use can be recommended. It has been noticed that after a short preparation period, most users are able to write and update use cases by themselves. There has been criticism of use cases and scenarios. Martin and Odell ([16], p. 314) state that use cases should capture the desired state of the system (what it should be) and not the present state (what it is). This is a good point, and developers should be aware of it. They also say that all the functionality of the system should be covered by the use cases and not just parts of it. Their last concern is user involvement in application development. This has nothing to do with use cases, but is merely a general observation. Instead, the strength of the use case lies in its ability to tie users to application development without introducing notations or graphs that are unfamiliar to them. Graham ([7], p. 286-287) claims that use cases are an old invention having roots in DFD (Data Flow Diagram) process bubbles, stereotypical scripts and hierarchical task analysis in HCI (Human Computer Interaction). These claims do not reject use cases in any way. Graham’s own concept is the task script. He also claims that use cases are better for user interface description than for internal analysis and design of the system. This is true, but it must be remembered that use cases are not intended for internal structure definitions, although they have been used that way by some developers. When using use cases in practice, the following ‘chain’ has proved to be successful (Figure 2). User roles (actors of the system) and their main functions in the OOCS are described in the System Definition phase. This in turn guides work in the Analysis Phase of the OOCS process, i.e., the use cases are written from the main functions of the actors. These descriptions in turn guide the object modeling process. The developers must notice that there is a strong connection between the use cases and the object model, and the user interfaces are designed after first versions of the use cases and the object model exist. This order attempts to avoid the common problem where the user interface design starts to rule the Analysis Phase, leading the work into the wrong direction.

298

Kari Kivisto

User roles Users’ main functions

System definition, business objects

System Definition Phase Analysis Phase Use cases

‘User’s Manual’

Object Model -attributes and operations

User Interfaces

Operations in more detail

Data Model

Fig. 3. The Analysis Phase Table 1 gives an example of how roles and activities are joined in the model. Table 1. Example from the model: The Analysis Phase. Roles

Activities

Artifacts

US AD

Use case analysis

Use case descriptions

US AD US AD

Write user manual draft and on-line help texts Define user interfaces and reports

BO

Define Business objects

User manual and on-line help draft User interfaces and reports descriptions Class descriptions

AD BO DB

Define operations Carry out data modeling

Operations descriptions Data model description

UML Diagrams Use Case Diagram

Class Diagram Class Diagram

Use cases were included in the OOCS immediately after Jacobson et al. published their book [13]. Figure 2 depicts their role in the Analysis phase of the model. Use cases are good for communication between users and developers. With a little train-

Considerations of and Suggestions for a UML-Specific Process Model

299

ing, users are able to write and update use cases. Use cases are also inputs to testing. The link between the use cases and the object model is a very close one (cf. Figure 2), and user interfaces are defined after both the use cases and the object model have been designed. This does not mean that the use cases and the object model have to be ready before going on, however, since the process is iterative in nature. The use cases should be available for user interface design and the object model for the data modeling, not vice versa. It was rather hard utilize use cases in the first projects, because they represented a new approach to the most of the developers. In addition, OMT by Rumbaugh et al. [18] was the best known method at that time (at least in Finland) and instead of use cases it had scenarios, which had different semantics. Scenarios were also meant to be used in the Design Phase, not in the Analysis Phase. Of the other models, only Lorenz [15] included use cases in his model. Later on, use cases have also been adapted to other models (for instance, OMT++ by Jaaksi [12]). After they were adapted into the UML, every process model will probably include advice on their use. All the UML diagrams are used in the OOCS model. 2.2 Architecture Centric Process The UML defines the architecture as an organizational structure of a system ([20], p. 3). Booch [4] uses concepts such as micro and macro architecture when referring to the architecture issue. When systems continuously become more and more complex, a special concern should be stated regarding the architecture. The concept of architecture mentioned above refers only to one part of the whole architecture issue, i.e., to the application architecture. However, the client/server architecture should be seen as constructed of three sub-architectures: technological architecture, application architecture and data architecture (Microsoft [17]) (Figure 3).

Messages Presentation

Messages Business Logic

Application Architecture Distribution of Logic Reusable Components

Data Management

Data Architecture Distribution Replication Access Strategy

Technological Architecture Run-time Environment Development Tools Selection Security

Fig. 3. Three-tier client/server architecture.

300

Kari Kivisto

The Technological Architecture defines both the run-time environment and the development environment. The Data Architecture design (distribution of databases, access strategies, replication strategies, etc.) will gain more and more attention, since companies are breaking their centralized organizations (and databases as well) into more self-controlled departments, which may be geographically located throughout the world. The Application Architecture is based on both the technology and the data architecture, and it defines how the application is built and divided into three parts: presentation (user interfaces), business logic (local and corporate business logic) and data management (object-oriented or relational databases). The Application Architecture is the most interesting of the three architectures. A client/server development model should clearly state the commitment to the client/server architecture. All three architectural parts (technology, data, and application) should be described in the documentation, as the design phase cannot start if the architectural decisions are missing. The three-tier application architecture (presentation, business logic, and data/object management) must be kept in mind in all phases of the development. This is one conclusion that has emerged from real projects, in addition to the observation that this approach may increase development time but reduce maintenance and redevelopment. The overall architecture needs a phase of its own. This phase - the Architecture Phase - is carried out between the Analysis and Design Phases (Figure 4). Component and Deployment diagrams from the UML are used in this phase to clarify the document. Architecture draft in the System Definition document

First drafts of the Analysis Phase System Definition Phase Analysis Phase

Architecture Phase Use this for help: Old system Other relevant systems Technology reports Tool reports Books, articles etc.

Technology Architecture

Application Architecture

Data Architecture

Fig. 4. The Architecture Phase

Considerations of and Suggestions for a UML-Specific Process Model

301

The concept of architecture centricity is broadened in the OOCS. First of all, the OOCS is based on three-tier architecture, meaning that the Analysis, Design and Construction Phases are carried out with this division in mind. For instance, in the Analysis Phase (Figure 2) use case analysis and user interface design belong to presentation, object modeling and design of operations to business logic, and data modeling to data management. The architecture issue has been very relevant in recent years when companies have adopted new hardware and software systems. There has been considerably difficulty and many projects have failed due to insufficient knowledge of architecture issues. The Architecture Phase was added to the OOCS because of these problems. All three architectures (cf. Figure 3) are defined in this phase (Figure 4). The technological architecture defines the development and run-time environment. These decisions then guide the design of application and data architecture. The application architecture describes the structure of the application, how it will be divided into presentation, business logic and data management. It also describes the application by means of executables, dynamic link libraries, and so on. If the business logic is to be divided between client and server, this division is decided here. Other relevant parts here are the reusable components, which means that one should look for parts in the system that should be constructed with reusability in mind. These parts could later be moved into the company’s reusable component library. Parts that could be constructed either from the company’s own reusable library or bought from vendors should also be sought. 2.3 Iterative and Incremental Development There are many definitions for the terms iterative and incremental, and depending on the source the same ideas are covered slightly differently. The following definitions are given by Cockburn ([5], p. 423) from an object-oriented point of view. Cockburn defines incremental development as ‘a scheduling and staging strategy which allows portions of the system to be developed at different times or rates and to be integrated as they are completed.’ Both Jacobson et al. [13] and Cockburn emphasize that the system is developed in portions because it is a quite natural way. The team(s) of a project can focus on one part at time and users will receive the overall system in parts that are easier to adopt. Iterative development is (according to Cockburn) a scheduling and staging strategy supporting predicted reworking of portions of a system. While incremental development spans phases, iterative development is used inside one phase (the analysis phase and design & test phase in Lorentz’s model (1993)). However, a process can be iterative even if it spans phases. Berard [1] favors a recursive/parallel development process. This approach is based on the fact that the parts of a system are usually at different abstraction levels, i.e. one part of the system might be ready for implementation soon after a short analysis and design session, while other parts might need several analysis, design, and implementation iterations. This development process decomposes the system to be built into independent components (business objects or collaborative objects), after which each

302

Kari Kivisto

component is (recursively) decomposed into smaller components. This decomposition is made in parallel to the chosen components. Booch’s ‘round-trip-gestalt design’ [2] and Henderson-Sellers’s fountain model [8],[9],[10] are variations of this theme. From the definitions it follows that iterative/incremental and recursive/parallel mean the same thing, but the terms iterative/incremental are more commonly used. As a conclusion to this topic, we may refer to Cockburn [5]: ‘the precise distinction between the incremental and iterative (or some other) development processes is not critical’. Goldberg and Rubin remark the same: ‘Large organizations have more than one product process model because they build several types of software products’. [6], p.91. However, project teams must be aware of what these terms mean and which of the process models are to be used in their project. Another obvious conclusion can be made, also: object-oriented client/server application development is made iteratively and incrementally or recursively and in parallel. When developers build object-oriented client/server applications, the best way to construct the application is to do it incrementally, i.e., in small manageable portions. These small parts of the system are constructed iteratively, i.e. defining them in more and more detail in each round of iteration. This is nothing new, and this approach is recommended in nearly every book. 2.4 Roles of the Developers An extra feature suggested in the OOCS deals with the roles of the developers. Process models seldom speak about the roles of the developers and if they are discussed, the discussion separated from the model (examples, Jacobson et al. [13], Lorenz [14], Booch [4], Goldberg and Rubin [6], Rumbaugh et al. [18]). Herbsleb et al. [11], p. 289 conclude their large study by noting that ‘It is going to take careful analysis of the interplay of cognitive and organizational factors across a range of studies to determine how best to organize OOD teams’. The OOCS process model defines the roles needed in each phase and connects them to the activities. This way project members can concentrate on the tasks they are responsible for. The role issue is normally separated from the process issue, although they should be handled in parallel, since the developers carry out the process. It is also important when the project starts that the developers know what they should do, when they should do it, how they should do it, and what the deliverables are. The OOCS model includes the roles in the model. The roles are assigned to the members of the project team when the project starts,. This way everyone knows their own responsibilities and can concentrate on them in each phase. The role model is based on small teams. Each team is responsible for its portion of the overall system. The idea behind the roles is the three-tier architecture, which means that there is a role for presentation, a role for business logic and a role for data management. The client/server application architecture and team roles are combined in Figure 5.

Considerations of and Suggestions for a UML-Specific Process Model

Team Leader

Users

Business Object Developers

Application Developers

Presentation

Business Logic

303

Experts

Database Developers

Data Management

QA Tester

Fig. 5. Roles of the team and object-oriented client/server architecture In addition, there should be two leading roles, one responsible for leading the project and its members and the other for leading application development or a portion of it. Users are an internal part of a good, successful project. As the size of the project increases, no new persons should be added to the team who would cause it to grow in size to over six persons. Instead, a new team should be established. A project manager is needed to control these teams. This is a role in which the tasks and responsibilities are virtually the same as those of a traditional project manager (resources, planning, scheduling, etc.). Two teams are also needed in a case where the application covers several business objects, as it is natural to establish teams around business objects. Each team is responsible for developing the parts of the application which deal with its business objects. In object-oriented terms, teams are responsible for contracts. Next is a list of roles in the TB model. Project Manager (PM) In this approach, a clear distinction is made between human management and application development. The project manager is responsible for the former, including management of the project’s resources (human and

304

Kari Kivisto

technical), tasks, deliverables, schedule and planning. He/she controls the project and the teams and determines the rhythm of the project. Team Leader (TL) This role works as a peer to the project manager. The team leader is responsible for directing the application development process. He/she is the architect or technical controller of the project. The team leader’s skills are measured at two critical points: in the system definition phase, where the team leader is the visionist for the new system and, in the architecture phase, where he/she designs the application architecture. These two activities call for experience and knowledge. Business Object Developer (BO) The business object developer designs, develops and maintains reusable business objects, and will ‘own’ some business objects. If there is more than one team on the project, each team will have at least one business object developer. As there are visions that business objects may some day be purchased from software vendors, a business object developer may be responsible for searching for these reusable components from different sources. The business object developer’s key activities are to analyze, design and construct business objects, which are a company’s key assets. Application Developer (AD) These developers analyze, design and develop the requested application using reusable components whenever possible. During the project they may design and develop new reusable components or drafts of them. Note that application developers are able to work in all phases. This thesis disagrees with role models where there are different persons in different phases. There is always an information loss in such situations. Database Developer (DB) This role is responsible for data modeling and database design, and he/she acts as an expert in data architecture definition. The role is an elementary part of the project for two reasons. First, the applications are databaseintensive. Second, new versions of database management systems include object-oriented features, providing possibilities of placing parts of the functional logic in the databases. Quality Assurancer/Tester (QA/TE) This role has two sides. The quality assurancer is an outsider in the project who comes from the QA department and reviews, inspects and audits the

Considerations of and Suggestions for a UML-Specific Process Model

305

quality of the project. The tester is an insider who is responsible for making test specifications and testing. Application developers also take part in testing and may write test specifications. This is always true in small projects. User (US) Users are am essential part of the development team. User involvement in application development is never underestimated. They take part in system definition, analysis and testing and, with a little training, they are also capable of writing use cases, on-line helps and user’s manuals. Expert (EX) There are at least two categories of experts, namely domain experts and technical experts. Domain experts work with business object developers, and they can also help application developers. Technical experts take part in architectural phase activities and they are interviewed during the system definition phase when architectural issues are under discussion. These experts may also check the installation instructions, test the installation and perform system tests. These are the main roles needed in object-oriented client/server application development. Another interesting issue is the reuse team and its roles. In our view, business object developers belong to the reuse team if a company has one. In larger projects with two or more teams there should be common roles that are responsible for the object model and data model. These roles do not belong to a reuse team.

3 Conclusions This paper studied features that an object-oriented client/server application development process model should have. Some of the authors of the Unified Modeling Language promote a development process that is use case-driven, architecture centric, and iterative and incremental. These features and their backgrounds were studied and enlarged so they would be more useful in interface-centric and data-centric applications development. In the OOCS process model, use cases have been part of the analysis phase from the very beginning. Architecture centricity was enlarged to cover three architectures (technology, data and solution). A phase for architecture was also suggested because of the importance of architecture issues in new technology projects. The UML supports architecture phase documenting because of its Deployment and Component Diagram. The iterative and incremental development process was discussed and a parallel development possibility was added for large applications built in portions. The roles in the team were defined and they were connected to the activities in each phase. The OOCS model has been used for object-oriented client/server application development for about four years. There were some problems in the beginning because the developers had to combine different notations and diagramming tech-

306

Kari Kivisto

niques. The UML removed these problems and even gave new diagrams for the Architecture phase. The emphasis on architecture has been one of the benefits of the OOCS model. The emerging technologies include high risks that can be avoided if the technology is properly tested and evaluated. The OOCS model forces developers to focus on technology (i.e., client/server architecture) before it is too late. Also, the roles have been accepted and their relevance has been acknowledged.

References 1. Berard, E.: Essays On Object-Oriented Software Engineering, Vol 1. Prentice-Hall, Englewood Cliffs, NJ, (1993) 2. Booch, G.: Object-Oriented Design. Benjamin/Cummings, Menlo Park, CA, (1991) 3. Booch, G.: Object-Oriented Analysis and Design with Applications. Benjamin/Cummings, Redwood City, CA, (1994) 4. Booch, G.:Object Solutions: Managing the Object-Oriented Project. Addison-Wesley, Menlo Park, CA, (1995) 5. Cockburn, A., A., R.: The Impact of Object-Orientation on Application Development.IBM Systems Journal, vol 32, no 3, (1993) 420 - 444 6. Goldberg, A., Rubin, K.: Succeeding with Objects. Decision Frameworks for Project Management. Addison-Wesley, Reading, Mass., (1995) 7. Graham, I.: Migrating To Object Technology. Addison-Wesley, Wokingham, (1995) 8. Henderson-Sellers, B.: Book of Object-Oriented Knowledge. Prentice-Hall, Sydney, Australia, (1992) 9. Henderson-Sellers, B., Edwards, J.: The Object-Oriented Systems Lifecycle. Communications of ACM, vol 13, no 9, (1990) 142-159 10. Henderson-Sellers, B., Edwards, J.: Book Two of Object-Oriented Knowledge: The Working Object. Prentice-Hall, Sydney, Australia, (1994) 11. Herbsleb, J., Klein, H., Olson, G., Brunner, H., Olson, J., Harding, J.: Object-Oriented Analysis and Design in Software Project Teams. Human-Computer Interaction, vol. 10, (1995) 249-292 12. Jaaksi, A.: Object-Oriented Development of Interactive Systems. Thesis for Doctor of Technology. Tampere University of Technology, (1997) 13. Jacobson, I., Christerson, M., Jonsson, P., Övergaard, G.: Object-Oriented Software Engineering - A Use Case Driven Approach. Reading, MA: Addison-Wesley; New York: ACM Press, (1992) 14. Kivisto, K.: Team-Based Development of Object-Oriented Clien/Server Applications: The Role Perspective. Licentiate thesis. Institute of Information Processing Science, University of Oulu, Finland, (1997) 15. Lorenz, M.: Object-Oriented Software Development: A Practical Guide. Prentice Hall, Englewood Cliffs, NJ, (1993) 16. Martin, J., Odell, J.:Object-Oriented Methods: A Foundation. Prentice Hall, Englewood Cliffs, NJ, (1995) 17. Microsoft Corporation: Analysis and Design of Client/Server Systems. Course Material, (1993) 18. Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F., Lorensen, W.: Object-oriented modeling and design. Prentice Hall, Englewood Cliffs, NJ, (1991) 19. UML Notation Guide, ver 1.1. Rational Software Co., (1997) 20. UML Semantics, ver 1.1. Rational Software Co., (1997) 21. UML Summary, ver 1.1. Rational Software Co., (1997)

An Action Language for UML: Proposal for a Precise Execution Semantics Stephen J. Mellor’, Stephen R. Tockey’, Rodolphe Arthaud3, Philippe Leblanc3 ‘Project Technology, Inc. [email protected] ‘Rockwell International srtockey@ cca.rockwell.com 3Verilog,SA [email protected], [email protected]

Abstract. This paper explores the requirements for complementing the UML with a compatible, software-platform-independentexecutable action language that enables mapping into efficient code. This language is henceforth referred to as an action language. The user of the action language will be able to specify the structure of the algorithms for a problem domain precisely without making unnecessary assumptions about the detailed organization of the software. An action language will enable precise specification of the structure of actions on a UML State Chart and the operations on a UML Class Diagram. A precise language that allows specification of the structure of algorithms for carrying out UML actions and operations without otherwise constraining possible software implementationsenables: Early Verification. An action language can perform specification-based simulation and formal proofs of correctness early in the software lifecycle. Problems detected early can be removed with much less rework, leading to a reduction in both project cost and time-to-market. Domain Level Reuse. With appropriate tooling, the system specification can be mapped into multiple different implementation technologies at significantly reduced cost.

1 TheProblem The ,UML is a rich and powerful notation that can be used for problem conceptualization, software system specification as well as implementation. The UML also covers a wide range of issues from use cases and scenarios, to state behavior and operation declaration. However, the UML uses ‘uninterpreted strings’ to describe the behavior of actions and operations. To provide for sharing of semantics of action and operation behavior between UML modelers and UML tools, there needs to be a way to define this behavior precisely-an action language. J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 307–318, 1999. © Springer-Verlag Berlin Heidelberg 1999

308

Stephen J. Mellor et al.

2 Why An Action Language? Model precision and level of implementation detail are two separate things. An action language, in conjunction with UML, can be used to specify a computing problem completely without actually programming it. An action language, in conjunction with UML, can be used to build complete and precise models that specify problems at a higher level of abstraction than a programming language or a “graphical programming” system. An action language can support formal proofs of correctness of a problem specification. An action language, in conjunction with UML, makes possible high-fidelity model-based simulation and verification. An action language enables reuse of domain models, because each domain is completely specified, though not embedded in code. An action language, in conjunction with UML, provides a stronger basis for model design and eventual coding. An action language, in conjunction with UML, supports code generation to multiple software platforms.

3 Why Not Use An Existing Language? To be useful, the action language should be abstract, so that the user can state the behavior minimally without duplicating implementation information. Because the action language should take a view focused on the high level policies in the domain, it should employ only a restricted conceptual subset of the UML. This issue is explored below in the context of existing programming languages. To be useful, the action language should allow for smooth incorporation of executable code, so that tools can map actions and operations into code efficiently. This issue is explored below in the context of declarative languages. To be useful, the action language should take a perspective that is precise and detailed enough to specify the policy and the high level algorithms of a system unambiguously, but without requiring the user to make any decisions about the structure of the software. This issue is explored below as Sufhyare-Platform Independence.

3.1

Why Not Use An Existing Programming Language?

Existing programming languages already provide a precise specification of the structure of actions and operations at the implementation level. Why not use one of them and avoid inventing another language? Concepts such as Class, Package, and Exception exist in both the UML and in many 00 languages, but existing programming languages provide much, much more

An Action Language for UML

309

than an action language needs. Consequently, any action language would have to be a subset of an existing programming language. On the other hand, many existing programming languages limit implementation options. For example, Java has only one way to represent associations, namely object references in one, the other or both of 'the associated classes. An appropriately abstract action language must represent the meaning of the association rather than its implementation. Similarly, existing programming languages do not support directly many UML concepts such as Association or State. An action language should support directly UML concepts that are appropriate at the level required for a system specification. Existing programming languages have serial, sequential execution models, while an action language should define minimally constrained execution. Note that this is conceptually equivalent to allowing for concurrent execution within an action, though in practice, it merely provides the design mapping with the information required to determine the possible orders of execution.

3.2

Why Not Use A Declarative Language?

Declarative languages provide for precise specifications without overly constraining implementation. Why not use one of them and avoid inventing another language? When combined with the Object Constraint Language [l], UML actions and operations can be specified in terms of precise pre-conditions and post-conditions. But the OCL is intentionally declarative (side effect-free). Declarative languages allow the ultimate freedom in implementation independence but there is often a need to include some level of algorithmic specification to ensure efficient execution. Consider, for example, a pre-condition of a collection of values and a postcondition that the collection of values be sorted in ascending order. It is possible, of course, to implement this by building a list of all possible permutations of the collection and select the permutation that is properly sorted, but this solution is not practical for real systems. Consequently, it is desirable to be able to replace a specification stated abstractly using pre- and post-conditions by the code that implements it without affecting the remainder of the system specification. This implies that definition of a computation should be separated from the remainder of the structure of the action language. Hence, a computation can be defined using pre- and post-conditions, an imperative function definition language, or a programming language, while data access and signal generation remain independent. Su,me Vocabulary. This paper uses action to refer to a grouping of various data accesses, signal generators, and computations that are all executed as a unit on receipt of an event, or an invocation of an operation. The paper uses computation to refer to a side-effect free transformation of data inputs into outputs (i.e. not a data access or signal generator), which is a part of an action.

310

3.3

Stephen J. Mellor et al.

Software-PlatformIndependence

Software-platform independence is analogous to hardware-platform independence. A hardware-platform independent language must enable the writing of a specification that can execute on a variety of hardware platforms with no change. Similarly, a software-platform independent language must enable the writing of a specification that can execute on a variety of software platforms, or designs, with no change. For example, a software-platform independent specification should be able to be mapped into a multi-processor/multi-tasking CORBA environment or a client-server relational database environment with no change in the specification. When the concepts Customer and Account exist in the problem domain under analysis, they can be modeled in UML on a Class Diagram. The vocabulary of UML, including the name of class diagram, suggests that the software solution should be expressed in terms of classes named Customer and Account. But there are many possible software designs that can meet these requirements, many of which are not even object-oriented. This goal of software-platform independence suggests several general implications. The action language must enable the generation of a system with a diflerent structure from the model. The organization of the data and processing implied by a conceptual model may not be the same as the organization of the data and processing implied by the model in implementation. For example, between concept and implementation an attribute may become a reference; a class may be divided into sets of object instances according to some sorting criteria; classes may be merged; or split; state charts may be flattened, merged or separated, and so on. The action language must be at a level of abstraction that enables such model reorganization. Consequently, access to a UML attribute should be stated as an atomic operation, so that any potential reorganization of the data in the model can be generated as a single access. Consider, for example, the OCL data access statement in the context of a Customer: Self .Address. This statement might be implemented as a data member of customer or redundantly as a data member of each of the customers’ accounts. The action language must not assume that the structure of the model is the structure of the implementation. Similarly, access across an association must be atomic. Consider, for example the OCL data access statement in the context of an Account: Self .Customer . address.This statement - because it is single unit - can be optimized to any selected organization for the data. Note that splitting the access into two, say by finding the customer and then the customers address, will be much more difficult to map into an efficient design. The same principle applies for behavior. If we view the system as a set of interacting state charts that signal each other to coordinate behavior, then any implementation that has the same behavior-whether based on state charts, state machines, threaded code or linear code-would all meet the application requirement. The action language must enable reorganization offundamental problem-oriented computation. Because the conceptual model of a problem may be implemented in a

An Action Language for UML

311

variety of ways, the action language must not allow the user to specify computation in a manner that depends on assumptions about data access and organization. Consider, for example, a problem in which we compute the monthly interest for all accounts belonging to a subset of customers. One way to program this is to build a double loop that iterates over customers, searching for ones that qualify, then iterates over those accounts, adding some percentage to the balance. This approach assumes a certain data organization, and it would be dreadfully inefficient using some other data organization. The fundamental, problem-oriented computation is simply the interest computation applied to the relevant subset of accounts-all else is concerned with managing the data organization. One approach to enabling reorganization of fundamental problem-oriented computation is to separate such Computation from data access, and vice versa. This approach does not embed data access or control structure within any computation, but instead places data access and control structure outside the computation. This approach also suggests that computations should be context-free. A contextfree computation is one that has no side effects and no (internal) state memory. It is a function in the mathematical sense of the word. The approach also suggests that collections of object instances or collections of data values should be treated as a unit. In a specification, certain collections may be identified as fundamental to the problem; these necessary collections often become the basis for optimization in the implementation. In short, the action language should avoid structures that inhibit mapping the problem specification into implementations with different organizations.

4 Requirements This section summarizes requirements for an action language. Types. The action language should enable manipulation of all UML, definition of domain-specific types, and enforce strict typing, within the context of subtyping. Because the action language is an adjunct to the UML, types may be defined as a part of the definition of a class, and not necessarily as linear text Object Lifecycles The action language should support object instance creation, including initialization of attributes and state (in the Statechart sense), by providing values or from defaults. The action language should provide the ability to refer to the current instance (self or t h i s in Smalltalk and C++ respectively). The action language should provide object instance deletion of single instances or collections. Associations. The action language should support creation and deletion of associations between instances of classes. Instance Selection. The action language should support the ability to produce collections of instances based on complex selection criteria and traversal over associations.

312

Stephen J. Mellor et al.

The action language should support the ability to define selection criteria in terms of attributes', and/or in terms of associative relationship participation with source object instance(s). The action language should support the ability to specify ordering criteria (e.g., ascending with respect to property X or descending with respect to property Y). Object instance selection should be separate from physical data organization and access. Collection Operators. We define a collection to be a group of similar object instances, or of similar data values about object instances. Such a collection iscreated as a result of a selection identifying some number of object instances, or data values from instances. The collection may be a bag or a set. The action language should provide for the creation of a collection in a specified order. The action language should support the ability to manipulate collections in appropriate ways, including, for example: union and intersection for uniquemembership collections (sets), and sum and difference for non-unique-membership collections. The action language should support the ability to apply a specified action to all members of some collection. For ordered collections, the order of repetition shall be pre-defined by the collection's ordering. For unordered collections, the order of application shall be undefined. Attributes. The action language should support the ability to access the attributes of selected object instances for both read and write of data. Since the action language must manipulate collections of object instances as a unit, this implies that data access must also act on collections of values. The action language should support the ability to access several data elements of an object instance in a single action language clause. Compounds. The action language shall support manipulation of groups of data values or of object instances. This manipulation should include the ability to read and write dissimilar elements as a unit, and the ability to define computations that accept compounds and produce compounds. (For example, a Computation SquareRoot may be defined to produce a compound of the two roots of a positive number.) The action language should support the ability to read, write, and manipulate collections of compounds. Flows. The action language should support a first-class mechanism to maintain a clean separation between data access and computation. The problem statement calls for a clean separation between data access and computation. Therefore, a computation should not access any data except that which is passed to it, and a data access operation may not incorporate any implicit computation. As a consequence, (setshags of) data values only must be passed to and from computations. Further, the action language must support a mechanism to refer to the locations of the data values so that new values must be written back.

*

Defined broadly to include attribute values, state chart state, or operation return values of the target class.

An Action Language for UML

313

Sequence. The action language must enable minimal specification of the order of execution within a single action. Assertions. The action language must enable specifications of assertions such as pre-conditions, post-conditions, and consistency for both classes and states. State Chart. The action language must support UML state chart specifications, including event conditions, guard conditions. Signals. The action language must support specification of signal timing, information accompanying a signal, and association traversal paths for signals. Computations. The action language must support specification of side-effect free functions which operate on instances, collections (ordered or not), and on all members of a collection as a unit. Such computations may act on data values to produce either data values, or values used to direct the sequence of further processing.

5 Examples It is beyond the scope of this paper to provide a specification for an action language. However, some examples in an action language that meets the requirements described above can be useful in understanding the issues. Action languages based on classical third generation languages (such as C and C++), leave much to be desired. The fundamental problems are (1) that they are too low-level and (2) provide too much power and choice. On one hand, they require the analyst to over-specify some aspects of an action (for example, statements in these languages are generally executed sequentially); on the other, constructs are provided for for loops, if statements, switch statements that embed computational code within control structures, inhibiting reorganization of the model into an efficient implementation. Consider, for example, the following third-generation style action language:

/ * In the context of a DogOwner named myDogOwnerID, find all the owned dogs * / Select many dog from instances of object Dog ’owned by’ myDogOwnerID; For each dogID in dog //Generate a UML signal to the StateChart for each Dog Signal D1: ComeToDinner( ) to dogID; dogId(Weight) := ProportionalIncrement(dogID(Weight)); End for; If myDogOwnerId.FoodLeft < SafeAmount then Signal D04: BuyMoreFood to myDogOwnerID; End if; Some of the problems here are: Because of the sequential nature of the language, control has been over-specified. There is no apparent reason why the if clause should be after (or before) the for loop.

314

Stephen J. Mellor et al.

The for loop is a general structure into which we could place any number of statements. In the example, two unrelated analysis ideas have been related by being placed in the for loop. This obscures the fact that such statements may, in fact, constitute reusable processes. The if statement has the same problemunrelated analysis thoughts can be placed in the body of the statement. By examining other code fragments one can find additional problems. Perhaps the best summary conclusion is this: Textual action languages based on third-generation languages tend to produce only a thinly disguised form of the implementation, and do not provide the level of abstraction needed for clear, complete analysis and intelligent code generation. One alternative approach is to base an action language on data and object flow. This approach has the advantage that data access and computation are completely separated, and therefore that computational code is not embedded in control structures. This approach matches, to some extent, both the Activity Diagram defined as a part of the UML, and the Shlaer-Mellor approach of constructing a data flow diagram for each action. Here are some general properties of the example action language: Each chain of computations, connected by object flow, is a statement. Each statement like of pipes and filters. Within each statement, data is viewed as being as active and flowing. Data is assumed to be flowing in collections, and no distinction is made between collections and single data values. The details of the computations are defined separately from the body of the action. We do not provide here any examples of the language required for this. It could be in pre- and post-conditions, an imperative language, or an existing programming language constrained to prohibit data access. Execution proceeds in parallel2 for all statements, except where constrained by data or control flow. Consequently, if two statements write the same variable and their order of execution has not been constrained, it is indeterminate which value will be used in subsequent processing. Guards can be set and then used to initiate (or not initiate) the execution of a chain of computations. Execution of the action terminates when the last statement that can execute has completed. We can now re-write the dog-feeding example in the example action language: 8

/ * In the context of a DogOwner named myDogOwnerID, find all the owned dogs, pipe that collection of Dogs to a process that generates a signal to each * /

myDogOwner.OwnedBy

I

Signal D1: ComeToDinner;

/ * In the context of a Dogowner, get the weight of each, pipe that to a process that computes the new The statements proceed “in parallel” from the analyst’s perspective. The architecture can serialize the statements as desired for optimization or other purposes.

An Action Language for UML

315

eight for each member of the collection and write it back using the ' > ' operator. The ’Dog( ) ' refers to the set of Dogs found by the expression ’myDogOwner.OwnedBy’. That expression is then dereferenced by ’.Weight’. The computation ProportionalIncrement is defined separately. * / myDogOwner.0wnedBy.Weight

I

ProportionalIncrement >Dog( ).Weight;

/ * Get the FoodLeft attribute of the owner, and the variable SafeAmount, pipe the pair to a test process, TestLess, which has two guard conditions * /

(myDogOwner.FoodLeft, -Safemount !Less, !GreaterEqual;

)

I

TestLess?

/ * If the ’Less’ guard is set, pipe the DogOwner object instance to a process that generates a signal * /

!Less: myDogOwner

I

Signal D04: BuyMoreFood;

6 A proof of concept We will illustrate the potential benefits of an action language through the example of SDL and MSC [4, 51, a language defined by the ITU, widely used in the telecom industry and also used for the development of embedded systems. The approach described here has been used successfully in ObjectGEODE [2] for the design, validation and fully automatic generation of code running in cellular phones, satellites and smart cards. SDL is similar to UML in that it describes a system in terms of interacting processes owning state machines and exchanging messages. The parallel could be pushed much further, but the aim of this paragraph is simply to show what has been made possible by the use of a language with precise execution semantics.

6.1

100%Code Generation

Though SDL became a formal language in 1988 only, code generators were developed before that, often by users themselves. Of course, there are code generators from UML, too. How do SDL and UML code generators differ? A first major difference is that you can generate 100% of the code from your standard SDL (though you do not have to) [3]. All SDL code generators allow you to combine SDL and user code, but this is mainly used as a way to connect the generated code with legacy code, libraries, or to use code that cannot be generated

316

Stephen J. Mellor et al.

from SDL (a man-machine interface, access to a database, etc.) since SDL is dedicated to the real time aspects of systems only. In other words, users do not write code simply to fill the holes left by SDL here and there; instead, they isolate whole, meaningful subsystems to be entirely described in SDL. This is not only a productive way to work, but also a safe one, since the generated code will behave consistently with what you have simulated before, as described further in this paper. A second major difference is that you do not need to rewrite all actions (in transitions or in the body of operations) simply because you have changed tool! You can even change programming language without changing design. On the contrary, since UML does not even support assignments, all manipulations of a simple association have to be mapped in a proprietary, tool-dependent fashion (perhaps manually) to a programming language and/or library. As a consequence, you must rewrite all actions if you ever change from a UML tool to another one. It is not the case with SDL.

6.2

Early Simulation

SDL allows the representation of informal actions, decisions and operators. This means that you can write in an abstract way what can happen, using semi natural language, without loosing the ability to simulate or to generate a prototype. The actual behavior of informal parts during simulation can be driven in various manners - interactively at run time, through lists of predefined values, randomly, with probabilities ... The key points are that you can simulate or generate code without having to go in the details of implementation, and that this possibility is an integral part of the SDL, not a vendor-specific extension. Simulation is not only about executing actions, but also about simulating performance, failure, etc. For example, ObjectGEODE uses SDL formal comments (similar to UML’s tagged values) to tell how much time an action takes, and can use these values to compute the throughput of the system, or the average interval between two events. An action language is necessary, because there must be something to tag with a duration or a MTBF: an action.

6.3

Automatic Test Generation, Exhaustive Verification and more

Since SDL has a formal and executable semantics, vendors have been able to develop incredibly powerful services around it, unparalleled for UML today [2]. Some examples : verifying that a given sequence of events can or cannot happen; that a given scenario can still be “played” by the system (non regression testing, compliance with requirements); verifying that a given property, possibly distributed over several objects and nodes, is always true; if not, generating automatically a sequence that “breaks the law” (exhaustive, aggressive testing); causing failures and alterations automatically (fault tolerance).

An Action Language for UML

317

This cannot be done with today’s UML because many U M L concepts have to be implemented before you can see something running, and because it is not possible to develop such services on top of programming languages for theoretical reasons. 6.4

Combining Tools From Several vendors

Of course, a vendor could extend U M L to make all the features previously described available for UML. The user would then be trapped, because these extensions not being part of the standard, any UML actions written with tool A would be impossible to use with tool B (even when the Stream-based Model Interchange Format is available). SDL users can choose a certain tool for modeling and simulating, and another for code generation. It is often the case since many of our customers have developed their own code generator for specific targets. Still, they use our simulator, because there is one language and one interpretationof it. We are convinced that extending UML would bring a lot to the UML community, and is a key to making better systems. It is interesting to note that SDL’s language is not a complex one (you cannot write complex algorithms in SDL for example), but that, under the pressure of major users, the ITU is currently making extensions to make it much more powerful.

7 Issues To Be Addressed By An Action Language Specification There are a number of semantic issues that an action language must resolve. The action language specification should describe how the action language treats composition (i.e. if a ‘whole’ is deleted by an action language statement, what does the action language do with the ‘parts’?) The action language specification should describe how the action language treats multiplicity and conditionality (i.e. if an object instance is deleted by an action language statement contrary to the multiplicity and conditionality specified on an association, what is the responsibility of the action language?) The action language specification should define precisely the semantics of an object instance’s lifetime. Is the user responsible for deleting object instances to maintain consistency of the association? The action language specification should indicate if and how an action can invoke an operation. Although the action language may be used to define the semantics of an operation, it is far from clear what the semantics of invoking a non-data-access operation of a target class are in the context of the run-to-completion semantics of the state chart. The action language specification should indicate which elements of the action language can be extended using stereotypes or tagged values.

318

Stephen J. Mellor et al.

8 Conclusion We believe that complementing the UML with a compatible, software-platformindependent executable action language that enables mapping into efficient code brings undoubted benefits to the community. This paper has explored the requirements on such an action language and the rationale for the benefits. The next step is to gather support for the action language effort. Since the UML is an adopted technology of the Object Management Group, the action language i s best promoted by that same group. We encourage and solicit support for this effort. This can be done in several ways: by giving us your ideas on action languages and their requirements; by participating in the OMG Request For Proposal process as submitters or reviewers; by pestering your favorite tool vendor to participate in the RFP; and by pestering your favorite tool vendor to offer the Action Language in their tool set.

Acknowledgements Johannes, Tom, Sally

References 1. Object Constraint language Specification 1.1, OMG Document ad/97-08-08 (OCL reference) 2. Philippe Leblanc: OMT and SDL based Techniques and Tools for Design. In: Simulation and Test Production of Distributed Systems,:STTT, International Journal on Software Tools for Technology Transfer, Springer, Dec. 1997, http://link.springer.deor http://eti.cs.unidortmund .de 3. Vincent Encontre: How to use modeling to implement verifiable, scaleable and efficient real-time application programs. In: Real-Time Engineering Journal, Vol. 4,No. 3, Fall 1997 4. CCITT, Recommendation Z. 100, Specification and Description Language SDL, 1993 5. ITU-T, Recommendation Z. 120, Message Sequence Chart (MSC), 1993

Real-Time Modeling with UML: The ACCORD Approach1 Agnès Lanusse1, Sébastien Gérard2, François Terrier1 1

LETI (CEA - Technologies Avancées) DEIN - CEA/Saclay F-91191 Gif sur Yvette Cedex France Phone: +33 1 69 08 62 59 Fax: +33 1 69 08 83 95 [email protected] [email protected] 2 PSA - Peugeot Citroën / Direction des Technologies de l’Information et de l’Informatique 18, rue des Fauvelles 92250 La Garenne Colombes, France [email protected]

Abstract. Adopting object oriented modeling in the real-time domain appears to be essential in order to face the rapidly changing market conditions. Main obstacles have been, in the past, the lack of standards and no good adequation with real-time needs. With the standardization of UML 1.1, the first main drawback is coming down. The challenge is then to handle properly these notations and provide a clean general framework for real-time design yet still object oriented. This goal may be reached through different approaches, the most common one consists in defining new versions of UML dedicated to real-time. The ACCORD approach, described in this paper, advocates the idea that only very few adaptations centered on a small subset of concepts are necessary to make out of UML a good framework for real-time design. We show how thanks to the realtime active object concept that encapsulates concurrency control and temporal constraints handling. Keywords: Real-time, UML, Concurrent programming, Active Object.

1. Introduction Classical real-time development of software systems is reaching its limits in a world were target hardware cannot be known in advance, versions evolution become increasingly fast and time to market must be shorten drastically in order to meet economical requirements. Reusability and evolutivity become even more important in this particular domain than it is in other software fields. In such a context real-time systems development cannot be achieved efficiently without a strong methodological support and accompanying tools. In parallel, a consensus is reached that object oriented techniques are successful to provide the flexibility required. Up to now J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 319–335, 1999. © Springer-Verlag Berlin Heidelberg 1999

320

Agnès Lanusse, Sébastien Gérard, and François Terrier

however, real-time community has long been reluctant to cross the Rubicon for mainly two reasons: • object orientation offer was not mature enough to provide stability in their solutions (methods, tools,...), • the real-time specificity was generally not well covered by the methods. With the standardization of UML notations, the signal that many editors were waiting for has appeared, a first step is being achieved that will permit the spread out of a new generation of tools. But the main questions remains the adequacy of object oriented methods with real-time specificity. In the past years, a certain number of solutions have been investigated, they have resulted in some commercially available tools (and methods) such as: Stood (HOOD), ObjectGeode [1, 8] or SDT, ObjecTime (ROOM [10]), and Rhapsody [4]. Though a great effort has been done in order to provide a good compromise between task oriented modeling and object oriented one, these proposals reflect a great diversity in the possible methodological choices offered to handle real-time and object orientation. In particular, they often propose specialized models or paradigms to model the real-time point of view of an application making, de facto, the communication and interaction between usual object oriented modeling and real-time one, difficult. They generally maintain two separate conceptual models : the object one and the task one. This requires, from the designers, high level skills for maintaining consistent these two different views, all along the development process. In this paper we claim that real-time development can be fully object oriented and handled with classical object oriented approaches quite easily, in exactly the same way with the same concepts and notations and for most development steps as any usual software. This can be achieved by providing high level abstractions for communication and concurrency management. Thanks to the real-time object paradigm, these matters can be handled transparently by the underlying object system, then, a real-time application can be simply described in terms of communicating objects with (possibly) constraints attached to requests. The development process can then stay quite close to classic object oriented ones, and most classic tools can be used, provided they offer meta-model facilities compatible with UML extension package. Of course, this implies that specific active object patterns are provided within the execution environment that support the semantics defined for real-time active objects. ACCORD execution environment provides libraries that support these paradigms, but one can find numerous existing design patterns have been proposed in the litterature these last years if they want. In the following we present the ACCORD method and the UML-extensions that permit its integration within the Objecteering CASE tool.

2. ACCORD Overview ACCORD method provides an object oriented framework for real-time design and development as close as possible to classic object oriented environments [13, 14]. The

Real-Time Modeling with UML: The ACCORD Approach

321

idea is to make a real-time object oriented application look like classical object oriented applications as far as possible thanks to high level abstractions (namely the real-time active object concept) for handling in a transparent way concurrency control, communications, scheduling and tasks management. The original main motivation behind it, is to provide a way for real-time developers to use almost classic object oriented design and development techniques instead of proposing yet another specialized method. The ACCORD method is an extension of the CLASS-RELATION method [2, 3]. It is supported by the Objecteering CASE tool and uses Hypergenericity to specialize the UML meta-model. This meta-model however is not completely finalized yet since possible real-time extensions to UML are still in discussion within the OMG working group and might bring new description facilities. The ACCORD development process quite classically follows three main stages: analysis, design, implementation and deployment. UML notations and diagrams are used all along this process [5]. In the higher stages, almost no modification to the UML standard is done since the ACCORD approach consist in promoting classical object oriented analysis and design. In a specific real-time design stage ACCORD extensions to UML are provided and design rules are added. Some diagrams are specialized in order to provide a better visibility on the real-time characteristics of the application. In the next section, we rapidly present each main step of the process, then a specific section will develop the real-time design stage and finally the ACCORD meta-model will be rapidly presented in the last section. 2.1 Development Process The ACCORD development process is based on an incremental and iterative process The backbone of this process is based on three models : a static model , the structural model and two complementary dynamic models : the interaction model and the behavior model. All along the development process these three models will concurrently be refined and specialized in a consistent manner.

 Class-Relation is a method proposed by Ph. Desfray from SOFTEAM

322

Agnès Lanusse, Sébastien Gérard, and François Terrier

Real-time Design

Implementation

Object Design Train

Analysis

Train Train

Circuit

Circuit

Circuit Control

Control

STRUCTURE STRUCTURE STRUCTURE STRUCTURE Train

Circuit

Control

Control

BEHAVIORS INTERACTIONS BEHAVIORS INTERACTIONS BEHAVIORS INTERACTIONS BEHAVIORS INTERACTIONS

Figure 1: ACCORD Models and Process The structural model describes the classes involved in the application and their dependencies. It is described with UML Class diagrams. The interaction model describes the possible interactions between objects. This model is described with UML Use Cases and Sequence diagrams. The behavior model describes for each class or operation its possible behaviors characterized by states and possible transitions in each state. This model is described with UML Statechart diagrams. 2.2 Analysis The analysis stage is fully standard. A Use Cases analysis is conducted in order to identify interactions between the system and its environment represented by actors, Sequence diagrams are built that permit to identify both the main classes of the application and their interactions (with actors and between them) in the Class diagrams. On this basis, a first version of structural model is built In parallel, an analysis of the vocabulary used in the requirements documents help precise in the Class diagrams the Classes and Operations issued from the Use Cases analysis. One specificity of ACCORD is its ability to specify behavioral information very early in the process. Temporal information can be captured in the interaction model by the Sequence diagrams and object behavior can be specified at the class and operation levels in the Behavior model by Statechart diagrams.

Real-Time Modeling with UML: The ACCORD Approach

323

Requirements UseCases

Système

Dictionary

Système

Preliminary Analysis

Tr ain

Circuit

Control

Class Diagrams Sequence Diagrams

State Charts

Detailed Analysis

Figure 2: Analysis Step 2.3 Design The design stage is decomposed into an objet design step and a real-time design one. 2.3.1 Object Design The object design follows here again pure object oriented design style. The idea is to define through iterative steps the full logical design of the application. It requires UML Class diagrams, Sequence diagrams and Statechart diagrams for developing the three facets of our ACCORD modeling strategy. At this stage communication between classes are not yet specialized into signals or operations and communication modes are not yet decided. This model is actually a common platform for various possible real-time design models. The idea behind it is to postpone as far as possible design decisions that might reduce reusability. Classes definitions from upper stage are refined and new classes are added. Sequence diagrams are completed in order to take into account new classes. Communications specifications extracted from Sequence diagrams are used to determine potential operations attached to classes. Statechart diagrams are defined for the new classes and previous ones are enriched. Consistency checking is performed in order to keep consistency between both the three design models and the three analysis models. 2.3.2 Real-Time Design The real-time design stage is devoted to the real-time specialization of the object design. During this stage the specialization of communications is done (signal/operations, synchronous/asynchronous), possible concurrent objects are identified. Time constraints are attached to requests and real-time behaviors are detailed (triggering conditions on statecharts, periodic operations, ....). This is this stage that requires a specialization of UML in order to support the ACCORD methodology.

324

Agnès Lanusse, Sébastien Gérard, and François Terrier

2.4 Implementation System design and implementation stages are greatly facilitated by the use of the realtime active objects paradigm [12] and the ACCORD libraries that support it, defined as an object oriented Virtual Machine. Most of implementation issues can be automated thanks to high level code generation facilities offered by the Objecteering tool and its Hypergenericity component [2], and to the ACCORD execution environment that provides specific components for tasks creations and management, communication mechanisms, synchronization, resource sharing protection, scheduling and so on [11], [6]. Implementation stage in ACCORD concerns both instanciation issues, deployment and code generation. Instanciation concerns first the identification of objects from the structural model and, the instanciation of the various temporal constraints. It concerns also some specific scripts definition such as application initialization, termination, and/or reconfiguration. The deployment process concerns three main steps : the partitioning, the implementation tailoring and the actual implementation: Partitioning determines the number of nodes and the allocation of objects to nodes. Implementation tailoring concerns a very important step related to optimization issues: considerations such as determining whether or not an object defined as possibly concurrent will or will not eventually be attached to several threads or to a single thread or will use the threads of calling objects. Actual implementation is finally obtained by automatic code generation on a given Virtual Machine instance. We provide solutions for multi-threaded implementations, other schemes optimized for embedded systems are under development. ACCORD Method

Objecteering Class-Relation UML

Requirements

Model of the application

Modeling

l'

li

i

VxWorks

Code generation

Source C++ of the application

ACCORD virtual machine Solaris 2.5

ACCORD coding rules

Compilation Linking

Windows-NT SoftKernel

Figure 3: ACCORD development environment.

ACCORD kernel

Runtime of the application

Real-Time Modeling with UML: The ACCORD Approach

325

3. Real-Time Design ACCORD is initially a specialization of the CLASS-RELATION method for real-time which means that CLASS-RELATION can also be directly used for the first two steps: analysis and object design. Thanks to the Hypergenericity technique that can be used to extend the meta-model and adapt the code generation in consequence, very few work is required in order to provide support for ACCORD either for the CLASSRELATION method or for a UML based one. This very powerful technique had already been used in a previous version of ACCORD which used the CLASS-RELATION notations. We are currently porting it in order to be UML compatible. The main ACCORD specialization of classical object oriented methods concerns a specific design stage named the real-time design which is described below. This is possible thanks to a specialization of the UML meta-model specific to this stage that is presented in Section 4. Real-time characteristics will be obtained from the design object model mainly through specialization of communications and classes. We will thus consider in sequence the communications, the identification of the possible sources of parallelism that will determine real-time objects, the refinement of their behaviors, and the refinement of temporal issues (deadlines on requests, periodicity on operations, ready times, watch dogs,...) and the refinement of operations descriptions through State Machines. 3.1 Communications Specialization The goal of this step is twofold : • Identification of signals and operations represented up to now as messages in the previous steps • Identification of communication modes (Asynchronous / Synchronous) (one part of the problem is already solved by the previous point since signals are asynchronous in UML). As a consequence of the specialization of communications in the Sequence diagrams, several updating will occur both in Class diagrams and in Behavior diagrams attached to classes. In Sequence diagrams, requests (messages) are specialized in signals or operations.

326

Agnès Lanusse, Sébastien Gérard, and François Terrier

: Sensor

t0

: ControlSystem

: Switch

TrainDetected

t1

: Sensor

: PowerSystem

t0

: ControlSystem

: Switch

: PowerSystem

«TrainDetected

CheckAhead

CheckAhead

SwitchToPosition

SwitchToPosition

PowerOnCircuit

t1

PowerOnCircuit

t1-t0 < 110ms Figure 4: Sequence diagram refinement with communication specialization. In Class diagrams, the list of operations attached to classes is updated and the sensitivity of the class to signals is specified. A specific iconic form has been defined in order to make them appear clearly in the diagram. In Class statechart diagrams, class StateMachine impacted by changes of requests into signals are updated, and trigger conditions of transitions are completed. 3.2 Concurrency Specification The goal of this step is to identify the possible sources of parallelism in the application and provide object oriented means to support and handle this parallelism. For that purpose we rely on the active object paradigm that is specialized in ACCORD to handle real-time constraints. We proceed as follows. 3.2.1 Real-Time Objects Identification Once the communication analysis is done, a certain number of asynchronous requests have been specified. This demonstrates an implicit potential parallelism in the application. The most natural way of handling such parallelism between classes, and later on objects, is to introduce the concept of concurrent object, that is an object able to handle its own computational resources. Such an object can thus handle concurrently with other objects the processing of requests. They can be considered as servers offering various services provided by operations. In UML the concept of active object exist, we precise it in the ACCORD meta-model. In ACCORD, intraobject parallelism is possible and specific concurrency control mechanisms are provided so that data protection can be performed automatically (Section 4). At this level, concurrency is an abstraction used to make the application look simpler. The real implementation may be different depending on the code generation options chosen . Though this is not described in this paper since we focus on the design methodology, we have developed several implementation schemes. The concurrency analysis results in the identification of active objects, also called in ACCORD real-time objects, stereotyped with « RealTimeObjects ».

Real-Time Modeling with UML: The ACCORD Approach

327

3.2.2 Operations Access Modes Specification For each active object a specific presentation choice in ACCORD has been defined to group Readers and Writers operations. The specification of operations access modes uses specific tagged values associated to operation. Readers operations can be executed in parallel with other Readers operations within a real-time object while Writers operations are always serialized whit Readers and Writers. Other operations using the object nor in write access nor in read access can be executed concurrently to any other operations within the same real-time object. 3.2.3 Resource SharingIdentification (Protected Passive Objects) Some objects that are typically data objects may be associated through several links with other objects (active or passive objects). In order to facilitate the sharing of such objects by several active objects a new class stereotype has been introduced: « ProtectedPassiveObjects ». So stereotyped classes automatically insure data access protection thanks to a concurrency control mechanism. During this step we identify such classes. Finally in the Class diagram all real-time objects and protected passive objects are updated. «RealTimeObjects»

DisplayPannel

OnOff EltTriggered EltNewState TrainDetected TrainBreakdown

«PublicWriter» ResetDisplay() InitDisplay()

«PublicReader»

Signals Handled by the Class

DisplaySwitchState(<eltId,state) DisplayTrainPosition(<eltId,pos) DisplayTrainBreakdown(
Display

1-1 PA

«ProtectedPassiveObjects» PAPort

Figure 5: Class diagrams specialization 3.3 Behaviors Refinement Once each class has been properly defined an iteration is performed on the Behavior models in order to complete them. Class statechart diagram used are restrictions of UML statecharts. The action part of a transition label is systematically restricted to an operation name (no decomposition is authorized at this level). Operations themselves will in turn be described through Operation statechart diagram where transitions will represent individual actions whose type will be one of the UML specified ones. 3.3.1 Class Statechart Diagrams An ACCORD, a Class statechart diagram describes the possible states and transitions associated with a class. In a given state, a class will possibly execute an operation

328

Agnès Lanusse, Sébastien Gérard, and François Terrier

depending on occurring events or satisfied conditions. The RTC semantics of UML statechart is observed. The action performed on a transition is the activation of the method associated to the operation specified in the statechart. Its execution is run to completion. During the real-time design step, triggering conditions are systematically specified. In particular, each incoming signal is associated with an operation name that determines the action to be performed on arrival of this signal, when an object, instance of this class, will be on that particular state. A few additions have been made to UML notations, in order to precise exception situations where signals or operation calls are not supposed to activate operations in particular states. The only possible action with UML is to declare the corresponding triggering events in a deferredEvent list associated with the state, otherwise they are lost (it is also the case in SDL and in classical statecharts [7]). One might want to specify explicitly what to do with these unexpected events. We have introduced three possible actions, namely: ignore, reject, defer. The defer possibility corresponds to the UML defer, ignore explicitly declares that the message will be discarded if received while the object is in that state; finally the reject option will discard the message and produce an exception (an error signal). create

EltTriggered / DisplaySwitchState{dl=300}

OnOff / InitDisplay {deadline =50}

ON

OFF delete

OnOff / ResetDisplay

TrainDetected/DisplayTrainPosition {deadline=300} EltNewState/DisplaySwitchState {deadline=300}

TrainBreakdown / DisplayTrainBreakdown{deadline=300}

Figure 6: Class statechart diagram Moreover, the Objecteering CASE tool provides high level masking facilities, that is used in ACCORD to offer the possibility of getting a simplified view of a statechart if wanted. The simplified statechart shows only the possible states and transitions but do not display triggering conditions. Though it is not an obligation for the designer, having two views of the same statechart provides certain advantages. In particular, it permits to separate the generic behavior of the class that can be reused in an other application from a specific one depending on specific signals, present in the current application. This way, a distinction is made between the fundamental behavior of a class that actually represents potential behaviors, and the actual behavior that is the set of reachable states and fireable transitions within the particular application that depends on the arrival of a possibly restricted set of signals.

Real-Time Modeling with UML: The ACCORD Approach

329

3.3.2 Operation Statechart Diagrams Operations are in turn described through statecharts decomposing each of them in sequences of UML elementary actions (SendAction, CallAction, InvokeAction,...) according to actions sequences as defined in UML statechart package. 3.4 Time Constraints Specifications All along the development process time constraints have been expressed in the Sequence diagrams but it is during the real-time design stage that real-time constraints are systematically completed and checked. Real-time constraints may take different forms. They are generally attached to requests and specify in this case a deadline for the operation invoked to complete its execution. A ready time may also be specified. These constraints will be taken into account at runtime by the scheduler component of the ACCORD execution environment. Other temporal constraints concern the definition of periodic operations. In ACCORD they are specified in the Class statechart diagrams as cyclic transitions. This specification is taken into account during code generation in order to instanciate specific mechanisms devoted to periodic processing of operations.

4. Extensions to UML As introduced above, the ACCORD specialization of UML notations are mostly restricted to the real-time design step in the development process except for a small restriction to statechart utilization during the analysis and object design step. In the next figure we illustrate the main specialized diagrams introduced by ACCORD. We do not detail here the entire process but stress the most significant extensions defined in respect to real-time concerns. Their envisaged mapping are rapidly described and discussed. They are not yet fully stabilized and some questions are still open. We believe that the UML'98 workshop is a good opportunity to discuss them.

330

Agnès Lanusse, Sébastien Gérard, and François Terrier

UML Use Cases + Interaction Diagrams

UML Class Diagrams

Analysis Train

Circuit Control

ACCORD Classes StateCharts UML Interaction Diagrams UML Class Diagrams

Object Design Train

Circuit Control

UML Interaction Diagrams

ACCORD Classes StateCharts

Real-time Design Train

Circuit Control

ACCORD RTClassDiagram

ACCORD Operations State Machines

Figure 7: ACCORD extensions within the development process This figure shows UML standard notations utilization versus ACCORD specialization. It demonstrates the fact that most of UML notations and diagrams are used as is. In the first two stages only small restrictions apply to statecharts (limitation on the possibility of detailing transitions). Except for this, no extension of existing UML notations are made. However rules restricting the utilization of some specific attributes relative to concurrency and synchronization within the first two stages are entered in the tool. During real-time design, on the contrary, a real specialization is operated and concerns the three types of diagrams. Those extensions or presentation choices are done with the will to explicit as clearly as possible the real-time behavior of the application. 4.1 Extensions Summary The main extensions required at the Real-time design level are the following : UML ACCORD stereotype meta-model element Class « RealTimeObjects », « ProtectedPassiveObjects » Operation

« Parallel », « Reader », « Writer »

Real-Time Modeling with UML: The ACCORD Approach

StateMachine

« ClassStateMachine », « OperationStateMachine »

State

« A-State »

Transition

« A-ClassTransition » », « A-OperationTransition »

331

4.2 Stereotypes Descriptions 4.2.1 Class The « RealTimeObjects » stereotype is applied to classes in order to specify that instances of these classes will be able to handle their own threads of control and run concurrently with other real-time objects. In UML, the isActive attribute of Class already permits to specify that a thread of control is attached to the instances of the Class. However the semantics seems restrictive since nothing is said on the possibility to associate several threads to a same object. Moreover, this facility is often used to implement tasks while the ACCORD real-time object seems more general since they can be considered as tasks servers and are able to handle several resources. If the isActive semantics of UML may actually contain this one, then this stereotype might disappear in the future. The « ProtectedPassiveObjects » stereotype is applied to classes in order to specify that instances of these classes will be able to handle concurrency control on shared data. This facility does not exist in UML while it is quite useful to implement concurrent accesses of several active objects to a same object. A protected passive object, as an active object, possesses concurrency control mechanisms but operations are executed in the thread of the caller. 4.2.2 Operation The policy adopted in ACCORD for managing concurrent accesses to a same resource is "one Writer and N Readers". Moreover, it is possible to treat the concurrency question with two levels of granularity : either we consider separately each attribute and role of a class as being distinct resources or we just consider the whole object as a single resource (that is, the set of attributes and roles of the object). In the first case, two lists are attached to each operation of a class: a list of attributes and roles used only in read mode and a list of attributes and roles used in read and write mode. In the second case, a ternary attribute is attached to each operation specifying if the operation uses the object only in read accesses or sometimes in write accesses or never (nor for read accesses, nor for write accesses). In order to take into account this point of view, we need to be able to distinguish Reader and Writer operations at the level of each attribute or role and not simply at the level of the object in its whole. In UML, the concurrency problem is taken into account through an attribute on operations named concurrency. It may take three different values: sequential, guarded and concurrent.

332

Agnès Lanusse, Sébastien Gérard, and François Terrier

Sequential means that if concurrent calls to a same passive instance occurs nothing is guaranteed by the system. It implies that callers have to control their calls to passive objects of the application and that callers have to synchronize each other. In ACCORD, we do not want that the concurrency control be exported on the caller objects, we focus the control in the called object, because we think that the called object is more able to be responsible of the management of its encapsulated data (attributes and roles). Guarded means that concurrent calls to any guarded operation of an instance might occur but will be executed one per one in a mutual exclusion mode. This mechanism do not allow to have selective exclusion between sub-sets of guarded operations of a class. For example, if we want to specify that m1 and m2 must not be executed concurrently with m3, the use of the guarded value on m1, m2 and m3 operations will ensure that m1 and m2 will not be executed concurrently with m3. But it will force also to execute m1 and m2 in mutual exclusion while there is perhaps no particular reason to add this additional constraint (for example, m1 and m2 can use different attributes of the object even if m3 uses the attributes used by both m1 and m2). Concurrent means that concurrent calls can be executed in parallel with any other operation of the object, that is, with concurrent, sequential or guarded operations. The main point here is that all values taken by the concurrency attribute are defined from the caller point of view exporting the management of the concurrency control outside of the called object. It is why we have introduced in ACCORD the three stereotypes « Writer », « Reader », « Parallel » in order to maintain completely the concurrency control inside the object that might be target of concurrent calls. This way, we ensure the fundamental encapsulation property of the object model. In addition, UML introduces the isQuery attribute on the BehavioralFeatures, namely operations. It does not satisfy the expression needs we have to specify ACCORD concurrency management policy due to both following reasons : • it does not allow to manage fine grain concurrency specification, that is at the attribute and role level • it does not distinguish completely the access performed on the resource (the object) : if isQuery is false, this means that the operation can perform a write access on the object state ; if isQuery is true, this only indicates that the object state is not changed but not if the operation uses the object state in read mode or does not use at all the object state. This last case is very important, because it allows to execute such operation concurrently with any operations and especially with write operations. However, this last case could be specified in UML by setting isQuery to true and concurrency to concurrent. So in order to introduce our point of view on concurrency issues, we introduce the three following stereotypes : A « Writer » declaration implies that multiple calls from concurrent threads may occur simultaneously and will be treated as soon as concurrency on attributes and roles -that

Real-Time Modeling with UML: The ACCORD Approach

333

the operation uses in writing accesses- allows its execution. That is to say if two « Writer » operations are called simultaneously and that they do not use the same attributes or/and roles, so they are not concurrent and are allowed to be executed simultaneously. A « Reader » declaration implies that multiple calls from concurrent threads may occur simultaneously and will be executed simultaneously if there is no « Writer » operation -using one or more of the same attributes/roles that the « Reader » operation needs-. At last, we wanted to introduce the possibility that operations do not use any attributes or roles of a class neither in reading mode nor in writing mode. This can be specified by the « Parallel » stereotype. 4.2.3 StateMachine The StateMachine package proposes modeling concepts that allow to specify behaviors through finite state-transition systems. State-transition systems are in fact object version of statecharts. In ACCORD, we specify two types of the StateMachine element, ClassStateMachine and OperationStateMachine. In both case, we use reduced statecharts in which the State and Transition elements are redefined. 4.2.4 State An « A-State » state element owns among other things three associations: deferredEvent, rejectedEvent and ignoredEvent. • deferredEvent is the same as the UML StateElement • rejectedEvent is a list of Events whose occurrence during the current state will be considered as faulty and will cause the receiving object to generate an exception in the application. • ignoredEvent is a list of Events whose occurrence during the current state is unexpected and will cause it to be ignored and lost. By default, events belong to this list. Transition « A-ClassTransition » Transition is used to design statemachines describing the class behavior. They have a single type of possible action : a LocalInvocation. When the trigger event is a CallEvent, the operation linked to the LocalInvocation is the operation linked to the CallEvent. When the trigger event is a SignalEvent, the operation linked to the LocalInvocation must always be specified explicitly (by a single local invocation of an operation of the class). « A-OperationTransition » Transition is used to design statemachines describing the operation. It provides a description of an operation in terms of Actions. The restriction here is that the only possible trigger event of this statechart is a particular ChangeEvent: the availability of a reply in a reply box following the send of message

334

Agnès Lanusse, Sébastien Gérard, and François Terrier

with output parameter(s) to active objects -it is the only possible synchronization point with other objects inside the specification of an operation.

5. Conclusion Integrating the active object paradigm within an object oriented method permits a simple, complete and homogeneous modeling of requirements and needs for multitasking within an application. The ACCORD proposal for extending UML to real-time is based on this paradigm enriched for real-time concerns. On this basis we can obtain with very few specialization an enrichments of UML, the frame for methods supporting real-time development with intellectual approaches and concepts very close to those used in usual object oriented modeling. Moreover, this proposal allows to postpone all the parallelism, real-time and implementation choices very late in the design. Thus changing parallelism granularity will not cause the redesign or reanalysis of all the application, but will just require the tuning of implementation choices and of the last real-time design iteration of the application. In addition to an obvious benefit in terms of quality due to the fact that constraints related to multitasking and real-time behavior of the application can be expressed in the model itself, the use of the active object concept prevents users from having a deep knowledge of fine synchronization mechanisms used in multitasking programming. This way they can concentrate on the modeling and on application specific code rather than wasting precious time in tuning low level mechanisms. Development time is saved, it becomes possible to develop quite rapidly multitasking applications whose behavior can be as reliable and efficient as code obtained with the use of classical techniques using ad hoc system solutions for each application. Obviously, this implies that specific design patterns for the implementation of realtime active objects related concepts be provided. Our execution environment provides libraries and code generation options for various strategies of implementations (fully or partially multi-threaded), and many other are described in the literature. However, the optimization of such implementation schemes, especially when the target must meet the constraints of embedded systems in terms of limited memory must be treated quite carefully. Current developments at CEA concern the implementation of such optimized schemes. An evaluation of ACCORD is being done on a PSA application and further more theoretical work is focused on the refinement and validation of this approach, and on the definition on formal analysis techniques allowing the validation of the real-time behavior of an application from its model.

6. References 1 R. Arthaud, OMT-RT : Extensions of OMT for better Describing Dynamic Behavior, in proc. TOOLS Europe’95, Versailles, France, February 1995. 2 P. Desfray, Object Engineering - The Fourth Dimension, Addison-Wesley, 1994.

Real-Time Modeling with UML: The ACCORD Approach

335

3 P. Desfray, Modélisation par objets : La fin de la programmation, Masson, MIPS, France, 1996. 4 B. P. Douglass, Real-Time UML, Object technology Series, Addison Wesley, 1998. 5 S. Gérard et al., Modélisation à objets temps réel d’un système de contrôle de train avec la méthode ACCORD, in proc. Real-Time Systems (RTS’98), Tekna Pub., Paris, France, January 1998. 6 S. Gérard et al., Developing applications with the Real-Time Object paradigm : a way to implement real-time system on Windows-NT, in Real-Time Magazine, 3Q,1998. 7 D. Harel., Statecharts : A Visual Formalism for Complex Systems, Science of Computer programming, V8, pp. 231-274, 1987. 8 P. Leblanc, V. Encontre, ObjectGeode : Method Guidelines, VERILOG SA, 1996. 9 UML Proposal to the Object Management Group, Version 1.1, September 1997. 10 B. Sellic et al., Real time Object-oriented Modeling, John Wiley Publisher, 1994. 11 L. Rioux et al., Scheduling Mechanisms for Efficient Implementation of Real-Time Objects, in ECOOP'97 Workshop Reader, S. Mitchell & J. Bosch Ed., Springer Verlag, December 1997. 12 F. Terrier et al., A real time object model, in proc. TOOLS Europe'96, Paris, February 1996. 13 F. Terrier et al., Des objets concurrents pour le multitâche, Rev. L’Objet, Ed. Hermès, V3-2, 1997. 14 F. Terrier et al., Développement multitâche par objet : la solution ACCORD, in proc. Génie Logiciel’97, Paris, December 1997.

The UML as a Formal Modeling Notation Andy Evans1 , Robert France2 , Kevin Lano3 , and Bernhard Rumpe4 1

2

Department of Computing, Bradford University, UK Department of Computer Science & Engineering, Florida Atlantic University, USA 3 Department of Computing, Imperial College, London, UK 4 Department of Computer Science, Munich University of Technology, Germany [email protected]

Abstract. The Uniﬁed Modeling Language (UML) is rapidly emerging as a de-facto standard for modelling OO systems. Given this role, it is imperative that the UML needs a well-deﬁned, fully explored semantics. Such semantics is required in order to ensure that UML concepts are precisely stated and deﬁned. In this paper we motivate an approach to formalizing UML in which formal speciﬁcation techniques are used to gain insight into the semantics of UML notations and diagrams and describe a roadmap for this approach. The authors initiated the Precise UML (PUML) group in order to develop a precise semantic model for UML diagrams. The semantic model is to be used as the basis for a set of diagrammatical transformation rules, which enable formal deductions to be made about UML diagrams. A small example shows how these rules can be used to verify whether one class diagram is a valid deduction of another. Because these rules are presented at the diagrammatical level, it will be argued that UML can be successfully used as a formal modelling tool without the notational complexities that are commonly found in textual speciﬁcation techniques.

1

Introduction

The popularity of object-oriented methods such as OMT [RBP+ 91] and the Fusion Method [CAB+ 94], stems primarily from their use of intuitively-appealing modelling constructs, rich structuring mechanisms, and ready availability of expertise in the form of training courses and books. Despite their strengths, the use of OO methods on nontrivial development projects can be problematic. A signiﬁcant source of problems is the lack of semantics for the modelling notations used by these methods. A consequence of this is that understanding of models can be more apparent than real. In some cases, developers can waste considerable time resolving disputes over usage and interpretation of notations. While informal analysis, for example, requirements and design reviews, are possible, the lack of precise semantics for OO modelling makes it diﬃcult to develop rigorous, tool-based validation and veriﬁcation procedures. The Unified Modeling Language (UML) [Gro97c] is a set of OO modelling notations that has been standardized by the Object Management Group (OMG). J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 336–348, 1999. c Springer-Verlag Berlin Heidelberg 1999

The UML as a Formal Modeling Notation

337

It is diﬃcult to dispute that the UML reﬂects some of the best modelling experiences and that it incorporates notations that have been proven useful in practice. Yet, the UML does not go far enough in addressing problems that relate to the lack of precision. The architects of the UML have stated that precision of syntax and semantics is a major goal. The UML semantics document (version 1.1) [Gro97b] is claimed to provide a “complete semantics” that is expressed in a “precise way” using meta-models and a mixture of natural language and an adaptation of formal techniques that improves “precision while maintaining readability”. The meta-models do capture a precise notion of the (abstract) syntax of the UML modelling techniques (this is what meta-models are typically used for), but they do little in the way of answering questions related to the interpretation of non-trivial UML structures. It does not help that the semantic meta-model is expressed in a subset of the notation that one is trying to interpret. The metamodels can serve as precise description of the notation and are therefore useful in implementing editors, and they can be used as a basis to deﬁne semantics, but they cannot serve as a precise description of the meaning of UML constructs. The UML architects justify their limited use of formal techniques by claiming that “the state of the practice in formal speciﬁcations does not yet address some of the more diﬃcult language issues that UML introduces”. Our experiences with formalizing OO concepts indicate that this is not the case. While this may be true to some extent, we believe that much can be gained by using formal techniques to explore the semantics of UML. On the other hand, we do agree that current text-based formal techniques tend to produce models that are diﬃcult to read and interpret, and, as a result, can hinder the understanding of UML concepts. This latter problem does not diminish the utility of formal techniques, rather, it obligates one to translate formal expressions of semantics to a form that is digestible by users of the UML notation. In a previous paper [FELR98], we discussed how experiences gained by formalizing OO concepts can signiﬁcantly impact the development of a precise semantics for UML structures. We motivated an approach to formalizing UML concepts in which formal speciﬁcation techniques are used primarily to gain insights to the semantics of UML notations. In this paper we present the roadmap we are using to formalize the UML, and describe the results of its application to the formalization of UML static models. The primary objective of our work is to produce rigorous development techniques based on the UML. A ﬁrst step is to make UML models amenable to rigorous analyses by providing a precise semantics for the models. This paves the way for the development of formal techniques supporting the rigorous development of systems through the systematic enhancement and transformation of OO models. In this paper we show how the formalized static model can be rigorously manipulated to prove properties about them and their relationships to other static models. In Section 2, we present an overview of work on the formalization of OO modelling concepts and notations, and outline the PUML formalization approach.

338

Andy Evans et al.

As we ﬁrmly believe that not the formalization, but the resulting manipulation techniques and consistency checks are the value add, we give only a small example formalization of UML static models in Section 3 to demonstrate how our approach of formalization is applied. In Section 4 we discuss how the Class Diagrams can be formally manipulated and what the beneﬁts of such manipulation techniques are. We conclude in Section 5 with a summary and a list of some of the open issues that have to be tackled if our approach is to bear meaningful results.

2 2.1

Formalizing OO Concepts: Overview and Roadmap Classification of Approaches

In [FELR98] we identiﬁed three general approaches to formalizing OO modelling concepts: supplemental, OO-extended formal notation, and methods integration approaches. In the supplemental approach more formal statements replace parts of the informal models that are expressed in natural language. Syntropy [CD94a,CD94b] uses this approach. In the OO-extended formal language approach, an existing formal notation (e.g. Z [Spi92]) is extended with OO features (e.g. Z++ [Lan91] and Object-Z [DKRS91]). In the methods integration approach informal OO modelling techniques are made more precise and amenable to rigorous analysis by integrating them with a suitable formal speciﬁcation notation (e.g., see [FBLP97,BC95,Hal90]). Most method integration works involving OO methods focus on the generation of formal speciﬁcations from less formal OO models. This is in contrast to the PUML objectives, where the OO models are the precise (even formal) models. The degree of formality of a model is not necessarily related to its form of representation. In particular, graphical notations can be regarded as formal if a precise semantics is provided for their constructs. A formal semantics for a modelling notation can be obtained by deﬁning a mapping from syntactic structures in the (informal) modelling domain to artifacts in the formally deﬁned semantic domain. This mapping, often called a meaning function, is used to build interpretations of the informal models. Rather than generate formal speciﬁcations from informal OO models and require that developers manipulate these formal representations, a more workable approach is to provide formal semantics for graphical modelling notations and develop rigorous analysis tools that allow developers to directly manipulate the OO models they have created. Deﬁning meaning functions provides opportunities for exploring and gaining insight into appropriate formal semantics for graphical modelling constructs. The method developers (and not the application developers) should use these mappings to justify the correctness of analysis tools and procedures provided in a CASE tool environment. However, diagrams alone are usually not expressive enough to deﬁne all properties. Therefore it is to expect that a textual language, such as OCL or also Z, can be used to supplement the diagrams. In this case the supplemented textual

The UML as a Formal Modeling Notation

339

language is used as syntactic notation by the developer, but not as notation to deﬁne an appropriate semantics for the syntactic notation (we will use Z this way). 2.2

Roadmap to Formalization

Our experiences with formalizing OO modelling notations indicate that a precise and useful semantics must be complete (i.e., meanings must be associated with each well-formed syntactic structure), preserve the intended level of abstraction (i.e., the elements in the semantic domain must be at the same level of abstraction as their corresponding modelling concepts), and understandable by method developers. Furthermore, the formalization of a heterogeneous set of modelling techniques requires that the notations are integrated at the semantic level. Such integration is required if dependencies across the modelling techniques are to be deﬁned. The following are the steps of the formalization approach that we use in our work on formalizing the UML: 1. In this step, a formal language for describing syntax and semantics is chosen. For the UML formalization we chose Z because it is a mature, expressive and abstract language, that is well supported by tools. Our experiences with using Z to formalize OO concepts indicates that it is expressive enough to characterize OO concepts in a direct manner (i.e., without introducing unwanted detail). 2. In this step, the abstract syntax of the graphical OO notation is deﬁned. Here, we will refer to this notation as (language) L. Language L, like conventional textual languages, needs to have a precise syntax deﬁnition. Whereas grammars are well suited for text, the UML meta-model [Gro97a] works well as a description of the structure of UML diagrams. However, a Z characterization of the abstract syntax is better able to capture constraints on the syntactic structures that can be formed using the graphical constructs. 3. This step is concerned with characterizing the notion of a system in terms of its constituent parts, interactions, and static and behavioral properties. The characterization deﬁnes the elements of the semantic domain, which we denote by S. The elements of the semantic domain correspond to modelling concepts that are independent of particular modelling techniques. In the OO modelling realm this is possible because objects have certain properties that are independent from the modelling techniques, and are thus intrinsic to “being an object”. In [KRB96] and [Rum96] a system model is deﬁned, and used as the semantic domains for OO notations in papers such as [BHH+ 97] and [Rum96]. In this paper, the semantic domain is characterized using the language Z. 4. This step is concerned with deﬁning the meaning function for the OO notation. A mapping between the syntactic domain L and the semantic domain S is deﬁned. The system model domain formally deﬁnes the set of all possible systems. The semantics of a model created using a given description technique is obtained by applying the meaning function to its syntactic elements.

340

Andy Evans et al.

The semantics of a model is given by a subset of the system model domain. This subset of the system model consists of all the systems that possess the properties speciﬁed in the model. 5. In the ﬁnal step, analysis techniques are developed for the formalized OO notation. These techniques enable us to constructively enhance, reﬁne and compose models expressed in the language L, and also allow us to introduce veriﬁcation techniques at the diagrammatic level. An important aspect of our formalization approach is the separation of concerns reﬂected in the language-independent formulation of the semantic domain S. This leads to a better understanding of the developed systems, allows one to understand what a system is independently of the used notation, and allows one to add and integrate new OO diagramming forms. Though we speak of one language L, this language can be heterogeneously composed of several diﬀerent notations. However, it is important to note that integration of these notations is more easily accomplished if the semantic domain S is the same for all these sub-languages. In the following sections, we illustrate the application of this formalization approach using a small subset of UML class diagram notation.

3

A Formalization Example

In this section we formally deﬁne a small subset of the abstract syntax of the UML static model notation, characterize an appropriate semantic domain for its components, and deﬁne a meaning function for the formally deﬁned syntax. The focus of this paper is not to present this formalization, but to present the roadmap of the last section by example and to have a basis for arguing about the beneﬁts of a formalization in the next section. Please note that there are diﬀerent formalizations as well as diﬀerent denotations of the same formalization possible. Whereas the former diﬀer in their essential semantics, the later just denote the same semantics in diﬀerent ways. 3.1

Abstract Syntax

In the UML semantics document (version 1.1), the core package - relationships gives an abstract syntax for the static components of the UML. This is described at the meta-level using a class diagram with additional well-formedness rules given in OCL. For reasons given in the previous section, we use the Z notation to deﬁne the abstract syntax. Unlike the OCL, Z provides good facilities for proof. In our work we treat the UML semantics document as a requirements statement from which a fully formal model can be obtained. As an example, the following schemas deﬁne some of the UML static model constructs. Speciﬁcally, they deﬁne a set of classiﬁers, associations and a generalization hierarchy, and attach a set of attributes to each classiﬁer. We start to introduce a set for classiﬁers (e.g. class names) and a set of other names (e.g. attribute and method names).

The UML as a Formal Modeling Notation

341

[Classifier , Name]

An association end connects an association to a classiﬁer, and has a unique name and a multiplicity5 : AssociationEnd name : Name classifier : Classifier multi : P N

Each association has a name of its own and is connected to a number – typically two – of association ends: Association name : Name connects : F AssociationEnd

The abstract syntax of class diagrams contains to a set of classes, a set of abstract classes, a set of associations, and a supertype relation between classes. Each class is attached to a set of attribute names (denoted as Name). The components of the abstract syntax of class diagrams can be formalized as follows6 : Static1 abstract , classifiers : F Classifier associations : F Association attributes : Classifier → F Name supertype of : Classifier ↔ Classifier abstract ⊆ classifiers

Well-formedness of the abstract syntax is ensured by further constraints7 : 5

6 7

A Z schema is similar to a record. It introduces a schema name, and elements of the schema, which are part of the schema. They can be referred to when the schema is used. Schemas in addition allow to state axioms that must hold between their elements. Refering to another schema name includes the elements of the referred schema in the new one. All operations, and especially equality, are mathematical set and function operations.

342

Andy Evans et al.

Static Static1 supertype of ∈ (classifiers ↔ classifiers) supertype of + ∩ id(classifiers) = ∅ ∀ c1 , c2 : classifiers • c1 supertype of c2 ⇒ attributes(c2 ) ⊆ attributes(c1 ) ∀ a1 , a2 : associations • a1 = a2 ⇒ a1 .name = a2 .name ∀ a : associations • {e : a.connects • e.classifier } ⊆ classifiers The above schema describes the constraints governing how elements of the abstract syntax can be combined (more constraints are possible). These constraints state that: – the collection of classiﬁers in the supertype hierarchy form a directed acyclic graph; – association names are unique and link classiﬁers 3.2

Semantic Domain

Semantically, a classiﬁer is represented as a set of objects. We distinguish between object identiﬁers (oValues) and normal values (integer etc.): [Value]

Values oValues, nValues : P Value oValues ∩ nValues = ∅ An object is owned by a classiﬁer, has a unique identity, and maps a set of attribute names (denoted as Name) to values: Object classifier : Classifier self : Value attvals : Name → Value At any point in time, a system can be described as a set of objects, where each object is referenced by its identity self:

The UML as a Formal Modeling Notation

343

SM 1 Values objects : Value → Object dom objects ⊆ oValues ∀ o : Value • o ∈ dom objects ⇒ (objects(o)).self = o From that snapshot, we can derive sets of links (instances of associations): SM SM 1 links : Name → (Value ↔ Value) ∀ at : Name; o1 , o2 : Value • (o1 , o2 ) ∈ links(at ) ⇔ o2 = ((objects(o1 )).attvals)(at )

3.3

Semantic Mapping

The semantic mapping determines how the syntactic elements of the UML static model, for example, abstract, classiﬁer, and association, are to be interpreted in the semantic domain. The semantic mapping that takes the concepts given in the syntactic domain AbstractSyntax to elements in the semantic domain SM is characterized by a Z schema that takes the characterizations of the syntactic and semantic domains as parameters. Semantics Static SM {o : ran objects • o.classifier } ⊆ classifiers \ abstract ∀ o : dom objects • attributes((objects(o)).classifier ) ⊆ dom((objects(o)).attvals) ∀ a : associations; o : dom objects • ∀ e : a.connects • e.classifier = (objects(o)).classifier ⇒ e.name ∈ dom((objects(o)).attvals) ∧ #((links(e.name))(| {o} |)) ∈ e.multi ∀ s1 , s2 : Classifier • s1 supertype of s2 ⇒ {o : Value | (objects(o)).classifier = s2 } ⊆ {o : Value | (objects(o)).classifier = s1 } The axioms state that each object is assigned to a non-abstract classiﬁer. Furthermore, the objects have at least the set of attributes explicitly mentioned in the classiﬁer deﬁnitions. We also interpret association ends as attributes and restrict the multiplicities. Finally, the supertype relationship requires that a

344

Andy Evans et al.

set of objects assigned to a subtype is a subset of the objects assigned to its supertype. We have now given a formalization of (a subset of) the abstract syntax of class diagrams and an appropriate semantic domain. Especially the semantic domain is deﬁned in dependency of the abstract syntax. If a concrete class diagram is ﬁlled in for the schema Static then the semantics for this class diagrams is given by the resulting schema Semantics. Therefore, we implicitly deﬁned a mapping from syntax to the semantic domain without explicitely deﬁning this mapping. An explicit form of the semantics mapping can be expressed as follows: M : Static → P SM ∀ st : Static • M(st ) = {Semantics | st = θStatic • θSM } It can be used to prove properties of this mapping. One such property is e.g. the consistency of the mapping, which is stated by the property ∀ st : Static • M(st ) = ∅.

4

Analyzing UML Diagrams

As discussed above, a central part of the PUML group’s work is to develop a formal version of UML that can be used to build precise and analyzable models. However, how can a UML model be analyzed? In the case of a textual notation such as Z, analysis is carried out by constructing proofs to determine the truth or falsity of some property being asserted about a speciﬁcation. Each proof involves applying a sequence of inference rules and axioms to the speciﬁcation to derive the required conclusion. At each step, a new formula is derived either from the original speciﬁcation or as a result of applying an inference rule to previous formulas. To analyze UML models, a very similar approach can be adopted [Eva98]. However, because UML is a diagrammatical modelling language, a set of deductive rules for UML will consist of a set of diagrammatical transformation rules. Thus, proving a property about a UML model will involve applying a sequence of transformation rules to the model diagrams until the desired conclusion is reached. This approach is brieﬂy illustrated by a simple (toy) example. Consider the left hand class diagram D in Figure 1, which describes the relationship between a university and its students. Given that full-time students are enlightened by a university, it is an interesting question to deduce the relationship between universities and students in general. One (obvious) conjecture is that some students are enlightened, but not all. This is expressed by the right hand class diagram. Using a suitable sequence of transformation rules, we should be able to transform the original diagram into the second diagram, thereby proving that the derivation is valid. In this simple case, only three steps are required to carry out the proof. One transformation rule allows us to move an association end from a

The UML as a Formal Modeling Notation D

345

D’

Student

University

enlightens University 0..1 0..* Student

1 enlightens 0..*

Part-time

Full-time

Fig. 1. Transforming a Class Diagram D to D to derive information

subclass to a superclass, but requires that the opposite association end becomes optional. This rule is justiﬁed because a superclass may contain objects that are not in its subclasses, thus they may not participate in the association. The second transformation rule permits the deletion of the full and part-time classes, as they are of no further interest in the current derivation. By only applying correct transformations, the derivation automatically is correct, and a proof for its correctness exists. Please note that this is not a mathematical or textual proof, but a diagrammatic proof that deals with diagrams as axioms and diagrammatic rules as transformation rules. Nevertheless, it can be regarded fully formal, provided a formal syntax, semantics and set of transformation rules exists. This is of course just indicated here, but not fully carried out8 . 4.1

Satisfaction Conditions

Whenever a transformation rule is applied to a diagram it must be shown that the resulting diagram is a valid deduction of the original diagram. The condition under which this is true is known as the satisfaction condition. This states that if every meaning satisfying one model also satisﬁes another model, then whatever property holds for the ﬁrst model must also hold for the second. Thus, the second diagram follows from (or is a logical deduction of) the ﬁrst diagram. Of course, for this result to be valid, both models must be well formed. This condition can be expressed in Z as follows: Let us assume, there is a transformation rule T given. This is formally represented as a modiﬁcation on the syntax, in this case a static model: T : Static → Static Such a transformation can, for example, be the erasure of a classiﬁer or association, or weakening of a multiplicity. This syntactic transformation needs a semantic counterpart, which relates elements of the semantic domain. This is known as the satisfaction relation, and it has the general form: 8

Full details of the transformation rules can be found in [Eva98].

346

Andy Evans et al.

|=

: P(SM ) ↔ P(SM )

∀ s, s : P(SM ) • s |= s ⇔ s ⊆ s Thus, a semantic model s , will satisfy all the properties of s provide that every property of s is in s . Finally, the formal proof of correctness of a transformation can now be described within Z (and therefore can be proven within Z). A transformation T is correct, iﬀ ∀ st : Static • M(st ) |= M(T (st )) This strongly corresponds to the commuting diagram, ﬁrst stated in [Rum96] and also in [KR97].

5

Summary and Open Issues

In this paper we outlined and illustrated an approach to formalizing the UML. The objective of our eﬀorts is to make the UML itself a precise modelling notation so that it can be used as the basis for a rigorous software development method. However, it must ﬁrst be determined how such a formalization can best be carried out, and what practical purpose it can serve. This paper aims to contribute to this ongoing discussion. The beneﬁts of formalization can be summarized as follows: – Lead to a deeper understanding of OO concepts, which in turn can lead to more mature use of technologies. – The UML models become amenable to rigorous analysis. As we have illustrated, diagrammatical analysis techniques can be developed. – Rigorous reﬁnement techniques can be developed. An interesting avenue to explore is the impact a formalized UML can have on OO design patterns and on the development of rigorous domain-speciﬁc software development notations. Domain-speciﬁc UML patterns can be used to bring UML notations closer to a user’s real-world constructs. Such patterns can ease the task of creating, reading, and analyzing models of software requirements and designs. An integrated approach to formalization of UML models is needed in order to provide a practical means of analyzing these models. Current work on compositional semantics [BLM97] has used techniques for theory composition to combine semantic interpretations of diﬀerent parts of an OO model set. Some of the other issues that have to be addressed in our work follows: – How does one gauge the appropriateness of an interpretation of UML constructs? In practice an ‘accepted’ interpretation is obtained by consensus within a group of experts. Formal interpretations can facilitate such a process by providing clear, precise statements of meaning.

The UML as a Formal Modeling Notation

347

– Should a single formal notation be used to express the semantics for all the models? The advantage of a single notation is that it provides a base for checking consistency across models, and for reﬁnement of the models. This is necessary if analysis and reﬁnement is done at the level of the formal notation. On the other hand, if the role of the formal notation is to explore the semantic possibilities for the notations, and analysis and reﬁnement are carried out at the UML level, then there seems to be no need to use a single formal notation. – How will the use of textual constraints (expressed in OCL for example) interact with and impact on diagrammatical analysis and reﬁnement techniques? It is anticipated that, as our work progresses, additional issues that will have to be tackled will come up.

Acknowledgements The authors thank their colleagues for fruitful discussions and the referees for helpful comments.

References BC95.

Robert H. Bourdeau and Betty H.C. Cheng. A formal semantics for object model diagrams. IEEE Transactions on Software Engineering, 21(10):799– 821, October 1995. BHH+ 97. Ruth Breu, Ursula Hinkel, Christoph Hofmann, Cornel Klein, Barbara Paech, Bernhard Rumpe, and Veronika Thurner. Towards a formalization of the uniﬁed modeling language. In Satoshi Matsuoka Mehmet Aksit, editor, ECOOP’97 Proceedings. Springer Verlag, LNCS 1241, 1997. BLM97. J. Bicarregui, K. Lano, and T. Maibaum. Objects, associations and subsystems: A hierarchical approach to encapsulation. In Proceedings of ECOOP 97, LNCS 1489. Springer-Verlag, 1997. CAB+ 94. Derek Coleman, Patrick Arnold, Stephanie Bodoﬀ, Chris Dollin, Helena Gilchrist, Fiona Hayes, and Paul Jeremaes. Object-Oriented Development: The Fusion Method. Prentice Hall, Englewood Cliﬀs, NJ, Object-Oriented Series edition, 1994. CD94a. Steve Cook and John Daniels. Designing Object Systems: Object-Oriented Modeling with Syntropy. Prentice Hall, Englewood Cliﬀs, NJ, September 1994. CD94b. Steve Cook and John Daniels. Let’s get formal. Journal of Object-Oriented Programming (JOOP), pages 22–24 and 64–66, July–August 1994. DKRS91. Roger Duke, Paul King, Gordon A. Rose, and Graeme Smith. The ObjectZ speciﬁcation language. In Timothy D. Korson, Vijay K. Vaishnavi, and Bertrand Meyer, editors, Technology of Object-Oriented Languages and Systems: TOOLS 5, pages 465–483. Prentice Hall, 1991. Eva98. Andy Evans. Reasoning with UML class diagrams. In WIFT’98 Proceedings. IEEE, 1998.

348

Andy Evans et al.

FBLP97. Robert B. France, Jean-Michel Bruel, and Maria M. Larrondo-Petrie. An Integrated Object-Oriented and Formal Modeling Environment. To appear in the Journal of Object-Oriented Programming (JOOP), 1997. FELR98. Robert France, Andy Evans, Kevin Lano, and Bernhard Rumpe. The UML as a formal modeling notation. Computer Standards & Interfaces, to appear, 1998. Gro97a. The UML Group. UML Metamodel. Version 1.1, Rational Software Corporation, Santa Clara, CA-95051, USA, September 1997. Gro97b. The UML Group. UML Semantics. Version 1.1, Rational Software Corporation, Santa Clara, CA-95051, USA, July 1997. Gro97c. The UML Group. Uniﬁed Modeling Language. Version 1.1, Rational Software Corporation, Santa Clara, CA-95051, USA, July 1997. Hal90. J. Anthony Hall. Using Z as a speciﬁcation calculus for object-oriented systems. In D. Bjørner, C. A. R. Hoare, and H. Langmaack, editors, VDM and Z – Formal Methods in Software Development, volume 428 of Lecture Notes in Computer Science, pages 290–318. VDM-Europe, Springer-Verlag, New York, 1990. KR97. Haim Kilov and Bernhard Rumpe. Summary of ecoop’97 workshop on precise semantics of object-oriented modeling techniques. In J. Bosch and S. Mitchell, editors, Object-Oriented Technology – ECOOP’97 Workshop Reader. Springer Verlag Berlin, LNCS 1357, 1997. KRB96. Cornel Klein, Bernhard Rumpe, and Manfred Broy. A stream-based mathematical model for distributed information processing systems - SysLab system model - . In Jean-Bernard Stefani Elie Naijm, editor, FMOODS’96 Formal Methods for Open Object-based Distributed Systems, pages 323–338. ENST France Telecom, 1996. Lan91. Kevin C. Lano. Z++ , an object-orientated extension to Z. In John E. Nicholls, editor, Z User Workshop, Oxford 1990, Workshops in Computing, pages 151–172. Springer-Verlag, 1991. RBP+ 91. J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. ObjectOriented Modeling and Design. Prentice Hall, 1991. Rum96. Bernhard Rumpe. Formal Method for Design of Distributed Object-oriented Systems. Ph.D. thesis (in German), Technische Universität München, 1996. Spi92. J. Michael Spivey. The Z Notation: A Reference Manual. Prentice Hall, Englewood Cliﬀs, NJ, Second edition, 1992.

OML: Proposals to Enhance UML B. Henderson-Sellers Centre for Object Technology Applications and Research School of Information Technology Swinburne University of Technology John Street, PO Box 218 Hawthorn, Victoria 3122, Australia Phone: +61 3 9214 8524 Fax: +61 3 9819 0823 Email: [email protected]

Abstract. While the UML metamodel and notation aim to be comprehensive, there are a number of areas in which this modelling language is seen to be deficient. The proposals in OML (Firesmith et al., 1997) contain a number of advanced metamodelling and notational techniques which could also be of use in enhancing UML. In particular contributions can be made in the areas of modelling responsibilities and aggregations and in the provision of notational elements underpinned by semiotics and usability concerns. Other areas of potential contribution include a more consistent and thorough treatment of abstraction foci in terms of class versus type versus instance - applicable not only at the classifier level but also to packages, scenarios etc.; the ability to discriminate clearly between the various types of inheritance and to represent these notationally. It is critical that any standard support not only a use-case and a data-driven mindset but also that of a responsibility-driven modelling process and that the results of these modelling endeavours are communicated as effectively as possible both to other developers and to users.

1. Two Modelling Languages: UML and OML The OMG’s recent RFP for a modelling language resulted in the endorsement in November 1997 of the Unified Modeling Language or UML. Initially one of six proposals in response to the OMG, the UML has now been released (Sept 1997) as Version 1.1. At present (March 1998), there is continuing activity under the auspices of the Revisionary Task Force (RTF) whose job it is to correct the remaining errors but who have no remit to enhance or improve the current version of UML other than minor revisions. A second modelling language, known as OML (OPEN Modelling Language) has been developed over the same period. Its metamodel has the same roots as one of the six submitted proposals to the OMG - in the COMMA metamodel (e.g. HendersonJ. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 349–364, 1999. © Springer-Verlag Berlin Heidelberg 1999

350

B. Henderson-Sellers

Sellers, 1994; Henderson-Sellers and Bulthuis, 1996, 1998). Version 1.0 of the OPEN Modelling Language (OML) was published in March 1997 (Firesmith et al., 1997) in the same week that saw the agreement at the OMG Meeting in Austin to coalesce the six proposals in answer to the OADTF RFP. Following the submission deadline of January 1997, OML continued to develop alongside the UML with members of the development team also participant members of the OMG OADTF. While having received no official sanction from OMG, OML continues to offer support beyond that offered by the UML. In addition, with the release of the OMG UML Version 1.0 late in 1997, the published version of OML (Version 1.0: Firesmith et al., 1997) was revised to become more compatible with this OMG standard, particularly in terms of the metamodel and the basic notation for a class (a rectangle). In this new Version 1.1 (fully described in Firesmith and Henderson-Sellers, 1998b; Henderson-Sellers and Firesmith, 1998), OML now offers a superset to UML. The additional components include full support for responsibilities, a clear iconic representation of three sorts of inheritance, a clearer metamodel for aggregation (Firesmith and Henderson-Sellers, 1998a) and, in its notation, icons to replace some UML stereotypes along with more semiotically based notations for referential relationships, types/classes/objects etc. The contrast is that the UML is more pragmatic and influenced by hybrid, non-OO concepts, such as C++, Java and relational databases; whereas the major influences on OML have been the ideas underpinning a responsibility-driven approach to modelling, semiotics, intuitive and easily usable and learnable notations. In addition, the precursors of UML (particularly OMT and Booch) had a much larger existing user base which has led to some decisions in UML necessarily being taken for market reasons of upward compatibility - OML, on the other hand, was able to take the opportunity to be designed from scratch and took a more purist approach. While both UML and OML have a similar scope, their different philosophies sometimes result in different metamodelling structures and constructs. Two examples will suffice: (1) in UML, aggregation is played down as a kind of association and denoted by setting attribute values on the metaclass AssociationEnd; in OML, aggregation is a relationship in its own right (though still a-kind-of Association) but is the subject of multiple stereotyping (Firesmith and Henderson-Sellers, 1998a) which permits the removal of several ambiguities by use of orthogonal and overlapping characteristics of the relationship. (2) in UML, there is no icon for implementation inheritance - only a stereotype <<subclass>> is available - on the grounds that this is poor design (which it is) so no-one will do it; the creators of OML are obviously more circumspect (or cynical) and take the opposite viewpoint - for the same reason - and supply a symbol for implementation inheritance specifically to draw the designer's attention (and that of his/her colleagues) to the poor (and hopefully temporary) design. Other differences include the degree to which support is offered for exceptions and exception handling mechanisms (extensive in OML, limited in UML); support is offered for Activity Diagrams (full support in UML, none in OML) and the underlying philosophy regarding navigability on associations (bidirectional default in UML, unidirectional in UML).

OML: Proposals to Enhance UML

351

With regard to notation, basic icons are in common. OML's notation, COMN (Common Object Modelling Notation), was able to be designed from scratch and was thus able to choose symbols for their intuitive nature, thus focussing on usability issues; whereas UML's notation was created specifically as an acceptable enhancement and merger of the existing Booch and OMT notations (predominantly) so that existing users had an easy migration path. Although developed in parallel, the advent of the OMG standard in late 1997 (OMG, 1997b) led to the OML Ver 1.1 alignments wherein options for notational emphasis within the documentation were chosen, sometimes differently from those in the UML (as described, for instance, by Fowler and Scott, 1997). This resulted in: • OML uses icons, UML uses stereotypes for abstract class, implementation class and type/interface • OML uses an icon, UML underlines the name for instances (objects) • OML uses Option 1 for navigation arrows (OMG, 1997b, p55) such that the absence of an arrowhead indicates no navigation (or TBD navigation), whereas UML users usually choose Option 3 (arrows suppressed means bidirectional navigability - no navigability has no distinguishing notation). Finally, although both Modelling Languages endorse the use of stereotypes, UML suggests their use more frequently while, at the same time, advocating user definition. In comparison, OML also defines a number of stereotypes and does not advocate user definition to the same degree (although this is of course permitted). Both UML and OML fully support collaborations - UML in the Collaborations package of the Behavioral Elements package - both using Collaboration diagrams as a form of Interaction diagram. Indeed, the overall suite of diagrams is almost identical in both Modelling Languages - although OML has more versions of the static architectural diagrams (called Class diagrams in UML and Semantic nets in OML) and UML has more detailed advice on Implementation diagrams.

2. The Metamodels for Structure Both the UML and OML metamodels offer a full specification of the metaclasses needed for creating a modelling language appropriate for OOAD. UML tends to offer more detailed support in detailed design/coding and less in analysis whereas OML is more strongly weighted in favour of analysis - although both are usable across the full lifecycle. While a modelling language does not (and should not) describe process, it is clear (from linguistic studies) that the availability (or absence) of specific language elements can influence people in their thinking and mode of expression. Within this context, it is fair to say that OML has a responsibility focus whereas UML favours a use-case driven approach with an emphasis on data-driven modelling at the detailed design level. There are many elements to the metamodel for both UML and OML. Here, we will only describe the most crucial ones, focussing on the modelling elements (such as classes) and relationships (such as associations, aggregations and inheritance). As far

352

B. Henderson-Sellers

as is feasible, the diagrams are displayed using the same layouts for both UML and OML for ease of comparison and evaluation. 2.1 UML In the UML core metamodel (Figure 1), there are three major architectural elements to represent concepts in the model (e.g. nodes in a class diagram). A class is a description of objects sharing the same attributes, operations, relationships etc. The interface is a collection of operations used to define a cohesive set of operations they have no attributes, associations or methods. A datatype is "a type whose values have no identity" which is distinguished (as in OMT) from regular classes. These three metatypes are generalized by the term Classifier (defined as "a mechanism that describes behavioral and structural features") which is thus an abstract metatype. It is a kind of GeneralizableElement which also includes other concepts which may participate in generalization relationships (at the metalevel), such as UML associations (see later discussions). Generalizable Element

Feature (visibility)

Classifier

Class

Interface

Behavioural Feature

Structural Feature

Datatype

Attribute

Association End (aggregation)

Operation Association

1*

Method

Association Class

Fig. 1. There are three main classifiers in the UML metamodel: class, interface and datatype. In the UML notation (used in this figure), concepts (at the metalevel) are represented by rectangular boxes, associations by lines, aggregations by black diamonds and generalization by a white arrowed line. Classifiers contain features. Some of the features are externally visible, others are private or hidden. Features may be structural or behavioural. Structural features in UML are restricted to being attributes. Thus attributes may represent visible attributes or hidden attributes depending upon the inherited value of the visibility attribute on the Feature metatype. Behavioural features are operations and methods, although generally it is considered that operations are the externally visible part - in other words, a method implements an operation (as indicated by an association between these two metatypes in the UML metamodel). The other elements of the core structural metamodel shown in Figure 1 are associations. While these are relationships (and thus discussed further below), UML also permits their elevation to classifiers and thus we include them in this diagram.

OML: Proposals to Enhance UML

353

This is done by including a metatype of AssociationClass which is a specialization of the metatype Class and also of the metatype Association i.e. multiple generalization. Finally, we show in Figure 1 that Associations have AssociationEnds (an aggregation relationship) and these have Attributes themselves. Of particular importance are the Attributes of isNavigable, aggregation and multiplicity. The isNavigable attribute permits uni- and bidirectionality on association relationships; the aggregation attribute is the way that UML models aggregation (which may be one of two types) and the multiplicity (the number of objects which can participate in the relationship) is considered to be an integral part of the AssociationEnd rather than the Association. Objects do not appear in the core UML metamodel but rather in the BehavioralElements Package describing common behaviour. An object is defined to be "an instance which originates from a class". 2.2 OML There are four concepts in OML's metamodel (Figure 2) which equate to the Classifiers of UML (Figure 1). These are class, instance, role and type. Collectively they are called CIRTs (rather than classifiers) and, as we shall see later, CIRTs have their own notational icon (they are concrete metaclasses) and play a greater role in object modelling than do UML's Classifiers (which are abstract metaclasses: OMG, 1997a, p21). Generalizable Element

encapsulates/ exports

CIRT

Class

Instance

Responsibility

has

Role

Type (interface)

Characteristics (visibility)

Property

Attribute Exception

implemented by

Entry

Assertion

Link

Operation

Part

Association

Fig. 2. In the OML metamodel, there are four main classifiers (called CIRTs): class, instance, role and type. In this diagram, which uses the OML notation, called COMN (Common Object Modelling Notation), concepts (at the metalevel) are represented by rectangular boxes, associations by directed lines and generalization by a double, arrowed line (after HendersonSellers et al., 1998).

A class describes a "template" by which instances can be created. A class has an interface which consists of one or more types together with its internal details - its implementation (a similar description to that in UML). Interfaces and types relate to that part of the class which is visible externally to the class. OML does not specially

354

B. Henderson-Sellers

distinguish Datatype as does UML but does introduce Instance as a separate metatype which is a peer to Class. It also includes Role (from OOram and MOSES), also as a peer to Class, to represent a partial object which is used to represent a role that can be played by instances of unrelated classes. This is supported in UML by the ClassifierRole which is part of the Collaboration Package. Furthermore, While classes, types etc. are usually thought of as being in the "object domain", they can also be superimposed on the scenario, package and association domains (Firesmith and Henderson-Sellers, 1998b). In other words, we can have scenario class, scenario instance (or plain "scenario") and scenario type etc. - these are all supported in OML V1.1. The UML has a similar, yet incomplete, classification dichotomy (for example, use case instances are available but no distinction is made between the class-level and type-level for use cases). In OML Ver 1.0, there was no metatype equivalent to UML's GeneralizableElement. To improve alignment with this OMG standard, in OML Ver 1.1 we now introduce this metaclass (Firesmith and Henderson-Sellers, 1998b) as shown in Figure 2. In addition, Association becomes a kind of GeneralizableElement. However, since OML supports aggregation as a separate metaclass, the need for an AssociationEnd metaclass (a la UML) is diminished - only multiplicities are needed and they are deemed part of the referential relationship (see discussion below and in detail in Firesmith and Henderson-Sellers, 1998a,b). CIRTs have Characteristics which are totally equivalent to UML's Features. They also have visibility and can be subclassified (as does the UML). In OML, not only are there structural features (called Properties) and Behavioural Features (Operations and Methods as before) but also Assertions. Assertions, as supported for instance in Eiffel (Meyer, 1992), are Boolean descriptors of rules. Assertions may be either Preconditions, Postconditions or Invariants (the unlabelled boxes in Figure 2). Preconditions must be true before a method is executed; postconditions must be true following method execution and invariants constrain the whole class. [In UML, this is also possible by use of stereotypes (user-defined metatypes) on a constraint rather than UML (predefined) metatypes.] In addition, OML offers five kinds of Property in comparison to the single type (Attribute) of UML. OML thus extends the OMG metamodel by proposing the addition of Exception, Entry, Link and Part as types of Property. (An Entry is an element in a Container, a Link is an instance of an Association to other classes and a Part is an element in an Aggregation). One important addition to UML suggested in OML is the Responsibility metatype. (UML supports only a tagged value on Classifier for Responsibility). OPEN as a method is unashamedly a responsibility-driven method. Consequently, in the OML metamodel it is not surprising to find a Responsibility metatype. CIRTs have Responsibilities and Responsibilities are implemented by Characteristics. As depicted here and in the COMMA core metamodel of Henderson-Sellers and Firesmith (1997)) and Henderson-Sellers and Bulthuis (1998), responsibilities may be public or hidden, as can be Characteristics. This means that an additional level of abstraction is supported in OML since responsibilities provide a modelling/analysis/requirements engineering focus which complements the design/code focus of the metamodel of classifiers having features/characteristics.

OML: Proposals to Enhance UML

355

OML does not support AssociationClass, preferring instead to use a regular class where UML would proffer an AssociationClass. In reifying an association, OML creates a new Class which does not inherit from Association; whereas in UML the new Class (an AssociationClass) is not only a class but also remains as a-kind-of Association. AssociationEnds in OML are not used since there is no need for an aggregation attribute, aggregation being regarded as a relationship in its own right. Associations are also restricted to being binary relationships so that the multiplicity on the metamodel is two - in UML it is 2 or more since ternary (or higher) relationships are permitted. OML also avoids OMT's/UML's roles as names attached to the association and instead supports the richer concept of role as described in OOram (Reenskaug et al., 1996) in which a role is a partial object participating in a collaboration which creates a role model which can then be instantiated by specific objects. Roles in OML may be static or dynamic (for further details see Henderson-Sellers et al., 1998) and are incorporated into the core architectural metamodel rather than, as in UML, being part of the dependent Behavioral Elements package.

3. The Metamodels for Relationships

3.1 UML An association is defined in UML as a “semantic relationship between classifiers” wherein the instances of an association (called links) “are a set of tuples relating instances of the classifiers”. Each association has two or more AssociationEnds which themselves have attributes, including isNavigable, aggregation, multiplicity and changeable. It is also noted (OMG, 1997a, p17) that the “bulk of the structure of an Association is defined by its AssociationEnds”. While no statement is made regarding assumed or preferred directionality in the metamodel, the UML notation document (OMG, 1997b) states a preference for two-way associations in which the arrowheads are suppressed. While aggregations are not a metaclass in the metamodel, two forms are defined in the UML Semantics document as (i) composite aggregation - "a strong form of aggregation which requires that a part instance be included in at most one composite at a time, although the owner may be changed over time" (page 38) such that if the whole is deleted then so are the parts (i.e. dependent lifetimes); (ii) shared aggregation - "weak ownership, i.e. the part may be included in several aggregates, and its owner may also change over time". In this case, deletion of the whole does not imply deletion of the parts. While Associations are a major relationship in UML, subsuming aggregations as they do, there are other important relationships. Of the static/architectural relationships, generalizations and dependencies are used in the static diagrams (a.k.a.

356

B. Henderson-Sellers

class diagrams); links (a link is an instance of an association) are used in object diagrams. Generalization is used to represent both "is-a-kind-of" or specialization inheritance as well as subtyping or specification inheritance. This is the default. If implementation inheritance is used, this may be shown by application of a predetermined stereotype. The Dependency relationship and its stereotypes form one hierarchy for relationships, While Associations (and aggregations) and Generalization form a second, which has Generalizable Element as a direct ancestor. A Dependency relationship is one in which "a change to one modeling element (the independent element) will affect the other modeling element (the dependent element)" (OMG, 1997a, p151). It is also only between the model elements themselves rather than instances of them (p45).

3.2 OML In OML, an attempt has been made to create a comprehensive hierarchy at the metalevel for relationships. This has been done by identifying four types of relationship (all relationships being regarded as binary, unidirectional dependency relationships): definitional, referential, scenario and transitional. Here we discuss only the two static relationship types: referential and definitional. The distinction made here is also reflected in COMN, the OML-preferred notation (HendersonSellers et al., 1997) - see below. Elaborating on the Referential relationship strand

REFERENTIAL RELATIONSHIP

WHOLE-PART (MERONYMIC) RELATIONSHIP

CONTAINMENT

AGGREGATION

NORMAL ASSOCIATION

MEMBERSHIP

OML/COMN symbol

U

+

ε

Fig. 3. Metamodel for REFERENTIAL RELATIONSHIP showing three major subtypes of NORMAL ASSOCIATION, CONTAINMENT and WHOLE-PART (with its corresponding subtypes)

In OML, all referential relationships are uni-directional. They consist of (a) normal associations, (b) containment (or topological inclusion) and (c) whole-part (or

OML: Proposals to Enhance UML

357

meronymic) relationships (Figure 3). Whole-part relationships are grouped into aggregation and membership relationships (Henderson-Sellers, 1997a). These form one partition. Aggregations are configurational relationships i.e. the parts bear some functional or structural relationship to each other - as in the parts of a car. In contrast, membership is a relationship in which, while still being whole-part or meronymic, the elements are not related to each other, but only to the whole e.g. participants in a sports club. TBD

NONINDEPENDENT

BY VALUE

BY REFERENCE USED

INDEPENDENT

UNIDIRECTIONAL 2 +

BIDIRECTIONAL

NOT USED

REFERENTIAL RELATIONSHIP VARIABLE

SEQUENTIAL

MANDATORY

OPTIONAL CONSTANT

SYNCHRONOUS

ASYNCHRONOUS

Fig. 4. Metamodel for REFERENTIAL RELATIONSHIP showing seven orthogonal partitions.

Superimposed on the appropriate referential relationship (whole-part, containment or association) are seven other, value-adding partitions (Figure 4) together with a further (eighth) partition (not shown here) which describes the abstraction level as either being class-level or instance-level. Some of these are useful in modelling (analysis), others, like by-value/by-reference, relate very specifically to implementation. Others, such as navigability (unidirectional/bidirectional/TBD directionality) are pertinent across the lifecycle - although the balance of their use will vary in time - for instance, early analysis diagrams will often have many TBD directionalities, while by the time of design/implementation these should all have been resolved into uni- or bi-directional navigability. Firesmith and HendersonSellers (1998a) show that the use of these partitions, and in particular the independent plus configurational plus constant/variable (and, for implementation, by value versus by reference) disambiguate UML’s black diamond notation. Furthermore, they show that the UML white-diamond aggregation cannot be a property of the aggregation relationship but is a property of an object which acts as a server to two or more client objects. Definitional relationships are classification, implementation and inheritance. There are three classification relationships and two for implementation, While the most common is inheritance. Three types of inheritance are supported in both metamodel and notation: generalization, specification inheritance (blackbox inheritance or subtyping) and implementation inheritance (or subclassing).

358

B. Henderson-Sellers

CONNEXION Kilov’s ELEMENTARY ASSOCIATION but (directionality) unidirectional

Dynamics branch REFINEMENT

BINDING UML and OML

s pe ty eo er st FUNCTIONAL DEPENDENCY

ε

OML

configurational

MEMBERSHIP

DEFINITIONIAL CONNEXION (UNIDIRECTIONAL)

GENERALIZATION OML etc. UML

EXISTENCE DEPENDENCY

unidirectionality

(Kilov’s REFERENCE)

+

Statics branch

UML

USES/USING

CONFIGURATIONAL (AGGREGATION) OML

REFERENTIAL CONNEXION

TRACE

(Kilov’s DEPENDENCY)

GENERIC ASSOC RELATIONSHIP (directionality) MERONYMIC RELATIONSHIP =WHOLE--PART CONTAINMENT (directionality -usually 1) OML U lifetime; invariance homeomerous is process;

COMPOSITE AGGREGATION

UNIDIRECTIONAL ASSOCIATION

can be single (MAPPING) or multivalued

sharing SHARED AGGREGATION

UML

UML

a.k.a. Relationship Type or Relation

TBD ASSOCIATION OML

essentially equivalent with different focus

OML UML

BIDIRECTIONAL

+constraints

2 of inverses

tuples [=relationship]

+ ASSOCIATION UML SYMMETRIC =ASSOC CLASS UML

Fig. 5. Proposed hierarchy for connexions encompassing those proposed in UML (OMG, 1997a), those in OML (Firesmith et al., 1997) and those proposed by Kilov and Ross (1994) (after Henderson-Sellers, 1997b with minor correction)

Finally, an attempt has been made to find an underpinning metamodel for these static relationships of both OML and UML that will provide a single framework. An initial proposal was made in Henderson-Sellers (1997b) - see Figure 5. This figure consolidates and extends the discussion above. It also indicates a possible linkage between UML's Dependency and Association. This was analyzed partly in response to Robert Martin's extensive discussions on the OTUG listserv in late 1997. That this figure readily maps to both UML and OML is demonstrated in Henderson-Sellers (1998).

4. The Notation for Structure

4.1 UML The basic icon for a class in UML is the rectangular box. It typically has three compartments: classname, attributes and operations - for objects the shape is the same, but there are only two compartments (an object has no operations in UML) and the name is underlined. This is in keeping with implementation concepts (operational information is stored once at class level) rather than with the conceptual level of analysis and design (when it is important to associate an object's behaviour with its attributes and name). Other boxes (e.g. for responsibilities) can be added by the user. In addition, it is permissible to include different detail and number of boxes depending upon requirements. So, for instance, everything except the class name could be

OML: Proposals to Enhance UML

359

suppressed, in analysis only names could be displayed, whereas in implementation full details on arguments, visibility etc. would be required. UML interfaces are shown as "lollipops" extending from the main class box - or explicitly as a link to an interface class icon. While Classes are denoted by rectangles in UML, it is actually Classifier that is the basic concept - all its “variants” are shown by stereotypes or metasubtypes. Since the meta-subtype of Class is so common, UML permits the omission of the <> keyword label. This is not done for the other variants which are labelled as <> (metasubtype), <> (stereotype) or <> (stereotype). As can be seen, the stereotype name or keyword is placed above the class name in the Classname box. For a stereotyped relationship or keyword the name, within guillemets, is placed above the relationship symbol.

CIRT is an instance of

Instance/ object

Class

+

0-*

is implemented by

may play the role of conforms to

1-* Role

1-* Type

1-* implements

Class Implementation

Fig. 6. Metamodel and notation for Object, Class, Type, Role and Implementation. The Class icon is "torn apart" into the Type (external/interface) and the Class Implemtation (internals). All icons can have a drop down box in which information pertinent to the lifecycle stage is displayed (after Henderson-Sellers et al., 1998)

4.2 OML The basic icon for a class in OML Ver 1.1 is the rectangular box, as in UML. Unlike UML, which uses underlining or stereotypes for objects, types, implementation classes and CIRTs/Classifiers, OML chooses the option within the OMG documentation that permits the creation of new icon shapes for these concepts (Figure 6). An additional advantage given by this choice is that there is a visual clue in the shapes chosen for type and implementation. Together they jigsaw together to make a full rectangle: the class. In addition, roles are explicitly designated by use of a Greek tragedy role mask. Object icons are class icons with a pointed top (visually like a house symbol). The rationale is that a blueprint (rectangular sheet of paper) represents the template/structure for one or more houses (the instances).

360

B. Henderson-Sellers

In OML, the same definition of stereotype as in UML (essentially a user-defined meta-subtype) is used. It can be applied to any of OML's traits (Firesmith et al., 1997) where TRAIT is an OML metatype. Stereotypes in OML are indicated by the stereotype name in braces {} rather than guillemets, placed below the class name - the rationale being that the class name and not the stereotype name is more important and is thus given a more pre-eminent position.

5. The Notation for Relationships

5.1 UML The major referential and definitional relationships in UML are summarized in Figure 7. Inheritance (actually generalization) is an arrowed line where the arrowhead is a white triangle. If needed, <<subclass>> and <<subtype>> stereotype labels can be added. Association is an undirected line (although navigability can be added by an open arrowhead) with optional name and aggregations are decorated associations. The Dependency relationship is a dashed line with an open arrowhead and the realizes relationship is a dashed inheritance arrow.

inheritance shared aggregation composite aggregation association dependency realizes Fig. 7. Notation for the major relationships in UML summarized (after Henderson-Sellers et al., 1998)

Discriminators can be added to specialization arrows for cases when subclassing is overlapping i.e. multiple, concurrent partitioning is occurring. There is no separate notation for specification inheritance as opposed to specialization inheritance since the difference is not recognized in the semantics document (OMG, 1997a). There is also no notation for implementation inheritance, other than a stereotype. This is

OML: Proposals to Enhance UML

361

because it is so strongly discouraged that there is seen to be no need for such a notation (Rumbaugh, p.c., 1997). In the preferred option of UML, an association is a bidirectional connexion represented by a single, undirected line. An optional name and black arrowhead can be added, together with multiplicities (actually part of the AssociationEnd rather than the Association). If unidirectionality is required, a navigability arrowhead can be added. AssociationClasses are represented by rectangles (the class icon) linked to the association arrow by a dotted line. Aggregation is a whole-part relationship depicted as a black diamond for strong aggregation and a white diamond for shared aggregation where the diamond is at the "whole" end of the relationships. Use of a black diamond also states that lifetimes are dependent, whereas a white diamond states that the deletion of the aggregate does not lead to deletion of the parts. An alternative for a black diamond is a series of nested class icons. definitional (classification, implementation or inheritance) - unlabelled default is specialization (a kind of) subtyping (blackbox inheritance)

subclassing (whitebox inheritance) association/mapping

+

aggregation

ε

membership

U

containment

Fig. 8. OML relationships are indicated by arrows, some with adornments. The major relationships are illustrated here and form part of the COMN Light notation (after HendersonSellers et al., 1998)

5.2 OML The major referential and definitional relationships in OML are summarized in Figure 8. Referential relationships are the most common and therefore get the easiest arrow style to draw (single arrow) while definitional relationships are shown with a double lined arrow (a strong, binding relationship) - both arrow styles give a visual reminder. The unlabelled default is specialization inheritance and blackbox and whitebox inheritance are shown with a black and white box at the source end of the arrow respectively. Discriminators can also be used, as in UML, although the published format (Firesmith et al., 1997) is more efficient.

362

B. Henderson-Sellers

Association is a directed arrow with a name which is more important (but still not mandatory) than in UML since it helps define the relationship. While different from standard UML as described above, this is an option available within the OMG documentation in which OML chooses to show directional arrows whereas standard UML chooses to only show directionality when it is unidirectional. In addition, the metamodel for association in OML is more clearly a unidirectional mapping (as advocated in Graham et al., 1997a) rather than a tuple as in UML's preferred modelling approach. Dependency or Usage (the dynamic view) is purposefully united with Association/Mapping (the static view) so no additional notation is needed. As with UML, in OML aggregation is regarded as a special type of association so that an adorned association arrow is used (Henderson-Sellers, 1997a). The adornment is (a) a plus in a circle for aggregation (the sum is greater than or equal to the sum of its parts) or (b) an epsilon (or set membership symbol) in a circle for Membership (Henderson-Sellers, 1997b) - see Figure 4. These symbols have been carefully chosen (by usability evaluations) to give a visual connotation, rather than having to memorize arbitrary symbols like black and white diamonds. For detailed design/coding, five further pieces of information are added in Ver 1.1: stereotypes to the relationship to indicate by reference or by value, variable versus constant, and used versus not used; a tombstone icon at the target end if the part must be destroyed along with the whole; and mandatory versus optional (expressed via the multiplicities) (Firesmith and Henderson-Sellers, 1998a) - these clarify and dissociate the concepts underlying the black and white diamonds of UML. Containment, or topological inclusion, is only supported in OML (and not in UML) and is shown as a stylized cup within the same circle as for aggregation and membership. Containment is a referential relationship which is not whole- part/meronymic.

6. To Enhance UML While the UML metamodel and notation aim to be comprehensive, there are a number of areas in which this modelling language is seen to be deficient. The proposals in OML (Firesmith et al., 1997; Firesmith and Henderson-Sellers, 1998b; Henderson-Sellers and Firesmith, 1998) support several clearer approaches and contain a number of advanced metamodelling and notational techniques which could also be of use in enhancing UML. The areas discussed here have focussed on responsibility metamodelling, aggregation discrimination and semiotic notations. The OPEN Consortium, and in particular those members involved in creating OPEN’s preferred modelling language, OML, have identified a number of areas where, currently, OML supports more sophisticated approaches. These could easily be incorporated into a next version of UML. These are: • full incorporation, with semantics, of responsibilities (Figure 2). At present UML supports them only as a tagged value independent of the rest of the metamodel

OML: Proposals to Enhance UML

363

• a more consistent and thorough treatment of abstraction foci in terms of class versus type versus instance - applicable not only at the classifier level but also to packages, scenarios etc. • a full aggregation metamodel discriminating between aggregation, membership and containment (Figure 3) together with a number of partitions addressing both analysis and design/code issues (Figure 4) • the ability to discriminate clearly between the various types of inheritance and represent these notationally (top of Figure 8) • incorporation of semiotic and usability concepts into the notational elements by using icons to represent instance, type and class implementation rather than the current UML stereotypes (Figure 6) • notational support for the metamodel elements above, e.g. for aggregation/membership/containment (bottom of Figure 8) It is crucial that the current use-case focus of UML is widened to permit support (through a process-focussed methodology like OPEN: Graham et al., 1997b) of a responsibility-driven mindset for the modelling component of an OO development and that the results of these modelling endeavours are communicated as effectively as possible both to other developers and to users.

Acknowledgements I wish to thank Don Firesmith for his insightful comments on an earlier draft of this paper as well as Grady Booch for his comments regarding the OMG process and the UML. This is Contribution no 98/8 of the Centre for Object Technology Applications and Research (COTAR).

References Firesmith, D.G. and Henderson-Sellers, B., 1998a, Clarifying specialized forms of association in UML and OML, JOOP, 11(2) , 47-50 Firesmith, D.G. and Henderson-Sellers, B., 1998b, Upgrading OML to Version 1.1: Part 1. Referential relationships, JOOP/ROAD, 11(3) Firesmith, D.G., Henderson-Sellers, B. and Graham, I., 1997, The OPEN Modeling Language (OML) Reference Manual, SIGS Books, NY, 271pp Fowler, M. and Scott, K., 1997, UML Distilled. Applying the standard object modeling language, Addison-Wesley, Reading, MA, 179pp Graham, I., Bischof, J. and Henderson-Sellers, B., 1997a, Associations considered a bad thing, J. Obj.-Oriented Prog., 9(9), 41-48 Graham, I., Henderson-Sellers, B. and Younessi, H., 1997b, The OPEN Process Specification, Addison-Wesley, London, UK, 314pp Henderson-Sellers, B., 1994, COMMA: an architecture for method interoperability, Report on Object Analysis and Design, 1(3), 25-28

364

B. Henderson-Sellers

Henderson-Sellers, B., 1997a, OPEN relationships - composition and containment, JOOP, 10(7), 51-55 Henderson-Sellers, B., 1997b, Towards the formalization of relationships for object modelling, Procs. TOOLS Pacific 1997, 253-265 Henderson-Sellers, B., 1998, OPEN relationships - associations, mappings, dependencies, and uses, JOOP, 10(9), 49-57 Henderson-Sellers, B. and Bulthuis, A., 1996, The COMMA project, Object Magazine, 6(4), 24-26 Henderson-Sellers, B. and Bulthuis, A., 1998, Object-Oriented Metamethods, Springer-Verlag, New York, USA, 158pp Henderson-Sellers, B. and Firesmith, D.G., 1997, COMMA: proposed core model, J. Obj.Oriented Prog., 9(8), 48-53 Henderson-Sellers, B. and Firesmith, D.G., 1998, Upgrading OML to Version 1.1: Part 2. Additional concepts and notations, JOOP, 11(5) Henderson-Sellers, B., Firesmith, D. and Graham, I.M., 1997, The benefits of Common Object Modeling Notation, J. Obj.-Oriented Prog., 10(5), 28-34 Henderson-Sellers, B., Simons, A.J.H. and Younessi, H., 1998, The OPEN Toolbox of Techniques, Addison-Wesley, London Kilov, H. and Ross, J., 1994, Information Modeling. An Object-Oriented Approach, Prentice Hall, Englewood Cliffs, New Jersey, USA, 268pp Meyer, B., 1992, , Prentice Hall, New York, 594pp OMG, 1997a, UML Semantics. Version 1.1, 15 September 1997, OMG document ad/97-08-04 OMG, 1997b, UML Notation. Version 1.1, 15 September 1997, OMG document ad/97-08-05 Reenskaug, T., Wold, P. and Lehne, O.A., 1996, Working with Objects. The OOram Software Engineering Manual, Manning, Greenwich, CT, USA, 366pp

Validating Distributed Software Modeled with the Unified Modeling Language Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h Irisa/CNRS, Campus de Beaulieu, F-35042 Rennes Cedex, FRANCE {jezequel,aleguenn,pennanea}@irisa.fr

Abstract. The development of correct OO distributed software is a daunting task as soon as the distributed interactions are not trivial. This is due to the inherent complexity of distributed systems (latency, error recovery, etc), leading to numerous problems such as deadlocks, race conditions, and many diﬃculties in trying to reproduce such error conditions and debug them. The OO technology is ill-equipped to deal with this dimension of the problem. On the other hand, the willingness of mastering this complexity in the context of telecommunication protocols gave birth to speciﬁc formal veriﬁcation and validation tools. The aim of this paper is to explore how the underlying technology of these tools could be made available to the designer of OO distributed software. We propose a framework allowing the integration of formal veriﬁcation and validation technology in a seamless OO life-cycle based on UML, the Uniﬁed Modeling Language.

1

Introduction

It is now widely admitted [8] that only system development based on “realworld” modeling is able to deal with the complexity and the versatility of large software systems. Once the idea of analyzing a system through modeling has been accepted, there is little surprise that the object-oriented (OO) approach is brought in, because its roots lie in Simula-67, a language for simulation designed in the late 1960s, and simulation basically relies on modeling. This is the underlying rationale of the numerous object-oriented analysis and design (OOAD) methods that have been documented in the literature [15]. OOAD methods allow the same conceptual framework (based on objects) to be used during the whole software life-cycle. This seamlessness should yield considerable beneﬁts in terms of ﬂexibility and traceability. These properties would translate to better quality software systems (fewer defects and delays) that are much easier to maintain because a requirement shift usually may be traced easily down to the (object-oriented) code. But today many such large software systems have acquired a distributed nature. This distributed nature may be either a constraint from the problem statement, or may be introduced as the consequence of a design decision to handle performance problems and/or fault tolerance. Frameworks such as CORBA help J. Bézivin and P.-A. Muller (Eds.): «UML»’98, LNCS 1618, pp. 365–377, 1999. c Springer-Verlag Berlin Heidelberg 1999

366

Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h

in deploying distributed solutions, but any experienced software engineer recognizes that the design, implementation and maintenance of correct distributed software is still a very diﬃcult exercise. Distributed systems have indeed an inherent complexity resulting from fundamental challenges such as latency of asynchronous communications, error recovery, service partitioning and load balancing. Furthermore, being intrinsically concurrent, distributed software faces race conditions, deadlocks, starvation problems, etc. This complexity is quite orthogonal to the programming-in-the-large problems addressed by OO technology, including CORBA. There are currently no approaches to deal with this aspect of the problem in an OO context (see [13, 16] for a good overview on current approaches at Validation and Veriﬁcation for OO systems). The nature of the complexity of distributed systems has been widely explored in many academic (and other) circles for several years. In the context of telecommunication protocols, the willingness of mastering this complexity gave birth to the development of standardized Formal Description Techniques (FDT) and to a set of associated formal veriﬁcation and validation tools. Unfortunately, for several reasons that we explore later in this paper, these tools usually cannot be easily used in an integrated OO life-cycle. The aim of this paper is to explore a way by which the underlying technology of these formal veriﬁcation and validation tools could be made available to the designer of OO distributed software. We start in Sect. 2 by recalling what the principles of formal veriﬁcation and validation tools are, and how they address the inherent complexity of distributed systems. We then try to analyze why they are still seldom used. In Sect. 3, building on this analysis, we outline a tentative OO framework making possible the use of formal veriﬁcation and validation technology. We illustrate this approach with a simple yet signiﬁcant case study (a distributed diary system). In Sect. 4 we describe the various formal veriﬁcation and validation activities that may be conducted on the case study within our framework. Finally, we conclude on the applicability of our approach for real size cases, and on the perspectives of the integration of formal veriﬁcation and validation technology in the OO life-cycle.

2 2.1

Validating Distributed Software with Formal Description Techniques A Set of Complementary Techniques

Validation techniques vary widely in their forms and their abilities, but they always need a formal description of the distributed software system. They output data on properties of the system under consideration that can be viewed with some conﬁdence level. Basically, the designer may attack his/her software by three complementary techniques. We list here their advantages and major drawbacks: – formal verification of properties: it gives a deﬁnite answer about validity by formally checking that all possible executions of the distributed software

Validating Distributed Software

367

respect some properties (e.g. no deadlock). But existing methods, such as model-checking, that is the construction of the graph of all the states the distributed system could reach, can only be easily applied to the analysis of very simpliﬁed models of the considered problem [5]. Otherwise there is a combinatory explosion of the number of states that forbids such a brute force veriﬁcation. This forces the distributed software to be described at a high abstraction level, so its formal veriﬁcation leaves the problem of property preservation during its reﬁnement course widely open. – intensive simulation, using a simulated (and centralized) environment: it can deal with more reﬁned models of the problem and can eﬃciently detect errors (even tricky or unexpected ones) on a reasonable subset of the possible system behaviors. Formally, it consists in randomly walking the reachability graph of the distributed software. The main diﬃculty is to formally describe and simulate the execution environment. This is generally quite simpliﬁed, because it would not be realistic (nor interesting) to take into account all the parameters of a real system, such as the exact inﬂuence of message size on transmission delays, or the exact operation durations (which are not computable without execution). – observation and test of an implementation: here, the execution environment is a real one. But since there is a lack of tools to observe a distributed system as a whole, it will be diﬃcult to actually validate the software. It will also be diﬃcult to generalize the possible behaviors from the observation. Even something as simple as trying to reproduce a test result is not straightforward, because the asynchronous nature of the communications makes the distributed system look non-deterministic. It appears that these approaches are more complementary than in competition, and that an advised project manager would try to use them all. However this is hard in practice because the formalisms used in these various stages differ widely. Most of these techniques have been developed in the context of the Formal Description Techniques (FDTs) for protocols, where they have been successfully applied to various toy and real problems. 2.2

Diﬃculties in Using FDTs

It is very disappointing to see that formal validation based on standard FDTs (such as SDL [2], Estelle [7] and Lotos [6]) never acceded to a widespread use in the industry, despite excellent results on most of the pilot projects where it has been used [10]. While the interest of formal techniques is widely acknowledged (at least in the context of mission-critical distributed software), their use is still deferred for various reasons: – their learning curve is very steep, because they rely on non-trivial formalisms and unusual syntaxes and semantics, – they require the analysis to be much more accurate in the early stages (which is not necessarily a bad thing, but it is a matter of facts that few projects are prepared to pay the additional cost early),

368

Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h

– and there is a lack of integration of this promising technology in widely used software development methods and life-cycles. In our experience, this last point is probably the most important one. Because standard FDTs lack basic support for modern software engineering principles, it is extremely clumsy to try to use them as implementation languages for real, large scale distributed applications. Furthermore, being fully formal implies that FDTs are based on a close world assumption, making them awkward to deal with the open nature of many distributed softwares: speciﬁers become prisoners of the FDTs underlying semantic choices. For example, all FDTs force a given communication semantics (multi rendez-vous for Lotos, FIFO for Estelle) upon the user, who has to painfully reconstruct the set of communication semantics needed for a given distributed system starting from the FDTs one, sometimes with a high performance cost (Estelle FIFO between protocol layers are diﬃcult to circumvent for instance). Using FDTs validation technology thus imposes a model rupture in the usual life-cycle: the formal model for the validation has to be built and maintained separately from the analysis and design model (expressed in e.g., OMT or UML). For example, this implies that formal validation technology may be used during the maintenance phase of a system only after a costly reverse engineering effort. Each time you make a modiﬁcation in your distributed software, you have to propagate it to the separate model described with your formal description technique, and start all over again your formal validation, which is quite impracticable in the real world. Since the maintenance phase costs for large, long-live systems can represent up to 3 or 4 times its initial development cost, this is not a good point for FDTs. As a consequence, formal validation rarely passes the stage of an annex (and more or less toy) task which gets low priority and low budget.

3 3.1

Alternative: Integrate Validation in an OO Life-Cycle A New Vision for the OO Life-Cycle

OOAD methods along with an OO implementation allow the same conceptual framework (based on objects) to be used during the whole software life-cycle. It should be stressed that the boundaries between analysis, design and implementation are not rigid. We advocate for extending this seamless OO development process to also encompass validation, not as a post facto task (as promoted in the classical vision of the waterfall or the V-model of the life-cycle), but as an integrated activity within the OO development process, as shown in Fig. 1. The key point in implementing this idea is to rely on the sound technological basis that has been developed in the context of formal validation based on FDTs, and to make it available to the OO designer through a dedicated framework. Our proposal is based on UMLAUT, a tool that can manipulate the UML representation of the system being designed.

Validating Distributed Software

369

Formal validation usually takes place on a separate simulation model of the system. This diﬀerent model must be updated (and revalidated) each time the model is changed, which is both costly and error prone. UMLAUT on the contrary automatically exposes the properties of the system that are relevant to the validation by directly processing its UML representation. An equivalent UML model is automatically produced that explicitly shows the protocol entities involved in asynchronous communications and the new system states that result from those communications.

Validation Results

Intensive simulation

Problem Model Checking Validation code UML Analysis Model

Validation Framework UMLAUT/VALOODS

UML Design Model

Implementation

Test Cases

CADP Graph API

Test Results

Fig. 1. OO Life-Cycle

UMLAUT can then proceed to the validation of the UML design: Code fragments are derived from the modiﬁed UML model and are "plugged" in a validation framework, called VALOODS. This framework comprises a validation engine that will exercise the actual validation. Since this engine is parameterized, one can try the model checking road, or do an intensive simulation, by just choosing the appropriate engine. Moreover, the framework can also serve as a bridge toward more sophisticated validation toolbox such as CADP [3] (by adapting the framework so as to output a transition graph in a format suitable for such a toolbox.) UMLAUT uses CDIF1 as its exchange format when communicating with other parts of the development environment, which ensures interoperability and independence from CASE tool vendors. Therefore UMLAUT can become a part of the development environment while preserving the investment represented by the other tools already used in the project. Of course, the CDIF output of UMLAUT can be injected back in any CASE tool that supports this format, to see what transformations were actually applied on the original UML model. 1

CASE Data Interchange Format

370

3.2

Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h

A Validation Framework for OO Distributed Systems

We now outline the principle of the VALOODS framework (VALidation of Object Oriented Distributed Software.) Its purpose is to be a testbed for OO designs of distributed software, after UMLAUT put them into a form suitable for the application of formal validation technology. A framework consists of a collection of classes together with many patterns of collaboration among instances of these classes. It provides a model of interaction among several objects that belong to classes deﬁned by the framework. The basic abstractions in VALOODS are: – Reactive objects, that inherit from the class REACTIVE and must deﬁne the method receive (e : MESSAGE) to handle messages. Messages can be asynchronous messages or signals, or notiﬁcations of a timer expiration. – Pro-active objects, that inherit from the class ACTIVABLE and must deﬁne the methods activable and action. Pro-active objects would be run in parallel, using an interleaving semantics for their actions (the method action being atomic). – The network interfaces (modeled through the class PORT), coming in several ﬂavors (that is, subclasses) in the VALOODS library. This is to model the various addressing schemes and quality of services (e.g. reliable, unreliable, etc.) available to the designer of a distributed software. The idea of VALOODS is that any class that interacts with a remote site in the distributed system must be a subclass of REACTIVE or ACTIVABLE (or both), and use a subclass of PORT for its remote communications. Once the complete OO distributed software design has been implemented in this framework, we get an accurate formal representation of the behavior of the distributed software as a whole. Furthermore we get the reversibility for free: if the design needs to be changed, it is easy to validate it again in the VALOODS framework. We no longer have to separately maintain a model of the distributed application for formal validation purposes and the application itself. 3.3

OO Modeling of a Distributed Diary

We could have chosen a multimedia application full of bells and whistles to illustrate our approach; the risk exists that this may have led to unnecessary obfuscation. Or we could have chosen the famous example of the Alternating Bit Protocol, which is commonly used as a cas d’école to evaluate protocol validation tools in the protocol engineering community. Nonetheless, showing the applicability of our approach only on this minimal example may not seem very convincing with respect to its scalability when it comes to more realistic and sophisticated distributed applications. Therefore a trade-oﬀ had to be done between the two extreme situations, and the application that we will present in the following sections is a Distributed Diary system, which was originally proposed as a shared case study for the Workshop on Models, Formalisms and Methods for Object-Oriented Distributed Computing (ECOOP’97 Workshop #6) [12].

Validating Distributed Software

371

manipulate

DIARY_INTERFACE 1

1

USER 1 /manipulate

1 local proxy PROXY 1 1 0..* remote peers

proxy’s local diary

DIARY

1

1

EVENT 0..* day : DATE begin : TIME end : TIME comment : STRING = ""

negociate with

Fig. 2. The UML Model of the Distributed Diary (Class Diagram)

We are using the Proxy Design Pattern [4] so as to implement a two phase commit protocol in a manner transparent to the user. The USER accesses its local proxy as if it were the diary itself. The proxy then makes sure that all diary copies are kept in sync. The two phase commit protocol consists in a negotiation between the proxies on the network before transactions are actually committed. Validity constraints can be described at this level with assertions, e.g., on the consistency of the Diary contents. Figure 3 describes the messages that the coordinator (the proxy on the site where the transaction is started) exchanges with its peers. In this particular case, all sites agree to commit the transaction.

1: add (EVENT) 5: add (EVENT)

4: add (EVENT) the local Proxy : PROXY

6: ready-to-commit ( )

7: ready-to-commit ( )

3: ready-to-commit ( ) 10: commit ( )

2: add (EVENT)

12: done ( )

a remote Proxy : PROXY

13: done ( )

9: done ( )

8: commit ( )

11: commit ( )

another remote Proxy : PROXY

the local Diary : DIARY

Fig. 3. Two phase commit negotiation between Proxies (UML Collaboration Diagram)

In the context of showing the interest of our approach, the interest of this case study is that there are known problems with the two phase commit protocol. Deadlocks can occur under certain circumstances (see [1].) We will see in Sect. 4.3 how VALOODS can be used to ﬁnd these problems.

372

3.4

Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h

Making the Distributed Diary Fit into VALOODS

Now let us see how UMLAUT transforms the original UML model into a new one suitable for validation, where reactive objects, pro-active objects and network interfaces appear explicitly. The starting point of the transformation is to determine which entities may interact with another one on a remote site. This information is provided by the deployment diagram. The deployment diagram of the UML model indeed shows the physical location of each component in the delivered distributed system and the relationships among them. Based on this information, the transformations are carried out for both the static and dynamic views of the UML model of the Distributed Diary.

Static Model Transformations. A class whose instances may communicate through the network is considered as the top-level layer of a protocol stack. Since each layer in a protocol stack may have a lower layer and an upper layer and must be able to send informations to these layers, this common behavior can be factorized into an abstract class PROTOCOL_ENTITY, which is an heir of the REACTIVE class. Some actions, having a speciﬁc behavior depending on the actual layer level, are not described in the class PROTOCOL_ENTITY but in each subclass corresponding to a diﬀerent layer level. Layers of a protocol stack can be connected using the insert_on_top operation, and messages are forwarded from one layer to another through the send_up and send_down operations, respectively. The bottom level layer of a stack is made of an instance of the PORT class, which is in charge of the actual network communications. – UMLAUT ﬁrst adds a generalization (inheritance) association between each class that can have asynchronous communications with remote sites and the PROTOCOL_ENTITY class, making them explicit heirs of PROTOCOL_ENTITY. – Since network communications are handled by instances of the PORT class, a protocol_stack relationship is established between PORT (playing the role of the lower layer) and the classes representing the upper layer (see Fig. 4.) – MESSAGEs are reiﬁed, to be later exchanged through PORTs. – Finally, classes stereotyped as <> are also made subclasses of ACTIVABLE. Objects of the ACTIVABLE class provide a set of stimuli to exercise the dynamic properties of the system. An activable object (e.g. a USER consulting the diary in Fig. 4) is just an heir of the abstract class ACTIVABLE, which features an entry point called action that may be called from time to time by, e.g., a scheduler, provided the method activable returns true. This way, a validation engine can call the action operation for Users or for network interfaces (i.e. PORTs) to arbitrary test the system.

Validating Distributed Software

373

protocol stack

1 upper layer PROTOCOL_ENTITY send_down( ) send_up( ) 1 insert_on_top( )

send/receive

lower layer

MESSAGE

DIARY_INTERFACE consult( ) add( ) cancel( ) replace( ) ready-to-commit( ) done( ) commit( ) abort( )

ACTIVABLE isActivable( ) action( )

/manipulate

1

PROXY

proxy’s local diary DIARY

1 upper layer

1 1

USER

1

1 1 0..*

PORT peer

1 lower layer

/protocol stack

EVENT 0..1

Fig. 4. The transformed UML Model of the diary (Class Diagram)

Dynamic Model Transformations. In our framework, the transitions of the UML dynamic model are translated to methods of an OO language according to the following design rules: – the triggering event of the transition is transformed in a method parameter, – the starting state, plus optional conditions on the event parameters or other conditions on local variables, can be speciﬁed in a precondition, – the arrival state is speciﬁed in the postcondition, – the method body must be implemented so as to guarantee the postcondition. Then we have to provide some implementation details for the control of the automaton. In our framework, the receive method is the “engine” of the automaton associated to protocol entities. It ﬁres transitions (i.e. calls the relevant method) depending on both the received event type and the state in which the object is when the receive method is invoked. The implementation of such a method thus needs a double dispatch operation that has several well-known implementation methods (see e.g. the State design pattern from [4]). 3.5

Generation of the Validation Code

Once all necessary transformations have been realized, UMLAUT proceeds with the generation of the code that will eventually be plugged into VALOODS for the actual validation to take place. This is done by walking through the connected graph stored in UMLAUT in the form of instances of UML Meta-Model classes. The principle is very similar to the generation of implementation code as usually performed by various CASE tools, except that the code is produced to ﬁt the VALOODS framework.

374

Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h

Because the ﬁrst prototype of VALOODS was written with Eiﬀel [14], UMLAUT also generates Eiﬀel code. As a side eﬀect, the validity constraints deﬁned on the model (see Sect. 3.3) and the pre- and post-conditions introduced in the dynamic model (see Sect. 3.4) directly map to Eiﬀel assertions whose violation can be trapped by the Eiﬀel execution environment.

4

Validation in the VALOODS Framework

Within this framework, a validation process may be carried on a seamless way. Since our system can now be compiled to a reactive program oﬀering a set of transitions (guarded by activation conditions) located in the activable objects, we have many opportunities to apply the basic technologies that have been developed in the context of FDT based formal validation. 4.1

Model-Checking

If we want to try the model-checking road, we can use a driver setting the system in its initial state and then constructing its reachability graph by exploring all the possible paths allowed by activable transitions. The only problem is to be able to externalize the relevant global state (made of the states of the Proxy, Diary and User objects, plus the state of the communication ﬁles). We basically solve this problem by leveraging the Memento pattern [4]. The main drawback of this approach is that global state manipulations (comparison, insertion in the table, etc.) are then very costly operations that could compromise large scale model-checking. 4.2

Intensive Simulation

For larger systems, an intensive simulation (randomly following paths in the reachability graph) would probably be a more fruitful avenue. Running such a simulation involves the use of a scheduler object implementing a redeﬁnable scheduling policy among the activable transitions (e.g., random selection). It is also possible to observe the system, using an observer, as in Veda [9]. An observer is a program which permits to catch and analyze informations about execution. It can see every interactions exchanged in the system, and also every internal states of a module. A protocol sequencing error is detected as a precondition violation on the observer (such an error is detected e.g. when the PORT corrupts data). In the case of VALOODS, the execution environment then allows the user to precisely locate and delimit the responsibility of the error, by providing him with an exception history trace including a readable dump of the call stack. For more abstract or complicated properties to be checked on real systems (e.g., that a service behaves like a FIFO and it is live), the observer object could be derived automatically from higher level speciﬁcation languages (e.g., temporal logic speciﬁcations).

Validating Distributed Software

4.3

375

Validation Results on the Diary

The scenario presented in Fig. 3 illustrates the ideal situation of a transaction that is conducted without any problem. However, a major problem with the two phase commit protocol is that the coordinator may block, indeﬁnitely waiting for the answer of a site that is permanently down. After the coordinator has started a transaction (with a broadcast to all sites), it waits for an answer from each site to know whether they accept the transaction (ready-to-commit). If they do, the coordinator enters the second phase and sends a commit message to the remote sites and then waits for an acknowledgment. The problem arises if a remote site crashes (and stays down) after it has agreed to accept the transaction, but before it could process the commit message sent by the coordinator. If this happens, the coordinator stales. This scenario can easily be deduced from Fig. 3 by just removing one of the done messages, i.e. message number 12 or 13. When the scheduler reproduces this particular sequence of messages by triggering the corresponding activable objects, it detects that no more transition can be ﬁred to leave the waiting-state because the guard (all done received) is false. The only ﬁreable transitions left consist in starting a new transaction from a remote site that has committed the previous transaction. Since new transactions are bound to fail in a similar manner, all non-crashed sites will successively come into a blocked state. Eventually, there is no ﬁreable transition left and VALOODS issues a deadlock diagnosis and is able to produce an execution trace. Ideally, when the scheduler has driven the system into such a faulty state, the framework would transpose the trace (which may not be easy to read) into an equivalent UML interaction diagram (a sequence diagram or a collaboration diagram such as the one in Fig. 3) representing the critical scenario. This interaction diagram would then be integrated in the original UML model of the system, providing the designer with a diagnosis of the problem in the notation that they are familiar with. This feedback allows for correction of the UML design so as to solve the problem, as outlined in Fig. 1.

5

Conclusion

We have shown the interest and feasibility of integrating formal veriﬁcation and validation techniques in an established OO life-cycle for the construction of correct OO distributed software systems. On the Distributed Diary system example, we have described how a continuous validation framework can be set up to go smoothly from the OO analysis to the OO implementation of a validated distributed system. But this approach is not limited to simple problems: the intensive simulation techniques have already been used on real OO systems, e.g., the implementation of a parallel SMDS server where it has allowed us to detect non trivial problems at early stages of the life-cycle [11].

376

Jean-Marc Jézéquel, Alain Le Guennec, and François Pennaneac’h

One of the current limitations with UMLAUT/VALOODS for model-checking, is the cost of global state manipulations. A solution that we are currently investigating is to interface the VALOODS framework with open validation tools such as the CADP environment [3], in order to leverage the huge know-how they have accumulated to deal with this kind of problem. UMLAUT can be extended to automatically generate from the transformed model a set of routines suﬃcient to implement the API needed by tools such as those in CADP. The set typically comprises routines to get to the initial state, ﬁre transitions or compare arbitrary states. This API is also used by some test-generation tools that could be integrated in the development life-cycle presented on Fig. 1. Concurrently, we plan to consolidate and extend VALOODS to deal with higher level interactions between distributed objects (e.g. in the context of CORBA.) Once UMLAUT/VALOODS is a truly usable validation framework, we will make it widely available (see http://www.irisa.fr/pampa/UMLAUT/).

References [1] P. A. Bernstein and N. Goodman. An algorithm for concurrency control and recovery for replicated databases. ACM Transactions on Database Systems, 9(4), December 1984. [2] CCITT. SDL, Recommendation Z.100, 1987. [3] J.-C. Fernandez, H. Garavel, L. Mounier, C. Rodriguez A. Rasse, and J. Sifakis. A toolbox for the veriﬁcation of programs. In International Conference on Software Engineering, ICSE’14, Melbourne, Australia, pages 246–259, May 1992. [4] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison Wesley, 1994. [5] S. Graf, J.L. Richier, C. Rodriguez, and J. Voiron. What are the limits of model checking methods for the veriﬁcation of real life protocols? In Proceedings of the International Workshop on Automatic Veriﬁcation Methods for Finite State Systems, Grenoble, France, June 1989. Springer–Verlag, LNCS #407, pages 189– 196. [6] ISO. LOTOS, A Formal Description Technique Based on the Temporal Ordering of Observational Behaviour. ISO/ DP 8807, March 1985. [7] ISO. Estelle: a Formal Description Technique based on an Extented State Transition Model. ISO 9074 TC97/SC21/WG6.1, 1989. [8] M.A. Jackson. System Development. Prentice-Hall International, Series in Computer Science, 1985. [9] C. Jard, R. Groz, and J.F. Monin. Development of VEDA: a prototyping tool for distributed algorithms. In IEEE Trans. on Software Engin., volume 14,3, pages 339–352, March 1988. [10] J.-M. Jézéquel. Experience in validating protocol integration using Estelle. In Proc. of the Third International Conference on Formal Description Techniques, Madrid, Spain, November 1990. [11] J.-M. Jézéquel, X. Desmaison, and F. Guerber. Performance issues in implementing a portable SMDS server. In IFIP, editor, 6th International IFIP Conference On High Performance Networking, pages 267–278. Chapman & Hall, London, September 1995.

Validating Distributed Software

377

[12] Jean-Marc Jézéquel and François Pennaneac’h. Preliminary ideas for validating distributed oo software. In Workshop on Models, Formalisms and Methods for Object-Oriented Distributed Computing (ECOOP’97 Workshop #6), Finland, June 1997. [13] S. Kirani. Speciﬁcation and Veriﬁcation of Object-Oriented Programs. Phd thesis, University of Minnesota, 1994. [14] B. Meyer. Eiﬀel: The Language. Prentice-Hall, 1992. [15] D. E. Monarchi and G. I. Puhr. A research typology for object-oriented analysis and design. Communications of the ACM, 9(35):35–47, September 1992. [16] R. M. Poston. Automated testing from object models. Communications of the ACM, 37(9):48–58, September 1994.

Supporting Disciplined Reuse and Evolution of UML Models Tom Mens, Carine Lucas, Patrick Steyaert Programming Technology Lab Vrije Universiteit Brussel Pleinlaan 2 - 1050 Brussels - BELGIUM { tommens | clucas | prsteyae }@vub.ac.be

Abstract. UML provides very little support for modelling evolvable or reusable specifications and designs. To cope with this problem, the UML needs to be extended with support for reuse and evolution of model components. As a first step, this paper enhances the UML metamodel with the “reuse contract” formalism to deal with evolution of collaborating class interfaces. Such a formal semantics for reuse allows us to detect evolution and composition conflicts automatically.

1 Introduction During the last two decades an entire range of mechanisms has been developed to support reuse and evolution of implementation-level components such as classes and objects. Some indicative examples are inheritance [13], late-binding polymorphism, object-oriented frameworks [14], meta-object protocols [4] and aspect-oriented programming [6]. Although significantly more benefits can be gained from reuse during the analysis and design phase than during the implementation phase, there is much less support for reuse during these early phases of the software life cycle. Some support for model reuse does exist. Examples are “facades” and “variation points” [2] and “synthesis” of role model components [11]. In general, however, it is not clear what an analysis or design component is, and even less clear how such model components can be composed and reused. Taking UML 1.1 [10] as a representative example, we observe that it does not provide adequate support for dealing with reusable and evolvable model elements. To go beyond the reuse of single classes, packages can be used to encapsulate model elements and pattern structures and templates to define generic models. However, customisation of packages containing arbitrary nested model elements is poorly supported in UML. Experience with implementation reuse has learned that building and reusing components is best supported by an iterative process. The reuser can only gain insights in the qualities of reusable components by reusing them in new applications. The provider can only improve the qualities of components if the experience of reuse J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 378–392, 1999. © Springer-Verlag Berlin Heidelberg 1999

379

Tom Mens, Carine Lucas, and Patrick Steyaert

is fed back to him. Unlike what is sometimes believed, this iteration must be sustained beyond the initial iterations for making reusable components [1]. Successful components can have a long life-span and thus need to evolve and adapt to new reusers and new requirements. To be able to sustain the iteration that underlies successful reuse, the issue of how reusers can be supported in upgrading their applications to improved components, must be addressed. Therefore, we introduce disciplined reuse as a form of reuse where a maximal degree of consistency is maintained between reusable components and the systems in which they are reused. In the absence of disciplined reuse, the provider of a component cannot easily benefit from the improvements made by a reuser or from the knowledge that is gained by reusing the component. The reuser does not benefit from improvements made to the component afterwards, nor from the improvements made by other reusers. Moreover, non-disciplined reuse leads to serious maintenance and version management problems. No consistency exists between the model used by the provider and the modified model employed by the reuser. So both models need to be maintained separately, ultimately leading to a proliferation of different versions. With few exceptions, the most widespread form of reuse at analysis and design level is copy-and-paste reuse. With this kind of reuse the reuser takes a copy of a model and changes it to new requirements without maintaining any form of consistency with the original model. Obviously, copy-and-paste reuse of analysis and design models as practised today is not adequate for the needs of organisations that want to employ reuse in a systematic way. Today, organisations are investing in corporate-wide models, in order to be able to reuse them in their applications. Even more, industry-wide initiatives such as OMG are defining models that can be reused by large sections of the industry. Thse models are intended to be shared by a large number of people over a long period of time, thereby yielding a maximum return on investment but also a maximum potential for applications that are open towards each other (because they share a common model). For this to become reality, these models must not only be shared – rather than copied – by different reusers, but the model itself must be able to evolve, while keeping reusers consistent with it. The remainder of this paper is outlined as follows. Section 2 discusses the basic concepts and terminology for reuse and evolution of model components. Section 3 presents the reuse contract formalism as a notation and semantics for dealing with modification of UML class collaborations, and shows how it can be incorporated in the UML metamodel. The ideas are based on our previous work on reuse contracts: in [12] reuse contracts were defined as a means for maintaining the consistency between an evolving parent class and its subclasses, and in [7] this idea was extended and formally defined for collaborating classes. Section 4 explains how the reuse contract formalism allows us to detect conflicts during evolution and reuse of model components. Finally, section 5 concludes and discusses some future work. Although we will not discuss practical applications in this paper, the inspiration for what is presented comes from very practical problems, such as how to maintain the consistency in a family of analysis and design models, and how to deal with reuse and evolution as early as possible in the life-cycle. This is also why UML was chosen to express our ideas, as UML has become a standard modelling notation.

Supporting Disciplined Reuse and Evolution of UML Models

380

2 Reuse and Evolution of Model Components 2.1 Components Are the Units of Reuse The term component is interpreted very broadly by most members of the OO community. A component can be a single class, a library of classes, a set of objects that collaborate, a full-fledged framework, and so on. It can be a piece of code, or a model, or even a combination of both. The common characteristic is that components are the units of reuse. Obviously, what is considered a unit of reuse heavily depends on the kind of reuse technique that is adopted: with a copy-and-paste reuse mechanism any part of an object model can be considered a component. Since we are only interested in disciplined forms of reuse we will restrict ourselves to more coherent components. Moreover, we will only look at components defined at the modelling level. In this paper we focus on model components consisting of collaborating class interfaces. We have chosen for this kind of model components because collaborations are good building blocks for modelling object-oriented application families. In fact, collaboration patterns are the modelling-level equivalent of object-oriented frameworks. Their static aspects are represented semantically in UML by means of collaborations. At specification level, a collaboration describes the entities (called the classifier roles) that participate in the collaboration, their inferfaces, and the relationships between them (described by association roles). At instance level, the collaboration presents instances and links conforming to the classifier roles and association roles in the specification. The dynamic aspects of the collaboration are represented by interactions that describe the object interaction behaviour. 2.2 Incremental Component Modification Most components need to be adapted or customised for reuse. The simplest way to modify a component is by editing the component itself. This has obvious deficiencies: it is not clear how the component is adapted, and the original component is no longer available afterwards. So, in general, an incremental modification mechanism is preferred. Incremental modification has been explored mostly at the programming level with inheritance being the most wide-spread incremental modification mechanism [13]. At analysis and design level the notion of incremental modification is less well developed. The generalisation relationship can be used to specify incremental modification, but only to add more information (since the more specific element needs to be substitutable with the more general element). This is too restrictive for our purposes. An alternative is to use the dependency relationship. This is a common mechanism that can be used to indicate a situation in which a change to the target element (the “supplier”) requires a change to the dependent source element (the

381

Tom Mens, Carine Lucas, and Patrick Steyaert

“client”). In the UML metamodel, the dependency relationship, depicted by a dashed arrow from client to supplier, is specialised to "refinement”, “usage”, “trace” and “binding” relationships. Unfortunately, these relationships are not directly suitable to express reuse or evolution. For this reason, we need to define our own specialisation of the dependency relationship. 2.3 Evolution and Composition Conflicts As discussed in section 1, developing reusable components is an iterative process. It is therefore very important that components can evolve. However, component evolution involves a certain cost, since reusers must frequently upgrading to the new version. This can lead to several problems: the behaviour of the evolved component has changed, properties of the component that were valid before do not hold anymore, and so on. These kinds of conflicts are referred to as evolution conflicts. Furthermore, a component that is reused improperly may cause unexpected behaviour, both in the reuser and in the component itself. Or, even worse, two components that exhibit correct behaviour when reused separately may cause errors when reused both together in the same system. These kinds of conflicts are called composition conflicts. Conflicts show up during evolution or composition because properties that were relied on by reusers have become invalid. At the programming level composition and evolution conflicts result in erroneous or unexpected behaviour [5, 12]. From a modelling perspective, composition and evolution conflicts may result in a model that is inconsistent, or in a model that does not have the meaning intended by the different reusers. An example will be given in section 4.

3 Reuse Contracts In the rest of this paper we show how the formalism of reuse contracts [7, 12] can be incorporated into the UML language by directly extending its metamodel. As in the UML semantics document, the abstract syntax will be given by means of UML class diagrams, while additional well-formedness rules will be expressed in semi-formal natural language. We have deliberately chosen not to express the constraints in OCL because OCL does not have a complete formal semantics, which leads to many problems, ambiguities and open questions [9, 15]. Moreover, there is virtually no support for OCL in existing CASE tools that have adopted the UML notation. 3.1 Informal Discussion The idea behind reuse contracts is that components are reused on the basis of an explicit contract between the provider of the component and a reuser that modifies this component. The purpose of a contract is to make reuse more disciplined. For this

Supporting Disciplined Reuse and Evolution of UML Models

382

purpose, the provider and the reuser have obligations. The primary obligation of the provider is to document how the component can be reused. The reuser needs to document how the component is reused or how the component evolves. Both the provider's and reuser's documentation must be in a form that allows to detect what the impact of changes is, and what actions the reuser must undertake to "upgrade" if a certain component has evolved. Summarised we can say that a reuse contract helps in keeping the model of the provider consistent with the model of the reuser. To deal with evolution, the provider needs to document what properties of the component can be relied on at a particular point in time. These properties are specified in the provider clause. This provider clause provides only a certain view on the component. Thus a component can participate in different reuse contracts that address different concerns of the provided component. The reuser clause is used to document the changes made to the provided component. The contract type expresses how the provided component is reused. Possible contract types include extension, cancellation, refinement and coarsening. The contract type imposes obligations, permissions and prohibitions onto the reuser. For example, the extension contract type obliges reusers to add new elements, but prohibits overriding of existing elements. It permits adding multiple elements at once. Contract types and the obligations, permissions and prohibitions they impose are fundamental to disciplined reuse, as they are the basis for detecting conflicts when components evolve. 3.2 Provider Clause Notation In order to deal with model components consisting of collaborating class interfaces, the provider clause must be a stereotyped «provider clause» package consisting of two parts: a collaboration specification and a set of interactions owned by the collaboration. Their corresponding diagrams are encapsulated in stereotyped «collaboration» and «interaction» packages respectively1, as illustrated in the left part of Fig. 1. Such a collaboration between template class interfaces (indicated by a dashed rectangle) can be instantiated to an actual collaboration between existing classes by using the so-called pattern notation. This pattern fills in the different roles of the collaboration (Browser and Document) with links to actual classes (or class interfaces). As a result, these classes must satisfy the behaviour imposed by the collaboration. The notation for this is shown on the right in Fig. 1. The example of Fig. 1 represents part of the design for navigation in a webbrowser. There are only two important participants: Browser and Document. These classifier roles can communicate with each other over the association roles browser and doc. The Document interface contains two operations: mouseClick describes what happens when the mouse is clicked in some part of the document, and resolveLink expresses what happens when a hyperlink is followed in the document.

1

We are aware of the fact that “collaboration” and “interaction” are reserved keywords, but it is the most obvious name to choose for a stereotyped package containing a Collaboration and an Interaction respectively.

383

The

Tom Mens, Carine Lucas, and Patrick Steyaert Browser

handleClick

interface also contains operations that are important for navigation: and getURL.

«provider clause» WebNavigation «collaboration»

«i nteraction» mouseClicking 1: mouseClick

Browser handleClick getURL browser doc Document mouseClick resolveLink

Browser, Document

: Browser

doc

handleClick

Browser : Document

self 1.1: resolveLink

WebNavigation

«i nteraction» linkResolving Document

1: getURL : Browser

browser

: Document

resolveLink

Fig. 1. WebNavigation provider clause and pattern notation for instantiating it.

There are two interactions corresponding to the collaboration. In linkResolving part of the navigation behaviour is made explicit: when resolveLink is invoked in an instance of Document, a message getURL is sent to an instance of Browser to fetch the contents of the web page pointed to by the hyperlink. The mouseClicking interaction describes what happens if a mouse click is detected by the browser. When this click occurs inside a document, handleClick sends a mouseClick message to an instance of Document, which determines if this mouse click causes a link to be followed. If this is the case, the resolveLink self send is issued. Document Application

Document

+start() +exit()

BrowsableDoc

WebNavigation +mouseClick() +resolveLink()

InternetBrowser

Browser

Document

+handleClick() +getURL()

PDFNavigation

Browser

pdf-Doc

html-Doc

+gotoPage()

Fig. 2. Instantiation of provider clauses WebNavigation and PDFNavigation. The Browser role of WebNavigation is filled in by InternetBrowser, and the Document role is filled in by BrowsableDoc. PDFNavigation depends on WebNavigation, as will be shown in Fig. 5.

Fig. 2 shows how the WebNavigation collaboration pattern can be used to express an actual collaboration between different classes in a class diagram. Different

Supporting Disciplined Reuse and Evolution of UML Models

384

collaboration patterns can be used to indicate the many different roles that classes play (in the figure we have mentioned only WebNavigation and PDFNavigation), and the same pattern can be used in different places of the diagram, with different classes filling in the roles of the collaboration pattern. 3.3 Provider Clause Semantics In the UML metamodel of Fig. 3, a ProviderClause is defined as a specialisation of Package. The part about Collaborations and Interactions is taken over from the UML semantics document. Collaboration specifies which entities participate in the reusable component. Each participant, called ClassifierRole, is connected to an Interface that has a name and owns an ordered set of Operations. The relation between the different ClassifierRoles is specified by means of AssociationRoles. The Collaboration owns a set of Interactions (encapsulated in «interaction» packages) that describe how instances of the ClassifierRoles interact by means of message sends. Many different Interactions can correspond to the same Collaboration. /ownedElt

«collaboration» Package

1 Package

/connection

/ownedElt /ownedElt

1

1

AssociationRole

1

1

/ownedElt

ProviderClause

/ownedElt

*

1

sender

1

1

*

*

1

«interaction» Package

Interface

/owner

*

availableFeature

* * Message activator 0..1 interaction 1..* * specification 1 message

*

base

ClassifierRole

interaction

Interaction

AssociationEndRole

* receiver

*

2

1 /type

Collaboration context

1

1

*

Operation

* feature

Request

/ownedElt

Fig. 3. Collaborations and interactions in UML metamodel

Many constraints need to be satisfied by the different elements of Fig. 3. For example, consistency needs to be maintained between all Interactions and the Collaboration specified in the provider clause. Fortunately, many of these consistency constraints are already defined in the UML semantics. Since we are dealing with reuse contracts, we need to impose a number of additional requirements2: 2

Some of these well-formedness rules have been given in mathematical notation in [7] and [8].

385

Tom Mens, Carine Lucas, and Patrick Steyaert

1. Each Interaction corresponding to a certain Collaboration must be owned (indirectly) by the same ProviderClause. 2. The base of each ClassifierRole must be an Interface. Consequently, each availableFeature in a ClassifierRole must be an Operation. 3. The specification of each Message in an Interaction must be an Operation that is an availableFeature in the ClassifierRole which is the receiver of the Message. 4. The specification of the activator of each Message must be an Operation that is an availableFeature in the ClassifierRole which is the sender of the Message. 5. An Interaction can only contain a Message if its sender and receiver ClassifierRoles are connected by means of an AssociationRole in the Collaboration which is the context of the Interaction. A final issue that is not covered in the UML semantics is how Interactions can be kept mutually consistent. In our view, when two or more Interactions correspond to the same Collaboration, they need to describe independent behaviour. In other words, the same behaviour should not be written twice in different Interactions, and the same operation may not exhibit different behaviour in different Interactions. This can be guaranteed by prohibiting each operation to play the role of sender in more than one Interaction3. In other words, all messages sent by the same operation must be shown in the same Interaction. For example, the interactions linkResolving and mouseClicking of Fig. 1 are independent. The linkResolving interaction only describes the behaviour of resolveLink, while mouseClicking only describes the operations invoked by handleClick and mouseClick.

3.4 Reuser Clauses Besides a provider clause, a reuse contract also contains a reuser clause that describes the incremental modifications that are made to the provider clause. The modifications can be made by adding or removing information in different places in the provider clause. The easiest way to express this in UML is by using a tag-value pair with tag modification that can have two values: added and removed. A modelling element that is annotated with {modification=removed} will be removed in the reused model. If annotated with {modification=added} it will be added in the reused model. In the rest of this paper we will use the abbreviation {removed} and {added}. An example is given in Fig. 5. Similar to ProviderClauses, ReuserClauses are defined in the extended UML semantics as a specialisation of Package, and are stereotyped with «reuser clause». Since reuser clauses enumerate the changes only, they contain only partial information. As a result of this, some of the well-formedness rules that were needed for provider clauses are no longer needed for reuser clauses. The structural part of the semantics of provider and reuser clauses is shown in Fig. 4.

3This

rather strong restriction could be weakened, but we have chosen not to do this here for the sake of simplicity.

Supporting Disciplined Reuse and Evolution of UML Models

386

3.5 Contract Types The contract type is an annotation that expresses in which way the reuser clause incrementally modifies the provider clause. Contract types are specified by attaching an attribute contractType to the ReuseContract (see Fig. 4). The value of this attribute imposes constraints on the reuse contract in which it occurs, in the sense that the reuser clause must satisfy the requirements specified by the contract type. For example, a reuser clause corresponding to contract type “interaction refinement” is only allowed to add operation invocations, not to remove them. See [7] for a more detailed discussion of all possible contract types and the constraints they impose. In Table 1 we have presented a basic set of orthogonal contract types. Together with the constraints they impose, they describe the primitive modifications that can be made to a provider clause. Observe that these basic contract types are sufficient to model all changes to our current model. When new kinds of model elements are added, however, extra contract types might be required. One example from [7] is the addition of contract types “participant concretisation” and “participant abstraction” when an annotation abstract or concrete is added to operations. Table 1. Basic set of contract types Contract type

Meaning

Constraint

collaboration extension collaboration cancellation collaboration refinement collaboration coarsening

adding classifier roles to the collaboration removing classifier roles from the collaboration adding association roles between classifier roles removing association roles from the collaboration

new classifiers must have a name that differs from existing ones the classifiers should not be referred to in the interactions new association roles must have a name that differs from existing ones there should be no message sends over this association role in the interactions

participant extension participant cancellation

adding operations to classifier roles removing operations from classifier roles

new operations must have a name that differs from existing ones the operations should not be referred to in any of the interactions

interaction extension interaction cancellation interaction refinement

adding instances of class. roles to an interaction removing inst. of class. roles from an interaction adding operation invocations to an interaction

interaction coarsening

removing operation invoc. from an interaction

the added instances must correspond to an existing classifier in the collaboration there should be no operation invocations to or from these instances there should be an association role in the collaboration over which the invocation can take place, and the operations should be present in the classifier roles no constraint

387

Tom Mens, Carine Lucas, and Patrick Steyaert

3.6 Reuse Contracts In the extended UML metamodel of Fig. 4, a ReuseContract is modelled as a specialisation of a Dependency relationship. The supplier of the ReuseContract must be a ProviderClause, while the client of the ReuseContract must be a ReuserClause. Moreover, the ReuseContract must contain a contractType attribute. The complete definition of BasicProvider in Fig. 4 has been given earlier. The definition of BasicReuser is similar, except that it is not necessarily well-formed, and that its elements may be tagged with the value added or removed. Also, a BasicReuser does not require the presence of both collaboration and interactions. Only the diagram that is subject to modification needs to be mentioned. The parts in Fig. 4 that deal with composite providers and reusers are explained in the following subsection. owningDependency

0..1 subcontract ReuseContract

1 contractType

subDependencies

*

Dependency

1 /supplier ProviderClause

1 /client Package

ReuserClause

subreuser * {ordered}

CompositeProvider

BasicProvider

CompositeReuser

BasicReuser

Fig. 4. Extension of the UML metamodel with reuse contracts

3.7 Composite Reuse Contracts The difference between a model component and its reused version can be quite large. Complex reuser clauses might be necessary to describe this difference. In order to reduce the complexity of reuser clauses, and to increase their understandibility, we introduce composite reuser clauses. A CompositeReuser is an ordered sequence of reuser clauses, which can be composite reuser clauses again. The only restriction is that no cycles are introduced, i.e. a composite reuser clause cannot contain itself (either directly or indirectly). The order of the subclauses is important, since the provider clause will be incrementally modified by applying the subclauses in the specified order. To be able to deal with composite reusers, we also need a notion of composite reuse contracts. For this we can make use of the fact that in UML a Dependency is allowed to have any number of subDependencies. With this in mind, we can define a composite ReuseContract as a dependency between a ProviderClause and a CompositeReuser clause, and this dependency contains as many subdependencies as

Supporting Disciplined Reuse and Evolution of UML Models

388

there are subclauses in the composite reuser clause. Reuse contracts can thus be defined at different levels of granularity. Finally, a reuse contract can not only be applied to a BasicProvider, but also to a reuse contract itself, by wrapping it inside a CompositeProvider. This is for example needed to deal with successive incremental modifications at different points in time. Fig. 5 depicts a composite reuse contract that incrementally modifies the WebNavigation provider clause of Fig. 1 with a composite reuser clause PDFNavigation. The idea is to introduce a new kind of document that only contains hyperlinks that point to places within the document itself. For this reason, the targets of these links can be retrieved by the document itself. This is achieved by a composite reuser clause with three different subclauses. The first one adds an operation gotoPage to Document, the second one removes the operation invocation from resolveLink to getURL, and the last one adds an operation invocation from resolveLink to gotoPage. Each of these reuser clauses are part of a reuse contract, respectively with contract type “participant extension”, “interaction coarsening” and “interaction refinement”. These three reuse contracts are in their turn subdependencies of a larger composite reuse contract with contract type “composite”. «provider clause» WebNavigation

«reuser clause» PDFNavigation

{contract type = composite}

«collaboration»

«interface» Browser

«interface» doc Document

handleClick browser getURL

mouseClick resolveLink

{contract type = participant extension}

1: getURL {removed}

1: mouseClick handleClick

doc

: Document

self 1.1: resolveLink

: Browser

{contract type = interaction coarsening}

«interaction» linkResolving

browser

browser

: Document

«interaction» linkResolving : Document

1: getURL : Browser

«interface» Document gotoPage {added}

«interaction» linkResolving

«interaction» mouseClicking : Browser

«collaboration»

: Document

{contract type = interaction refinement}

self 1: gotoPage {added}

resolveLink

Fig. 5. Example of a composite reuse contract that modifies WebNavigation to PDFNavigation. In an actual class diagram, both collaborations and the relation between them can be instantiated as illustrated in Fig. 2.

To deal with composite reuse contracts, the metamodel presented in Fig. 4 has to satisfy the following well-formedness rules: 1. The subreuser relationship (on ReuserClause), the subcontract relationship (on ReuseContract) and the derived subDependencies relationship (on ReuseContract) should not introduce any cycles.

389

Tom Mens, Carine Lucas, and Patrick Steyaert

2. If a ReuseContract has subDependencies, then its client must be a CompositeReuser whose subreusers are clients of each subDependency of the original ReuseContract. Moreover, the suppliers of each subDependency must be owned by the supplier of the ReuseContract. 3. If the contractType of a ReuseContract is any of the basic contract types of Table 1, then the client is a BasicReuser. If the contractType is something else, then the client is a CompositeReuser.

4 Reuse Conflicts 4.1 Evolution and Composition Conflicts Fig. 6 presents two seemingly orthogonal modifications of the WebNavigation provider clause. The first one is the PDFNavigation reuser clause of Fig. 5. The second reuser clause, HistoryNavigation, adds history functionality to the original webbrowser. Each time a hyperlink is followed through getURL, the URL of this link is stored somewhere (by invoking addURL), to be able to return to this location at a later time.

«provider clause»

WebNavigation

{contract type = composite}

« reuser clause» HistoryNavigation

“i nteraction refinement” “participant extension” “i nteraction coarsening” “i nteraction refinement” «reuser clause»

“participant extension”

« collaboration» « interface»

Browser addURL «added»

«interaction» linkResolving :Browser

self 1.1: addURL «added»

PDFNavigation

Fig. 6. Detecting a conflict between two modifications of WebNavigation.

Both modifications work fine separately, but when they are combined a conflict arises. Since link resolving is done by the PDF document itself, the addURL operation in Browser will never be invoked, so the history will not be updated when a link is followed in Document. This is called the inconsistent operations problem. It occurs when one modification adapts a certain operation (here getURL) assuming that other operations invoke it, while another modification removes one of these invocations (here the invocation of getURL by resolveLink). Inconsistent operations can only appear when operation invocations are removed. This can only be achieved by a reuse contract with contract type “interaction coarsening”. For the conflict to occur, the second reuser clause should change the operation from which the invocation is

Supporting Disciplined Reuse and Evolution of UML Models

390

removed, so it can only be a reuse contract with contract type “interaction refinement” or “interaction coarsening”. In general, problems can occur when two independent changes are made to one model, regardless of whether this is achieved through composition, during evolution or by different developers. We therefore try to detect conflicts between two reuse contracts with the same provider clause, as this models two modifications of the same component. There are different approaches possible to detect the conflicts. In most cases, conflicts can be detected by comparing the two contract types and reuser clauses. Sometimes, however, the provider clause needs to be consulted or the result of applying the modifications needs to be computed. This is the case for conflicts involving a transitive closure of operation invocations. [7] inventorises a number of conflicts that can occur and describes how reuse contracts aid in detecting them. For each conflict a formal rule can be set up to detect the conflict. As the conflicts are dependent of the contract type, tables can be set up where both the rows and columns represent contract types and the fields specify what conflicts can possibly occur for a certain combination of types. This table can be filled in by comparing all contract types two by two, and determining under which conditions they interact in an undesired way. Using these tables it becomes possible to detect a number of conflicts automatically. This approach is of course not restricted to basic reuser clauses: it is also scaleable to composite reuse contracts and reuser clauses. This is crucial to the usefulness of our approach, as real modifications usually require composite reuser clauses. One should note that we are not able to detect all possible conflicts between different modifications of the same provider clause. For some conflicts, more behavioural information is needed. This is one of the trade-offs we made when developing the reuse contract approach. It should detect as much conflicts as possible, yet remain intuitive in use in order to be directly usable by software developers. 4.2 Instantiation Conflicts Since provider clauses represent reusable components (in our case template collaborations between class interfaces), they are not stand-alone, but interact with the environment in which they are used. For example, in Fig. 2 the WebNavigation and PDFNavigation provider clauses were instantiated to a collaboration between actual classes in a class diagram. This instantiation can lead to so-called instantiation conflicts. For example, when the mouseClick operation would be removed by editing the BrowsableDoc class, this would result in a conflict with the WebNavigation instantiation, since WebNavigation requires the presence of mouseClick in its Browser role. Even worse, due to the change in BrowsableDoc, similar conflicts might arise in all its subclasses because of the inheritance mechanism. In this case, a conflict will also arise in pdf-Doc, since pdf-Doc plays the role of Document in the PDFNavigation instantiation, and PDFNavigation also requires the presence of mouseClick.

391

Tom Mens, Carine Lucas, and Patrick Steyaert

Nevertheless, by using provider clauses to document reusable components, and by carefully specifying how these components are reused, it becomes possible to detect instantiation conflicts too.

5 Conclusion and Future Work This paper introduced a framework for reasoning about definition, modification and reuse of model components consisting of a collaboration specification and a set of associated interactions. To this extent, the UML metamodel was extended with the mechanism of reuse contracts. UML packages allowed us to encapsulate the internal details of the components. The dependency relationship was used to document incremental modification of model components. A contract type was attached to this relationship to express different kinds of incremental modification. To allow reuse and evolution at different levels of abstraction (cf. composite provider and reuser clauses) the nesting facility of packages was exploited, together with the possibility of dependencies to have subdependencies. Finally, the template mechanism was used to allow generic components that can be reused in different places. This allowed us not only to detect conflicts upon evolution and composition, but also instantiation conflicts. How easy our ideas can be integrated in a CASE tool that supports UML, will depend on the extensibility of the tool, and of the availability of a metamodel that can be directly accessed and modified from within the tool. Also, a CASE tool can assist in making the approach more user-friendly. We have already done some very promising experiments with scripts for adding reuse contracts to Rational Rose. One of the current shortcomings is that the notion of reuse and evolution needs to be defined manually for each kind of component, and for each kind of UML diagram. Another shortcoming is that it is not clear how different kinds of diagrams are related to one another. In this paper, we only gave a solution for components containing a collaboration and a set of associated interactions. The same approach could also be used for dealing with components consisting of use case diagrams, state diagrams, etc. In a forthcoming paper we will show how the UML metamodel can be extended to deal with evolution of all kinds of UML diagrams in a uniform way. Another challenge is to manage dependencies between components in different phases of the software life-cycle. The transition from model components in one phase to model components in another phase could be explicitated by means of reuse contracts and contract types. This should lead to a better traceability between phases, and should enable assessing the impact of changes to higher level components on the associated lower level components, and vice versa. Finally, we are currently applying this approach with an industrial partner, to deploy a set of domain-specific models in different projects. This will provide crucial feedback to ameliorate our framework and make it a true methodology for disciplined reuse and evolution.

Supporting Disciplined Reuse and Evolution of UML Models

392

Acknowledgements We thank Theo D'Hondt for supporting our work, Stephane Ducasse for the fruitful discussions, and of course Kim Mens, Koen De Hondt and Roel Wuyts for the great collaboration. We also express our gratitude to the anonymous referees for reviewing our paper and providing some valuable feedback.

References 1. Codenie, W., De Hondt, K., Steyaert, P. and Vercammen, A.: From Custom Applications to Domain-Specific Frameworks. Communications of the ACM, Special Issue on Application Frameworks, 40(10). ACM Press (1997) 70-77. 2. Jacobson, I., Griss, M. and Johnson, P.: Software Reuse, Architecture, Process and Organization for Business Success. ACM Press (1997). 4. Kiczales, G., des Rivières, J. and Bobrow, D.G.: The Art of the Meta-object Protocol. MIT Press (1991). 5. Kiczales, G. and Lamping, J.: Issues in the design and documentation of class libraries. Proceedings of OOPSLA '92, ACM SIGPLAN Notices, 27(10). ACM Press (1992) 435-451. 6. Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.-M. and Irwin, J.: Aspect-oriented Programming. Proceedings of ECOOP ’97, LNCS 1241. SpringerVerlag (1997) 220 - 242. 7. Lucas, C.: Documenting Reuse and Evolution with Reuse Contracts. PhD Dissertation. Vrije Universiteit Brussel (1997). 8. Mens, T., Lucas, C. and Steyaert, P.: Giving Precise Semantics to Reuse in UML. Proceedings of ICSE ’98 Workshop on Precise Semantics for Software Modeling Techniques, Technical Report TUM-I9803. Technische Universität München (1998) 73-89. 9. Gogolla, M. and Richters, M.: On constraints and queries in UML. The Unified Modeling Language - Technical Aspects and Applications. Physica-Verlag (1998). 10. Object Management Group: Unified Modeling Language 1.1 Document Set. OMG Documents ad/97-08-01 to ad/97-08-08 (1997). 11. Reenskaug, T., Wold, P. and Lehne, O. A.: Working With Objects. Manning Publications, Greenwich, CT (1996). 12. Steyaert, P., Lucas, C., Mens, K. and D'Hondt, T.: Reuse Contracts - Managing the Evolution of Reusable Assets. Proceedings of OOPSLA '96, SIGPLAN Notices, 31(10). ACM Press (1996) 268-286. 13. Wegner, P. and Zdonik, S. B.: Inheritance as an Incremental Modification Mechanism, or what like is and isn’t like. Proceedings of ECOOP ’88, LNCS 276. Springer-Verlag (1988) 55-77. 14. Wirfs-Brock, A.: Designing Reusable Designs - Experiences Designing Object-Oriented Frameworks. Addendum to the OOPSLA/ECOOP '90 Proceedings, SIGPLAN Notices Special Issue. ACM Press (1990) 19-24. 15. Hamie, A., Howse, J., Kent, S., Mitchell, R. and Civello, F.: Reflections on the OCL. Proceedings of <>'98 International Workshop. Springer-Verlag (1998).

Applying UML Extensions to Facilitate Software Reuse N.G. Lester, F.G. Wilkie, and D.W. Bustard School of Information and Software Engineering University of Ulster Newtownabbey Co Antrim BT37 0QB Tel: +44 - (0)1232 - 368197 Fax: +44 - (0)1232 - 366068 E-mail: [email protected]

Abstract. The benefits from reuse during the analysis and design stages of software development are well understood. This paper examines the contribution which UML can make to such reuse through its ability to document reusable structures. In particular, the application of the UML concepts of stereotypes, tagged values, class compartments and association roles in the definition of search criteria for reuse candidates are explored. An iterative development process, within which UML can be used, is presented and discussed. An initial implementation in a CBR environment, and results from this experimental prototype, are also presented.

1 Introduction Reuse is one way to achieve significant cost reduction in software development. It can be particularly beneficial during the analysis and design phases of the life cycle [4], [11]. Poulin [7] believes that the success of reuse technology depends critically on the integration of reuse repositories with design and programming development environments. UML supports such integration during these development phases through inclusion of software components and design patterns in its models. It also has the potential to provide much closer integration of reuse in analysis and design through prescriptive use of the UML extension mechanism in the generation of search criteria for sought reuse candidates. Classification and specification are two sides of the same coin. The structure of the repository and the classification techniques used to categorize reuse artifacts determine the way in which reuse requirements (search criteria) must be specified. This paper is concerned with the construction of appropriate search criteria. The techniques discussed in the sections that follow form part of a systematic reuse approach taken in the Esprit REBOOT project [4], [11] applied to an enhanced V life

J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 393–405, 1999. © Springer-Verlag Berlin Heidelberg 1999

394

N.G. Lester, F.G. Wilkie, and D.W. Bustard

cycle model [13]. A Case Based Reasoning (CBR) tool is used as an implementation testbed for evaluating the proposals presented.

2 An Iterative Reuse Process REBOOT [4] identifies the individual steps required to facilitate reuse in each phase of the Reuse Requirements Analysis Specification development life cycle. Our work integrates the REBOOT approach with an enhanced V life Specification cycle model to provide an iterative and structured approach to reuse. In the basic V life Reuse Architectural cycle model the development process moves Analysis Design through a series of phases, each generating a phase product. In the reuse enhanced model, Design each development phase has a corresponding reuse co-activity which can influence other activities and products. Each reuse activity feeds Fig. 1. Segment of Iterative Reuse ideas into the phase concerned, through Framework materials found as a result of searches performed, and leads to iterative refinement. So, for example, during requirements specification, a search could be made for similar existing software systems or even design material, which in turn might influence the analyst or client’s concept of what the system could or should be (figure 1). Just as an information system may perturb the business domain within which it is intended to operate, existing reuse material can affect and shape the view of how a system should operate. The iteration at each phase continues until both the developer and client is satisfied with the phase product.

3 Applying UML to Reuse The following sections discuss how UML modeling elements can support the documentation of reusable material. Extensions to these elements to increase UML’s support for reuse are then detailed. These modeling elements include stereotypes, tagged values, association roles, and class compartments. 1. The stereotype modeling element is one of three mechanisms in UML for extending the semantics of existing UML modeling elements. If a required concept does not exist, stereotypes can be used to add semantics to an existing model element that is similar to the concept concerned. A stereotype is represented by placing guillemets around the stereotype name, e.g. <<Stereotype_Name>>. [2] 2. The tagged value extension mechanism provides a means of attaching additional information or properties to the model elements. These properties can be

Applying UML Extensions to Facilitate Software Reuse

395

interpreted by a human reader or provide information for other tools. There are a number of predefined tagged values in UML, but others can be introduced as necessary. A tagged value is represented by a name-value pair inside braces e.g. {tagName = tagValue} [2]. 3. The association role name indicates the role a class plays in an association. A class participating in several associations may have several different roles. These roles help specify the context within which the class is used. 4. A class can have three standard class compartments, with others defined as required. These provide additional information that can help in the understanding of artifacts but not in the search process. 3.1 Building Search Criteria for Reusable Artifacts from UML Models The main purpose of this work is to build search criteria for reusable artifacts from UML models, in particular UML class models. Information from outline class diagrams is used to search for, and locate, reusable artifacts which can help improve a design or lead on to the discovery of related artifacts associated with later stages of the development life cycle. The specification of search criteria centers on classifiers that are applied to elements within the class model. Without classifiers, the search for reusable artifacts relies on the names assigned to elements. This can lead to missed reuse opportunities because artifacts in a repository must have the same name as the elements used to form the search criteria. When a classifier is assigned to an element, the search covers artifacts with the same or a similar classifier, which allows for synonyms. One possible way to use the classifier and element name to search for reusable artifacts is through the attribute-value classification technique [6]. With this approach, artifacts are described by a set of attributes and their values, where the classifier is the attribute, and the value is the name of the element. The attribute-value approach has some problems, however, which are outlined below, along with possible solutions. • Control of Attribute Values: The major problem with this form of classification is the control of the vocabulary used as values for the attributes. Without some form of controlled vocabulary the developer searching for an artifact may use a synonym of the value from the original classification, possibly resulting in a missed reuse opportunity. It is impossible to restrict the vocabulary available to the developer when assigning names to elements. One way to reduce this side effect of attribute-value classification, however, is to allow one or more terms to be selected with the classifier. A term in this context is a classification word more specific than, and appropriate to, the classifier. This means that each classifier will have a distinct list of related terms. This concept is similar to the faceted classification technique developed by Prieto-Diaz for categorizing reusable software components [8], [9]. The faceted classification approach builds up or synthesizes the categories from the domain of interest by selecting pre-defined keywords from a list of terms for the facet. Each element is therefore described by values for a

396

N.G. Lester, F.G. Wilkie, and D.W. Bustard

triple of attributes consisting of the classifier, term(s) and object name. The controlled vocabulary is introduced at a meta-level to describe the element name and thereby qualify the search. • Variation in classifiers and terms across domains. Moving from one domain to another requires a new set of classifiers and terms to be used. The differences between the domains will largely be reflected in the terms used rather than the classifiers. For example, the classifier ‘staff’ is applicable in both the education and health care domain, but the associated terms will differ. Within the education domain, the terms are likely to include lecturer, researcher, technician, and student, while in the health care domain the terms might be doctor, nurse, cleaner, porter, manager, surgeon, and so on. There are different approaches to representing these classifiers within UML. Two are outlined below. The first uses the UML stereotype element and the second tagged values. 3.2 Stereotypes The stereotype modeling element has great potential for helping with the construction of search criteria for reusable artifacts during analysis and design. There are predefined stereotypes in UML, but the developer can extend UML by defining other stereotypes. The developer is therefore able to use the concept of a stereotype to tailor UML for individual reuse needs. Stereotypes Applied to Classes Figure 2 provides an example of a stereotype in use. It describes a situation involving students enrolling on university courses and includes information on the staffing and accommodation resources required for course delivery. This example demonstrates how the use of a stereotype to qualify an object can make the search for material related to that object more accurate. The class ‘ProfessorInfo’ has been assigned the user defined stereotype <<Staff>>. Without the stereotype the search would be directed by the key word ‘ProfessorInfo’. However, whilst the class name ‘ProfessorInfo’ provides an indication of the precise staff type required, many synonyms may apply, resulting in difficulties when searching on the class name. When the stereotype is assigned to the object the search can be restricted to objects having ‘staff’ as part of their classification. Stereotypes Applied to Association Roles The roles which objects play in associations provide important contextual information which can be useful in identifying potentially reusable material. Context is very important in identifying relevant reuse candidates. The roles which an object plays, help clarify its context.

Applying UML Extensions to Facilitate Software Reuse

397

<> RoomDetails 'Room'

1..*

<> Booking 'Scheduling' 1..*

<<Staff>> ProfessorInfo 'Lecturer'

<<Staff>> Employee

1..*

Faculty

<> Teacher <> Manager

The classification terms are displayed in the compartments below the class names. The stereotypes applied to association roles are displayed above the role names

<> Module

<> StudentInfo

'Module'

'Student'

1..*

1..*

1..*

<> CourseInfo 'Course'

Fig. 2. Class Diagram Representing Student Registration and Resource Allocation

Attribute-Value classification can be used to provide a structured way to integrate association roles into the search criteria of an object. Appropriate stereotypes can be defined to classify each association role. Thus the role is described by the combination of attribute values for the stereotype along with its role name. In the example provided in figure 2 the role of the ‘ProfessorInfo’ class in its relationship with the ‘CourseInfo’ class is specified as: stereotype <>, with association role name ‘Manager’. Figure 2 shows two other roles for the ProfessorInfo object, namely ‘teacher’, in the association with a Module, and ‘employee’ in the association with a Faculty object. Stereotypes would also be assigned to these roles in creating the overall description of the ProfessorInfo object. Applying terms to the stereotype of a role provides little additional information for any subsequent search and has therefore not been adopted. In figure 2, stereotypes attached to association roles are displayed directly above the role name. By combining the faceted description of the object with that of its role(s) a more context specific description of the overall object is produced. This object description is therefore composed of the values for the three attributes i.e. stereotype, terms, and

398

N.G. Lester, F.G. Wilkie, and D.W. Bustard

object name for the object, plus stereotype and role name for each of its association roles. Table 1 provides an example specification for the ProfessorInfo object. OBJECT

ROLE

Name

Stereotype

Terms

Name

Stereotype

ProfessorInfo

Staff

Lecturer

Manager

Administration

Teacher

Academic

Employee

Staff

Table 1. Faceted and Attribute Classification for ProfessorInfo Object

3.3 Tagged Values Tagged values is another extension mechanism in UML. They allow the explicit definition of a property for a UML model element. Tag values are a name-value pair, where the name is the tag {tagName = tagValue}. As with stereotypes, there are predefined tagged values within UML, with the option to define others. Tagged values can be used to introduce classifiers in a similar way to stereotypes. The name or tag indicates the use of the classification facet and the value is the chosen classifier. For example, with the ‘ProfessorInfo’ class from figure 2, the tagged value {facet = “Lecturer”} can be attached. The tagged values are used in the same way as stereotypes and can be applied to association roles. Eriksson [2] outlines six steps to defining a tagged value. 1. ‘Investigate the purpose of the tagged value.’ The purpose of the ‘Facet’ tagged value is to provide a general description of the element to which it is attached. The general description of the element is used to build search criteria for reusable artifacts. 2. ‘Define the elements it may be attached to.’ Currently this work concentrates on the static class model in UML. Therefore, the ‘Facet’ tagged value can be attached to classes, objects, packages or complete class models. 3. ‘Name the tagged value appropriately.’ The tagged value is called ‘Facet’. 4. ‘Define the type of the value.’ The values for the ‘Facet’ tagged values will be predefined strings representing important concepts within the domain of interest. 5. ‘Know who or what will interpret it…’ A tool will use the descriptions provided by the tagged values to build search criteria for reusable artifacts. There is, however, a benefit to human observers of a model which includes the ‘Facet’ tagged values insofar as it will be more readable. 6. ‘Document one or more examples on how to use the tagged value.’ Figure 3 gives an example of how this might be presented.

Applying UML Extensions to Facilitate Software Reuse

{facet = "Resource", term1 = "Room"}

399

RoomDetails

1..*

{facet = "Process", term1 = "Scheduling"}

Booking

Fig. 3. Example of tagged value used as classifier

3.4 Stereotypes Versus Tagged Values The main concern with using stereotypes as classifiers is the number required. This can increase rapidly because only one stereotype can be assigned to any UML element. Thus, for example, if an element requires two classifiers to describe it accurately, a new stereotype must be derived from the stereotypes for each classifier. The tagged value approach to classifiers does not suffer from this problem because more than one tagged value can be assigned to an element. Therefore, if another classifier is required, another tagged value can be attached. The ability to attach more than one tagged value to an element also means that ‘terms’ can be represented using the same approach as classifiers. Using tagged values to represent both classifiers and terms simplifies the overall scheme. Stereotypes do have the advantage of having a formal definition. This would, for example, allow the stereotype to be defined by its ‘terms’. Also, stereotypes are generalisable elements and can have a classification hierarchy. The hierarchy of stereotypes formalizes the derivation of new stereotypes and so new classifiers can be created in a controlled way. Although there are advantages and disadvantages to each approach, and both stereotypes and tagged values can be used to introduce classifiers, overall, tagged values seem to be the most appropriate for this purpose. 3.5 Class Compartments and Refinement Relationships There are three standard compartments in a UML class: (i) class name, (ii) class attributes and (iii) class operations. It is possible to have additional non-standard class compartments. Class compartments cannot form part of the initial search for appropriate reusable artifacts, but they can provide additional information which will help with understanding artifacts and thus with selecting the best candidate. Examples include compartments devoted to business rules and responsibilities. The ability to trace artifacts through the software development lifecycle is important to the reuse process. The UML refinement relationship is a relationship between two descriptions of the same thing at different levels of detail. It provides a

400

N.G. Lester, F.G. Wilkie, and D.W. Bustard

means of tracing the different guises of a given artifact as it appears through various life cycle phases. For example, a collaboration at the design phase is a refinement of a use case at the analysis phase. 3.6 Keyword Specification Poulin [6] states that “facets alone cannot adequately provide all the information needed to fully classify and understand a reusable component.” A combination of classification techniques is required to help software developers locate, assess and integrate reusable components into their products. Keyword specification complements the faceted classification technique outlined in the preceding sections. It involves selecting keywords which best describe the artifacts required. These selected keywords are compared to information held about each artifact in the repository. A best match of keywords locates the most suitable artifacts. The keywords can be extracted from any model elements but there are some which are more fruitful than others. ‘Notes’ attached to modeling elements and models provide a description of the item and an insight into its context. These keywords may not necessarily appear elsewhere on the model. The names of other elements, such as: types, packages and swimlanes from activity diagrams, can also provide suitable keywords. Furthermore, a developer may wish to add keywords which are significant but which do not appear on the UML model. This is facilitated by the use of a tagged value ‘keyword’. There are three ways of extracting and using these keywords: 1. The simplest is to allow the developer to identify keywords from the model. The keywords entered then become part of the search for reusable material. 2. The second technique not only involves the developer selecting keywords from the model, but also evaluating their relevance. Each keyword selected is given a value of 1, 2 or 3 depending on its relevance, with 1 indicating high relevance and 3 indicating low relevance. The results of the search can then be ranked according to the values of the associated keywords. 3. The third technique involves automatic extraction of the keywords. Techniques have been proposed which identify suitable classification keywords from text [5]. Similar techniques can be used to identify keywords from ‘notes’ and other specified model elements - specifically those elements discussed previously. This approach releases the developer from the arduous task of identifying the keywords.

4 Implementation and Results Initial experiments were performed to investigate the feasibility of classifiers and terms forming the search criteria for reusable artifacts. The experiments were designed to investigate several aspects of the proposed concepts - the classifier concept, the term concept and the concept of association role descriptions.

Applying UML Extensions to Facilitate Software Reuse

401

A Case Based Reasoning (CBR) environment was used to carry out the experiments. CBR was chosen because of its support for similarity reasoning. CBR allows reasoning to be performed in the manner of an expert, that is, by using past experience to solve a new problem. A commercially available CBR system ReCall was used to implement and test a case base of UML class model descriptions of varying sizes and degrees of complexity. Artifact descriptions (or cases) were entered into the case base and various searches or targets were submitted to the system. ReCall is a structured CBR system which means the cases and targets can have a structured rather than a single vector representation. This means that any submitted class models or ‘cases’ can be made up of an unspecified number of UML packages, which in turn can be made up of an unspecified number of classes. In producing search results, ReCall was configured to order the cases in the case base according to how similar it judged them to be to the target. This ordering was important as it allowed us to verify our ideas of case similarity to the similarity reasoning of the package. Preliminary investigations were conducted with ten cases in the repository (the case base). Some initial tests were carried out to ensure that the CBR package performed in a sensible manner. For example, given a target which completely matched one of the stored cases, the CBR package correctly ordered the search results. The second stage introduced cases containing multiple class descriptions. This enabled us to evaluate some granularity concerns. For example, figure 4 contains a simple scheme for describing a ‘case’ in terms of the classifier and term for a class and likewise for an association role.

Class

Classifier Term Association Role

Classifier Term

Fig. 4. A generalized scheme for describing the elements of a case.

Consider the small case base shown in figure 5, which contains three cases. For illustrative purposes, only classes are described in each case, but in practice many cases would also contain association roles - the principle is the same. In the actual case base the cases had varying numbers of classes and relationships (via the association roles). Several targets were defined to test the reasoning powers of the CBR tool. These targets involved multiple classes. An example is given in figure 6. In this example, the CBR tool ordered the cases ‘2’ then ‘3’ and finally ‘1’, reflecting the degree of similarity between the required model and what was available. The order of these results is consistent with what a human, visually

402

N.G. Lester, F.G. Wilkie, and D.W. Bustard

comparing the cases, would expect1. When association roles were added to the models and target, the precision of the ordered results was improved. As well as returning exact matches, cases which had matching elements but different roles, and elements which had different descriptions but matching roles, were shown to be similar. This demonstrates that a CBR approach to solving this kind of problem is appropriate and useful. Case 1 Class

Class

Case 2 Staff

Module

Lecturer Commodity Module

CourseInfo

StudentInfo

Booking

Case 3 Commodity Module Commodity Course Client Student

StudentInfo

ProffessorInfo

RoomDetails

Client Student Staff Lecturer Resource Room

Process Scheduling

Fig. 5. A small illustrative case base.

Module

StudentInfo

RoomDetails

Commodity Module Client Student Resource Room

Fig. 6. An example CBR target.

5 Related Work Case Based Reasoning (CBR) has been identified by other workers, such as those involved with the ROSA project [14], as a possible vehicle for retrieving suitable 1

The CBR tool was configured so that cases with large amounts of irrelevant material were not disadvantaged in the matching process.

Applying UML Extensions to Facilitate Software Reuse

403

analysis-level specification models. Their work differs from ours in that they are especially interested in the contribution of analogical reasoning to the identification of reuse candidates. A later report [1] indicates that they used the OORam [10] method and paid attention to the semantic knowledge present in OORam ‘roles’ through the role names, associations (called the knows_about relation in OORam) and the messaging interactions between roles. UML class models do not include messaging interactions, beyond the identification of class relationships which imply some form of messaging. Such messaging is depicted in various forms of UML dynamic models including Sequence and Collaboration diagrams. Our current research is restricted to the contributions to reuse from the UML class relationship model only. However, it is acknowledged that there may be useful semantic information in dynamic models that could contribute to the successful identification of reuse candidates. Fernández-Chamizo et al [3] have also employed CBR techniques in the retrieval of program code. In their scheme, each case is represented by three features: (i) its descriptions, (ii) its associated solution and (iii) the justification. The description has two parts: the lexical description obtained from documentation using “automatic freetext indexing techniques” and the conceptual descriptions of the component functionality using a conceptual framework based on an ontology of programming knowledge. The associated solution represents the component’s source code and the justification includes things like dependencies or restrictions. As this scheme deals with low level design information such as the operations of a class, it is only suitable for reuse of source code artifacts. Although we are concerned with reuse at higher levels of abstraction, more recent work (discussed further in the conclusions) draws upon the use of ontologies of domain analysis and high-level design knowledge. In this respect the work is taking a similar approach to that of Fernández-Chamizo et al.

6 Conclusions This work is part of an ongoing research project. The paper has identified specific UML elements that can be usefully employed in the construction of search criteria for reusable artifacts. In so doing, a methodical approach to specifying the criteria for required reuse artifacts is created. The techniques discussed are part of an iterative reuse framework that supports reuse at each stage of the development life cycle. The paper demonstrates how faceted classification can be combined with either the UML stereotype or tagged value to overcome the problem of synonyms which limits the efficiency of searching reuse repositories using keywords alone. In particular, both UML stereotypes and tagged values have been applied to classes and association roles. The association roles enable the importance of an object’s context, in identifying suitable reuse candidates, to be accommodated. UML stereotypes could also be applied to packages and components as well as to classes, reflecting the need to locate reusable artifacts on varying levels of granularity.

404

N.G. Lester, F.G. Wilkie, and D.W. Bustard

The initial experiments with CBR technology are quite encouraging. Our results demonstrate the feasibility of using a CBR approach to implement the scheme proposed within this paper. The problem of synonyms is usually addressed via some form of thesaurus. Thesauri form part of many reuse classification and retrieval systems [8]. In this paper a limited vocabulary approach is outlined which does not rely on a thesaurus. The limited vocabulary is most suitable for reuse within a given domain or when users of the system are familiar with the vocabulary. In order to allow reuse across several domains, and to extend the limited vocabulary, a Terminological Ontology Framework is currently under construction. This framework is based around the limited vocabulary but employs three different views or dimensions, one of which is the ‘synonym’ dimension. This dimension relates the vocabulary to possible synonyms. The second dimension supports definitions of each classifier and the third dimension provides a hierarchical perspective of the vocabulary. These three dimensions are used in combination to perform similarity reasoning. An implementation of this framework using a variation on CBR is currently under way.

Acknowledgments The first named author gratefully acknowledges support for this work through a studentship funded by the Department of Education for Northern Ireland.

References 1. Solveig Bjørnestad: Relations to Support Reuse of Specifications, Norwegian Informatics Conference 1997, NIK'97, pp. 31-42, 1997. 2. Hans-Erik Eriksson, Magnus Penker : UML Toolkit, Wiley Computer Publishing, 1998, ISBN 0-471-19161-2 3. Carmen Fernández-Chamizo, Pedro Antonio González-Calero, Mercedes Gómez-Albarrán and Luis Hernández-Yáñez : Supporting Object Reuse Through Case-Based Reasoning, EWCBR-96, Switzerland, November 1996, Proceeding, In Lecture Notes in Artificial Intelligence, 1168, editors Ian Smith and Boi Falting 4. Even-André Karlsson: Software Reuse - A Holistic Approach, Wiley & Sons, 1995, ISBN 0-471-95489-6 5. Jian-Yun Nie, Francois Paradis, Jean Vaucher: Using Information Retrieval for Software Reuse, In Proc. Fifth International Conference on Computing and Information, pp 448-452, 1993 6. Jeffrey S. Poulin, Kathryn P. Yglesias: Experiences with a Faceted Classification Scheme in a Large Reusable Software Library (RSL), Seventh Annual International Computer Software and Application Conference, Phoenix, AZ, 3-5 November 1993, pp 90-99 7. Jeffrey S. Poulin: Integrated Support for Software Reuse in Computer-Aided Software Engineering (CASE), ACM Software Engineering Notes, 18(4), Oct 1993, pp 75-82 8. Ruben Prieto-Diaz, Peter Freeman: Classification of Reusable Modules, IEEE Software, 1987 Jan, pp 6-16

Applying UML Extensions to Facilitate Software Reuse

405

9. Ruben Prieto-Diaz: Implementing Faceted Classification for Software Reuse, Communications of The ACM, 24(5), pp 88-97, May 1991 10. Trygve Reenskaug :Working With Objects, The OOram Software Engineering Method, Manning, 1996, 1-884777-10-4 11. Danielle Ribot: Development Life-cycle WITH Reuse, In Proc. 9th ACM Symposium on Applied Computing, Software Reusability Track, 1994 12. Keith Robinson, Graham Berrisford: Object-Oriented SSADM, Prentice Hall, 1994, ISBN 0-13-309444-8 13. Paul Rook: Controlling Software Projects, Software Engineering Journal, January 1986, pp.7-16. 14. Bjørnar Tessem, Solveig Bjørnestad, Knut Martin Tornes, Gorm Steine-Eriksen: ROSA = Reuse of Object-oriented Specifications through Analogy: A Project Framework, IFI Report 16, ISSN 0803-6489, 1994, http://www.ifi.uib.no/projects/rosa/publikasjoner.html

A Formal Approach to Use Cases and Their Relationships Gunnar Overgaard and Karin Palmkvist Royal Institute of Technology, Electrum 204, SE-164 40 Stockholm, Sweden [email protected] Rational Software Scandinavia AB, Box 1128, SE- 164 22 Stockholm, Sweden karin @rational.com

Abstract. The Use Case construct is one of the most important constructs for modelling the dynamics of a system. In ths paper we describe the Use Case construct of the Unified Modeling Language (UML) together with two important kinds of relationshps between Use Cases, namely Uses and Extends. These two relationships are specializationsof the Generalization relationship in UML. We give a detailed presentation of the semantics of these constructs using the UML constructs Operation and Method. The presentation is used as a basis for formalization of the semantics. For the formal specification is used an object-oriented specification technique specially designed for formalization of modelling languages.

1 Introduction In the Unified Modeling Language (UML,) [ 5 ] ,one of the key tools for behaviour modelling is the Use Case construct, originating from OOSE [2]. A Use Case specifies one way of using a system without revealing the internal structure of the system. The construct can for example be used for specification of the functional requirements of a system, for deriving Classes, and for tracing requirements between different parts of the system. Two useful constructs in Use-Case modelling are the Uses and the Extends relationships, both included in UML as stereotypes of the Generalization relationship. These constructs are, together with the rest of the constructs of UML, defined in the UML semantics document [ 6 ] . This description is “semi-formal”, i.e. parts of it are specified with well-defined languages, while other parts have been described informally in English. The abstract syntax of the different language constructs in UML is specified with the graphical notation of Class diagrams in UML itself, while the well-formedness rules of UML are given in OCL, an object-oriented constraint language (see [ S ] for more details). Ordinary English is chosen for describing the J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 406–418, 1999. © Springer-Verlag Berlin Heidelberg 1999

A Formal Approach to Use Cases and Their Relationships

407

semantics of UML. This makes the structure of the language rigorous whereas the semantics of the language is still quite informal. It is commonly accepted that a language needs a formal specification to be unambiguous. Furthermore, the semantics of the language must be precise if tools are to perform intelligent operations on models expressed in the language, like consistency checks and transformations from one model to another. During the development of UML 1.1, the UML 1.1 semantics task force, in which the authors participated, had to find a balance between formalism and readability to avoid making the language definition overly complex for the current needs. In the future a more formal specification of UML will probably be needed, however [6,101. In this paper we focus on the semantics of the Use Case, the Uses, and the Extends constructs, and present some of the results we have achieved in a project formalizing UML. We have chosen to use an object-oriented, operational semantics. Since the intended user will be familiar with the object-oriented paradigm, the use of an objectoriented specification method instead of a traditional one, will cause no conceptual shift between the specification language and UML. This paper is organized as follows. The next section includes a brief presentation of the specification technique we have used. In section 3 the Use Case construct is described, while sections 4 and 5 present the Uses and the Extends relationships. The paper ends with some concluding remarks.

2 Specification Technique Before discussing the semantics of the Use Case construct in UML, we introduce the technique used for the specification of the semantics of these constructs. We give an operational semantics using an object-oriented specification language named ODAL, which has been formalized using the n;-calculus [3]. ODALis a simple, strongly typed language with a familiar syntax. It is used as the specification language in a framework for formal specification of modelling languages. In this paper we use a simplified version in which e.g. most type information has been omitted; we have also omitted the description of the underlying mechanism for invocation of the operations. The syntax of some of the ODALconstructs

ClassDef

: : CLASS ClassName [ SUPERCLASS ClassName J [ VARIABLES VarDef * ] [ METHODS MethDef * ] VarName : Type

.. ..

VarDef Type MethDef

-

Expr

-

ExprSeq AssignExpr

:: ::

.. ..

ClassName I BOOLEAN I Type* Methomame ( [ VarDef*l ) [ I VarDef* I 3 Expr ExprSeq I AssignExpr I CondExpr I MsgExpr I IterExpr I FindExpr I ... Expr ; Expr VarName : = Expr

408

Gunnar Övergaard and Karin Palmkvist

CondExpr MsgExpr IterExpr FindExpr

: : IF Expr, THEN Expr, [ ELSE Expr, ] : : Expr MethodName ( [ Expr,*] ) : : FOREACH VarName IN Exprs

[ VarName, IN Expr,,]" DO Expr : : FIND VarName IN Exprs SUCHTHAT Expr,

The meaning of most of these expressions is quite obvious, and we will therefore explain only the last two of them. An iteration expression involves two steps. First the E x p s expression is evaluated once, which results in a collection of values. Then, the Expr expression is evaluated for each of the resulting values. The current value is available during the evaluation of the Expr expression under the name of VurNume. The value of the complete expression is the same as the value of the last evaluation of the sub-expression. It is possible to iterate over several collections simultaneously. The meaning of afind expression is similar, but its result is the subset of the values for which the Expr, evaluates to true. A fundamental principle in our specification is that the semantics of a construct should be localized and not implicitly included in the specifications of other constructs. The reason for this is not only the need for readability and understandability but also for changeability of the specification. For example, UML includes the Stereotype construct which, when applied to a construct, may modify the semantics of that construct. It is easier to understand the implication of applying a particular stereotype and to keep the specification consistent if the specification of the construct is localized. This localization principle implies a specification technique in which the semantics of the different kinds of relationships are separated from other kinds of constructs. For example, the specification of an object construct states that an object has, among other things, a set of relationships, but the specification does not include their semantics. The meaning of a particular kind of relationship is defined separately. Hence, when a new kind of object construct is defined in the modelling language, features like atomic transactions, persistence etc. are included in the definition, while features based on a relationship of a specific kind are left out. Similarly, the definition of a class construct is made without presumptions about e.g. the existence of an inheritance construct. Each construct specification contains a mandatory operation named consistent, which specifies the invariant of the construct, i.e. properties which must always be true for a particular occurrence of that kind. The operation takes no argument, and for an occurrence of a construct to be correct from a semantic point of view, the operation must result in true. This technique implies that the static semantics of a construct, i.e. the well-formedness rules, is expressed with the consistent operation, while the dynamic semantics is expressed with the other operations. Note that the dynamic semantics of a modelling language includes the semantics used in the development of the model as well as the run-time semantics of the model. This implies that one operation may specify the semantics of both the development of the model and the execution of the model.

A Formal Approach to Use Cases and Their Relationships

409

3 Use Case The Use Case construct is a tool for describing how a system is to be used. A Use Case describes one service provided by a system, i.e. a specific way of using the system. The complete set of Use Cases specifies all possible ways in which the system can be used, without revealing how this is to be implemented by the system. This makes Use Cases suitable for defining functional requirements in the early stages of system development, when the inner structure of the system has not yet been defined. Also, Use Cases can be used as a basis for defining this structure in terms of Classes, Packages etc., and for defining test cases. Since Use Cases do not deal with technicalities inside the system but focus on how the system is perceived from the outside, they are most useful in discussions with end-users to make sure that there is agreement on the requirements on the system, on its delimitation etc. More specifically, a Use Case specifies a set of complete sequences of actions which the system can perform. An action sequence of a Use Case is complete in the sense that after its performance the system resumes the state in which the sequence was initiated, and the Use Case may be initiated once more. Each sequence is initiated by a user of the system, and it includes the interaction between the system and its environment as well as the system’s response to these interactions. Each time the Use Case is invoked by a user, one of these sequences is performed. The set of sequences of a Use Case is usually divided into a normal case, i.e. the sequence of actions which a user will expect to be performed, and a set of variants. These include alternative choices, exceptions and error handling. Note that in this context we use the word “action” in an informal, intuitive way.

a L o c al Call

Example 1. In a telecommunication exchange, Local Call is a Use Case whch describes how a call is performed between two subscribers connected to the exchange. A call is initiated when the A-subscriber lifts the handset. The subscriber receives a dialling tone, and the system registers the subscriber as busy. When the first digit is dialled, the dialling tone is disconnected. The received digits are analyzed and eventually, when all the digits have been received, the A-subscriber and the B-subscriber are connected through the network, and the Bsubscriber is registered as busy. At h s time the B-subscriber receives ring signals, while the A-subscriber receives a ring tone. When the B-subscriber lifts the handset, both the ring tone and the ring signal are disconnected. The two subscribers can now communicate with each other. The call is terminated when both the A-subscriber and the B-subscriber have hung up. The network connection is disconnected, and both subscribers become idle. One alternate sequence of the Use Case is that, by hanging up, the A-subscriber can terminate the call at any time before the B-subscriber has answered.

A Use-Caseinstance is a performance of a Use Case. It is initiated by a Message instance sent from a user, and it continues according to one of the sequences of actions specified in the Use Case. It usually includes more communication with users

410

Gunnar Övergaard and Karin Palmkvist

than the initiating Message instance, as in Example 1 above. Note, however, that a Use-Case instance can never communicate with another Use-Case instance, since each of them is a complete sequence. Apart from defining the services offered by a whole system, Use Cases can be used to specify how to use for example Classes or Subsystems (see [6], p. 93). The Use Cases of a Subsystem or a Class then define the functional requirements on this entity. In this way, Use Cases can be used for the distribution and tracking of requirements from the system level down to its constituents at different levels. Use Cases can be described in several different ways. In practice, ordinary text is often used, describing the different sequences in an informal manner as in the description in Example 1. However, since a Use Case describes a set of action sequences, the actions form a partially ordered set. This implies that a Use Case can be specified in a more formal way by a State machine. Then, the Use Case sequences are defined by the transitions between the states in the State machine. It is also possible to describe a Use Case more formally by means of a set of Operations and Methods. Since a Use Case is a kind of Classifier, it has a collection of Operations and a collection of Methods describing its behaviour. The Operations of a Use Case describe what Message instances an Instance of that Use Case may receive, while the Methods describe what sequences of actions are performed by Instances of the Use Case. A complete sequence of the Use Case consists of one or several OperationsMethods, performed in a pre-defined order. In this paper we use Operations and Methods for describing the Use Cases. By using this technique, we are able not only to give a more precise meaning to the Use Case, Uses, and Extends constructs than was presented in the UML 1.1 documentation; we also achieve a description of the constructs that is suitable as a basis for formalization. x A

:

:

Operation

I

Use Case

Example 2. A simple Use Case A, with user input only at the initiation of the action sequence, is described with one Operation x and a Method with the same name which realizes the Operation. The sequence of actions of the Method consists of two actions: k and 1

A concrete Use Case is an instantiable Use Case, i.e. all its sequences of actions are complete. This implies that all its Operations are realized by Methods. This is not necessarily the case for an abstract Use Case, in which the collection of Operations and the collection of Methods do not necessarily match (cf. the difference between concrete and abstract Classes - an abstract Class may contain Operations which are not realized by a Method). It should be noted, however, that there are two major

A Formal Approach to Use Cases and Their Relationships

411

differences between Use Cases and Classes regarding the usage of Operations and Methods: In a Class, all Methods have a name, including those of an abstract Class, while an abstract Use Case may contain Methods which have no name. This is due to the fact that the name of a Method is used for matching the name of the invoking Message, whereas an abstract Use Case may contain fragments of sequences of actions which do not necessarily start with an input. These unnamed Methods are used together with the Uses and Extends relationships and not for matching names, see sections 4 and 5 below. A Class contains no Methods which do not realize an Operation, while an abstract Use Case may do so. For example, an unnamed Method is not a realization of an Operation. In an inheriting (i.e. using) Use Case the actions of such a Method are included in the sequences of actions of other Methods. To specify the order between the Methods of a Use Case, we use a technique with a state variable. Each Method has a value representing the state in which the Method may be invoked, and each of the action sequences within a Method are ended by giving the state variable its new value, thus specifying which Methods are the successors of that sequence. A Use-Case instance behaves like an ordinary Instance with the exception that when it receives a Message instance it has to take the current value of the state variable into consideration. A Use-Case instance may have several Methods with the same name, since the action sequence to be performed may vary depending on in which state the invoking Message instance is received. The Use-Case instance will look up the Method in the same way as an Instance, but since this check might yield a collection of Methods, the Use-Case instance will have to find the Method that is applicable in the current state: CLASS UseCaseInstance SUPERCLASS Instance VARIABLES

currentstate

:

State

METHODS

lookupMethod (sign) FIND m IN SUPER lookupMethod (sign) SUCHTHAT m state ( ) = currentstate

In UML a Use Case is a kind of Classifier which implies that the semantics of Use Cases includes the semantics defined for Classifiers. However, in some cases this semantics is more restricted than for Classifiers in general. For example, unlike a Use Case a Classifier constitutes a namespace, and as such it may contain other entities, e.g. other Classifiers and Associations. Obviously, this does not hold for Use Cases. Moreover, like any Classifier a Use Case can be related to other entities in different ways. For example, since the performance of a Use Case involves communication with users, there are Associations between the Use Case and the Classes modelling the users. However, as mentioned above, relationships of this kind cannot exist

412

Gunnar Övergaard and Karin Palmkvist

U

Call Initialization and Termination

<<uses>>

Local C a l l

L <<uses>>

Order Wake-up Call

Example 3. In our telecommunication exchange there are two Use Cases, one describing a local call between two subscribers, and one for ordering a wake-up call. The initial sequences of these two Use Cases are the same: the subscriber lifts the handset, receives a dlalling tone and dials a sequence of digits. Moreover, the termination sequences of both Use Cases are also the same once both parties have hung up. This commonality is described in a separate Use Case, named Call Initialization and Termination, whch is an abstract Use Case, i.e. this sequence is never performed on its own but only as part of those of other Use Cases.

In terms of Operations and Methods, the Uses relationship implies that a using Use Case may: add new Operations add new Methods redefine already existing Methods A using Use Case must not redefine Operations, however, since this would imply redefinition of message reception. The Uses relationship implies that all the Operations of a used Use Case are applicable also to Instances of the using Use Case. If a Method redefines an existing Method, then all the actions of the original Method must still be present in the using Use Case and their original partial order must be still be satisfied. It should be noted, though, that the action sequences of a Method is not necessarily kept within a single Method in a using Use Case. The requirement is that the complete partial order of the actions defined in the used Use Case is kept. Therefore, the action sequences of one Method may be split into several different Methods in a using Use Case - as long as the order of the actions is kept.

A Formal Approach to Use Cases and Their Relationships

413

cannot exist between Use Cases describing the same system (or Subsystem or Class), since each sequence of a Use Case is complete in itself. This semantics is formalized as follows: CLASS UseCase SUPERCLASS Classifier METHODS

allowedAssociations ( ) c : = TRUE; FOREACH r IN relationships DO IF r isKindOf (Association) THEN FOREACH n IN r getAllOppositeNames (SELF) DO c : = c AND SELF owner ( ) f r 1ookupType (n) owner 0 ; C

consistent ( ) SUPER consistent ( ) AND SELF allowedAssociations SELF contents ( ) = NULL

()

AND

Between Use Cases there are two types of relationships that are of interest in this paper, namely the Uses and the Extends relationships. In the following sections we present the Uses and the Extends constructs. In UML both Uses and Extends are defined as stereotypes on the Generalization relationship, but in this paper we treat both of them as specific kinds of relationships since our focus is on their semantics and not on how to describe the stereotypes as such.

4 The Uses Relationship Commonalities between Use Cases are expressed with the Uses relationship, defined as a special kind of Generalization. It means that the sequences of actions specified by the used Use Case are included in the sequences of the using Use Case(s). The latter Use Case may both introduce new sequences and add new actions to the sequences specified by the used Use Case, as long as it does not change the ordering of the original actions. An Instance of a Use Case thus performs the actions specified in its originating Use Case as well as the actions specified in its used Use Cases. Uses relationships, like ordinary Generalization relationships, must not be defined in cycles, i.e. a Use Case must not, directly or indirectly, be used by itself. An additiona1,restriction on the Uses relationship is that used parts must not be replaced by other parts in the using Use Case. Only additions are allowed. This is due to the fact that the Uses relationship expresses commonalities between Use Cases, i.e. the same fragments of action sequences appear in several Use Cases.

414

Gunnar Övergaard and Karin Palmkvist

A : Use Case

check accuracy! calculate Actionsequence

:

<<uses>>

receive data

//

Q\\ B : Use Case

I

:

Method

1 I

check accuracy! output ConfirmationRequest ActionSequerice

receive confirmation

:

:

I i

Operation

calculate, output result ActionSe ence

:

Example 4. Assume that there is an abstract Use Case A with a Method consisting of the two actions check accuracy and calculate. Since neither of these actions is an input, the Method does not have a name. T h s Use Case is used when we define the Use Case B, which has the following sequence: check the accuracy of received data, ask for and receive a confirmation, perform some calculations, return the result. Here input is received twice, implying that the sequence is defined in two Operations and two corresponding Methods. The order of the sequence fragment defined in the Use Case A is maintained even though the actions are included in different Methods in B, since the Operation receive data is applied before the Operation receive confirmation (the order is not shown in the figure).

A formal specification of the Uses relationship is given in [4].Therefore, we do not include such a specification in this paper.

5 The Extends Relationship Use Cases may also have Extends relationships to each other. An Extends relationship states that a Use Case may be extended with the behaviour specified in another Use Case. The relationship defines not only the connection between the two Use Cases but also a condition that must be satisfied if the extension is to take place, and a reference

A Formal Approach to Use Cases and Their Relationships

415

to the extension point in the extended Use Case where the additional behaviour is to be inserted. Once an Instance of a Use Case reaches an extension point, the condition of the relationship is tested. If the condition is met, the Instance continues according to the sequence of the original Use Case extended with the sequence of the other Use Case. In the general case, the whole extending sequence is not inserted at the same place in the original sequence (see [6], p. 95; also [2], p. 401). Instead, the extension consists of fragments, each defined in a separate Method and inserted at different points. Since all or no part of an extending sequence of actions must always be inserted, the condition is checked only at the first extension point, and if it is fulfilled, the additions to the following extension points are all made. <<extends>>

Local Call

Charging

Example 5. In our small telecommunication exchange example, the operator of the exchange will be interested in measuring the duration of a call, e.g. for the charging of the call. How and when the measurement is done is not part of an ordinary local call; it is therefore not included in the Local Call Use Case. How to perfom the measurement and how to calculate the cost is specified in a separate Use Case, namely Charging, which is an extension of the Local Call Use Case. In principle, the Charging Use Case will have the following structure: Unnamed Method: reset clock, start clock Unnamed Method: stop clock, update the charging of the subscriber The Extends relationship specifies that this extension should always occur, i.e. the condition is true, and the extension points are: 1. Both subscribers off hook 2. One subscriber on hook If the condition is fulfilled when the first extension point is reached (as it will always be in this case), the actions of the first Method are inserted at the first extension point and the actions of the second Method at the second extension point.

An extension point is “a location at which the use case can be extended’’ (see [6], p. 91). In this paper we have refined the definition of extension point to be an action which has a name and contains a collection of action sequences. This collection is by default empty, but action sequences are added if a Use-Case instance is extended at this extension point: CLASS Ext ens ionpoint SUPERCLASS Act ion VARIABLES

name : ExtensionPointName, actionSeq : Action

One major idea of the Extends relationship is that the extended Use Case is not dependent on the fact that it is extended by another Use Case [l]. However, the semantics of the Extends relationship depends on what happens in an Instance of the

416

Gunnar Övergaard and Karin Palmkvist

extended Use Case. Thus, when formalizing the Extends construct we use a general technique of sending notifications when specific conditions are met. In this slightly simplified presentation each Use Case has a collection of entities which are notified each time an Instance of the Use Case performs an action. This collection will of course consist of all Extends relationships that are connected to the Use Case. However, the definition of the Use Case does not depend on the existence of or on the kinds of these entities - its Instances just send notifications to the entities in the (possibly empty) collection. We use this technique not only when specifying the Extends construct but also when specifying some other kinds of relationship constructs, like the subscribesTo relationship in OOSE. When an Extends relationship is established between two Use Cases in a model, it is given a condition which must be satisfied if the extension is to take place, as well as a sequence of extension points where the extensions are to be made. At that time the relationship is also registered in the list of entities which will be notified when an Instance of the extended Use Case executes an action. When a Use-Case instance is to execute an action, it first sends a notification to each entity in the notification list, which may cause one or several Extends relationship to extend the Instance, and after that the Instance continues its evaluation of the action. If a notification is received by an Extends relationship, and if the action that the Instance is to execute is an extension point, the condition of the relationship is tested in the environment of the Instance. If it evaluates to true and the current extension point is the one where the extension is to take place as defined in the Extends relationship, the Instance is extended with the behaviour of the extending Use Case. When the extension consists of several fragments, each fragment is added to its own extension point. An extending fragment can either be part of a sequence in which case it is expressed with an unnamed Method, or it can be a new Method which will extend the collection of Methods of the Instance. Once all the fragments have been added, the extension is done and the Instance can resume its execution. In accordance with the principles above, the Extends relationship is formalized as follows: CLASS Extends SUPERCLASS Entity VARIABLES

fr : UseCase, to : UseCase, co : Action, en : ExtensionPointName* METHODS

create (f, t, c, e) fr : = f; to : = t; co : = c; en : = e; to register (SELF) remove ( ) to deregister (SELF) notify (a, e, i) IF a isKindOf (ExtensionPoint) THEN

A Formal Approach to Use Cases and Their Relationships

IF a name THEN

()

=

en

HEAD

name

()

AND

417

i evaluate (co, e)

m IN fr getAllMethods ( ) , n IN en DO IF m name ( ) = THEN

FOREACH

”

(e getExtensionPoint (n)) adactions (m body

0)

ELSE (FIND s IN

to)

i segments

( ) SUCHTHAT

s class

()

=

addMethod (m)

consistent ( 1 SUPER consistent CO # NULL AND

()

en

ANDfr # NULL AND HEAD # NULL

to

# NULL

AND

The required additions to the Use Case and the Use-Case instance constructs are as follows: CLASS UseCaseInstance SUPERCLASS Instance METHODS

evaluate (a, e) FOREACH n IN SELF class

()

getNotificationCollection

()

DO

n notify (a, e, SELF); evaluate (a, e)

SUPER

CLASS UseCase SUPERCLASS Classifier VARIABLES

nc

:

Entity*

METHODS

register (c) nc : = nc ADD c deregister (c) nc := nc WITHOUT c

getNotificationCollection nc

()

It should be noted that if two Use Cases extend the same Use Case at the same extension point, nothing can be said about their internal order. Furthermore, unlike the Uses relationship, the Extends relationship may be cyclic: to an ongoing sequence of actions may be added another occurrence of that very same sequence.

418

Gunnar Övergaard and Karin Palmkvist

6 Concluding Remarks In this paper we have presented the semantics of the Use Case construct and its specific kinds of relationships, namely Uses and Extends. The presentation of the semantics is both more formal and more detailed compared to that of the language definition [ 6 ] .There are also other kinds of relationships that are applicable to Use Case, such as Association, Dependency and Constraint. Most of these kinds of relationships have been covered in another paper [4].Furthermore, due to the scope of this paper we have not included any of the semantics related to the relationship between Use Cases and Collaborations. We hope that the deficiencies in the definition of UML will be removed in future versions of the language, and we believe that a good way of finding these deficiencies is by providing a formal specification of the language. The current specification of UML is done partly in OCL and partly in natural language. OCL is suitable for expressing well-formedness rules, but a specification of the dynamic semantics expressed with constraints would not be easily understood. Moreover, since OCL lacks a formal specification of its semantics, it is unsuitable for formalizing UML. The usage of a language like ODAL,with its formal semantics, is suitable for the specification of both the well-formedness rules and the dynamic semantics of UML. The specification technique used in this project is a general technique for specification of object-oriented modelling languages. We have previously used it for specification of a subset of OOSE [2], where the technique proved to be suitable not only for the development of the specifications but also for later modifications. Our experiences from the current project are similar.

References 1. Jacobson, I.: Basic Use-Case Modeling (Continued). Report on Object Analysis 8z Design, 1(3), Sept-Oct 1994 2. Jacobson, I., Christerson, M., Jonsson, P., bergaard, G.: Object-Oriented Software Engineering: A Use Case Driven Approach. Addison-Wesley, 1993 3. Milner, R., Parrow, J., Walker, D.: A Calculus of Mobile Processes, I. Information and Computation, 100:1-40, 1992 4. Overgaard, G.: A Formal Approach to Relationships in the Unified Modeling Language. In: Broy, M., Coleman, D.., Maibaum, T. S. E., Rumpe, B. (eds.): Proceedings PSMT’98 Workshop on Precise Semantics for Software Modelling Techniques. Technische Universitat, Munchen, Germany, TUM-19803, April 1998 5 . Unified Modeling Language, version 1.1, September 1997. On-line documentation is found at http://www.rational.com/uml/documentation.html 6 . UML Semantics, version 1.1, September 1997. Part of [ 5 ]

A Practical Framework for Applying UML Paul Allen Vice President of Methods SELECT Software Tools, Westmoreland House, 80-86 Bath Road, Cheltenham, Gloucs., GL53 7JT, UK phone: (UK) 1242 229835/1737 813911 Fax: (UK) 1242 229701/1737 814279 Email: [email protected]

Abstract. This paper presents practical strategies and techniques for putting UML to work based on the author’s project experiences. A design philosophy is outlined for realizing the vision of black-box reuse in terms of a service-based architecture that capitalizes on the emergence of component technology. The paper discusses the role of the architecture in providing a framework for applying UML in the practical contexts of business process improvement and the need to leverage existing software and models. The need for an iterative and incremental delivery process is a well established theme of object-orientation (Booch, 1993). This paper is mainly concerned with the equally important, but less well-understood, need to employ an architectural cycle across projects. A simplified scenario is used to examine the role that different UML modeling techniques play throughout the process, to show how the architecture helps shape UML techniques, and to focus on some of the key issues and pitfalls that typically confront the developer. Keywords architecture component process

Background It is significant that UML is a modeling language, not a method (as it was in it’s early days). The reach of UML has grown much wider since the OMG (Object Management Group) took over responsibility for its development in early 1996, agreeing on V1.1 (OMG, 1997) late in 1997. A good method should also contain advice on how to go about software development, the steps to follow (a process) and how to structure the models (an architecture). The move from method to language clarified UML’s more focused position as a standard for mainly graphical notations (syntax) and definition of terms (semantics). Process and architecture are deliberately excluded and rightly so for reasons of scope.

J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 419–433, 1999. © Springer-Verlag Berlin Heidelberg 1999

420

Paul Allen

In this paper strategies and techniques are described for putting UML to work, based on project experiences: • • •

an architecture is described which provides an overall design strategy. a process is briefly outlined for employing the modeling techniques. step by step examples are provided to show how the UML models are used with respect to architecture and process.

The focus is not on UML diagram technicalities as the main purpose is to provide a road map of how the UML models fit into the software process together with some illustrations of how to use UML at an architectural level. The paper concentrates on “black-box” reuse; “white box” reuse of interfaces, though valuable, is excluded for reasons of scope.

Components as Enabling Technology Before embarking on the details of this paper the role of component technology in underpinning the approach needs some discussion. This is important because the term “component” has come to mean many things to many people. A component has three key features as understood in this paper: • • •

it is an executable unit of code that provides physical black-box encapsulation of related services. its services can only be accessed through a consistent, published interface that includes an interaction standard. it must be capable of being connected to other components (through a communications interface) to form a larger group.

Components represent a significant gear change from objects: encapsulation of objects is often imperfect because implementation dependencies are often exposed through programming language interfaces. Component technology such as the OMG’s Common Object Request Broker Architecture (COBRA) and Interface Definition Language (IDL) can help to resolve this issue by providing component interfaces, which are separate from implementations providing isolation of changes. In fact, providing its interfaces conform to standards, the component doesn’t have to be written in objects at all. A service is a user-defined stereotype of operation, that has a published specification of interface and behavior, that represents a contract between the provider of the capability and the potential consumers. Using the description, an arms length deal can be struck that allows the consumer to access the capability. The shift to components and services opens up interesting possibilities to address software reuse, including legacy systems. Rather than starting each project every time with a “clean slate,” sets of service features can be examined to see which can be reused to solve a business problem.

A Practical Framework for Applying UML

421

Service-Based Architecture An effective software architecture provides an overall structure and set of rules for managing the scale and complexity inherent in enterprise software development. Keynotes of a good architecture are to help achieve software interoperability, adaptability and consistency of design. Services are a key element of good software architecture and are categorized as follows: • User services provide human-computer interfaces to meet the needs of a particular business process or department. User services link together with business services to deliver the business capabilities to users in terms of a visual interface. • Business services convert data received from data services and user services into information by coupling related business tasks with the relevant business rules. Business services are technically neutral and may apply at different levels of generality across several business processes or departments. • Data services are used for the manipulation of data in a way that is independent of the underlying physical storage implementation. Data services commonly apply to generic data access requirements across several business processes or departments. Business components are abstracted out into their own layer ensuring the implementation independence that is so important for a component-based approach. This architecture also helps close the, all too prevalent, gap between business process improvement and software engineering: A user service delivers business capability through the software interface to a business process. At a lower level, a user service may be split into a set of smaller (more reusable) user services. A user service can call on business and data services via operation calls. Business and data services are usually shared by different user services. A use case breaks down into a set of business service requests. Business and data services may be split using operations into sets of smaller (more reusable) business and data services.

Two Architectural Ground-Rules There are two further ground-rules that are followed in layering services: •

aim for increasing generality downward through each layer; for example a core business component like Product Rules provides services to a specific business component like Product Sale.

•

take advantage of the best work of others. There are several important industry initiatives in this area, including OMG work through the business object task force (BOTF) as well as useful work on business related patterns (Fowler, 1997), not only design patterns which are now well-covered in the literature (Gamma et al 1995).

422

Paul Allen

It is important to understand that the service-based architecture is not a panacea. Developing software is still hard! However at least we have a start: the service-based architecture provides an overall context for shaping modeling techniques as we seek to apply good design principles.

Modeling Software Architecture UML package diagrams are used for modeling software architecture. A package is “a general purpose mechanism for organizing elements into groups” (OMG, 1997). Such elements can range from model items (classes, use cases and so on) to legacy assets (for example, compiled code, transaction, or database libraries). A package provides a scope for a set of names of the elements contained within it. The name of a package must be unique. In the component marketplace, this name must be unique worldwide in similar fashion to a URL (an Internet address of the form http://ahost/adirectory/afile). The package name must not change once it is published. A service package is a user-defined stereotype of package that provides a set of services belonging to a single service category. Service packages provide a mechanism for grouping objects into cohesive units and achieving an effective granularity of reuse. The service packages are also used to effectively wrap legacy assets. One of the major benefits of this is that proposed reuse of legacy systems, software packages and databases is addressed as part of architectural design, and not left as an afterthought, as in many methods. Three stereotypes of class are used; user, business, and data. User and data classes are technology classes. The former provide the screens and controls that let people employ user services, as well as other non-interactive interfaces such as interfaces to other systems or batch reporting. By assigning responsibility for external communication in this way, changes to user interfaces—which are quite common given ongoing developments in graphical user interface (GUI) technology—will not affect the underlying business classes. The data classes provide data services by controlling access to database storage and converting this data to a clean, nondatabase specific interface for use by business and user services. Again, this protects the business classes from the effects of changes to databases. The service package provides the required services through one or more service classes. A service class is a type of class, providing one or more services, which represents an abstraction of a component interface(s). A service class is used to help retain a domain focus in modeling without becoming embroiled in the potential complications of implementation: it gives the developer a mechanism for identifying interfaces “early in the process” in preparation for mapping to components. A service class marshals service requests and encapsulates access to a component in the spirit of responsibility-based design by contract (Wirfs-Brock, 1991). Typically, it is a control class (Jacobson, 1992) exclusively designed to provide services through one or more interfaces, although services can be allocated to any appropriate class, which then has

A Practical Framework for Applying UML

423

the role of service class. The fact that a class is a service class is derived from the fact that at least one of its operations is a service.

A Component-Based Process Although details of the software process are outside the scope of this paper, it is important to understand the basic principles in order to put the example that follows into perspective. The depth of modeling techniques needs to be tuned according to the potential for reuse. Although a single model structure applies to all types of projects, the modeling techniques applied vary according to the project type. Broadly speaking there are two types of project: solution projects focus on fast assembly of software increments that supply user services driven by specific business requirements. The more rigorous component process acquires and develops the supporting business and data services driven by generic business requirements. Both processes are iterative and incremental. The two processes are ideals; commonly it will be necessary to use elements of both processes adapted to one’s own specific needs. Each process contains guidelines on team structures, as well as which modeling techniques apply at different stages in development. Effective component-based development also requires an architectural cycle that is ongoing and proactive (see figure 1). The architectural cycle takes a strategic crossproject view, assesses possible services (including legacy assets) and plans and coordinates delivery projects with respect to the business plan. Full details of this process are described in (Allen and Frost, 1998).

ARCHITECTURAL CYCLE

Scope

Feedback

DELIVERY CYCLE

Iteration Figure 1: A Component-Based Process.

424

Paul Allen

A Modeling Example The following simplified scenario illustrates how UML can be applied in the context of the architectural and delivery cycles discussed above. Architectural Cycle: Understand the Business Processes: Business requirements must drive the process and a good place to start architectural scoping is with business process models. Many business process models are based on function decomposition. As a result, existing organizational structures are “reinvented” and the subsequent designs do not capitalize on the best features of object orientation. In contrast, an event-driven approach is applied in the search for services. Actors are treated as roles, as the exchanges between actors will provide a basis for identifying services. The services must be as generic as the business will allow, not constrained by current use of technology or people. In a software education business, a Plan Courses process is triggered on a regular quarterly basis by the event Start of Period. Plan Courses is analyzed in terms of a value chain of event-driven actions: • • • • •

Review course history Maintain course types Design course schedule Add scheduled course Produce course schedule

The end result of this chain of actions is that a course schedule is published. One of the actions along this value chain is Add Scheduled Course. There are two people currently responsible for Add Scheduled Course: the Course Administrator and the Marketing Manager. The role played however, regardless of who or where, is that of Course Planner. Add Scheduled Course is described in terms of the steps the actor (Course Planner) must achieve in order to complete it: • • • • •

Establish course type Find available venue Schedule course Reserve venue Raise requisition for course materials

Plan the Software Architecture: A package diagram is used to scope architectural dependencies (shown as dashed arrows); the UML note symbol is used to aid explanation (see figure 2). We follow the convention that a package is by default a service package if no stereotype symbol is shown. Service categories are used to guide the layering of service packages. User service packages often correspond to

A Practical Framework for Applying UML

425

particular individuals or departments. In the example the Course Administrator and Marketing Manager have different use interface requirements; the former needs a sophisticated GUI whereas the latter requires a simple menu-based interface. Therefore there are two folder service packages corresponding to these different needs. user service packages: solution development

COURSE ADMIN FOLDER

COURSE BOOKING

EVENT FOLDER

SALES PROMOTION

COURSE PLANNING

PURCHASING

LOCATIONS

CONSULTANTS

new component dev.

new component dev.

software package wrapping.

PURCHASING DATAMGT

legacy database wrapping

Figure 2. Example package diagram. At the next level are business service packages that correspond to the needs of particular business processes (Course Booking, Course Planning and Sales Promotion). Beneath these are more generic business service packages that reflect the needs of the business domain (Consultants, Locations, Purchasing). Techniques such as domain analysis are used to help partition these packages.

426

Paul Allen

In parallel, we also approach bottom-up by assessing existing software assets. In particular, if reuse of a particular legacy database or software package is mandated up-front, then it is important to declare this within the overall architecture. In the example, an existing purchasing data model is used as an input to creating a class model for the Purchasing business service package. The Purchasing Data Mgt service package provides a set of interfaces which sit “on top” of the existing database. The Locations service package wraps an existing software package. Typically, service packages are allocated to different teams. Project management is facilitated by architecting service packages in early phases of development. This also has the advantage that incremental design can focus on specific implementation details without being overloaded with wider architectural concerns. Of course, no partitioning will be ideal but at least we have a starting context in which to perform the more detailed work. Class modeling is used at the enterprise level to understand the business domain. High-level business classes are allocated to packages (see figure 3). We follow the convention that a class is by default a business class if no stereotype symbol is shown. At this stage, the diagram is simply intended to provide a sketch; it is not necessary to show attributes, operations, or association names. It is useful to show cardinality constraints however as these are important in exposing strong dependencies. For example the fact that a Course must be associated with a single Venue means there is a strong dependency of the former on the latter, which is probably acceptable as Venue is likely to be a very stable class. At this stage the class diagram is “rough cut”: it is important to examine the semantics of classes and associations carefully, as well covered in the literature (Booch, 1993), and to ask questions! For example, is Course Material specific to an occurrence of a particular Course (I need 12 sets of binders for the component modeling course in London on 12 Feb) or does it apply to a Course Type (I need 12 sets of binders for any component modeling course)? If it turns out to be the latter (as we shall assume) then the associations from Course Material to Course and Requisition are not meaningful, though an association from Course to Requisition may well be. Appraise Software Architecture: No architecture is perfect - making trade-offs is a necessary part of the process. The price paid for a pure service-based architecture can be increased numbers of calls across the layers. This problem is exacerbated in the case of a highly constrained implementation, where services are fragmented across several platforms. These issues need to be balanced against the flexibility, maintainability and reuse that comes with a service-based approach. It is important to understand that service layers guide the partitioning but are sometimes sacrificed in the interests of non-functional requirements such as performance or security. The main theme is to partition to minimize undesirable dependencies and to critically appraise associations between classes in different service packages. It is also important to keep pattern aware, as discussed earlier.

A Practical Framework for Applying UML

427

COURSE PLANNING

COURSE TYPE

1

1

LOCATIONS

*

COURSE MATERIAL

*

0..1

* *

VENUE

COURSE 1

*

PURCHASING

* REQUISITION

Figure 3. A high-level class diagram with packages. A basic guideline is to separate operational from core functionality. For example it may well be best to factor Course Type into its own package as it represents core rules, rather than operational requirements like Course. Whether to split into separate service packages is essentially a trade-off between the resulting package reusability and the number of inter-dependencies incurred. In this case we decide to split Course Type into a Course Rules package which provides core functionality; for example all courses of a certain type share the same material requirements. Course splits to a separate Course Planning service. Other trade-offs occur at the service-layer boundaries. For example if there is certain business policy that applies locally at different user sites to Course Planning then it may well be expedient to hold the local business logic in the user service layer, especially if faced with a distributed implementation. Regardless of implementation constraints it is a good strategy to separate out common interfaces within the business service layer. The reason is twofold: technical neutrality and reuse of common interfaces. That is why it is so important to model actors as roles, both in business process modeling and use case modeling (see below). In the example, a Course Scheduler service class is introduced to provide a common interface for the Course Planner role. This is used by both the Course Admin Folder and the Events Folder. Service packages are often “orthogonal” in the sense that it can be difficult to judge in which package a particular class resides. For example, does Course belong in Course

428

Paul Allen

Booking or Course Planning? Other guidelines that help in making such decisions include: • • • • •

avoiding circular dependencies minimal inter-associations between packaged classes maximum intra-associations between packaged classes aiming for cohesive packages, more by classes commonly reused together, rather than simply functionally cohesive achieving good granularity; for example, 5-15 classes per service package is a rough rule of thumb

Delivery Cycle: Identify the Use Cases: Use cases help establish the system scope for the business activities under study. Use cases are not a good tool for creating a system architecture but are a very good tool for testing out an architecture. In this case the Course Planning service package provides a context for solution development. There is one use case for each business action (see figure 4) except for Design Course Schedule, which is excluded as it is totally manual. Note that this need not always to be the case; a business action can correspond to several use cases and vice-versa, depending on the granularity of the business actions. Use case modeling needs o be applied carefully, to get maximum value from the technique. As well as modeling actors as roles, it is vital to get use case granularity right. Granularity is the functional level of a use case. A very high level would be functions such as Purchasing and Accounting. A very low level would be specific atomic tasks such as Find Customer Address and Validate Stock Level. To find the right level of granularity for a use case we need to again consider the definition: “A behaviorally related sequence of interactions performed by an actor in a dialogue with the system to provide some measurable value to the actor” (Jacobson, 1994). There are four rules of thumb that are usefully applied with respect to this definition. • “Behaviorally related” means that the interactions should as a group be a self contained unit which is an end in itself with no intervening time delays imposed by the business. • The use case must be performed by a single actor in a single place, although it might result in output flows which are sent to other passive actors. • “Measurable value” means that the use case must achieve some business goal. If we cannot find a business related objective for the use case then we should think again. •

The use case must leave the system in a stable state; it cannot be left half done.

A Practical Framework for Applying UML

429

MAINTAIN COURSE TYPES

REVIEW COURSE HISTORY COURSE PLANNER

PRODUCE COURSE SCHEDULE

SERVICES ADD SCHEDULED COURSE

Figure 4. Example use case diagram: Plan Courses. In describing the use case, the focus is on how the business action is to work in practice using a proposed software solution. Attention is restricted to the steps of the action that require actor-computer interaction. A first-cut use case description for Add Scheduled Course follows: • Find required course type from list of available course titles. • For the required run date, enter start date. Find venue from list of available venue names. • Request course to be scheduled, for chosen date and venue, including requisition of course materials. Analyze the Need for Services: Depending on the scale of the system and the priority of use cases, a number of use cases are now selected for further analysis. A single use case is analyzed in terms of required services (see Table 1). Note all the services shown are business services which may be reused by other use cases. Component management software can assist greatly in cataloging and accessing of the services (Allen and Frost, 1997). Further detail, such as non-functional requirements and alternative courses, are also captured but not shown here for reasons of scope. USE CASE STEP Find required course type from list of available course titles For the required run date Enter start date Find venue form list of available venue names Request course to be scheduled, for chosen date and venue, including requisition of course materials

POTENTIAL SERVICE List course types

List available Venues Schedule Course

Table 1. Potential services for Add Scheduled Course use case. Services must be identified before committing to detail about objects. The use case descriptions are completed fully; otherwise, assumptions will be made about which

430

Paul Allen

services to reuse before the user’s problem is understood. Function decomposition is a common pitfall if the definition of a use case is not carefully applied. Whereas function decomposition attempts to break down an abstract description into increasing levels of detail - until eventually we reach precise specifications - a service-based approach identifies interactions between user and system at a consistent level of detail. Work is done “outside-in” to identify required services, rather than top-down. Apply Use Cases to Domain Objects: Collaboration diagrams are used to explore how objects identified in our domain analysis work together to provide services to support use cases (see Figure 5). Collaboration diagrams are focused on single scenarios rather than whole use cases. The collaboration diagram can be a victim of excess with further conditional notation; a picture may be worth a thousand words, but not if it has a thousand word written all over it. This technique is akin to CRC cards (Wilkinson, 1995) and can be applied that way using a white-board. Note the Course Scheduler service class introduced to provide the Course Planner’s interface. :COURSE TYPE

st 1 Li

3 Schedule

3.1 Create

:COURSE

:COURSE SCHEDULER

3.1

.2 F

i nd

COURSE PLANNER 3.1.1 Rese

t

:COURSE MATERIAL

rve

2L

Ma

ist

R .3 3.1 ais e

:VENUE :REQUISITION

Figure 5. Example collaboration diagram for Add Scheduled Course use case (normal scenario). Refine the Architecture: The architecture is developed in more detail to support the use cases. There are multiple occurrences of Venue and Requisition, both of which represent operational data that should be encapsulated with its function. Venue and Requisition are not therefore suitable as service classes; Locator and Purchaser service classes are accordingly introduced. A Rulebook service class is created to encapsulate the Course Rules service package. These decisions are reflected in figure 6; service classes are evident by services are shown in italics (UML notation for abstract operation) and additionally shown bold.

A Practical Framework for Applying UML

431

The above changes illustrate the iteration that occurs between solution delivery and architecture. On early projects (like the one described here) this is fine, as part of the exercise is reconnaissance in understanding the problem, but on later projects it must be carefully managed for “knock-on” effect as the architecture should be as stable as possible. COURSE PLANNING

COURSE SCHEDULER

COURSE START DATE

SCHEDULE

CREATE

PURCHASING

LOCATIONS

VENUE PURCHASER RAISE REQ

REQUISITION DATE NUMBER RAISE

LOCATOR LIST VENUES RESERVE VENUE

ADDRESS NAME LIST RESERVE

COURSE RULES RULEBOOK LIST COURSE TYPES FIND MAT

COURSE TYPE PRICE TITLE LIST

COURSE MATERIAL

REQUIRES 1

*

CODE DESCRIPTION

Figure 6. Example refined class diagram for Course Planning. Design Business Services: A plan is now constructed for design and delivery of Course Planning increments; this is excluded here for reasons of scope. Object interaction modeling is used to further examine the required object messaging for the Add Scheduled Course use case (see Figure 7) which is selected as the first increment for delivery. Note how the use case stays as free as possible from implementation details at this stage. That way, business services are exposed which are “pluggable” into different possible user interfaces. For example, both the Events Folder and the Course Admin Folder are able to apply these business services according to user interface requirements.

432

Paul Allen

ADD SCHEDULED COURSE Description

:COURSE SCHEDULER

Request Add Scheduled Course List possible venues

Create scheduled course

:COURSE RULES :: RULEBOOK

:LOCATIONS ::LOCATOR

:PURCHASING ::PURCHASER

:LOCATIONS ::LOCATOR

:PURCHASING ::PURCHASER

LIST COURSE TYPES

List possible course types Select course and venue for chosen date

:COURSE

LIST VENUES

SCHEDULE

CREATE

RESERVE

Reserve venue

FIND MAT

Ascertain materials

RAISE

Raise requisition

:COURSE SCHEDULER

:COURSE

:COURSE RULES :: RULEBOOK

Figure 7. Example sequence diagram for Add Scheduled Course use case. As development unfolds so further UML diagrams are bought into play. In particular it is important to use deployment modeling (OMG, 1997) to explore distribution of service packages in terms of physical components across physical platforms to ensure the design meets detailed implementation constraints and quality criteria.

Summary This paper has presented a pragmatic approach to using UML within the context of a service-based architecture and component-based process. The approach is grounded in business requirements. Domain modeling is used to drive the architecture. The architecture is governed by some basic principles including the concept of service layering, separation of operational from core functionality, the application of patterns and design guidelines. The service-based architecture is particularly appropriate for addressing the issue of legacy systems, sadly ignored by many methods. However as Winston Churchill once said “Perfection is spelt P-A-R-A-L-Y-S-I-S”. Sometimes the principles conflict not only with each other but with non-functional requirements. A good software architecture is a matter of making well reasoned trade-offs. Use cases are employed not to create the architecture but to test it out and as a vehicle for solution delivery. Use cases give a largely external perspective. Other techniques such as collaboration diagramming and sequence diagramming are applied to help with the internal organization of the system.

References Allen, P. and S. Frost., Component Manager, Select Software Tools White Paper, 1997. Allen, P. and S. Frost. Component-Based Development for the Enterprise: Applying The SELECT Perspective, SIGS Publications/Cambridge University Press, New York, NY 1998.

A Practical Framework for Applying UML

433

Booch, G., Object Oriented Analysis and Design with Applications, 2nd Ed., Benjamin Cummins, 1993. Fowler, M., Analysis Patterns: Reusable Object Models, Addison Wesley, 1997. Gamma, E., Helm, R., Johnson, R., Vlissides, J., Design Patterns: Elements of Reusable Object-Oriented Software, Addison Wesley, 1995. Jacobson, I., Christerson,M., Jonsson, P., Overgaard, G. Object Oriented Software Engineering: A Use Case Driven Approach, Addison-Wesley, 1992. OMG, Unified Modeling Language Version 1.1, OMG, Framingham, Mass.,1997. Wilkinson, N. Using CRC Cards: AN Informal Approach to Object-Oriented Development, SIGS Books, New York, NY 1995. Wirfs-Brock, R., B. Wilkerson, and L. Weiner. Designing Object-Oriented Software, Prentice Hall, Upper Saddle River, NJ, 1991.

Extending Aggregation Constructs in UML Monika Saksena' , Maria Larrondo-Petrie2, Robert B. France3, and Matthew P. Evett2 Powervision Corporation 220 Congress Park Drive, Suite 130 Delray Beach, FL 33445-4605, USA Phone: (561) 279 2890 Fax: (561) 279 2810 [email protected] Department of Computer Science & Engineering Florida Atlantic University Boca Raton, FL-33431-0991, USA Phone: (561) 297 3855 Fax: (561) 297 2800 [email protected] Computer Science Department Colorado State University Fort Collins, GO 80523, USA. Phone: (970) 491 6356 Fax: (970) 491 2466 [email protected] e.edu

Abstract. In this paper we provide a characterization of aggregation as used in static conceptual modeling of applications. Based on this characterization we suggest changes to UML notation that allow for more precise characterization of aggregate structures.

1

Introduction

Many authors in the software engineering, knowledge representation, conceptual modeling, and database communities have explored the notion of aggregation. For example, Winston et al. [7] characterized different kinds of aggregations from a knowledge representation perspective, Ode11 mapped this work t o the 00 modeling perspective [3], and Kilov and Ross [2] provide a rigorous treatment of aggregation in terms of invariant properties. The Methods Integration Research Group (MIRG) is developing a characterization of aggregation that unifies and extends the above works on aggregation. Our initial focus is on aggregation constructs in static 00 models (i.e., as used in Class Diagrams), and the results of our initial work are detailed in [6]. Our characterization of aggregation in static models is expressed in terms of primary (essential) and secondary (non-essential) characteristics. Combinations of the secondary properties result in varieties of aggregation. In this paper we propose extensions to the Unified Modeling Language (UML) [4] definition of aggregation J. Bézivin and P.-A. Muller (Eds.): «UML» ’98, LNCS 1618, pp. 434–441, 1999. © Springer-Verlag Berlin Heidelberg 1999

Extending Aggregation Constructs in UML

435

that allow developers to explicitly express additional properties of the aggregate structures. Fig. 1 shows our proposed classification of aggregation properties in terms of primary and secondary properties. The leaf nodes of the classification tree are properties. This characterization is independent of UML.

Fig. 1. Properties of Aggregation

2

Characteristics of Aggregation

In this section we give an overview of our proposed primary and secondary characteristics of aggregation. A primary characteristic is a property that holds for all forms of aggregation, while a secondary characteristic is neither unique nor essential to aggregation, but is used to distinguish different kinds of aggregation structures. 2.1

Primary Characteristics

The primary characteristics are grouped under three categories: structural, Zifetime binding, and ownership.

436

Monika Saksena et al.

Structural. The structural category includes static structural properties of aggregates. The mathematical properties required in an aggregate structure are antisymmetry and irreflexivity. We do not consider transitivity an essential property for all aggregations, for example, if a club is composed of persons and persons are composed of legs, then inferring that the legs of club members also belong t o the club may not be desired. An aggregate structure must also have attributes based on Resultant and Emergent properties [2]. A resultant property of an aggregate structure is one that is dependent on a subset of the properties of the aggregate’s parts. An emergent property is one that is independent of the properties of the component instances. Lifetime Binding. Lifetime binding refers to the relationship between the lifetimes of the whole and its parts. The strong form of aggregation in the UML is defined as coincident lifetime binding between the part and the whole. The weaker form of aggregation relaxes this rule to allow sharing of parts across wholes. Civello [l]proposes that parts and wholes can outlive each other. In the Rumbaugh et a1.[5] characterization of aggregation, the lifetime of the part is contained within the lifetime of the whole (i.e., parts do not outlive the whole). In our characterization of aggregation, we use the following weaker form of lifetime binding I n an aggregate structure, the lifetime of the part must overlap the lifetime of the whole. The possible relationships between lifetimes of parts and wholes allowed by our characterization are given in Fig. 2. Case 4 corresponds t o the UML notion of coincident lifetimes. Cases 1 to 4 are scenarios of lifetime bindings where the lifetime of the part is contained within the lifetime of the whole. In these cases the parts cannot be separated from their wholes and must always be associated with a whole. We refer to this group of lifetime dependencies as inseparable parts in this paper. Cases 5 t o 8 allow parts t o exist without associated wholes. This group of dependencies is termed separable parts. In Cases 5 to 7 the death of a part is independent of the death of the whole. In Case 8, a part can exist on its own, but when associated with a whole, it dies at the time the whole dies.

Ownership. Ownership pertains to the control the whole has over the behavior of its parts. According to Odell, it is a term that describes the destiny of the whole with respect to its parts [3].We consider the following notion of ownership as essential t o aggregation:

An object owns its part in the sense that it controls the behavior of its parts, that is, the whole controls how and when the services of its parts are used. This requires that an aggregate class have methods that call the methods of the parts. Note that the above notion of ownership does not preclude other objects from interacting directly with the part. A part may be owned by one or more wholes at a particular time, and its owners may change over the part’s lifetime.

Extending Aggregation Constructs in UML

-

Whole Part Whole Part

contained

Whole Part Whole Part

coincident

Whole Part Whole Part

unconstrained

-,

-

Whole Part

coastrained

: :

Whole Part

Object Lifetimes

-

-

437

Case 1

Case 2 1.. . INSEPARABLE PARTS

Case 3

Case 4

Case 5

Case 6

. .. SEPARABLE PARTS Case 7

Case 8

-

(time)

Fig. 2. Lifetime Bindings between the part and the whole

2.2

Secondary Characteristics

The secondary characteristics of aggregation are parts sharing, homeomerousity (similarity of part and whole), and propagation of features.

Sharing. Sharing of parts occurs when a part is simultaneously associated with more than one whole. Shared parts are necessarily separable. Three types of parts sharing can be identified: Homogeneous Sharing: An instance of a part class can be shared among different instances of the same aggregate class. For example, an instance of Person can belong to two different instances of ResearchGroup. - Heterogeneous Sharing: An instance of a class can be shared by wholes that are instances of different classes. An instance of Person can be simultaneously shared by an instance of Family and by an instance of ResearchGroup. - Conceptual Sharing: The concept represented by a class is shared. Conceptual sharing does not imply sharing of instances. For example, an Engine can be a part of two different aggregate classes: Car and Plane. However, one particular instance of Engine of an instance of a Plane cannot be used -

438

Monika Saksena et al.

part of an instance of Car. The concept of the Engine is being shared, not instances of Engine.

i ~ sa

Homeomerousity. Parts in an aggregate structure are said to be homeomerous when they have properties in common with their whole or with other parts in the structure. For example, a homeomerous part class can be a specialization of its whole class. Such parts are often used to form recursive aggregate structures. Propagation. Properties of the whole (attributes, operations, references) can propagate to its parts. Such propagation is a powerful and concise way of specifying an entire continuum of behavior. For an example of behavior propagation, consider a triangle modeled as an aggregate with edges as parts. Moving the triangle requires the movement of the edges. The move operation of the whole is propagated t o its parts. For an example of attribute propagation consider a car composed of an engine, body and chassis. The color of the body is the color of the car; the color of the car propagates to the body.

3

Proposed extensions for UML

The information conveyed by existing graphical notations for aggregation is inadequate for representing various shades of aggregation. In this section we propose extensions to the standard UML notation to support a more precise modeling of aggregation characteristics. The primary structural properties of aggregation do not require any annotations because their semantics are implicit in the graphical representation. While all aggregate structures have lifetime bindings, the precise nature of the bindings can vary across structures. The current UML notation does not directly support a more precise expression of the lifetime binding property of an aggregate. We propose that the parts in an aggregate be explicitly marked as being separable or inseparable. An inseparable component is marked by an inseparable annotation adjacent to the component class. For example, an inseparable Tableof Contents is depicted in Fig. 3.

'% Article

I

I {inseparable)

TableOfContents

Fig. 3. Annotated InseDarable Part

Extending Aggregation Constructs in UML

439

A separable part of Property is shown in Fig. 4. In such cases, the cardinality of the whole class implies 0 as a lower limit for separable parts. This

Fig. 4. Annotated Separable Part

notation provides for aggregate constructs in which one omponent is sep rable and another is inseparable. The secondary properties require annotations because these connotations are not implicit in the definition of aggregation. The syntax for secondary properties is described next.

3.1

Sharing

Parts can be shared homogenously, heterogenously or conceptually. Homogenous sharing is depicted by multiplicities on the whole. For example, a Person object belonging to two ResearchGroup objects is indicated by placing a O..* multiplicity at the ResearchGroup end of the aggregation. Heterogenous sharing is implicit when a class is part of two different aggregate structures. Conceptual Sharing is illustrated in Fig. 5, where the part class is annotated with conceptual in curly braces at the top of the component class.

Fig. 5. Conceptualsharing

440

Monika Saksena et al.

3.2

Homeomerousity

Homeomerousity is a special feature that is often applicable to certain cases where the part is similar to the whole. No annotation is required for homeomerousity, because it is evident in the definition and (recursive) structure of classes. An example is given in Fig. 6. An annotation is given in the example even though it is not needed.

Fig. 6. Homeomerous parts and wholes

3.3

Propagation

Propagation of structure is implicit in the aggregate configuration. Operations and properties need special annotations since not all attributes and/or operations are propagated in any given structure. An arrow in the direction of propagation indicates propagation of the attribute to that class. For example, if the color of the Body is propagated to the aggregate class Car, then the attribute color has an arrow in the direction of propagation adjacent to it (Fig. 7).

Fig. 7 . Propagation of properties

Similarly, an arrow next to the affected operation can be used to depict prop agation of operations. For example, Fig. 8 shows propagation of the operation COPY.

Extending Aggregation Constructs in UML

1

441

I..*

Cheracrer

COPY

Fig. 8. Propagation of operations

4

Conclusions

We provide a characterization of aggregation that enhances the UML interpretation of aggregation. The characterization is expressed in terms of primary and secondary properties. Predefined UML annotations are proposed for rendering additional aggregation properties. Future work will focus on providing a precise semantics for aggregation and identifying patterns of aggregate structures that are useful in conceptual modeling.

References 1. F.Civello. Roles for composite objects in object-oriented analysis and design. In Proceedings of OOPSLA '93, 1993. 2. H. Kilov and J . Ross. Information Modeling: A n Object-Oriented Approach. ObjectOriented Series. Prentice Hall, 1994. 3. J. Odell. Six different kinds of composition. Journal of Object- Oriented Programming, 6(8), January 1994. 4. The UML Partners. Unified Modeling Language. Version 1.1, Rational Software Corporation, Santa Clara, CA-95051, USA, January 1997. 5. J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-Oriented Modeling and Design. Prentice Hall, 1991. 6. Monika Saksena. The notion of aggregation. Master's thesis, Department of Computer Science & Engineering, Florida Atlantic University, Boca Raton, Florida, 1998. 7. M. E. Winston, R. Chaffin, and D. Hermann. A taxonomy of whole-part relations. Cognitive Science, 11, 1987.

Author Index Allen, P. ....................................419 Arlow, J.....................................189 Arnold, F. ....................................49 Arthaud, R.................................307 Atkinson, C. ................................21 Bézivin, J.......................................1 Bicarregui, J. .............................104 Bourdeau, E...............................227 Bustard, D.W. ...........................393 Civello, F...................................162 Cook, S......................................148 Desfray, P..................................120 Elkoutbi, M. ..............................132 Emmerich, W. ...........................189 Evans, A....................................336 Evett, M.P. ................................434 France, R.B. ......................336, 434 Gérard, S. ..................................319 Gogolla, M. .................................92 Hamie, A. ..................................162 Henderson-Sellers, B. ...............349 Hitz, M.. ........................................9 Howse, J....................................162 Hruby, P. ...................................278 Jézéquel J.M..............................365 Kappel, G. .....................................9 Keller, R.K. ...............................132 Kent, S.......................................162 Khriss, I.....................................132 Kivisto, K. .................................294 Kleppe, A. .................................148 Korthaus, A. ..............................215 Kovacevik, S. ............................253 Kuhlins, S..................................215 Lano, K. ............................107, 336 Lanusse, A.................................319 Larrondo-Petrie, M.M. ..............434

Le Guennec, A...........................365 Leblanc, P..................................307 Lester, N.G. ...............................393 Lucas, C.....................................378 Lugagne, P.................................227 Mancona Kandé, M. ..................200 Mazaher, S.................................200 Mellor, S.J. ................................307 Mens, T. ....................................378 Mitchell, R.................................162 Morand, B. ..................................37 Muller, P.A....................................1 Olivé, A. ......................................64 Ou, Y. ........................................173 Övergaard, G. ............................406 Paech, B.....................................267 Palmkvis, K. ..............................406 Pennanearc'h, F. ........................365 Podehl, G.....................................49 Prnjat, O. ...................................200 Quinn, J. ....................................189 Ritchers, M. .................................92 Roques, P...................................227 Rumpe, B...................................336 Sacks, L. ....................................200 Saksena, M. ...............................434 Sancho, M.R................................64 Seeman, J...................................240 Steyaert, P..................................378 Susuki, J. .....................................78 Terrier, F. ..................................319 Tockey, S.R. ..............................307 Warner, J. ..................................148 Wilkie, F.G................................393 Wittig, M. ..................................200 Wolff v. Gudenberg, J. ..............240 Yamamoto, Y. .............................78

Unified Modeling Language User Guide, The

Sensor Based Intelligent Robots: International Workshop Dagstuhl Castle, Germany, September 28 - October 2, 1998 Selected Papers

Collective Robotics: First International Workshop, CRW'98, Paris, France, July 4-5, 1998, Proceedings

Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers

Graph-Theoretic Concepts in Computer Science: 31st International Workshop, WG 2005, Metz, France, June 23-25, 2005, Revised Selected Papers

Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers

Information Hiding: 9th International Workshop, IH 2007, Saint Malo, France, June 11-13, 2007, Revised Selected Papers

Multiagent Platforms: First Pacific Rim International Workshop on Multi-Agents, PRIMA'98, Singapore, November 23, 1998, Selected Papers

Information Hiding: 7th International Workshop, IH 2005, Barcelona, Spain, June 6-8, 2005, Revised Selected Papers

Formal Modeling and Analysis of Timed Systems: First International Workshop, FORMATS 2003, Marseille, France, September 6-7, 2003, Revised Papers

Machine Learning for Multimodal Interaction: First International Workshop, MLMI 2004, Martigny, Switzerland, June 21-23, 2004, Revised Selected Papers

System Analysis and Modeling: Language Profiles: 5th International Workshop, SAM 2006, Kaiserslautern, Germany, May 31 - June 2, 2006, Revised Selected

Datalog reloaded : first International Workshop, Datalog 2010, Oxford, UK, March 16-19, 2010. Revised selected papers

The World Wide Web and Databases: International Workshop WebDB'98, Valencia, Spain, March 27- 28, 1998 Selected Papers: EDBT Workshop Web DB'98, ... Papers

The Unified Modeling Language. UML'98: Beyond the Notation: First International Workshop, Mulhouse, France, June 3-4, 1998, Selected Papers

Unified Modeling Language User Guide, The

The Unified Modeling Language User Guide

The Unified Modeling Language Reference Manual

The Unified Modeling Language Reference Manual

The unified modeling language reference manual

Sensor Based Intelligent Robots: International Workshop Dagstuhl Castle, Germany, September 28 - October 2, 1998 Selected Papers

Collective Robotics: First International Workshop, CRW'98, Paris, France, July 4-5, 1998, Proceedings

Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers

Graph-Theoretic Concepts in Computer Science: 31st International Workshop, WG 2005, Metz, France, June 23-25, 2005, Revised Selected Papers

Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers

Information Hiding: 9th International Workshop, IH 2007, Saint Malo, France, June 11-13, 2007, Revised Selected Papers

Multiagent Platforms: First Pacific Rim International Workshop on Multi-Agents, PRIMA'98, Singapore, November 23, 1998, Selected Papers

Information Hiding: 7th International Workshop, IH 2005, Barcelona, Spain, June 6-8, 2005, Revised Selected Papers

Formal Modeling and Analysis of Timed Systems: First International Workshop, FORMATS 2003, Marseille, France, September 6-7, 2003, Revised Papers

Machine Learning for Multimodal Interaction: First International Workshop, MLMI 2004, Martigny, Switzerland, June 21-23, 2004, Revised Selected Papers

System Analysis and Modeling: Language Profiles: 5th International Workshop, SAM 2006, Kaiserslautern, Germany, May 31 - June 2, 2006, Revised Selected

Datalog reloaded : first International Workshop, Datalog 2010, Oxford, UK, March 16-19, 2010. Revised selected papers

Selected Papers

Selected Papers

The Hellenistic Monarchies: Selected Papers

Selected Papers

The World Wide Web and Databases: International Workshop WebDB'98, Valencia, Spain, March 27- 28, 1998 Selected Papers: EDBT Workshop Web DB'98, ... Papers

Functional Imaging and Modeling of the Heart: Second International Workshop, FIMH 2003, Lyon, France, June 2003, Proceedings

Beyond the Language Classroom

Software language engineering first international conference; revised selected papers SLE <1. 2008. Toulouse>

Object-Based Parallel and Distributed Computation: France-Japan Workshop, OBPDC'95, Tokyo, Japan, June 21 - 23, 1995, Selected Papers

Industrial Mathematics : The 1998 CRSC Workshop

Industrial Mathematics : The 1998 CRSC Workshop

Industrial Mathematics : The 1998 CRSC Workshop

Industrial Mathematics : The 1998 CRSC Workshop

The Unified Modeling Language. UML'98: Beyond the Notation: First International Workshop, Mulhouse, France, June 3-4, 1998, Selected Papers

Unified Modeling Language User Guide, The

The Unified Modeling Language User Guide

The Unified Modeling Language Reference Manual

The Unified Modeling Language Reference Manual

The unified modeling language reference manual

Sensor Based Intelligent Robots: International Workshop Dagstuhl Castle, Germany, September 28 - October 2, 1998 Selected Papers

Collective Robotics: First International Workshop, CRW'98, Paris, France, July 4-5, 1998, Proceedings

Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers

Graph-Theoretic Concepts in Computer Science: 31st International Workshop, WG 2005, Metz, France, June 23-25, 2005, Revised Selected Papers

Curves and Surfaces: 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers

Information Hiding: 9th International Workshop, IH 2007, Saint Malo, France, June 11-13, 2007, Revised Selected Papers

Multiagent Platforms: First Pacific Rim International Workshop on Multi-Agents, PRIMA'98, Singapore, November 23, 1998, Selected Papers

Information Hiding: 7th International Workshop, IH 2005, Barcelona, Spain, June 6-8, 2005, Revised Selected Papers

Formal Modeling and Analysis of Timed Systems: First International Workshop, FORMATS 2003, Marseille, France, September 6-7, 2003, Revised Papers

Machine Learning for Multimodal Interaction: First International Workshop, MLMI 2004, Martigny, Switzerland, June 21-23, 2004, Revised Selected Papers

System Analysis and Modeling: Language Profiles: 5th International Workshop, SAM 2006, Kaiserslautern, Germany, May 31 - June 2, 2006, Revised Selected

Datalog reloaded : first International Workshop, Datalog 2010, Oxford, UK, March 16-19, 2010. Revised selected papers

Selected Papers

Selected Papers

The Hellenistic Monarchies: Selected Papers

Selected Papers

The World Wide Web and Databases: International Workshop WebDB'98, Valencia, Spain, March 27- 28, 1998 Selected Papers: EDBT Workshop Web DB'98, ... Papers

Functional Imaging and Modeling of the Heart: Second International Workshop, FIMH 2003, Lyon, France, June 2003, Proceedings

Beyond the Language Classroom

Software language engineering first international conference; revised selected papers SLE <1. 2008. Toulouse>

Object-Based Parallel and Distributed Computation: France-Japan Workshop, OBPDC'95, Tokyo, Japan, June 21 - 23, 1995, Selected Papers

Industrial Mathematics : The 1998 CRSC Workshop

Industrial Mathematics : The 1998 CRSC Workshop

Industrial Mathematics : The 1998 CRSC Workshop

Industrial Mathematics : The 1998 CRSC Workshop

Recommend Documents