Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
4980
Zbigniew Huzar Radek Koci Bertrand Meyer Bartosz Walter Jaroslav Zendulka (Eds.)
Software Engineering Techniques Conference Third IFIP TC 2 Central and East European CEE-SET 2008 Brno, Czech Republic, October 13-15, 2008 Revised Selected Papers
13
Volume Editors Zbigniew Huzar Wrocław University of Technology Wybrze˙ze Wyspia´nskiego 27, 50-370 Wrocław, Poland E-mail:
[email protected] Radek Koci Jaroslav Zendulka Brno University of Technology Božetˇechova 2, 612 66 Brno, Czech Republic E-mail: {koci, zendulka}@fit.vutbr.cz Bertrand Meyer ETH Zurich Clausiusstr. 59, 8092 Zurich Switzerland E-mail:
[email protected] Bartosz Walter Pozna´n University of Technology Piotrowo 2, 60-965 Pozna´n, Poland E-mail:
[email protected]
ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-22385-3 e-ISBN 978-3-642-22386-0 DOI 10.1007/978-3-642-22386-0 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011930847 CR Subject Classification (1998): D.2, K.6.3, K.6, K.4.3, D.3, F.3.2, J.1 LNCS Sublibrary: SL 2 – Programming and Software Engineering
© IFIP International Federation for Information Processing 2011 the material is This work is subject to copyright. All rights are reserved, whether the whole or part of broadcasting, concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, this publication reproduction on microfilms or in any other way, and storage in data banks. Duplication of 9, 1965, or parts thereof is permitted only under the provisions of the German Copyright Law of Septemberare liable in its current version, and permission for use must always be obtained from Springer. Violations to prosecution under the German Copyright Law. does not imply, The use of general descriptive names, registered names, trademarks, etc. in this publication protective laws even in the absence of a specific statement, that such names are exempt from the relevant and regulations and therefore free for general use. India Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
The predecessor of CEE-SET was the Polish Conference on Software Engineering, KKIO, organized annually since 1999. In 2006 KKIO changed to an international conference on Software Engineering Techniques SET 2006 sponsored by Technical Committee 2 (Software: Theory and Practice) of the International Federation for Information Processing, IFIP [http://www.ifip.org/]. In 2007 the conference got a new name: IFIP TC2 Central and East-European Conference on Software Engineering Techniques, CEE-SET. In 2008 the conference took place in Brno, Czech Republic, and lasted for three days, October 13–15, 2008 (the details are on the conference website http://www.cee-set.put.poznan.pl/2008/). The conference aim was to bring together software engineering researchers and practitioners, mainly from Central and East-European countries (but not only), and to allow them to share their ideas and experiences. The conference was technically sponsored by: – – – – – – – –
IFIP Technical Committee 2, Software: Theory and Practice Czech Society for Cybernetics and Informatics (CSKI) Gesellschaft f¨ ur Informatik, Special Interest Group Software Engineering John von Neumann Computer Society (NJSZT), Hungary Lithuanian Computer Society Polish Academy of Sciences, Committee for Informatics Polish Information Processing Society Slovak Society for Computer Science
The financial support was provided by Visegrad Fund, CSKI, and Brno University of Technology. The conference program consisted of three keynote speeches given by Krzysztof Czarnecki (University of Waterloo, Canada), Thomas Gschwind (IBM Zurich Research Lab, Switzerland), and Mauro Pezz (University of Milano Bicocca, Italy), 21 regular presentations selected from 69 submissions (success rate was about 30%), and 20 work-in-progress presentations. The International Program Committee decided that the Best Paper Award would be presented to: L ukasz Olek, Jerzy Nawrocki and Miroslaw Ochodek for the paper “Enhancing Use Cases with Screen Designs.” This volume contains a keynote speech by Thomas Gschwind and regular presentations given at the conference. We believe that publishing these high-quality papers will support a wider discussion on software engineering techniques. Zbigniew Huzar Radek Koci Bertrand Meyer Bartosz Walter Jaroslav Zendulka
Organization
Program Committee Pekka Abrahamsson Vincenzo Ambriola Nathan Baddoo Vladim´ır Bart´ık Hubert Baumeister Maria Bielikova M. Biro Pere Botella Albertas Caplinskas Gabor Fazekas K. Geihs Janusz Gorski Bogumila Hnatkowska Petr Hnetynka Tomas Hruska Zbigniew Huzar Paul Klint Jan Kollar Laszlo Kozma Leszek Maciaszek Jan Madey Lech Madeyski Zygmunt Mazur Bertrand Meyer Matthias M¨ uller J¨ urgen M¨ unch Jerzy Nawrocki Miroslav Ochodek L ukasz Olek
VTT Technical Research Centre of Finland University of Pisa, Italy University of Hertfordshire, UK Brno University of Technology, Czech Republic Technical University of Denmark Slovak University of Technology in Bratislava, Slovakia John von Neumann Computer Society and Corvinus University of Budapest, Hungary Universitat Politecnica de Catalunya, Spain Institute of Mathematics and Informatics Lithuania University of Debrecen, Hungary Universit¨ at Kassel, Germany Gdansk University of Technology, FETI, DSE, Poland Institute of Applied Informatics, Wroclaw University of Technology, Poland Charles University in Prague, Czech Republic Brno University of Technology, Czech Republic Wroclaw University of Technology, Poland Centrum voor Wiskunde en Informatica, The Netherlands Technical University Kosice, Slovakia E¨otv¨ os Lor´ and University, Hungary Macquarie University Sydney, Australia Warsaw University, Poland Wroclaw University of Technology, Poland Wroclaw Institute of Technology, Poland ETH Zurich, Switzerland IDOS Software AG, Germany Fraunhofer Institute for Experimental Software Engineering, Germany Pozna´ n University of Technology, Poland Pozna´ n University of Technology, Poland Pozna´ n University of Technology, Poland
VIII
Organization
Janis Osis Erhard Ploedereder Saulius Ragaisis Felix Redmill Marek Rychl´ y Krzysztof Sacha Wilhelm Schaefer Giancarlo Succi Tomasz Szmuc Marcin Szpyrka Andrey Terekhov Richard Torkar Corrado Aaron Visaggio Tomas Vojnar Bartosz Walter Jaroslav Zendulka Krzysztof Zielinski
Riga Technical University, Latvia University of Stuttgart, Germany Vilnius University, Lithuania Newcastle University, UK Brno University of Technology, Czech Republic Warsaw University of Technology, Poland University of Paderborn, Germany Free University of Bolzano-Bozen, Italy AGH University of Science and Technology, Poland AGH University of Science and Technology, Poland St. Petersburg State University, Russia Blekinge Institute of Technology, Sweden University of Sannio, Italy Brno University of Technology, Czech Republic Pozna´ n University of Technology, Poland Brno University of Technology, Czech Republic AGH University of Science and Technology, Poland
Additional Reviewers Adamek, Jiri Bleul, Steffen Fryzlewicz, Zbigniew Gall, Dariusz Habermehl, Peter Henkler, Stefan Hirsch, Martin Holik, Lukas Khan, Mohammad Ullah Krena, Bohuslav
Letko, Zdenek Michalik, Bartosz Ochodek, Miroslaw Parizek, Pavel Reichle, Roland Skubch, Hendrik Sudmann, Oliver von Detten, Markus Vranic, Valentino
Table of Contents
Keynote Towards a Compiler for Business-IT Systems: A Vision Statement Complemented with a Research Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jana Koehler, Thomas Gschwind, Jochen K¨ uster, Hagen V¨ olzer, and Olaf Zimmermann
1
Requirements Specification Towards Use-Cases Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bartosz Alchimowicz, Jakub Jurkiewicz, Jerzy Nawrocki, and Miroslaw Ochodek
20
Automated Generation of Implementation from Textual System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Franc˚ u and Petr Hnˇetynka
34
Enhancing Use Cases with Screen Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . L ukasz Olek, Jerzy Nawrocki, and Miroslaw Ochodek
48
Design Mining Design Patterns from Existing Projects Using Static and Run-Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michal Dobiˇs and L’ubom´ır Majt´ as
62
Transformational Design of Business Processes for SOA . . . . . . . . . . . . . . . Andrzej Ratkowski and Andrzej Zalewski
76
Service-Based Realization of Business Processes Driven by Control-Flow Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Petr Weiss
91
Modeling SMA—The Smyle Modeling Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benedikt Bollig, Joost-Pieter Katoen, Carsten Kern, and Martin Leucker
103
X
Table of Contents
Open Work of Two-Hemisphere Model Transformation Definition into UML Class Diagram in the Context of MDA . . . . . . . . . . . . . . . . . . . . . . . . Oksana Nikiforova and Natalja Pavlova
118
HTCPNs–Based Tool for Web–Server Clusters Development . . . . . . . . . . . Slawomir Samolej and Tomasz Szmuc
131
Software Product Lines Software Product Line Adoption – Guidelines from a Case Study . . . . . . Pasi Kuvaja, Jouni Simil¨ a, and Hanna Hanhela
143
Refactoring the Documentation of Software Product Lines . . . . . . . . . . . . Konstantin Romanovsky, Dmitry Koznov, and Leonid Minchin
158
Code Generation Code Generation for a Bi-dimensional Composition Mechanism . . . . . . . . Jacky Estublier, Anca Daniela Ionita, and Tam Nguyen
171
Advanced Data Organization for Java-Powered Mobile Devices . . . . . . . . ˇ Tom´ aˇs Tureˇcek and Petr Saloun
186
Developing Applications with Aspect-Oriented Change Realization . . . . . Valentino Vrani´c, Michal Bebjak, Radoslav Menkyna, and Peter Dolog
192
Project Management Assessing the Quality of Quality Gate Reference Processes . . . . . . . . . . . . Thomas Flohr
207
Exploratory Comparison of Expert and Novice Pair Programmers . . . . . . Andreas H¨ ofer
218
State of the Practice in Software Effort Estimation: A Survey and Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adam Trendowicz, J¨ urgen M¨ unch, and Ross Jeffery
232
Quality Testing of Heuristic Methods: A Case Study of Greedy Algorithm . . . . . . A.C. Barus, T.Y. Chen, D. Grant, Fei-Ching Kuo, and M.F. Lau
246
Table of Contents
XI
A Framework for Defect Prediction in Specific Software Project Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dindin Wahyudin, Rudolf Ramler, and Stefan Biffl
261
Meeting Organisational Needs and Quality Assurance through Balancing Agile and Formal Usability Testing Results . . . . . . . . . . . . . . . . Jeff Winter, Kari R¨ onkk¨ o, M˚ arten Ahlberg, and Jo Hotchkiss
275
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
291
Towards a Compiler for Business-IT Systems A Vision Statement Complemented with a Research Agenda Jana Koehler, Thomas Gschwind, Jochen K¨uster, Hagen V¨olzer, and Olaf Zimmermann IBM Zurich Resarch Laboratory S¨aumerstrasse 4, CH-8038 R¨uschlikon, Switzerland
Abstract. Business information systems and enterprise applications have continuously evolved into Business-IT systems over the last decades, directly linking and integrating Business Process Management with recent technology evolutions such as Web services and Service-Oriented Architectures. Many of these technological evolutions include areas of past academic research: Business rules closely relate to expert systems, Semantic Web technology uses results from description logics, attempts have been made to compose Web services using intelligent planning techniques, and the analysis of business processes and Web service choreographies often relies on model checking. As such, many of the problems that arise with these new technologies have been solved at least in principle. However, if we try to apply these “in principle” solutions, we are confronted with the failure of these solutions in practice: many proposed solution techniques do not scale to the real-world requirements or they rely on assumptions that are not satisfied by Business-IT systems. As has been observed previously, research in this area is fragmented and does not follow a truly interdisciplinary approach. To overcome this fragmentation, we propose the vision of a compiler for Business-IT systems that takes business process specifications described at various degrees of detail as input and compiles them into executable IT systems. As any classical compiler, the parsing, analysis, optimization, code generation and linking phases are supported. We describe a set of ten research problems that we see as critical to bring our compiler vision to reality.
1 Introduction Business processes comprise sequences of activities that bring people, IT systems, and other machines together to act on information and raw materials. They allow a business to produce goods and services and deliver them to its customers. A business is a collection of business processes and thus (re-)engineering business processes to be efficient is one of the primary functions of a company’s management. This is one of the most fundamental mechanisms that drives advances in our society. Business Process Management (BPM) is a structured way to manage the life cycle of business processes including their modeling (analysis and design), execution, monitoring, and optimization. Good tools exist that allow business processes to be analyzed and designed in an iterative process. Creating a model for a process provides insights that Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 1–19, 2011. c IFIP International Federation for Information Processing 2011
2
J. Koehler et al.
allow the design of a process to be improved, in particular if the modeling tool supports process simulation. A Service-Oriented Architecture (SOA), which is often implemented using Web services, is an architectural style that allows IT systems to be integrated in a standard way, which lends itself to the efficient implementation of business processes. SOA also enables efficient process re-engineering as the use of standard programming models and interfaces makes it much simpler to change the way in which the components of business processes are integrated. Once a process is implemented, it can be monitored, e.g., measured, and then further optimized to improve the quality, the performance, or some other aspect of the process. Despite the progress that has been made recently in business process design and modeling on the one hand, and their execution and monitoring on the other, there is a significant gap in the overall BPM life cycle, which severely limits the ability of companies to realize the benefits of BPM. No complete solution exists to automate the translation from business process models to executable business processes. While partial solutions exist that allow some process models to be mapped to implementations (workflows), scalable and automated approaches do not exist that would support businesses in the full exploitation of the BPM life cycle to achieve rapid improvements of their business processes. Thus, many of the benefits of process modeling and SOA cannot be realized and effective BPM remains a vision. As has been observed previously, research in this area is fragmented and does not follow a truly interdisciplinary approach. This lack of interdisciplinary research is seen as a major impediment that limits added economic growth through deployment and use of services technology [1]: The subject of SOC1 is vast and enormously complex, spanning many concepts and technologies that find their origins in diverse disciplines that are woven together in an intricate manner. In addition, there is a need to merge technology with an understanding of business processes and organizational structures, a combination of recognizing an enterprise’s pain points and the potential solutions that can be applied to correct them. The material in research spans an immense and diverse spectrum of literature, in origin and in character. As a result research activities at both worldwide as well as at European level are very fragmented. This necessitates that a broader vision and perspective be established one that permeates and transforms the fundamental requirements of complex applications that require the use of the SOC paradigm. This paper presents a concrete technological vision and foundation to overcome the fragmentation of research in the BPM/SOA area. We propose the vision of a compiler for Business-IT systems that takes business process specifications described at various degrees of detail as input and compiles them into executable IT systems. This may sound rather adventurous, however, recall how the first compiler pioneers were questioned when they suggested to programmers that they should move from hand-written 1
SOC stands for Service-Oriented Computing a term that is also used to denote research in the area of BPM and SOA. The still evolving terminology is a further indicator of the emerging nature of this new research field.
Towards a Compiler for Business-IT Systems
3
assembly code to abstract programming languages from which machine-generated code would then be produced. The emerging new process-oriented programming languages such as the Business Process Execution Language (BPEL) [2] or the Business Process Modeling Notation (BPMN) [3] are examples of languages that are input to such a compiler. In particular, the upcoming version 2 of the Business Process Modeling Notation (BPMN) [3] can be considered as a language that at the same time allows non-technical users to describe business processes in a graphical notation, while technical users can enrich these descriptions textually until the implementation of the business process is completely specified. With its formally defined execution semantics, BPMN 2.0 is directly executable. However, direct execution in the literal meaning of the words means to follow an interpreter-based approach with all its shortcomings, which cannot be considered as really desirable. The envisioned compiler does not take a high-level business process model, magically adds all the missing information pieces, and then compiles it into executable code. The need to go from an analysis model to a design model in an iterative refinement process that involves human experts does not disappear. The compiler enables human experts to more easily check and validate their process model programs. The validation helps them in determining the sources of errors as well as the information that is missing. Only once all the required information is available, the code can be completely generated. The compiler approach also extends the Model-Driven Architecture (MDA) vision; beyond model transformations that provide a mapping between models with different abstractions, we combine code generation with powerful analytical techniques. Static analysis is performed yielding detailed diagnostic information and structural representations similar to the abstract syntax tree are used by the Business-IT systems compiler. This provides a more complete understanding of the process models, which is the basis for error handling, correct translation, and runtime execution. The paper is organized as follows: In Section 2, we motivate the need for a BusinessIT systems compiler by looking at BPM life cycle challenges. In Section 3, we summarize the ten research problems that we consider as particularly interesting and important to solve. In Section 4, we discuss the problems in more detail and review selected related work. Section 5 concludes the paper.
2 Life Cycle Challenges in Business-IT Systems The IT Infrastructure underlying a business is a critical success factor. Even when IT is positioned as a commodity such as by Nicolas Carr in “IT doesn’t matter,” it is emphasized that a disruptive new technology has arrived, which requires companies to master the economic forces that the new technology is unleashing. In particular smaller companies are not so well positioned in this situation. The new technology in the form of Business Process Management (BPM) and Service-Oriented Architecture (SOA) is complex to use, still undergoing significant changes, and it is difficult even for the expert to distinguish hype from mature technology development. There is wide agreement that business processes are the central focus area of the new technology wave. On the one hand, business processes are undergoing dramatic change made possible by the technology. On the other hand, increasing needs in making business processes more flexible, while retaining their integrity and compliance with legal
4
J. Koehler et al.
Standardize & Establish new regulations
Industry Standard
Standardized Reuse & Configure Process
Adopt regulation solutions
Best Practice Industry Asset
Configured
Reference
Best Practice
Process
Process
Harvest As-Is Process As-Is Analysis Model
To-Be Process To-Be Analysis Model
abstract As-Is Design Model
deploy Runtime
Add innovative business model elements
refine
IT Architectural BusinessDecision Driven + Making Development
To-Be Design Model
Fig. 1. Driving Forces behind the Life Cycle of Business Processes
regulations continue to drive technology advances in this space. A prominent failure in process integrity is the financial crisis that emerged throughout 2008. Figure 1 reflects our current view of the driving forces behind the life cycle of business processes. Two interleaved trends of commoditization and innovation have to be mastered that involve the solution of many technical problems. Let us spend some space discussing this picture to explain why this is a challenge for most businesses today and why underpinning business process innovation with compiler technology is essential to master the innovation challenge. Let us begin in the lower left corner with the As-Is Process box. This box describes the present situation of a business. Any business has many processes implemented, many of them run today in an IT-supported environment. As such they were derived from some As-Is design model and deployed. Sometimes, the design model is simply the code. The As-Is design model is linked to an As-Is analysis model. Here, we adopt the terminology from software modeling that distinguishes between an analysis model (in our case, the business view on the processes) and the design model (in our case, the implemented processes). The analysis model is usually an abstraction from the design model, i.e., a common view of the business on the implemented processes exists in many companies.2 The direct linkage between the business view on the processes and their implementation constitutes the Business-IT system. 2
By monitoring or mining the running processes or analyzing and abstracting the underlying design model or code in some form, an analysis model can also be produced in an automatic or semi-automatic manner, but this is beyond the focus of this paper.
Towards a Compiler for Business-IT Systems
5
The As-Is processes implemented by the players in an industry represent the state of the art of the Business-IT systems. Industries tend to develop a solid understanding of good and bad practices and often develop best practice reference solutions. Very often, consulting firms also specialize in helping businesses understanding and adopting these best practice processes. Today, one can even see a trend beyond adoption. For example, in the financial industry one can see first trends towards standardized processes that are closely linked to new regulations. This clearly creates a trend of commoditization forcing companies to adopt the new regulation solutions. With that we have arrived at the upper right corner of the picture. The commoditization trend is pervasive in the economic model of the western society and it is as such not surprising that it now reaches into business processes. However, in a profit-driven economy, commoditization is not desirable as it erodes profit. Businesses are thus forced to escape the commoditization trap, which they mostly approach by either adopting new technologies or by inventing new business models. Both approaches directly lead to innovations in the business processes. In the picture above, the new business processes that result from the innovation trend are shown as the To-Be processes, which have to both accommodate commoditization requirements and as well as to include innovation elements at the same time. The To-Be innovation must also be evident with respect to the As-Is process and the best practice process, which is illustrated in the picture with the reference to a delta analysis involving the three process models. The To-Be process is usually (but not always) initiated at the analysis level, i.e., the business develops a need for change and begins to define this change. The ToBe analysis model must be refined into a To-Be design model and then taking through a business-driven development and IT-architectural decision process that is very complex today. With the successful completion of the development, the To-Be process becomes the new As-Is process. With that the trends of commoditization and innovation repeat within the life cycle of business processes [4,5]. This paper focuses on the technological underpinnings for business process innovation, i.e., the adoption of best practice processes, their combination with innovative elements, and the replacement of the As-Is process by the To-Be process. We investigate these challenges from a strictly technological point of view and identify a number of specific problems that are yet unsolved, but have to be solved in order to support businesses in their innovation needs. Problems of process abstraction, harvesting, and standardization are also of general interest, but are outside the scope of this vision as is a study of the economic or social effects of what has been discussed above.
3 Compilation Phases and Associated Research Problems Our main goal is to understand how a compiler for Business-IT systems works. At its core, we see the compilation of business process models that constitutes a well-defined problem. In the following, we relate the principal functionalities of a programming language compiler to the corresponding problems of compiling a business process model. Following Muchnik [6] “compilers are tools that generate efficient mappings from programs to machines”. Muchnik also points out that languages, machines, and target architectures continue to change and that the programs become ever more ambitious in their scale and complexity. In our understanding, languages such as BPMN are the
6
J. Koehler et al.
new forms of programs and SOA is a new type of architecture that we have to tackle with compilers. A compiler-oriented approach helps to solve the business problems and to address the technical challenges around BPM/SOA. For example, verifying the compliance and integrity of a business with legal requirements must rely on a formal foundation. Furthermore, agility in responding to innovation requires a higher degree of automation. At a high-level, a compiler works in the following five phases: 1. 2. 3. 4. 5.
Lexical analysis and parsing Structural and semantic analysis Translation and intermediate code generation Optimization Final assembly and linking and further optimization
We envision the compiler for Business-IT systems to work in the same five phases. While we consider the parsing and lexical analysis phase as essentially being solved by our previous and current work, we propose ten specific research problems that address key problems for the subsequent phases. The list below briefly summarizes the ten problems. In Section 4, they are discussed in more detail. 1. Lexical analysis and parsing: We developed the Process Structure Tree (PST) as a unique decomposition of the workflow graph, which underlies any business process model, into a tree of fragments that can be computed in linear time. The PST plays the same role in the Business-IT systems compiler as the Abstract Syntax Tree (AST) in a classical compiler. 2. Structural and semantic analysis: We developed a control-flow analysis for workflow graphs that exploits the PST and demonstrates its usefulness, but which can still be significantly expanded in terms of the analysis results it delivers as well as the scope of models to which it can be applied. Problem 1: Clarify the role of orchestrations and choreographies in the compiler. Process models describe the flow of tasks for one partner (orchestration) as well as the communication between several partners (choreographies). Structural and semantic analysis must be extended to choreographies and orchestration models. Furthermore, it must be clarified which role choreography specifications play during the compilation process. Problem 2: Solve the flow separation problem for arbitrary process orchestrations. Process orchestrations can contain specifications of normal as well as errorhandling flows. Both flows can be interwoven in an unstructured diagram, with their separation being a difficult, not yet well-understood problem. Problem 3: Transfer and extend data-flow analysis techniques from classical compilers to Business-IT systems compilers. Processes manipulate business data, which is captured as data flow in process models. Successful techniques such as Concurrent Single Static Assignment (CSSA) must be transferred to the Business-IT systems compiler. Problem 4: Solve the temporal projection problem for arbitrary process orchestrations. Process models are commonly annotated with information about states and events. This information is usually available at the level of a single task, but must be propagated over process fragments, which can exhibit a complex structure including cycles.
Towards a Compiler for Business-IT Systems
7
Problem 5: Develop scalable methods to verify the termination of a process choreography returning detailed diagnostic information in case of failure. Correctly specifying the interaction between partners that execute complex process orchestrations in a choreography model is a challenging modeling task for humans. In particular, determining whether the orchestration terminates is a fundamental analysis technique that the compiler must provide in a scalable manner. 3. Translation and intermediate code generation: Many attempts exist to translate business-level process languages such as BPMN to those languages used by the runtime such as BPEL. None of the proposed approaches is satisfying due to strong limitations in the subsets of the languages that can be handled and the quality of the generated code, which is often verbose. Major efforts have to be made to improve the current situation. Problem 6: Define a translation from BPMN to BPEL and precisely characterize the maximal set(s) of BPMN diagrams that are translatable to structured BPEL.3 4. Optimization: Code generated today or attempts to natively execute processoriented languages are very limited with respect to the further optimization of the code. Specific characteristics of the target IT architecture are rarely taken into account. Problem 7: Define execution optimization techniques for the Business-IT systems compiler. Until today, business processes are usually optimized with respect to their costs. No optimization of a process with respect to the desired target platform happens automatically as it is available in a classical compiler. It is an open question which optimizations should be applied when processes are compiled for a Service-Oriented Architecture. 5. Final assembly, linking and further optimization: Assembly and linking problems in a Service-Oriented Architecture immediately define problems of Web service reuse and composition, for which no satisfying solutions have been found yet. Problem 8: Redefine the Web service composition problem such that it is grounded in realistic assumptions and delivers scalable solutions. Web service composition is studied today mostly from a Semantic Web perspective assuming that rich semantic annotations are available that are provided by humans. A compiler, however, should be able to perform the composition and linking of service components without requiring such annotations. Problem 9: Redefine the adapter synthesis problem by taking into consideration constraints that occur in business scenarios. Incorrect choreographies have to be repaired. Often, this is achieved by not changing the processes that are involved in the choreography, but by synthesizing an adapter that allows the partners to successfully communicate with each other. Such an adapter often must include comprehensive protocol mediation capabilities. So far, no satisfying solutions have been found for this problem and we argue that it must be reformulated under realistic constraints. 3
We only define a single problem focusing on the challenge of BPMN-BPEL translation, although the question of an adequate BPEL-independent “byte-code” level for BPMN is also very interesting and deserves further study.
8
J. Koehler et al.
Problem 10: Demonstrate how IT architectural knowledge and decisions are used within the compiler. The target platform for the Business-IT systems compiler is a Service-Oriented Architecture. Architectural decision making is increasingly done with tools that make architectural decisions explicit and manage their consistency. These decisions can thus become part of the compilation process, making it easier to compile processes for different back end systems. The positioning of these ten problems within the various compilation phases makes it possible for researchers to tackle them systematically, study their interrelationships, and solve the problems under realistic boundary constraints. Our vision allows us to position problems in a consistent and comprehensive framework that have previously been tackled in isolation. This can lead to synergies between the various possible solution techniques and allows researchers to successfully transfer techniques that were successful in one problem space to another. Our vision provides researchers with continuity in the technological development, with compilers tackling increasingly complex languages and architectures. A solution of the ten research problems has significant impact on the integrity, improved agility and higher automation within BPM/SOA.
4 A Deeper Dive into the Research Problems A compiler significantly increases the quality of the produced solution and provides clearer traceability. Approaches of manual translation are envisioned to be replaced by tool-supported refinement steps guided by detailed diagnostic information. The optimization of Business-IT systems with respect to their execution becomes possible, which can be expected to lead to systems with greater flexibility making it easier for businesses to follow the life cycle of process innovation. A compiler can also help in automating many manual steps and be expected to produce higher-quality results than those that can be obtained by manual, unsupported refinement and implementation steps. With the compiler approach, we propose to go beyond the Model-Driven Architecture (MDA) vision that proposes models at different levels of abstraction and model transformations to go from a more abstract to a more refined model. Two problems prevent that MDA is fully workable for BPM/SOA. If used at all, model transformations are written mostly in an ad-hoc manner in industrial projects today. They rarely use powerful analytical techniques such as the static analysis performed by compilers, nor do they exploit structural representations similar to the abstract syntax tree that a compiler builds for a program. Furthermore, too many different models result from the transformations with traceability between these models remaining an unsolved problem so far. In the following, we review selected related work in the context of the five phases. The review will not be a comprehensive survey of the state of the art. We focus on where we stand in our own research with respect to the compiler for Business-IT systems and point by example to other existing work in various fields of computer science that we consider as relevant when tackling the problems that we define for the five phases.
Towards a Compiler for Business-IT Systems f1
j1
f2
a1
j2
f7
j6 a6
a5
a3
f4
a8 f3
a2
d1
a9
a7 m4
f5
j4
a14
a10 a11 d2
j3
a4
m1
d4
m3
m2
j5 a16
f6 m5
a13
a12
9
d3
a17
a15
Fig. 2. Workflow graph of an example process model in a UML Activity Diagram-like notation
4.1 Parsing The parsing problem for business process models has not yet been widely recognized by the BPM community as an important problem. Figure 2 shows a typical workflow graph underlying any business process model. It includes activities ai , decisions di , merges mi (for alternative branching and joining) as well as forks fi and joins ji (for parallel branching and joining). Today, process-oriented tools treat such models as large, unstructured graphs. No data structure such as the Abstract Syntax Tree (AST) used by compilers is available in these tools. In our own research, we developed the Process Structure Tree (PST) [7,8], which we consider to be the AST analogy for Business-IT systems compilers. The PST is a fundamental data structure for all the subsequent phases of a compilation. By applying techniques from the analysis of program structure trees [9,10,11,12] to business process models a unique decomposition of process models into a tree of fragments can be computed using a linear time algorithm. The PST is a significant improvement compared to approaches that use graph grammars to parse the visual language, which is exponential in most cases [13]. Figure 3 shows the PST for the workflow graph of Figure 2. With this, we believe that the parsing problem for the Business-IT systems compiler is solved for the near future. Additional improvements can be imagined, but in the Z f7 J d1 C
Y
K
L
a4 a5
a6
a7 a8
j6
V
D m1 f2 E F f3 G j2 H I j3 a9 d2 O m2
f1 A B j1 a3 a1 a2
X
f4 M
W
P m3 Q d3 d4 f5 R f6 S m4 T m5 U j5
N j4 a12
a13
a14
a15
a10 a11
Fig. 3. Process Structure Tree (PST) for the workflow graph of Figure 2
a16
a17
10
J. Koehler et al.
following we concentrate on the other phases of the compiler, which help validating that the PST is indeed as powerful as the AST. 4.2 Structural and Semantic Analysis In our own research, we have developed two types of analysis based on the PST: a) a control-flow analysis [7,14] and b) an approach to the structural comparison and difference analysis of process models [15]. Both demonstrated that the PST is an essential prerequisite and a powerful data structure to implement various forms of analyses. In the following, we shortly summarize our current insights into the analysis problem and identify a set of concrete problems that we consider as being especially relevant and interesting. A business process model is also often referred to as a process orchestration. A process orchestration (the control- or sequence flow) describes how a single business process is composed out of process tasks and subprocesses. In a SOA implementation, each task or subprocess is implemented as a service, where services can also be complex computations encapsulating other process orchestrations. In contrast to an orchestration, a process choreography describes the communication and thus the dependencies between several process orchestrations. Note that the distinction between orchestration and choreography is a “soft” one and usually depends on the point of view of the modeler. An example of a simple process orchestration and choreography specification in BPMN is shown in Figure 4, taken from the BPMN 1.1 specification [3]. The figure shows an abstract process Patient and a concrete process Doctor’s office. The Doctor’s office process orchestration is a simple sequence of tasks. The dotted lines between the two processes represent an initial and incomplete description of the choreography by showing the messages flowing between the two processes.4
Fig. 4. Choreographies and orchestrations in the Doctor’s Office example process 4
Note that the clarification and formal definition of the semantics of BPMN is another focus area of our work. However, developing the fundamental techniques for a Business-IT compiler does not require BPMN as a prerequisite. Related well-defined languages such as Petri nets or workflow graphs can also be assumed. Nevertheless, we plan to apply our techniques to BPMN due to the growing practical relevance of the language.
Towards a Compiler for Business-IT Systems
11
Problem 1: Clarify the role of orchestrations and choreographies in the compiler. Our compiler needs to be able to analyze orchestrations as well as choreographies. However, it is not fully clear at which phase choreography information is relevant for the compilation. It is clearly relevant in the assembly and linking phase when an entire Business-IT system is built, but one can also imagine that the optimization of an orchestration can be specific to a given choreography in order to better address the desired target architecture. Another fundamental question for the analysis is the detection of control- and dataflow errors. In the context of a process orchestration, verification techniques have been widely used, e.g., [16] to find errors in the specified control flows. To the best of our knowledge, compiler techniques have not yet been considered so far. Verification of business processes is an area of research that has established itself over the last decade. Locating errors in business processes is important in particular because of the side effects that processes have on data. Processes that do not terminate correctly because of deadlocks or processes that exhibit unintended execution traces due to a lack of synchronization often leave data in inconsistent states [17]. Common approaches to process verification usually take a business process model, translate a process model into a Petri net or another form of a state-based encoding and then run a Petri-net analysis tool or model checker on the encoding. Examples are the Woflan tool [18] or the application of SMV or Spin to BPEL verification, e.g., [19,20]. In principle, these approaches make it possible to detect errors in business processes. However, there are severe limitations that so far prevented the adoption of the proposed solutions in industrial tools: – Encodings are usually of exponential size compared to the original size of the process model. – The verification tools in use do not give detailed enough diagnostic information in such a way that they allow an end user to easily correct errors—it has turned out in practice that counterexample traces are unfortunately only rarely pointing to the real cause of an error. – The approaches often make restricting assumptions on the subclass of process models that they can handle. Consequently, the currently available solutions are only partially applicable in practice due to their long runtimes, the lack of suitable diagnostic information, and the restrictions on the defined encodings. In our own work, we have followed a different approach. First, we analyzed hundreds of real-world business processes and identified commonly occurring so-called anti-patterns [21]. Secondly, we used the PST as th unique parse tree of a process model to speed up the verification of the process [7,14]. Each fragment in a PST can be analyzed in isolation because the tree decomposition ensures that a process model is sound if each fragment is sound. Many fragments exhibit a simplified structure and their soundness (i.e., the absence of deadlocks and lack of synchronization errors) can be verified by matching them against patterns and anti-patterns. Only a small number of fragments remains to which verification methods such as model checking must be
12
J. Koehler et al.
Fig. 5. Example of a business process model in BPMN showing an error-handling flow and data
applied. Furthermore, the size of a fragment is usually small in practice, which results in a significant state-space reduction. Consequently, the resulting combination of verification techniques with structural analysis leads to a complete verification method that is low polynomial in practice with worst-case instances only occurring rarely. As each error is local to a fragment, this method also returns precise diagnostic information. Implementation of the work [14] showed that the soundness of even the largest business process models that we observed in practice can be completely analyzed within a few milliseconds. Consequently, the technology can be made available to users of modeling tools where they obtain instant feedback. Patterns and refactoring operations [22,23] can be provided to users to help them correct the detected modeling errors easily. The patterns and refactoring operations take advantage of the fine-grained diagnostic information and the PST to support users in accomplishing complicated editing steps in a semi-automatic and correct manner. With these results, a major step forward has been made. Still, two problems remain. First, the control-flow analysis must be extended to process models that are enriched with the description of error-handling or compensation flow as it is possible in BPMN. Secondly, no sufficient data-flow analysis techniques are yet available to analyze business processes. Figure 5 illustrates two more problems that we propose to investigate in more detail.
Problem 2: Solve the flow separation problem for arbitrary process orchestrations. Figure 5 shows a repetitive process where a task T 1 is executed followed by a task T 2. T 1 has some data object as output. During the execution of T 1, some compensation event can occur that requires task T 3 to execute. When the compensation is finished, the process continues with T 2. BPMN allows business users to freely draw “normal” flows as well as error-handling flows within the same process model. An error-handling flow can branch off in some task interrupting the normal flow and then merge back later into the normal flow. For a process without cycles, it is relatively easy to tell from the process model where normal and error-handling flows begin and end. For processes with cycles, this is much more complicated and constitutes an unsolved problem that we denote as the “flow separation problem.” A solution to this problem requires the definition of the semantics of error-handling flows. Furthermore, an error-handling flow must always be properly linked to a well-defined part of the normal flow, which is usually called the
Towards a Compiler for Business-IT Systems
13
scope. Computing the scope of an error handling flow from an unstructured process model is an open problem.
Problem 3: Transfer and extend data-flow analysis techniques from classical compilers to Business-IT systems compilers. Data-flow analysis for unstructured business process models is also a largely unsolved problem. Figure 5 shows some data object as an output of task T 1. Large diagrams often refer to many different types of data objects as the inputs and outputs of tasks. Furthermore, decision conditions in the branching points of process flows often refer to data objects. Users who work with process models are interested in answering many questions around data such as whether data input is available for a task, whether data can be simultaneously accessed by tasks running in parallel in a process, or whether certain decision conditions can ever become true given certain data, i.e., whether there are flows in the process that can never execute. An immediate candidate is the Concurrent Single Static Assignment approach [24] that we have begun to explore. Data-flow analysis is also a prerequisite to answer questions such as whether a compensation flow really compensates for the effects of a failed normal flow.
Problem 4: Solve the temporal projection problem for arbitrary process orchestrations. Recently, additional knowledge about the process behavior in the form of semantic annotations is added to process models. These annotations take the form of formally specified pre- and postconditions or simple attribute-value pairs. A tool should be able to reason about these semantic annotations, for example to conclude what pre- and postconditions hold for a complex process fragment containing cycles when the control flow is specified and the pre- and postconditions of the individual tasks are known. This problem of computing the consequences of a set of events has been studied as the so-called Temporal Projection problem in the area of Artificial Intelligence (AI) planning [25] and regressing and progression techniques have been developed. Unfortunately, AI plans exhibit a much simpler structure than process models, in particular they are acyclic, i.e., the existing techniques are not directly applicable. A solution to the temporal projection problem is important for the analysis of data flows as well as for the composition of processes (and services). These four research problems address major challenges for the analysis phase of the compiler when investigating a process orchestration, i.e., a single process model. For process choreographies that describe the interaction and communication between several processes, we are mostly interested in termination problems. Can two processes successfully communicate with each other such that both terminate?
Problem 5: Develop scalable methods to verify the termination of a process choreography returning detailed diagnostic information in case of failure. If a process choreography is fully specified, this question can be precisely answered. Even in the case of abstract models and underspecified choreographies such as in the
14
J. Koehler et al.
example of Figure 4, interesting questions can be asked and answered. For example, which flow constraints must the abstract Patient process satisfy such that a successful communication with the Doctor’s office is possible? Previous work, notably the research on operating guidelines [26,27,28] has provided an initial answer to these questions. The proposed analysis techniques are based on Petri nets, but do not yet scale sufficiently well. Similar to the case of process orchestrations, we are also interested in precise diagnostic information when verifying choreographies. 4.3 Translation and Intermediate Code Generation For the translation phase, we consider one problem as especially important and want to restrict us to this problem, namely the translation from unstructured BPMN to structured BPEL [2].
Problem 6: Define a translation from BPMN to BPEL and precisely characterize the maximal set of BPMN diagrams that are translatable to structured BPEL. An example of relevant related work is [29]. The approach exploits a form of structural decomposition, but not as rigorous as the PST and therefore leads to non-uniform translation results, i.e., the order of application of the translation rules determines the translation output. It is also important to further improve on the initial insights into the classes of BPMN diagrams that are translatable into structured BPEL. Beyond BPEL, one can also imagine that the translation of BPMN to other runtimes, e.g., ones that use communicating state machines is of major practical relevance. 4.4 Optimization The optimization phase for the Business-IT systems compiler is a completely unaddressed research area so far.
Problem 7: Define execution optimization techniques for the Business-IT systems compiler. Classical process optimization, which is mostly performed during a Business Analysis phase, usually focuses on cost minimization. For the compiler, we are envisioning an optimization of processes with respect to their execution on the planned target architecture, but not so much a cost optimization of the process itself. One advantage of compilers is their ability to support multiple platforms. Different architectures, including different styles of SOA, require and enable differences in the process implementation. Optimizations such as load balancing or clustering have for example been studied in the context of J2EE applications [30]. We have some initial insights, but first of all the main goal must be to clarify what can and should happen during the optimization phase.
Towards a Compiler for Business-IT Systems
15
4.5 Final Assembly and Linking Further Optimization For the assembly and linking phase, we see two problem areas that are of particular interest. First, we propose to further study several well-defined synthesis problems. In the literature, two instances of process synthesis problems have been investigated so far: On the one hand, there is the Web service composition problem that is mostly tackled using AI planning techniques [31,32].
Problem 8: Redefine the Web service composition problem such that it is grounded in realistic assumptions and delivers scalable solutions. Web service composition tries to assemble a process orchestration from a predefined set of services. It is commonly assumed that the goal for the composition is explicitly given and that services are annotated with pre- and postconditions. Unfortunately, both assumptions are rarely satisfied in practice. In particular, business users usually have a rather implicit understanding of their composition goals. We cannot expect these users to explicitly formulate their goals in some formal language. Furthermore, the processes returned by the proposed methods for service composition are very simple and resemble more those partially-ordered plans as studied by the AI planning community than those processes modeled by BPMN diagrams. The second problem is the adapter synthesis problem, which is addressed by combining model checking techniques with more or less intelligent “guess” algorithms [33,34]. Adapter synthesis tries to resolve problems in a faulty choreography by generating an additional process that allows existing partners to successfully communicate.
Problem 9: Redefine the adapter synthesis problem by taking into consideration constraints that occur in business scenarios. The problem is inherently difficult in particular due to the unconstrained formulation in which it is studied. Usually, the goal is to generate “some” adapter without formulating any further constraints. Consequently, an infinite search space is opened up and the methods are inherently incomplete. In addition, the synthesized adapters must be verified, because the correctness of the synthesis algorithms is usually not guaranteed. There is thus a wide gap between the currently proposed techniques and the needs of a practically relevant solution. A first goal must therefore be to formulate practically relevant variants of the service composition and adapter synthesis problems. Secondly, solutions to these problems must be worked out that make realistic assumptions, scale to real-world problems and are accepted by the commercial as well as the academic world. An initial goal for these last two research problems is thus to identify realistic problem formulations. For the web composition problem, this means to replace the assumptions of explicit goals and pre- and postconditions by the information that is available in real-world use cases of service composition. Furthermore, the composition methods must be embedded into an approach based on iterative process modeling where a human user is involved, similar to what has been studied by the AI planning community under the term of so-called mixed-initiative approaches. It also seems to be a promising approach to combine such approaches with pattern-based authoring methods similar in
16
J. Koehler et al.
spirit to those known from the object-oriented software engineering community [35], i.e., to provide users with predefined composition problems and proven solutions in the form of composition patterns that they “only” need to instantiate and apply to their problems. The second problem area for the assembly and linking phase focuses on those architectural design decisions that must be taken when compiling business processes to IT systems.
Problem 10: Demonstrate how IT architectural knowledge and decisions are used within the compiler. Today, these decisions are taken by IT architects mostly working with paper and pen. Decisions are not formally represented in tools and no decision-making support is available. Consequently, architectural decisions are not available in a form that they can really be used by the Business-IT systems compiler. Recent work by others and us has shown that architecturally decision making can be systematically supported and that decision alternatives, drivers and dependencies can be explicitly captured in tools and injected into a code-generating process [36,37,38,39]. By separating and validating the architectural decisions, design flaws can be more easily detected and a recompilation of a system for a different architecture is becoming more feasible. With this list of ten specific research problems, the vision of a compiler for BusinessIT systems is broken down into a specific set of key problems. We believe that a solution of these problems constitutes the essential cornerstones for such a compiler. The positioning of the problems within the various compilation phases makes it possible to tackle them systematically as well as study their relationships and dependencies, and to solve the problems under realistic boundary constraints. We believe that the compiler vision is a key to overcome the most urgent problems in the BPM and SOA space. Today, BPM and SOA applications are built from business process models that were drawn in modeling tools that offer little analytical or pattern-based support. From the process analysis models, design models are created by hand by manually translating and refining the information contained in the analysis model. Usually, the direct linkage between analysis and design gets lost during this step. Changes made at the design level are rarely reflected back at the analysis level. Commonly, the business processes are modeled in isolation. Their interdependencies and communication, their distributed side effects on shared data are rarely captured in models, but remain hidden in hand-written code. Thus, building the applications is expensive, resource-intensive, and often ad-hoc. The resulting BPM and SOA systems are hard to test, to maintain, and to change. A compiler significantly increases the quality of the produced solution and provides clearer traceability. Approaches of manual translation are replaced by tool-supported refinement steps guided by detailed diagnostic information. When embedding the compiler into a development environment supporting the life cycle of process models in horizontal (distributed modeling) and vertical (refinement) scenarios, versions of the process models can be tagged, compared, and merged. Alternative views on the processes for different purposes can also be more easily provided.
Towards a Compiler for Business-IT Systems
17
The optimization of Business-IT systems with respect to their execution becomes possible, which can be expected to lead to systems with greater flexibility making it easier for businesses to follow the life cycle of process innovation.
5 Conclusion In this paper, we proposed the vision of a compiler for Business-IT systems that takes business process specifications described at various degrees of detail as input and compiles them into executable IT systems. We defined ten research problems that have to be solved towards creating a compiler for Business-IT systems. Our vision allows us to position problems in a consistent and comprehensive framework that have previously been tackled in isolation. None of the presented research problems is new. In fact, many research projects have been initiated around them. However, as we tried to outline in the previous discussion, none of these projects has been truly successful, because the developed solutions commonly fail in practice: they do either not scale to the size of real-world examples, they do not provide users with the information that they need, or they rely on assumptions that do not hold in practice. However, many of these research projects have delivered interesting partial solutions that are worth to be preserved and integrated into a compiler for Business-IT systems. Consequently, many of these results have to be combined with novel “gap-closing” technology that still has to be developed and placed within the vision of the compiler. In many cases, the gap is in fact quite wide, requiring researchers to leave established solution approaches and develop much more than a small delta on top of existing research results. The ten research problems have been defined at different levels of abstraction. Some are concrete, while others first have to be addressed at the conceptual level before they can be refined into a concrete set of problems. We believe that this mix makes the proposed problems particularly interesting and will enable researchers to drive progress in complementary strands of work. The positioning of these ten problems within the various compilation phases makes it possible for researchers to tackle them systematically, study their interrelationships, and solve the problems under realistic boundary constraints.
References 1. Papazoglou, M.P., Traverso, P., Dustdar, S., Leymann, F., Kr¨amer, B.J.: Service-oriented computing: A research roadmap. In: Dagstuhl Seminar Proceedings on Service-Oriented Computing (2006) 2. Jordan, D., et al.: Web services business process execution language (WSBPEL) 2.0 (2007), http://www.oasis-open.org/committees/wsbpel/ 3. OMG: Business Process Modeling Notation Specification, Version 1.1 (2007) 4. Reichert, M., Dadam, P.: ADEPT Flex - supporting dynamic changes of workflows without losing control. Journal of Intelligent Information Systems 10(2), 93–129 (1998) 5. Rinderle, S., Reichert, M., Dadam, P.: Disjoint and overlapping process changes: Challenges, solutions, applications. In: Chung, S. (ed.) OTM 2004. LNCS, vol. 3290, pp. 101–120. Springer, Heidelberg (2004)
18
J. Koehler et al.
6. Muchnik, S.: Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco (1997) 7. Vanhatalo, J., V¨olzer, H., Leymann, F.: Faster and more focused control-flow analysis for business process models through SESE decomposition. In: Kr¨amer, B.J., Lin, K.-J., Narasimhan, P. (eds.) ICSOC 2007. LNCS, vol. 4749, pp. 43–55. Springer, Heidelberg (2007) 8. Vanhatalo, J., V¨olzer, H., Koehler, J.: The Refined Process Structure Tree. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 100–115. Springer, Heidelberg (2008) 9. Johnson, R., Pearson, D., Pingali, K.: The program structure tree: Computing control regions in linear time. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 1994), pp. 171–185 (1994) 10. Ananian, C.S.: The static single information form. Master’s thesis, Massachusetts Institute of Technology (September 1999) 11. Valdes-Ayesta, J.: Parsing Flowcharts and series-parallel graphs. PhD thesis, Stanford University (1978) 12. Tarjan, R.E., Valdes, J.: Prime subprogram parsing of a program. In: 7th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pp. 95–105. ACM, New York (1980) 13. Ehrig, H., Engels, G., Kreowski, H.-J., Rozenberg, G.: Handbook of Graph Grammars and Computing by Graph Transformation, vol. 2. World Scientific, Singapore (1999) ´ 14. Favre, C.: Algorithmic verification of business process models. Master’s thesis, Ecole Polytechnique F´ed´erale de Lausanne (August 2008) 15. K¨uster, J.M., Gerth, C., F¨orster, A., Engels, G.: Detecting and resolving process model differences in the absence of a change log. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 244–260. Springer, Heidelberg (2008) 16. Baresi, L., Nitto, E.D.: Test and Analysis of Web Services. Springer, Heidelberg (2007) 17. Leymann, F., Roller, D.: Production Workflow. Prentice Hall, Englewood Cliffs (2000) 18. Verbeek, H.M.W., Basten, T., van der Aalst, W.M.P.: Diagnosing workflow processes using WOFLAN. The Computer Journal 44(4), 246–279 (2001) 19. Fu, X., Bultan, T., Su, J.: Analysis of interacting BPEL Web Services. In: 13th Int. Conference on the World Wide Web (WWW 2004), pp. 621–630. ACM, New York (2004) 20. Trainotti, M., Pistore, M., Calabrese, G., Zacco, G., Lucchese, G., Barbon, F., Bertoli, P.G., Traverso, P.: ASTRO: Supporting Composition and Execution of Web Services. In: Benatallah, B., Casati, F., Traverso, P. (eds.) ICSOC 2005. LNCS, vol. 3826, pp. 495–501. Springer, Heidelberg (2005) 21. Koehler, J., Vanhatalo, J.: Process anti-patterns: How to avoid the common traps of business process modeling. IBM WebSphere Developer Technical Journal 10(2+4) (2007) 22. Koehler, J., Gschwind, T., K¨uster, J., Pautasso, C., Ryndina, K., Vanhatalo, J., V¨olzer, H.: Combining quality assurance and model transformations in business-driven development. In: Sch¨urr, A., Nagl, M., Z¨undorf, A. (eds.) AGTIVE 2007. LNCS, vol. 5088, pp. 1–16. Springer, Heidelberg (2008) 23. Gschwind, T., Koehler, J., Wong, J.: Applying patterns during business process modeling. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 4–19. Springer, Heidelberg (2008) 24. Lee, J., Midkiff, S.P., Padua, D.A.: Concurrent static single assignment form and const propagation for explicitly parallel programs. In: Huang, C.-H., Sadayappan, P., Sehr, D. (eds.) LCPC 1997. LNCS, vol. 1366, pp. 114–130. Springer, Heidelberg (1998) 25. Nebel, B., B¨ackstr¨om, C.: On the computational complexity of temporal projection, planning, and plan validation. Artificial Intelligence 66(1), 125–160 (1994) 26. Massuthe, P., Reisig, W., Schmidt, K.: An operating guideline approach to the soa. AMCT 1(3), 35–43 (2005)
Towards a Compiler for Business-IT Systems
19
27. Lohmann, N., Massuthe, P., Wolf, K.: Operating guidelines for finite-state services. In: Kleijn, J., Yakovlev, A. (eds.) ICATPN 2007. LNCS, vol. 4546, pp. 321–341. Springer, Heidelberg (2007) 28. Massuthe, P., Wolf, K.: An algorithm for matching nondeterministic services with operating guidelines. Int. Journal of Business Process Integration and Management 2(2), 81–90 (2007) 29. Ouyang, C., Dumas, M., ter Hofstede, A.H.M., van der Aalst, W.M.P.: From BPMN process models to BPEL Web Services. In: IEEE Int. Conference on Web Services (ICWS 2006), pp. 285–292 (2006) 30. Sriganesh, R.P., Bose, G., Silvermanohn, M.: Mastering Enterprise JavaBeans 3.0. John Wiley, Chichester (2006) 31. Rao, J., Su, X.: A survey of automated web service composition methods. In: Cardoso, J., Sheth, A.P. (eds.) SWSWPC 2004. LNCS, vol. 3387, pp. 43–54. Springer, Heidelberg (2005) 32. Hoffmann, J., Bertoli, P., Pistore, M.: Web service composition planning, revisted: In between background theories and initial state uncertainty. In: 22nd AAAI Conference on Artificial Intelligence (AAAI 2007), pp. 1013–1018 (2007) 33. Bertoli, P., Hoffmann, J., L´ecu´e, F., Pistore, M.: Integrating discovery and automated composition: From semantic requirements to executable code. In: IEEE Int. Conference on Web Services (ICWS 2007), pp. 815–822. IEEE, Los Alamitos (2007) 34. Brogi, A., Popescu, R.: Automated generation of bpel adapters. In: Dan, A., Lamersdorf, W. (eds.) ICSOC 2006. LNCS, vol. 4294, pp. 27–39. Springer, Heidelberg (2006) 35. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading (1995) 36. Kruchten, P., Lago, P., van Vliet, H.: Building up and reasoning about architectural knowledge. In: Hofmeister, C., Crnkovi´c, I., Reussner, R. (eds.) QoSA 2006. LNCS, vol. 4214, pp. 43–58. Springer, Heidelberg (2006) 37. Jansen, A., Bosch, J.: Software architecture as a set of architectural design choices. In: 5th IFIP Conference on Software Architecture (WICSA 2005), pp. 109–120. IEEE, Los Alamitos (2005) 38. Tyree, J., Akerman, A.: Architecture decisions: Demystifying architecture. IEEE Software 22(2), 19–27 (2005) 39. Zimmermann, O., Zduhn, U., Gschwind, T., Leymann, F.: Combining pattern languages and architectural decision models into a comprehensive and comprehensible design method. In: 8th IFIP Conference on Software Architecture (WICSA 2008), pp. 157–166. IEEE, Los Alamitos (2008)
Towards Use-Cases Benchmark Bartosz Alchimowicz, Jakub Jurkiewicz, Jerzy Nawrocki, and Mirosław Ochodek Poznan University of Technology, Institute of Computing Science, ul. Piotrowo 3A, 60-965 Poznań, Poland {Bartosz.Alchimowicz,Jakub.Jurkiewicz,Jerzy.Nawrocki, Miroslaw.Ochodek}@cs.put.poznan.pl
Abstract. In the paper an approach to developing a use-cases benchmark is presented. The benchmark itself is a referential use-case-based requirements specification, which has a typical profile observed in real projects. To obtain this profile an extensive analysis of 432 use cases coming from 11 projects was performed. Because the developed specification represents those found in real projects, it might be used in order to present, test, and verify methods and tools for use-case analysis. This is especially important because industrial specifications are in most cases confident, and they might not be used by researchers who would like to replicate studies performed by their colleagues. Keywords: Use cases, Requirements engineering, Metrics, Benchmark.
1
Introduction
Functional requirements are very important in software development. They impact not only the product but also test cases, cost estimates, delivery date, and user manual. One of the forms of functional requirements are use cases introduced by Ivar Jacobson about 20 years ago. They are getting more and more popular in software industry and they are also a subject of intensive research (see e.g. [2,3,8,9,20]). Ideas presented by researches must be empirically verified (in best case, using industrial data) and it should be possible to replicate a given experiment by any other researcher [19]. In case of use cases research it means that one should be able to publish not only his/her results but also the functional requirements (use cases) that has been used during the experiment. Unfortunately, that is very seldom done because it is very difficult. If a given requirements specification has a real commercial value, it will be hard to convince its owner (company) to publish it. Thus, to make experiments concerning use cases replicable, one needs a benchmark that would represent use cases used in real software projects. In the paper an approach to construct use-cases benchmark is presented. The benchmark itself is the use-cases-based requirements specification, which has a
This research has been financially supported by the Polish Ministry of Science and Higher Education grant N516 001 31/0269.
Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 20–33, 2011. c IFIP International Federation for Information Processing 2011
Towards Use-Cases Benchmark
21
typical profile observed in requirements coming from the real projects. To derive such a profile an extensive analysis of 432 use cases was performed. The paper is organised as follows. In Section 2 a model of use-cases-based requirements specification is presented. This model is further used in Section 3, which presents an analysis of the use-cases, coming from eleven projects. Based on the analysis a profile of the typical use-case-based specification is derived, which is used to create a benchmark specification in Section 4. Finally, a case study is described in Section 5, which presents a typical usage of the benchmark specification in order to compare tools for use-case analysis.
2
Benchmark-Oriented Model of Use-Case-Based Specification
Although use cases have been successfully used in many industrial projects (Neil et al. [16] reported that 50% of projects have their functional requirements presented in that form), they have never been a subject of any recognisable standardisation. Moreover, since their introduction by Ivar Jacobson [13] many approaches for their development were proposed. Russell R. Hurlbut [11], gathered fifty-seven contributions concerning use cases modeling. Although, all approaches share the same idea of presenting actor’s interaction with the system in order to obtain his goal, they vary in a level of formalism and presentation form. Thus it is very important to state what is understood by the term usecases-based requirements specification. To mitigate this problem a semi-formal model of use-cases-based requirements specification has been proposed (see figure 1). It incorporates most of the bestpractices presented in [1,5]. It still allows to create specification that contains only scenario (or nonstructured story) and actors, which is enough to compose small but valuable use-case, as well as extend it with other elements, for example extensions (using steps or stories), pre- and post-conditions, notes or triggers. Unfortunately, understanding the structure of specification is still not enough in order to construct a typical use-cases-based specification. What is missing, is the quantifiable data concerning the number of model-elements and proportions between them1 . In other words, one meta-question has to be asked for each element of the model - "how many?".
3
Analysis of Use-Cases-Based Specifications Structure
In order to discover the profile of typical-specification, data from various projects was collected and analysed. As a result a database called UCDB (Use Cases Database) [6] has been created. At this stage it stores data from 11 projects with the total number of 432 use cases (basic characteristics of requirements specifications as well as projects descriptions are grouped in table 1). 1
A number/proportion of any element of the model will be further addressed as a property of the specification.
22
B. Alchimowicz et al. Requirements Specification *
1
Business Object
* Actor
+Name +Description
+Name +Descrip tion
1
1
*
+main-actors +secondary -actors
+Title
*
*
Use Case
1 Referenced Element
+sub-scenario
+Description
* *
1
Scenario 1
1 S tep
*
+main-scenario
1
* Trigger
1 Goal Level
0..1
1..*
* +pre-conditions +p ost-conditions * * Condition
+Description
+Text 1
*
1 Business
Story +Text
1 0..1
User
Sub-function
Extension +Event_Text
Fig. 1. Use-Cases-based functional requirements specification model
All of the specifications have been analysed in order to populate the model with the information concerning average number of its elements occurrence and some additional quantifiable data. One of the interesting findings is that 79.9% of the main-scenarios in the use cases consist of 3-9 steps, which means that they fulfil the guidelines proposed by Cockburn [5]. What is more 72.9% of the analysed use cases are augmented with the alternative scenarios (extensions). There are projects which contains extensions for all steps in main scenario, however on the average use case contains 1.5 extension. Detailed information regarding number and distributions of steps in both - main scenario and extensions, is presented in figure 2. Another interesting observation, concerning the structure of use cases, is that the sequences of steps performed by the same actor, frequently contains more than one step. What is even more interesting this tendency is more visible in case of main actor’s steps (37.3% of main actor’s steps sequences are longer than one step, and only 19.4% in case of secondary actor - in most cases system being built). This is probably because actions performed by main actor are more important, from the business point of view. This is contradict to the concept of transactions in use cases presented by Ivar Jacobson [12]. Jacobson enumerated four types of actions which may form together use-case transaction. Only one of them belongs to a main actor (user request action). The rest of them are system actions: validation, internal state change, and response. It might look
Towards Use-Cases Benchmark
23
ID
Specification Language
Origin
All
Business
User
Subfunction
Table 1. Analysed projects requirements-specifications (origin: industry - project developed by software development company, s2b - project developed by students for external organisation, external - specification obtained from the external source which is freely accessible through the Internet - this refers to two specifications: UKCDR [21], PIMS [18]; projects D and K come from the same organisation)
Project A
English
S2B
17
0%
76%
24%
Web & standalone application for managing members of organization
Project B
English
S2B
37
19%
46%
35%
Web-based Customer Relationship Management (CRM) system
Project C
English External
39
18%
44%
33%
UK Collaboration for a Digital Repository (UKCDR)
Project D
Polish
Industry
77
0%
96%
4%
Web-based e-government Content Management System (CMS)
Project E
Polish
S2B
41
0%
100%
0%
Web-based Document Management System (DMS)
Project F
Polish
Industry
10
0%
100%
0%
Web-based invoices repository for remote accounting
Project G
English External
90
0%
81%
19%
Protein Information Management System (PIMS)
Project H
Polish
Industry
16
19%
56%
25%
Integration of two sub-system s in ERP scale system
Project I
Polish
Industry
21
38%
57%
5%
Banking system
Project J
Polish
Industry
9
0%
67%
33%
Single functional module for the webbased e-commerce solution
Project K
Polish
Industry
75
0%
97%
3%
Web-based workflow system with Content Management System (CMS)
Number of use cases Description
that those actions should frequently form together longer sequences. However, 80.6% of steps sequences performed by system consisted of single step only. The distributions and number of steps in main actor’s sequences are presented in figure 3. If we look deeper into the textual representation of the use cases, some interesting observation concerning their semantic might be made. One of them regards the way use-cases authors describe validations actions. Two different approaches are observed. The most common is to use extensions to represent alternative system behaviour in case of verification-process failure (46.6% of extensions have this kind of nature). Second one, and less frequent, is to incorporate validation actions into steps (e.g. System verifies data). This kind of actions are observed only in 3.0% of steps. There is also yet another interesting semantic structure used to represent alternative execution paths - conditional clauses (which in fact, are rather deprecated). Fortunately this kind of statements are observed only in 3.2% of steps (and they occur intensively only in 2 projects).
0.00 1
3 5 7 9 11 14 Number of steps in extension
10
A B C D E
F G H
I
J
K
J
K
d)
2
4
6
8
Project Number of steps in extension
0.0
0.1
Density 0.2 0.3
0.4
1 3 5 7 9 11 14 Number of steps in main scenario c)
b)
5
Density 0.10 0.20
a)
15
B. Alchimowicz et al.
Number of steps in main scenario
24
A
B
C
D
F
G
H
I
Project
Fig. 2. Scenarios lengths in analysed use cases a) histogram presents the number of steps in main scenario (data aggregated from all of the projects), b) box plot presents the number of steps in main scenario (in each project), c) histogram presents the number of steps in extension (data aggregated from all of the projects), d) box plot presents the number of steps in extension (in each project, note that project E was excluded because it has all alternative scenarios written as stories)
Analysed properties can be classified into two separate classes. The first one contains properties which are observed in nearly all of the projects, with comparable intensity. Those kind of properties are seem to be independent from the specification they belong to (from its author’s writing style). The second class is an opposite one, and includes properties wich are either characteristic only for a certain set of use cases or their occurrence in different specifications are in between of two extremes - full presence or marginal/none. Such properties are project-dependent. More detailed description of analysed requirements specifications is presented in table 2.
Towards Use-Cases Benchmark
b)
100%
90%
80%
80%
70%
Density
100%
90%
62.7%
50% 40% 30%
80.6%
70%
60%
Density
a)
25
9.8% 3.9% 3.5%
10%
50% 40% 30%
20.1%
20%
60%
0%
13.3%
20%
3.3% 1.9% 0.9%
10% 0%
1
2
3
4
>4
Main actor's steps sequence length
1 2 3 4 >4 Secondary actor's steps sequence length
Fig. 3. Distributions of actors steps-sequences lengths in analysed use cases, a) histogram presents the length of the steps sequences performed by main actor, b) histogram presents length of the steps sequences performed by secondary actor - e.g. System
4
Building Referential Specification
Combining use-cases model and average values coming from the analysis, a profile of the typical use-cases-based requirements specification can be derived. In such specification all properties would appear with the typical intensity. In other words, analysis performed on such document could be perceived as an every-day task for use-cases analysis tools. There might be at least two approaches to acquire instance of such specification. The first one would be to search for the industrial set of use cases, which would fulfil given criteria. Unfortunately, most of industrial documents are confidential, therefore they could not be used at large scale. The second, obvious approach would be to develop such specification from scratch. If it is built according to the obtained typical-specification profile, it might be used instead of industrial specification. Since it would describe some abstract system it can be freely distributed and used by anyone. Therefore, we would like to propose approach to develop an instance of the referential specification for the benchmarking purpose. The document is available at the web site [6] and might be freely used for further research. 4.1
Specification Domain and Structure
It seems that large number of tools for the use-cases analysis have their roots in the research community. Therefore the domain of the developed specification should be easy to understand especially for the academia staff. One of the wellrecognised processes at universities is a students admission process. Although
Requirements specification independent
38.0% 13.7% 1.98 1.69 48.5% 10.6% 36.3% 3.2% 6.3% 71.9% 28.1%
13.3% 3.3% 1.9% 0.9%
2 3 4 >4
Use cases with additional description Number of use cases with sub-scenario Mean Number of steps in sub-scenario SD Use cases with pre-conditions Use cases with post-conditions Use cases with triggers Steps with conditional clauses Number of steps with reference to use cases Number of extensions with scenario Number of extensions with stories
Secondary actor's steps sequence length in main scenario
Main actor's steps sequence length in main scenario
62.7% 20.1% 9.8% 3.9% 3.5% 80.6%
1 2 3 4 >4 1
Mean SD Mean SD
Steps with validation actions Extensions which are validations
Number of steps in extension
Number of extensions in use case
Use cases with extensions
Number of steps in main scenario
432 4.87 2.48 72.9% 1.50 1.84 2.51 1.62 3.0% 46.6%
Number of use cases
Mean SD
Overall
Property
0.0% 0.0% 0.00 N/A 82.4% 0.0% 100% 1.2% 0.0% 27.3% 72.7%
17.4% 0.0% 0.0% 0.0%
42.3% 23.1% 26.9% 7.7% 0.0% 82.6%
0.0% 0.0% 0.00 N/A 100% 0.0% 100% 1.1% 0.5% 100% 0.0%
3.7% 0.0% 0.0% 0.0%
93.1% 3.5% 3.5% 0.0% 0.0% 96.3% 29.6% 0.0% 2.5% 0.0%
37.0% 34.8% 16.3% 6.5% 5.4% 67.9% 0.0% 0.0% 0.0% 0.0%
61.9% 36.1% 0.0% 0.0% 2.1% 100% 0.0% 0.0% 0.0% 0.0%
87.5% 12.5% 0.0% 0.0% 0.0% 100% 8.0% 2.0% 0.0% 0.0%
91.7% 0.0% 8.3% 0.0% 0.0% 90.0% 16.7% 33.3% 0.0% 0.0%
47.4% 0.0% 31.6% 21.1% 0.0% 50.0%
43.8% 25.0% 0.0% 18.8%
52.4% 19.1% 23.8% 0.0% 4.8% 12.5%
37.5% 0.0% 0.0% 0.0%
50.0% 42.9% 7.1% 0.0% 0.0% 62.5%
14.8% 7.4% 4.1% 1.6%
52.6% 16.4% 12.9% 8.6% 9.5% 72.1%
Project F G H I J K 10 90 16 21 9 75 3.90 5.10 3.38 4.33 3.67 6.36 1.20 2.41 0.50 2.01 1.32 3.25 100% 68.9% 81.3% 90.5% 55.6% 98.7% 1.00 1.28 2.94 2.33 1.00 2.69 0.00 1.30 2.98 1.58 1.58 2.91 1.10 2.77 2.30 1.64 1.44 3.09 0.32 1.87 0.65 0.67 0.53 1.80 0.0% 0.0% 27.8% 17.6% 3.0% 0.0% 100% 50.4% 78.7% 89.8% 100% 15.3%
100% 0.0% 0.0% 0.0% 100% 56.3% 100% 0.0% 6.7% 0.0% 32.5% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 44.2% 0.00 3.20 0.00 0.00 0.00 0.00 0.00 0.00 1.09 N/A 2.02 N/A N/A N/A N/A N/A N/A 0.29 0.0% 0.0% 97.6% 0.0% 0.0% 0.0% 23.8% 77.8% 0.0% 97.4% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 100% 0.0% 100% 0.0% 0.0% 68.8% 57.1% 0.0% 0.0% 17.4% 0.3% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 9.0% 0.0% 0.0% 0.0% 25.6% 14.2% 0.0% 0.0% 6.1% 33.3% 8.1% 87.3% 0.0% 100% 100% 100% 100% 100% 100% 91.9% 12.7% 100% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
22.2% 0.0% 5.6% 0.0%
88.6% 6.8% 4.6% 0.0% 0.0% 72.2%
A B C D E 17 37 39 77 41 4.76 4.92 2.95 4.34 5.78 0.97 1.46 2.69 2.25 1.26 94.1% 51.4% 41.0% 55.8% 92.7% 1.29 0.57 0.95 0.92 1.63 0.77 0.60 1.72 1.12 0.62 1.80 2.52 1.00 1.45 N/A 0.84 0.87 0.00 0.59 N/A 3.7% 5.5% 11.3% 1.8% 0.0% 9.1% 76.2% 64.9% 63.4% 40.3%
26 B. Alchimowicz et al.
Table 2. Use-Cases Database analysis according to the presented requirements specification model (see section 2)
Requirements specification dependent
Towards Use-Cases Benchmark
27
developed specification will describe hypothetical process and tools, it should be easy to understand for anyone who have ever participated in a similar process. Another important decision concerns the structure of the specification (number of actors, use cases and business objects). Unfortunately the decision is rather arbitral, because number of use cases, actors and business objects may depend on the size of the system and level of details incorporated into its description. In this case a median number of use cases and actors from the UCDB set has been used as a number of use cases and actors in the constructed specification. The final specification consist of 7 actors (Administrator, Candidate, Bank, Selection committee, Students Management System, System, User), 37 use cases (3 business, 33 user, and 1 sub-function level), and 10 business objects. 4.2
Use Cases Structure
A breadth-first rule has been followed in order to construct referential specification. Firstly, all use cases were titled and augmented with descriptions. Secondly, all of them were filled with main scenarios and corresponding extensions. During that process all changes made in specification were recorded in order to check conformance with the typical profile. This iterative approach was used until the final version of the specification has been constructed. Its profile is presented in table 3, in comparison to the corresponding average values from the UCDB projects analysis (main actor’s step sequences lengths for referential specification, are additionally presented in figure 4). The most important properties are those which are observed in all of the specifications (specification independent). In case of those metrics most values in the referential document are very close to those derived from the analysis. This situation differs in case of properties which were characteristic only for certain projects (or variability between projects were very high). If the constructed specification is suppose to be a typical one, it should not be biased by features observed only in some of the industrial specifications. Therefore dependent properties were also incorporated into the referential use cases, however they were not a subject of the tunning process.
5
Benchmarking Use Cases - Case Study
Having the example of the typical requirements specification, one can wonder how it could be used. Firstly, researchers who construct methods and tools in order to analyse use cases often face the problem of evaluating their ideas. Using some specification coming from the industrial project is a typical approach, however, this cannot lead to the conclusion that the given solution would give the same results for other specifications. The situation looks different when the typical specification is considered, then researchers can assume that their tool would work for most of the industrial specifications. Secondly, analysts who want to use some tools to analyse their requirements can have a problem to choose the best tool that would meet their needs. With the typical specification in mind
28
B. Alchimowicz et al.
Table 3. Referential specification profile in comparison to the average profile derived from the UCDB use-cases analysis Referential Specification
Property
Admission System
Number of steps in main scenario
Mean SD
Use Cases with extensions Number of extensions in use case Number of steps in extension
Mean SD Mean SD
Steps of validation nature Extensions of validation nature Steps with conditional clauses
Main actor's steps sequence length in main scenario
Secondary actor's steps sequence length in main scenario
1 2 3 4 >4 1 2 3 4 >4
4.89 1.47 70.3% 1.50 0.69 2.49 2.12 2.2% 46,2% 0.6% 62.3% 21.3% 9.8% 3.3% 3.3% 80.7% 12.9% 4.8% 1.6% 0.00%
Observed in UseCases Database 4.87 2.48 72.92% 1.50 1.84 2.51 1.62 3.0% 46.6% 3.2% 62.7% 20.1% 9.8% 3.9% 3.5% 80.6% 13.3% 3.3% 1.9% 0.9%
it can be easier to evaluate the available solutions and choose the most suitable one. To conclude the example of the typical requirements specification can be used: – to compare two specifications and asses how the given specification is similar to the typical one – to asses the time required for a tool to analyse requirement specification – to asses the quality of a tool/method – to compare tools and choose the best one In order to demonstrate the usage of the constructed specification we have conducted a case study. As a tool for managing requirements in form of use cases we chose the UC Workbench tool [15], which has been developed at Poznan University of Technology since the year 2005. Additionally we used the set of tools for defects detection [4] to analyse the requirements. The aim of the mentioned tools is to find potential defects and present them to analyst during the development of the requirements specification. The tools are based on the Natural Language Processing (NLP) methods and use Standford parser [7] and/or OpenNLP [17] tools to perform the NLP analysis of the requirements.
Towards Use-Cases Benchmark
a)
b)
100%
90%
80%
80%
62.3% Density
50% 40%
21.3%
30%
80.7%
70%
60%
Density
100%
90% 70%
29
20%
50% 40% 30%
9.8%
10%
60%
12.9%
20%
3.3% 3.3%
4.8% 1.6% 0.0%
10%
0%
0%
1
2
3
4
>4
1 2 3 4 >4 Secondary actor's steps sequence length
Main actor's steps sequence length
Fig. 4. Distributions of actors steps-sequences lengths in referential specification, a) histogram presents the length of the steps sequences performed by main actor, b) histogram presents length of the steps sequences performed by secondary actor - e.g. System
5.1
Quality Analysis
In order to evaluate the quality of the used tools, a confusion matrix will be used [10] (see table 4). Table 4. Confusion matrix
System answer
Expert answer Yes
No
Yes
TP
FP
No
FN
TN
The symbols used in table 4 are described below: – – – –
T (true) - tool answer that is consistent with the expert decision F (false) - tool answer that is inconsistent with the expert decision P (positive) - system positive answer (defect occurs) N (negative) - system negative answer (defect does not occur)
On the basis of the confusion matrix, following metrics [14] can be calculated:
30
B. Alchimowicz et al.
– Accuracy (AC) - proportion of the total number of predictions that were correct. It is determined using the equation 1. AC =
TP + TN TP + FN + FP + TN
(1)
– True positive rate (TP rate) - proportion of positive cases that were correctly identified, as calculated using the equation 2. T P rate =
TP TP + FN
(2)
– True negative rate (TN rate) is defined as the proportion of negatives cases that were classified correctly, as calculated using the equation: 3. T N rate =
TN TN + FP
(3)
– Precision (PR) - proportion of the predicted positive cases that were correct, as calculated using the equation 4. PR =
TP TP + FP
(4)
The referential use-case specification was analysed by the mentioned tools (to perform the NLP processing Stanford library was used) in order to find 10 types of defects described in [4]. Aggregated values of the above accuracy-metrics are as follows: – – – –
AC = 0.99 TP rate = 0.96 TN rate = 0.99 PR = 0.82
This shows that the developed tools for defects detection are rather good, however, the precision could be better and the researchers could conclude that more investigation is needed in this area. If the researchers worked on their tool using only defect-prone or defect-free specifications the results could be distorted and could lead to some misleading conclusions (e.g. that the tool is significantly better or that it gives very poor outcome). Having access to the typical specification allows the researchers to explore main difficulties the tool can encounter during working with the industrial specifications. 5.2
Time Analysis
One of the main characteristics of a tool being developed is its efficiency. Not only developers are interested in this metric, but also for the users - e.g. when analysts have to choose between two tools which give the same results, they would choose the one which is more efficient. However, it can be hard to asses
Towards Use-Cases Benchmark
31
Table 5. Case-study results (tools are written in Java, so memory was automatically managed – garbage collection) Summarised for all components
English grammar parser only
Mean time Maximal English Overall Overall needed to memory Startup grammar processing processing analyse utilization time [s] parser used time [s] time [s] one step [s] [MB]
Memory usage Initial while memory processing usage [MB] single element [KB]
Stanford
54.37
0.29
212
37.44
0.8
27
547
OpenNLP
31.21
0.17
339
14.58
15.6
226
3925
AC
TP rate
TN rate
PR
Stanford
0.99
0.96
0.99
0.82
OpenNLP
0.99
0.84
0.99
0.79
Quality
the efficiency just by running the tool with any data, as this may result with distorted outcome. Use-cases benchmark gives the opportunity to measure the time, required by a tool to complete its computations, in an objective way. It can be also valuable for researchers constructing tools to consider using different third-party components in their applications. For instance, in case of the mentioned defect-detection tool there is a possibility of using one of the available English grammar parsers. In this case study two of them will be considered - Standford parser and OpenNLP. If one exchanges those components the tool will still be able to detect defects, but its efficiency and accuracy may change. Therefore, the time analysis was performed to asses how using those libraries influence the efficiency of the tool. The overall time required to analyse the typical specification2 for the tool using Stanford parser was 54.37 seconds and for the one with OpenNLP it was 31.21 seconds. Although the first time value seems to be large, the examined tool needed on average 290 ms to analyse a single use-case step, so it should not be a problem to use it in the industrial environment. Of course, when comparing efficiency, one have to remember that other factors (like the quality of the results, memory requirements or the time needed for the initialization of the tools) should be taken into account, as they can also have significant impact on the tool itself. Therefore, time, memory usage and quality analysis is presented in table 5. This summarised statistics allow researchers to choose the best approach for their work. Although this specification describes some abstract system, it shows the typical phenomenon of the industrial requirements specifications. The conducted 2
Tests were performed for the tools in two versions: with Stanford and OpenNLP parsers. Each version of the tools analysed the referential specification five times, on the computer with Pentium Core Duo 2.0GHz processor and 2GB RAM.
32
B. Alchimowicz et al.
case study shows that the developed referential use-case specification can be used in different ways by both researchers and analysts.
6
Conclusions
In the paper an approach to create a referential use-cases-based requirements specification was presented. In order to derive such a specification 432 use cases were analysed. It has proved that some of the use-case properties are project-independent (are observed in most of the projects). They have been used for creating the referential specification. On the other hand, there is also a number of properties which depend very much on the author. In order to present potential usage of the benchmark specification, a case study was conducted. Two sets of tools for defect detection in requirements were compared from the point of view of efficiency and accuracy. In the future it would be beneficial to extend the use-case database with more use-cases coming from commercial projects. This would require an iterative approach to updating the typical-profile as well as the referential specification. Acknowledgments. We would like to thank Piotr Godek, Kamil Kwarciak and Maciej Mazur for supporting us with industrial use cases. This research has been financially supported by the Polish Ministry of Science and Higher Education under grant N516 001 31/0269.
References 1. Adolph, S., Bramble, P., Cockburn, A., Pols, A.: Patterns for Effective Use Cases. Addison-Wesley, Reading (2002) 2. Anda, B., Sjøberg, D.I.K.: Towards an inspection technique for use case models. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, pp. 127–134 (2002) 3. Bernardez, B., Duran, A., Genero, M.: Empirical Evaluation and Review of a Metrics-Based Approach for Use Case Verification. Journal of Research and Practice in Information Technology 36(4), 247–258 (2004) 4. Ciemniewska, A., Jurkiewicz, J., Olek, Ł., Nawrocki, J.R.: Supporting use-case reviews. In: Abramowicz, W. (ed.) BIS 2007. LNCS, vol. 4439, pp. 424–437. Springer, Heidelberg (2007) 5. Cockburn, A.: Writing effective use cases. Addison-Wesley, Boston (2001) 6. Use Case Database, http://www.ucdb.cs.put.poznan.pl 7. de Marneffe, M.-C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure parses. In: Proceedings of the EACL Workshop on Linguistically Interpreted Corpora, LINC (2006) 8. Denger, C., Paech, B., Freimut, B.: Achieving high quality of use-case-based requirements. Informatik-Forschung und Entwicklung 20(1), 11–23 (2005) 9. Diev, S.: Use cases modeling and software estimation: applying use case points. ACM SIGSOFT Software Engineering Notes 31(6), 1–4 (2006)
Towards Use-Cases Benchmark
33
10. Fawcett, T.: Roc graphs: Notes and practical considerations for data mining researchers (2003) 11. Hurlbut, R.: A Survey of Approaches for Describing and Formalizing Use Cases. Expertech, Ltd. (1997) 12. Jacobson, I.: Object-oriented development in an industrial environment. ACM SIGPLAN Notices 22(12), 183–191 (1987) 13. Jacobson, I., Christerson, M., Jonsson, P., Overgaard, G.: Object-oriented software engineering: A use case driven approach (1992) 14. Krawiec, K., Stefanowski, J.: Uczenie maszynowe i sieci neuronowe. Wydawnictwo Politechniki Poznańskiej (2004) 15. Nawrocki, J., Olek, Ł.: Uc workbench - a tool for writing use cases. In: Baumeister, H., Marchesi, M., Holcombe, M. (eds.) XP 2005. LNCS, vol. 3556, pp. 230–234. Springer, Heidelberg (2005) 16. Neill, C.J., Laplante, P.A.: Requirements Engineering: The State of the Practice. IEEE Software 20(6), 40–45 (2003) 17. OpenNLP, http://opennlp.sourceforge.net 18. PIMS, http://www.mole.ac.uk/lims/project/srs.html 19. Shull, F.J., Carver, J.C., Vegas, S., Juristo, N.: The role of replications in empirical software engineering. Empirical Software Engineering 13(2), 211–218 (2008) 20. Somé, S.S.: Supporting use case based requirements engineering. Information and Software Technology 48(1), 43–58 (2006) 21. UKCDR, http://www.ukoln.ac.uk/repositories/digirep/index/all_the_scenarios_ and_use_cases_submitted#ukcdr
Automated Generation of Implementation from Textual System Requirements Jan Franc˚ u and Petr Hnˇetynka Department of Distributed and Dependable Systems Faculty of Mathematics and Physics, Charles University in Prague Malostranske namesti 25, 118 00 Prague 1, Czech Republic
[email protected],
[email protected]
Abstract. An initial stage of a software development is a specification of the system requirements. Frequently, these requirements are expressed in UML and consist of use cases and a domain model. A use case is a sequence of tasks, which have to be performed to achieve a specific goal. The tasks of the use case are written in a natural language. The domain model describes objects used in the use cases. In this paper, we present an approach that allows automated generation of an executable code directly from the use cases written in a natural language. Use of the generation significantly accelerates the system development, e.g. it makes immediate verification of requirements completeness possible and the generated code can be used as a starting point for the final implementation. A prototype implementation of the approach is also described in the paper. Keywords: Use cases, code generation, requirement engineering, natural language.
1
Introduction
Development of software is covered by several stages from which one of the most important is the initial stage — collecting system requirements. These requirements can be captured in many forms but using the Unified Modeling Language (UML) has become an industry standard at least for large and medium enterprise applications. Development with UML [6] is based on modeling the developed system at multiple levels of abstraction. Such a separation helps developers to reflect specific aspects of the designed system on different levels and therefore to get a “whole picture” of the system. Development with the UML starts with defining goals of the system. Then, main characteristics of the system requirements are identified and described. A behavior of the developed system is specified as a set of use cases. A use case is a description of a single task performed in the designed system [2]. The task itself is further divided into a sequence of steps that are performed by communicating entities. These entities are either parts of the system or users of the system. The step of a use case is specified by natural language sentences. The use cases of the Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 34–47, 2011. c IFIP International Federation for Information Processing 2011
Automated Generation of Implementation
35
system are completed by a domain model that describes entities, which together form the designed system and which are referred in the use cases. Bringing a system from the designing stage to the market is a very timeconsuming and also money-consuming task. A possibility to generate an implementation draft directly from the system requirements would be very helpful for both requirement engineers and developers and it would significantly speed up development of the system and decrease time required to deliver the system to the market and also decrease amount of spent money. The system use cases contain work-flow information and together with the domain model capture all important information and therefore seem to be sufficient for such a generation. But the problem is that use cases are written in natural language and there is a gap to overcome to generate a system code. 1.1
Goals of the Paper
In this paper, we describe an approach, which allows to generate an implementation of a system from the use cases written in a natural language. The process proposed in the paper enables software developers to take an advantage of the carefully written system requirements in order to accelerate the development and to provide immediate feedback for the project’s requirement engineers by highlighting missing parts of the system requirements. The process fits into the incremental development where in each iteration developers can eliminate shortcomings in design. The process can be customized to fit any enterprise application project. Described approach is implemented in a proof-of-the-concept tool and tested on a case study system. To achieve the goals, the paper is structured as follows. Section 2 provides an overview of the UML models and technologies required for a use case analysis. Section 3 shows how our generation tool is employed in application development process and Sec. 4 describes the tool and all generation steps in detail. Section 5 evaluates our approach and the paper is concluded in Sec. 6, where future plans are also shown.
2
Specification of Requirements
The Unified Modeling Language (UML) is a standardized specification language for the software development. Development with UML is based on modeling a system at multiple levels of abstraction in separated models. Each model represented as a set of documents clarifies the abstraction on a particular level and captures different aspects of the modeled system. The UML based metodologies standardize whole development process and ensures that the designed system will meet all the requirements. UML also increases possibilities to reuse existing models and simplifies reuse of an implemented code. In this paper, we work with the UML documents created during the initial stage of the system development, i.e. the requirement specification. Results of the stage are captured in use cases and domain model.
36
2.1
J. Franc˚ u and P. Hnˇetynka
Use Cases
A use case in the context of UML is a description of a process where a set of entities cooperates together to achieve a goal of the use case. The entities in the use case can refer to the whole system, parts of the system, or users. Each use case has a single entity called system under discussion (SuD); from the perspective of this entity, the whole use case is written. An entity with which SuD primarily communicates is called a primary actor (PA). Other entities involved in the use case are called supporting actors (SA). Each use case is a textual document written in a natural language. The book [2] recommends the following structure of the use case: (1) header, (2) main scenario, (3) extensions, and (4) sub-variations. The header contains a name of the use case, SuD entity, primary actor and supporting actors. The main scenario (also called the success scenario) defines a list of steps (also called actions) written as sentences in a natural language that are performed to achieve the goal of the use case. An action can be extended with a branch action, which reflects possible diversions from the main scenario. There are two types of the branch actions: extensions and sub-variations. In an extension, actions are performed in addition to the extended action while in a sub-variation, actions are performed instead of the extended action. The first sub-action of a branch action is called a conditional label and describes necessary condition under which the branch action is performed. The above described structure is not the only possible one — designers can use any structure they like. In our approach, we assume the use cases hold these recommendation as it allows us to process the use case automatically and generate the system implementation. Such an assumption does not limit the whole approach in a significant way, hence the book [2] is widely considered as a “bible” for writing the use cases (in addition, we already have an approach for using use cases with in fact any structure — see Sec. 6). In the rest of the paper we use as an example a Marketplace project for online selling and buying. A global view of the application entities is depicted on Figure 1. There are several actors, which communicate with the system. Seller s enter an offer to the system and Buyer s search for interesting offers. Both of them mainly communicate directly with the Computer system — in few cases, they have to communicate through a Clerk who passes information to the Computer system. There is also a Supervisor which maintains the Computer system. A Credit verification agency verifies Seller’s and Buyer’s operations and finally a Trade commission confirms the offers. The use case on Figure 2a is a part of the Marketplace specification (the whole specification has 19 use cases) and it describes a communication between the Buyer (as PA), Clerk (as SuD), and Computer system. It is prepared according the recommendations. 2.2
Domain Model
A domain model describes entities appearing in the designed system. Typically, the domain model is captured as a UML class diagram and consists of three
Automated Generation of Implementation
37
Fig. 1. The Marketplace project entities
(a) Use case example
(b) Domain model
Fig. 2. Marketplace requirements
types of elements: (1) conceptual classes, (2) attributes of conceptual classes, and (3) associations among conceptual classes. Conceptual classes represent objects used in the system use cases. The attributes are features of the represented objects and associations describe relations among the classes. Figure 2b shows the Marketplace domain model. 2.3
Procasor Tool and Procases
The Procasor [5] is a tool for automated transformation of a natural language (English) use cases into a formal behaviour specification. The transformations are described in [7] and further extended in [3] where almost all restrictions of a use case step syntax were removed. As a formalism into which the natural language use cases are transformed the Procasor uses procases [7] that are a special form of behavior protocols [10]. In addition to procases, a UML state machine diagram is also generated. The procase is a regular-expression-like specification, which can describe an entity behavior as well as the whole system behavior [9]. The procases generate so called traces that represent all possible valid sequences of the actions described by the use cases. Figure 4 shows a procase derived from the use case shown in Figure 2a. A procase is composed of operators (i.e. +, ;), procedure calls ({,}), action tokens, and supporting symbols (i.e. round parenthesis for specifying operators’
38
J. Franc˚ u and P. Hnˇetynka
precedence). Each action token represents a single action that has to be performed and its notation is composed of several parts. First, there is a single character representing a type of the action. The possible types are ? resp. ! for request receiving resp. sending action, # for internal actions (unobservable by others than SuD) and % for special actions. The action type is followed by the entity name on which the action is performed. Finally, the name of the action itself is the last part (separated by a dot). In a case, the entity name is omitted, the action is internal. For example, ?B.submitSelectOffer is the submitSelectOffer action where SuD waits for a request from the B (Buyer) entity. The procases use the same set of operation as regular expressions. These are: * for iteration, ; for sequencing, and + for alternatives. In this paper, we call the alternative operator as a branch action, its operands (actions) as branches, and the iteration operator with its operand as a loop action. A special action is NULL which means no activity and is used in places where is no activity but the procase syntax requires an action specified (e.g. with the alternative operator). Another special action is the first action inside a not main scenario branch, which is called condition branch label and express the condition under which the branch is triggered. Procedure calls (written as curly brackets) represent a behavior (mostly composed of inner actions) of the request receive action after which they are placed (the action is called trigger action). 2.4
Goals Revisited
As described in the sections above, the Procasor tool parses the use cases written in a natural language and generates a formal specification of behavior of the designed system. A straightforward idea is then why to stop just with the generated behavior description and not to generate also an implementation of the system which implements the work-flow captured in the use cases. The goal of this paper is to present an extension of the Procasor tool that based on the use cases generates an executable implementation of the designed system.
3
Generating Process
The development process with our generating tool is as follows. First, requirement engineers collect all requirements and describe them in the form of use cases. Then the Procasor tool automatically generates procases. In parallel, the requirement engineers create a project domain model. As a next step of validating the use cases, the generated procases can be reviewed. Then, our generator is employed and produces an implementation of the developed system. The generated implementation consists of three main parts: (i) use case objects where work-flow captured in a use case is generated, (ii) pages which are used to communicate with users of the system, and (iii) entity data objects which correspond with manipulated objects.
Automated Generation of Implementation
39
Fig. 3. Generator employment
The generated implementation is only an initial draft and serves primarily for testing the use cases and domain model. But it can be also used as a skeleton for the actual implementation and/or to allow customers can gain first impressions of the application. The whole process is illustrated on Figure 3. At this point a single common mistake have to be emphasized (which is also emphasized in [6]). The system requirements cannot be understood as final and unchangeable. Especially in incremental development, the requirements are created in several iterations and obviously the first versions are incomplete. Therefore if the generator is used on such use cases, it can generate completely wrong implementation. But this implementation can be used to validate the use case, repair them and regenerate the implementation.
4
Generating Tool in Detail
The generator of the implementation takes as an input the procases generated by the Procasor and the created domain model of the designed system. From these inputs, it generates the executable implementation. The generation is automated and it consists of three steps: 1. First, procases generated from the Procasor are rearranged into a form, in which still follow the Procase syntax but are more suitable for generating the implementation (Sec. 4.1). 2. Then, a relation between words used in the use cases and elements in the domain model is obtained and parameters (i.e. their numbers and types) of the methods are identified (Sec. 4.2). 3. Finally, the implementation of the designed system is generated (Sec. 4.3). 4.1
Procase Preprocessing
The procases produced by the Procasor do not contain procedure calls brackets (see 2.3), which are crucial for successful transformation of the procases into the code. Except several marginal cases, each use case is a request-response sequence between SuD and PA (for enterprise applications). In the procase, a single
40
J. Franc˚ u and P. Hnˇetynka
request-response element is represented as a sequence of actions from which the first one is the request receive action (i.e. starts with ?) and then followed by zero or more other actions (i.e. sending request action, internal actions, etc.). In other words, SuD receives the request action, then performs a list of other actions, and finally returns the result (i.e. end of the initial request receive action). Hence, the sequence of actions after the request receive action can be modeled as a procedure content and enclosed in the procedure call brackets. The following example is a simple procase in a form produced by the Procasor: ?PA.a; #b; !SA.c; ?PA.d; #e; #f. After identifying the procedure calls, the procase is modified into the following form: ?PA.a{#b; !SA.c};?PA.d{#e; #f}. At the end, the code generated from this procase consists of two procedures — first one generated from the ?PA.a action and internally calling the procedures resulted from #b and !SA.c, and the second one generated from ?PA.d and calling #e and #f. The approach described in the paragraph above works fine except for several cases. In particular, these are: (1) first action of the use case is not a request receive action, (2) a request receive action is in a branch(es), and (3) a request receive action is anywhere inside a loop. In a case the first action of the use case is not a request receive action, a special action INIT is prepended to the use case and the action till the first request receive action are enclosed in the procedure call brackets. In the generated code, a procedure generated from the INIT action is called automatically before the other actions. Two other cases cannot be solved directly and require more complex preprocessing. To solve these cases, we enhance procases with so called conditional events, which allow to “cut” branches of the procase and arrange them in a sequence, but which do not modify the procase syntax. The conditional events allow to mark branches of the alternatives by a boolean variable (written in the procase just as a name without any prefix symbol followed by a colon, e.g. D:). Such a mark modifies the behavior of the procase in a way that only the traces which contain the action with a declaration of the variable (to true) continue with this marked branch. When the value of the variable is false, the traces continue with the unmarked branches. Variables can be set to true via the action written as the variable name prefixed with the $ symbol (e.g. $D) or to false by its name with the $ and ∼ symbols (e.g. ∼$D). At the beginning of each procase, all variables are undeclared. Branch transformation. First, we sketch how to rearrange a procase with the request receive action placed in a branch. The general approach of identifying procedures as described above does not work as it would result in nested procedures. To avoid them, it is necessary to rearrange the procase in order to place affected branches sequentially. The rearrangement in a basic form is described as follows. Actions belonging to subsequent actions of the processing request receive action are cut out and wrapped into new procedure call. The new procedure call is put into new branch action beside NULL and marked with new conditional event variable.
Automated Generation of Implementation
41
Fig. 4. Procase example before and after Preprocessing
The new conditional event variable declaration is appended on previous place of the request receive action. Figure 4 shows procase before and after Preprocessing transformation. The processed procase is completely equivalent to the former one (they generate the same set of traces) and does not contain the problematic branch. Loop transformation. The transformation of the procases with the request receive action located in a loop action is quite similar to the previous case. Again, the transformation guarantees that the resulting procase generates the same traces as the original one. The following procase is an example with the request receive action inside the loop: ?PA.a;#b;(#c;?PA.d;#e)*#f — and the resulting transformed procase: ?PA.a {#b; (#c; $D + #f )}; (D: ?PA.d {#e; (#c + #f; ∼$D)} + NULL)*. Unresolved cases. In a case the request receive action is located in two or more nested loops or in a loop nested in branches, the previous two transformations do not work. The procase is then marked as unresolved, excluded from the further processing and has to be managed manually. On the other hand, such use cases are very unreadable (see [6] for suggestions about avoiding the nested branches and loops) and therefore the skipped use cases are candidates for rewriting in a more simple and readable way. 4.2
Determining Arguments
Once the procases have been preprocessed into sequences of actions grouped as procedure calls, the next step is to determine arguments of the identified procedures, types of the arguments, and how their values are assigned. The arguments are subsequently used as arguments for methods in the final generated code. As mentioned in [6], noun phrases appearing in the use cases are tied to the domain model elements labels and this connection is used in our approach for identifying the arguments. The process of determining arguments is as follows.
42
J. Franc˚ u and P. Hnˇetynka
The noun phrases which may refer to the data manipulated in the use case step are extracted by the Procasor from the use case step sentence. The list of extracted words is matched against keywords of the domain model (by the keywords we mean names of the classes, attributes, and associations). There are many options to match the keywords — currently in our implementation we use a simple case-insensitive equality of strings. The determined types are compared with arguments of previous procedures (if exist) and the already used arguments are copied (their values). If the previous procedures are located in a branch parallel with the NULL action they are excluded from processing as they may not be called before the processed one. Now, the process behaves differently based on a type of the entity, on which the action is called. The types are (1) human user entities (UE) such as buyer, seller, etc. and (2) parts of the system or other computer systems (SE). For the trigger action and actions with UE SuD, the unmatched determined types are used as arguments (i.e. parameters which have to be inputted by users). For actions with SE SuD the unmatched determined types are also added as arguments but with default values (during the development of the final application, developers have to provide correct values for them). 4.3
Application Generation
Generation of the final application employs multiple commonly used design patterns. Based on these patterns, the generated code is structured into three layers — presentation layer, middle (business) layer, and data layer. In the following text, we refer to objects of the presentation layer as pages because the most commonly used presentation layer in contemporary large applications employs web pages but any type of the user interface can be generated in the same way. The middle layer consists of so called use case objects which contain the business logic of the application. Also, the middle layer contains entity objects where the internal logic (implementation of the basic actions) is generated (the use case objects implement the ordering of the actions and call the entity objects). We do not describe generation of the data layer, as it is well captured in common UML tools and frameworks (generation of classes from class diagrams etc. — see Sec. 5). The generation depends on the type of entity — pages are generated for UE while for SE only non-interactive code. Thus, a page is generated for every action performed by UE (procedure call triggering actions and procedure call internal actions). If the use case has SuD as UE then elements generated from the internal actions of a procedure call are named with the suffix “X” to allow their easier identification during future development as in most cases they have to be modified. Based on a combination of communicating actors, the generation distinguishes four cases how the code is generated from a procedure call. (i) If PA and/or SA is UE, then a page is generated for every procedure call, which is triggered by this UE. Otherwise (i.e. PA/SA is SE), (ii) an action
Automated Generation of Implementation
43
Fig. 5. Generated elements
implementation method is generated in the actor entity object and the action method body contains a call to the corresponding use case object. (iii) If SuD is UE, then a method in the corresponding use case object is generated for each procedure call of SuD. The method body calls the actor entity object and redirects to “X” pages, which manage the internal procedure call actions. Internal procedure call actions are generated in a similar way to the request receive action with UE PA — the “X” page and a method inside the “X” use case object are generated. The method inside the “X” use case object is generated as a simple delegation method to the corresponding entity object and redirection to the particular page. And finally (iv) if SuD is SE, then inside the corresponding use case object, a method with the body containing the internal procedure call actions is generated. Figure 5 shows the procase of the Buyer-buys-selected-offer use case and generated elements of this use case. The following sections describe each type of the generated objects in more details. Pages. As generated, pages are intended for testing the use cases and are expected to be reimplemented during the further development. A single page is generated for each action interacting with UE. In a case of UE PA there is a page for every triggering action and in addition for UE SuD there is also a page for every procedure call internal action. If the action has arguments which can be inputted then for each of them an input field is generated. Values are assigned by humans during testing the generated system. For the UE PA actions, the corresponding pages have a button (an input control element) that allows to continue to the next page i.e. to continue in the use case (there is only a single button as there is no other choice to continue).
44
J. Franc˚ u and P. Hnˇetynka
On the pages belonging to the UE SuD actions, there are several buttons, which reflect the possibilities of continuation in the original use case. For a sequence of the actions, the page contains the “continue” button; if the next action is a branch action then the page contains a button for each branch (the default button is for the main scenario branch — the buttons for the rest of the branches are labeled by the branch condition label; if the next action is a loop action then the page has a button to enter the loop and another button to skip the loop (following definition of loop operation). Use Case Objects. The use case objects contain the business logic (workflow) of the use case, i.e an order of actions in the main scenario and all possible branches. Bodies of the generated methods differ according to SuD. UE SuD: As described above, a method in the corresponding use case object is generated for each procedure call of UE SuD. For each trigger action, the method body contains a call to the particular entity object and redirection to a page of the subsequent action. For internal procedure call actions, a similar method body is created in “X” use case object. SE SuD: A body of the method generated for the procedure call trigger action from the use case with the SE SuD contains the internal procedure calls. For the SuD internal actions, methods are called on the use case SuD entity object and for request send actions, methods are called on the action triggered entity objects. The branch actions are generated as a sequence of the condition statements (i.e. if () ... else if () ...) with as many elements as branches in the branch action. In each if statement a particular actions are generated while the last else statement contains the main scenario actions. A similar construction but with a loop statement (while) is created for the loop action. The number of iteration in the loop statement and choice of the particular branch in the condition statements cannot be determined from the use case. Therefore, the statements are generated with predefined but configurable constants inside the use case object. Entity Data Objects. The internal logic of actions is not captured by the use cases neither by the domain model. Therefore, the entity objects are generated with almost empty methods containing only calls to a logger and they have to be finished by developers. For testing purposes, the logging methods seem to be the most suitable ones as designers can immediately check the traces of the generated system. 4.4
Navigation
Navigation (transitions) between the pages is an important part of the application internal logic as it determines the order of procedure calls and sequence of actions. The navigation is derived from the procases as a set of navigation rules. The pages/objects have associated these rules that contain under which circumstances which transition has to be chosen.
Automated Generation of Implementation
45
In general, the navigation rules are created from special actions that can change transitions (aborts, etc.), branch actions, and loops. In a case of the use case with SE SuD, the rules are applied to determine transitions between calls of the actions. In a case of UE SuD, the rules determine how the pages are generated, i.e. which buttons are placed on them. 4.5
Implementation Details
To prove our approach is feasible, we have implemented the proposed generator. As a particular technology in which applications are generated, we chose the Java EE platform with Enterprise Java Beans as business layer and Java Server Faces as presentation layer. The generator itself has been written in plain Java. The generator generates sources together with an Ant build file, which can immediately compile and deploy the application to the JBoss application server, which allows users to inspect and modify code and iteratively test the application. Based on the chosen technologies and used design patterns the first two application layers are mapped into the following five tiers. The pages resulted in two tiers: (i) JSF pages and (ii) backing beans. The middle layer then results into tree tiers: (iii) business delegator tier, (iv) Enterprise Java Bean tier, and finally (v) manager tier. Generation of the data persistence layer is not currently supported but it is a simple task, which will be added soon (see Sec. 6).
5
Evaluation and Related Work
To verify our generator, we used it on the Marketplace application (see Sec. 2.1). The generated implementation was compiled and directly deployed to an application server. The implementation consists of approx. 70 classes with 13 EJBs. The complete application has 92 action, from which only 16 actions were generated with wrong arguments and had to be repaired manually (we are working on an enhanced method of argument detection — see Sec. 6). Testing of the generated application discovered a necessity to add one use case, two missing extensions in another use case, and also suggested restructuring other two use cases. All these defects also could be detected directly from the use cases but with generated application, they became evident immediately. Our tool can be also viewed as an ideal application of the Model-driven Architectures (MDA) [8] approach. In this view, the uses cases and domain model serve as a platform independent model, which via several transformations are transformed directly into an executable code, i.e. platform specific model. Currently, the existing tools usually generate just data structures (source code files, database tables, or XML descriptors) from the UML class diagrams but no interaction between entities (i.e. they handle just the class diagrams) and as far as we know, there is no tool/project that generates the implementation from the description in a natural language. Below, there are several projects or tools that take as an input not only class diagrams but still they work with diagrams and not with a natural language.
46
J. Franc˚ u and P. Hnˇetynka
The AndroMDA [1] is the generator framework which transforms the UML models into an implementation. It supports transformations into several technologies and it is possible to add new transformations. In general, it works with the class diagrams and based on the class stereotypes, it generates the source code. Moreover, it can be extended to work with other diagram types. A similar generator (made as an Eclipse extension) is openArchitectureWare [11], which is a general model-to-model transformation framework. In [12], the sequence diagrams together with class diagram are used to generate fragments of code in a language similar to Java. The generation is based on the order of messages captured in the sequence diagram and the structure of the class diagram. There is also a proposed algorithm for checking consistency between these two types of diagrams. Similarly in [4], Java code fragments are generated from the collaboration and class diagrams. The authors use enhanced collaboration diagrams in order to allow better management of variables in the generated code. In [13], the use cases are automatically parsed and together with a domain model are used to produce a state transition machine, which reflects behavior of the system. From the high level view, the used approach is very similar to our solution but they allow to process only very restricted use cases and thus the approach is quite limited.
6
Conclusion and Future Work
The approach proposed in the paper allows for automated generation of executable code directly from a requirements specification written as use cases in a natural language. Also, we have developed a prototype, which generates JEE applications via the proposed approach. Applications generated by our tool are immediately ready to be deployed and launched and they are suitable for testing the use cases (i.e. if the requirement specification is complete and well structured) and as a starting point for the development of the real implementation. The proposed generator has several shortcomings, which suit for further improvements. An important issue is connected with associations among the classes in the domain model. The current implementation correctly handles just one-toone associations. The one-to-many or many-to-many associations result in the code of an application to arrays and therefore the determination of the arguments is more complex. We plan to solve association limitation by analysis of sentence to determine whether a method argument is the array or object itself. Also, we plan to add a categorization of verbs to allow better management of arguments of the procedures. Finally, we plan to add the generation of a data layer to the application. The required structure of the use cases (based on recommendations in [2]) can be seen as another limitation but we already have an approach, which allows processing of use cases with almost any structure (see [3]) and we are incorporating it to the implementation.
Automated Generation of Implementation
47
Acknowledgements The authors would like to thank Vladimir Mencl and Jiri Adamek for valuable comments. This work was partially supported by the Grant Agency of the Czech Republic project 201/06/0770.
References 1. AndroMDA, http://galaxy.andromda.org 2. Cockburn, A.: Writing Effective Use Cases, 1st edn. Addison-Wesley, Reading (2000) 3. Drazan, J., Mencl, V.: Improved Processing of Textual Use Cases: Deriving Behavior Specifications. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Pl´ aˇsil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 856–868. Springer, Heidelberg (2007) 4. Engels, G.: UML collaboration diagrams and their transformation to java. In: France, R.B. (ed.) UML 1999. LNCS, vol. 1723, pp. 473–488. Springer, Heidelberg (1999) 5. Fiedler, M., Francu, J., Mencl, V., Ondrusek, J., Plsek, A.: Procasor Environment: Interactive Environment for Requirement Specification, http://dsrg.mff.cuni.cz/~ mencl/procasor-env 6. Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and the Unified Process, 2nd edn. Prentice Hall PTR, Upper Saddle River (2001) 7. Mencl, V.: Deriving Behavior Specifications from Textual Use Cases. In: Proc. of WITSE 2004, Linz, Austria (September 2004) 8. OMG: Model Driven Architecture (MDA), OMG document ormsc/01-07-01 (July 2001) 9. Plasil, F., Mencl, V.: Getting ”Whole Picture” Behavior in a Use Case Model. In: Proceedings of IDPT, Austin, Texas, U.S.A. (December 2003) 10. Plasil, F., Visnovsky, S.: Behavior Protocols for Software Components. IEEE Transactions on Software Engineering 28(11) (November 2002) 11. openArchitectureWare, http://www.openarchitectureware.org 12. Quan, L., Zhiming, L., Xiaoshan, L., Jifeng, H.: Consistent Code Generation from UML Models, UNU-IIST Rep. No. 319, The United Nations University (April 2005) 13. Som´e, S.S.: Supporting Use Cases based Requirements Engineering. Information and Software Technology 48(1), 43–58 (2006)
Enhancing Use Cases with Screen Designs Łukasz Olek, Jerzy Nawrocki, and Mirosław Ochodek Poznań University of Technology, Institute of Computing Science, ul. Piotrowo 3A, 60-965 Poznań, Poland {Lukasz.Olek,Miroslaw.Ochodek}@cs.put.poznan.pl
Abstract. This paper presents a language called ScreenSpec that can be used to specify screens at requirements elicitation phase. ScreenSpec was successfully applied in 8 real projects. It is very effective: average time needed to specify a screen is 2 minutes, and takes an hour to become proficient in using it. Visual representation generated from ScreenSpec can be attached to requirements specification (e.g. as adornments to use cases). Keywords: Use cases, GUI Design, Prototyping, ScreenSpec.
1
Introduction
Use cases are the most popular way of specifying functional requirements. A survey published in IEEE Software in 2003 [12] shows that over 50% of software projects elicit requirements as use cases. Use case is a good way of describing interaction between a user and a system at a high level, so maybe now the number can be even higher. At the same time many practitioners (in about 40% of projects [12]) draw user interfaces to visualise better how the future system will behave. This is wise, since showing user interface designs (e.g. prototypes [14,17,8,18], storyboards [10]) together with use cases helps to detect more problems with requirements ([13]1 ). Unfortunately, user interface details would clutter use-case description and should be kept apart from the steps ([6,7]), but they can be attached to use cases as adornments [6]. Much have been said about writing use cases [6,7,16,11,9] (e.g. how to divide them into main scenario and extensions, what type of language to use), however it is not clear how to specify UI details as adornments. Practitioners seem to either draw screens in graphical editors and attach graphical files to use cases, or just describe it using a natural language. Both approaches have advantages and disadvantages. The graphical approach is easier to analyse by humans, however more difficult to prepare and maintain. On the other hand, the textual approach is much easier to prepare, but not so easy to perceive. The goal of this paper is to propose a simple formalism called ScreenSpec to specify user interface details. It has both advantages of the approaches mentioned 1
This research has been financially supported by the Polish Ministry of Science and Higher Education grant N516 001 31/0269. See experiment conclusions in section: “Mockup helps to unveil usability problems”.
Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 48–61, 2011. c IFIP International Federation for Information Processing 2011
Enhancing Use Cases with Screen Designs
49
earlier: it is inexpensive to prepare and maintain, and can be automatically converted to a graphical form (attached to use cases as adornments can stimulate readers visually). Currently the language is limited to describe user interfaces of web applications. It is easy to propose a new formalism, but it is much more difficult to prove that it is useful. ScreenSpec has been successfully used in 8 real projects. An investigation was carried out to find out how much effort is needed to use ScreenSpec, and how much time does it take to learn how to use it. The plan of this paper is following. Section 2. describes some approaches to UI specification that are popular in software engineering. Section 3. describes the ScreenSpec language. Section 4. describes case studies that were conducted to check whether ScreenSpec is complete enough and flexible to specify real applications and how much effort does it take to specify screens at requirements elicitation phase. Section 5. describes a way to generate graphical files representing particular screens from ScreenSpec. Section 6. presents how the visual representation of screens can be embedded in requirements documents: as adornments, or as a mockup. The whole paper is concluded in Section 7.
2
Related Work
There are many outstanding technologies and formalisms used to describe user interface. One of them is XUL[2] (XML User Interface Language) from the Mozilla Foundation. This is an interesting language used to describe platformindependent user interface. It has many interesting features, e.g. separation of presentation from application logic, easy customisation, localisation or branding. However this language seems to be too much implementation-oriented, to be used at requirements elicitation phase. There are two important impediments to use XUL for this purpose. Firstly, one has to know component type before it can be specified (e.g.
).Therefore it is not possible to start from specifying the structure of information connected with particular screen, and then add details about control types (see Section 3.3). Secondly, labels for components have to be defined explicitly. This means that for each component two declarations are required: one for a label, and second one for a control. However it would be interesting to generate (or round-trip-generate) XUL from ScreenSpec. Produced XUL could be easily used for implementation. There are other technologies that are getting more and more popular nowadays, like for instance, MDSD (Model Driven Software Development) approaches, that provide an ability to generate a whole application from a set of models (e.g. WebML[3], UWE/ArgoUWE[5]). Unfortunately this approach still seems to require too much effort, to be successfully used at the requirements elicitation phase. There are companies using such approach, that Poznań University of Technology cooperates with. According to their experience it takes at least several hours to describe a single use case with MDSD models. It is definitely to long to be used at requirements elicitation phase, so they use generic text editors to specify use cases and screens.
50
3
Ł. Olek, J. Nawrocki, and M. Ochodek
ScreenSpec - Language for Screen Specification
The goal of the ScreenSpec language is to allow analyst to specify the structure of user interface very efficiently. Since this approach is supposed to be used at early stages of requirements elicitation phase, it would be wise to focus on the structure of screens and information exchanged between a user and a system, rather then on such attributes like colours, fonts, layout of components. This is called a lo-fidelity approach([19,15]) and is used in ScreenSpec. It is best to explain how ScreenSpec specification looks like on a simple example. Let us imagine an Internet shop with a view of all categories (see Fig. 1). It contains a list of categories (each of them is represented by a link with its name) and lists of subcategories. There are also some links to other shopping-views (by brand, by store, etc). 3.1
Language Definition
Screens. Each screen is specified with a set of lines that describe its structure. The definition starts with keyword SCREEN followed by screen’s identifier (each screen must have an unique ID). The following lines are indented and describe components, that belong to the screen: SCREEN Screen_ID: Component_definition_1 Component_definition_2 ...
Where Component_definition_n is a definition of a basic component or a group. Basic components. Basic components are mainly simple widgets taken from the HTML language. They are specified using the following syntax:
SCREEN Shop_By_Category: Categories(LIST): Name(LINK) Subcategories(LIST): Name(LINK) Shop_by_Brand(LINK) Shop_by_Store(LINK)
Fig. 1. A screen showing a list of categories (screenshot taken from shopping.yahoo.com), and its corresponding specification in ScreenSpec
Enhancing Use Cases with Screen Designs
51
Component_ID[(CONTROL_TYPE)]
where: – Component_ID is a string that identifies the component, e.g. Login_name, Password, Author. It has to be unique in the scope of the screen or parent group. – CONTROL_TYPE is one of: BUTTON, LINK, IMAGE, STATIC_TEXT,DYNAMIC_TEXT, EDIT_BOX, COMBO_BOX, RADIO_BUTTON, LIST_BOX, CHECK_BOX, CUSTOM. This parameter is optional (EDIT_BOX by default) and can be defined by an analyst later (see Section 3.3). Three controls do not come from a standard set of HTML controls, thus require additional comment: (1)STATIC_TEXT is used to display a text that will be the same on each screen instance (e.g. a comment or instructions) (2)DYNAMIC_TEXT is a block of text generated dynamically by the system (e.g. a total value of an invoice) and (3)CUSTOM is used to represent non-standard components (e.g. date picker). Groups. Groups of components are containers for widgets, usually providing additional features. They are defined using the following syntax: Group_ID[(GROUP_TYPE)]: Component_definition_1 Component_definition_2 ...
where: – Group_ID is a string that identifies the group. – Component_definition_n - is a definition of a basic component or another group of components – GROUP_TYPE - determines the type of the group: • SIMPLE or omitted - such group is just used to introduce structure to the screen, but it does not provide any additional semantics to its components • LIST - its components are repeated in a list, all child components form a single list item • TABLE - its components are repeated in rows, as a table (similar to LIST, but different layout) • TREE - similar to list, but its items can also contain another lists of the same structure (e.g. used to create tree of categories, or site map tree). 3.2
ScreenSpec Advanced Elements
Static values are used to specify, that the component will have the same values on each screen. For example, when we have a combo box that allows to choose sex of user, it would always have two values: “Male” and “Female”. We can express such case in ScreenSpec using the following structure:
52
Ł. Olek, J. Nawrocki, and M. Ochodek
Component_ID(...):Value_1|Value_2|...
e.g. Sex(COMBO_BOX):Male|Female Template mechanism can be used to build more complex screens. Two elements are needed to do this: a template definition and a reference to the template in the particular screen. This can be done using the following structure (shown in the example): TEMPLATE Main: Header: Sign_in(LINK) Register(LINK) ... Contents(CUSTOM) ... SCREEN All_categories(Main): ...
Each template is defined with keyword TEMPLATE followed by a template name. The structure of template is the same as the structure of a screen. The main difference is that it requires putting the additional component Contents(CUSTOM) - which is a placeholder for the specific screen which will use the template. When screen is supposed to use a particular template, we need to mention the name of the template in parenthesis, e.g. SCREEN All_Categories(Main). Include common screen elements. When there are the same controls (functional blocks) used in more screens it would be wise to declare them once and then make references. We can do this by using INCLUDE keyword: SCREEN Search_box: ... SCREEN Search_results: INCLUDE Search_box Results(LIST): ...
Select clause. It is common that a screen has different variants depending on situation. For example, a screen can display a list of all accepted papers or a message that no papers are accepted yet. When one of several variants can be used, it is specified by a select clause: SELECT: Component_definition_1 Component_definition_2
This means that either Component_definition_1 or Component_definition_2 will be displayed on a real screen. More than two variants are possible (additional lines should be added within the SELECT clause). Components, that are specified here can be both simple components, or groups.
Enhancing Use Cases with Screen Designs
3.3
53
Iterative Approach to Screen Specification
ScreenSpec is designed to be used by an analyst at requirements elicitation phase. This phase is exploratory, which means that change-involving decisions are made frequently. It would be important to provide incremental approach to the screen specification process. It would be great if analyst could just roughly describe screen at the beginning (only the structure of information), and add more details later (when a customer confirms it is correct). Therefore ScreenSpec has 3 levels of details: L1 Component names - need to be specified at the beginning. L2 Types of controls and groups - specifies types of information connected with each screen. L3 Static values and templates. These levels can be mixed throughout the specification process: some fragments of screens can be written at one level of details, whereas other ones at another level.
4
Experience with ScreenSpec
4.1
Specifying Screens for the Real Projects – Case Studies
Analysts usually use word processors and sheets of papers to author requirements. Keeping it in mind, it seems that introducing formalised requirements models can be risky. It may happen that some of the developed-system features might be too difficult to describe. To make sure that the ScreenSpec formalism is complete and flexible enough to be used for describing real systems, eight case studies were conducted. They included a large variety of projects. Some of them were internally-complex (large number of sub-function2 requirements), with a small amount of interaction with a user (e.g. Project A, Project C). Others were interaction-oriented, with a great number of use cases and screens (e.g. Project D, Project G). First 6 projects were selected from the Software Development Studio course at Poznań University of Technology. These projects were developed for external customers by students of the Master of Science in Software Engineering. Students were successfully using ScreenSpec approach to specify screens. They also raised some minor suggestions for ScreenSpec language, and small simplifications were introduced afterwords. Then screens for two commercial projects were also written using ScreenSpec language. In both cases all screens were successfully specified. It seems that number of lines of code (LOC) per screen may differ depending on the screen complexity. In analysed projects average LOC per Screen varies from 3.0 to 14.5 (see table 1). 2
After Cockburn[7]: a sub-function requirement is a requirement that is below the main level of interest to the user, i.e. “logging in”, “locate a device in a DB”.
54
Ł. Olek, J. Nawrocki, and M. Ochodek Table 1. Eight projects selected for the case study
4.2
Screens
Average LOC/Screen
Total LOC
4
2
4
14.5
58
2
5
9.4
47
Project
Business
Subfunctional
ScreenSpec Screens
User
Number of Use Cases
Project A
0
Project B
3
13
Project C
0
5
0
4
3
12
Project D
0
16
0
27
4.7
128
Project E
0
4
0
7
3.9
27
Project F
1
3
2
3
13
39
Project G
0
44
39
92
9.5
917
Project H
2
12
0
7
5.1
36
ScreenSpec Efficiency Analysis
Although an average amount of code required to specify a screen with ScreenSpec seems to be rather small, two important questions arise: – Q1: how much effort is required to specify3 a screen? – Q2: how much time is required to learn how to use ScreenSpec? The second question is also important because practitioners tend to choose solutions, which provide business value and are inexpensive to introduce. If an extensive training is required in order to use ScreenSpec efficiently, there might be a serious threat, that the language will not be attractive to the potential users. In order to answer these questions, a controlled case study was conducted4 . Eight participants were asked to specify sequence of 12 screens coming from the real application (provided as the series of application screenshots). The time required for coding each of the screens was precisely measured (up to the seconds). The code was written manually on sheets of paper. Participants were also asked to copy a sample screen specification, in order to examine their writing speed. Before they started to specify screens, they had been also introduced to the ScreenSpec during the 15-minute lecture, and each of them was also provided with a page containing the ScreenSpec specification in a nutshell. All materials provided to participants are published at [1]. 3 4
The term "specifying" is understood here as the process of transcribing the vision of the screen into the ScreenSpec code. The case study is labeled here as a controlled, because the methodology was similar to that used in case of controlled experiments, however the nature of questions being investigated refers rather to the "common sense", then to some obtainable values (e.g. compare average learning time, to the one which is acceptable for the industry).
Enhancing Use Cases with Screen Designs
55
Table 2. Effort and lines of code for each participant and task (sample screen refers to the task measuring participants writing speed)
Time [min] Sample Participant Screen P1 P2 P3 P4 P5 P6 P7 P8 Mean SD
0.9 1.6 1.2 1.3 1.0 1.0 1.0 1.6 1.2 0.3
1
2
3
5
6
7
8
9
11
12
4.3 2.8 1.8 1.7 2.4 2.5 1.9 5.4 2.9 1.3
2.0 2.0 2.0 1.4 2.2 1.6 1.5 4.2 2.1 0.9
2.0 3.6 3.0 5.7 1.0 1.5 1.6 3.2 2.7 1.5
6.2 3.6 4.6 2.5 2.5 3.7 2.0 5.8 3.9 1.6
2.3 1.2 1.5 1.8 1.4 1.3 1.1 3.7 1.8 0.9
4.7 2.0 2.4 1.9 1.8 1.4 1.3 2.5 2.3 1.1
7.7 4.1 6.2 3.3 3.6 4.1 2.3 4.7 4.5 1.7
3.3 3.1 4.8 3.5 4.3 3.0 1.8 5.6 3.7 1.2
5.5 3.1 3.9 4.7 3.0 3.3 3.2 4.0 3.8 0.9
6.3 4.2 4.4 3.9 4.2 3.8 3.5 6.7 4.6 1.2
8
9
11
12
Lines of code - LOC Sample Participant 1 Screen 8 P1 14 8 P2 9 8 8 P3 8 P4 7 8 P5 11 8 P6 9 8 10 P7 8 P8 5 8.0 9.1 Mean SD 0.0 2.7
2 6 7 6 6 8 7 7 7 6.8 0.7
3
5
6
8 17 7 9 12 6 9 10 6 9 10 6 8 11 6 8 9 7 9 10 6 7 10 6 8.4 11.1 6.3 0.7 2.5 0.5
7
20 35 20 25 32 8 21 16 18 19 7 22 16 18 25 7 15 17 20 19 8 16 16 15 23 8 17 14 17 23 7 17 15 21 21 7 14 16 13 18 9.0 19.6 16.3 18.4 22.5 4.5 6.8 1.8 3.7 4.5
Descriptive analysis and data clearing. During the completion of each task (single screen specification) two values were measured: – time required to finish the task – lines of code developed to specify the screen Screens specifications, developed by participants, differed in respect to their size, because they were specified only on the basis of the screenshots, which were be perceived slightly differently by different people. What is more, some of the ScreenSpec structures might be used optionally. The detailed results of the case study is presented in table 2. Before proceeding to the further analysis results for all tasks were carefully analysed in order to find potential outliers. The task was marked as a suspicious if the variability in lines of code provided by participants was high (or there were outlying observations). According to the box plots presented in figure 2 tasks 1, 4, 5, 7, 8, 9, 10, 11, 12 were chosen for further investigation in order to find out the reasons for the LOC variability. It turned out that tasks 4 and 8 were ambiguous, because in both cases there were two possible interpretations of the screens semantic. What is more, the amount of code required to specify each of two versions differed significantly. Therefore those tasks were excluded from the further analysis.
Ł. Olek, J. Nawrocki, and M. Ochodek
a)
b)
8
36
7,2
32
Size [LOC]
5,6 4,8 4 3,2
28 24 20 16 12 8
2,4
Task (screen)
12
11
9
10
8
7
6
5
4
12
11
9
10
8
7
6
5
4
3
2
1
0
3
4
1,6
2
Time [min]
6,4
1
56
Task (screen)
Fig. 2. Effort and size of code (LOC) variability for each task, a) box plot presents effort variability for each task, b) box plot presents lines of code variability for each task
Productivity analysis. Based on the effort and code size measured for each task performed by each participant, a productivity factor can be derived. It will be defined here as a time required to produce a single line of code. It might be calculated using equation 1. P ROD =
Ef f ort Size
(1)
Where: – P ROD - is a productivity factor understood as a number of minutes required to develop a single line of code – Ef f ort - is an effort required to complete the task (measured in minutes) – Size - is the size of code developed to specify the screen (measured in LOC) Effort measured during the case study consists of two components: (1)time required for thinking and (2)writing down the screen. It would be difficult to precisely measure both of them, however knowing the writing speed of each participant (see equation 2) it is possible to calculate the approximate effort spent only on thinking. It can be further used to estimate cognitive productivity factor (see equation 3), which can be understood as a productivity of thinking while coding the screen. It is independent from the tool (effort needed mentally produce the screen-specification code). Vwriting =
Sizesample Ef f ortsample
(2)
Enhancing Use Cases with Screen Designs
57
Where: – Vwriting - is a writing speed (measured in LOC per minutes) – Ef f ortsample - is an effort required to copy the code for the sample screen (measured in minutes) – Sizesample - is the size of the code for the sample screen – 8 LOC P RODcognitive =
Ef f ort − (Size/Vwriting ) Size
(3)
Where: – P RODcognitive - is an estimation of cognitive productivity factor understood as a number of minutes spend on thinking in order to produce a single line of code – Ef f ort - is an effort required to complete the task (measured in minutes) – Size - is the size of code to specify the screen (measured in LOC) – Vwriting - is a writing speed (measured in LOC per minutes) Cognitive and standard productivity factors were calculated for each task performed by participants. The chart presenting mean values for each task is presented in figure 3.
Productivity [min/LOC]
0.40 0.35 0.30 0.25
Mean productivity factor
0.20 0.15
Mean cognitive productivity factor
0.10 0.05 0.00 1
2
3
5
6
7
9
10
11
12
Task (screen) Fig. 3. Mean (cognitive and standard) productivity factors for each task (screen)
Q1: how much effort is required to specify a screen? If the mean productivities from the first and the last task are compared, it would mean that average beginner produces around 2.81 LOC / minute while person with some experience 4.7 LOC / minute (this of course may very depending on the screen complexity). That means that total effort of specifying all of the screens for the largest project included in the case studies – Project G (92 screens with total of 917 LOC of screen specifications) would vary from 3.2 to 5.4 hours depending on analyst skill. What is more, an average screen size is around 8 LOC (average
58
Ł. Olek, J. Nawrocki, and M. Ochodek
from table 1), which could be specified in less then 2 minutes (for experienced analyst, and less then 3 minutes for beginner). Therefore it seems that the SpecScreen notation might be used directly during the meetings with customer. It is also worth to mention that if there was an efficient editor available (with high usability), the productivity factor for potential user, would be closer to the cognitive one. This means that a 8 LOC screen would be specified in about 30 seconds. Q2: how much time is required to learn how to use ScreenSpec? By looking at the productivity chart presented in figure 3, the learning process can be investigated. The ratio between productivity factors calculated for the ending and beginning task is 1.69. In addition it seems that after completing 8-10 tasks, learning process saturates. Therefore it seems that participating in a single training session which includes a short lecture and ten practical tasks (about an hour), should be enough to start using ScreenSpec effectively. An interesting observation is regarding the task number 5, because the productivity factor suddenly increased at this point (more time required to produce one line of code). This issue was further investigated, and the finding was that the screen for that task contained interactive controls, which appeared for the first time in the training cycle (edit boxes, check boxes etc.). Thus an important suggestion for a preparation of the training course would be to cover all of the components available in the ScreenSpec language.
5
Visual Representation of Screens
ScreenSpec can be authored using a dedicated tool. This is a simple editor, that detects each change, and automatically regenerates graphics files (PNG) that can be attached to requirements documents. The generator uses simple rules to transform ScreenSpec to visual representation: 1. For each component: – EDIT_BOX, COMBO_BOX, LIST_BOX, CUSTOM - a label (equal Component ID) is displayed on the left side of the control, the control’s value is taken from the defined static value, or it is left empty. CUSTOM component is displayed as the EDIT_BOX. – BUTTON, LINK - displays a control with a caption equal to the defined static value, or component ID. – STATIC_TEXT, DYNAMIC_TEXT - displays a piece of text equal the static value or component ID. – RADIO_BUTTON, CHECK_BOX - displays a control followed by a label (label’s value equals the static value or component ID) – IMAGE - displays a label on the left (equal to component ID) and an empty image frame on the right. 2. For each group: – SIMPLE - a header and a frame is created, all children components are placed inside this frame.
Enhancing Use Cases with Screen Designs
59
Fig. 4. An example of visual representation for a LIST group component
– LIST - a header and a frame is created. In the frame 3 rows are displayed (this visualises that a list can have more elements): two rows having the child components, and the third one containing “...” – TABLE - is similar to a LIST, however a new table column is created for each child component. Its label is displayed in the table header rather then on the left (near its control). – TREE - is similar to a LIST, but for each row a nested and smaller list is displayed. Following example (figure 4) shows a visual representation of a simple screen specified in ScreenSpec.
6
ScreenSpec Meets Use Cases
Visual screens generated from ScreenSpec can be directly inserted into requirements specification in adornments section of particular use cases. Having up-todate graphic files allows to update the specification very easily, because many modern text editors allow to link with external files, and update them each time the document is opened (e.g. Microsoft Word, OpenOffice). 6.1
Mockup
Mockup is an interesting artefact created by connecting screens to particular steps of use cases. It is rendered as a simple web application that can display both: use cases and screens at the same time. Use case (displayed on the left side) shows the interaction between an actor and a system (see figure 5). After selecting particular step, an according screen is displayed (on the right side). This artefact seems to be useful in practice, initial feedback from commercial projects using mockups is very positive.
60
Ł. Olek, J. Nawrocki, and M. Ochodek
Actors
Business Processes
Use Cases
Business Objects
UC5: Login to the system Main Scenario: 1. Customer chooses login action . 2. Customer fills in login form . 3. System checks if data is correct and authorizes Customer .
Extensions: 3.A. Customer cannot be authorized 3.A.1. System prints warning info and ask Customer to repeat authorization process . 2.A. There is no account for Customer. 2.A.1. Customer would like to add a new acount to the
Username Password Login Cancel
Fig. 5. An screenshot of Mockup - showing use case with corresponding screens at the same time
It is difficult to connect screens to use case steps in generic text editor, so a dedicated tool called UC Workbench [4] was developed at Poznan University of Technology.
7
Conclusions
User interface designs are often attached to use cases as adornments, because it helps to understand the requirements by IT laymen. However, it is not clear how to specify UI details. This paper proposes a language called ScreenSpec, that can be used for this purpose. ScreenSpec is a formalism that was thoroughly validated. It was used to describe UI in eight real software projects. ScreenSpec allows to work incrementally on screen designs, starting with the general structure of information at particular screen, and then adding more details about widgets. It is very efficient, it takes on average about 2 minutes per screen. ScreenSpec is also easy to learn, it takes about an hour, for a person that has never seen ScreenSpec, to become proficient in using it. Although it is interesting to use ScreenSpec at requirements elicitation stage, it could be even more interesting to use it at later stages. One can think about generating skeleton user interface code (in XUL, SWT, Swing or other technologies), that could be refined during implementation. Appropriate research will be conducted as a future work.
Acknowledgements Authors would like to thank companies which cooperate with Poznań University of Technology: Polsoft and Komputronik. They found time and courage to try our ideas in practice and provided us with a substantial feedback. This research
Enhancing Use Cases with Screen Designs
61
has been financially supported by the Polish Ministry of Science and Higher Education under grant N516 001 31/0269.
References 1. A web page containing all materials for a ScreenSpec evaluation case study, http://www.cs.put.poznan.pl/lolek/homepage/ScreenSpec.html 2. Home page for Mozilla XUL, http://www.mozilla.org/projects/xul/ 3. The Web Modeling Language Home Page, http://www.webml.org/ 4. UC Workbench project homepage, http://ucworkbench.org 5. UWE - UML-based Web Engineering Home Page, http://www.pst.informatik.uni-muenchen.de/projekte/uwe/index.html 6. Adolph, S., Bramble, P., Cockburn, A., Pols, A.: Patterns for Effective Use Cases. Addison-Wesley, Reading (2002) 7. Cockburn, A.: Writing Effective Use Cases. Addison-Wesley, Reading (2001) 8. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (1999) 9. Jacobson, I.: Object-Oriented Software Engineering: A Use Case Driven Approach. Addison Wesley Longman Publishing Co., Inc., Redwood City (2004) 10. Landay, J.A., Myers, B.A.: Sketching storyboards to illustrate interface behaviors. In: CHI 1996: Conference Companion on Human Factors in Computing Systems, pp. 193–194. ACM Press, New York (1996) 11. Leffingwell, D., Widrig, D.: Managing Software Requirements: A Use Case Approach, 2nd edn. Addison-Wesley Professional, Reading (2003) 12. Neill, C.J., Laplante, P.A.: Requirements Engineering: The State of the Practice. IEEE Software 20(6), 40–45 (2003) 13. Olek, Ł., Nawrocki, J., Michalik, B., Ochodek, M.: Quick prototyping of web applications. In: Madeyski, L., Ochodek, M., Weiss, D., Zendulka, J. (eds.) Software Engineering in Progress, pp. 124–137. NAKOM (2007) 14. Pressman, R.: Software Engineering - A practitioners Approach. McGraw-Hill, New York (2001) 15. Rudd, J., Stern, K., Isensee, S.: Low vs. high-fidelity prototyping debate. Interactions 3(1), 76–85 (1996) 16. Schneider, G., Winters, J.P.: Applying Use Cases: A Practical Guide. AddisonWesley, Reading (1998) 17. Snyder, C.: Paper Prototyping: The Fast and Easy Way to Define and Refine User Interfaces. Morgan Kaufmann Publishers, San Francisco (2003) 18. Sommerville, Y., Sawyer, P.: Requirements Engineering. A Good Practice Guide. Wiley and Sons, Chichester (1997) 19. Walker, M., Takayama, L., Landay, J.A.: High-Fidelity or Low-Fidelity, Paper or Computer? Choosing Attributes When Testing Web Applications. In: Proceedings of the Human Factors and Ergonomics Society 46th Anuual Meeting, pp. 661–665 (2002)
Mining Design Patterns from Existing Projects Using Static and Run-Time Analysis Michal Dobiš and Ľubomír Majtás Faculty of Informatics and Information Technologies, Slovak University of Technology, Bratislava, Slovakia
[email protected],
[email protected]
Abstract. Software design patterns are documented best practice solutions that can be applied to recurring problems. The information about used patterns and their placement in the system can be crucial when trying to add a new feature without degradation of its internal quality. In this paper a new method for recognition of given patterns in objectoriented software is presented. It is based on static and semidynamic analysis of intermediate code, which is precised by the run-time analysis. It utilizes own XML based language for the pattern description and the graph theory based approach for the final search. The proof of concept is provided by the tool searching for the patterns in .Net framework intermediate language and presenting the results using common UML-like diagrams, text and tree views. Keywords: Design patterns, reverse engineering.
1
Introduction
It is not easy to develop a good design of a complex system. Experienced designer of object oriented software often rely on usage of design patterns that are known as good solutions to recurring problems. A design pattern can impact scalability of a system and defines the way of modification of the architecture without decreasing its quality. Patterns are based on experience of many software developers and they are abstractions of effective, reliable and robust solutions to repeating problems. A pattern describes a problem, its context and supplies some alternative solutions, each of them consisting of collaborating objects, their methods and services. Introducing patterns to software design provides one more benefit. While class in the object oriented paradigm makes it easier to understand the code by encapsulating the functionality and the data, an instance of a design pattern identifies the basic idea behind the relations between classes. It is an abstract term that can help to understand design decisions made by the original developer. But how to find instances of design patterns in the implemented system, when they represent an idea rather than just an implementation template? A large system has usually long lifetime and original architect of a particular component can leave the team before the system life cycle ends. Modification of the internal Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 62–75, 2011. c IFIP International Federation for Information Processing 2011
Mining Design Patterns from Existing Projects
63
structure of a software component is a common need during the maintenance or when developing it for a long time. Therefore the precise documentation of design decisions by describing used design patterns becomes necessary. Unfortunately, such documentation is mostly obsolete or absent at all. There can be plenty of design pattern instances in a large system, and each pattern can have many implementation variants. Moreover, a code fragment can play role in more than one pattern. It is unlikely to expect a human being to be able to orientate in a large mesh of connected classes forming a system. Thus, automatic analysis and search have to be done. We focused our effort on analysis of intermediate code produced by compilers of modern object oriented programming languages like C#. This code is also directly executable so its runtime behavior can be inspected as well. With the proposed method and the implemented tool, the valid pattern instances can be found more precisely, so the developers can become familiar with the analyzed system easier and have all the information necessary to make the correct design decisions while modifying it. Besides that, the library of patterns is kept open and new ones can be added using a special editor or by directly editing XML files.
2
Related Work
There are some tools that support reverse engineering with design pattern support, but this feature is far from perfect and there is still research needed to be done. There are three main questions to answer when developing a solid design patterns mining tool. First question that comes out is ”What to find?” It represents the challenge of defining a formal language for the design pattern description that would allow dynamic editing of the catalogue. For humans the most suitable is natural language as it is used in works like the GoF’s catalogue [7]. It is good for description of the pattern’s variability, but is insufficient for the computer processing. UMLbased ”Design Pattern Modeling Language” [9] is visual language consisting of 3 models (specification, instantiation and class diagram) that uses extensions capability of UML but authors propose just the forward engineering way of model transformations. Mathematical logic with its power is introduced in ”LanguagE for Patterns’ Uniform Specification” (LePUS) [5]. It, however, is difficult to capture the leitmotif of pattern using LePUS - even GoF patterns are not fully written in it. Progressive way of pattern description applicable in the context of pattern mining utilizes XML - ”Design Pattern Markup Language” [2] defines basic requirements on each class, operation and attribute so the group of them can be considered a pattern instance. The answer to the question ”Where to search?” starts with creation of system model. There are three basic ways how to get it automatically. First, static analysis of the source code comes in mind. It is straightforward, but the drawback is that it would require development of a new parser for each language analyzed system could use. Thus, lower level language would be useful - that is intermediate language like Java byte-code (JBC) or Common Intermediate Language (CIL).
64
M. Dobiš and Ľ. Majtás
All the languages of the .Net family (C#. Visual Basic, managed C++, J#, DotLisp, etc.) are translated to CIL when compiled. There exists also a transformation between the JBC and CIL using IKVM.NET [6]. Some behavioral analysis can be done directly from the statically written code - good examples are tools like Pinot (Pattern INference and recOvery Tool) [11] or Hedgehog [3] identify blocks of code and data flow between them to gain the simple image of the idea behind the code and its dynamic nature. More precise perspective of a system can be gain by dynamic analysis realizable through profiling API (e.g. CLR Profiler uses it). Running the system and monitoring its behavior might not be the most reliable method, but it is the only way how to gain some very useful information about it (e.g. Wendehals uses this, but his tool is not really automatic [14]). Last of the three questions is ”How to search?” There has been some works utilizing a graph theory to search the patterns in graph model of analyzed system, while mostly just static information were taken in account [2]. Similar to it, distance between graphs of an analyzed system and a pattern (difference in the similarity of graphs) was introduced in the literature [13]. A quick method of pattern recognition, useful mostly to speed up the search process in combination with something else, comes from the utilization of object oriented software metrics known from the algorithms for looking for ”bad smells” in the code [1]. Finally, rules and logic programming come alive in ”System for Pattern Query and Recognition” [12] in cooperation with already mentioned language LePUS with all its benefits and drawbacks.
3
Our Approach
The main reason for design pattern mining is making the system understandable and easier for modification. The problem of many mining approaches presented before is the relatively high false acceptance ratio, they often identify elements that just look like pattern instance, but they are certainly not. Therefore we decided to focus on preciseness of the searching method that would try to identify only correct pattern instances that meet all constrains a man could define on a pattern. The collaborating classes, methods and fields are marked as instance of particular design pattern when they really fit the design pattern definition; so the software engineer can gain benefits from using it. The first information that a software developer learns about a design pattern is mostly its structure. This static information is very important but is useless without the knowledge of the idea represented by pattern. What we want from our products - systems and programs - is the behavior rather than the structure and that is what specifies pattern most accurately. To provide reliable results we decided to take three different types of analysis that need to come to an agreement whether the examined elements complies with pattern constrains or not: structural, semidynamic and dynamic (run-time) analysis.
Mining Design Patterns from Existing Projects
3.1
65
Structural Analysis
The first and the easiest phase of the precise system analysis is the recognition of all types, their operations and data (attributes, local variables, method parameters). The search algorithm is based on graph theory, so we form the three types of vertex from these elements. Following information are recognized for each of them: – Unique identifier of each element. – Recognition whether it is static. – Identification of parent - the element in which the current one is defined inside. Type The Type vertex represents class, abstract class, interface, delegate, enumeration or generic parameter. It contains identification of element type (Boolean values indicating whether it is a class, interface, etc.). Special type of edge identifies the types the element derives from - this list contains the base type and all the interfaces implemented by it. Finally the child elements (inner classes, operations, attributes, etc.) are recognized forming new vertices connected to the current one with ”parent-identification” edge. Operation The business logic of applications is defined inside the methods and constructors. For these types of elements we recognize the following information: – Identification whether the operation is virtual. If so, the overridden virtual or abstract element is searched. – Recognition whether it is abstract. – Distinction between the constructors and common methods. – Returned type. – Parameters and local variables of the operation. Data Attributes, parameters and local variables are represented as the vertices of type data having the original scope stored as the vertex attribute. Each data field has its type (identified by an edge connected to the vertex standing for the type). The type of collections is not important when looking for design pattern instances. The important information is the type of elements in the collection in combination with additional information, that the reference (the data field) is not only one instance, but it stays for a group of elements of the recognized type.
66
M. Dobiš and Ľ. Majtás
Event Some modern object oriented languages define language level implementation of design pattern Observer. In .Net, the definition of events is similar to common class property - it has its type (delegate) and name. The difference is that during the compilation two new methods on the class containing the event are defined. These are used to attach and detach listeners (observers). From the design pattern mining point of view we can say, that the events are single-purpose. The definition of an event and references to it exactly define the Observer pattern instance that is implemented correctly. It is clear that when looking for this kind of pattern implementation, we can mark each occurance of the key-word. Thus, we store this information in the system model, but do not discuss this in the paper bellow. 3.2
Semidynamic Analysis
Application behavior is represented by methods, invocations between them and object creations. Some information can be gathered already from static representation of an analyzed system - the most important information is the call from one operation to another one. The necessity of the information about the calls was introduced in many recent works [2] [3] [11] concerning the design patterns mining. However, just a few of them inspected the whole body of operations in more detail. We find the placement of a method call (whether inside a condition, cycle or directly in a method) equally important as the basic recognition of the call. Most of the design patterns define the condition, that the particular call should be invoked just when an if-clause is evaluated as true (e.g. Singleton, Proxy) or it needs to be realized inside a cycle (e.g. Composite). It, however, should not forbid it - the inclusion inside a condition can be done due to many other reasons not affecting the design pattern instance. For some design patterns is also important the count of calls to a single operation placed inside the current operation (e.g. Adapter ). During the semidynamic analysis there we also analyze the class attributes. The access to a member variable from an operation brings some very useful information enhancing the image we can gain about the behavior from the statically written system. If a method plays the role of operation GetInstance in pattern Singleton, it is insufficient to have just the structural information, it needs to have a reference to the static field instance and modify it during the lazy initialization. The Fig. 1 presents the example results of the static and semidynamic analysis. 3.3
Dynamic Analysis
The only time, we can really observe the behavior of a system, is the run time. When we execute the analyzed system we can see, what is really happening inside the system. To do so, we decided to enrich the model (graph) with four simple information collected automatically.
Mining Design Patterns from Existing Projects
67
Fig. 1. Graph of the system structure enhanced with semidynamic information
Count of Instances Several design patterns say how many instances a particular class should have during one system execution. The simplest example should be the Singleton pattern - it is obvious that there needs to be exactly one instance of the class. The suitability and the goal we want to achieve by counting instances of concrete types is obvious. We decided to increase the number also for class A when an instance of class B is created whereas B derives from A. The reason for doing this is the fact that patterns usually do not forbid to derive from classes playing the roles in pattern instance. Thus, there might be a class that derives from Singleton but the method GetInstace still has to return the same object - Singleton needs to have just one instance no matter how many subclasses of it are defined in the system. Count of the Operation Invokes In the previous section we have already introduced the importance of the information how many times an operation was invoked. The Prototype example illustrates the comparison of the count of operation Clone invokes to the count of instances of a particular type. More common however, is the comparison to count of invokes of calls done from a particular operation (useful when searching for e.g. Composite or Proxy). Thanks to this information it is easy to verify whether the placement of a call inside a loop really increases, resp. the placement inside a condition decreases, the count of call executions. Using this information we can check also whether e.g. Composite really does its job and groups more components forming a tree structure.
68
M. Dobiš and Ľ. Majtás
Less common is to compare count of invokes of two separate operations. The reason is that mostly we do not know whether the target operation really gets executed - there might be lazy binding and the pattern definition would be useless. We could increase the count of invokes not only for the operation really called, but also for the overridden one. It might seem similar to the situation we have with the classes (previous chapter) but there are some differences. First of all, it is sometimes useful to distinguish exactly the called operation and modifying the count of overridden method would make it a little bit harder. The most important difference, however, is that we have another useful information when considering the operation invokes - we can identify the source and desired target by monitoring invokes of the calls from one operation to another. Count of the Calls Invokes The call from an operation is mostly polymorphic - its target is a virtual method that can but (in case it is not abstract) does not have to be overridden in a subclass. As described in previous section, this information can serve as verification of the real usage of a loop or a condition during the system run time. A good example is the Proxy pattern, where we can observe whether the condition really leads to lower calls to the RealSubject and is not just a result of coding standards or an accident. Count of Changes in Attributes Some patterns require tracking the value-changes in data variables. It might be interesting to monitor changes in all of them including local variables and method parameters, but we do not find this really useful and it would lead to performance problems due to many logging. What we find useful and sufficient is to count, how many times a value in a class attribute changes. The importance of tracking the changes can be seen in several design patterns -even when considering the very simple pattern Singleton we can propose a restriction that value stored in the static field instance should change just once. More changes would cause the program to loose the reference to the only instance, thus the pattern instance would behave incorrect. The most important, however, is the value to identify patterns Strategy and State. They have been discussed many times in literature so far [2]. When trying to recognize these patterns in a system automatically, most of the approaches presented earlier simply sets the instance as ”Strategy or State” - since they were using just the static analysis they were unable to distinguish one from other. The approaches that took the challenge and proposed the ability to distinguish them used some additional information, restriction that do not fit always and were added by them [3] [11]. Very common is the definition that a concrete state needs to know about its successor (one concrete state has reference to another concrete state) or the context simply creates the concrete states. This is usually true and is a very good assumption. We have identified different assumption that allows us to recognize these patterns: states should change, whereas strategies
Mining Design Patterns from Existing Projects
69
should not. This means that we need to know whether the attribute in Context class (its reference to abstract state or strategy) changes or not. We can say that the Strategy pattern says about algorithm a context can be configured with. Strategy encapsulates behavior used by a system, its part or a single class and it should not change during the client (Context ) life time. Thus, the restriction we placed is, that there can not exist any class that contains a parameter of type Strategy (reference to abstract strategy) that changes too many times - the highest possible count of changes in the field referencing from the Context to the Strategy is the count of instances of the Context class. Having the Strategy clear, we know how to find a State - when all the other requirements (structural, semidynamic, other dynamic restrictions) are met, there needs to be a Context class that has an attribute of type State that was changed (the value stored in the variable was changed) more times than the Context class was instantiated. 3.4
Design Pattern Model and the Search
We have introduced several constraints that can be used to exact recognition of design pattern instances. Because of these new constraints, we had to develop a new pattern description that includes improved structural and behavioral information. As we mentioned earlier, the analyzed system is represented as a graph with four kinds of vertices: type, operation, data and call. Each type of vertex has its own attributes (as they are described in chapters 3.1, 3.2 and 3.3) and can be connected with the other vertices using typed edges. The pattern is a rule that describes constraints on a subgraph of this graph - its vertices including values of the attributes, types and edges. Patterns usually define roles that might (or should be) played by more system elements - e.g. they define a concrete strategy, but expect that there will be more of them in a single instance of the pattern. It means that the multiplicity of items playing role concrete strategy should be at least 2. The multiplicity is crucial for all patterns (defining the minimal and maximal value of it for each pattern element) - it is clear that there should not be a public constructor available in the class Singleton, thus its multiplicity needs to be exactly 0. There needs to be at least one private constructor, so multiplicity of this element has the minimum at 1. The multiplicity is even more interesting when we look for an Abstract Factory instances - when we take the definition of pattern very precisely, we can propose a constraint on count of abstract products demanding it to be the same as the count of methods of the factory that create objects of different types. The minimal and maximal value is used also for all the information gathered using the dynamic analysis and can be set as constant number or as a reference to another numeric value evaluated on the pattern instance.
4
The Tool
To prove our concept we have developed a complex reverse engineering tool that is able to analyze any system built on the .Net platform and search the instances of design patterns defined in an editable catalogue (see Fig. 2 for the screenshot
70
M. Dobiš and Ľ. Majtás
Fig. 2. The results dialog window of the tool
of the tool). The analysis process starts with selection of assemblies (DLL and EXE files) and specifying namespaces that should be included. The content of the assemblies is written in Common Intermediate Language, that is produced by every compiler of all the languages of the .Net family. CIL can be read in various ways. The simplest one is the usage of the reflection that is commonly available in the .Net environment and is sufficient for the needs of structural analysis. The content of the operations that is investigated by the rest of the analysis process is available as plain CIL. We find here all the references to attributes (data fields), method calls and recognize whether they are positioned inside a condition, loop or directly in the method body. Finally, the dynamic analysis is done using .Net profiling API that allows us to inject logging instructions into every operation of the analyzed system. This is done before Just-In-Time compilation (from CIL to machine code) of each method starts. Before we could search for something, we needed to define it. Therefore we have created an editable catalogue of design patterns stored in an XML-based language that we have created from original Design Pattern Markup Language (DPML) [2] enhancing it by the features we added (we call it DPMLd ). The search for design patterns instances itself is based on graph theory and utilizes GrGen.Net [8]. GrGen.Net allows a developer to define his own model of graph that contains typed vertices, each type of vertex having its own attributes and connected with other vertices using typed edges. Upon the defined graph, there can be a graph-rewrite-rule created. The rule consists of conditional and changing part and can be used to specify conditions on subgraphs we want to search for. The tool we created transforms the graph of analyzed system into the GrGen.Net form and produces the graph-rewrite-rules from specification of design patterns. Finally, it collects the found subgraphs - basic fragments of candidates of design pattern instance (e.g. group of vertices that represent an instance of State with
Mining Design Patterns from Existing Projects
71
only one concrete state). These fragments are grouped together if they have common ”group-defining” elements - this is for example the abstract state class, or the template method. Which elements are the ”group-defining” is defined by the pattern and it can be any type, operation or data. The last step of the search is done when subgraphs are grouped into the candidates. It consists of multiplicity checks verifying the correct count of elements in the pattern instance.
5
Results
The Fig. 3 illustrates the benefits of the individual steps of the analysis we proposed in this article. It contains wrong and good implementations of the simplest of the GoF design patterns - the Singleton.
Fig. 3. Experiments with singleton pattern
Structural analysis is the first what comes in mind when we talk about mining of design patterns from existing projects. It checks many conditions - when considering the Singleton, it checks whether there is no public constructor and at least one private one in a class that contains static attribute of its own type
72
M. Dobiš and Ľ. Majtás
and a static method returning the type. However, all these conditions are met already in the case of class NearlySingleton1 (part A in the figure). Even more, the method GetInstance has reference to the instance data field and calls the private constructor. The semidynamic analysis checks all this and adds another feature - it requires the call to be placed inside a condition (if-clause). It might seem enough, but as we can see the NearlySingleton2 (part B in the figure) fits also these constraints and still is not a correct instance of the singleton pattern. Finally, the dynamic analysis adds some something that ensures the correctness - it checks the count of instances of the class (to be more exact it checks also the count of value changes in the instance field and count of method and call invokes) and verifies that the Singleton1 and Singleton2 are the valid Singleton pattern instances in the system. Please note, that the call to the private constructor in the Singleton2 is not placed in a condition as it is written in the code. However, we can say there is a hidden (suppressed) else block because the precious if-block ends with the return statement (the tool recognizes this correctly). We have filled the catalogue of the implemented tool with all GoF design patterns having one group of implementation variants for each pattern. This means that all common implementation variants are found, but some correct pattern instances that rely on transitive relations might stay undiscovered. The resolution to this is by adjusting the level of accuracy of the pattern definition or by adding the definitions of the additional variants (all using the GUI of the tool; automatic transitive relations are to be added during future work). The Observer and Iterator patterns are also included. We, however, decided not to search for the language-level implementation of them - this would be just the keyword based search for event or foreach. We defined these two as they are described in the GoF catalogue and search just in the client code. As expected, none of the projects we tested was ”reinventing the wheel” (they used the language level features instead of coding it at their own). Thus, we do not show these results in the tables bellow. We have checked our tool on the code of our tool itself (PatternFinder ). We chose our own tool for examination because it is an untrivial piece of code and we know about occurrence of all instances in it. The tool was able to find all the pattern instances correctly (except Mediator - it did not find it since it has colleagues made from system types that were skipped during the analysis process). It was also able to find some pattern instances we included unintentionally. The other tests were made on both - the free (mostly open source) and the commercial projects on the .Net platform (since we do not need sources, we need just the DLL and EXE). The Table 1 presents results of the examination executed on the different projects. Please note that we looked just for a single variant of each design pattern (resp. small group of variants covered using single pattern description in the DPMLd pattern language we proposed). The variants can be modified or some new one can be added to the tool using the editor it contains or by writing it as the XML. The table columns marked with S are based on static (structural and semidynamic) analysis and the D columns contain results that came from complete
S
Paint.Net D
S
SharpDevelop D
0/0/0 14 / 1 / 0 7/1/0 6/2/0 2/1/0 152 / 16 / 7 1585 / 73 / 24 451 / 30 / 5 13 568 / 285 / 71 1 249 / 100 / 39 0/0/0 188 / 2 / 2 0/0/0 3040 / 7 / 0 193 / 3 / 0 6/1/1 20 / 1 / 1 12 / 1 / 1 624 / 14 / 14 66 / 3 / 3 0/0/0 0/0/0 0/0/0 3/1/1 0/0/0 20 / 1 / 0 43 / 3 / 0 4/1/0 7054 / 16 / 3 2 565 / 5 / 1 48 / 1 / 1 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 16 / 5 / 0 0/0/0 3/1/0 0/0/0 4/1/1 25 / 1 / 1 1/1/1 24 / 2 / 2 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0 / 0 / 0 12 672 / 2 / 2 0 / 0 / 0 0/0/0 0/0/0 14 / 1 / 1 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 8/2/2 2/1/1 0/0/0 0/0/0 17 / 3 / 3 33 / 3 / 3 0/0/0 15 / 4 / 4 1/1/1 6/2/2 0/0/0 0/0/0 7/2/2 7/2/2 0 / 0 / 0 325 / 18 / 12 110 / 2 / 1 1528 / 43 / 25 428 / 23 / 14 237 / 9 / 9 591 / 43 / 23 171 / 16 / 11 7924 / 127 / 47 1476 / 75 / 26 14 / 2 / 1 87 / 14 / 2 41 / 5 / 1 17 / 5 / 0 0/0/0 51 / 1 / 1 0/0/0 0/0/0 0/0/0 0/0/0
PatternFinder S D
Abstract Factory 0/0/0 Adapter 389 / 25 / 7 Bridge 0/0/0 Builder 9/1/1 Chain of Respon. 0/0/0 Command 54 / 1 / 0 Composite 96 / 1 / 1 Decorator 0/0/0 Facade 1/1/0 Factory method 9/1/1 Flyweight 0/0/0 Interpret 0/0/0 Mediator 23 / 1 / 1 Memento 0/0/0 Prototype 0/0/0 Proxy 17 / 3 / 3 Singleton 6/2/2 State 42 / 1 / 0 Strategy 394 / 11 / 11 Template Method 14 / 2 / 1 Visitor 82 / 1 / 1
Pattern
Table 1. Experiments on existing projects
D 4/1/0 30 / 12 / 7 27 / 1 / 1 0/0/0 0/0/0 147 / 4 / 1 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 108 / 5 / 4 212 / 18 / 10 4/1/0 0/0/0
NUnit 14 / 1 / 0 247 / 28 / 11 96 / 1 / 1 27 / 1 / 1 0/0/0 398 / 4 / 1 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 0/0/0 1/1/1 0/0/0 771 / 13 / 6 161 / 18 / 11 7/3/0 0/0/0
S
Mining Design Patterns from Existing Projects 73
74
M. Dobiš and Ľ. Majtás
static and dynamic analysis Each table cell contains three numbers formatted like x / y / z, where: x - count of all subgraphs (smallest fragments that meet the basic requirements). y - count of all candidates that arrised from grouping of the subgraphs. z - count of the found pattern instances.
6
Conclusions and Future Work
In the previous sections we have introduced a method for searching instances of design patterns in an intermediate code and a tool that is a proof-of-concept. We have focused on preciseness of search results, therefore we have proposed a method based on three different types of analysis. All of them need to agree before they mark the pattern instance, so they together provide interesting results with low false acceptance ratio. The implemented tool provides identification of pattern instances in the intermediate language of the .Net platform. Choosing intermediate language as an input to the analysis removes the necessity of access to the source code. As a result we are able to check any software developed for the .Net platform, including COTS products provided by third parties. The search process is driven by a pattern specification which is written in an XML-based language. The tool provides a GUI editor for the language, what allows users to modify pattern descriptions or specify other patterns. Through modification of a pattern description a user can even configure the level of search preciseness. In the future, we would like to improve our specification language to allow searching more variants of design patterns. An interesting approach in this area has been introduced by Nagl [10]. He uses feature modeling [4] for specification of design pattern variability. Another improvement can be gained by allowing the transitive relationships which are sometimes parts of the software design (for example indirect inheritance, multiple code delegation, etc.). Acknowledgement. This work was partially supported by the project of APVV (Slovak Research and Development Agency), No. APVV-0391-06 ”Semantic Composition of Web and Grid Services” and the Scientific Grant Agency of Republic of Slovakia, grant No. VG1/3102/06.
References 1. Antoniol, G., Fiutem, R., Cristoforetti, L.: Using Metrics to Identify Design Patterns in Object-Oriented Software. In: Proceedings of the 5th International Symposium on Software Metrics, pp. 23–34. IEEE Computer Society, Washington DC (1998) 2. Balanyi, Z., Ferenc, R.: Mining Design Patterns from C++ Source Code. In: Proceedings of the International Conference on Software Maintenance, pp. 305–314. IEEE Computer Society, Washington DC (2003)
Mining Design Patterns from Existing Projects
75
3. Blewitt, A., et al.: Automatic Verification of Design Patterns in Java. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, ACM, New York (2005) 4. Czarnecki, K., Helsen, S., Eisenecker, U.: Staged Configuration Through Specialization and Multi-Level Configuration of Feature Models. Software Process Improvement and Practice, Special Issue on Software Variability: Process and Management 10(2), 143–169 (2005) 5. Eden, A.H.: Precise Specification of Design Patterns and Tool Support in Their Application. PhD thesis, University of Tel Aviv (1999) 6. Frijters, J.: IKVM.NET Home Page, http://www.ikvm.net/index.html (accessed May 5, 2007) 7. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns, Elements of Reusable Object-Oriented Software. Addison-Wesley professional computing series (1995) ISBN 0-201-63361-2 8. Geiss, R., et al.: GrGen.NET, www.grgen.net (accessed November 4th, 2007) 9. Mapelsden, D., Hosking, J., Grundy, J.: Design Pattern Modelling and Instantiation using DPML. In: Proceedings of the Fortieth International Conference on Tools Pacific: Objects for Internet, Mobile and Embedded Applications, pp. 3–11. CRPIT Press, Sydney (2002) 10. Nágl, M. (supervised by Filkorn, R.): Catalog of software knowledge with variability modeling, Master thesis, Slovak university of technology, Faculty of informatics and software engineering (2008) 11. Shi, N., Olsson, R.A.: Reverse Engineering of Design Patterns from Java Source Code. In: Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, Washington (2006) 12. Smith, McC, J., Stotts, D.: SPQR: Flexible Automated Design Pattern Extraction From Source Code. In: 18th IEEE International Conference on Automated Software Engineering, pp. 215–225. IEEE Computer Society, Washington DC (2003) 13. Tsantalis, N., et al.: Design Pattern Detection Using Similarity Scoring. IEEE Transactions on Software Engineering 32(11), 896–909 (2006) 14. Wendehals, L.: Improving design pattern instance recognition by dynamic analysis. In: Proceedings of the ICSE 2003 Workshop on Dynamic Analysis, pp. 29–32. IEEE Computer Society, Portland (2003)
Transformational Design of Business Processes for SOA Andrzej Ratkowski and Andrzej Zalewski Warsaw University of Technology Department of Electronics and Information Technology {
[email protected],
[email protected]}
Abstract. By describing business processes in BPEL (Business Process Execution Language) one can make them executable. Then a problem arises how to assure that some non-functional requirements concerning e.g. performance of these processes, are met. In the paper a transformational approach to design of business processes is presented. To check equivalence of business processes resulting from the transformations, a BPEL description is converted to Process Algebra (Lotos version) and model-checking techniques are applied. The paper contains also an example of applying the proposed approach in a real-life situation. Keywords: SOA, BPEL, business process design.
1
Introduction
The ability to define and execute business processes seems to be one of the most important advantages introduced by the research and commercial developments on Service-Oriented Architectures (SOA). The two worlds of business modeling and software systems development have never been closer to each other – it is now possible to express software requirements and business processes in terms of services. BPEL has become a standard for defining executable business processes. This in turn triggered an extensive research on the modeling and verification techniques suitable for BPEL-like notations. The recent research is concentrated on converting BPEL processes to one of the formal models that can be analysed with model-checking techniques. A survey of such approaches can be found in [3]. It reveals that all the most important formal modeling techniques developed for concurrent systems are applicable here: Petri nets (basic model, high-level, coloured) – (see e.g. [23], [21], [28]), Process Algebras – (see e.g. [12], [11]), Lotos – (see e.g. [9], [25]), Promela and LTL – (see e.g. [13], [15]), Abstract State Machines – (see e.g. [7], [26]), Finite State Automata – (see e.g. [11]). These conversions make possible deadlock and livelock detection as well as reachability analysis with automated model checkers. The approaches presented above accompanied by appropriate verification techniques can detect certain flows in BPEL processes. However, they are not methods of business processes design – they do not provide any guidance on how to Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 76–90, 2011. c IFIP International Federation for Information Processing 2011
Transformational Design of Business Processes for SOA
77
improve the quality attributes of designed systems like maintainability, performance, reusability etc. This is what our approach is aimed at. In this paper we advocate an idea of transformational design of BPEL business processes in which specified behavior remains preserved, while quality attributes get improved. There are three basic roots of our approach: 1. Software refactoring – the approach introduced by Opdyke in [20], further developed in [17], in which the transformations of source code are defined so as to improve its quality attributes; 2. Business process design – in the realm of SOA informal or semiformal methods dominate the research carried out so far – e.g. Service Responsibility and Interaction Design Method (SRI-DM) [14]; 3. Business process equivalence – there have already been developed several notions of the equivalence between business processes based on Petri Nets [16] and Process Algebras [24]. The transformations of Business Processes are in the core of our approach and represent similar concept as popular software refactorings. In a formal Process Algebra model of business processes there have been introduced our original notion of business process equivalence (explained and discussed in section Behavioral Equivalence) and it has been proved that the defined transformations create processes equivalent to the one being transformed. These transformed processes are compliant in terms of their behavior, however, quality attributes have changed. This provides a foundation for the transformational design method in which a starting BPEL process is subject to a series of transformations yielding as a result behaviorally compatible model with improved non-functional properties like modifiability, maintainability, performance, reusability etc. The rest of the paper has been organized as follows: the section Transformational Approach describes generally concept of our transformational approach, sections Behavioral Equivalence drills down formal aspects of our method, finally section Process Design Example shows practical example of the application of the proposed approach.
2
Transformational Approach
The process of transformational design of business processes has been depicted in figure 1. 1. The transformational desing starts from establishing reference process representing only functionality of the process, order of the activities, relations between them as well as exchanged data and external services invocations. In following iterations the original process is gradually changed by refactoring transformations [22], [17] like: – Service split – dividing one complex services into two or more more simple ones,
78
A. Ratkowski and A. Zalewski
Fig. 1. Process transformation algorithm
2.
3.
4.
5.
– Service aggregation – opposite to service split: composing two or more services into one larger, – Parallelization – making serial activities to run in parallel, – Asynchronization – replacing synchronous communication protocol with asynchronous one. The above transformations are referred to as refactorings or transformations and are only examples of possible refactorings. A few independent refactorings or transformations executed on a given process create a few alternative processes, which should be equivalent to the original one or at least changes in behaviour should be known. Behavioral equivalence verification step is aimed at checking behaviour preservation. In this step formal methods of Process Algebra (PA) [5] are used. The result of verification is either elimination of not-equivalent alternative or acceptance of changes in behaviour that transformations caused. The transformation changes behaviour is exactly known thanks to PA formalism. After eliminating some of the alternatives or accepting all of them, alternatives are evaluated against non-functional properties like: – performance, – safety, – maintainability, – availability, – or any important property. The measure of each property is calculated e.g. with the metrics [18], models [8] or simulation. A single alternative is selected from the set of acceptable processes. This is based on the evaluation performed in the previous phase as well as the predefined desing preferences.
The above steps should lead from the process, which is correct from functional point of view, to the process that has desired non-functional quality attributes. All steps of algorithm are guided by a human designer and supported by automatic tools that may:
Transformational Design of Business Processes for SOA
– – – –
3
79
Suggest possible transformations of reference process, Verify behavioral equivalence, Compute quality metrics of alternatives, Point out quality attribute trade-off points.
Behavioral Equivalence
Behavioral equivalence verification is based on the transformation of BPEL processes to Process Algebras [5], which is a formal model, particularly suitable for the modeling of concurrent, loosely coupled and asynchronously communicating systems such as BPEL business processes. 3.1
Process Algebra for Behavioral Equivalence
We use LOTOS [2] implementation of PA as a model of BPEL process. To achieve this we have devised a mapping from BPEL to PA terms. There have already been proposed a number of BPEL to PA mappings – e.g. [10] or [4], however, they do not meet the needs of transformational design as they define full semantic equivalence preserving every detail of the internal structure of the process (i.e. the order of activities and their relation with other activities). This strictness narrows the possibility of changing the process structure or even makes it impossible. Therefore, we need a looser equivalence definition to assure that enough freedom is given to the trasformation of business processes. A separate issue is that transformations should produce simple models with possibly smallest statespace (the mappings referred above produce rather complicated models). During design procedure a few alternative process structures are considered and the equivalence of each of them has to be verified – this in turn should not be too time-consuming if the method is expected to be of a practical importance. Most important mappings of BPEL activities to PA formulas have been presented in table 1. The mappings neither take into account data values nor condition fulfillment. This is motivated by simplicity of the model and its more efficient verification with a model-checker. There has also been added an artificial mapping of activity which is not explicit part of BPEL but is necessary for equivalence verification. This is activity dependency mapping. Let us assume that there are two activities in BPEL process that are not directly attached to each other (by e.g. <sequence> or <switch>) but by shared variable, like in the following example:
...
80
A. Ratkowski and A. Zalewski
Table 1. Sample mappings BPEL activities to PA formulas BPEL external service invocation
LOTOS Process Algebra
process invoke_invokeName [inputName,outputName] := hide tau in ( inputName;tau;outputName;ended;exit ) endproc
receive message
process receive_receiveName [variableName] := hide tau in ( variableName;tau;ended;exit ) endproc
assign variable value
process assign_assignName [fromVar, toVar] := hide tau in ( fromVar;tau;toVar;ended;exit ) endproc
parallel execution
< ... name="activityA"/> < ... name="activityB"/> [...]
process flow_flowName[dmmy] := hide ended in ( activityA |[ended]| activityB ... ) endproc
sequential execution <sequence name="seqName"> < ... name="activityA"/> < ... name="activityB"/>[...]
process sequence_seqName[dmmy] := activityA >> activityB >> ... endproc
conditional execution <switch name="switchName">
< ... name="activityA"/> < ... name="activityB"/>
process switch_switchName[dmmy] := hide ended in ( activityA [] activityB ... ) endproc
Transformational Design of Business Processes for SOA
81
Then activity dependency mapping will be : process act_dependency[dummy] receive_ReceivePurchase[PurchaseOrder] |[PurchaseOrder]| assign_assignOrder[PurchaseOrder, ShippingRequest] endproc The activity dependency expresses indirect dependency of two activities that one needs output data from another, no matter what structural dependencies (sequence or parallel) in the process are. 3.2
BPEL Behavioral Equivalence
The key structure of our definition of behavioral equivalence is minimal dependency process (MDP). MDP is the process that is as simple as possible but still gives the same response for given stimulation as original process. MDP is constructed as set of activities executed in parallel that do not interact with each other. It is achieved by relaxing unnecessary structural activities that are in the original process. Current section supplies theoretical basis for the construction of MDP and the evidence that it can be used for process equivalence definition. There are few approaches to determine behavioral equivalence (or other words behaviour preservation) of refactored processes. In [20] author proposes definition that two systems are equivalent when response for each request is the same from both systems. According to [19] communication oriented systems are equivalent, if they send messages in the same order. In case of transformational design we assume that every service fulfills stateless assumption. It means that when BPEL process invokes external service then every response for some request is the same and do not depend on history. This assumption leads to the conclusion that state of external services (and all environment) is encapsulated inside the invoking service. To make this assumption usable and to prove how it can be used we need basic PA theory. → B B− x
(1)
The above formula means that process B reaches state B’ after receiving event (message) x Now PA semantics is defined using inference rules that has form: premises (sidecondition) conclusions
(2)
82
A. Ratkowski and A. Zalewski
For example parallel execution (without synchronization) || has 2 symmetric rules : x
x
B1 − → B1 x
B1||B2 − → B1 ||B2
and
B2 − → B2 x
B1||B2 − → B1||B2
(3)
and precedes (sequential composition) >> has 2 rules : x
σ
B1 − → B1 x
B1 >> B2 − → B1 >> B2
and
B2 − → B2 i
B1 >> B2 − → B2 where σ is successful termination and i is unobservable (hidden) event. If external services S is stateless then: y
∀y ∈ Y S − →S
(4)
(5)
where Y is a set of all events. This means that every event, generated externally by the subject service, does not change the state and answer of the service. To analyse BPEL process using PA terms, the BPEL process has to be translated into PA using the mapping mentioned above. The result of the translation is a set of PA processes that are sequentially ordered by BPEL steering instructions – sequences, flows, switches and so on. Additionally, part of the mapping is activity dependency process. This artifact symbolizes data dependency between the activities (one needs data generated by the other one). Let us denote it with dependency operator A]x]B
(6)
which means that state B can be started after A is successfully terminated and event x is emitted (or received). Below we illustrate the foundations of the behavioral equivalence concept. Let us consider a process that has a set of operations connected with dependency sequence: (A]x]C]z]D) (7) C waits for A result and D for C result. Apart from the above dependency, the process has also structural sequence defined by <sequence> instruction A -> B -> C -> D, where B is an instruction, which is not connected by activity dependency. We can relax the structural sequence and consider the process as: (A]x]C]z]D)||B
(8)
That means, we can treat (A]x]C]z]D) and B as two parallel, independent activities. Proof that ( 8) is true for stateless services 1. if there is no external service, ( 8) is true by the definition because there is no interaction between (A]x]C]z]D) and B
Transformational Design of Business Processes for SOA
83
2. if there is stateless external service S, then: y
→ ((A]x]C]z]D)) ||S ∀y(A]x]C]z]D)||S − and
y
→ S||B ∀yS||B −
(9) (10)
which leads to : y
y
→ (A]x]C]z]D) ⇒ (A]x]C]z]D)||B − → (A]x]C]z]D) ||B (A]x]C]z]D) − and
y
y
B− → B ⇒ (A]x]C]z]D)||B − → (A]x]C]z]D)||B
(11)
(12)
Equation ( 12) is parallel execution inference rules ( 3) which is proof of ( 8) If S was statefull, then y
∃y(A]x]C]z]D)||S − → (A]x]C]z]D) ||S then
y
→ (A]x]C]z]D) ||B (A]x]C]z]D)||B −
(13)
(14)
this would mean that there are some interactions between (A]x]C]z]D) and B, and that they can not be treated independently. The above theory makes possible to break the whole BPEL process into a set of subprocesses, which depend on each other only by data dependecies represented as activity dependencies. This technique can be related to program slicing [1] used broadly in source code refactoring. BPEL service with defined activity dependencies and without structured constraints (sequences, flows, conditional and so on) is called minimal dependency process (MDP). Such a MDP becomes a basis for verification of equivalence of transformed (refactored) process with the requirements specified by MDP. After refactoring, new (refactored) process has to be translated to PA and its PA image must fulfill preorder relationship specified by MDP, therefore, refactored process has to be a subgraph of MDP’s states graph. 3.3
Application of PA in Refactoring
As it was mentioned in the transformational approach introduction, PA formalism can help to decide if examined process is equivalent to reference one or points changes between processes. This is done by comparing execution state graphs of reference and refactored process. There are two possible results of this examination: – execution graph of refactored process is subgraph of MDP reference process – this means that refactored process is equivalent to reference process or
84
A. Ratkowski and A. Zalewski
– execution graph is not subgraph of MDP reference process – processes are not equivalent, but the information that can be obtained from the comparision is: what edges or nodes of execution graph are in refactored process but does not exist in MDP graph. Those extra edges and/or nodes are related to instructions or parts of code that makes importand diference between original and refactored process. The human designer can then decide, if theese differences are acceptable or not, in context of refactored process. 3.4
Algorithm and Tools for Equivalence Verification
Algorithm of equivalence verification consist of three steps: 1. Translating BPEL process to minimal dependency process (MDP) – this step is made only once at the beginning of refactoring process; 2. Translating BPEL process to its PA image; 3. Checking preorder relationship of PA image with minimal dependency process. For purposes of the test, translation BPEL to PA was made by using XSLT [27] processor, as the PA processor Concurrency Workbench for New Century (CWBNC) [6] has been used. The structure of the verification system is presented in figure 2.
Fig. 2. Structure of verification process
4
Process Design Example
Entire approach can be illustrated on a practical example of an order handling process. During the transformations two quality attributes are taken into account: performance and reusability. First of them is measured by a response time under given load, the latter one by a number of interfaces the whole considered system provides.
Transformational Design of Business Processes for SOA
4.1
85
Reference Process
The order handling process service is composed of three basic activities: invoicing, order shipping and production scheduling, which operate as follows: 1. A purchase order is received – it defines purchased product type, quantity and desired shipping method, 2. Shipping service is requested, which returns the shipping costs, 3. Invoice service provides the process with an invoice, 4. Production of ordered goods is scheduled with the request to scheduling service. All the activities are performed consecutively. The reference process with accompanying services is depicted in fig. 3.
Fig. 3. Purchase order handling reference process
4.2
Process Alternatives
The designer consideres three alternatives to the reference process shown in fig. 4: 1. Alternative (1): the purchase process first makes request to the shipping service then in parallel to scheduling and invoice services. 2. Alternative (2): the process runs parallely all three requests – to invoice, shipping and scheduling services.
86
A. Ratkowski and A. Zalewski
3. Alternative (3): a bit more sophisticated one – the reference service is splited into three separate services. First of them invokes shipping service, second one invokes invoice and scheduling services, the third one composes the other two subservices. 4.3
Equivalence Verification
Each of the three alternatives are verified whether they are behaviorally equivalent to the reference process. Technique of the verification has been described in section 4. The result of the verification is as follows: – Alternative (1) is behaviorally equivalent unconditionally, – Alternative (2) is not equivalent, because the request to invoicing and shipping services depends on the data received from shipping service. When all three requests start at the same time, we can not guarantee that the data from shipping service is received before the request to scheduling and invoicing services. This alternative cannot be accepted. – Alternative (3) is behaviorally equivalent. 4.4
Alternatives Evaluation – Performance
As it was mentioned at the beginning of the section, alternatives performance and reusability is assessed so as to chose a preferred process’ structure. Performance is measured as a mean response time under certain load level. The web service and connections between services can be modeled with queueing theory like M/M/1//inf system [8]. It means that requests arrive to the system independently with exponential interval distribution and response time is also exponentially distributed. Thanks to the above assumptions, average response time of the whole system can be estimated as a sum of average response times of its components: services and links between them. To make evaluation simpler, we assume that every network connection has the same average latency RN . So average response time of the reference process is: RRP = RBP ELR P + Rshipping + Rinvoicing + Rscheduling + 7RN
(15)
Please note that average response times of invoicing, shipping and scheduling are simply added, thanks to the fact that services are invoked consecutively. Let us assume additionally that the values of appropriate parameters are as follows: – – – – –
RBP ELR P = 2 ms (average time of processing of main BPEL process) Rshipping = 3 ms (avg. resp. time. from shipping service) Rinvoicing = 5 ms (avg. resp. time. from invoicing service) Rscheduling = 4 ms (avg. resp. time. from service) RN = 1 ms (avg. network latency)
Transformational Design of Business Processes for SOA
Fig. 4. Discussed alternatives for the reference process
87
88
A. Ratkowski and A. Zalewski
That gives RRP = 21ms. For alternative (1) average response time is: RA1 = RBP ELA1 + Rshipping + max(Rinvoicing , Rscheduling ) + 7RN
(16)
the difference between alternative (1) and reference process is that invoice and scheduling services are requested parallelly, so response time of parallel part is the maximum of response times of invoicing and scheduling processes. When we assume that RBP ELA1 = RBP ELRP then: RA1 = 17ms. Finally alternative (3) average response time is given by: RA3 = RBP ELA31 +RBP ELA32 +RBP ELA33 +Rshipping +max(Rinvoicing , Rscheduling )+11RN (17)
that yields: RA3 = 25ms 4.5
Alternatives Evaluation – Reusability
Total number of interfaces provided by the considered system (i.e. process and acompanying services) has been used as a reusability metric (the more interfaces the higher reusability). Reference process and alternative (1) deliver four interfaces: one to the purchase process and three to the elementary services: invoicing, shipping and scheduling. Alternative (3) delivers six interfaces: three to the elementary services, one to the composite service (process) and two new interfaces to two subservices. All the above data is gathered in table 2. Table 2. Quality metrics for the reference process and its alternatives Reference process response 21 ms
Average time Reusability Services quantity
4.6
4 1
Alternative 1 Alternative 3 17 ms
25 ms
4 1
6 3
Alternatives Selection
There is a trade-off between the two considered quality attributes: systems consisting of more basic services are more reusable at the expense of performance and vice versa. The designer has to decide according to assumed design preferences: when reusability is prefered – alternative (3) should be chosen, otherwise when performance is the most important attribute – alternative (1) should be preferred.
Transformational Design of Business Processes for SOA
5
89
Summary
The paper presents a transformational approach to the design of electronic business processes denoted in BPEL. The approach has been founded on a novel concept of business processes equivalence, which makes possible to construct business processes by their gradual transformations. Processes resulting from those transformations can be formally verified against their specification as well as against typical properties of concurrent systems like livenes or reachability. The designer, who steeres the transformations by evaluating non-functional attributes, is informed either that transformed process meets predefined requirements or which parts of process’ behaviour has changed. He can accept or reject such a non-equivalent transformation, beeing also assisted by supporting tools. The usefulness of our approach has been presented on a not-trivial example. The directions for tool support development have also been provided, some of these tools have already been implemented: e.g. BPEL to LOTOS transformation tool. Further research can include development of predefined transformations that have been proved to preserve a priori properties like process equivalence as well as the development of a business process design environemnt (software tools) integrating all the tools needed to support the presented approach. The challenge in building such tools is to make one consistent process out of all the separated method steps and to build an integrated development environment. Integrated tool can be built as an extension of one of the open-source software development environments like Eclipse or NetBeans.
References 1. Binkley, D., Gallagher, K.B.: Program slicing. Advances in Computers 43, 1–50 (1996) 2. Bolognesi, T., Brinksma, E.: Introduction to the iso specification language lotos. Comput. Netw. ISDN Syst. 14(1), 25–59 (1987) 3. Koshkina, M., Breugel, F.: Models and verification of bpel (2006) 4. C´ amara, J., Canal, C., Cubo, J., Vallecillo, A.: Formalizing wsbpel business processes using process algebra. Electr. Notes Theor. Comput. Sci. 154(1), 159–173 (2006) 5. Cleaveland, R., Smolka, S.: Process algebra (1999) 6. Cleaveland, R.: Concurrency workbench of the new century (2000), http://www.cs.sunysb.edu/~ cwb/ 7. Fahland, D., Reisig, W.: Asm-based semantics for bpel: The negative control flow. In: Abstract State Machines, pp. 131–152 (2005) 8. D’Ambrogio, A., Bocciarelli, P.: A model-driven approach to describe and predict the performance of composite services. pp. 78–89 (2007) 9. Ferrara, A.: Web services: a process algebra approach. In: ICSOC 2004: Proceedings of the 2nd International Conference on Service Oriented Computing, pp. 242–251. ACM Press, New York (2004) 10. Ferrara, A.: Web services: a process algebra approach, pp. 242–251 (2004) 11. Foster, H., Kramer, J., Magee, J., Uchitel, S.: Model-based verification of web service compositions. In: 18th IEEE International Conference on Automated Software Engineering, ASE (2003)
90
A. Ratkowski and A. Zalewski
12. Foster, H., Uchitel, S., Magee, J., Kramer, J., Hu, M.: Using a rigorous approach for engineering web service compositions: A case study. In: SCC 2005: Proceedings of the 2005 IEEE International Conference on Services Computing, pp. 217–224. IEEE Computer Society, Washington, DC (2005) 13. Fu, X., Bultan, T., Su, J.: Analysis of interacting bpel web services. In: WWW 2004: Proceedings of the 13th International Conference on World Wide Web, pp. 621–630. ACM, New York (2004) 14. Hofacker, I., Vetschera, R.: Algorithmical approaches to business process design. Computers & OR 28(13), 1253–1275 (2001) 15. Holzmann, G.J.: The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley Professional, Reading (2003) 16. Martens, A.: Simulation and equivalence between bpel process models (2005) 17. Martin, F.: Refactoring: improving the design of existing code. Addison-Wesley Longman Publishing Co., Inc., Boston (1999) 18. Seattle University Everald E. Mills. Software metrics, SEI-CM-12-1.1 (1988) 19. Moore, I.: Automatic inheritance hierarchy restructuring and method refactoring, pp. 235–250 (1996) 20. Opdyke, W.F.: Refactoring Object-Oriented Frameworks. PhD thesis, UrbanaChampaign, IL, USA (1992) 21. Ouyang, C., Verbeek, E., van der Aalst, W.M.P., Breutel, S., Dumas, M., ter Hofstede, A.H.M.: Formal semantics and analysis of control flow in ws-bpel. Sci. Comput. Program. 67(2-3), 162–198 (2007) 22. Ratkowski, A., Zalewski, A.: Performance refactoring for service oriented architecture. In: ISAT 2007: Information Systems Architecture And Technology (2007) 23. Hinz, S., Schmidt, K., Stahl, C.: Transforming BPEL to Petri Nets. In: van der Aalst, W.M.P., Benatallah, B., Casati, F., Curbera, F. (eds.) BPM 2005. LNCS, vol. 3649, pp. 220–235. Springer, Heidelberg (2005) 24. Sala¨ un, G., Bordeaux, L., Schaerf, M.: Describing and reasoning on web services using process algebra, p. 43 (2004) 25. Sala¨ un, G., Ferrara, A., Chirichiello, A.: Negotiation among web services using LOTOS/CADP. In (LJ) Zhang, L.-J., Jeckle, M. (eds.) ECOWS 2004. LNCS, vol. 3250, pp. 198–212. Springer, Heidelberg (2004) 26. Reisig, W.: Modeling- and Analysis Techniques for Web Services and Business Processes. In: Steffen, M., Tennenholtz, M. (eds.) FMOODS 2005. LNCS, vol. 3535, pp. 243–258. Springer, Heidelberg (2005) 27. W3C. Xsl transformations (xslt) version 1.0 (1999), http://www.w3.org/tr/xslt 28. Yang, Y., Tan, T., Yu, J., Liu, F.: Transformation bpel to cp-nets for verifying web services composition. In: Proceedings of NWESP 2005, p. 137. IEEE Computer Society, Washington, DC (2005)
Service-Based Realization of Business Processes Driven by Control-Flow Patterns Petr Weiss Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Bozetechova 2, 612 66 Brno, Czech Republic
[email protected]
Abstract. This paper deals with Service-Oriented Analysis and Design (SOAD). SOAD comprises requirements elicitation, analysis of the requirements to identify services and business processes and finally implementation of the services and processes. It is a heterogeneous approach that combines a number of well-established practices. The key question in SOAD is how to integrate the practices to model a true service-oriented system with all its requirements. The approach presented in this paper helps to bridge the semantic gap between business requirements and IT architecture by using a method for transformation of business process diagrams into service diagrams. The service diagrams describe how the process is realized by using services. In details, the method transforms diagrams denoted in Business Process Modeling Notation into a set of UML diagrams. The principle of the method is demonstrated in an example. Keywords: Service-Oriented Architecture, Service-Oriented Analysis and Design, Business Process Modeling, Service Specification, Composite Services.
1
Introduction
Service-Oriented Architecture (SOA) is a complex solution for analysis, design, maintenance and integration of enterprise applications that are based on services. Services are autonomous platform-independent entities that provide one or more functional capabilities via their interfaces. In other words, SOA is an architectural style for aligning business and IT architectures. If a new service is considered it has to fulfill both business requirements and the fundamental SOA properties such as loose coupling, service independence, stateless and reusability [4]. It means that Service-Oriented Analysis and Design (SOAD) should focus not only on services but also on business requirements. This paper introduces an approach for integration between Business Process Modeling and Service Modeling. The approach is based on a technique that transforms business process models into models of service orchestration by using control-flow patterns. Service orchestration represents realization of a business Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 91–102, 2011. c IFIP International Federation for Information Processing 2011
92
P. Weiss
process. To be more specific, business process diagrams are denoted in Business Process Modeling Notation (BPMN) and service orchestrations are modeled by means of UML. The reminder of this paper is organized as follows. Section 2 containes an overview of main approaches that are relevant to the subject. Section 3 introduces the concept of primitive and composite services and presents the idea of service specification. Next, in Section 4 the transformation rules are explained. They are illustrated with an examplary business process in Section 5. Finally, Section 6 discusses advantages and disadvantages of our approach compared with the approaches presented in Section 2. Conclusions and future work are contained in Section 7.
2
Related Work
The work presented in this paper has been influenced by several different proposals. First of all, we should mention UML profiles for SOA. [1] introduced a simple UML profile that is intended to use for modeling of web services. [6] provided a well-defined profile for modeling of service-oriented solutions. In [11] it is showed how services and their extra-functional properties can be modeled by using UML. The ideas presented in these approaches are a good starting point to model both the structural and the behavioral properties of services by means of UML. However, Service Modeling needs to take into account some additional aspects. Primarily, we have to know the connection between business requirements and functional capabilities of modeled services. [13] introduced 21 patterns that describe the behavior of a business process. Next, [16] showed how these patterns can be modeled by means of UML and BPMN. The author discussed the readability and technical properties of each model. Furthermore, [17] analyzed using of Control-flow, Data and Resource patterns in BPMN, UML and BPEL. The relationship between BPMN and UML is also introduced in [2]. The BPMN is defined in [9]. The author describes a high-level mapping of a business process to UML 2.0 Business Service Model (BSM) that represents service specification between business clients and information technology implementers. Lastly, [7] addressed main problems of transformation between business process modeling languages. Particularly, ADONIS Standard Modeling Language, BPMN, Eventdriven Process Chains and UML 2.0 Activity Diagrams were mentioned.
3
Service Specification: Primitive and Composite Services
Within this paper, each service is modeled as a stereotype service that extends the UML class Component [8]. This concept is introduced in [15]. Every service interacts with its environment via interfaces. During an interaction, service can play two different roles: service provider or service consumer. These two roles are distinguished in the service model by means of a port. The provider port (stereotyped as <<prov>>) of a service implements interfaces that specify
Service-Based Realization of Business Processes
93
functional capabilities provided to possible consumers of the service, while the consumer port (stereotyped as <
>) requires interfaces of defined services to consume their functionality. In our approach, we prefer a flat model to a hierarchical model of the serviceoriented architecture. For this purpose we introduce two types of services: primitive services and composite services. Primitive services are derived from service invocation tasks, and are responsible for providing functional capabilities defined by a task. A composite service is an access point to an orchestration of other primitive or composite services. It means that a composite service does not enclose its internal services participating in the orchestration and does not delegate its interfaces to them. It only represents a controller of those services, i.e. the composite service communicates with its neighboring services at the same level of hierarchy in the producer-consumer relationship. The flat model provides better reusability of services, because the context of a service is defined only by its implemented (i.e. provided) and required interfaces, not by its position in the hierarchy. A specification of each service includes description of its architecture and internal behavior. The architecture is defined by interfaces that the service provides and requires. Each service provides at least one interface at the consumer port. Such an interface involves service operations that realize the functional capability of the service. Parameters of these operations describe format of an incoming message. The internal behavior describes which actions are executed when an operation of the service is invoked. In the case of a composite service, the behavior describes an orchestration that this service provides. It means which actions have to be performed in order to execute the corresponding orchestration.
4
The Transformation
As mentioned above, SOAD provides techniques required for identification, specification and realization of services and service flows. It is obvious that the initial activity in the development of a new SOA-based system is the service identification. It consists of a combination of top-down, bottom-up, and middle-out techniques of domain decomposition of legacy systems, existing asset analysis, and goal-service modeling. The result of the service identification is a set of candidate services. In the context of this paper, the service identification is a prerequisite for the transformation. Since this paper is mainly focused on the transformation, the details about the service identification are omitted here. More about service identification can be found in [5]. The transformation consists of two basic steps. The objective of the first step is to identify which tasks from the Business Process Diagram (BPD) represent service invocations and therefore will be transformed into services. This decision takes into account such aspects as which service providers provide which services, Quality of Service (QoS) requirements, security issues, etc. [3] and is closely related with the results of the service identification. Such an analysis is beyond the scope of this paper.
94
P. Weiss Table 1. Transformation rules for basic BPMN elements BP pattern
Service pattern
Communication protocol
None
None
None
The second step, the transformation process itself, generates an orchestration of the identified services. The orchestration represents realization of the given business process. The transformation is based on transformation rules. Each rule defines how to transform a control-flow pattern (eventually basic BPMN element) into an UML service pattern. Each service pattern is described by its structure and by a communication protocol. Due to the limitation of space in this paper, only basic rules are introduced (see Table 1 to Table 3). Some rules include context information that is needed to understand the using of the rule. For this purpose, the actual transformation step is inserted into a red-line-bordered area. To distinguish a primitive service from a composite one, the convention used in this paper is to name a primitive service with a capital letter (e.g. serviceA) and a composite service with a number (e.g. service1). Besides, some BPD elements can by transformed in more than one way (see e.g. Table 2). Table 1 contains transformation rules for basic BPD elements. The first row of the table depicts a rule for a pool. A pool represents a business entity in the given process. Graphically, a pool is a container for the control flow between BPD elements. Each pool in a BPD is mapped to a corresponding composite service
Service-Based Realization of Business Processes
95
in a service diagram. The composite service is then responsible for the execution of that part of a business process that is modeled within the pool. Hereafter, we assume that all BPD elements belong to the processA pool, unless otherwise indicated. A pool can contain one or more lanes. Lanes model internal sub-partitions (usually called roles) within a business entity. These roles encapsulate a given part of the BP and ensure its execution. In the context of service diagrams, this relationship is modeled by means of packages. Next, a message start event is mapped to a message (implicitly asynchronous) and a message end event is mapped to a replay. The last row in Table 1 contains a transformation rule for a single service invocation task. Generally, a task represents some unit of work performed in a process that is not further modeled. A service task is a task that provides some sort of service. Graphically, it is used to depict a service task with a small sprocket-symbol in the upper right corner to distinguish a service task from a task. A service task in a business process diagram is represented by a primitive service in a service diagram. As mentioned above, Task1 (from Table 1) belongs to processA. It means that Task1 is executed within the scope of processA. As a result, a new service1 is created and connected to serviceA. This is made by means of an assembly connector. The connector defines that service1 provides some capabilities that serviceA requires. In this case, it means that service1 is controlled by serviceA. In the other words, serviceA is an access point to service1. The corresponding sequence diagram describes details of the communication between serviceA and service1. As we can see, the invocation of Task1 is modeled by means of an incoming message to service1. In Table 2, it is shown how to transform a sequence of two service invocation tasks. The transformation can be done in three different ways. The first one is to add a new primitive service to the existing service diagram and to specify the communication protocol in the corresponding sequence diagram (see Table 1, the first row). The alternative way is to create a new primitive service and a new composite service. These two services will be added to the service diagram as follows (see Table 2, the second row): service2 is the primitive service that represents Task2, and serviceB is the composite service that acts as a gate to service2 and service1. Hence, service2 is connected to serviceB and service1 has to be switched to serviceB if it was connected to another service before this transformation step. Interactions among these services are depicted in the sequence diagram. Lastly, the sequence of two service tasks can be transformed into a single primitive service. It is achieved through adding a new interface to the provider port of the existing primitive service and connecting this interface to the corresponding composite service. Hence, it follows that such a service provides functional capabilities that correspond to both of the tasks. When the service is invoked, the capability is chosen according to the message that is sent to the service. See Table 2 (the third row) for details. Generally, the first rule is usually used when we need to create a new standalone service. The second rule is used when we need to control the primitive services for some
96
P. Weiss Table 2. Transformation rules for the Sequence Pattern BP pattern
Service pattern
Communication protocol
reason, i.e. because of QoS or security issues. If we want to use an existing service with only some modifications, we can apply the third transformation rule. Transformation rules for passing of data in a BPD are in Table 3. In this case, a data object is sent from one task to another, via a sequence flow. The first row describes the transformation of a general data passing. As we can see in the sequence diagram, the composite service is responsible for forwarding data from one primitive service to the another one. This rule can be used also for tasks that are not adjacent to each other in a BPD. The alternative rule (the second row) can be used only for two adjacent service tasks. In this case, service1 just produces data that is needed to pass to service2. For this purpose, service1 has to be switched directly to service2 if it has been connected to another service before this transformation step. The sequence diagram then shows details of the communication among these services. At first, serviceA invokes service2. After that, service2 calls service1 and
Service-Based Realization of Business Processes
97
Table 3. Transformation rules for the Data-Passing Pattern BP pattern
Service pattern
Communication protocol
None
waits for the data. As soon as the data is passed, service2 can complete its processing and send the result message back to serviceA.
5
Example
Figure 1 shows an exemplary “Purchase Order” business process model. This example is adopted from [10]. As we can see, there are three roles, which are responsible for realization of the “Purchase Order” process: “Invoicing”, “Shipping” and “Scheduling”. Processing starts by receiving a purchase order message. Afterwards, the “Invoicing” role calculates an initial price. This price is not yet complete, because the total price depends on where the products are produced and the amount of the shipping cost. In parallel, the “Shipping” role determines when the products will be available and from what locations. At the same time, the process requests shipping schedule from the “Scheduling” role. After the shipping information is known, the complete price can be evaluated. Finally, when the complete price and shipping schedule are available, the invoice can be completed and sent to the customer. In this example, we assume that following tasks (from Figure 1) were identified as service invocation tasks: “Initiate Price Calculation”, “Complete Price Calculation”, “Request Shipping” and “Request Production Scheduling”. We can generate a service orchestration (see Figure 2) from this BPD by using the transformation rules mentioned in Section 4. The orchestration is composed from one composite and three primitive services. The composite service – the PurchaseOrder service – represents the
98
P. Weiss
Fig. 1. BPMN model of the Purchase Order process
Fig. 2. The derived service orchestration
Fig. 3. The service topology
business process itself. This service controles the rest of following primitive services. The PriceCalculation service provides two interfaces (i.e. two functional capabilities) iInitialPrice and iTotalPrice according to the “Initiate Price Calculation” and “Complete Price Calculation” tasks. Similarly, ProductionScheduling and Shipping provide functional capabilities specified by “Request Production Scheduling” and “Request Shipping”, respectively. Figure 3 depicts the service topology, which belongs to the orchestration. In Figure 2, we can also see that the PurchaseOrder service provides the iAsyncReplay interface at its consumer port. Each service that enables
Service-Based Realization of Business Processes
99
Fig. 4. The behavior of the PurchaseOrder service
asynchronous communication has to implement this auxiliary interface. Consumed services use this interface to send replies back to the consumer. As a composite service, the PurchaseOrder service is responsible to keep its subordinate services independent and stateless. This is done by controlling of passing of data between its subordinate services. The principle can be shown on the collaboration between PurchaseOrder and PriceCalculation (see Figure 4). In order to complete the invoice, the price has to be calculated. Consequently, the PurchaseOrder service requests the PriceCalculation service. PriceCalculation calculates the initial price and returns it to PurchaseOrder. PriceCalculation does not store the calculated price for later use, i.e. it does not hold any state information. When the service is invoked again (in order to calculate the final price), its initial state has to be set by sending of the initial price.
100
P. Weiss
Because this paper deals mainly with the issue of generating models of service orchestration from a BPD, it is payed less attention to specify single services. The details about messages, which are being passed between services, and details about architectures of every single one service mentioned in this example can be found in [12].
6
Discussion
The transformation technique presented in this paper is motivated by several approaches introduced in Section 2, especially by [16], [17] and [5]. Compared to these approaches, the technique proposed in this paper is primarily focused on realization of business processes rather than on modeling of business processes by means of UML. [2] provides a method for integrating Business Process Modeling and Object Modeling by treating each business process as a contract. The contract is a mediator between business clients and IT implementers. It specifies service providers playing roles in a collaboration fulfilling some business rules in order to achieve some business goals. Our technique follows this method by describing how to implement the business process by services in the context of SOA. [10] proposes a similar concept to our approach. However, that concept does not provide any description of the relation between business process diagrams and UML models. What is more, it contains a number of incorrectness. The most significant one is that the stateless of services (the fundamental SOA principle) is ignored. Services are designed in such a way that they store data affecting their functionality between two single incoming requests. Our approach solves this problem by using a composite service (see Section 3 and 5). Furthermore, in our approach, services’ interfaces are restricted to provide only one functionality. Regardless, it is possible to implement services with more interfaces (i.e. with more functional capabilities), but the interfaces have to provide independent functionality. This allows to add, remove or alter any capability, which is provided by the service, without large modifications in the service specification. This provide better reusability and leads to easy extensibility of modeled services.
7
Conclusion
This paper concerns Service-Oriented Analysis and Design, especially integration between Business Process Modeling and Service Modeling. We outlined a transformation technique that defines how a business process should be transformed into an orchestration of services and how the services should collaborate to fulfill the business goals. The transformation is based on control-flow patterns and is designed in conformity with fundamental SOA principles such as loose coupling, service independence, stateless and reusability. Some of these features are novel and have not been integrated into existing service-oriented modeling methods. The presented ideas are a part of a greater project, which deals with modeling and formal specification of SOA and underlying component-based systems.
Service-Based Realization of Business Processes
101
The approach, which is presented in this paper, is aimed at modeling of highlevel-abstract layers of the SOA hierarchy. The future work will focus on formal description of the proposed transformation and on an integration with formal component models. This allows formal verification of a whole modeled system (e.g. tracing of changes in a business process model to the changes in components’ structure). Furthermore, the formal description of the transformation allows validation of the generated service orchestration. Besides, the ongoing research includes design of additional transformation rules, especially rules for business patterns based on gateways and events, and applying the transformation in a real-world case study.
References 1. Amir, R., Zeid, A.: A UML profile for service oriented architectures. In: OOPSLA 2004: Companion to the 19th Annual ACM SIGPLAN Conference on ObjectOriented Programming Systems, Languages, and Applications, pp. 192–193. ACM, Vancouver (2004) 2. Amsden, J.: Business services modeling: Integrating WebSphere Business Modeler and Rational Software Modeler, http://www.ibm.com/developerworks/ rational/library/05/1227_amsden/ 3. Arsanjani, A.: Service-oriented modeling and architecture: How to identify, specify, and realize services for your SOA, http://www.ibm.com/developerworks/ webservices/library/ws-soa-design1/ 4. Erl, T.: Service-Oriented Architecture: Concepts, Technology, and Design. Prentice Hall PTR, Upper Saddle River (2005) 5. Inaganti, S., Behara, G.K.: Service Identification: BPM and SOA Handshake. BPTrends (March 2007), http://www.bptrends.com/publicationfiles/ THREE%2003-07-ART-BPMandSOAHandshake-InagantiBehara-Final.pdf 6. Johnston, S.: UML 2.0 Profile for Software Services, http://www.ibm.com/ developerworks/rational/library/05/419_soa/ 7. Murzek, M., Kramler, G.: Business Process Model Transformation Issues - The Top 7 Adversaries Encountered at Defining Model Transformations. In: Proceedings of the Ninth International Conference on Enterprise Information Systems, pp. 144– 151. INSTICC, Paphos (2007) 8. Object Management Group: UML Superstructure Specification, version 2.0, http://www.omg.org/cgi-bin/doc?formal/05-07-04 9. Object Management Group: Business Process Modeling Notation (BPMN) Specification, http://www.bpmn.org/Documents/OMG%20Final%20Adopted%20BPM %201-0%20Spec%2006-02-01.pdf 10. Object Management Group: UML Profile and Metamodel for Services (UPMS), Request For Proposal, http://www.omg.org/docs/soa/06-09-09.pdf 11. Ortiz, G., Hernandez, J.: Toward UML Profiles for Web Services and their ExtraFunctional Properties. In: IEEE International Conference on Web Services (ICWS 2006), pp. 889–892. IEEE Computer Society, Chicago (2006) 12. Rychly, M., Weiss, P.: Modeling Of Service Oriented Architecture: From Business Process To Service Realisation. In: ENASE 2008 Third International Conference on Evaluation of Novel Approaches to Software Engineering Proceedings, pp. 140–146. INSTICC, Funchal (2008)
102
P. Weiss
13. van der Aalst, W.M.P., Barros, A.P., ter Hofstede, A.H.M., Kiepuszewski, B.: Advanced Workflow Patterns. In: Scheuermann, P., Etzion, O. (eds.) CoopIS 2000. LNCS, vol. 1901, pp. 18–29. Springer, Heidelberg (2000) 14. Weiss, P.: Using UML 2.0 in Service-Oriented Analysis and Design. In: Proceedings of the 12th Conference and Competition STUDENT EEICT 2006, vol. 4, pp. 420– 424. Faculty of Information Technology BUT, Brno (2006) 15. Weiss, P., Zendulka, J.: Modeling of Services and Service Collaboration in UML 2.0. In: Information Systems and Formal Models, pp. 29–36. Faculty of Philosophy and Science in Opava, Silesian University in Opava, Opava (2007) 16. White, S.: Process Modeling Notations and Workflow Patterns. BPTrends (February 2008), http://www.bptrends.com/deliver file.cfm?fileType=publication &fileName=03%2D04%20WP%20Notations%20and%20Workflow%20Patterns%20%2D %20White%2Epdf 17. Wohed, P., van der Aalst, W.M.P., Dumas, M., ter Hofstede, A.H.M., Russel, N.: Pattern-based Analysis of BPMN - An extensive evaluation of the Controlflow, the Data and the Resource Perspectives (revised version), http://www. workflowpatterns.com/documentation/index.php
SMA—The Smyle Modeling Approach Benedikt Bollig1 , Joost-Pieter Katoen2 , Carsten Kern2 , and Martin Leucker3 1 2
LSV, ENS Cachan, CNRS RWTH Aachen University 3
TU M¨unchen
Abstract. This paper introduces the model-based software development methodology SMA—the Smyle Modeling Approach—which is centered around Smyle, a dedicated learning procedure to support engineers to interactively obtain design models from requirements, characterized as either being desired (positive) or unwanted (negative) system behavior. The learning approach is complemented by scenario patterns where the engineer can specify clearly desired or unwanted behavior. This way, user interaction is reduced to the interesting scenarios limiting the design effort considerably. In SMA, the learning phase is complemented by an effective analysis phase that allows for detecting design flaws at an early design stage. This paper describes the approach and reports on first practical experiences. Keywords: Requirements elicitation, design model, learning, software engineering lifecycle, Message Sequence Charts, UML.
1 Introduction To put it bluntly, software engineering—under the assumption that a requirements acquisition has taken place—amounts to bridging the gap between requirements, typically stated in natural language, and a piece of software. To ease this step, in model-driven design (like MDA), architecture-independent design models are introduced as intermediary between requirement specifications and concrete implementations. These design models typically describe the control flow, basic modules or components, and their interaction. Design models are then refined towards executable code typically using several design steps where system details are incorporated progressively. Correctness of these design steps may be checked using, e.g., model checking or deductive techniques. Problems in software engineering cycles occur if, e.g., the number of requirements is abundant, they are ambiguous, contradictory, and change over time. Evolving requirements may be due to changing user requirements or to anomalous system behavior detected at later design stages and thus occur at all stages of the development process. This paper presents a novel software engineering lifecycle model based on a new approach towards requirement specification and high-level design. It is tailored to the development of communicating distributed systems whose behavior can be specified using sequence diagrams exemplifying either desired or undesired system runs. A widespread notation for sequence diagrams is that of message sequence charts (MSCs). They have
This work is partially supported by EGIDE/DAAD (Procope 2008), DOTS (ANR-06-SETIN003), and P2R MODISTECOVER.
Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 103–117, 2011. c IFIP International Federation for Information Processing 2011
104
B. Bollig et al.
been adopted by the UML, are standardized by the ITU [22], and are part of several requirements elicitation techniques such as CREWS [26]. At the heart of our approach—called the Smyle Modeling Approach (SMA)— dedicated learning techniques support the engineer to interactively obtain implementation-independent design models from MSCs exemplifying the system’s behavior. These techniques are implemented in the tool Smyle (Synthesizing models by learning from examples, cf. [5]). The incremental learning approach allows to gradually develop, refine, and complete requirements, and supports evolving requirements in a natural manner, rather than requiring a full-fledged set of requirements up front. Importantly, Smyle does not only rely on given system behaviors but progressively asks the engineer to classify certain corner cases as either desired or undesired behavior, whenever the so-far provided examples do not allow to uniquely determine a (minimal) system model. As abstract design models, Smyle synthesizes distributed finite-state automata (referred to as communicating finite-state machines, or CFMs for short) [5]. This model is implementation-independent and describes the local control flow as finite automata which communicate via unbounded order-preserving channels. The learning approach is complemented by so-called scenario patterns where the engineer can specify clearly desired or unwanted behavior via a dedicated formula editor. This way, user interaction is reduced to the interesting scenarios limiting the design effort considerably. Once an initial high-level design has been obtained by learning, SMA suggests an intensive analysis of the obtained model, first by comprehensive simulation and second by checking elementary correctness properties of the CFM, for example by means of model checking or dedicated analysis algorithms [6]. This allows for an early detection of design flaws. In case of a flaw, i.e., some observed behavior should be ruled out or some expected behavior cannot be realized by the current model, the learning phase can be continued with the corresponding scenarios yielding an adapted design model now reflecting the expected behavior for the given scenarios. A satisfactory high-level design may subsequently be refined or translated into, e.g., Stateflow diagrams [17] from which executable code is automatically generated using tools as Matlab/Simulink. The final stage of SMA is a model-based testing phase [10] in which it is checked whether the software conforms to the high-level design description. The MSCs used for formalizing requirements now serve as abstract test cases. Moreover, supplementary test cases are generated in an automated way. This systematic on-the-fly test procedure is supported by tools such as TorX and TGV [2] that can easily be plugged in into our design cycle. Again, any test failure can be described by MSCs which may be fed back to the learning phase. Related work. To our best knowledge there is no related work on defining lifecycle models based on learning techniques. However, several approaches for synthesizing models based on scenarios are known. In [32,31], Uchitel et al. recommend the use of high-level MSCs (HMSCs) as input for model synthesis. High-level MSCs aim at specifying the overall system behavior, yet are hard to adapt when unwanted behavior has to be removed or wanted behavior has to be defined. The same problem arises for Live Sequence Charts or related formalisms [13,19,8]. In general, whenever modeling the overall global system behavior, a modification due to changing requirements is cumbersome and error prone.
SMA—The Smyle Modeling Approach
105
The approaches taken in [25] and [12] are, similarly as Smyle, based on learning techniques. The general advantage of learning techniques is that changing requirements can be incorporated into the learning process. However, the algorithms of [25] and [12] both have the drawback that the resulting design model does not necessarily conform to the given examples and requires that unwanted “[...] implied scenarios should be detected and excluded” [12], manually, while Smyle does conform to the given examples. A very interesting prospect is described in [18] where Harel presents his ideas and dreams about scenario-based programming and proposes to use learning techniques for system synthesis. In his vision “[the] programmer teaches and guides the computer to get to know about the system’s intended behavior [...]”—just as it is our intention. An extended journal version of this paper will be available as [7]. Outline. In Section 2 the ingredients for our learning approach are described and complemented by a theoretical result on its feasibility. Section 3 describes SMA in detail and compares it to traditional and modern software engineering lifecycle models. In Section 4 we apply SMA gradually to a simple example, followed by insights on an industrial case study in Section 5.
2 Ingredients of the SMA We now recall message sequence charts (MSCs), communicating finite-state machines, describe the gist of Smyle and present a logic for specifying sets of MSCs. 2.1 Message Sequence Charts Message Sequence Charts (MSCs) are an ITU standardized notation [22] for describing message exchange between concurrent processes. An MSC depicts a single partially ordered execution sequence of a system. It defines a collection of processes, which are drawn as vertical lines and interpreted as top-down time axes. Labeled vertical arrows represent message exchanges, cf. Figure 1 (a). An MSC can be understood as a graph whose nodes represent communication actions, e.g., the graph in Figure 1 (b) represents the MSC of Figure 1 (a). A node or event represents the communication action indicated by its label, where, e.g., 1!2(a) stands
1
2
c b c
a
3
1!2(a) u
msg
2?3(c)
proc
msg
1!3(b)
msg
2?3(c)
msg
2?1(a)
(b) Fig. 1. An MSC (a) and its graph (b)
3?1(b) proc
proc
(a)
3!2(c) proc
proc
v 3!2(c)
106
B. Bollig et al.
for sending a message a from 1 to 2, whereas 2?1(a) is the complementary action of receiving a from 1 at process 2. The edges reflect causal dependencies between events. An edge can be of two types: it is either a process edge (proc), describing progress of one particular process, or a message edge (msg), relating a send with its corresponding receive event. This graph can be represented as a partial order of communication events. In this work we abstract from several features provided by the standard. Many of them (e.g., local actions, co-regions, etc.) can be easily included. Some of them, however, are excluded on purpose: loops and alternatives are not allowed as single executions are to be specified by MSCs. Note that, in correspondence to the ITU standard but in contrast to most works on learning MSCs, we consider the communication of an MSC to be asynchronous meaning that sending and receiving of a message may be delayed. A (finite or infinite) set of MSCs, which we call an MSC language, may represent a system in the sense that it contains all possible scenarios that the system may exhibit. MSC languages can be characterized and represented in many ways. Here, the notion of a regular MSC language is of particular interest, as it comprises languages that are learnable . Regularity of MSC languages is based on linearizations: A linearization of an MSC M is a total ordering of its events that does not contradict the transitive closure of the edge relation. Any linearization can be represented as a word over the set of communication actions. Two sample linearizations of the MSC from Figure 1 are l1 = 1!2(a)3!2(c)2?3(c)1!3(b)3?1(b)3!2(c)2?3(c)2?1(a) and l2 = 3!2(c) 2?3(c) 1!2(a) 1!3(b) 3?1(b) 3!2(c) 2?3(c) 2?1(a). Let Lin(M ) denote the set of linearizations of M and, for set of MSCs, let Lin() denote M ∈ Lin(M ). 2.2 Communicating Finite-State Machines Regular MSC languages can be naturally and effectively implemented in terms of communicating finite-state machines (CFMs) [9]. CFMs constitute an appropriate automaton model for distributed systems where processes are represented as finite-state automata that can send messages to one another through reliable FIFO channels. We omit a formal definition of CFMs and instead refer to the example depicted in Figure 2 illustrating the Alternating Bit Protocol [24,30]. There, a producer process (p) and a consumer process (c) exchange messages from {0, 1, a}. Transitions are labeled with communication actions such as p!c(0), p?c(a), etc. (abbreviated by !0, ?a, and so on). For a concise description of this protocol, see Section 4. A CFM accepts a set of MSCs in a natural manner. For example, the language of the CFM from Figure 2 contains the MSCs depicted in Figure 4. Using CFMs, we account for the asynchronous communication behavior whereas usually other approaches use synchronous communication. This complicates the underlying theory of learning procedures but results in a model that exactly does what the user expects and does not represent an over-approximation. The formal justification of using regular MSC languages is given by the following theorem, which states that a set of MSCs is implementable as a CFM if its set of linearizations is regular, or if it can be represented by a regular set of linearizations. Theorem 1 ([20,15]). Let be an MSC language. There is a CFM accepting precisely the MSCs from , if one of the following holds:
SMA—The Smyle Modeling Approach
?a !0
!1 !1
!0
!0
!a
?1
producer p
?1 ?1
?0 ?0
?a
107
!a
?0
consumer c Fig. 2. Example CFM
1. The set Lin() is a regular set of words.
2. There is a channel bound B and a regular subset L of Lin() such that (i) any MSC from exhibits a linearization that does not exceed B, and (ii) L contains precisely the linearizations from Lin() that do not exceed B. If the regular languages are given as finite automata, we can compute a corresponding CFM effectively. 2.3 The Gist of Smyle Smyle is the learning procedure underlying SMA and has recently been described in [5]. As input, Smyle is given a set + of positive scenarios which are desired executions of the system to be and a set − of negative scenarios which should not be observed as system executions. If the given examples do not indicate a single conforming model, Smyle saturates both sets by asking further queries which are successively presented to the user who in turn has to classify each of them as either positive or negative resulting ¯ + and ¯ − . Otherwise, a minimal deterministic finite automaton and a correspondin ¯ + and rejecting those of ¯ − are computed. If a ing CFM accepting the MSCs of subsequent analysis of the obtained CFM shows that it does not conform to the user’s ¯ + or ¯ − and intention, it can be refined by providing further examples to be added to reinvoking the learning procedure. This process eventually converges to any intended CFM [5]. At first sight, one might think that inconsistencies could be introduced by the classifications of the presented MSCs. However, this is not possible due to the simple nature of MSCs: We do not allow branching, if-then-else or loop constructs. Thus they cannot overlap and generate inconsistencies. Note moreover that the learning algorithm is deterministic in the following sense: For every (saturated) set of examples, the learning algorithm computes a unique CFM. This allows, within SMA, to rely only on all classified MSCs within a long-term project and to resume learning whenever new requirements arise. Moreover, reclassification in case of user errors is likewise simple. An important aspect that distinguishes Smyle from others [25,12] is that the resulting CFM is consistent with the set of MSCs that served as input. Other approaches project their learning result onto the processes involved, accepting that the resulting system is a (coarse) over-approximation.
108
B. Bollig et al.
2.4 MSC Patterns In order to significantly reduce the number of scenarios the user has to classify during a learning phase, it is worthwhile to consider a formalism where (un)desired behavior can a priori be specified in terms of logical formulas. Due to space constraints we only give a superficial description of how to apply such a logic within the SMA. A more sophisticated introduction can be found in [7]. The logic we employ will be used as follows: positive and negative sets of formulas Φ+ and Φ− are input by the user, either directly or by annotating MSCs. An example for a negative statement would be, say, “there are two receipts of the same message in a row”. An annotated MSC for this example formula is given in Figure 6 (c). Then, the learning algorithm can efficiently check for all formulas ϕ+ ∈ Φ+ , ϕ− ∈ Φ− and unclassified MSCs M whether M |= ϕ+ or M |= ϕ− . If so, then the set of negative samples is updated to {M } ∪ − and otherwise the question is passed to the user.
3 The Smyle Modeling Approach It is common knowledge [14] that traditional engineering lifecycle models like the waterfall model [28,29,27,16] or the V-model [27,29] suffer from some severe deficiencies, despite their wide use in today’s software development. One of the problems is that both models assume (implicitly) that a complete set of requirements can be formulated at the beginning of the lifecycle. Although in both approaches it is possible to revisit a previously passed phase, this is considered a backwards step involving time-consuming reformulation of documents, models, or code produced, causing high additional costs. The nature of a typical software engineering project is, however, that requirements are usually incomplete, often contradicting, and frequently changing. A high-level design, on the other hand, is typically a complete and consistent model that is expected to conform to the requirements. Thus, especially the step from requirements to a highlevel design is a major challenge: The incomplete set of requirements has to be made complete and inconsistencies have to be eliminated. An impressive example for inconsistencies in industrial-size applications is given by Holzmann [21] where for the design R and implementation of a part of Signaling System 7 in the 5ESSswitching system (the ISDN User-Part protocol defined by the CCITT) “almost 55% of all requirements from the original design requirements [...] were proven to be logically inconsistent [...]”. Moreover, also later stages of the development process often require additional modifications of requirements and the corresponding high-level design, either due to changing user requirements or due to unforeseen technical difficulties. Thus, a lifecycle model should support an easy adaptation of requirements and its conforming design model also at later stages. The SMA is a new software engineering lifecycle model that addresses these goals. 3.1 A Bird’s-eye View on SMA The Smyle Modeling Approach (SMA) is a software engineering lifecycle model tailored to communicating distributed systems. A prerequisite is, however, that the participating units (processes) and their communication actions can be fixed in the first steps of the
SMA—The Smyle Modeling Approach
109
development process, before actually deriving a design model. Requirements for the behavior of the involved processes, however, may be given vaguely and incomplete first but are made precise within the process. While clearly not every development project fits these needs, a considerable amount of systems especially in the automotive domain do. Within SMA, our goal is to round-off requirements, remove inconsistencies and to provide methods catering for modifications of requirements in later stages of the software engineering lifecycle. One of the main challenges to achieve these goals is to come up with simple means for concretizing and completing requirements as well as resolving conflicts in requirements. We attack this intrinsically hard problem using the following rationale: While it is hard to come up with a complete and consistent formal specification of the requirements, it is feasible to classify exemplifying behavior as desired or illegal. (SMA rationale) This rationale builds on the well-known experience that human beings prefer to explain, discuss, and argue in terms of example scenarios but are often overstrained when having to give precise and universally valid definitions. Thus, while the general idea to formalize requirements, for example using temporal logic, is in general desirable, this formalization is often too cumbersome and therefore not cost-effective and the result is, unfortunately, often too error-prone. This also justifies our restriction to MSCs without branching, if-then-else, and loops, when learning design models: It may be too error-prone to classify complex MSCs as either wanted or unwanted behavior. Our experience with requirements documents shows that especially requirements formulated in natural language are often explained in terms of scenarios, expressing wanted or unwanted behavior of the system to develop. Additionally, it is evident that it is easier for the customer to judge whether a given simple scenario is intended or not, in comparison to answering whether a formal specification matches the customer’s needs. The key idea of SMA is therefore to incorporate the novel learning algorithm Smyle (with supporting tool) [5] for synthesizing design models based on scenarios explaining requirements. Thus, requirements- and high-level design phase are interweaved. Smyle’s nature is to extend initially given scenarios to consider, for example, corner cases: It generates new scenarios whose classification as desired or undesired is indispensable to complete the design model and asks the engineer exactly these scenarios. Thus, the learning algorithm actually causes a natural iteration of the requirements elicitation and design model construction phase. Note that Smyle synthesizes a design model that is indeed consistent with the given scenarios and thus does precisely exhibit the scenario behavior. While SMA’s initial objective is to elaborate on the inherent correspondence of requirements and design models by asking for further exemplifying scenarios, it also provides simple means for modifications of requirements later in the design process. Whenever, for example in the testing phase, a mismatch of the implementation’s behavior and the design model is witnessed which can be traced back to an invalid design model, it can be formulated as a negative scenario and can be given to the learning algorithm to update the design model. This will, possibly after considering further scenarios, modify the design model to disallow the unwanted behavior. Thus, necessary
110
B. Bollig et al.
modifications of the current software system in later phases of the software engineering lifecycle can easily be fed back to update the design model. This high level of automation is aimed at an important reduction of development costs. 3.2 The SMA Lifecycle Model in detail The Smyle Modeling Approach, cf. Figure 3, consists of a requirements phase, a highlevel design phase, a low-level design phase, and a testing and integration phase. Following modern model-based design lifecycle models, the implementation model is transformed automatically into executable code, as it is increasingly done in the automotive and avionics domain. In the following, the main steps of the SMA lifecycle model are described in more detail, with a focus on the phases depicted in Figure 3. Derivation of a design model. According to Figure 3, the derivation of design models is divided into three steps: The first phase is called scenario extraction phase.
Fig. 3. The Smyle Modeling Approach: SMA
SMA—The Smyle Modeling Approach
111
Based on the usually incomplete system specification the designer has to infer a set of scenarios which will be used as input to Smyle.1 In the learning and simulation phase, the designer and client (referred to as stakeholders in the following) will work hand in hand according to the designing-in-pairs paradigm. The advantage is that both specific knowledge about requirements (contributed by the customer) and solutions to abstract design questions (contributed by the designer) coalesce into one model. With its progressive nature, Smyle attempts to derive a model by interactively presenting new scenarios to the stakeholders which in turn have to classify them as either positive or negative system behavior. Due to the evolution of requirements implied by this categorization the requirements document should automatically be updated incorporating the new MSCs. Additionally, the most important scenarios are to be user-annotated with the reason for the particular classification to complement the documentation. When the internal model is complete and consistent with regard to the scenarios classified by the stakeholders, the learning procedure halts and Smyle presents a frame for simulating and analyzing the current system. In this dedicated simulation component—depicted in Figure 5 (a) and (c)—the designer and customer pursue their designing-in-pairs task and try to obtain a first impression on the system to be by executing events and monitoring the resulting system behavior depicted as an MSC. In case missing requirements are detected the simulator can extract a set of counterexample MSCs which should again be augmented by the stakeholders to complete documentation. These MSCs are then introduced to Smyle whereupon the learning procedure continues until reaching the next consistent automaton. The designer then advances to the synthesis and analysis phase where a distributed model (a CFM) is synthesized in an automated way. To get diagnostic feedback as soon as possible in the software engineering lifecycle, a subsequent analysis phase asks for an intensive analysis of the current design model. Consulting model-checking-like tools2 as MSCan [6] which are designed for checking dedicated properties of communicating systems might lead to additional knowledge about the current model and its implementability. With MSCan the designer is able to check for potential deficiencies of the forthcoming implementation, like non-local choice or non-regularity [3,20], i.e., process divergence. The counterexamples generated by MSCan are again MSCs and as such can be fed back to the learning phase. If the customer and designer are satisfied with the result the client’s presence is not required anymore and their direct collaboration terminates. Note that the design model obtained at this stage may also serve as a legal contract for the system to be built. Enhancing the learning process. While it is hard to come up with a universally valid specification right in the beginning of the design phase, typical patterns of clearly allowed or disallowed MSCs usually are observed during the learning phase. An unclassified MSC has to fulfill all positive patterns and must not fulfill any negative pattern in order to be passed to the designer. In case some positive pattern is not fulfilled or some negative pattern is fulfilled the scenario is be classified as negative without user 1
2
It is worthwhile to study the results from [23] in this context, which allow to infer MSCs from requirements documents by means of natural language processing tools, potentially yielding (premature) initial behavior. Note that currently there are no general purpose model checkers for CFMs available.
112
B. Bollig et al.
interaction. Roughly speaking: employing a set of formulas in the learning procedure will further ease the designer’s task because she has to classify less scenarios. Transformation to an implementation model. The engineer’s task now is to semi-automatically transform the design model into an implementation model. For this purpose the SMA proposes to employ tools like Matlab Simulink which takes as input for example a so-called Stateflow diagram [17] and transforms it into an implementation model. Hence, the manual effort the designer has to perform in the current phase reduces to transforming the CFM (as artifact of the design phase) into the input language (e.g., Stateflow). Conformance testing. As early as possible the implementation model should pass a testing phase before being transformed into real code to lower the risk of severe design errors and supplementary costs. SMA employs model-based testing [10] as it allows a much more systematic treatment by mechanizing the generation of tests as well as the test execution phase. Moreover, in the SMA the MSCs classified during the learning phase and contained in the requirements document enriched by additional MSCs form a natural test suite (a set of tests). If the designer detects a failure during the testing phase, counterexamples are automatically generated and again the requirements document is updated accordingly enclosing the new scenarios and their corresponding requirements derived by the designer. At last, the generated scenarios are introduced into Smyle to refine the model. In practice, model-based testing has been implemented in several software tools and demonstrated its power in various case studies [11,10]. Synthesis of code and maintenance. Having converged to a final, consistent implementation model a code generator is employed for generating code skeletons or even entire code fragments for the distributed system. These fragments then have to be completed by programmers such that afterwards the software can finally be installed at the client’s site. If new requirements arise after some operating time of the system the old design model can be upgraded by restarting the SMA. 3.3 SMA vs. Other Lifecycle Models This section briefly compares the SMA to other well-known traditional and modern lifecycle models. Due to lack of space, an extended comparison including coarse descriptions of the lifecycle models mentioned below can be found in [7]. In contrast to traditional lifecycle models like the well-known waterfall- and Vmodel in SMA requirements need not be fixed in advance but can be derived interactively while evolving towards a final conforming and validated model. Intensive simulation and analysis phases reduce the need for costly and time-consuming backward steps during the software development process. While in many processes the documentation is not regularly updated the SMA provides means for extending this documentation whenever additional information becomes available. Compared to several modern lifecycle models like the spiral model [4] and rapid prototyping [14,29], SMA adapted the feature of periodic prototype generation in order to iteratively improve the design model by constantly learning from the insights achieved during the previous iteration. But to our opinion it has the extra benefit of only demanding a classification for automatically derived scenarios whereas in other models these scenarios have to be derived
SMA—The Smyle Modeling Approach
113
manually, first. However, the spiral model describes a more general process as it aims at developing large-scale projects while the main application area for SMA is to be seen in developing software for embedded systems where the number of communication entities is fixed a priori. Another advantage of SMA compared to, e.g., rapid prototyping is that for closing the gaps between requirements and design model there is no mandatory need for highly experienced and thus very expensive design personnel. Requirements engineers with specific domain knowledge, however, are sufficient because design questions are mainly solved automated by the learning procedure. A last model we would like to compare SMA to is the extreme programming model [1] where, similarly, in each iteration user stories (i.e., scenarios) are planned for implementation and regular and early testing phases are stipulated. As a further risk reduction technique both models employ designing- and programming-in-pairs, thus lessening the danger of errors and lowering the costs of possible redesign or implementation.
4 SMA by Example Our goal now is to derive a model for the well-known Alternating Bit Protocol (ABP). Along the lines of [24,30], we start with a short requirements description in natural language. Examining this description, we will identify the participating processes and formulate some initial MSCs exemplifying the behavior of the protocol. These MSCs will be used as input for Smyle which in turn will ask us to classify further MSCs, before deriving a first model of the protocol. Eventually, we come up with a design model for the ABP matching the model from [30]. However, we refrain from implementing and maintaining the example, due to resource constrains. Problem description. The main aim of the ABP is to assure the reliability of data transmission initiated by a producer through an unreliable FIFO (first-in-first-out) channel to a consumer. Here, unreliable means that data can be corrupted during transmission. We suppose, however, that the consumer is capable of detecting such corrupted messages. Additionally, there is a channel from the consumer to the producer, which, however, is assumed to be reliable. The protocol now works as follows: inip c p c tially a bit b is set to 0. The producer keeps 0 sending the value of b until it receives an 0 acknowledgment message a from the cona a 0 sumer. This affirmation message is sent some 1 time after a message of the producer containa ing the message content b is obtained. After receiving such an acknowledgment, the producer inverts the value of b and starts send(a) (b) ing the new value until the next affirmation message is received at the producer. The comFig. 4. Two input scenarios for Smyle munication can terminate after any received acknowledgment a that was received at the producer side.
114
B. Bollig et al.
p
c 0 a 1 a 1
(a)
(b)
(c)
Fig. 5. Smyle’s simulation window: (a) intermediate internal model with missing behavior (b) missing scenario (c) final internal model
Applying the SMA. We first start with identifying the participating processes in this protocol: the producer p and the consumer c. Next, we turn towards the scenario extraction phase and have to come up with a set of initial scenarios. Following the problem description, we first derive the MSC shown in Figure 4 (a). Let us now consider the behavior caused by the non-reliability of the channel. We could imagine that p sends a message 0 but, due to channel latency, does not receive a confirmation within a certain time bound and thus sends a second 0 while the first one is already being acknowledged by c. This yields the MSC in Figure 4 (b). Within the learning phase, Smyle asks us to classify further scenarios —most of which we are easily able to negate—before providing a first design model. Now the simulation phase is activated (cf. Figure 5 (a)), where we can test the current model. We execute several events as shown in the right part of Figure 5 (a) and review the model’s behavior. We come across an execution where after an initial phase of sending a 0 and receiving the corresponding affirmation we expect to observe a similar behavior as in Figure 4 (b) (but now containing the message content b = 1). According to the problem description this is a feasible protocol execution but is not contained in our system, yet. Thus, we encountered a missing scenario. Therefore, we enter the scenario extraction phase again, formulate the missing scenario (cf. Figure 5 (b)), and input it into Smyle as a counterexample. As before, Smyle presents further MSCs that we have to classify: Among others, we are confronted with MSCs that (1) do not end with an acknowledgment (cf. Figure 6 (a)) and with MSCs that (2) have two subsequent acknowledgment events (cf. Figure 6 (c)). Both kinds of behavior are not allowed according to the problem description. We identify a pattern in each of these MSCs, by marking the parts of the MSCs as shown in Figure 6 (a) and (c), yielding the patterns: 1. Every system run has to finish with an acknowledgement a. 2. There must never be two subsequent sends or receipts of an acknowledgement a. To tell Smyle to abolish all MSCs fulfilling the patterns we mark them as unwanted behavior. Thus, the MSCs from Figure 6 (b) and (d) are automatically classified as
SMA—The Smyle Modeling Approach
(a)
(b)
(c)
115
(d)
Fig. 6. Some patterns for (un) desired behavior
negative later on. In addition, we reflect these patterns in the requirements documents by adding, for example, the explanation that every system run has to end with an acknowledgment (cf. (1)) and its formal specification. With the help of these two patterns, we continue our learning effort and end with the next hypothesis after a total of 55 user queries. Without patterns, we would have needed 70 queries. Moreover, identifying three more obvious patterns at the beginning of the learning process, we could have managed to infer the correct design model with only 12 user queries in total. One can argue that this is a high number of scenarios to classify but this is the price one has to pay for getting an exact system and not an approximation (that indeed can be arbitrarily inaccurate) as in related approaches. At the end of the second iteration an intensive simulation (cf. Figure 5 (c)) does not give any evidence of wrong behavior. Thus, we enter the analysis phase to check the model with respect to further properties. For example, we check whether the resulting system can be implemented fixing a maximum channel capacity in advance. MSCan tells us that the system does not fulfill this property. Therefore we need to add a (fair) scheduler to make the protocol work in practice. According to Theorem 1 a CFM is constructed which exactly is the one from Figure 2.
5 SMA in an Industrial Case Study This section examines a real-world industrial case-study derived within a project with a Bavarian automotive manufacturer. The main goal of this section is not to present a detailed report of the underlying system and the way the SMA was employed but to share insights acquired while inferring the design model using the SMA. The case study describes the functionality of the automotive manufacturer’s onboard diagnostic service integrated into their high-end product. In case the climate control unit (CCU) of the automobile does not operate as expected a report is sent to the onboard diagnostic service which in turn initiates a CCU-self-diagnosis and waits for response to the query. After the reply the driver has to be briefed about the malfunction of the climate control via the car’s multi-information-display. The driver is asked to halt at the next gas station where the onboard diagnostic service communicates the problems to the automotive manufacturer’s central server. A diagnostic service is downloaded
116
B. Bollig et al.
from the server and executed locally on the vehicle’s on-board computer. The diagnostic routine locates the faulty component within the CCU and sends the problem report back to the central server. In case of a hardware failure a car garage could be informed and the replacement part be reordered to minimize the CCU’s downtime. If no hardware failure is detected a software update (if available) is installed and the CCU reset. By applying SMA to the given problem we were able to infer a system model in less than one afternoon fulfilling exactly the requirements imposed by our customer. Lessons learned. Throughout the entire process, we applied the designing-in-pairs paradigm to minimize the danger of misunderstandings and resulting system flaws. The early feedback of the simulation and analysis resulted in finding missing system behavior and continuously growing insights—even on our customers site—about the client’s needs. The automated scenario derivation was found to be a major gain because even corner cases (i.e., exceptional scenarios the client did not consider) were covered. As requirements in the SMA are accumulated in an iterative process, growing system knowledge could be applied to derive new patterns easing the design task and to obtain increasingly more elaborate design models. Last but not least the on-the-fly completion of the requirements document resulted in a complete system description after finishing the design phase which could then be used as contract for the final implementation. Besides all the positive issues we also faced inconveniences using the SMA. Finding an initial set of scenarios turned in some cases out to be a difficult task. This could be eased in the future by integrating an approach proposed in [23] where scenarios represented as MSCs are derived from natural language specifications. These could then smoothly be fed to Smyle. Moreover, the simulation facilities have to be improved allowing for random simulations etc. Additional details on lessons learned can be found in [7].
6 Conclusion This paper presented a software engineering lifecycle model centered around learning and early analysis in the design trajectory. Our model is described, has been compared with the main development models, and applied to a toy, as well as an industrial example. Further applications are planned to show its feasibility and to refine the method.
References 1. Ambler, S.W., Jeffries, R.: Agile Modeling: Effective Practices for Extreme Programming and the Unified Process. Wiley, Chichester (2002) 2. Belinfante, A., Frantzen, L., Schallhart, C.: Tools for test case generation. In: Model-Based Testing of Reactive Systems, pp. 391–438 (2004) 3. Ben-Abdallah, H., Leue, S.: Syntactic detection of process divergence and non-local choice in message sequence charts. In: Brinksma, E. (ed.) TACAS 1997. LNCS, vol. 1217, pp. 259– 274. Springer, Heidelberg (1997) 4. Boehm, B.W.: A spiral model of software development and enhancement. IEEE Computer 21, 61–72 (1988) 5. Bollig, B., Katoen, J.P., Kern, C., Leucker, M.: Replaying play in and play out: Synthesis of design models from scenarios by learning. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 435–450. Springer, Heidelberg (2007)
SMA—The Smyle Modeling Approach
117
6. Bollig, B., Kern, C., Schl¨utter, M., Stolz, V.: MSCan: A tool for analyzing MSC specifications. In: Hermanns, H. (ed.) TACAS 2006. LNCS, vol. 3920, pp. 455–458. Springer, Heidelberg (2006) 7. Bollig, B., Katoen, J.-P., Kern, C., Leucker, M.: SMA—The Smyle Modeling Approach. Computing and Informatics (2009) (to appear) 8. Bontemps, Y., Heymand, P., Schobbens, P.-Y.: From live sequence charts to state machines and back: a guided tour. IEEE TSE 31, 999–1014 (2005) 9. Brand, D., Zafiropulo, P.: On communicating finite-state machines. J. of the ACM 30, 323– 342 (1983) 10. Broy, M., Jonsson, B., Katoen, J.-P., Leucker, M., Pretschner, A. (eds.): Model-Based Testing of Reactive Systems. LNCS, vol. 3472. Springer, Heidelberg (2005) 11. Craggs, I., Sardis, M., Heuillard, T.: Agedis case studies: Model-based testing in industry. In: Eur. Conf. on Model Driven Softw. Eng., pp. 106–117 (2003) 12. Damas, C., Lambeau, B., Dupont, P.: Generating annotated behavior models from end-user scenarios. IEEE TSE 31, 1056–1073 (2005) 13. Damm, W., Harel, D.: LSCs: Breathing life into message sequence charts. Formal Methods in System Design 19(1), 45–80 (2001) 14. Easterbrook, S.M.: Requirements engineering (2004) (unpub. manuscript), http://www.cs.toronto.edu/ sme/papers/2004/FoRE-chapter03-v8. pdf 15. Genest, B., Kuske, D., Muscholl, A.: A Kleene theorem and model checking algorithms for existentially bounded communicating automata. I&C 204(6), 920–956 (2006) 16. Ghezzi, C., Jazayeri, M., Mandrioli, D.: Fundamentals of Software Engineering, 2nd edn. Prentice-Hall, Englewood Cliffs (2002) 17. Hamon, G., Rushby, J.M.: An operational semantics for Stateflow. In: Wermelinger, M., Margaria-Steffen, T. (eds.) FASE 2004. LNCS, vol. 2984, pp. 229–243. Springer, Heidelberg (2004) 18. Harel, D.: Can programming be liberated, period? Computer 41, 28–37 (2008) 19. Harel, D., Marelly, R.: Come, Let’s Play. Springer, Heidelberg (2003) 20. Henriksen, J.G., Mukund, M., Kumar, K.N., Sohoni, M., Thiagarajan, P.S.: A theory of regular MSC languages. Inf. and Comput. 202(1), 1–38 (2005) 21. Holzmann, G.J.: The theory and practice of a formal method: Newcore. In: IFIP Congress (1), pp. 35–44 (1994) 22. ITU: ITU-TS Recommendation Z.120 (04/04): Message Sequence Chart (2004) 23. Kof, L.: Scenarios: Identifying missing objects and actions by means of computational linguistics. In: 15th IEEE RE, pp. 121–130 (2007) 24. Lynch, N.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1997) 25. M¨akinen, E., Syst¨a, T.: MAS – An interactive synthesizer to support behavioral modeling in UML. In: ICSE, pp. 15–24. IEEE Computer Society, Los Alamitos (2001) 26. Nuseibeh, B., Easterbrook, S.: Requirements engineering: a roadmap. In: ICSE, pp. 35–46. ACM, New York (2000) 27. Pressman, R.S.: Software Engineering: A Practitioner’s Approach. McGraw-Hill, New York (2004) 28. Royce, W.: Managing the development of large software systems: concepts and techniques. In: ICSE, pp. 328–338. IEEE CS Press, Los Alamitos (1987) 29. Sommerville, I.: Software Engineering, 8th edn. Addison-Wesley, Reading (2006) 30. Tanenbaum, A.S.: Computer Networks. Prentice Hall, Englewood Cliffs (2002) 31. Uchitel, S., Brunet, G., Chechik, M.: Behaviour model synthesis from properties and scenarios. In: ICSE, pp. 34–43. IEEE Computer Society, Los Alamitos (2007) 32. Uchitel, S., Kramer, J., Magee, J.: Synthesis of behavioral models from scenarios. IEEE TSE 29, 99–115 (2003)
Open Work of Two-Hemisphere Model Transformation Definition into UML Class Diagram in the Context of MDA Oksana Nikiforova and Natalja Pavlova Department of Applied Computer Science, Riga Technical University, Riga, Latvia {oksana.nikiforova,natalja.pavlova}@rtu.lv
Abstract. Model Driven Architecture (MDA) is based on models and distinguish between a system functionality specification and this specification realization on a given technological platform. MDA consists of four models: CIM (Computation Independent Model), PIM (Platform Independent Model), PSM (Platform Specific Model) and code model, all these are parts of the MDA transformation line: CIM->PIM->PSM->code. A PIM model has to be created using a language which is able to describe a system from various points of view, system behavior, system’s business objects, system actors, system use cases and so on. Current paper discusses the application of two-hemisphere model for construction of UML class diagram as a part of PIM. Several solutions for determination of elements of class diagram from two-hemisphere model are currently researched and described in the paper. As well as application of the transformations by example of insurance problem domain are presented in the paper. Keywords: Business process diagram, class diagram, MDA, PIM, relationships among classes.
1 Introduction Model Driven Architecture (MDA) is a framework being built under supervision of the Object Modeling Group (OMG) [1]. MDA separates system business aspects from system implementation aspects. MDA defines the approach and tool requirements to specification of systems independently of platforms, specification of platforms, choosing of particular platforms to the systems, and transformation of specifications of business domains into specifications that include specific information of platforms, which have been chosen. MDA proposes a software development process in which the key notions are models and model transformations [2]. Software is built by constructing one or more models, and transforming them into other models in this process. The common view on this process is as follows: input is platform independent models and output is platform specific models. The platform specific models can be easily transformed into an executable format [3]. There is a more generic view in Model Driven Architecture [4]. A difference between platform independent models and platform specific models is not dominant in this case. The key of this view is that the software development process is Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 118–130, 2011. © IFIP International Federation for Information Processing 2011
Open Work of Two-Hemisphere Model Transformation Definition
119
implemented by an intricate sequence of transformation executions that are combined in various ways. This makes system development much more open and flexible [3]. The MDA idea is promising – raising the level of abstraction, on which systems are developed. Therefore, it will be possible to develop systems that are more complex in a qualitative way. The basic goal in recent researches in the area of MDA is to achieve a system representation, which corresponds to business requirements, at the highest level of abstraction as possible. Nowadays, MDA tools support formalization of transformations between PIM and PSM stages, and researchers try to “raise” it up as high as possible to fulfill the main statement of MDA [5], [6]. The paper shows state of the art of two-hemisphere model application to generation of elements of class diagram and presents several solutions which address the problems stated above and try to “raise” up the level of transformations into the transformations inside PIM. Section 2 describes two-hemisphere model and define main components of it suitable for generation of elements of UML class diagram. As well as structure of source and target models is defined in Section 2. Section 3 defines all possible transformations from elements of two-hemisphere model into elements of UML class diagram, which is defined as a target. A practical experiment with the transformations from two-hemisphere model into class diagram is processed in Section 4. Overall statements of the research are discussed in conclusions.
2 Model Transformations in Terms of Two-Hemisphere Model Two-hemisphere model driven approach [7] proposes to use of business process modeling and concept modeling to represent systems in the platform independent manner and describes how to transform business process models into several elements of UML diagrams. For the first time the strategy was proposed in [8], where the general framework for object-oriented software development had been presented and the idea about usage of two interrelated models for software system development has been stated and discussed. Two-hemisphere approach proposes to start software development process based on two-hemisphere problem domain model, where one model reflects functional (procedural) aspects of the business and software system, and another model reflects corresponding concept structures. The co-existence and inter-relatedness of models enables use of knowledge transfer from one model to another, as well as utilization of particular knowledge completeness and consistency checks [7]. MDA introduces an approach to system specification that separates the views on three different layers of abstraction: high level specification of what the system is expected to do (Computation Independent Model or CIM); the specification of system functionality (Platform Independent Model or PIM); and the specification of the implementation of that functionality on a specific technology platform (Platform Specific Model or PSM). Currently available methods do not support formal transformation from CIM to PIM, wherein the PIM would be sufficient for PSM generation [5]. The PIM received from the CIM should be refined in order to get the correct transformation into the PSM. Moreover, investigation of the PIM requires a more detailed description of it. That is why it was decided to divide the solution domain into two above-mentioned levels: the first one closer to the CIM and the second one closer to the PSM. As
120
O. Nikiforova and N. Pavlova
previously mentioned, the second level of the solution domain is an application level. Because this level is closer to the PSM, models with further automation should be represented here. A model on the application level of the solution domain is class diagram. A model on the application level should be sufficient for formal transformation to the Platform Specific Model. Therefore, data structure, attributes, system functions and main algorithms should be presented in the PIM ready to transformation into PSM [5]. CIM presents specification of the system at problem domain level and can be transformed into elements of PIM. PIM provides formal specification of the system structure and functions that abstracts from technical details, and thus presents solution aspects of the system to be developed, which enables model transformation to the platform level (PSM), named implementation domain in Figure 1. Definition of MDA principles in terms of MDA Problem domain
CIM
Presentation of problem domain elements suitable for further transformation
Derivation of solution elements from problem domain Solution domain
PIM
Presentation (and required transformations) of derived elements suitable for further transformation
Transformation of solution elements into implementable elements Implementation domain
PSM
Presentation (and required transformations) of implementation elements suitable for further transformation
Definition of MDA principles in terms of 2HMD approach for software architecture development Process model and conceptual model of problem domain Derivation of automated processes from process model and definition of structure of their information flow based on concepts in conceptual model Selected processes and their information flow structure Application of 2HMD transformation algorithm for defined processes and concepts Development of class diagram based on elements received after 2HMD transformation algorithm application
Fig. 1. Model transformation from problem domain level of knowledge representation into implementation domain level according 2HMD approach
The details in the right column of the table in Figure 1 correspond to the twohemisphere approach, which addresses the construction of information about problem domain by use of two interrelated models at problem domain level, namely, the process model and the conceptual model. The conceptual model is used in parallel with process model to cross-examine software developers understanding of procedural and semantic aspects of problem domain. The main idea of MDA is to achieve formal system representation at the as high level of abstraction as possible. One of the most important and problematic stages in MDA realization is derivation of PIM elements from a problem domain, and PIM construction in the form that is suitable for the PSM [5]. It is necessary to find the way to develop PIM using formal representation, so far keeping the level of abstraction high enough. PIM model should represent system static and dynamic aspects. Class diagram shows static structure of the developed system and is the central component of PIM. But UML is a modeling language and does not have all the possibilities to specify context and the way of modeling, which is required always to be
Open Work of Two-Hemisphere Model Transformation Definition
121
defined in a methodology. Therefore the construction of class diagram has to be based on well defined rules for its elements generation from the problem domain model presented in the form suitable for that. The MDA framework implies system development based on modeling, not on programming activities. System development is divided into three stages according to the level of abstraction. Every stage is denoted with a model. The model is often presented as a combination of drawings and text [1]. A transformation tool or approach takes a model on input and creates another model on output, see Figure 2 [4]. The two-hemisphere model has been marked as input with mapping rules, the class diagram and transformation trace has been received on output. Transformation trace shows the plan how an element of the two-hemisphere model is transformed into the corresponding element of the class diagram, and which parts of the mapping are used for transformation of every part of the two-hemisphere model [1]. Figure 2 shows how a transformation tool takes input – the twohemisphere model and receives output – the class diagram.
Fig. 2. Structure of model transformation tool in the framework of MDA
All elements of the source model are shown in Table 1. It is elements of the business process model and concept model. A notation of the business process model is optional, however, it must reflect the following components of business process model: processes; performers; information flows; and information (data) stores [7]. Realworld classes relevant to the problem domain and their relationships are presented in concepts model. It is a variation of well known ER diagram notation [9] and consists of concepts (i.e. entities or objects) and their attributes. The notational conventions of the business process diagram give a possibility to address concepts in concept model to information flows (e.g. events) in process model. The elements of the source model are listed in the first column of Table 1. The second column describes the main elements of the source model. The elements of source model are important for further system analysis and design. Information from these elements is significant, and nothing from it should remain without any usage on the lowest levels of abstraction. Business processes or tasks present system functionality, its activities and operations. If this information is lost it is necessary to find system functions for implementing operations of classes in description of the problem domain. Events, data stores and data objects are parts of the data structure model. With these elements initial static structure is presented in the two-hemisphere model, and should be transited into the class diagram. Associations and attributes in the concept model are useful for definition of relationships between objects of system static structure (classes).
122
O. Nikiforova and N. Pavlova Table 1. Elements of source model Elements of Source model
Description
Business process diagram/ Process process name description triggering condition performer expression duration start option end option no start option tag assignment Business process diagram/Event name transfer name set option repeat option
Business Process usually means a chain of tasks that produce a result which is valuable to some hypothetical customer. A business process is a gradually refined description of a business activity (task) [10].
BP diagram/Data store store name comment ER model<ER name>
Concept model/Concept name
Concept model/Attribute name type
Events are defined (as a rule) in the moment when they are mentioned for the first time in BP or TD diagrams. Events are an input/output object (or more precisely - the arrival of an input object and departure of an output object) of certain business process. These objects can be material things or just information [10]. The data store is a persistent (independent of the current task) storage of data or materials. In the case of an information system, the data store most likely will be converted to a database with a certain data structure (Entity Relationship Model). On the highest levels of business models, the data store can be used to denote an archive of data or it can also be used to represent a warehouse or stock of goods [10]. Conceptual classes that are software (analysis) class candidates in essence. A conceptual class is an idea, thing, or object. A conceptual class may be considered in terms of its symbols – words or images, intensions – definitions, and extensions – the set of examples [11]. An attribute is a logical data value of an object [11].
The elements of the target model are listed in Table 2. Only the main elements of the class diagram are shown there. It is necessary to find the way how source model elements can be transformed into target model elements according to the definition of transformations in the framework of MDA.
Open Work of Two-Hemisphere Model Transformation Definition
123
Table 2. Elements of target model Element of Source model Class diagram/Class
Class diagram/Class/Attribute name type Class diagram/Class/Operation name return type argument precondition postcondition Class diagram/Relationship type multiplicity role Class diagram/Class/Stereotype
Class diagram/Constraint
Description A class is the descriptor for a set of objects with similar structure, behavior, and relationships [11]. An attribute is a logical data value of an object [11]. The UML formally defines operations. To quote: "An operation is a specification of a transformation or query that an object may be called to execute" [12].
A relationship between instances of the two classes. There is an association between two classes if an instance of one class must know about the other in order to perform its work [11]. Stereotypes, which provide a way of extending UML, are new kinds of model elements created from existing kinds [11]. A constraint is a condition that every implementation of the design must satisfy [11].
3 Application of Two-Hemisphere Model for Obtaining of Elements of Class Diagram For the research the source model is two-hemisphere model, or business process and concept diagrams, and the target model is class diagram in terms of UML class diagram. The possible combinations of transition from two-hemisphere model to class diagram will be discussed in this section. Input and output is necessary for any transformation. The transformation discussed here is a transformation inside the PIM – one of the MDA models. In other words, it is a transformation from two-hemisphere model to class diagram. 3.1 General Schema of Transformation Abilities The detailed transformations between models proposed for the application of twohemisphere model in the paper are shown in Figure 3. Two-hemisphere model consists of business process model (graph G1 on Figure 3) and concept model (graph G2 on Figure 3). The notation of business process model have not a significant value, main requirement to the notation of business process model is possibility to define business processes, performers, events and data flows among business processes. For current research is used business process model constructed with GRAPES [10] notation. The second hemisphere is concept model. C.Larman defines concept model as “The concept model captures real-world concepts (objects), their attributes, and the associations between these concepts.” [11].
124
O. Nikiforova and N. Pavlova
In the case of two-hemisphere model authors avoid relations between classes in concept model at business level (of problem domain) and the relations will be defined according system realization at software level (of implementation domain). For performing of transformation to class diagram the intermediate model (graph G3 on Figure 3) is introduced. Intermediate model is used to simplify the transition between business process and object interaction models, which now is presented in the form of UML collaboration diagram (graph G4 in Figure 3). Figure 3 shows all the transformations from the business process model (G1) and concept model (G2) into the class diagram (G5). Transformations are based on the hypothesis that elements of the class diagram can be received from the two-hemisphere model by applying defined techniques of graph transformation [13].
Fig. 3. Essence of application of two-hemisphere model for generation of elements of class diagram
Open Work of Two-Hemisphere Model Transformation Definition
125
Intermediate model is generated from business process model using methods of directed graph transformation, when arcs of one graph (G1 on Figure 3) are transformed into nodes of another graph (G3 on Figure 3) and nodes of one graph (G1) are transformed into arcs of another graph (G2) [14]. Figure 3 presents the sequence of transformations from two-hemisphere model to class diagram with dotted arrows. Business process “perform action 1” is transformed into arc “perform action 1” of intermediate model (graph G3 on Figure 3). The next transformation create the method “perform action 1()” in collaboration diagram (graph G4 on Figure 3) from the arc of intermediate model. The last transformation of this business process defines the responsible class of this method in class diagram (graph G5). The element “performer 1” is transformed as a node of intermediate model, and as “actor 1” of collaboration model. This element is defined as “actor 1” in class diagram. Data types for elements “event 1” and “event 3” is defined as “DataType A” or “Concept A” of concept model. Events are transformed into nodes of intermediate model, and then into objects like “Event1: Class A” in collaboration diagram, which serves as a base for classes of class diagram definition. All attributes for classes are determined based on attributes defined in concept model. Figure 3 presents how elements of two-hemisphere model are transformed into elements of class diagram. The ways of receiving of the following elements are shown with arrows in Figure 3: • Business process “perform action 1” is transformed into arc “perform action 1” of intermediate model. The next transformation create the method “perform action 1()” in collaboration diagram from the arc of intermediate model. The last transformation of this business process defines the responsible class of this method in class diagram. • The element “performer 1” is transformed as a node of intermediate model, and as “actor 1” of collaboration model. This element is defined as “actor 1” in class diagram. • Data types for elements “event 1” and “event 3” is defined as “DataType A” or “Concept A” of concept model. Events are transformed into nodes of intermediate model, and then into objects like “Event1: Class A” in collaboration diagram, which serves as a base for classes of class diagram definition. • All attributes for classes are determined based on attributes defined in concept model. Summarization of mapping of source model into target of model is shown in Figure 4.
Fig. 4. Mapping of elements of source model into elements of target model
126
O. Nikiforova and N. Pavlova
As it is seen in Figure 4 not all the elements of class diagram defined in Table 2 are received from two-hemisphere model: stereotype and constraint are still under research. But the main components of class diagram are generated from different elements or their combinations of two-hemisphere model: classes, operations, methods and different types of relationships between classes. An illustrative example of definition of classes and their attributes and operations is described in Section 3.2. and transformation rules for definition of different types of relationships between classes in details are discusses in [15]. And not all the elements of two-hemisphere model are used for identification of elements of class diagram: data stores are avoided due to it duplication by element “task” with different meaning. 3.2 An Illustrative Example of Class Diagram Generation from Two-Hemisphere Model For better understanding of main idea, the example of such model transformations is shown for a fragment of problem domain concerned with insurance activities [16]. Figure 5 presents only fragment of transformation. There is one process “Pay sum” which has output “policy”. Concept “policy” defines data type for the output of process. It is transformed into fragment of intermediate model with arc “Pay sum” and node “Policy”. Intermediate model allows to receive collaboration diagram, where initial process “Pay sum” is a method of object “Policy”. Concrete object “Policy” belongs to class “Policy”, which is defined with corresponding concept. When a collaboration of objects is defined, it is possible to construct class diagram according to rules of object-oriented system modeling [8].
Fig. 5. An example of process and concept elements transformation into class elements
4 Practical Experiment with the Processing of Transformations from Two-Hemisphere Model into Class Diagram During the investigation of receiving of class diagram from two-hemisphere model all possible combination of number and types of incoming and outgoing information
Open Work of Two-Hemisphere Model Transformation Definition
127
flows from nodes of processes are examined [8]. Different combinations give a possibility to receive different relations among classes. The tool for business process modeling GRADE (GRADE) [17] gives a possibility to construct two interrelated models (business process and the concept ones) and to generate text description of models with permanent structure, therefore it is chosen as a tool for development of two-hemisphere model and further generation of textual files, which defines all the elements of the model and their relations each to other. Generated text files serve as an input information to support the processing algorithm of the transformations among graphs defined in the Section 3. And as the result the XML file, which contains description of structure of the class diagram generated from the source model. XML format of class specification gives a possibility to receive visual representation of class diagram in any tool, which support import from XML for class diagram development. To check, that offered transformations are independent from problem domain an experiment with two-hemisphere model of insurance is performed. The transformations are applied for generation of class diagram from two-hemisphere model developed for insurance problem domain (shown in Figure 6) and the result class diagram is shown in Figure 7.
•
Fig. 6. Initial business process and concept models for Insurance problem domain
128
O. Nikiforova and N. Pavlova
Fig. 7. Class diagram for Insurance business
As far as business process and conceptual models are the built-in demo example for system development in GRADE the authors may suppose that the models (e.g. source information) are correct and constructed independently from author participation. Therefore it is possible to address the truthful verification of an experiment. The class diagram in Figure 7 has undefined relations, and unrelated classes, for which additional, detailed, business process models are required. Classes, which are highlighted as gray in Figure 7 are defined as classes which have a restriction. For current level of details it is impossible to define relations of this classes and belonging of method “manage_assets ()” without creating a sub-process diagram for corresponding fragment of business process. After the detailed elaboration of the process it is possible to apply transformations one more time and receive more correct relations among classes. According the restriction of graph transformation with multiple inputs and multiple outputs it can be possible to define processes, which requires additional detailed elaboration. This feature is realized only partly with reporting of the restriction places in the output text file. The visualization with indicating of a process is not realized because the tool developed for processing of transformations from two-hemisphere model into class diagram does not support yet diagramming abilities, but receiving of XML code gives a possibility to create class diagram with any tool, which support export from XML.
5 Conclusions The elements of class diagram are received in the formal way during the transformation from the two-hemisphere model into the class diagram. For generation of class diagram elements, elements of the business process and concept models are used. Receiving of elements of the class diagram allows to define a class diagram at the conceptual level. It could serve as a base for further development of the system architecture.
Open Work of Two-Hemisphere Model Transformation Definition
129
Not all elements of the class diagram are received from the two-hemisphere model on the current stage of research. Definition of such elements as operation arguments, operation return types, stereotypes, constraints and so on is still researched. There exists the probability that for definition of this attributes, an extension of the initial twohemisphere model will be required. The proposed transformations are applied for two-hemisphere model of insurance and classes with attributes and different kinds of relationships are identified based on elements of process and concept models. The ability to define all the types of transformations in a formal way gives a possibility to automate the process of class diagram development from correct and precise two-hemisphere model. The title of the paper is called “Open work of …” it means that the research is under development. Authors try to find the way to receive the rest elements of class diagram and moreover to find the possibility of define system dynamic in a more precise way.
Acknowledgements The research reflected in the paper is supported by the research grant No. R7389 of Latvian Ministry of Education and Science in cooperation with Riga Technical University “Development of tool prototype for generation of software system class structure based on two-hemisphere model.” and by the European Social Fund within the National Programme "Support for the carrying out doctoral study program's and postdoctoral researches".
References 1. MDA Guide Version 1.0.1, http://www.omg.org/docs/omg/03-05-01.pdf 2. Kent, S.: Model driven engineering. In: Butler, M., Petre, L., Sere, K. (eds.) IFM 2002. LNCS, vol. 2335, p. 286. Springer, Heidelberg (2002) 3. Kleppe, A.: MCC: A model transformation environment. In: Proceedings of the ECMDA, pp. 173–187. Springer, Heidelberg (2006) 4. Kleppe, A., Warmer, J., Bast, W.: MDA Explained: The Model Driven Architecture. In: Practise and Promise, p. 192. Addison Wesley, Reading (2003) 5. Nikiforova, O., Kuzmina, M., Pavlova, N.: Formal Development of Platform Independent Model in the Framework of MDA: Myth or Reality. In: Scientific Proceedings of Riga Technical University, 5th Series, Computer Science, Applied Computer Science, vol. 22, pp. 42–53. RTU, Riga (2005) 6. Pavlova, N.: Several Outlines of Graph Theory in Framework of MDA. In: Maguar, G., Knapp, G., Wojtkowski, W., Wojtkowski, W.G., Zupancic, J. (eds.) Advances in Information Systems Development, New Methods and Practice for the Networked Society, vol. 2, pp. 25–36. Springer Science+Business Media, LLC (2007) 7. Nikiforova, O., Kirikova, M.: Two-Hemisphere Model Driven Approach: Engineering Based Software Development. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 219–233. Springer, Heidelberg (2004) 8. Nikiforova, O.: General Framework For Object-Oriented Software Development Process. In: Proceedings of Conference of Riga Technical University, Computer Science, Applied Computer Systems, 3rd Thematic Issue, Riga, Latvia, pp. 132–144 (2002)
130
O. Nikiforova and N. Pavlova
9. Chen, P.: The entity relationship model – towards a unified view of data. ACM Trans. Database Systems 1, 9–36 (1976) 10. GRADE Business Modeling, Language Guide. INFOLOGISTIK GmbH (1998) 11. Larman, C.: Applying UML And Patterns: An Introduction To Object-Oriented Analysis And Design. Prentice Halls, New Jersey (2000) 12. Rumbaugh, J., Jacobson, I., Booch, G.: The unified modeling language reference manual. Addison-Wesley, Reading (1999) 13. Grundspenkis, J.: Causal Domain Model Driven Knowledge Acquisition for Expert Diagnosis System Development. Kaunas University of Technology Press, Kaunas (1997) 14. Pavlova, N., Nikiforova, O.: Formalization of Two-Hemisphere Model Driven Approach in the Framework of MDA. In: Proceedings of the 9th Confe-rence on Information Systems Implementation and Modeling, Czech Republic, Prerov, pp. 105–112 (2006) 15. Nikiforova, O., Pavlova, N.: Foundations of Generation of Relationships between Classes Based on Initial Business Knowledge. In: The Proceedings of the 17th International Conference on Information Systems Development (ISD2008). Towards a Service-Provision Society (accepted for publication) (2008) 16. Pavlova, N.: Approach for Development of Platform Independent Model in the Framework of Model Driven Architecture, Ph.D. thesis, Riga Technical University (2008) 17. GRADE tools, GRADE Development Group (2006), http://www.gradetools.com/
HTCPNs–Based Tool for Web–Server Clusters Development Slawomir Samolej1 and Tomasz Szmuc2 1
Department of Computer and Control Engineering, Rzeszow University of Technology, Ul. W. Pola 2, 35-959 Rzeszw, Poland 2 Institute of Automatics, AGH University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakw, Poland [email protected], [email protected]
Abstract. A new software tool for web–server clusters development is presented. The tool consist of a set of predefined Hierarchical Timed Coloured Petri Net (HTCPN) structures – patterns. The patterns make it possible to naturally construct typical and experimental server–cluster structures. The preliminary patterns are executable queueing systems. A simulation based methodology of web–server model analysis and validation has been proposed. The paper focuses on presenting the construction of the software tool and the guidelines for applying it in cluster–based web–server development. Keywords: Hierarchical Timed Coloured Petri Nets, Web–Server Clusters, Performance Evaluation.
1
Introduction
Gradually, the Internet becomes the most important medium for conducting business, selling services and remote control of industrial processes. Typical modern software applications have a client–server logical structure where predominant role plays an Internet server offering data access or computation abilities for remote clients. The hardware of an Internet or web–server is now usually designed as a set of (locally) deployed computers–a server cluster [3,8,13,19]. This design approach makes it possible to distribute services among the nodes of a cluster and to improve the scalability of the system. Redundancy which intrinsically exists in such hardware structure provides higher system dependability. To improve the quality of service of web–server clusters two main research paths are followed. First, the software of individual web–server nodes is modified to offer average response time to dedicated classes of consumers [7,11,12]. Second, some distribution strategies of cluster nodes are investigated [3] in conjunction with searching for load balancing policies for the nodes [5,18,21]. In several research projects reported in [8,17,19] load balancing algorithms and modified cluster node structures are analyzed together. It is worth noticing that in some of abovementioned manuscripts searching for a solution of the problem goes together with searching for the adequate formal language to express the system developed [2,8,17,18,19,21]. In [2,18,19,21] Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 131–142, 2011. c IFIP International Federation for Information Processing 2011
132
S. Samolej and T. Szmuc
Queueing Nets whereas in [17] Stochastic Petri Nets are applied for system model construction and examination. However, the most mature and expressive language proposed for the web–cluster modelling seems to be Queueing Petri Nets (QPNs)[8]. The nets combine coloured and stochastic Petri nets with queueing systems [1] and consequently make it possible to model relatively complex web– server systems in a concise way. Moreover, there exists a software tool for the nets simulation [9]. The research results reported in [8] include a systematic approach to applying QPNs in distributed applications modelling and evaluation. The modelling process has been divided into following stages: system components and resources modelling, workload modelling, intercomponent interactions and processing steps modelling, and finally – model parameterization. The final QPNs based model can be executed and used for modelled system performance prediction. The successful application of QPNs in web–cluster modelling become motivation to research reported in this paper. The aim of the research is to provide an alternative methodology and software tool for cluster–based hardware/software systems development. The main features of the methodology are as follows: – The modelling language will be Hierarchical Timed Coloured Petri Nets (HTCPNs) [6], – A set of so called HTCPNs design patterns (predefined net structures) will be prepared and validated to model typical web cluster components, – The basic patterns will be executable models of queueing systems, – A set of design rules will be provided to cope with the patterns during the system model creation, – The final model will be an executable and analyzable Hierarchical Timed Coloured Petri Net, – A well established Design/CPN and CPN Tools software toolkits will be used for the design patterns construction and validation, – The toolkits will also be used as a platform for the web–server modelling and development, – Performance analysis modules of the toolkits will be used for capturing and monitoring the state of the net during execution. The choice of HTCPNs formalism as a modelling language comes from the following prerequisites. First, HTCPNs has an expression power comparable to QPNs. Second, the available software toolkits for HTCPNs composition and validation seem to be more popular than“SimQPN” [9]. Third, there exist a reach knowledge base of successful HTCPNs applications to modelling and validation of wide range software/hardware systems [6] including web–servers [13,14,20]. The rest named features of design methodology introduced in this paper results from both generally known capabilities of software toolkits for HTCPNs modelling and some previous experience gained by the authors in application HTCPNs to real–time systems development [15,16]. This paper is organized as follows. Section 2 describes some selected design patterns and rules of applying them to web–server cluster model construction. An
HTCPNs–Based Tool for Web–Server Clusters Development
133
example queueing system, web–server subsystem and top–level system models are presented. Section 3 touches the simulation based HTCPNs models validation methods. Conclusions and future research program complete the paper. It has been assumed that the reader is familiar with the basic principles of Hierarchical Timed Coloured Petri Nets theory [6]. All the Coloured Petri Nets in the paper have been edited and analysed using Design/CPN tool.
2
Cluster Server Modelling Methodology
The main concept of the methodology lies in the definition of reusable timed coloured Petri nets structures (patterns) making it possible to compose web– server models in a systematic manner. The basic set of the patterns includes typical queueing systems TCPNs implementations, eg. –/M/PS/∞ , –/M/FIFO/∞ [13,14]. Packet distribution TCPNs patterns constitute the next group of reusable blocks. They preliminary role is to provide some predefined web–server cluster substructures composed from the queueing systems. At this stage of subsystem modelling the queueing systems are represented as substitution transitions (compare [13,14]). The separate models of system arrival processes are also the members of the group mentioned. The packet distribution patterns represented as substitution transitions are in turn used for the general top–level system model composition. As a result, the 3–level web–server model composition has been proposed. The top–level TCPN represents the general view of system components. The middle–level TCPNs structures represent the queueing systems interconnections. And the lowest level includes executable queueing systems implementations. The modelling methodology assumes, that the actual state of the Internet requests servicing in the system can be monitored. Moreover, from the logical point of view the model of the server cluster is an open queueing network, so the requests are generated, serviced and finally removed from the system. As a result an important component of the software tool for server cluster development is the logical representation of the requests. In the next subsections the following features of the modelling methodology will be explained in detail. First, the logical representation of Internet requests will be shown. Second, queueing system modelling rules will be explained. Third, an example cluster subsystem with an individual load–balancing strategy will be proposed. Fourth, Internet request generator structure will be examined. Finally, top–level HTCPNs structure of an example cluster–server model will be shown. 2.1
Logical Request Representation
In the server–cluster modelling methodology that is introducing in the paper the structure of the HTCPN represents a hardvare/software architecture of web– server. Yet, the dynamics of the modelled system behavior is determined by state and allocation of tokens in the net structure. Two groups of tokens has been proposed for model construction. The fist group consist of so–called local tokens, that“live” in individual design patters.They provide local functions and data
134
S. Samolej and T. Szmuc
structures for the patterns. The second group tokens represent Internet requests that are serviced in the system. They are transported throughout several cluster components. Their internal state carries the data that may be used for timing and performance evaluation of the system modelled. As the tokens representing the requests have the predominant role in the modelling methodology, they structure will be explained in detail. Each token representing an Internet request is a tuple P ACKAGE = (ID, P RT, ST ART T IM E, P ROB, AU T IL, RU T IL) , where ID is a request identifier, P RT is a request priority, ST ART T IM E is a value of simulation time when the request is generated, P ROB is a random value, AU T IL is an absolute request utilization value, and RU T IL is a relative request utilization value. Request identifier makes it possible to give the request an unique number. Request priority is an integer value that may be taken into consideration when the requests are scheduled according priority driven strategy [7]. ST ART T IM E parameter can store a simulation time value and can be used for the timing validation of the requests. Absolute request utilization value, and relative request utilization value are exploited in some queueing systems execution models (eg. with processor sharing service). 2.2
Queueing System Models
The basic components of the software tool for web–server clusters development that is being introduced in this paper are the executable queueing systems models. At the current state of the software tool construction the queueing systems models can have FIFO, LIFO, processor sharing or priority based service discipline. For each queue an arbitrary number of service units may be defined. Additionally, the basic queueing systems has been equipped with auxiliary components that make it possible to monitor the internal state of the queue during it’s execution. The example HTCPNs based queueing system model is shown in fig. 1. The model is a HTCPNs subpage that can communicate with the parent page via IN P U T P ACKS, OU T P U T P ACKS and QL port places. The request packets (that arrive through IN P U T P ACK place) are placed into a queue structure within P ACK QU EU E place after ADD F IF O transition execution. T IM ERS place and REM OV E F IF O transition constitute a clock–like structure and make it possible to model the duration of packet execution. When REM OV E F IF O transition fires, the first packet from the queue is withdrawn and directed to the service procedure. The packets under service acquire the adequate time stamps generated according the assumed service time random distribution function. The time stamps associated with the tokens prevent from using the packet tuples (the tokens) for any transition firing until the stated simulation time elapses (according to firing rules defined for HTCPNs [6]). The packets are treated as serviced when they can leave OU T P U T P ACKS place as their time stamps expired. The number
HTCPNs–Based Tool for Web–Server Clusters Development P Ge
FG
Fifo queue
COUNTER
QL
1‘1
INT
QL_A_ID
1‘(length fifo_queue, #2 qlen_a_id)
1‘0 qlen_a_id
TIMER
n+1
n tim1 1‘tim1@+ ql_timer_val
fifo_queue
fifo_queue nil pack
INPUT_PACKS PACKAGE
TIMERS
COUNT_QL
TIMER_QL
P Ge
1‘tim1
update_FIFO (fifo_queue)
TIMER 1‘1
1‘tim1@+tim_val REMOVE_FIFO
PACK_QUEUE ADD_FIFO
135
add_FIFO(pack,fifo_queue) PACK_QUEUE
P Ge
OUTPUT_PACKS fifo_queue
C
release_FIFO PACKAGE (fifo_queue)@+tim_val
[fifo_queue<>nil] output (tim_val); action discExp(1.0/fifo1_ser_mean_time) ;
Fig. 1. HTCPNs based –/1/FIFO/∞ queueing system model
of tokens in TIMERS place defines the quantity of queue servicing units in the system. The main parameters that define the queueing system model dynamics are the queue mean service time, the service time probability distribution function and the number of servicing units. The capacity of the queue is not now taken into consideration and theoretically may be unlimited. For future applications the primary queueing system design pattern explained above has been equipped with an auxiliary “plug–in”. COU N T QL transition and T IM ER QL, QL and COU N T ER places make it possible to measure the queue length and export the measured value to the parent CPNs page during the net execution. T IM ER QL place includes a timer token that can periodically enable the COU N T QL transition. QL port place includes a token storing the last measured queue length and an individual number of a queueing system in the system. The COU N T ER place includes a counter token used for the synchronization purpose. 2.3
Cluster Load–Balancing Model
Having a set of queueing systems design patterns some packet distribution HTCPNs structures may be proposed. In [13] a typical homogenous multi–tier web–server structure pattern was examined, whereas in [14] a preliminary version of server structure with feedback like admission control of Internet requests was introduced. The packet distribution pattern presented in this paper touches the load balancing in web–server cluster problem. Fig. 2 includes an example cluster load–balancing HTCPNs model.The cluster consist of 3 computers represented as F IF O1...F IF O3 substitution transitions, where each transition is attached to a F IF O queueing pattern. The Internet
136
S. Samolej and T. Szmuc
Server Cluster
Load Balancer QL_A_ID 1‘(0,1) QL1_ qlen_a_id1 1‘((1,33),(34,66),(67,100))
H
BALANCE BAND_TABLE3 count_bands_of3(qlen_a_id1, [b_guard31( b_tab3,pack)] qlen_a_id2, qlen_a_id3) b_tab3 T3 B_TABLE PACKS3 b_tab3 pack PACKAGE qlen_a_id2 b_tab3 QL_A_ID
1‘(0,2)
T4
pack
T8
H
PACKS4
pack [b_guard32 ( b_tab3,pack)]
T9
PACKS9 pack
PACKAGE
H pack
pack
qlen_a_id3
QL3_
PACKAGE
PACKS8
FIFO2 PACKAGE
QL2_
1‘(0,3)
FIFO1
pack b_tab3
P G PACKS2
pack
T5
PACKS5
PACKS10 pack
pack PACKAGE [b_guard33( b_tab3,pack)]
QL_A_ID
P G
FIFO3 PACKAGE
pack
PACKAGE
H
T10
PACKS13 pack PACKAGE
Fig. 2. Server cluster with load balancing model
requests serviced by the cluster arrive through P ACKS2 port place. A load balancer decides where the currently acquired request should be send to achieve an uniform load for all, even heterogenic nodes of the cluster. Generally, the load balancing procedure can follow the Fewest Server Processes First [17] or the Adaptive Load Sharing [5] algorithms. In both algorithms some feedback information about the state of cluster nodes under balance is needed. QL1 , QL2 and QL3 places (connected to corresponding QL port places of F IF O queueing system models–compare section 2.2) provide the queue’s lengths of each cluster node to the load balance procedure. The less loaded server nodes have the highest probability to get a new request to serve. The HTCPNs implementation of the algorithms involves periodical firing BALAN CE transition. During the firing, a set of threshold values is generated and stored in B T ABLE place. Finally, the thresholds values are mapped to guard functions associated to T 3, T 4 and T 5 transitions and can be understand as some kind of bandwidths for the requests streams. At current state of the design pattern composition, the only load balancer parameter that influences its dynamics is the frequency at witch the load of cluster nodes is measured. 2.4
Request Generator Model
According to one of main assumptions of the web–server cluster modelling methodology presented in this paper, the system model can be treated as an
HTCPNs–Based Tool for Web–Server Clusters Development INT
137
1‘1
COUNT0
TIMER TIMER0
1‘1
(n,1,intTime(), ran’random_val(),0,0)
n+1 n tim1 T0
tim1 @+tim_val
PACKS1
P
Ge
C
PACKAGE
output (tim_val); action discExp(1.0/ pack_gen_mean_time);
Fig. 3. Web–server arrival process model
open queueing network. Consequently, the crucial model component must be a network arrival process simulating the Internet service requests that are sent to the server. Fig. 3 shows an example HTCPNs subpage that models a typical Internet request generator. The core of the packet generator is a clock composed from T IM ER0 place and T 0 transition. The code segment attached to the T 0 transition produces values of timestamps for tokens stored in T IM ER0 place. The values have the probability distribution function defined. As a result the Internet requests appear into P ACKS1 place at random moments in simulation time. The frequency at which tokens appear in P ACKS1 place follows the distribution function mentioned. P ACKS1 place has a port place status and thereafter tokens appearing in it can be consumed by other model components (e.g. server cluster model). The Internet request frequency can have any standard probability distribution function or can be individually constructed as it was proposed in [20]. 2.5
Example Top–Level Cluster Server Model
Having the adequate set of design patterns, a wide area of server cluster architectures can be modelled and tested at the early stage of development process. At the top–level modelling process each of the main components of the system can be represented as a HTCPNs substitution transition. The modelling methodology presented in the paper suggest that at the top–level model construction the arrival process and main server cluster layers should be highlighted. After that each of the main components (main substitution transition) should be decomposed into an adequate packed distribution subpage, were under some of transitions queueing system models will be attached. It is easily to notice that a typical top–down modelling approach of software/hardware system modelling has been adapted in the web server modelling methodology proposed in the paper. Fig. 4 includes an example top–level HTCPN model of server cluster that follows the abovementioned modelling development rules. The HTCPN in fig. 4
138
S. Samolej and T. Szmuc
Top-level cluster model HS
PACKAGE
HS
PACKS1
PACKAGE
PACKS2 pack
1
INPUT_PROCS
1‘(1,1,0, 29,0,0) @[0]
SERVER_CLUSTER
OUTPUT
Fig. 4. Example top–level cluster server model
consist of 2 substitution transitions. IN P U T P ROCS transition represents the arrival process for the server cluster whereas SERV ER CLU ST ER transition represents example one–layer web–server cluster. The modelling process can be easily continued by attaching the request generator model as in section 2.4 under the IN P U T P ROCS transition and by attaching the cluster model with load balancing module as in section 2.3 under SERV ER CLU ST ER transition. The final executable model can be acquired by attaching FIFO design patterns under F IF O1, F IF O2 and F IF O3 transitions in the load balancing module (compare sections 2.2 and 2.3).
3
Model Validation Capabilities
Typical elements of HTCPNs modelling software tools are performance evaluation routines, e.g.: [10] . The routines make it possible to capture the state of dedicated tokens or places during the HTCPN execution. A special kind of log files showing the changes in the state of HTCPN can be received and analyzed offline. At the currently reported version of web–server cluster modelling and analysis software tool, queue lengths and service time lengths can be stored during the model execution. Detecting the queue lengths seem to be the most natural load measure available at typical software systems. The service time lengths are measurable in the modelling method proposed because of a special kind P ACKAGE type tokens construction (compare section 2.1). The tokens “remember” the simulation time at witch their appear at the cluster and thereafter the time at each state of their service may be captured. In real systems the service time is one of predominant quality of service parameters for performance evaluation. The performance analysis of models of web servers constructed according the proposed in the paper methodology can be applied in the following domains. First, the system instability may be easily detected. The stable or balanced queueing system in a steady state has an approximately constans average queue length and average service time. On the contrary, when the arrival process is to intensive for the queueing systems to serve, both queue lengths and service times increase. This kind of analysis is possible because there are not limitations for queue lengths in the modelling method proposed. Fig. 5 shows the queue lengths
HTCPNs–Based Tool for Web–Server Clusters Development
Fifo1_length Fifo2_length Fifo3_length
0
Service Time Lengths
Service Time Length
Queue Length
Queue 1,2,3 Lengths 900 800 700 600 500 400 300 200 100 0
139
900000 800000 700000 600000 500000 400000 300000 200000 100000 0
500000 1e+06 1.5e+06 2e+06 2.5e+06 Time [sim. time units]
Server 1 serv. len. Server 2 serv. len. Server 3 serv. len.
0
(a)
500000 1e+06 1.5e+06 2e+06 2.5e+06 Time [sim. time units]
(b)
Fig. 5. Queue lengths (a) and service times (b) under overload condition
Queue 1,2,3 Lengths Fifo1_length Fifo2_length Fifo3_length
Queue Length
10 8 6 4 2 0
Service Time Lengths 30000 Service Time Length
12
Server 1 serv. len. Server 2 serv. len. Server 3 serv. len.
25000 20000 15000 10000 5000 0
0
500000 1e+06 1.5e+06 2e+06 2.5e+06 Time [sim. time units]
(a)
0
500000 1e+06 1.5e+06 2e+06 2.5e+06 Time [sim. time units]
(b)
Fig. 6. Queue lengths (a) and service times (b) under stable system execution
(fig. 5a) and service time lengths (fig. 5b) when the example web server cluster model presented in the paper experiences the permanent overload. Second, the average values of queueing system systems parameters such as average queue lengths and average servicing time for the balanced model can be estimated. Provided that the arrival process model and the server nodes models parameters are acquired from the real devices as in [11,17,19,20], the software model can be used for derivation the system properties under different load conditions. In the fig. 6 queue lengths (fig. 6a) and service times (fig. 6b) under stable system execution are shown. The cluster had a heterogenic structure, where server 2 (fifo 2 model) had 4 times lower performance. The load balance procedure was trying to reduce amount of Internet requests for the second server executing the Adaptive Load Sharing [5] algorithm. FIFO1 and FIFO3 average queue length was 1.7, whereas FIFO3 queue length was 4.4. The average service time for FIFO1 and FIFO3 cluster nodes was 811 time units whereas for FIFO2 was 7471 time units. Third, some individual properties of cluster node structures or load balancing strategies may be observed. For example, in some load balancing strategies mentioned in [5], the load of cluster node is estimated without
140
S. Samolej and T. Szmuc Queue 1,2,3 Lengths Fifo1_length Fifo2_length Fifo3_length
Queue Length
600 500 400 300 200 100 0
0
500000 1e+06 1.5e+06 2e+06 2.5e+06 Time [sim. time units]
Service Time Lengths 300000 Service Time Length
700
Server 1 serv. len. Server 2 serv. len. Server 3 serv. len.
250000 200000 150000 100000 50000 0
0
(a)
500000 1e+06 1.5e+06 2e+06 2.5e+06 Time [sim. time units]
(b)
Fig. 7. Example queue lengths (a) and service times (b) of unbalanced web–server cluster
any feedback information from the node. Such load balancing strategy may easily fail when node performance becomes reduced due to some external reasons (e.g. hardware fail or some extra node load). Fig. 7 shows a possible web–server cluster reaction to the not reported performance reduction of one server node. It can be easily noticed that under some disadvantageous conditions the loss or unavailability of feedback information about the current state of cluster server can lead to its unstable behavior. The Internet requests scheduled to server 2 node after the node performaance reduction (at approximately 1000000 time units) may not be serviced due to the server overload.
4
Conclusions and Future Research
The paper introduces the HTCPNs–based software tool making it possible to construct and validate some web–server clusters executable models. The main concept of the tool lies in the definition of reusable HTCPNs structures (patterns) involving typical components of cluster–based server structures. The preliminary patterns are executable models of typical queueing systems. The queueing systems templates may be arranged into server cluster subsystems by means of packet distribution patterns. Finally, the subsystems patterns may be naturally used for top level system modelling, where individual substitution transitions “hide” the main components of the system. The final model is a hierarchical timed coloured Petri net. Simulation and performance analysis are the predominant methods that can be applied for the model validaton. Queueing systems templates was checked whether they meet theoretically derived performance functions. The analysis of HTCPNs simulation reports makes it possible to predict the load of the modelled system under the certain arrival request stream; to detect the stability of the system; to test a new algorithms for Internet requests redirection and for their service within cluster structures.
HTCPNs–Based Tool for Web–Server Clusters Development
141
Currently, the software tool announced in the paper can be applied for a limited web–server cluster structures modelling and validation. Thereafter the main stream of author’s future research will concentrate on developing next web– server node structures models. This may result in following advantages. First, an open library of already proposed web–server cluster structures could be created and applied by the future web–server developers. Second, some new solutions for distributed web–server systems may be proposed and validated.
References 1. Bause, F.: Queueing Petri Nets – a formalism for the combined qualititative and quantitative analysis of systems. In: PNPM 1993, pp. 14–23. IEEE Press, Los Alamitos (1993) 2. Cao, J., Andersson, M., Nyberg, C., Khil, M.: Web Server Performance Modeling Using an M/G/1/K*PS Queue. In: 10th International Conference on Telecommunications, ICT 2003, vol. 2, pp. 1501–1506 (2003) 3. Cardellini, V., Casalicchio, E., Colajanni, M.: The State of the Art in Locally Distributed Web-Server Systems. ACM Computing Surveys 34(2), 263–311 (2002) 4. Filipowicz, B.: Stochastic models in operations research, analysis and synthesis of service systems and queueing networks. WNT, Warszawa (1997) (in Polish) 5. Guo, J., Bhuyan, L.N.: Load Balancing in a Cluster-Based Web Server for Multimedia Applications. IEEE Transactions on Parallel and Distributed Systems 17(11), 1321–1334 (2006) 6. Jensen, K.: Coloured Petri Nets, Basic Concepts, Analysis Methods and Practical Use. Springer, Heidelberg (1996) 7. Kim, D., Lee, S., Han, S., Abraham, A.: Improving Web Services Performance Using Priority Allocation Method. In: Proc. of International Conference on Next Generation Web Services Practices, pp. 201–206. IEEE, Los Alamitos (2005) 8. Konunev, S.: Performance Modelling and Evaluation of Distributed Component– Based Systems Using Queuing Petri Nets. IEEE Transactions on Software Engineering 32(7), 486–502 (2006) 9. Kounev, S., Buchmann, A.: SimQPN–A tool and methodology for analyzing queueing Petri net models by means of simulation. Performance Evaluation 63(4-5), 364–394 (2006) 10. Linstrom, B., Wells, L.: Design/CPN Perf. Tool Manual. CPN Group, Univ. of Aarhus, Denmark (1999) 11. Liu, X., Sha, L., Diao, Y., Froehlich, S., Hellerstein, J.L., Parekh, S.: Online Response Time Optimization of Apache Web Server. In: Jeffay, K., Stoica, I., Wehrle, K. (eds.) IWQoS 2003. LNCS, vol. 2707, pp. 461–478. Springer, Heidelberg (2003) 12. Liu, X., Zheng, R., Heo, J., Wang, Q., Sha, L.: Timing Performance Control in Web Server Systems Utilizing Server Internal State Information. In: Proc. of the Joint Internat. Conf. on Autonomic and Autonomous Systems and International Conference on Networking and Services, p. 75. IEEE, Los Alamitos (2005) 13. Samolej, S., Rak, T.: Timing Properties of Internet Systems Modelling Using Coloured Petri Nets. Systemy czasu rzeczywistego–Kierunki badan i rozwoju, Wydawnictwa Komunikacji i Lacznosci, 91–100 (2005) (in Polish) 14. Samolej, S., Szmuc, T.: Dedicated Internet Systems Design Using Timed Coloured Petri Nets. Systemy czasu rzeczywistego–Metody i zastosowania, Wydawnictwa Komunikacji i Lacznosci, 87–96 (2007) (in Polish)
142
S. Samolej and T. Szmuc
15. Samolej, S., Szmuc, T.: TCPN–Based Tool for Timing Constraints Modelling and Validation. In: Software Engineering: Evolution and Emerging Technologies. Frontiers in Artificial Intelligence and Applications, vol. 130, pp. 194–205. IOS Press, Amsterdam (2005) 16. Samolej, S., Szmuc, T.: Time Constraints Modeling And Verification Using Timed Colored Petri Nets. In: Real–Time Programming 2004, pp. 127–132. Elsevier, Amsterdam (2005) 17. Shan, Z., Lin, C., Marinecu, D.C., Yang, Y.: Modelling and performance analysis of QoS–aware load balancing of Web–server clusters. Computer Networks 40, 235–256 (2002) 18. Spies, F.: Modeling of Optimal load balancing strategy using queueing theory. Microprocessing and Microprogramming 41, 555–570 (1996) 19. Urgaonkar, B.: An Analytical Model for Multi–tier Internet Services and Its Applications. ACM SIGMETRICS Performance Evaluation Review 33, 291–302 (2005) 20. Wells, L.: Simulation Based Performance Analysis of Web Servers. In: Proc. of the 9th Internat. Workshop on Petri Nets and Perf. Models, p. 59. IEEE, Los Alamitos (2001) 21. Zhang, Z., Fan, W.: Web server load balancing: A queuing analysis. European Journal of Operation Research 186, 681–693 (2008)
Software Product Line Adoption – Guidelines from a Case Study Pasi Kuvaja1, Jouni Similä1, and Hanna Hanhela2 1
Department of Information Processing Science, University of Oulu, FI-90014 University of Oulu, Finland [email protected], [email protected] 2 Digia, Sepänkatu 20, FI-90100 Oulu [email protected]
Abstract. It is possible to proceed with software product line adoption only once without major reinvestments and loss of time and money. In the literature, reported experiences of using the adoption models are not to be found, and especially the suitability of the models has not been reported. The purpose of this research is to compare known adoption models by formulating general evaluation criteria for the selection of an adoption model. Next an adoption model is selected for empirical research based on the context of a multimedia unit of a global telecommunication company. The empirical part consists of a case study analyzing the present state of adoption and producing plans for proceeding with the adoption. The research results can be utilized when selecting an adoption model for an empirical case and adopting a software product line in a software intensive organization. Keywords: software product line, adoption, adoption model, adoption strategy, guidelines.
1 Introduction Over the last decade, software product line engineering has been recognized as one of the most promising software development paradigms, which substantially increases the productivity of IT-related industries, enables them to handle the diversity of global markets, and reduces time to market [1]. In addition, the software product line approach can be considered as the first intra-organizational software reuse approach that has proven to be successful [2] and is a key strategic technology in attaining and maintaining unique competitive positions [3]. Software product line is “a set of software intensive systems sharing a common, managed set of features that satisfy the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way” [4]. Thus, different systems involving a product line are built by exploiting existing core assets. However, all of the existing core assets are not necessary to be used in one system. Transition from conventional system development mode towards product line engineering requires adoption of a new approach. In software product line adoption, an organization changes its operational mode to develop product lines consisting of Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 143–157, 2011. © IFIP International Federation for Information Processing 2011
144
P. Kuvaja, J. Similä, and H. Hanhela
several products instead of developing products separately in-house. Adopting the new approach is not, however, effortless. During the adoption, planning and coordinating of technical, management, organizational, and personnel changes are required [2, 5]. Furthermore, an adoption can be made either starting a product line from scratch or by exploiting existing systems [6, 7, 8, 9, and 10]. If the former strategy is used, needed changes are even larger than when using the latter strategy. There are stories about successful adoption. After adopting a software product line, an organization can benefit in many different ways. Many studies have reported, that development time has shortened as efficiency has increased, less personnel to produce more systems is needed, more software is reused, overall and maintenance costs have decreased, and defects have reduced without compromising customer satisfaction [11,12,13,14,15]. Among the successful stories of software product line adoption there are three common characteristics: exploring commonalities and variability, architecture-centric development, and two-tiered organization in which one part develops the reusable assets and the other develops products using the assets [9]. Regardless of the reported successful adoption stories, in the literature no reported experiences of using the adoption models can be found. This study fulfils that gap with respect to one adoption model.
2 Research Questions and Study Setup There are two main research problems considered in this study. The problems are further elaborated into sub-questions, which when answered will solve the main research problems. The research problems with their sub-questions are as follows. (1) How can a software product line be adopted? – How to choose an adoption strategy and model among the existing strategies and models presented in the literature? What are the general evaluation criteria for the selection of the adoption models for an empirical case? (2) How does the chosen adoption model fit for the context of this empirical research? - What are the experiences of using the adoption model? Are there any missing characteristics in the model, which would have been essential in this particular context? The research performed consists of two parts: a literature review (first research question) and an empirical study (second research question). Based on the literature, the general evaluation criteria are identified to evaluate the adoption models. By evaluating the adoption models, the most suitable model for the needs of the target organization the research will be selected. The adoption is then applied in an empirical case study performed in a multimedia unit of a global telecommunication company.
3 State of the Art of Software Product Line Engineering and Adoption Principles of software product line engineering are presented to get a common understanding of the approach and its terminology. Thereafter, different factors impeding the successful adoption are described. Different adoption strategies are introduced to illustrate ways a software product line can be adopted. At the end the
Software Product Line Adoption – Guidelines from a Case Study
145
different adoption models according to the literature are presented with their common phases and development parties. In addition, general evaluation criteria for the selection of the adoption model are identified and the adoption models are evaluated to find the most suitable one to be used in the empirical part of this research. 3.1 Software Product Line Engineering Software product line engineering is based on the idea that software systems in a particular domain share more common characteristics than uniqueness and those systems are built repeatedly releasing product variants by adding new features [1]. Therefore, the scope of the product line is defined so, that the products involved in it have a high degree of common characteristics and the implementation of a component is shared over multiple products [2, 9]. Generally, approximately upfront investments of three or four products are required to have return on those investments [9, 16, and 17]. Nevertheless, using an incremental transition approach with a large legacy code base in a large organization, it is possible to adopt a software product line without a large upfront investment and without disrupting ongoing product schedules [18]. Important issues related to software product line engineering are the definition of scope [19], and the consideration of variability and commonalities [20, 21, and 22]. Architectures have a key role in software product line engineering [9, 19, 23, 24, and 25]. There are three kinds of architectures in the context of product lines: platform architecture, product line architecture, and product-specific architecture. Platform architecture is used to build reusable assets within a platform and it focuses on the internal structure of the platform [26]. A software platform is a set of software subsystems and interfaces that form a common structure from which a set of derivative products can be efficiently developed and produced [15]. A platform is the basis of the product line and, in many cases, is built from components evolving through the lifecycle of the product line. If developers cannot obtain the assets they need from the platform itself, they must develop them. Afterwards, the new, singleproduct assets might be integrated into the platform. [23]. To achieve the benefits of software product line engineering, an organization has first to adopt the approach. The adoption is a major change process in the organization affecting different groups in the organization [2]. According to Bosch, the different alternatives of adoption should be understood and evaluated, rather than blindly following a standard model [27]. The adoption itself starts with an assessment of the current state [7]. Therefore, it is essential to understand different challenges, strategies and models related to software product line adoption. 3.2 Software Product Line Adoption Both organizational and technical skills are the key for a product line introduction in existing domains [28]. Challenges related to technical aspects are wrong or incomplete requirements for the platform and wrong platform architecture [26]. In addition, if there is lack of either in architecture focus or architecture talent, an otherwise promising product line effort can be killed [19]. Software product line adoption affects employees’ roles and responsibilities. When an organization learns to operate in a new mode, it is usually not achieved without problems [2]. There are
146
P. Kuvaja, J. Similä, and H. Hanhela
resistances within the adopting organization, which can affect the success of the product line adoption [26]. When moving from conventional software development towards software product line engineering, a selected adoption strategy defines how much investments are needed in the beginning of the adoption and what the development time of the products is. During the adoption, the change effort needed is usually underestimated and timetables are often defined to be too tight [8, 26]. In addition, this is challenging, as normally resources have to be shifted from existing projects, and those rarely have resources to spare [7]. Organizations are typically hesitant to invest in changes if they do not have obvious, short-term Return on Investment (ROI) [2]. One of the most essential issues to take into consideration during the adoption is management commitment. Without explicit and sufficient management commitment and involvement, product line business practices cannot be influenced upon and successful [3, 19]. In addition, management commitment needs to be long-term [13] and it doesn’t depend on the size of an organization [29]. According to Krueger [30], minimally invasive transitions eliminate the adoption barriers. This means that while moving from conventional one system development towards software product line engineering, only minimal disruption of ongoing production schedules is allowed. Minimally invasive transitions have two main techniques. The first technique focuses on exploiting existing systems, in which existing assets, processes, infrastructure, and organizational structures of an organization are carefully assessed to exploit them as much as possible. The second technique concentrates on incremental adoption, in which a small upfront investment creates immediate and incremental return on investment (ROI). In such a case, the returns of the previous incremental step fund the next incremental step, and the organization adopts a software product line not much disrupting the ongoing production. In addition, to lower the adoption barriers, the organization’s current strengths and interests should be taken into consideration together with a reasonable speed of change [28]. 3.3 Software Product Line Adoption Strategies There are different strategies with different names on how to adopt software product lines. McGregor et al. [9] present two main types of adoption strategies, which they call heavyweight and lightweight strategies. Krueger [8] discusses about proactive, reactive, and extractive adoption models. Further, according to Schmid and Verlage [10], there are four types of situations when adopting a product line: independent, project-integrating, reengineering-driven, and leveraged. Böckle et al. [7] divide transition strategies into four groups, which are called incremental introduction, incremental investment, pilot first, and big bang. Bosch [6] divides the adoption process to two different approaches, evolutionary and revolutionary, for two different situations depending on are the existing items utilized or not. Although there are many different strategies for adopting a software product line, there are common characteristics among them. Common to all the mentioned adoption strategies is that the adoption either starts from scratch or exploits existing systems. The main differences between the two strategies are related to duration of the adoption time and needed upfront investment. In the starting from scratch strategy the adoption time (and thus the development time of one product) is shorter but higher upfront investments are needed than in the latter strategy and returns on investment
Software Product Line Adoption – Guidelines from a Case Study
147
can only be seen when products are developed and maintained. In addition, the cumulative costs are reduced faster in the starting from scratch strategy than in exploitation of existing products. [9,31]. Starting from scratch strategy is like waterfall approach in conventional software engineering whereas exploiting existing systems refers to incremental software development [31]. There are also differences between the strategies in exploiting commonalities and variability, in architecture development, and in organizational structure. In starting from scratch strategy, the adoption starts from creating assets which satisfy the specifications of the platform architecture. After that, creation of products takes place. In addition, product line architecture is defined completely before delivering first products. When using the starting from scratch strategy, there are particular teams which produce assets such as architecture and components. In exploiting existing systems strategy, assets are created from existing and currently developing products and the product line architecture is not completed when the first products are delivered. In that strategy, organizational structure does not change until the first few products have been delivered. [9]. The choice of the adoption strategy may depend on the situation of an organization and market demand. If the organization can afford to freeze conventional software development while adopting the software product line, it can choose a starting from scratch strategy. On the other hand, that strategy would be good in cases where the organization has additional resources for adoption, or the transition doesn’t need to be done quickly. In the cases where the organization has already products, or even a product line, which are worth to utilize, it may choose an exploitation of existing systems strategy. That strategy can also facilitate the adoption barrier of large-scale reuse as the organization can reuse existing items (software, tools, people, organization charts, and processes) to establish a product line [8]. 3.4 Software Product Line Adoption Models The adoption of software product line requires changes in technical, management, organizational, process, and personnel aspects [2, 5, and 7]. Consequently, an adoption model needs to take into consideration these aspects, if not all at least most of them. The adoption models focusing on only certain aspects are not discussed in this research, for example the ones where adoption is based on legacy products [32], architecture [33, 34], organizational structure [27], or separation of concerns [35]. Böckle et al. [7] has introduced a General Adoption Process Model for adopting a software product line. It has four main phases focusing on stakeholders, business cases, adoption plan, and launching and institutionalizing. In addition to the main phases, the model includes different factors contributing to the adoption: goals, promotion, and adoption decision. Software product line adoption requires many decisions which have to be made in the adoption phase by an adopting organization. These decisions concern what components are developed and in which order, how the architecture is harmonized, and how the development teams are organized. For that purpose, Decision Framework introduces five decision dimensions: feature selection, architecture harmonization, R&D organization, funding, and shared component scoping [2]. In addition the model contains three stages through which product line adoption typically evolves through: initial adoption, increasing scope, and increasing maturity.
148
P. Kuvaja, J. Similä, and H. Hanhela
Product Line Software Engineering (PuLSE) methodology has a strong productcentric focus for the conception and deployment of software product lines [36]. It comprises three main elements which are deployment phases, technical components, and support components. The deployment phases involve activities which are needed when adopting and using a product line. There are four different deployment phases: PuLSE initialization, product line infrastructure construction, product line infrastructure usage, and product line infrastructure evolution and management. The purpose of the technical components, the second element of the PuLSE methodology, is to offer technical knowledge needed in all the phases of the product line development. There are six technical components: customizing, scoping, modelling, architecting, instantiating, and evolving and managing. The support components are information packages or guidelines, the purpose of which is to enable a better adoption, evolution, and deployment of the product line and they are used by deployment phase components. There are three support components: project entry points, maturity scale, and organization issues. Business, Architecture, Process, and Organization (BAPO) model is a fourdimensional evaluation framework which organizations can use for determining the current state of the product family adoption and improvement priorities [37]. The dimensions concern business, architecture, process, and organization. Each dimension can be on five different levels which are defined with different evaluation aspects. For example, in business dimension at reactive level, identity of an organization is implicit (software product line engineering not visible), there is only short-term vision and both objectives and strategic planning are missing. Adoption Factory has, just as a decision framework including three main phases (Establish Context, Establish Production Capability and Operate Product Line) for software product line adoption (Figure 1). Different focus (Product, Process and Organisation) areas are separated by horizontal dashed lines and arrows are the indications of information flows and shift of emphasis among the elements. [5, 38].
Fig. 1. Adoption Factory [5]
Software Product Line Adoption – Guidelines from a Case Study
149
4 Conduct of the Study In the beginning of the study the idea was that the main evaluation criterion for the selection of the adoption model would be derived from the reported experiments of using the models by the adopting organizations. However, no reports were found in the literature describing pros and cons of using the models in the adoption phase. Some of the models had reported experiences in the literature: PuLSE [29, 39, and 40], Adoption Factory [41], 2005, and BAPO [26]. These reports nevertheless did not discuss the applicability of the models. The empirical research was carried out as a case study. According to Yin, case study is suited for research which is focused on finding answers to “how”, “why” or exploratory “what” questions, when the investigator has little control over the events, and when a contemporary phenomenon is investigated in some real-life context [43]. A case study is either single-case or multiple case and the data gathering methods for a case study are surveys, interviews, observation, and use of existing materials. This research focused on a single-case. The empirical data was collected by semistructured interviews and by analyzing existing materials of the organization. 4.1 Choosing the Adoption Model Due to the situation more general evaluation criteria were derived from the literature including: supported adoption strategy, customization, separation of core asset and product development, current state evaluation and guidelines. The supported adoption strategy defines to which strategies the model is applicable; starting from scratch, exploiting of existing systems, or both. Customization means the ability of an organization to tailor the adoption model for its own needs. Separation of core asset and product development defines whether these two development phases are illustrated separately in the adoption model. The current state evaluation describes how easy the evaluation is to do in higher level, and may have values easy or not easy. The last evaluation criterion presents, whether guidelines for proceeding with the adoption may be followed based on the adoption model. Customization, separation of core asset and product development, and guidelines may have values yes or no. Table 1. Evaluation of the Adoption Models
General Adoption Process Decision Framework PuLSE Adoption Factory BAPO
Supported Adoption Strategy Both
Customization
Current State Evaluation Not Easy
Guidelines
Exploiting Existing Systems Both Both
No
No
Easy
Yes
Yes Yes
No Yes
Easy Easy
No Yes
Both
Yes
No
Not Easy
Yes
Yes
Separation of Core Asset and Product Development No
No
150
P. Kuvaja, J. Similä, and H. Hanhela
The adoption models were evaluated according to the defined criteria in order to find the most suitable one for using in the empirical study (Table 1). Based on the evaluation of the adoption models and the research context, Adoption Factory1 was selected for the empirical case. 4.2 Interviews The themes for the interviews were selected from the Adoption Factory on the basis of two reasons. As the purpose was to find out current status and future plans of the software product line adoption, the selected themes should cover the model as extensively as possible (but considering the resource limitations of the research) and the interviewees should have knowledge about them. The themes are marked with arrows in Figure 2. The structured questions for the interviews were derived from the selected themes. The questions were partly planned beforehand, but not in very much detail. In addition, there were also questions relating to the gathered experiments which were utilized when defining the adoption guidelines for the target organisation.
Fig. 2. Selected Themes for the Interviews from the Adoption Factory
Before the interviews, the interviewees were divided to different categories according to different generic development phases of the organization in question. The categories were road-mapping, product management, architecture, and requirements engineering. The reason for these categories was that possible gaps between them, for example in communication, could be found in order to minimize 1
The Adoption Factory is discussed in some more detail in the empirical section. A detailed description of the Adoption Factory may be found in SEI’s web pages [42].
Software Product Line Adoption – Guidelines from a Case Study
151
the gaps when proceeding with the adoption. Another reason was to find out if all aspects and steps of maturing market needs for requirements that could be implemented were covered. In the interviews, the themes varied according to which category the interviewee belonged to. Table 2 clarifies the relationships between the themes and the interviewees. As in most of the cases all the selected themes belonging to one sub-pattern were asked from the interviewee, the sub-patterns were used instead of the themes as presented in Table 2. Table 2. Summary of the interviewees Interviewee 1 2 3 4 5 6 7 8 9 10
Category road-mapping road-mapping product management product management architecture architecture requirements engineering requirements engineering requirements engineering requirements engineering
Role of the Interviewee Senior Manager, Portfolio Management Senior Product Manager, Road-mapping Product Manager Product Manager Engine Product Manager Product Chief Architect Product Requirement Manager
Date 7.8.2007 7.8.2007 8.8.2007 8.8.2007 10.8.2007 13.8.2007 14.8.2007
SW Technology Manager, Requirements SW Requirements Operational Manager
23.8.2007
SW Implementation Operational Manager
23.8.2007 23.8.2007
4.3 Data Collection In addition to the interviews, existing documents were analyzed to clarify the current state and future plans related to the software product line adoption. The analyzed materials were mainly mentioned during the interview by the interviewee, so the interviews had an open-ended nature. Such material was, for example, a process description of a certain development phase. The existing documents were analyzed after the interviews. After selecting the adoption model, an e-mail was sent to 10 persons who had participated in the development of the product line and one product involving the product line to inform them about the research. The e-mail consisted of general information of the research and the Adoption Factory together with the purpose of the research. Two days later, a new e-mail was sent to arrange an interview. In that email, there were a list of themes and the topics, which would be covered in every theme: current situation, experiments, and future plans. Therefore, the interviewees could be well-prepared beforehand [43]. No one declined the interview. Among the interviewees there were two persons from road-mapping, product management, and architecture categories, and four persons from requirements engineering. The roles of the interviewees varied according to which category they belonged to. Overall the interviewees covered the interview themes well. The interviews were conducted in the same order as they are presented in Table 2.
152
P. Kuvaja, J. Similä, and H. Hanhela
All the interviews were face-to-face interviews with one interviewee at the time. In the beginning of each interview, a short introduction was held to familiarize the interviewees more closely with the research. The introduction consisted of the Adoption Factory, which was gone through more in depth than in the e-mails, research problems (and that the interviews will answer to the second research question), how the research is conducted, and how the results are constituted. The interviewees had a possibility to ask for more details, if necessary. The interviews themselves lasted for an average of one and a half hours. All the interviews were tape-recorded with a digital voice recorder, so that any of the information they gave would not be wasted and only correct information would be used when analyzing the results. After the interviews, the tape-recordings were transcribed. Later, the data gathered by interviews and by using existing material were read through several times together with the Adoption Factory to form a clear general view to analyze the results more in depth. 4.4 Data Analysis and Results In this study, the data was analyzed by classifying it according to the used themes. By this, the current situation could be compared to the model, as well as the future plans. Table 3. Main findings with their possible reasons Category Roadmapping Road-mapping
Finding Period requirements described in a too high level of abstraction No commonalities for one product line
Road-mapping
Period requirements cannot be implemented in a required timeframe Each product goes through the period requirements by itself Documentation requires a lot of effort
Product Management Product Management Product Management Product Management Architecture
Requirements Engineering Requirements Engineering
Documents are not comparable between the products Inefficient communication Architecture definition could not be started before certain decisions related to it were made Confusion among stakeholders Lots of data is collected but it is utilized poorly Product line was established after establishing the products Adoption plan has not been defined No common place for data distribution No training related to software product lines
Reason Period requirements defined for several product lines Period requirements defined for several product lines Processing of the period requirements not been defined No commonalities for one product line Each product team writes its own documentations No common structure for the documents No clear roles and responsibilities Insufficient management commitment No clear roles and responsibilities No common structure for the metrics
Software Product Line Adoption – Guidelines from a Case Study
153
Table 4. How and why to take the guidelines into the daily practices Guideline Product line is established before the products (new short term operational mode)
Aspects to Consider Benefit • scoping for several product lines • development of products is more efficient • period requirements defined for the • no need to cancel products product lines development (significant cost • supplier starts to process the period savings) requirements immediately • no unrealistic requirements • products of the product line implement the same period • diminished multiple work loads requirements • utilization of documents is more • product line is more responsible for efficient documentations • data collection is more efficient Core asset • core asset development by exploiting • reuse of core assets development existing systems • utilization of core assets (new long term • attached processes for core assets • development of products is more operational • establishment of a core asset base efficient mode) Adoption plan • definition of practices, roles, and • clear practices, roles, and responsibilities responsibilities • definition of different requirement • helps with new operational mode types • mitigates adverse effects relating to • definition of usage of period the changes requirements • utilization of period requirements • separation of product line and • communication and cooperation is products more efficient • development of products is more efficient • valuable for future product lines Place for data • existing data is collected to the same • utilization of existing data is more distribution place (e.g. to a web page) efficient • possible pilot project • data can be found more easily • hierarchical order • helps in employee networking Training • trainings should cover principles of • sharing of knowledge is more product line engineering, relations efficient between different requirement types, • development of products is more each practice area efficient • other training needs should be clarified Data collection • definition of data collectors • helps in following the software product line adoption • definition of review points • needed changes to refine the • separation of product line and product line practices can be products identified • similar structure for the metrics • efforts for developing products can be seen • decreases duplicate work • metrics are more comparable and utilizable
154
P. Kuvaja, J. Similä, and H. Hanhela
With these classifications it is possible to see, if some focus area of the adoption or a part of it is not considered. Together with the categories, the flow between roadmapping, product management, architecture, and requirements engineering could be seen and possible gaps were discovered. Hence, especially conflicts between different categories were noticed, as those would affect the success of the adoption negatively. The findings were classified according to the same categories, which were used in the categorization of the interviewees. The categories are called road-mapping, product management, architecture, and requirements engineering. In addition, the findings related to several or all categories are discussed in the end of this section. Table 3 summarizes the findings according to the category they belong to and a possible reason for each finding. Findings and reasons were conducted from the research data (interviews and using of existing material). Adoption Factory was also considered when defining the reasons for the findings. Based on the findings, guidelines for correcting and improving the situation in the case organization were formulated. Table 4 describes which aspects to consider when following the guidelines, and what benefits the guidelines would give. Both of these were formulated based on the research data (interviews and using of existing material) as well as on the Adoption Factory model. In the formulation one aspect can give several benefits and one benefit can be a consequence of several aspects. The first two guidelines were named in temporal order as short term new operational mode and long term operational mode. Formulation of the adoption plan may be started immediately while changing to a new short term operational mode. This applies equally to place for data distribution, training, and data collection.
5 Conclusions In answering the first research question and two sub-questions the following was found in the literature analysis. There are two basic alternatives, which are called adoption strategies, for adopting software product lines. The first alternative is to do everything from the very beginning and not utilizing any existing systems, which is called a starting from scratch strategy. When using the starting from scratch strategy, the development time of one product is shorter but higher upfront investments are needed than in the other alternative, which is called an exploiting existing systems strategy. In that strategy, existing systems are utilized as much as possible and the cumulative costs are reduced faster than in the starting from scratch strategy. Compared to the conventional software development, the starting from scratch is like a waterfall approach and the exploiting existing systems strategy refers to incremental software development. Based on the literature review, five evaluation criteria were found for selection of adoption models. As the situation of the organization and market demand as well as the adoption time and needed upfront investments are the aspects, which should take into consideration, when selecting a suitable adoption model, the supported adoption strategy is the first criterion. To clarify whether an adoption model can be adapted to the organizational needs, the customization of the adoption model is the second evaluation criterion. Further, the software product line organization has two different roles: the first role is to develop core assets and the other is to produce products by exploiting the core assets. Due to this, the third evaluation criterion is called the
Software Product Line Adoption – Guidelines from a Case Study
155
separation of core asset and product development. In addition according to the theory, the adoption should start with a current state evaluation and the possibility for evaluating the current state with the adoption model needs to be considered, when selecting the adoption model. The last evaluation criterion is called guidelines. That means that the adoption model should support the creation of guidelines the purpose of which is to help to keep the adoption in the right track. In answering the second research question and two sub-questions the empirical part of the study concluded the following. First of all five guidelines were defined be taken into consideration when proceeding with the adoption. The first is to change the operational mode towards software product line engineering. As a short term guideline, the operational mode will be changed to establish the product line before the products involved in it. As a long term guideline, the operational mode is changed to develop core assets and the products are developed based on the core assets. At the same time with the new operational mode, an adoption plan should be created. The purpose of it is to define new practices, roles, and responsibilities needed to adopt software product line. After these, a place for data distribution is needed to utilize existing systems as extensively as possible. In addition, training is needed to ensure that the products of the product lines can be efficiently build. The last guideline is called data collection, which helps to measure if the adoption plan is working and the efforts needed to develop products are available. These three guidelines should be considered in reverse order: data collection should be considered first, then training, and the last, but not least, the place for data distribution should be established. Secondly the used adoption model, Adoption Factory, was found to be the most suitable one for this research context based on the literature review. The overall comprehension of the model is that the model was utilizable in the empirical part of the research. The phases and the focus areas of the model enabled the analysis of the organization in question. In addition, the practice areas of the model were clear and understandable when defining the interview themes and questions as well as the guidelines. Based on the model, the current state could be estimated and it was possible to set future guidelines were possible to constitute. The model suited well for the context of the research. As no reported empirical experiences were found in the literature of using the adoption models, this study fulfils that gap for the Adoption Factory model, although more case studies should be carried out to understand in which context a certain adoption model would suit the best. Two missing characteristics in Adoption Factory were found during the empirical study. First, the model is meant for a pure software product line adoption. It doesn’t consider cases where software needs hardware components for its operation and, therefore, totally new practice area could be included in the Establish Context phase for considering architectural aspects of embedded software product lines. Secondly, a new practice area or even an alternative phase could also be added to the Establish Context phase to show how to share the results of the marketing analysis between several product lines.
References 1. Sugumaran, V., Park, S., Kang, K.C.: Software product line engineering. Communications of the ACM 49(12), 28–32 (2006)
156
P. Kuvaja, J. Similä, and H. Hanhela
2. Bosch, J.: On the development of software product-family components. LNCS, pp. 146– 164. Springer, Berlin (2004) 3. Birk, A., Heller, G., John, I., Schmid, K., von der Massen, T., Muller, K., et al.: Product line engineering: The state of the practice. IEEE Software 20(6), 52–60 (2003) 4. Clements, P., Northrop, L.: Software product lines: Practices and patterns. AddisonWesley, Boston (2002) 5. Clements, P.C., Jones, L.G., McGregor, J.D., Northrop, L.M.: Getting there from here: A roadmap for software product line adoption. Communications of the ACM 49(12), 33–36 (2006) 6. Bosch, J.: Maturity and evolution in software product lines: Approaches, artefacts and organization. In: Chastek, G.J. (ed.) SPLC 2002. LNCS, vol. 2379, pp. 257–271. Springer, Heidelberg (2002) 7. Böckle, G., Munoz, J.B., Knauber, P., Krueger, C.W., do Prado Leite, S., Cesar, J., van der Linden, F., et al.: Adopting and institutionalizing a product line culture. LNCS, pp. 1–8. Springer, Heidelberg (2002) 8. Krueger, C.: Eliminating the adoption barrier. IEEE Software 19(4), 29–31 (2002) 9. McGregor, J.D., Northrop, L.M., Jarrad, S., Pohl, K.: Initiating software product lines. IEEE Software 19(4), 24–27 (2002) 10. Schmid, K., Verlage, M.: The economic impact of product line adoption and evolution. IEEE Software 19(4), 50–57 (2002) 11. Brownsword, L., Clements, P.C.: A case study in successful product line development. Carnegie Mellon University, Software Engineering Institute, Pittsburgh (1996) 12. Donohoe, P.: Software product lines: Experience and research directions. Springer, Heidelberg (2000) 13. Jaaksi, A.: Developing mobile browsers in a product line. IEEE Software 19(4), 73–80 (2002) 14. Kiesgen, T., Verlage, M.: Five years of product line engineering in a small company. In: Proceedings of the 27th International Conference on Software Engineering, pp. 534–543 (2005) 15. Meyer, M.H., Lehnerd, A.P.: The power of product platforms: Building value and cost leadership. Free Press, New York (1997) 16. Pohl, K., Böckle, G., van der Linden, F.: Software product line engineering: Foundations, principles, and techniques. Springer, Berlin (2005) 17. Weiss, D.M., Lai, C.T.R.: Software product-line engineering: A family-based software development process. Addison-Wesley, Reading (1999) 18. Hetrick, W.A., Krueger, C.W., Moore, J.G.: Incremental return on incremental investment: Engenio’s transition to software product line practice. In: Conference on Object Oriented Programming Systems Languages and Applications, pp. 798–804 (2006) 19. Northrop, L.M.: SEI’s software product line tenets. IEEE Software 19(4), 32–40 (2002) 20. Bosch, J., Florijn, G., Greefhorst, D., Kuusela, J., Obbink, H., Pohl, K.: Variability issues in software product lines. In: van der Linden, F.J. (ed.) PFE 2002. LNCS, vol. 2290, pp. 13– 338. Springer, Heidelberg (2002) 21. Jaring, M., Bosch, J.: Representing variability in software product lines: A case study. In: Chastek, G.J. (ed.) SPLC 2002. LNCS, vol. 2379, pp. 15–22. Springer, Heidelberg (2002) 22. Coplien, J., Hoffman, D., Weiss, D.: Commonality and variability in software engineering. IEEE Software 15(6), 37–45 (1998) 23. van der Linden, F.: Software product families in Europe: The esaps & cafe projects. IEEE Software 19(4), 41–49 (2002)
Software Product Line Adoption – Guidelines from a Case Study
157
24. Bosch, J.: Design and use of software architectures: Adopting and evolving a product-line approach. ACM Press/Addison-Wesley Publishing Co., New York (2000) 25. Mohagheghi, P., Conradi, R.: Different aspects of product family adoption. In: Software Product Family Engineering, pp. 459–464. Springer, Heidelberg (2004) 26. Wijnstra, J.G.: Critical Factors for a Successful Platform-Based Product Family Approach. In: Chastek, G.J. (ed.) SPLC 2002. LNCS, vol. 2379, pp. 68–3349. Springer, Heidelberg (2002) 27. Bosch, J.: Software product lines: Organizational alternatives. In: Proceedings of the 23rd International Conference on Software Engineering, pp. 91–100 (2001) 28. Stoermer, C., Roeddiger, M.: Introducing Product Lines in Small Embedded Systems. In: van der Linden, F.J. (ed.) PFE 2002. LNCS, vol. 2290, pp. 101–112. Springer, Heidelberg (2002) 29. Knauber, P., Muthig, D., Schmid, K., Wide, T.: Applying product line concepts in small and medium-sized companies. IEEE Software 17(5), 88–95 (2000) 30. Krueger, C.W.: New methods in software product line practice. Communications of the ACM 49(12), 37–40 (2006) 31. Frakes, W.B., Kang, K.: Software reuse research: Status and future. IEEE Transactions on Software Engineering 31(7), 529–536 (2005) 32. Simon, D., Eisenbarth, T.: Evolutionary introduction of software product lines. In: Chastek, G.J. (ed.) SPLC 2002. LNCS, vol. 2379, pp. 1611–3349. Springer, Heidelberg (2002) 33. Myllymäki, T., Koskimies, K., Mikkonen, T.: Structuring product-lines: A layered architectural style (2002) 34. Thiel, S.: On the Definition of a Framework for an Architecting Process Supporting Product Family Development. In: van der Linden, F.J. (ed.) PFE 2002. LNCS, vol. 2290, pp. 47– 125. Springer, Heidelberg (2002) 35. Krueger, C.W.: Easing the transition to software mass customization. In: Proceedings of the Distal Seminar No.01161: Product Family Development (2002) 36. Bayer, J., Flege, O., Knauber, P., Laqua, R., Muthig, D., Schmid, K., et al.: PuLSE: A methodology to develop software product lines. In: SSR 1999: Proceedings of the 1999 Symposium on Software Reusability, Los Angeles, California, United States, pp. 122–131 (1999) 37. van der Linden, F., Bosch, J., Kamsties, E., Känsälä, K., Obbink, H.: Software product family evaluation. LNCS, pp. 110–129. Springer, Heidelberg (2004) 38. Northrop, L.M.: Software product line adoption roadmap. Carnegie Mellon University, Software Engineering Institute, Pittsburgh (2004) 39. Kolb, R., Muthig, D., Patzke, T., Yamauchi, K.: A case study in refactoring a legacy component for reuse in a product line. In: Proceedings of the 21st IEEE International Conference on Software Maintenance, pp. 369–378 (2005) 40. Schmid, K., John, I., Kolb, R., Meier, G.: Introducing the PuLSE approach to an embedded system population at Testo AG. In: Proceedings of the 27th International Conference on Software Engineering, pp. 544–552 (2005) 41. Donohoe, P., Jones, L., Northrop, L.: Examining product line readiness: Experiences with the SEI product line technical probe. In: Proceedings of the 9th International Software Product Line Conference (2005) 42. Northrop, L.M., Clements, P.C.: A framework for software product line practice (2007), http://www.sei.cmu.edu/productlines/framework.html (retrieved 05/03, 2007) 43. Yin, R.K.: Case study research: Design and methods, 3rd edn. Sage Publications, Thousand Oaks (2003)
Refactoring the Documentation of Software Product Lines* Konstantin Romanovsky, Dmitry Koznov, and Leonid Minchin Saint-Petersburg State University, Universitetsky pr. 28, Peterhof, Saint-Petersburg, 198504 Russia {kromanovsky,dkoznov,len-min}@yandex.ru
Abstract. One of the most vital techniques in the context of software product line (SPL) evolution is refactoring – extracting and refining reusable assets and improving SPL architecture in such a way that the behavior of existing products remains unchanged. We extend the idea of SPL refactoring to technical documentation because reuse techniques could effectively be applied to this area and reusable assets evolve and should be maintained. Various XML-based technologies for documentation development are widely spread today, and XML-specifications appear to be a good field for formal transformations. We base our research on the DocLine technology; the main goal of which is to introduce adaptive reuse into documentation development. We define a model of refactoring-based documentation development process, a set of refactoring operations, and describe their implementation in the DocLine toolset. Also, we present an experiment in which we applied the proposed approach to the documentation of a telecommunication systems SPL. Keywords: Software Product Line, Refactoring, Documentation.
1 Introduction Technical documentation is an important part of commercial software. The Development and maintenance of requirements and design specifications, user manuals, tutorials, etc. is a labor-intensive part of any software development process. Exactly like software, documentation could be volatile and multi-versioned, and it may also have a complex structure. Moreover, documentation is often developed in several natural languages and in different target formats, like HTML, PDF, and HTML Help. The Documentation of a software product line [1] (SPL), which is a set of software applications sharing a common set of features, – appears to be even more complicated than the documentation for stand-alone applications, since it contains multiple repetitions that should be explicitly managed to reduce documentation development effort. In [2] we presented DocLine – a technology for developing SPL documentation, which supports planned adaptive reuse. By providing an XML-language for documentation development, DocLine allows a three-level representation of documentation, namely as *
This research is partially supported by RFFI (grant 08-07-08066-з).
Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 158–170, 2011. © IFIP International Federation for Information Processing 2011
Refactoring the Documentation of Software Product Lines
159
diagrams of reuse structure, as XML-specification and as generated target documents in PDF, HTML or other format. DocLine also provides process guidelines and is supported by the Eclipse-based toolset. SPL development is a complicated evolutional process: While new products are developed, existing ones need maintenance and enhancement. The refactoring of SPL architecture and common assets is a popular approach to improving SPLs [3, 4, 5, 6, 7]. The purpose of this paper is to extend the idea of SPL refactoring to documentation development. Indeed, XML-based approaches to documentation development (like DocBook [8], DITA [9]) are becoming more and more popular, while turning documentation into a kind of formal specification. If we extract and explicitly mark up reusable text fragments to be used in newly created documents, the target representation of existing documents should not change (though XML-specification would be changed). Therefore, we consider it reasonable to use the term refactoring for such transformations. In this paper, we propose a refactoring-based documentation development process model, as well as offering a set of refactoring operations. Also, we describe their implementation in the context of DocLine toolset, and discuss an experiment in which we applied the proposed approach to the documentation of a telecommunication systems SPL.
2 Background 2.1 Software Product Line Development Back in 1976, Parnas noted that it was efficient to create whole product families [10] instead of creating stand-alone systems. In present, this idea is actively being developed. In [11], a software product line is defined as “a set of software-intensive systems that share a common, managed set of features satisfying the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way”. Product line evolution is an important issue in product line development. At first, top-down methods of product line development were created, for example DSSA [12]. Top-down methods involved starting development with an in-depth domain analysis, identifying potential reuse areas, and developing common assets, and, afterwards, moving to product line members’ development. Such methods require a lot of investment, but ensure flexible reuse, efficient maintenance and new products creation [13], thus bringing significant economic (and time-to-market) gain after the development of several products. Ultimately, new product development could be done just by selecting and configuring common assets [14]. These methods, however, involve serious risks, because, should the number of developed products remain small, investment will not pay off. Light-weight bottom-up methods were designed to mitigate these risks. They suggest starting from developing a single product and moving to developing a product line only when product perspectives become clear [15]. To facilitate this move, common assets are extracted from “donors” (stand-alone products) and form the basis of further development. This approach significantly reduces cost and time to market for the first product, but brings in less profit when a number of products increases.
160
K. Romanovsky, D. Koznov, and L. Minchin
The protagonists of both approaches agree that the need for change in SPL architecture or common assets arises regularly during product line evolution, but this change should not affect the behavior of products. Moreover, in bottom-up approaches such changes are the foundation of the development process. 2.2 Refactoring Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure [16]. It became popular in agile software development methods because it provides an alternative to expensive preliminary design by allowing for constant improvement of software architecture while preserving the behavior of software. Refactoring helps to fulfill particular tasks, like code structure and code representation refinement (dead code elimination, conditional expressions simplification, method extraction, etc.), and OO-hierarchy refinement (moving a field between classes, class extracting, pulling a field up/down the hierarchy, etc.). Also, there are so called “big refactorings”, for example, transition from procedural to object-oriented design. Refactoring can be done manually, but ensuring that the behavior of a system remains unchanged is not an easy task. As trivial an operation as it may seem like, renaming a method involves finding and modifying all of its calls (including calls via objects of all derived classes) and could affect the entire application source code. There are tools that facilitate refactoring by automating typical refactoring operations while ensuring correctness of source code transformations (correctness here means that the code remains compilable and its external behavior remains the same). Moreover, some toolsets support manual refactoring, including big refactoring, by providing means for automated regression testing (including, but not limited to unit testing). 2.3 Refactoring in Product Line Development A lot of research focuses on SPL refactoring. In [4], a feature model refactoring method is proposed as a way to improve the set of all possible product line configurations, that is, to maximize the potential for new products creation. In [3], the authors introduce a method of decomposing an application to a set of features for using them in product line development. What is offered in [6] is a set of metrics and a tool for refactoring SPL architecture. All the methods above are aimed to get better product line variability by extracting common assets and improving their configurability. 2.4 XML-Based Approaches to Technical Documentation Development Many documentation development approaches employ the concept of content and formatting separation, which means that meaningful document constructions, for example parts, chapters, sections, tables, are separated from their formatting (for example, information about a title font and size selection is not part of content; therefore it is defined separately). This idea was introduced well in advance of XML and it was implemented, for example, in the TeX typesetting system [17] by Donald
Refactoring the Documentation of Software Product Lines
161
Knuth. Modern XML-based approaches to documentation development make use of this idea as well as supporting single sourcing, that is developing several documents on the basis of a single source representation [18]. The appropriateness of using such technologies is widely discussed by technical writers [19]. Research data in this area show that the advantages of using XML technology in middle-size and large companies justify the cost of their adoption (see, for instance, [20]). In practice, such XML-based technologies as DocBook [8] and DITA [9], are actively adopted by the industry in dozens of large companies and projects, including IBM, PostgreSQL, Adobe, Sun, Cray Inc. [21], Unix-systems distributives, window environments (GNOME, KDE) [22].
3 The DocLine Approach 3.1 Basic Ideas The DocLine approach [2] was designed for developing and maintaining SPL technical documentation. One distinct characteristic of such documentation is that there are a lot of text reuse opportunities both in single product documentation and among similar documents for different products. DocLine introduces planned adaptive reuse of documentation fragments. Reuse is planned with the help of a visual modeling technique which allows creating, navigating and modifying a scheme of reusable fragments (common assets) and their relations. Adaptive reuse means that common assets can be configured independently for each usage context. DocLine features an XML-based language DRL (Documentation Reuse Language), intended for designing and implementing reusable documentation. It also offers a model of documentation development process, and a toolset integrated into Eclipse IDE. DocLine was presented in detail in [2]; therefore we now focus on the basic features of DRL which are critical to explaining the refactoring operations we propose. Also we describe a process model for developing product line documentation, provided by DocLine. In order to implement text markup we use the well known DocBook [8] approach that is a standard de factum in the Linux/Unix world. In fact, DRL extends DocBook to include an adaptive reuse mechanism. DRL-specifications are first translated by the DocLine toolset into plain DocBook format; then, the DocBook utilities are used to produce target documents in a variety of formats (PDF, HTML, etc). 3.2 DRL Overview The most important kind of common assets in DRL is an information element, which is defined as а context-independent reusable text fragment. In order to put a set of information elements together, DocLine provides what is called an information product, which is a template of a real document, like a user guide or a reference manual. Every information element could be included in any other information element or information product. These inclusions can be optional or mutually dependent, and all such variation points must be resolved to derive the specific document from an information product.
162
K. Romanovsky, D. Koznov, and L. Minchin
DRL provides two mechanisms of adaptive reuse: customizable information elements and multi-view item catalogs. Customizable information elements. Let us look at the documentation of a phones product line with a CallerID function. It could have an information element named “Receiving incoming calls” containing a text like this: Once you receive an incoming call, the phone gets CallerID information and displays it on the screen.
(1)
Phone may have additional ways of indicating a caller, e.g. a phone for the visually impaired could have a voice announcement instead of a visual presentation, and thus the example (1) would look as follows: Once you receive an incoming call, the phone gets CallerID information and reads it out.
(2)
To facilitate the transformation of the sentence in (1) into the sentence in (2) using an adaptive reuse technique, the corresponding information element must be written in the following way (in the syntax of DRL): Once you receive an incoming call, the phone gets CallerID information and displays it on the screen.
(3)
In this example, we define an information element ( tag) and an extension point inside it ( tag). When this information element is included in a particular context, any extension point can be removed, replaced or appended with custom content without having to modify the information element itself. If no customization is defined, the information element in (3) will produce the text as in (1) seen. The following customization transforms it into the text in (2): and reads it out
(4)
The example (4) shows a reference to the information element () defined in the example (3) and the replacement of the extension point defined in this information element by new content (). Multi-view item catalogs. In the documentation of most software products one can find descriptions of typical items of the same kind, for example, GUI commands. In different documents and contexts they are accompanied by a different set of details. In toolbar documentation, for example, commands are defined as an icon with a name and a description. In menu documentation you will see the name of a command, its description and accelerator key sequence; an online help also will contain a relevant
Refactoring the Documentation of Software Product Lines
163
tooltip text that shows up when a user drags the mouse cursor over the command button. All these fragments have common attributes of corresponding items. To allow the reuse of such attributes, DocLine introduces the concept of a catalog. A catalog contains a collection of items represented by a set of attributes, e.g. the GUI commands catalog may include a collection of GUI commands, each of them having a name, an icon, a description, an accelerator, a tooltip text, a list of side effects, usage rules, etc. Here is an example of such a collection of GUI commands for some software product represented as a catalog in DRL: <entry name="Print"> Print Print.bmp This command … <entry name=" Save"> Save Save.bmp This command … In addition to a collection of items, a catalog contains a set of representation templates that define how to combine item attributes to get a particular item representation. The template in the example below represents a short notation of GUI commands including only the icon and the name: Images/ A representation template contains some text and references to item attributes (). When a technical writer includes a catalog item into a particular context, he or she must indicate the corresponding representation template and the item identifier. Then, the content of the template will be inserted into the target context and all the references to attributes will be replaced by corresponding attribute values. 3.3 Documentation Development Process Model As we discussed above, there are two major models of product line evolution: topdown and bottom-up. Consequently, there are two models of documentation development. DocLine supports both models, but focuses mainly on the latter since it is more often used in practice. The scheme of the bottom-up model is shown in Fig. 1. In case of the bottom-up model, a technical writer starts from the documentation of a particular product and does not pay attention to reuse techniques. Priority is given to achieving a specific business goal (e.g., a good documentation package for a concrete product), while further perspectives might not be clear. However, once a need for new
164
K. Romanovsky, D. Koznov, and L. Minchin
products documentation arises, a technical writer can benefit from a reuse technique by analyzing reuse options, extracting common assets, and proceeding with the development of documentation on the basis of the common assets. This is an appropriate moment to adopt the DocLine technique. It is a straightforward task if the documentation has been developed by means of DocBook, or it can be easily converted to DocBook. In other cases the existing documentation should be manually ported to DocBook and then marked up with DRL constructions. First product documentation development
First product documentation
Creation of common assets
Common assets
Second product documentation development
Second product documentation
Next products documentation development
Products documentation
Fig. 1. Bottom-up documentation development process
4 Refactoring of Product Lines Documentation 4.1 Refactoring Process Let us look at the bottom-up process model for product line documentation development in greater detail from refactoring perspective: 1. DocLine is adopted when new product documentation is to be created. 2. To use DocLine, а technical writer analyzes the functionality of a new product, and finds similarities, additions and modifications compared to the functionality of existing products. 3. Then, a technical writer finds all fragments in the existing documentation that should be preserved, modified, added or deleted. 4. Then, some formal transformations are applied to documentation sources in order to turn them into correct DRL-specification (if documentation was developed using plain DocBook; otherwise documentation must be ported to DocBook before this step) and to explicitly mark common assets. Existing and newly created pieces of text which are not reused (or are identically the same) should be converted into a single large information element. DRL-specification is decomposed in-depth if the information element varies from one product to another,
Refactoring the Documentation of Software Product Lines
165
or if there are other compelling reasons for decomposition, e.g., simultaneous documentation editing by several technical writers. The same model remains relevant when a new product is added to a product line, whose documentation has been already developed by using DocLine. In this case, some common assets could be reused “as is” in new product documentation, but typically there is always a need for the refinement of existing assets and the creation of new common assets. 4.2 Refactoring Operations Let us discuss how exactly refactoring ideas are realized in text transformations. The first group consists of text transformations aimed at extracting common assets. 1. Extracting information element. A fragment of text is extracted to a stand-alone information element. Then, it is replaced by a reference to the newly created information element. If the extracted fragment contains extension points, new adapters are created for all final information products to ensure that all manipulations with extension points are preserved. 2. Splitting information element. An information element is split into two information elements. All references and adapters are updated. 3. Importing of DocBook documentation. The entire DocBook documentation for a product is imported into the DocLine toolset. All the necessary elements are created: the entire documentation is placed into an information element that is included into a newly created information product. Also, a final information product is created. As a result, we receive the same target documentation as the documentation derived from initial DocBook documentation. 4. Extracting information product. A selected document (or a fragment) is extracted to an information product. It is followed by the creation of a final information product that uses the newly created element to produce the exact copy of the original document. This operation is normally used when plain documentation in DocBook is imported into the DocLine toolset and a technical writer needs to extract several information products from it. The following operations are designed to facilitate core assets tuning (extending their configurability). 5. Converting to extension point. A text fragment inside an information element is surrounded by an extension point. 6. Extracting to insert after / insert before. A trailing / heading text fragment inside an extension point is extracted and inserted into all existing references to the original information element as an Insert After / Insert Before construction. If the text is outside any extension point, but inside an information element, a new empty extension point is created before / after the fragment. 7. Making a reference to an information element optional. A particular mandatory reference to an information element (infelemref) is re-declared as optional. In all existing final information products containing the above-mentioned information element, an adapter is created (or updated) to force the inclusion of the optional information element.
166
K. Romanovsky, D. Koznov, and L. Minchin
8. Converting to conditional. A selected text fragment is marked as conditional text (condition shall be provided by the technical writer). In all existing final information products containing the fragment, the condition is set to true to force the inclusion of the conditional text. 9. Switching default behavior on optional references to information element. Depending on a technical writer’s preferences optional references may be treated as included by default or excluded by default. When default behavior is switched, all adapters must be updated to ensure that the target documents stay unchanged. The following operations are designed to facilitate the use of small-grained reuse constructions – dictionaries and multi-view item catalogs (directories). 10. Extracting to dictionary. A selected text fragment (typically, it is a single word or a word combination) is extracted to a dictionary and its entry is replaced by a reference to the newly created dictionary item. Then, documentation is scanned for other entries of the same item. Found entries could be replaced by references to the dictionary item. 11. Extracting to multi-view item catalog (directory). A new multi-view item catalog item is created based on a selected text fragment. Then, the fragment is replaced by a reference to the newly created dictionary item with a selected (or newly created) item representation template. 12. Copying/moving dictionary/directory item to/from product documentation. A selected dictionary (or directory) item is copied (or moved) from the common assets to the context of particular product(s) documentation and vice versa. Thus the scope of the item is extended or narrowed. The following operations facilitate renaming various structural elements of documentation. 13. Renaming. The following documentation items can be renamed: an information element, an information product, a dictionary, a directory, a dictionary item, a directory item, an extension point, a reference to an information element, a dictionary representation template. All references to renamed elements are updated. 4.3 Refactoring Operation Example Let us consider an example of applying the refactoring operation of extracting information element. The purpose of this example is to show that refactoring operations require non-local source text modifications in order to keep target documents unchanged. Here is a fragment of a documentation source in DRL (note that all these constructions may be stored in physically different files):
Refactoring the Documentation of Software Product Lines
167
You can connect your phone to You can dial numbers using urban telephone network or office exchange. built-in numeric key-pad or phone memory. This fragment contains an information product phone_manual that is the template of the user manual for a phone set. It contains a reference to the information element basic_functions describing basic phone functionality. Also, there is a specialization of the user manual: it is a final information product office_phone, that uses the phone_manual information product to produce a manual for an office phone. If we run the operation of extracting information element for an extension point (nest) dial_options, we get the following changes. The information element basic_functions will look as follows (changed text is typed in boldface): You can connect your phone to A new information element is created out of the extracted text fragment: You can dial numbers using Finally, let us look at changes in the final information product office_phone (a changed text is typed in boldface): urban telephone network or office exchange.
168
K. Romanovsky, D. Koznov, and L. Minchin
built-in numeric key-pad or phone memory. As you can see, the manipulations with the extracted extension point were moved from the existing adapter to a newly created one that defines adaptations of the new information element. 4.4 The Toolset Most of the proposed refactoring operations are implemented as part of the DocLine toolset. DocLine toolset is designed as a set of plug-ins for Eclipse IDE [2]. The refactoring tool is embedded into the DRL text editor. In addition to the operations, the refactoring tool provides a framework for implementing new refactoring operations, and a support library that perform tasks typical of most refactoring operations, like DRL parsing (supporting multi-file documentation structure) and DRL generation.
5 The Experiment The approach to refactoring of SPL documentation presented in this paper was applied to the documentation of a telecommunication systems product line. This product line includes phone exchanges of various purposes: private branch exchanges, inter-city gateways, transit exchanges, etc. For our experiment we selected two product line members – an exchange for public switched telephone network (hereinafter called PSTNX) and a special-purpose exchange (hereinafter called SPX). We decided to port user manuals of these products to DocLine. During the analysis, we found that historically SPX was developed as a version of PSTNX with reduced functionality. In course of evolution, some functions of SPX were changed, so its user manual was updated accordingly. First, we converted the documentation to DocBook (in our experiment it was done manually, although for some cases it could have been automated). Then, we started to introduce a DRL markup. We discovered a series of common terms and word combinations in the documentation and created dictionaries and directories to guarantee their unified use across the documents. After that, we “mined” some text fragments, which were similar in both documents, and wrapped them with information element constructions to make them available for reuse in various contexts. Then, we “fine-tuned” these information elements to prepare them for use in both documents. We used the following refactoring operations in our experiment: importing of DocBook documentation, extracting information element, converting to extension point, extracting to insert after / insert before, making reference to information element optional, extracting to dictionary, extracting to directory. These operations helped us to build an efficient internal structure of the documentation and enable the reuse of text fragments across the two documents while preserving the view of the target documents.
Refactoring the Documentation of Software Product Lines
169
One of our findings is that joining two products to form a family significantly differs from deriving a new product from an existing one to form a family. Let us consider an extracting information element operation. How do we find what to extract? Likely candidates are similar but not identical text fragments, but finding them in two documents with a total of 300 pages proved to be a very difficult task. This suggests that there is a need for a specific tool which a technical writer could use alongside with the refactoring tool, to facilitate finding potential common assets.
6 Conclusions and Further Work The product line documentation refactoring approach proposed in this paper is designed to facilitate moving from monolithic documentation for one or several products to reusable documentation of a software product line with explicitly defined common assets. It can also be used for developing documentation for newly created product line members. In our further research we plan to enable intelligent selection of candidates for refactoring: fragments to be extracted as information elements, frequently used words to be extracted to dictionaries and common word combinations to form multi-view item catalogs. What seems to be a promising approach to find candidates for an information element extraction is source code clones detection, since it could be enhanced so as to identify “polymorphic” clones. Techniques for product line variability management are also of interest to us because they could provide a technical writer with information on products variability that is more or less reflected in the documentation structure (e.g. common features in a product variability model correspond to information elements in a documentation). We also plan to introduce means for “big refactorings” (major changes of documentation composed of series of automated and manual transformations). Entire automation for such an operation is impossible, but we could offer some useful services, for example, the automated checking of target documentation consistency. Another area for further research is the pragmatics of refactoring, and we would like to propose some ideas of how to guide documentation refactoring. As for program code refactoring, there are coding conventions, rules of building OO hierarchy, off-the-shelf recommendations on using various refactoring operations, etc. We plan to develop a set of similar recommendations for our case. One more question of pragmatics is how to keep a balance between the configurability of documentation and the complexity of its structure. Finally, we intend making a larger experiment which main goal will be to test the scalability of the proposed approach.
References 1. Northrop, L., Clements, P.: A Framework for Software Product Line Practice, Version 5.0 (2008), http://www.sei.cmu.edu/productlines/framework.html 2. Koznov, D., Romanovsky, K.: DocLine: a Method for Software Product Line Documentation Development. In: Ivannikov, V.P. (ed.) Programming and Computer Software, vol. 34(4) (2008)
170
K. Romanovsky, D. Koznov, and L. Minchin
3. Trujillo, S., Batory, D., Díaz, O.: Feature Refactoring a Multi-Representation Program into a Product Line. In: Proc. of the 5th Int. Conf. on Generative Programming and Component Engineering (2006) 4. Calheiros, F., Borba, P., Soares, S., Nepomuceno, V., Vander A.: Product Line Variability Refactoring Tool. 1st Workshop on Refactoring Tools, Berlin (2007) 5. Liu, J., Batory, D., Lengauer, C.: Feature oriented refactoring of legacy applications. In: Proceedings of the 28th International Conference on Software Engineering, pp. 112–121. ACM Press, New York (2006) 6. Critchlow, M., Dodd, K., Chou, J., van der Hoek, A.: Refactoring product line architectures. In: IWR: Achievements, Challenges, and Effects, pp. 23–26 (2003) 7. Alves, V., Gheyi, R., Massoni, T., Kulesza, U., Borba, P., Lucena, C.: Refactoring Product Lines. In: Proceedings of the 5th International Conference on Generative Programming and Component Engineering, Portland, Oregon, USA, pp. 201–210 (2006) 8. Walsh, N., Muellner, L.: DocBook: The Definitive Guide. O’Reilly, Sebastopol (1999) 9. Day, D., Priestley, M., Schell, D.A.: Introduction to the Darwin Information Typing Architecture – Toward portable technical information, http://www106.ibm.com/developerworks/xml/library/x-dita1/ 10. Parnas, D.: On the Design and Development of Program Families. IEEE Transactions on Software Engineering, 1–9 (March 1976) 11. Clements, P., Northrop, L.: Software Product Lines: Practices and Patterns. AddisonWesley, Boston (2002) 12. Tracz, W.: Collected Overview Reports from the DSSA Project, Technical Report, Loral Federal Systems – Owego (1994) 13. Clements, P.: Being Proactive Pays Off. IEEE Software, 28–31 (July/August 2002) 14. Krueger, C.: New Methods in Software Product Line Practice. Communications of The ACM 49(12), 37–40 (2006) 15. Krueger, C.: Eliminating the Adoption Barrier. IEEE Software, 29–31 (July/August 2002) 16. Fowler, M., et al.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999) 17. TeX user group, http://www.tug.org 18. Rockley, A., Kostur, P., Manning, S.: Managing Enterprise Content: A Unified Content Strategy. New Riders, Indianapolis (2002) 19. Clark, D.: Rhetoric of Present Single-Sourcing Methodologies. In: SIGDOC 2002, Toronto, Ontario, Canada (2002) 20. Albing, B.: Combining Human-Authored and Machine-Generated Software Product Documentation. In: Professional Communication Conference, pp. 6–11. IEEE Press, Los Alamitos (2003) 21. Companies using DITA, http://dita.xml.org/deployments 22. Companies using DocBook, http://wiki.docbook.org/topic/WhoUsesDocBook
Code Generation for a Bi-dimensional Composition Mechanism Jacky Estublier1, Anca Daniela Ionita2, and Tam Nguyen1 1
LIG-IMAG, 220, rue de la Chimie BP5338041 Grenoble Cedex 9, France {Jacky.Estublier,Tam.Nguyen}@imag.fr 2 Automatic Control and Computers Faculty, Univ. "Politehnica" of Bucharest, Spl.Independentei 313, 060042, Bucharest, Romania [email protected]
Abstract. Composition mechanisms are intended to build a target system out of many independent units. The paper presents how the aspect technology may leverage the hierarchical composition, by supporting two orthogonal mechanisms (vertical and horizontal) for composing completely autonomous parts. The vertical mechanism is in charge of coordinating heterogeneous components, tools or services at a high level of abstraction, by hiding the technical details. The result of such a composition is called “domain” and, at its turn, it represents a high granularity unit of reuse. The horizontal mechanism composes domains at the level of their abstract concepts, even if they have been independently designed and implemented. The paper discusses the formalization of the vertical and horizontal compositions, and the wizard we have developed for generating the needed code (using Aspect Oriented Programming) in order to build the modeled applications. Keywords: Model Driven Engineering, code generation, AOP, model composition, Domain Engineering.
1 Introduction Creating software based on already available components is an obvious way to speed up the development process and to increase productivity. The “classical” composition approach - often referred as CBSE (Component Based Software Engineering) – deals with components especially designed to be composed and with a hidden internal structure. This approach works well under constraints related to context dependency and component homogeneity. These constraints involve a rigid composition mechanism, since components know each other, must have compatible interfaces and must comply with the constraints of the same component model, which reduces the likelihood of reuse and prevents from obtaining a large variety of assemblies. The paper presents an alternative composition approach, which still sticks to the encapsulation principle (parts have a hidden internal structure) and reuse components without any change, but which relaxes the composition constraints found in CBSE. The aims of this approach can be summarized as below: Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 171–185, 2011. © IFIP International Federation for Information Processing 2011
172
J. Estublier, A.D. Ionita, and T. Nguyen
A. The components or, generally speaking, the parts ignore each other and may have been designed and developed independently, i.e. they do not call each other; B. Composed parts may be of any nature (ad hoc, legacy, commercial, COTS, local or distant); C. Parts are heterogeneous i.e. they do not need to follow a particular model (component model, service etc.); D. Parts have to be reused without performing any change on their code. To solve the heterogeneity issue (the above mentioned aims B and C), one can imagine that the part to be composed is wrapped, directly or indirectly, into a “composable element” [1]. For composing parts that ignore each other and have been designed independently (aim A), there is a need to define a composition mechanism that is not based on the traditional method call. The publish subscribe mechanism [2] is an interesting candidate, since the component that sends events ignores who (if any) is interested in that event, but the receiver knows and must declare what it is interested in. If other events, in other topics are sent, the receiver code has to be changed. Moreover, the approach works fine only if the sender is an active component. Aspect Oriented Software Development (AOSD) [3][4] satisfies some of the requirements above, since the sender (the main program) ignores and does not call the receiver (the aspects). Unfortunately, the aspect knows the internals of the main program, which defeats the encapsulation principle [5] and aspects are defined at a low level of abstraction (the code) [6][7]. The bi-dimensional composition mechanism presented here is intended to be a solution for such situations. The idea is that the elements to be composed are not traditional components, but much larger elements, called domains, which do not expose simple interfaces, but (domain) models (described in chapter 2). Composition is not performed calling component interfaces, but composing such (domain) models. Model composition allows the definition of variability points, which make the mechanism more flexible than component composition [8]. In contrast with a method call, model composition does not require from the models to stick to common interfaces, or to know each other, it may even compose independent concepts. One of our main goals is also to reuse code without changing it (aim D) because: we have to compose tools, for which we do not have access to the internal code, but to an API only - this is what we call vertical composition and results in a so-called domain (see chapter 2.1 about the concepts and chapter 3 about the code generation); we want to reuse domains (which may be quite large) without changing them, because any change would require new tests and validations. For this purpose, we apply the horizontal composition (described in chapter 2.2. and 4). So, our method is non-invasive, using an implementation based on AOP; the composed domains and their models are totally unchanged and the new code is isolated with the help of aspects. However, since the AOP technique is at code level, performing domain composition has proved to be very difficult in practice; the conceptual complexity is increased, due to the necessity to deal with many technical details. The solution would be to specify composition at a high, conceptual level and to be able to generate the code based on aspects. The elevation of crosscutting modeling concerns to first-class constructs has been done in [9], by generating weavers from domain specific descriptions, using ECL, an
Code Generation for a Bi-dimensional Composition Mechanism
173
extension of OCL (Object Constraint Language). Another weaver constructed with domain modeling concepts is presented in [10], while [11] discusses mappings from a design-level language, Theme/UML, to an implementation-level language, AspectJ. For managing the complexity in a user friendly manner, we propose a conceptual framework, such as the AOP code is generated and the user defines the composition at a conceptual level, using wizards for selecting among pre-defined properties, instead of writing a specification in a textual language. Mélusine is the engineering environment that assists designers and programmers for developing such autonomous domains, for composing them and for creating applications based on them [1]. Recently, Mélusine has been leveraged by a new tool, which supports domain composition by generating Java and AspectJ code; it guides the domain expert for performing the composition at the conceptual level, as opposed to the programming level. Chapters 3 and 4 describe the metamodels that allow code generation for vertical and horizontal composition. Chapter 5 compares the approach with respect to other related works and evaluates its usefulness when compared with the domain compositions we have performed before the availability of the code generation facility.
2 A Bi-dimensional Composition Technique A possible answer to the requirements presented above is to create units of reuse that are autonomous (eliminating dependencies on the context of use) and composable at an abstract level (eliminating dependencies on the implementation techniques and details). The solution presented here combines two techniques (see Fig. 1):
Fig. 1. Bi-dimensional composition mechanism
174
J. Estublier, A.D. Ionita, and T. Nguyen
Building autonomous domains using vertical composition - which is a coordination of heterogeneous and “low level” components or tools, in order to provide an homogeneous and autonomous functional unit, called domain; Abstract composition of domains using horizontal composition – performed between the abstract concepts of independent domains, without modifying their code. 2.1 Developing Autonomous Domains: Vertical Composition Developing a domain can be performed following a top-down or a bottom-up approach. From a top down perspective, the required functionalities of the domain can be specified through a model, irrespective of its underlying technology; then, one identifies the software artifacts (available or not) that will be used to implement the expected functionality and make them interoperate. From a bottom up perspective, the designer already knows the software artifacts that may be used for the implementation and will have to interoperate; therefore, the designer has to identify the abstract concepts shared by these software artifacts and how they are supposed to be consistently managed; then, one defines how to coordinate the software artifacts, based on the behavior of the shared concepts. In both cases, the composition is called vertical, because the real software components, services or tools are driven based on a high level model of the application The model elements are instances of the shared concepts, which are abstractions of the actual software artifacts. The synchronization between these software artifacts and the model means that the evolution of the model is transformed into actions performed by the software artifacts. The set of the shared concepts and their consistency constraints constitute a domain model, to which the application model must conform to. In the Model Driven Engineering (MDE) vocabulary, the domain model is the metamodel of all the application models for that domain [6]. For instance, one of our domains, which has been intensely reused, is the Product domain, which will also be presented in the case study of this paper. It was developed as a basic versioning system for various products, characterised by a map of attributes, according to their type; the versions are stored in a tree, consisting of branches and revisions. The domain model of Product domain contains the following concepts: Product, Branch, Revision, Attribute, ProductType, ProductAttribute, AttributeType (see [8] for a detailed presentation). The application models are interpreted by a virtual machine built according to the domain model, which orchestrates then the lower level services or tools (see Fig. 1). The domain interpreter is realized by Java classes that reify the shared concepts (the domain model) and whose methods implement the behavior of these concepts. In many cases, these methods are empty because most, if not all, the behavior is actually delegated to other software artifacts, with the help of aspect technology. Thus, the domain interpreter, also called the domain virtual machine, separates the abstract, conceptual part from the implementation, creating architecture with 3 layers [6] (see Fig. 1). The domains may be autonomously executed, they do not have dependencies and they may be easily used for developing applications (details in chapter 3). For the example of Product domain, one of our application models is dedicated to the J2EE architecture, versioning typed software artefacts. A Servlet from this application model conforms to the concept ProductType from the Product domain model.
Code Generation for a Bi-dimensional Composition Mechanism
175
Moreover, in order to assure its persistency, the Product domain interpreter may use one of the domain tools, based either on SQL storage, or on the repository of another versioning system, like Subversion or CVS. They correspond to the third layer presented in Fig. 1. The tool is then chosen with respect to the client preference. 2.2 Abstract Composition of Domains: Horizontal Composition It may happen that the development of a new application requires the cooperation of two concepts, pertaining to two different domains, and realized through two or more software components, services or tools. In this case, the interoperation is performed through a horizontal composition between these abstract concepts, and also through the domain virtual machines, ignoring the low level components, services, tools used for the implementation. The mechanism consists in establishing relationships between concepts of the two domain models and implementing them using aspect technology, such as to keep the composed domains unchanged. A very strict definition of the horizontal relationship properties is necessary, such as to be able to generate most of the AOP code for implementing them. This code belongs to the Composition Virtual Machine (Fig. 1) and is separated from the virtual machines of the composed domains. This composition is called horizontal, because it is performed between parts situated at the same level of abstraction. It can be seen as a grey box approach, taking into account that the only visible part of a domain is its domain model. It is a non-invasive composition technique, because the components and adapters are hidden and are reused as they are (details in chapter 4). The composition result is a new domain model (Fig. 1) and therefore, a new domain, with its virtual machine, so that the process may be iterated. As the domains are executable and the composition is performed imperatively, its result is immediately executable, even if situated at a high level of abstraction. Fig. 2 is a real example of two domain models –used and reused in our industrial applications. On the left, there is the Activity domain, which supports workflow execution, while on the right there is the Product domain, meant to store typed products and their attributes. The light colored boxes represent the visible concepts (the abstract syntax) used for defining the models with appropriate editors; the dark grey ones show the hidden classes, introduced for implementing the interpreters (the virtual machines). For each domain, a model is made by instantiating the concepts from the light colored part. Fig. 3 shows an Activity model, conforming to the metamodel from Fig. 2; the boxes for Design, Programming and Test are instances of the ActivityDefinition concept; connector labels, like requirement, specification etc. are instances of DataVariable from Fig. 2. They correspond (conform) to the data types defined in this model (Requirement, Specification, Program etc.) shown in the bottom panel. Similarly, a Product model could contain product types, like JML Specification, JavaFile. These two models are related together by the horizontal relationships, for example there may be a link between the data type Specification from the Activity model and the product type JML Specification from the Product model; this link conforms to the relationship represented between the concepts DataType and ProductType in Fig. 2.
176
J. Estublier, A.D. Ion nita, and T. Nguyen
Fig. 2. Actiivity domain model vs. Product domain model
3 Generating Code for f the Vertical Composition The methods defined in a concept c are introduced for providing some functionalityy. In most cases, only a part (if any) a of the functionality is defined inside the method itsself, because, most often, the beehavior involves the execution of some tools. The conccept of Feature has been defined to provide the code calling the services that actually imvior of the method. Additionally, a feature can implemennt a plement the expected behav concern attached to that method, like an optional behavior, as in product lline approaches.
Code Generation for a Bi-dimensional Composition Mechanism
177
Fig. 3. An Activity model (fragment)
Fig. 4. Metamodel for the vertical composition
An abstract service is an abstraction for a set of functionalities defined in a Java interface that are ultimately executed by services / tools supporting the service (i.e. implementing its methods). For example, in Fig. 4, the method getProducts, in class ProductType is empty and it is its associated feature that will delegate the call to a database in which the actual products are stored. More than one feature can be attached to the same method and each feature can address a different concern. The word feature is used in the product line approach to express a possible variability that may be attached to a concept. Our approach is a combination of the product line intention with the AOP implementation [12]. Moreover, the purpose is to aid software engineers as much as possible, in the design and development of such kind of applications.
178
J. Estublier, A.D. Ionita, and T. Nguyen Table 1. Mapping on Eclipse Artifacts for the Vertical Composition Metamodel
Metamodel element Domain Concept Behavior Feature Abstract service Service Interception
Eclipse artifact
Elements generated inside the artifacts
Project Class Method AspectJ Project Project Project AspectJ Capture
Interfaces for the domain management Skelton for the methods Empty body by default. The AspectJ aspect and a class for the behavior Java interface defining the service interface An interface and an implementation skeleton The corresponding AspectJ code
Using the Codèle tool, which “knows” this metamodel, the software engineer simply creates instances of its concepts (Behavior, Interception, Feature, Service etc.) and the tool generates the corresponding code in the Eclipse framework. As well as all Mèlusine domain models, Codèle metamodels are implemented with Java, while AspectJ, its aspect-oriented extension, is used for delegating the implementation to different tools and/or components (instances of the concept Service). The Eclipse mappings currently used in Mélusine environment are presented in Table 1. In particular, users never see, and even ignore, that AspectJ code is generated; they simply create a feature associated with a concept behavior. A similar idea is presented in [13], where Xtend and Xpand languages are used for specifying mappings from problem to solution space and the code generation is considered to be less error-prone than the manual coding.
4 Generating Code for the Horizontal Composition In other similar approaches, as in model collaboration [14], AOP was mentioned as a possible solution for implementing the collaboration templates, among service oriented architectures (SOA), orchestration languages or coordination languages. As our approach is based on establishing relationships, it can be compared to [15], where the properties of AOP concepts are identified (e.g. behavioral and structural crosscutting advices, static and dynamic weaving). Our intention is to identify such properties at a more abstract level, because in our approach, aspects only constitute an implementation technique. The technique we use for generating horizontal composition between domains is similar to transforming UML associations into Java code [16], but using AOP, because we are not allowed to change the domain code. 4.1 Meta-metamodel for the Horizontal Composition To provide an effective support for domain composition, Mélusine requires a specific formal definition and semantics. Fig. 5 shows that domain composition relies on Horizontal Relationship, made of connections. A connection is established between a source concept, pertaining to the source domain, and a destination concept in the target domain. A connection intercepts a behavior (method) pertaining to the source concept (class), and performs some computation depending on its type: Synchronization, StaticInstantiation, DynamicInstantiation.
Code Generation for a Bi-dimensional Composition Mechanism
179
Synchronization connections are meant to synchronize the state of the destination object; they intercept all methods that change the state of the source concept, and perform the needed actions in order to change the destination concept object accordingly. Instantiation connections intercept the creation of an instance of Element conforming to the source concept and are in charge of creating a link toward an instance of Element conforming to the destination concept. This instantiation connection may be performed statically or dynamically. Statically means that the pair of model elements (source, destination) are known and created before execution; dynamically means that this pair is computed during the execution, when the source object is created (eager) or when the link is needed for the first time (lazy). To implement horizontal relationships in AspectJ, each connection is transformed into an AspectJ code that calls a method in a class generated by Codèle; users never “see” AspectJ code. In practice, the code for horizontal relationships semantics represents about 15% of the total code. The mappings towards Eclipse artifacts used by Mélusine are indicated in the table below. Table 2. Mapping between Horizontal Composition concepts and Eclipse Artifacts
Metamodel element Domain Concept Behavior HorizontalRelationship
Eclipse Project Class Method AJ Class and Java classes
Interception
AspectJ Capture
Elements generated inside the artifacts Predefined interfaces and classes. None None - an AspectJ file containing the code for all the interceptions - a Java file for each instantiation connections - a Java file for each synchronization connections Lines in the AspectJ file for the interception, and a java file for the connection code.
4.2 Relationships for Horizontal Composition at Metamodel Level Composing two domains means establishing relationships between the concepts pertaining to these domains [17]. In our example, one can establish a relationship between the concept of DataType in the Activity domain, and the concept of ProductType defined in the Product domain (see Fig. 2). The screen shot in Fig. 6 shows how this relationship is defined using Codèle tool. The horizontal relationship is defined as static, because in this specific case, the involved data types are defined in the models (Specification from the Activity model - Fig. 3 and UMLDocument from the Product model) and therefore are known before the execution. This relationship has a single connection, which intercepts the constructor of a ProcessDataType, with the type name as parameter, and declares that the relationship should be static and its instantiation should be done automatically, by choosing “Static instantiation Automatic selection” (Fig. 6).
180
J. Estublier, A.D. Ionita, and T. Nguyen
Fig. 5. Metamodel for the horizontal composition
It is important to mention that the system knows which concepts are visible in each domain; thus, the wizard does not allow horizontal relationships that are not valid. The interceptions are defined at the conceptual level; the developer of the composite domain does not know that AspectJ captures are generated. For the instantiation connections, the wizard proposes a set of predefined instantiation strategies, for which all the code is generated (as in our example); for the synchronization connection, the developer has to fill a method that has as parameters the context of the interception and the connection destination object. 4.3 Relationships for Horizontal Composition at Model Level At metamodel level, a horizontal relationship definition is established between 2 concepts, i.e. between the Java classes that implement these concepts. However, at execution, instances of these horizontal relationships must be created between instances of these classes. At model level, Codèle proposes an editor that allows the selection of two domains (i.e. two domain models and the horizontal relationships defined as shown in Fig. 6) and a pair of models pertaining to these domains. The top left panel lists the horizontal relationships (between the Activity and Product domains, in the example from Fig. 7). When selecting a horizontal relationship, the two top right panels show the names of the entities that are instances of the source and destination classes.
Code Geeneration for a Bi-dimensional Composition Mechanism
Fig. 6. Defin ning horizontal relationships at metamodel level
Fig. 7. Defining static links at model level
181
182
J. Estublier, A.D. Ionita, and T. Nguyen
In our example, the DataType-ProductType horizontal relationship has been selected, for which one displays the corresponding instances, like Specification in the Activity domain, and Use Case Document, or JMLSpecification in the Product domain. As this horizontal relationship has been declared Static, the developer is asked to provide the pairs of model entities that must be linked, according to that Horizontal relationship. Otherwise, they would have been selected automatically, at run time. The bottom panel lists the pairs that have been defined. For example, the data type called Specification in the Activity domain is related to JML Specification in the Product domain. The system finds this information by introspecting the models and is in charge of creating these relationships at model level.
5 Discussion In order to make the domain composition task as simple as possible, the metamodels presented above take into account the specificities of Mélusine domains. Consequently, the composition we realized is specific for this situation, as opposed to other approaches that try to provide mechanisms for composing heterogonous models in general contexts, generally without specifying how to implement them precisely. For this reason, many researches have tried to find out a generic approach that solves this problem, by proposing abstract composition operators, like: match [18], relate [19], compare [20] for discovering correspondences between models; merge [18], compose [18], weaving [21] for integrating models and sewing [21] for relating models without changing their structure. The elaboration of metamodels that support code generation in Codèle tool was possible after years of performing Mélusine’s domain compositions. Through trials and errors, we have found recurring patterns of code when defining vertical and horizontal relationships and we have been capable of identifying some of their functional and non functional characteristics. Codèle embodies and formalizes this knowledge through simple panels, such that users “only” need to write code for the non standard functionalities. Experience shows that more than half of the code is generated in average, and that it is the generated code which is error prone, since it manages the low level technical code including AOP captures, aspect generation and so on. The user’s added code fully ignores the generated one and the existence of AOP; it describes at the logical level the added functionality. Experience with Codele has shown a dramatic simplification for writing relationship, and the elimination of the most difficult bugs. In some cases, the generated code is sufficient, allowing application composition without any programming. This experience also led to the definition of a methodology for developing horizontal relationships, described in [17]. However, many other non functional characteristics could be identified and generated in the same way, and Codèle can (should) be extended to support them. We have also discovered that some, if not most, non functional characteristics cannot be defined as a domain (security, performance, transaction etc.), and therefore these non functional properties cannot be added through horizontal relationships. For these properties, we have developed another technique, called model annotation, described in [22].
Code Generation for a Bi-dimensional Composition Mechanism
183
6 Conclusion Designing and implementing large and complex artifacts always relies on two basic principles: dividing the artifacts in parts (reducing the size and complexity of the each part) and abstracting (eliminating the irrelevant details). Our approach is an application of these general principles to the development of large software applications. The division of applications in parts is performed by reusing large functional areas, called domains. Domains are units of reuse, primary elements for dividing the problem in parts, and atoms on which our composition techniques are applied. To support the abstraction principle, the visible part of these domains is their (domain) model; conceptually as well as technically, our composition technique only relies on domain models. A domain is usually implemented by reusing existing parts, found on the market or inside the company, which are components or tools of various size and nature. We call vertical composition the technique which consists in relating the abstract elements found in the domain model, with the existing components found in the company. Reuse imposes that vertical relationships are implemented, without changing the domain concepts, or the existing components. In our approach, one develops independent and autonomous domains, which become the primary element for reuse. Domain composition is performed without any change in the composed domains, but only through so-called horizontal composition, by defining relationships between modeling elements pertaining to the composed domains. Domains can be defined and implemented independently of each other. They are large reuse units, whose interfaces are abstract models and whose composition is only based on the knowledge of these models. The necessity to design and implement large applications in the presence of existing components or tools led us and to develop Mélusine, a comprehensive environment, for supporting the approach presented in this paper, based on: • • • •
Formalizing the architectural concepts related to domains, based on modeling and metamodeling; Formalizing domain reuse and composition through horizontal relationships; Formalizing component and tool reuse and composition through vertical relationships; Generating aspects as a hidden implementation of these composition concepts.
An important goal of our approach is to raise the level of abstraction and the granularity level at which large applications are designed, decomposed and recomposed. Moreover, these large elements are highly reusable, because the composition only needs to “see” their abstract models, not their implementation. Finally, by relating domain concepts using wizards, most compositions can be performed by domain experts, not necessarily by highly trained technical experts, as it would be the case if directly using AOP techniques.
184
J. Estublier, A.D. Ionita, and T. Nguyen
References 1. Le-Anh, T., Estublier, J., Villalobos, J.: Multi-Level Composition for Software Federations. In: SC 2003, Warsaw, Poland (April 2003) 2. Bass, L., Clements, P., Kazman, R.: Software Architecture in Practice. Addison-Wesley, Reading (2003) 3. Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C., Loingtier, J.-M., Irwin, J.: Aspect-oriented programming. In: Aksit, M., Auletta, V. (eds.) ECOOP 1997. LNCS, vol. 1241, pp. 220–242. Springer, Heidelberg (1997) 4. Filman, R.E., Elrad, T., Clarke, S., Aksit, M.: Aspect-Oriented Software Development. Addison-Wesley Professional, Reading (2004) ISBN10: 0321219767 5. Dave, T.: Reflective Software Engineering - From MOPS to AOSD. Journal of Object Technology 1(4) (September-October 2002) 6. Estublier, J., Vega, G., Ionita, A.D.: Composing Domain-Specific Languages for WideScope Software Engineering Applications. In: Briand, L.C., Williams, C. (eds.) MoDELS 2005. LNCS, vol. 3713, pp. 69–83. Springer, Heidelberg (2005) 7. Monga, M.: Aspect-oriented programming as model driven evolution. In: Proceedings of the Linking Aspect Technology and Evolution Workshop (LATE), Chicago, IL, USA (2005) 8. Ionita, A.D., Estublier, J., Vega, G.: Variations in Model-Based Composition of Domains. In: Software and Service Variability Management Workshop, Helsinki, Finland (April 2007) 9. Gray, J., Bapty, T., Neema, S., Schmidt, D.C., Gokhale, A., Natarajan, B.: An approach for supporting aspect-oriented domain modeling. In: Pfenning, F., Macko, M. (eds.) GPCE 2003. LNCS, vol. 2830, pp. 151–168. Springer, Heidelberg (2003) 10. Ho, W., Jezequel, J.-M., Pennaneac’h, F., Plouzeau, N.: A Toolkit for Weaving AspectOriented UML Designs. In: First International Conference on Aspect-Oriented Software Development, Enschede, The Netherlands, pp. 99–105 (April 2002) 11. Clarke, S., Walker, R.: Towards a Standard Design Language for AOSD. In: Proc. of the 1st Int. Conf. on Aspect Oriented Software Development, Enschede, Netherlands, pp. 113– 119 (2002) 12. Estublier, J., Vega, G.: Reuse and Variability in Large Software Applications. In: Proceedings of the 10th European Software Engineering Conference, Lisbon, Portugal (September 2005) 13. Voelter, M., Groher, I.: Product Line Implementation Using Aspect-Oriented and ModelDriven Software Development. In: Proc. of the 11th International Software Prouct Line Conference (SPLC), Kyoto, Japan (2007) 14. Occello, A., Casile, O., Dery-Pinna, A., Riveill, M.: Making Domain-Specific Models Collaborate. In: Proc. of the 7th OOPSLA Workshop on Domain-Specific Modeling, Canada (2007) 15. Barra Zavaleta, E., Génova Fuster, G., Llorens Morillo, J.: An Approach to Aspect Modelling with UML 2.0. In: Baar, T., Strohmeier, A., Moreira, A., Mellor, S.J. (eds.) UML 2004. LNCS, vol. 3273. Springer, Heidelberg (2004) 16. Génova, G., Ruiz del Castillo, C., Lloréns, J.: Mapping UML Associations into Java Code. Journal of Object Technology 2(5), 135–162 (2003) 17. Estublier, J., Ionita, A.D., Vega, G.: Relationships for Domain Reuse and Composition. Journal of Research and Practice in Information Technology 38(4), 287–301 (2006) 18. Bernstein, P.A.: Applying model management to classical meta data problems. In: Proceedings of the Conference on Innovative Database Research (CIDR), Asilomar, CA, USA (Janvier 2003)
Code Generation for a Bi-dimensional Composition Mechanism
185
19. Kurtev, I., Didonet Del Fabro, M.: A DSL for Definition of Model Composition Operators. In: Models and Aspects Workshop at ECOOP, Nantes, France (July 2006) 20. Kolovos, D.S., Paige, R.F., Polack, F.A.C.: Model Comparison: A Foundation for Model Composition and Model Transformation Testing. In: 1st International Workshop on Global Integrated Model Management, GaMMa 2006, Shanghai (2006) 21. Reiter, T., et al.: Model Integration Through Mega Operations. In: Workshop on Modeldriven Web Engineering (MDWE), Sydney (2005) 22. Chollet, S., Lalanda, P., Bottaro, A.: Transparently adding security properties to service orchestration. In: 3rd International IEEE Workshop on Service Oriented Architectures in Converging Networked Environments (SOCNE 2008), Ginowan, Okinawa, Japan (March 2008)
Advanced Data Organization for Java-Powered Mobile Devices Tomáš Tureček1 and Petr Šaloun2 1
VŠB TU Ostrava, 17. listopadu 15, Ostrava, 708 00, Czech Republic [email protected] 2 Ostravská univerzita, Dvořákova 7, 701 03, Ostrava, Czech Republic [email protected]
Abstract. The paper reports on actual research motivated by need for efficient data storage on J2ME CDLC 1.0+ MIDP 1.0+ platform. Presented solution fulfils those needs by providing Midletbase package implementing a relational database with user friendly and extensible interface. Keywords: Java, J2ME, Database, Midlet, Midletbase, storage.
1 Introduction Mobile devices are all around of us. They became so cheap and so useful that almost everyone of us has its own cellular phone. Today’s cellular phones are quite universal as they can run various applications. The paper reports on continuation of previous research related to remote access to information systems [2] and it describes advantages of using Java-powered portable devices and wireless connections. The paper presents research on J2ME database framework we are working on. 1.1 Motivation There are many ways of integrating portable device with existing systems. One is to extend an existing IS with some interface and to implement thick client application into the mobile device communicating with the IS through this interface. We have applied this approach as it allows implementation of more complex logic into thick client. That is an advantage in cases like offline access to IS data. During our work on synchronization of thick IS clients we have noticed some weaknesses of J2ME storage capabilities. We were forced every time to create custom wrappers around RMS storage (Record Management System [5]). Those problems have inspired us to create extension to RMS in form of relational database. 1.2 Problem to Solve Thick client application demands some kind of storage to persist needed data. Reason for that can be fount in [1]. J2ME offers just simple storage called RMS (Record Management System [5]), which is in fact capable of storing byte arrays only. Developers are forced then to invent again and again their own RMS extension to persist primitive Java types. We have focused on this problem and we have created RMS extension simulating relational database over simple RMS storage. Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 186–191, 2011. © IFIP International Federation for Information Processing 2011
Advanced Data Organization for Java-Powered Mobile Devices
187
2 Solution State of the art was quite empty in the time we have started with our own database implementation. Only available solution was commercial solution PointBase Micro [4] and series of implementations of remote access to remote databases like Oracle Database Lite Edition [6]. These particular solutions did not satisfy our needs because we needed combination of all those solutions. We needed J2ME relational database which can be easily extended with new functionality such as remote access to remote systems. Presented solution behaves as usual relational database but it uses RMS for storing data. We call the extension as Midletbase (Midlet [5] database). Extension offers easy-to-use API to database engine allowing creating or dropping tables and inserting/ updating/deleting/searching table entries. Our database framework can look like another J2ME layer to J2ME developers (see Figure 1).
Fig. 1. Midletbase represents additional (database) layer to J2ME
Midletbase adds to J2ME following features: • • • • • • • •
Data to store are organized into entries with defined data types. Entries are stored in tables. Tables are organized in databases. Database structure as well as data is persisted in RMS to have it accessible even after device restart. Data is accessible through simple API with the same way that it is usual in relational databases. API allows creating and dropping tables. API allows inserting, updating, deleting and searching over data in RMS. Midletbase is extensible so it enables developer to extend Midletbase functionality plug-in way and implement his/her own behaviour for database operations.
Preliminary implementation results were already published in [1]. Inspiration for our solution was based on our experience with J2ME and also on existing solutions [4, 6].
188
T. Tureček and P. Šaloun
As the target platform we have selected CLDC 1.0 and MIDP 1.0. Reason for that is the Midletbase framework then will be usable on every single cell phone with Java support. 2.1 API One of the requirements for the Midletbase framework was usability. Interface allows developers to use the storage with a simple way and following piece of code shows how easy is to initialize storage and to create table and insert there some entries. Example of storage initialization and table interface usage: // initializing storage Storage storage = Storage.getInstance(); // creating table ITable table = storage.createTable( Dbname, // database name tblname, // table name new Column[]{ // table structure new Column( id, // column name IOperator.INTEGER, // column type false, // can be null? true), // is the key? new Column(name, IOperator.VARCHAR, false, false) }); // adding entry to table table.addEntry( new String[]{id, name}, // columns for values new Value[]{ // values to insert new Value(10), // Value class ensures new Value(some string)}); // type safety // searching in table IResultTable result = table.getEntry( new String[]{id, name}, // result columns new String[]{id}, // search columns new Value[]{new Value(10)}, // search values new int[]{IOperator.EQUALS}); // search operators // printing result vector to console System.out.printline(result.getAllEntries());
Advanced Data Organization for Java-Powered Mobile Devices
189
2.2 Extensibility Midletbase database engine implements Interceptor design pattern [3]. Database engine is implemented as a chain of Interceptors and developer (using Midletbase) can create his/her own interceptors and plug them into database engine and to create this way some additional functionality (e.g. logging or triggers to his/her application). Developer is even able to completely change the engine behaviour like not to store data to RMS but to create connection to the server application and store data there. Only things needed for the change is to implement simple well-described IStorageInterceptor interface and to use different constructor for Storage class initialization. Example of storage initialization using custom interceptors. // initializing storage Storage storage = Storage.getInstance( new String[]{ mypackage.MyInterceptor1, mypackage.MyInterceptor2});
Fig. 2. Simplified class diagram depicting interceptor design pattern involvement in architecture and how the custom interceptors can be passed from Storage initialization through Table to StorageInterceptingPoint class.
2.3 Measures We have performed set of measures showing characteristics of algorithms used in the implementation. They can be used for estimation of the effectiveness of Midletbase usage for certain purposes. This section shows characteristics of two usually used Midletbase use cases – create table and search the table. Following charts show average results from 20 measures. Unfortunately the short space in this paper prevents us from showing more characteristics of the framework.
190
T. Tureček and P. Šaloun
Fig. 3. Characteristics of the create table algorithm
Fig. 4. Characteristics of the search table algorithm
Figure 3 shows the create table algorithm characteristics. Dependency is not linear and it opens opportunity for future work to optimize algorithms for better performance. Objective of the measure is to get chart trend so the concrete time values are not relevant here; also the measuring is done with emulator1. Figure 4 shows needed time for searching for entries (by key) in table with different number of entries. The characteristic is quite linear with some small overhead in case of first record store access. Chart above represents search without cache. 1
Performed with Sun Microsystems Wireless Toolkit emulator 2.5.3 on Dell Latitude D620 laptop (CPU Genuine Intel T2400, 2GB RAM)
Advanced Data Organization for Java-Powered Mobile Devices
191
3 Future Work We are now working on JDBC driver for Midletbase to allow developers to use also standard DB interface. Midletbase does not contain functionality related to database structure change. Tables and databases can be created or dropped but not altered. Functionality is not included so far because it is not needed functionality when we consider the framework usage – storage for deployed application. Such application has already steady DB structure. In future we want also to focus on algorithms used in Midletbase. As we can see from section 2.5 there are opportunities to optimize those algorithms. The topic of the research is really actual and solves incompleteness of today used standards. As a proof we can take [7] where we can find similar solution to our independent one but our solution is more focused on extensibility and usage.
4 Conclusion The paper presents an ongoing research responding to real needs reported by many developers of Java mobile applications. The need is to have smart-enough storage on J2ME CDLC 1.0+ MIDP 1.0+ platform. Presented solution fulfils those needs by providing a package implementing a relational database with many features and a user friendly interface. The work is still ongoing but we plan to release Midletbase package as soon as we get a stable well tested version. The current version is available only for pilot applications.
References 1. Tureček, T., Běhálek, M.: Data organization on Java powered mobile devices. In: Proceeding Datakon 2005, Brno (2005) ISBN 8021038136 2. Tureček, T., Beneš, M.: Remote Access to Information System via PDA. In: ISIM 2003, Brno, pp. 145–152 (2003) ISBN 80-85988-84-4 3. Interceptor design pattern on Daily development (July 2008), http://dailydevelopment.blogspot.com/2007/04/interceptordesign-pattern.html 4. PointBase Micro (July 2008), http://www.pointbase.com/ 5. Keogh, J.: J2ME: The Complete Reference. McGraw-Hill/Osborne, Berkeley (2003) 6. Oracle Database Lite Edition (April 2008), http://www.oracle.com/technology/products/lite/index.html 7. Alier, M., Casado, P., Casany, M.J.: J2MEMicroDB an open source distributed database engine for mobile applications. In: PPPJ 2007 (2007)
Developing Applications with Aspect-Oriented Change Realization Valentino Vrani´c1 , Michal Bebjak1 , Radoslav Menkyna1 , and Peter Dolog2 1
Institute of Informatics and Software Engineering, Faculty of Informatics and Information Technologies, Slovak University of Technology, Ilkoviˇcova 3, 84216 Bratislava 4, Slovakia [email protected], [email protected], [email protected] 2 Department of Computer Science, Aalborg University, Selma Lagerl¨ ofs Vej 300, DK-9220 Aalborg EAST, Denmark [email protected]
Abstract. An approach to aspect-oriented change realization is proposed in this paper. With aspect-oriented programming changes can be treated explicitly and directly at the programming language level. Aspect-oriented change realizations are mainly based on aspect-oriented design patterns or themselves constitute pattern-like forms in connection to which domain independent change types can be identified. However, it is more convenient to plan changes in a domain specific manner. Domain specific change types can be seen as subtypes of generally applicable change types. This relationship can be maintained in a form of a catalog. Further changes can actually affect the existing aspect-oriented change realizations, which can be solved by adapting the existing change implementation or by implementing an aspect-oriented change realization of the existing change without having to modify its source code. Separating out the changes this way can lead to a kind of aspect-oriented refactoring beneficial to the application as such. As demonstrated partially by the approach evaluation, the problem of change interaction may be avoided to the large extent by using appropriate aspect-oriented development tools, but for a large number of changes, dependencies between them have to be tracked, which could be supported by feature modeling. Keywords: change, aspect-oriented programming, generally applicable changes, domain specific changes, change interaction.
1
Introduction
To quote a phrase, change is the only constant in software development. Change realization consumes enormous effort and time. Once implemented, changes get lost in the code. While individual code modifications are usually tracked by a version control tool, the logic of a change as a whole vanishes without a proper support in the programming language itself. Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 192–206, 2011. c IFIP International Federation for Information Processing 2011
Developing Applications with Aspect-Oriented Change Realization
193
By its capability to separate crosscutting concerns, aspect-oriented programming enables to deal with change explicitly and directly at programming language level. Changes implemented this way are pluggable and—to the great extent—reapplicable to similar applications, such as applications from the same product line. Customization of web applications represents a prominent example of that kind. In customization, a general application is being adapted to the client’s needs by a series of changes. With each new version of the base application all the changes have to be applied to it. In many occasions, the difference between the new and old application does not affect the structure of changes, so if changes have been implemented using aspect-oriented programming, they can be simply included into the new application build without any additional effort. We have already reported briefly our initial efforts in change realization using aspect-oriented programming [1]. In this paper, we present our improved view of the approach to change realization and the change types we discovered. Section 2 presents our approach to aspect-oriented change realization. Section 3 introduces the change types we have discovered so far in the web application domain. Section 4 discusses how to deal with a change of a change. Section 5 describes the approach evaluation and identifies the possibilities of coping with change interaction with tool support. Section 6 discusses related work. Section 7 presents conclusions and directions of further work.
2
Changes as Crosscutting Requirements
A change is initiated by a change request made by a user or some other stakeholder. Change requests are specified in domain notions similarly as initial requirements are. A change request tends to be focused, but it often consists of several different—though usually interrelated—requirements that specify actual changes to be realized. By decomposing a change request into individual changes and by abstracting the essence out of each such change while generalizing it at the same time, a change type applicable to a range of the applications that belong to the same domain can be defined. We will introduce our approach by a series of examples on a common scenario.1 Suppose a merchant who runs his online music shop purchases a general affiliate marketing software [9] to advertise at third party web sites denoted as affiliates. In a simplified schema of affiliate marketing, a customer visits an affiliate’s site which refers him to the merchant’s site. When he buys something from the merchant, the provision is given to the affiliate who referred the sale. A general affiliate marketing software enables to manage affiliates, track sales referred by these affiliates, and compute provisions for referred sales. It is also able to send notifications about new sales, signed up affiliates, etc. The general affiliate marketing software has to be adapted (customized), which involves a series of changes. We will assume the affiliate marketing software is 1
This is an adapted scenario published in our earlier work [1].
194
V. Vrani´c et al.
written in Java and use AspectJ, the most popular aspect-oriented language, which is based on Java, to implement some of these changes. In the AspectJ style of aspect-oriented programming, the crosscutting concerns are captured in units called aspects. Aspects may contain fields and methods much the same way the usual Java classes do, but what makes possible for them to affect other code are genuine aspect-oriented constructs, namely: pointcuts, which specify the places in the code to be affected, advices, which implement the additional behavior before, after, or instead of the captured join point (a well-defined place in the program execution)—most often method calls or executions—and inter-type declarations, which enable introduction of new members into types, as well as introduction of compilation warnings and errors. 2.1
Domain Specific Changes
One of the changes of the affiliate marketing software would be adding a backup SMTP server to ensure delivery of the notifications to users. Each time the affiliate marketing software needs to send a notification, it creates an instance of the SMTPServer class which handles the connection to the SMTP server. An SMTP server is a kind of a resource that needs to be backed up, so in general, the type of the change we are talking about could be denoted as Introducing Resource Backup. This change type is still expressed in a domain specific way. We can clearly identify a crosscutting concern of maintaining a backup resource that has to be activated if the original one fails and implement this change in a single aspect without modifying the original code: class AnotherSMTPServer extends SMTPServer { ... } public aspect BackupSMTPServer { public pointcut SMTPServerConstructor(URL url, String user, String password): call(SMTPServer.new(..)) && args (url, user, password); SMTPServer around(URL url, String user, String password): SMTPServerConstructor(url, user, password) { return getSMTPServerBackup(proceed(url, user, password)); } SMTPServer getSMTPServerBackup(SMTPServer obj) { if (obj.isConnected()) { return obj; } else { return new AnotherSMTPServer(obj.getUrl(), obj.getUser(), obj.getPassword()); } } }
The around() advice captures constructor calls of the SMTPServer class and their arguments. This kind of advice takes complete control over the captured join point and its return clause, which is used in this example to control the
Developing Applications with Aspect-Oriented Change Realization
195
type of the SMTP server being returned. The policy is implemented in the getSMTPServerBackup() method: if the original SMTP server can’t be connected to, a backup SMTP server class instance is created and returned. 2.2
Generally Applicable Changes
Looking at this code and leaving aside SMTP servers and resources altogether, we notice that it actually performs a class exchange. This idea can be generalized and domain details abstracted out of it bringing us to the Class Exchange change type [1] which is based on the Cuckoo’s Egg aspect-oriented design pattern [16]: public class AnotherClass extends MyClass { ... } public aspect MyClassSwapper { public pointcut myConstructors(): call(MyClass.new()); Object around(): myConstructors() { return new AnotherClass(); } }
2.3
Applying a Change Type
It would be beneficial if the developer could get a hint on using the Cuckoo’s Egg pattern based on the information that a resource backup had to be introduced. This could be achieved by maintaining a catalog of changes in which each domain specific change type would be defined as a specialization of one or more generally applicable changes. When determining a change type to be applied, a developer chooses a particular change request, identifies individual changes in it, and determines their type. Figure 1 shows an example situation. Domain specific changes of the D1 and D2 type have been identified in the Change Request 1. From the previously identified and cataloged relationships between change types, we would know their generally applicable change types are G1 and G2.
Fig. 1. Generally applicable and domain specific changes
196
V. Vrani´c et al.
A generally applicable change type can be a kind of an aspect-oriented design pattern (consider G2 and AO Pattern 2). A domain specific change realization can also be complemented by an aspect-oriented design patterns, which is expressed by an association between them (consider D1 and AO Pattern 1). Each generally applicable change has a known domain independent code scheme (G2’s code scheme is omitted from the figure). This code scheme has to be adapted to the context of a particular domain specific change, which may be seen as a kind of refinement (consider D1 Code and D2 Code).
3
Catalog of Changes
To support the process of change selection, the catalog of changes is needed in which the generalization–specialization relationships between change types would be explicitly established. The following list sums up these relationships between change types we have identified in the web application domain (the domain specific change type is introduced first): – – – – – – – – – – –
One Way Integration: Performing Action After Event Two Way Integration: Performing Action After Event Adding Column to Grid: Performing Action After Event Removing Column from Grid: Method Substitution Altering Column Presentation in Grid: Method Substitution Adding Fields to Form: Enumeration Modification with Additional Return Value Checking/Modification Removing Fields from Form: Additional Return Value Checking/Modification Introducing Additional Constraint on Fields: Additional Parameter Checking or Performing Action After Event Introducing User Rights Management: Border Control with Method Substitution User Interface Restriction: Additional Return Value Checking/Modifications Introducing Resource Backup: Class Exchange
We have already described Introducing Resource Backup and the corresponding generally applicable change, Class Exchange. Here, we will briefly describe the rest of the domain specific change types we identified in the web application domain along with the corresponding generally applicable changes. The generally applicable change types are described where they are first mentioned to make the sequential reading of this section easier. A real catalog of changes would require to describe each change type separately. 3.1
Integration Changes
Web applications often have to be integrated with other systems. Suppose that in our example the merchant wants to integrate the affiliate marketing software with the third party newsletter which he uses. Every affiliate should be a member
Developing Applications with Aspect-Oriented Change Realization
197
of the newsletter. When an affiliate signs up to the affiliate marketing software, he should be signed up to the newsletter, too. Upon deleting his account, the affiliate should be removed from the newsletter, too. This is a typical example of the One Way Integration change type [1]. Its essence is the one way notification: the integrating application notifies the integrated application of relevant events. In our case, such events are the affiliate sign-up and affiliate account deletion. Such integration corresponds to the Performing Action After Event change type [1]. Since events are actually represented by methods, the desired action can be implemented in an after advice: public aspect PerformActionAfterEvent { pointcut methodCalls(TargetClass t, int a): . . .; after(/∗ captured arguments ∗/): methodCalls(/∗ captured arguments ∗/) { performAction(/∗ captured arguments ∗/); } private void performAction(/∗ arguments ∗/) { /∗ action logic ∗/ } }
The after advice executes after the captured method calls. The actual action is implemented as the performAction() method called by the advice. To implement the one way integration, in the after advice we will make a post to the newsletter sign-up/sign-out script and pass it the e-mail address and name of the newly signed-up or deleted affiliate. We can seamlessly combine multiple one way integrations to integrate with several systems. The Two Way Integration change type can be seen as a double One Way Integration. A typical example of such a change is data synchronization (e.g., synchronization of user accounts) across multiple systems. When a user changes his profile in one of the systems, these changes should be visible in all of them. In our example, introducing a forum for affiliates with synchronized user accounts for affiliate convenience would represent a Two Way Integration. 3.2
Introducing User Rights Management
In our affiliate marketing application, the marketing is managed by several coworkers with different roles. Therefore, its database has to be updated from an administrator account with limited permissions. A limited administrator should not be able to decline or delete affiliates, nor modify the advertising campaigns and banners that have been integrated with the web sites of affiliates. This is an instance of the Introducing User Rights Management change type. Suppose all the methods for managing campaigns and banners are located in the campaigns and banners packages. The calls to these methods can be viewed as a region prohibited to the restricted administrator. The Border Control design pattern [16] enables to partition an application into a series of regions implemented as pointcuts that can later be operated on by advices [1]:
198
V. Vrani´c et al.
pointcut prohibitedRegion(): (within(application.Proxy) && call(void ∗.∗(..))) || (within(application.campaigns.+) && call(void ∗.∗(..))) || within(application.banners.+) || call(void Affiliate.decline(..)) || call(void Affiliate.delete(..)); }
What we actually need is to substitute the calls to the methods in the region with our own code that will let the original methods execute only if the current user has sufficient rights. This can be achieved by applying the Method Substitution change type which is based on an around advice that enables to change or completely disable the execution of methods. The following pointcut captures all method calls of the method called method() belonging to the TargetClass class: pointcut allmethodCalls(TargetClass t, int a): call(ReturnType TargetClass.method(..)) && target(t) && args(a);
Note that we capture method calls, not executions, which gives us the flexibility in constraining the method substitution logic by the context of the method call. The pointcut call(ReturnType TargetClass.method(..)) captures all the calls of TargetClass.method(). The target() pointcut is used to capture the reference to the target class. The method arguments can be captured by an args() pointcut. In the example code above, we assume method() has one integer argument and capture it with this pointcut. The following example captures the method() calls made within the control flow of any of the CallingClass methods: pointcut specificmethodCalls(TargetClass t, int a): call(ReturnType TargetClass.method(a)) && target(t) && args(a) && cflow(call(∗ CallingClass.∗(..)));
This embraces the calls made directly in these methods, but also any of the method() calls made further in the methods called directly or indirectly by the CallingClass methods. By making an around advice on the specified method call capturing pointcut, we can create a new logic of the method to be substituted: public aspect MethodSubstition { pointcut methodCalls(TargetClass t, int a): . . .; ReturnType around(TargetClass t, int a): methodCalls(t, a) { if (. . .) { . . . } // the new method logic else proceed(t, a); } }
3.3
User Interface Restriction
It is quite annoying when a user sees, but can’t access some options due to user rights restrictions. This requires a User Interface Restriction change type
Developing Applications with Aspect-Oriented Change Realization
199
to be applied. We have created a similar situation in our example by a previous change implementation that introduced the restricted administrator (see Sect. 3.2). Since the restricted administrator can’t access advertising campaigns and banners, he shouldn’t see them in menu either. Menu items are retrieved by a method and all we have to do to remove the banners and campaigns items is to modify the return value of this method. This may be achieved by applying a Additional Return Value Checking/Modification change which checks or modifies a method return value using an around advice: public aspect AdditionalReturnValueProcessing { pointcut methodCalls(TargetClass t, int a): . . .; private ReturnType retValue; ReturnType around(): methodCalls(/∗ captured arguments ∗/) { retValue = proceed(/∗ captured arguments ∗/); processOutput(/∗ captured arguments ∗/); return retValue; } private void processOutput(/∗ arguments ∗/) { // processing logic } }
In the around advice, we assign the original return value to the private attribute of the aspect. Afterwards, this value is processed by the processOutput() method and the result is returned by the around advice. 3.4
Grid Display Changes
It is often necessary to modify the way data are displayed or inserted. In web applications, data are often displayed in grids, and data input is usually realized via forms. Grids usually display the content of a database table or collation of data from multiple tables directly. Typical changes required on grid are adding columns, removing them, and modifying their presentation. A grid that is going to be modified must be implemented either as some kind of a reusable component or generated by row and cell processing methods. If the grid is hard coded for a specific view, it is difficult or even impossible to modify it using aspect-oriented techniques. If the grid is implemented as a data driven component, we just have to modify the data passed to the grid. This corresponds to the Additional Return Value Checking/Modification change (see Sect. 3.3). If the grid is not a data driven component, it has to be provided at least with the methods for processing rows and cells. Adding Column to Grid can be performed after an event of displaying the existing columns of the grid which brings us to the Performing Action After Event change type (see Sect. 3.1). Note that the database has to reflect the change, too. Removing Column from Grid requires a conditional execution of the method that displays cells, which may be realized as a Method Substitution change (see Sect. 3.2).
200
V. Vrani´c et al.
Alterations of a grid are often necessary due to software localization. For example, in Japan and Hungary, in contrast to most other countries, the surname is placed before the given names. The Altering Column Presentation in Grid change type requires preprocessing of all the data to be displayed in a grid before actually displaying them. This may be easily achieved by modifying the way the grid cells are rendered, which may be implemented again as a Method Substitution (see Sect. 3.2): public aspect ChangeUserNameDisplay { pointcut displayCellCalls(String name, String value): call(void UserTable.displayCell(..)) || args(name, value); around(String name, String value): displayCellCalls(name, value) { if (name == ””) { . . . // display the modified column } else { proceed(name, value); } } }
3.5
Input Form Changes
Similarly to tables, forms are often subject to modifications. Users often want to add or remove fields from forms or perform additional checks of the form inputs, which constitute Adding Fields to Form, Removing Fields from Form, and Introducing Additional Constraint on Fields change types, respectively. Note that to be possible to modify forms using aspect-oriented programming they may not be hard coded in HTML, but generated by a method. Typically, they are generated from a list of fields implemented by an enumeration. Going back to our example, assume that the merchant wants to know the genre of the music which is promoted by his affiliates. We need to add the genre field to the generic affiliate sign-up form and his profile form to acquire the information about the genre to be promoted at different affiliate web sites. This is a change of the Adding Fields to Form type. To display the required information, we need to modify the affiliate table of the merchant panel to display genre in a new column. This can be realized by applying the Enumeration Modification change type to add the genre field along with already mentioned Additional Return Value Checking/Modification in order to modify the list of fields being returned (see Sect. 3.3). The realization of the Enumeration Modification change type depends on the enumeration type implementation. Enumeration types are often represented as classes with a static field for each enumeration value. A single enumeration value type is represented as a class with a field that holds the actual (usually integer) value and its name. We add a new enumeration value by introducing the corresponding static field:
Developing Applications with Aspect-Oriented Change Realization
201
public aspect NewEnumType { public static EnumValueType EnumType.NEWVALUE = new EnumValueType(10, ””); }
The fields in a form are generated according to the enumeration values. The list of enumeration values is typically accessible via a method provided by it. This method has to be addressed by an Additional Return Value Checking/Modification change. An Additional Return Value Checking/Modification change is sufficient to remove a field from a form. Actually, the enumeration value would still be included in the enumeration, but this would not affect the form generation. If we want to introduce additional validations on the form input data to the system without built-in validation, an Additional Parameter Checking change can be applied to methods that process values submitted by the form. This change enables to introduce an additional check or constraint on method arguments. For this, we have to specify a pointcut that will capture all the calls of the affected methods along with their context similarly as in Sect. 3.2. Their arguments will be checked by the check() method called from within an around advice which will throw WrongParamsException if they are not correct: public aspect AdditionalParameterChecking { pointcut methodCalls(TargetClass t, int a): . . .; ReturnType around(/∗ arguments ∗/) throws WrongParamsException: methodCalls(/∗ arguments ∗/) { check(/∗ arguments ∗/); return proceed(/∗ arguments ∗/); } void check(/∗ arguments ∗/) throws WrongParamsException { if (arg1 != <desired value>) throw new WrongParamsException(); } }
Adding a new validator to a system that already has built-in validation is realized by simply adding it to the list of validators. This can be done by implementing Performing Action After Event change (see Sect. 3.1), which would implement the addition of the validator to the list of validators after the list initialization.
4
Changing a Change
Sooner or later there will be a need for a change whose realization will affect some of the already applied changes. There are two possibilities to deal with this situation: a new change can be implemented separately using aspect-oriented programming or the affected change source code could be modified directly. Either way, the changes remain separate from the rest of the application. The possibility to implement a change of a change using aspect-oriented programming and without modifying the original change is given by the aspectoriented programming language capabilities. Consider, for example, advices in
202
V. Vrani´c et al.
AspectJ. They are unnamed, so can’t be referred to directly. The primitive pointcut adviceexecution(), which captures execution of all advices, can be restricted by the within() pointcut to a given aspect, but if an aspect contains several advices, advices have to be annotated and accessed by the @annotation() pointcut, which was impossible in AspectJ versions that existed before Java was extended with annotations. An interesting consequence of aspect-oriented change realization is the separation of crosscutting concerns in the application which improves its modularity (and thus makes easier further changes) and may be seen as a kind of aspectoriented refactoring. For example, in our affiliate marketing application, the integration with a newsletter—identified as a kind of One Way Integration—actually was a separation of integration connection, which may be seen as a concern of its own. Even if these once separated concerns are further maintained by direct source code modification, the important thing is that they remain separate from the rest of the application. Implementing a change of a change using aspectoriented programming and without modifying the original change is interesting mainly if it leads to separation of another crosscutting concern.
5
Evaluation and Tool Support Outlooks
We have successfully applied the aspect-oriented approach to change realization to introduce changes into YonBan, a student project management system developed at Slovak University of Technology. It is based on J2EE, Spring, Hibernate, and Acegi frameworks. The YonBan architecture is based on the Inversion Of Control principle and Model-View-Controller pattern. We implemented the following changes in YonBan: – Telephone number validator as Performing Action After Event – Telephone number formatter as Additional Return Value Checking/Modification – Project registration statistics as One Way Integration – Project registration constraint as Additional Parameter Checking/Modification – Exception logging as Performing Action After Event – Name formatter as Method Substitution No original code of the system had to be modified. Except in the case of project registration statistics and project registration constraint, which where well separated from the rest of the code, other changes would require extensive code modifications if they have had been implemented the conventional way. We encountered one change interaction: between the telephone number formatter and validator. These two changes are interrelated—they would probably be part of one change request—so it comes as no surprise they affect the same method. However, no intervention was needed. We managed to implement the changes easily even without a dedicated tool, but to cope with a large number of changes, such a tool may become crucial.
Developing Applications with Aspect-Oriented Change Realization
203
Even general aspect-oriented programming support tools—usually integrated with development environments—may be of some help in this. AJDT (AspectJ Development Tools) for Eclipse is a prominent example of such a tool. AJDT shows whether a particular code is affected by advices, the list of join points affected by each advice, and the order of advice execution, which all are important to track when multiple changes affect the same code. Advices that do not affect any join point are reported in compilation warnings, which may help detect pointcuts invalidated by direct modifications of the application base code such as identifier name changes or changes in method arguments. A dedicated tool could provide a much more sophisticated support. A change implementation can consist of several aspects, classes, and interfaces, commonly denoted as types. The tool should keep a track of all the parts of a change. Some types may be shared among changes, so the tool should enable simple inclusion and exclusion of changes. This is related to change interaction which is exhibited as dependencies between changes. A simplified view of change dependencies is that a change may require another change or two changes may be mutually exclusive, but the dependencies between changes could be as complex as feature dependencies in feature modeling and accordingly represented by feature diagrams and additional constraints expressed as logical expressions [22] (which can be partly embedded into feature diagrams by allowing them to be directed acyclic graphs instead of just trees [8]). Some dependencies between changes may exhibit only recommending character, i.e. whether they are expected to be included or not included together, but their application remains meaningful either way. An example of this are features that belong to the same change request. Again, feature modeling can be used to model such dependencies with so-called default dependency rules that may also be represented by logical expressions [22].
6
Related Work
The work presented in this paper is based on our initial efforts related to aspectoriented change control [6] in which we related our approach to change-based approaches in version control. We identified that the problem with change-based approaches that could be solved by aspect-oriented programming is the lack of programming language awareness in change realizations. In our work on the evolution of web applications based on aspect-oriented design patterns and pattern-like forms [1], we reported the fundamentals of aspectoriented change realizations based on the two level model of domain specific and generally applicable change types, as well as four particular change types: Class Exchange, Performing Action After Event, and One/Two Way Integration. Applying feature modeling to maintain change dependencies (see Sect. 4) is similar to constraints and preferences proposed in SIO software configuration management system [4]. However, a version model for aspect dependency management [19] with appropriate aspect model that enables to control aspect recursion and stratification [2] would be needed as well.
204
V. Vrani´c et al.
We tend to regard changes as concerns, which is similar to the approach of facilitating configurability by separation of concerns in the source code [7]. This approach actually enables a kind of aspect-oriented programming on top of a versioning system. Parts of the code that belong to one concern need to be marked manually in the code. This enables to easily plug in or out concerns. However, the major drawback, besides having to manually mark the parts of concerns, is that—unlike in aspect-oriented programming—concerns remain tangled in code. Others have explored several issues generally related to our work, but none of this work aims at capturing changes by aspects. These issuse include database schema evolution with aspects [10] or aspect-oriented extensions of business processes and web services with crosscutting concerns of reliability, security, and transactions [3]. Also, an increased changeability of components implemented using aspect-oriented programming [13, 14, 18] and aspect-oriented programming with the frame technology [15], as well as enhanced reusability and evolvability of design patterns achieved by using generic aspect-oriented languages to implement them [20] have been reported. The impact of changes implemented by aspects has been studied using slicing in concern graphs [11]. While we do see potential of configuration and reconfiguration of applications, our work does not aim at automatic adaptation in application evolution, such as event triggered evolutionary actions [17], evolution based on active rules [5], or adaptation of languages instead of software systems [12].
7
Conclusions and Further Work
In this paper, we have described our approach to change realization using aspectoriented programming. We deal with changes at two levels distinguishing between domain specific and generally applicable change types. We introduced change types specific to web application domain along with corresponding generally applicable changes. We also discussed consequences of having to implement a change of a change. Although the evaluation of the approach has shown the approach can be applied even without a dedicated tool support, we believe that tool support is important in dealing with change interaction, especially if their number is high. Our intent is to use feature modeling. With changes modeled as features, change dependencies could be tracked through feature dependencies. For further evaluation, it would be interesting to expand domain specific change types to other domains like service-oriented architecture for which we have available suitable application developed in Java [21]. Acknowledgements. The work was supported by the Scientific Grant Agency of Slovak Republic (VEGA) grant No. VG 1/3102/06. We would like to thank Michael Grossniklaus for sharing his observations regarding our work with us.
Developing Applications with Aspect-Oriented Change Realization
205
References [1] Bebjak, M., Vrani´c, V., Dolog, P.: Evolution of web applications with aspectoriented design patterns. In: Brambilla, M., Mendes, E. (eds.) Proc. of ICWE 2007 Workshops, 2nd International Workshop on Adaptation and Evolution in Web Systems Engineering, AEWSE 2007, in conjunction with 7th International Conference on Web Engineering, ICWE 2007, Como, Italy, pp. 80–86 (July 2007) [2] Bodden, E., Forster, F., Steimann, F.: Avoiding infinite recursion with stratified aspects. In: Hirschfeld, R., et al. (eds.) Proc. of NODe 2006. LNI P, vol. 88, pp. 49–64. GI, Erfurt (2006) [3] Charfi, A., Schmeling, B., Heizenreder, A., Mezini, M.: Reliable, secure, and transacted web service compositions with AO4BPEL. In: 4th IEEE European Conf. on Web Services (ECOWS 2006), pp. 23–34. IEEE Computer Society, Switzerland (2006) [4] Conradi, R., Westfechtel, B.: Version models for software configuration management. ACM Computing Surveys 30(2), 232–282 (1998) [5] Daniel, F., Matera, M., Pozzi, G.: Combining conceptual modeling and active rules for the design of adaptive web applications. In: Workshop Proc. of 6th Int. Conf. on Web Engineering (ICWE 2006). ACM Press, New York (2006) [6] Dolog, P., Vrani´c, V., Bielikov´ a, M.: Representing change by aspect. ACM SIGPLAN Notices 36(12), 77–83 (2001) [7] Fazekas, Z.: Facilitating configurability by separation of concerns in the source code. Journal of Computing and Information Technology (CIT) 13(3), 195–210 (2005) [8] Filkorn, R., N´ avrat, P.: An approach for integrating analysis patterns and feature diagrams into model driven architecture. In: Vojt´ aˇs, P., Bielikov´ a, M., CharronBost, B., S´ ykora, O. (eds.) SOFSEM 2005. LNCS, vol. 3381, pp. 372–375. Springer, Heidelberg (2005) [9] Goldschmidt, S., Junghagen, S., Harris, U.: Strategic Affiliate Marketing. Edward Elgar Publishing, London (2003) [10] Green, R., Rashid, A.: An aspect-oriented framework for schema evolution in object-oriented databases. In: Proc. of the Workshop on Aspects, Components and Patterns for Infrastructure Software (in Conjunction with AOSD 2002), Enschede, Netherlands (April 2002) [11] Khan, S., Rashid, A.: Analysing requirements dependencies and change impact using concern slicing. In: Proc. of Aspects, Dependencies, and Interactions Workshop (affiliated to ECOO 2008), Nantes, France (July 2006) [12] Koll´ ar, J., Porub¨ an, J., V´ aclav´ık, P., Band´ akov´ a, J., Forg´ a`e, M.: Functional approach to the adaptation of languages instead of software systems. Computer Science and Information Systems Journal (ComSIS) 4(2) (December 2007) [13] Kvale, A.A., Li, J., Conradi, R.: A case study on building COTS-based system using aspect-oriented programming. In: 2005 ACM Symposium on Applied Computing, pp. 1491–1497. ACM, Santa Fe (2005) [14] Li, J., Kvale, A.A., Conradi, R.: A case study on improving changeability of COTS-based system using aspect-oriented programming. Journal of Information Science and Engineering 22(2), 375–390 (2006) [15] Loughran, N., Rashid, A., Zhang, W., Jarzabek, S.: Supporting product line evolution with framed aspects. In: Workshop on Aspects, Componentsand Patterns for Infrastructure Software (held with AOSD 2004, International Conference on Aspect-Oriented Software Development), Lancaster, UK (March 2004)
206
V. Vrani´c et al.
[16] Miles, R.: AspectJ Cookbook. O’Reilly, Sebastopol (2004) [17] Molina-Ortiz, F., Medina-Medina, N., Garc´ıa-Cabrera, L.: An author tool based on SEM-HP for the creation and evolution of adaptive hypermedia systems. In: Workshop Proc. of 6th Int. Conf. on Web Engineering (ICWE 2006), ACM Press, New York (2006) [18] Papapetrou, O., Papadopoulos, G.A.: Aspect-oriented programming for a component based real life application: A case study. In: 2004 ACM Symposium on Applied Computing, pp. 1554–1558. ACM, Nicosia (2004) [19] Pulverm¨ uller, E., Speck, A., Coplien, J.O.: A version model for aspect dependency management. In: Dannenberg, R.B. (ed.) GCSE 2001. LNCS, vol. 2186, pp. 70–79. Springer, Heidelberg (2001) [20] Rho, T., Kniesel, G.: Independent evolution of design patterns and application logic with generic aspects—a case study. Technical Report IAI-TR-2006-4, University of Bonn, Bonn, Germany (April 2006) [21] Rozinajov´ a, V., Braun, M., N´ avrat, P., Bielikov´ a, M.: Bridging the gap between service-oriented and object-oriented approach in information systems development. In: Avison, D., Kasper, G.M., Pernici, B., Ramos, I., Roode, D. (eds.) Proc. of IFIP 20th World Computer Congress, TC 8, Information Systems, Milano, Italy, Springer, Boston (2008) [22] Vrani´c, V.: Multi-paradigm design with feature modeling. Computer Science and Information Systems Journal (ComSIS) 2(1), 79–102 (2005)
Assessing the Quality of Quality Gate Reference Processes Thomas Flohr FG Software Engineering, Leibniz Universit¨ at Hannover Welfengarten 1, 30167 Hannover, Germany [email protected]
Abstract. Many software developing companies use Quality Gates to mitigate quality problems and to steer projects in time. The necessary structures, activities, methods, roles and documents can be encapsulated in a Quality Gate reference process, which then can be tailored to fulfill the needs of different projects. Each company has to implement a Quality Gate reference process individually because quality and business goals differ. In order to improve the quality of a Quality Gate reference process a company has to assess the quality of the implemented Quality Gate reference process. This paper presents a concept allowing the conduction of such an assessment by assessing the concepts of a Quality Gate reference process separately. The concepts (which have to be assessed) were identified by an empirical study involving several companies and by analyzing current literature. The assessment concept was validated by assessing the quality of different Quality Gate reference processes from literature. Keywords: Quality Gates, Process Assessment, Continous Process Improvement.
1
Introduction
Quality Gates are significant milestones and decision points within a project [5,7]. At each Quality Gate certain project results are evaluated against predefined and quality focused criteria. Based on the fulfillment of these criteria gatekeepers (which are usually part of the quality management) make a decision whether a project may proceed or not. Consequently, the quality situation of a project can be uncovered to the management and actions can be made in time. Quality Gates are often used in certain domains, e. g. in car development or in serial production of industrial goods [8]. In the domain of software development Quality Gates are used cumulatively in the last years [9]. Unfortunately, a theoretical foundation for Quality Gates and for the assessment of the process quality of Quality Gate reference processes is currently missing in the domain of software development. Assessments are necessary in order to identify potential shortcomings within an implemented Quality Gate reference process. A negative assessment can used as a starting point of a continuous improvement process. A positive assessment can be used to attest a project’s client the ability to control quality and to steer a project. Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 207–217, 2011. c IFIP International Federation for Information Processing 2011
208
T. Flohr
A software company can use Quality Gates in two ways (we will refer to them as strategies): – Quality Gates as a quality guideline: The same set of Quality Gates (and criteria) is applied to all projects resulting in a comparable and at least an equal minimum quality level in all these projects. – Quality Gates as a flexible quality strategy: A suitable Quality Gate process is applied to each project to exactly meet the project’s needs. A Quality Gate reference process encapsulates special structures, activities, methods, roles and documents, which can be implemented by a software company individually in order to satisfy their quality and business needs. This Quality Gate reference process then can be tailored to meet the needs of a given project. The result is a Quality Gate process, containing a set of criteria and a set of Quality Gates. Moreover, the intensity of the gate review and other methods and activities are determined. In the final step the Quality Gate process is instantiated by assigning persons to the roles and by assigning a fixed date to each determined Quality Gate. A gate management (which is usually part of the quality management) can continuously improve the implemented Quality Gate reference process. To achieve this task, the gate management needs to know possible shortcomings: the assessment concept described in this paper can provide a strong assistance here. Figure 1 summarizes the different steps of tailoring and instantiation. Quality Gate reference process
Continues Improvement
Tailoring
Quality Gate process Legend Activity Result
Instantiation Instantiated Quality Gate process
Fig. 1. Tailoring of a Quality Gate reference process
1.1
Outline
This paper is structured in four main sections. Section two shows the concepts a software company has to implement to gain a Quality Gate reference process. Section three presents our assessment concept as well as the possible impacts resulting from shortcomings in the implementation. Furthermore, it is described
Assessing the Quality of Quality Gate Reference Processes
209
how the assessment concept can be used as a starting point for a continuous improvement process. Section four shows the application of the concept on different Quality Gate reference processes from literature. Finally, section five contains a conclusion and an outlook.
2
Concepts of Quality Gate Reference Processes
In order to assess the process quality of a Quality Gate reference process we need to identify its concepts first. The concepts were identified through a empirical survey conducted among software companies. The survey lasted three months and was conducted in 2007. Overall, 11 questionnaires were sent back and evaluated. Furthermore, Quality Gate reference processes from literature [3,7] and from the V-Model XT reference process [2] of the German federal administration were analyzed. To keep track of the identified concepts, the concepts are structured in different categories. The categories and their concepts are described in detail in the following sections. 2.1
Structural Concepts
The structural category only contains one concept: the gate network. A Quality Gate reference process can have an arbitrary number of gate networks. Each gate network holds information on a set of Quality Gates and the order in which these Quality Gates have to be passed. Each gate network is usually assigned to a certain project type. Smaller projects tend to have very few or even no Quality Gates, because the resource overhead is too high. However, important or high-risk projects usually have to pass more (or a maximum number of) Quality Gates. In case a software company pursuits the strategy Quality Gates as a quality guideline the company’s Quality Gate reference process only holds one gate network, which is applied to all projects. The strategy Quality Gates as a flexible quality strategy allows having more than one gate network. Figure 2 shows a classical waterfall process while figure 3 shows a gate network which can be applied to the waterfall process.
Requirements
Design
Implementation
Rollout
Testing
Fig. 2. A classical waterfall process
Requirements completed
Design completed
Test completed
Rollout completed
Fig. 3. A possible gate network for a waterfall process
210
2.2
T. Flohr
Criteria Concepts
Criteria concepts concern the creation of criteria. More precisely, criteria concepts concern how and when criteria are created and which roles are responsible for the process of creation. Table 1 summarizes the criteria concepts. Table 1. Overview of the identified criteria concepts Concept
Description
Criteria Creation
The criteria creation exactly defines, when in a project the creation of criteria takes place. The creation can take place at the project’s start, in the planning or conduction phase. Furthermore, (systematic) methods for criteria creation and the individuality of the criteria have to be defined. For example the strategy Quality Gates as a quality guideline requires to fix the criteria in a catalogue resulting in a low individuality of criteria.
Criteria Creator
A software company must define which roles are responsible for the creation of criteria. If a software company pursuits the strategy Quality Gates as a quality guideline criteria are created by the process management and are continuously improved by a dedicated gate management. Depending on the abstractness of criteria the creation also requires to interpret criteria in a project’s context to make them applicable. Usually, the interpretation is negotiated between the (internal) customers and (internal) contractors of a project.
Criteria
Quality Gate criteria usually are quality oriented. Nonetheless, it is possible to check other criteria (e. g. return on investment or market attractiveness) to some extent. It is important that a software company defines, what types of criteria are allowed in their Quality Gate reference process.
2.3
Review Concepts
Review concepts concern the systematic process of checking a project’s results against predefined criteria. Table 2 summarizes the review concepts. 2.4
Steering Concepts
Steering concepts concern the decision making which have to be done as a part of the gate review. Table 3 summarizes the steering concepts. 2.5
Tailoring Concepts
Tailoring concepts concern the tailoring and continuous improvement of an implemented Quality Gate reference process (also compare to figure 1). Table 4 summarizes the tailoring concepts.
Assessing the Quality of Quality Gate Reference Processes
211
Table 2. Overview of the identified review concepts Concept
Description
Gate Review
Within in the gate review a project’s results are checked against the criteria defined in the criteria creation. Similar to technical reviews (e. g. inspections [4] or peer reviews [10]) different intensities of gate reviews exist. Depending on the intensity a gate review requires more or less resources, but varies in reliability at the same time.
Gate Moderator
A gate moderator is responsible for a smooth and efficient conduction of the gate review. A software company has to map a role to the role of a gate moderator, to ensure that all gate reviews run smoothly.
Reviewer
The main task of a reviewer is to assess the quality of the project results against the criteria. Ideally, each reviewer possesses the necessary technical abilities to conduct the assessment without problems.
Project Representa- A project representative answers questions and defends his tive project within the gate review. A software company should assign a project role here, to ensure that checking failures are avoided (e. g. like a project’s result has been overlooked). Protocol
The protocol captures different results of a gate review. Major results are: the decision, the degree of fulfillment of the criteria and the actions having to be taken. A Quality Gate reference process should define a template as a guideline for the protocol.
Protocol Writer
The protocol writer captures the protocol of a gate review. A software company should assign a role here, to ensure that the protocol is captured consistently.
3
The Assessment Concept
The main idea of our assessment concept is very close to the idea of process capability maturity models such as SPICE [6] and CMMI [1]: it does not matter how a software company implements a Quality Gate concept because the actual implementation depends on the company’s size and its domain. Rather it is only relevant to rate the degree of implementation of a concept. Nonetheless, a faulty implementation of a concept can cause problems even if the concept is fully implemented. Our assessment concept differs in two ways from the well-known process capability maturity models: – SPICE and CMMI do not directly advocate the usage of Quality Gates in order to evaluate a project’s results. Rather quality checks might be performed by other activities too. Therefore SPICE and CMMI are not a proper starting point to assess Quality Gate reference processes in detail.
212
T. Flohr Table 3. Overview of the identified steering concepts
Concept
Description
Decisions
A software company has to define the actions which can be taken in a Quality Gate. Possible decisions are arbitrary combinations of the following decisions: go, conditional-go, repeat-gate, hold and kill. The allowed decisions are a subset of these decisions.
Gatekeeper
Gatekeepers are decision makers. A software company has to set a profile for a gatekeeper. Usually gatekeepers have a technical or quality management background. Nonetheless, if business criteria are checked within a quality gate the profile of a gatekeeper has to be defined accordingly. Additionally, it has to be defined which types of gatekeepers can make which types of decision.
Decision Support
Decision support concerns methods to map the degree of fulfillment and the importance of criteria to a decision. Decision support can be implemented either systematically or intuitively.
– Our assessment concept does not include certain maturity levels a software company can develop in. Each can concept can be improved individually. Nonetheless, it is possible the concepts in one category. Based on this idea all concepts can be assessed on a three-valued ordinal scale. The following listing explains the values of the scale. – A • denotes a fully implemented concept. This means that it is clear, how the concept has to be mapped to a project in order to be applicable. A fully implemented concept must be fixed within a process description. For example a role with a clear and fixed ability profile is a fully implemented concept. – A denotes a partly implemented concept. A partly implemented concept must be interpreted in order to be applicable. Partly implemented concepts often are fixed as an abstract description or a written description is missing, but is intuitively clear how the concept has to be applied. Sometimes it is necessary to leave a concept abstract because it must be applied to different business units of the software company. For example the protocol concept is partly implemented if most people in a company know how to write the protocol but no fixed template exists. – A ◦ denotes an unimplemented concept. Unimplemented concepts do not provide any hints how to apply the concept. Reasons could be: • The process management forgot to implement the concept. • The concept was left unimplemented, because the Quality Gate reference process must be used in different business units in the company and each business unit has to implement it individually. • The concept was intentionally left unimplemented, because the process management regards it as unimportant.
Assessing the Quality of Quality Gate Reference Processes
213
Table 4. Overview of the identified tailoring concepts Concept
Description
Gate Management
The gate management is responsible for the continuous improvement of an implemented Quality Gate reference process. Depending on the size of a company the gate management can be implemented in various intensities ranging from nonimplemented to a dedicated gate management.
Process Tailorer
A process tailorer’s task is to tailor a suitable Quality Gate process (also compare to figure 1). A software company has to assign a role who is responsible for tailoring.
Tailoring Method
A tailoring method maps a project situation and the tailorable elements to suitable Quality Gate process. A tailoring method can either be systematic or intuitive.
Tailorable Elements Tailorable elements concern the aspects of a Quality Gate reference process which can be tailored to better match a given project situation. For example, if the gate network is a tailorable concept the Quality Gate reference process has to provide various gate networks. Project Model
3.1
A project model helps to formally describe various project situations. A project model contains a set of attributes (e. g. project size, domain and risk) and for each attribute a set of values. A project situation assigns a value to each attribute. In this way a project can be formally described. In order to effectively and repeatable tailor a Quality Gate reference process, a software company has to design a project model.
Impacts of Shortcomings
Depending on the degree of implementation of a certain concept different impacts might exist. Table 5 shows an overview of possible impacts caused by shortcomings in the implementation of the concepts. 3.2
Continues Improvement
An assessment can be used as a starting point of a continuous improvement process. A continuous improvement process includes the following steps (which ideally have to be repeated in cyclic order): 1. Either the process management or the dedicated gate management (or at best external assessors) conduct an assessment of the software company’s Quality Gate reference process. 2. Based on the shortcomings possible impacts are identified (compare to table 5). The identified impacts are check against existing problems resulting in a set of concepts which have to be improved.
214
T. Flohr Table 5. Overview of possible impacts Shortcoming
Description
Impacts
Undefined Gate Net- The set of Quality work Gates or the order of Quality Gates is unclear.
Different Gate Networks might be used. Comparability between projects and quality level might be lower.
Undefined Role
Depending on the role project results might be checked inadequately, wrong decisions are made, inadequate criteria might be created, the protocol is inadequate or activities (especially the gate review) become tenacious.
Unqualified persons could be assigned to a role or a role stays unallocated.
Undefined Activities The gate review or Inadequate criteria might be the criteria definition created or project results are might be unclear. checked inadequately. Undefined Protocol
The contents of the Decisions, criteria assessment protocol is unclear. or actions might be untraceable in the future.
Undefined Tailoring It is unclear which conand Gate Manage- cepts can be tailored ment in order to obtain a suitable Quality Gate process. A gate management is not implemented.
Inadequate Quality Gate processes might be applied to projects. Quality Gate processes might be used inconsistently despite similar projects. The Quality Gate reference process is not continuously improved.
Undefined Type of Different types of crite- Non-quality related criteria Criteria ria might be applied to might be checked excessively (similar) projects. or non-quality related criteria are not checked (despite it is necessary in a given project). Project results might be checked against inconsistent criteria. Undefined Decisions It is unclear which deand Decision support cisions can be made within a Quality Gate and who is allowed to make certain decisions. Systematic methods to receive repeatable decisions beyond the scope of a project are not implemented.
Decisions are made inconsistently. Possible decisions are not made while impossible decisions might be taken.
Assessing the Quality of Quality Gate Reference Processes
215
3. The concepts which were identified in the last step are set as improvement goals. After the goals are achieved proceed with step 1. It is possible to improve the implementation of a concept by one level in each improvement cycle. For example if a role is completely unimplemented, we could first implement an abstract role profile in the first cycle (leading to a partly implemented concept) and then (after enough experience was gathered) refine and fix the role profile in the second (or later) cycle.
4
Practical Application of the Assessment Concept
Our assessment concept was applied to different Quality Gate reference processes from literature. Table 6 summarizes the results of the assessment of these Quality Gate reference processes. The assessment shows different problems. Despite Pfeifer’s Quality Gate reference process pursuits the strategy Quality Gates as a flexible quality strategy no tailoring concepts are implemented leading to different possible impacts (see table 5, row Undefined Tailoring and Gate Management). The Quality Gate reference process of the V-Model XT leaves the criteria concept and the decision support concept unimplemented. Furthermore, no gate management is implemented. Consequently, project results might be checked against different criteria and project results might be judged inconsistently from project to project. Table 6. The assessment concept applied to Quality Gate reference processes Category
Concept
Structural Concepts Gate Network
Pfeifer [7] V-Model XT [2] Stage-Gate [3] •
•
•
Criteria Concepts
Criteria Creation Criteria Creator Criteria
• •
• ◦
•
Review Concepts
Gate Review Gate Moderator Reviewer Project Represent. Protocol Protocol Writer
◦ • ◦
• • • • • •
• • ◦
Steering Concepts
Decision Gatekeeper Decision Support
• •
• • ◦
• •
Tailoring Concepts
Gate Management Process Tailorer Tailoring Method Tailorable Elements Project Model
◦ ◦ ◦ ◦ ◦
• • • • •
◦ •
216
T. Flohr
The Quality Gate reference process of Cooper’s Stage-Gate concept leaves two concepts unimplemented: process tailorer and protocol writer. Therefore, it is unclear who is responsible for the tailoring. Decisions and actions might be untraceable, because a proper protocol might be uncreated.
5
Conclusion and Outlook
In this paper a concept to assess the process quality of a Quality Gate reference processes was presented. In order to successfully establish Quality Gates a software company has to implemented certain concepts. These concepts were identified by conducting an empirical study involving several software companies and by analyzing literature. Depending on which concepts have been left unimplemented certain impacts are possible. An assessment makes this impacts visible for the process management. The assessment is then a starting point for a continuous improvement process. Furthermore, it can be used to show clients that Quality Gates are properly implemented (in case the assessment was positive). The assessment concept was applied to different Quality Gate reference processes from literature. Thus several possible impacts could be identified. Our assessment concept was not applied to real Quality Gate reference processes implemented in companies so far. Applications of our assessment concept in software companies is necessary and planned. These applications could possibly lead to a refined assessment scale and to the identification of more concepts.
References 1. CMMI for Development Version 1.2. Carnegie Mellon Software Engineering Institute, SEI (2006) 2. V-Modell, X.T. (Version 1.2). Koordinierungs- und Beratungsstelle der Bundesregierung f¨ ur Informationstechnik in der Bundesverwaltung (2006) 3. Cooper, R.G.: Winning At New Products: Accelerating the Process from Idea to Launch. Perseus Books Group, Cambridge (2001) 4. Fagan, M.E.: Design and Code Inspections to Reduce Errors in Program Development. IBM Systems Journal 15, 258–287 (1976) 5. Hawlitzky, N.: Integriertes Qualit¨ atscontrolling von Unternehmensprozessen Gestaltung eines Quality Gate-Konzeptes. TCW Wissenschaft und Praxis. TCW Transfer-Centrum (2002) 6. H¨ ormann, K., Dittmann, L., Hindel, B., M¨ uller, M.: SPICE in der Praxis. In: Interpretationshilfe f¨ ur Anwender und Assessoren. dpunkt Verlag (2006) 7. Pfeifer, T., Schmidt, R.: Das Quality-Gate-Konzept: Entwicklungsprojekte softwareintensiver Systeme verl¨ asslich planen. Industrie Management 19(5), 21–24 (2003) 8. Scharer, M.: Quality Gate-Ansatz mit integriertem Risikomanagement. PhD thesis, Institut f¨ ur Werkzeugmaschinen und Betriebstechnik der Universit¨ at Karlsruhe (2001)
Assessing the Quality of Quality Gate Reference Processes
217
9. Schubert, P., Guiver, T., MacDonald, R., Yu, F.: Using Quality Measures to Manage Statistical Risks in Business Surveys. In: European Conference on Quality in Survey Statistics (2006) 10. Wiegers, K.E.: Peer Reviews in Software: A Practical Guide. Addison-Wesley Information Technology Series (2002)
Exploratory Comparison of Expert and Novice Pair Programmers Andreas H¨ ofer Universit¨ at Karlsruhe (TH), IPD Institute Am Fasanengarten 5, D-76131 Karlsruhe, Germany Tel.: +49 721 608-7344 [email protected]
Abstract. We conducted a quasi-experiment comparing novice pair programmers to expert pair programmers. The expert pairs wrote tests with a higher instruction, line, and method coverage but were slower than the novices. The pairs within both groups switched keyboard and mouse possession frequently. Furthermore, most pairs did not share the input devices equally but rather had one partner who is more active than the other. Keywords: pair programming, experts and novices, quasi-experiment.
1
Introduction
Pair programming has been investigated in several studies in recent years. The experience of the subjects with pair programming in these studies varies widely: On the one extreme are novices with no or little pair programming experience who have just been trained in agile programming techniques, on the other extreme are experts with several years of experience with agile software development in industry. It seems rather obvious that expertise has an effect on the pair programming process and therefore on the outcome of a study comparing pair programming to some other technique. Yet, the nature of the differences between experts and novices has not been investigated so far. Nevertheless, knowing more about these differences is interesting for the training of agile techniques as well as for the assessment of studies on this topic. This study presents an exploratory analysis of the data of nine novice and seven expert pairs, exposing differences between the groups as well as identifying common attributes of their pair programming processes.
2
Related Work
When it comes to research on pair programming, a large part of the studies focus on the effectiveness of pair programming. Research on that topic has produced significant results as summarized in a meta-study by Dyb˚ a et al. [1]. They analyzed the results of 15 studies comparing pair and solo programming and Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 218–231, 2011. c IFIP International Federation for Information Processing 2011
Exploratory Comparison of Expert and Novice Pair Programmers
219
conclude that quality and duration favor pair programming while effort favors solo programming. Arisholm et al. [2] conducted a quasi-experiment with 295 professional Java consultants in which they examined the effect of programmer expertise and task complexity on the effectiveness of pair programming compared to solo programming. They measured the duration for task completion, effort and the correctness of the solutions. The participants had three different levels of expertise, namely junior, intermediate and senior and worked on maintenance tasks on two functionally equivalent Java applications with differing control style. The authors conclude that pair programming is not beneficial in general because of the observed increase in effort. Nevertheless, the results indicate positive effects of pair programming for inexperienced programmers solving complex tasks: The junior consultants had a 149 percent increase in correctness when solving the maintenance tasks on the Java application with the more complex, delegated control style. Other studies have taken an experimental approach to identify programmer characteristics critical to pair success: Domino et al. [3] examined the importance of the cognitive ability and conflict handling style. In their study, 14 parttime students with industrial programming experience participated. Cognitive ability was measured with the Wonderlic Personal Test (WPT), conflict handling style with the Rahim Organizational Conflict Inventory (ROCI-II). The performance of a pair was neither correlated with its cognitive ability nor its conflict handling style. Chao et al. [4] first surveyed professional programmers to identify the personality traits perceived as important for pair programming. They then conducted an experiment with 58 undergraduate students to identify the crucial personality traits for pair success. The experiment yielded no statistically significant results. Katira et al. [5] examined the compatibility of student pair programmers among 564 freshman, undergraduate, and graduate students. They found a positive correlation between the students’ perception of their partners’ skill level and the compatibility of the partners. Pairs in the freshman course were more compatible if the partners had different Myers-Briggs personality types. Sfetsos et al. [6] present the results of two experiments comparing the performance of 22 student pairs with different Keirsey temperaments to 20 student pairs with the same Keirsey temperament. The pairs with different temperaments performed better with respect to the total time needed for task completion and points earned for the tasks. The pairs with different temperaments also communicated more than the pairs with the same temperament. Furthermore, there are several field studies reporting on data from professional programmers, some of them including video analysis of pair programming sessions. None of these studies were designed to produce statistically significant results, but the observations made are valuable, because they show how pair programmers behave in typical working environments. Bryant [7] presents data from fourteen pair programming sessions in an internet banking company, half of which were videotaped. Initial findings suggest that expert pair programmers interact less than pair programmers with less expertise. Additionally, partners in expert pairs showed consistent behavior no matter which role they played,
220
A. H¨ ofer
whereas less experienced pair programmers showed no stable activity pattern and acted differently from one another. Bryant et al. [8] studied 36 pair programming sessions of professional programmers working in their familiar work environment. They classified programmers’ verbalizations according to sub-task (e. g. write code, test, debug, etc.). They conclude that pair programming is highly collaborative, although the level of collaboration depends on the sub-task. In a follow-up study Bryant et al. [9] report on data of 24 pair programming sessions. The authors observe that the commonly assumed roles of the navigator acting as a reviewer and working on a higher level of abstraction do not occur. They propose an alternative model for pair interaction in which the roles are rather equal. Chong and Hurlbutt [10] are also skeptical about the existence of the driver and navigator role. They observed two development teams in two companies for four months. They state that the observed behavior of the pair programmers is inconsistent with the common description of the roles driver and navigator. Both programmers in a pair were mostly at the same level of abstraction while discussing; different roles could not be observed.
3
Study
The following sections describes the study which was motivated by the following research hypotheses: RHtime . The expert pairs need less time to complete a task than the novice pairs. This assumption is based on the results from a quasi-experiment comparing the test-driven development processes of expert and novice solo programmers [11] where the experts were significantly faster than the novices. RHcov . The expert pairs achieve a higher test coverage than the novice pairs. Like the research hypothesis above, this one is based on the findings in [11]. RHconf . The partners in the expert pairs compete less for the input devices than the partners in the novice pairs. In our extreme programming lab course, we observed that the students were competing for the input devices. Hence, we thought this might be an indicator for an immature pair programming process. 3.1
Participants
The novice group consisted of 18 Computer Science students from an extreme programming lab course [12] in which they learned the techniques of extreme programming and applied them in a project week. They participated in the quasi-experiment in order to get their course credits. In the mean, they were in their seventh semester, had about five years of programming experience including two years of programming experience in Java. Six members of the novice group reported prior experience with pair programming, three of them in an industrial project. Only one novice had used JUnit before the lab course, none had tried test-driven development before. For the assignment of the pairs the
Exploratory Comparison of Expert and Novice Pair Programmers
221
experimenter asked each novice for three favorite partners and then assigned the pairs according to these preferences. Only pair N6 could not be matched based on their preferences. The group of experts was made up of 14 professional software developers. All experts came from German IT companies, 13 from a company specialized in agile software development and consulting. One expert took part in his spare time and was remunerated by the experimenter, the others participated during normal working hours, so all experts were compensated. All experts have a diploma in Computer Science or in Business Informatics. On average, they had 7.5 years of programming experience in industrial projects including on average five years experience with pair programming, about three years experience with test-driven development, five years experience with JUnit, and seven years experience with Java. The expert pairs were formed based on their preferences and time schedule. 3.2
Task
The pairs had to complete the control program of an elevator system written in Java. The system distinguishes between requests and jobs. A request is triggered if an up or down button outside the elevator is pressed. A job is assigned to the elevator after a passenger chooses the destination floor inside the elevator. The elevator system is driven by a discrete clock. For each cycle, the elevator control expects a list of requests and jobs and decides according to the elevator state which actions to perform next. The elevator control is driven by a finite automaton with four states: going-up, going-down, waiting, and open. The task description contained a state transition diagram explaining the conditions for switching from one state to another and the actions to be performed during a state switch. To keep the effort manageable, only the open-state of the elevator control had to be implemented. The pairs received a program skeleton which contained the implementation of the other three states. This skeleton comprises ten application and seven test classes with 388 and 602 non-commented lines of code, respectively. The set of unit tests provided with the program skeleton use mock objects [13] [14] to decouple the control of the elevator logic from the logic that administrates the incoming jobs and requests. However, the mock-object implementation in the skeleton does not provide enough functionality to develop the whole elevator control. Other functionality has to be added to the mock object to test all desired features of the elevator control. Thus, the number of lines of test code may be higher than the number of lines of application code. The mock object also contributes to the line count. 3.3
Realization
Implementation took place during a single programming session. All pairs worked on a workplace equipped with two cameras and a computer with screen capture software [15] installed. All novice pairs and one expert pair worked in an office within the Computer Science department. For the other expert pairs an equivalent workplace was set up in a conference room situated in their company.
222
A. H¨ ofer
There was an implicit time limit due to the cameras’ recording capacity of seven hours. Additionally, the task description states that the task can be completed in approximately four to five hours. Each participant recorded interrupts such as going to the bathroom or lunch breaks. The time logs were compared to the video recordings to ensure consistency. Apart from pair programming, the participants were asked to use test-driven development to solve the programming task. The pairs had to work on the problem until they were convinced they had an error free solution, which would pass an automatic acceptance test, ideally at first attempt. If the acceptance test failed, the pair was asked to correct the errors and to retry as soon as they were sure that the errors were fixed. One pair in the expert group and one pair in the novice group did not pass the acceptance test after more than six hours of work and gave up.
4
Data Analysis and Results
As all research hypotheses tested in the following sections have an implicit direction and the samples are small, the one-tailed Wilcoxon-Rank-Sum Test [16, pp. 106] is used for evaluation. The power of the respective one-tailed t-Test at a significance level of 5 percent, a medium effect size of 0.51 and a harmonic mean of 7.88 is 0.242. The power of the Wilcoxon-Test is in the worst case 13.6 percent smaller than the power of the t-Test [16, pp. 139]. Thus, the probability of detecting an effect is only 10.6 percent. This probability is fairly small compared to the suggested value of 80 percent [17, p. 531]. To sum up, if a difference on the 5 percent level can be shown, everything is fine. But the probability that an existing difference is not revealed is 89.4 percent for a medium effect size. As mentioned before, two pairs did not develop an error free solution. One could argue that the data points of these pairs should be excluded from analysis, because their programs are of inferior quality. Nevertheless, for the evaluations concerning input activity (see Sect. 4.3) the program quality is of minor importance. Accordingly, the two data points were not removed. Additional p-values, computed excluding the two data points2 , are reported wherever it makes a difference and the two data points are highlighted in all boxplots and tables. 4.1
Time
First of all, we compared the time needed for implementation defined as time span from handing out the task description to the final acceptance test. The initial reading phase, breaks, and the time needed for acceptance tests were excluded afterwards. RHtime stated our initial assumption that the expert pairs need less time than the novice pairs, i. e. T imee < T imen . Figure 1 depicts the time needed for implementation as boxplots (grey) with the data points (black) as overlay; the empty squares mark the pairs which did not pass the acceptance 1 2
As defined in [17, p. 26]. With two data points less the power is only 8.4 percent.
223
5:00 4:00 2:00
3:00
Time [h:mm]
6:00
7:00
Exploratory Comparison of Expert and Novice Pair Programmers
Expert
Novice
Fig. 1. Time Needed for Implementation
test. The boxplots show that there is no support for the initial research hypothesis. Judging by the data rather the opposite seems to be true. Consequently, not the initial research hypothesis but the re-formulated, opposite hypothesis T imee > T imen (null-hypothesis: T imee ≤ T imen) was tested. This revealed that the experts were significantly slower than the novices (p = 0.036). Omitting the data points from the pairs that did not pass the acceptance test results in a even smaller p-value of 0.015. 4.2
Test Coverage
The test coverage was measured on the final versions of the pairs’ programs using EclEmma [18]. The evaluation of test coverage is motivated by RHcov , which expresses our assumption that the expert pairs write tests with a higher coverage than the novice pairs, i. e. Cove > Covn . The respective null-hypothesis Cove ≤ Covn was tested for instruction, line, block, and method coverage. For instruction, line, and method coverage the null-hypothesis can be rejected on the 5 percent level with p-values of 0.045, 0.022, and 0.025. For block coverage the result is not statistically significant (p = 0.084). If we omit the pairs which did not successfully pass the acceptance test we can still observe a trend in the same direction. However, none of the results is statistically significant anymore. The pvalues for instruction, line, block, and method coverage are 0.135, 0.068, 0.238, and 0.077, respectively. Figure 2 shows the boxplots for the line and method coverage of the two groups. The dashed line indicates the test coverage of the program skeleton initially handed out to the pairs. Looking at the test coverage, it seems that the experts had sacrificed speed for quality. Yet, the costs for the extra quality are high: In the mean, the expert pairs worked more than one hour longer than the novice pairs to achieve a 2.6 percent higher line coverage. Perhaps they also took the acceptance test more seriously than the novices and tested longer before handing in their programs. But the number of acceptance tests needed by the expert pairs and novice pairs gives us no clue whether this assumption is true or false (see Fig. 3). The only way to answer the question will be further analysis of the recorded videos.
A. H¨ ofer
96
224
90.5
94
Init. Cov.
89.5
Percent
92
Init. Cov.
88.5
90
87.5
86
88
Percent
●
Expert
Novice
Expert
(a) Lines
Novice
(b) Methods
5 4 3 0
1
2
Frequency
3 2 1 0
Frequency
4
Fig. 2. Test Coverage
1
2
3
4
(a) Expert
1
2
3
4
(b) Novice
Fig. 3. Number of Acceptance Tests
4.3
Measures of Input Activity
Books for extreme programming practitioners mention two different roles when it comes to describing the interaction of the two programming partners and their basic tasks in a pair programming session [19] [20] [21]. Williams and Kessler [22] provide the most commonly used names for these roles: driver and navigator. Even though, the descriptions of the driver and navigator role in these textbooks differ marginally, all agree upon one basic feature of the driver role: The driver is responsible for implementing and therefore uses the keyboard and the mouse. Assuming that this is true, the use of mouse and keyboard by the two partners should make it possible to conclude how long one of the partners stays driver until the two partners switch roles. Input Device Control and Conflict. We observe the time a programmer touches the keyboard and/or the mouse. Having control of the input devices does not necessarily mean the programmer is really using it to type or browse code. Yet, because the pairs worked on a machine with one keyboard and one mouse possession of keyboard and/or mouse is a hindrance for the other programmer to use them and thus to become the driver. If one partner touches the keyboard
Exploratory Comparison of Expert and Novice Pair Programmers
225
while the other partner still has control of it, the time span where both partners have their hands on the keyboard is measured as conflict. Grabbing the mouse while the other partner has control of the keyboard is measured as conflict as well, assuming that the Eclipse IDE [23] (which was used for the task) requires keyboard and mouse for full control over all features. To obtain the measure of input device control, we transcribed the videos of the programming sessions with separate keyboard and mouse events for each programmer. We used a video transcription tool developed by one of our students especially for the purpose of pair programming video analysis [24]. RHconf phrases our initial assumption that the novice pairs spend more time in a conflict state than the expert pairs because they are less experienced in pair programming and do not have a protocol for changing the driver and navigator role. But this assumption could not be confirmed. Only three pairs spent more than one percent of their working time in a conflict state. One of them is in the expert group3 and two are in the novice group. Pair Balance. Figure 4 depicts the results from the analysis of input device control. It shows that the majority of the observed pairs did not share keyboard and mouse equally. To make this phenomenon measurable, pair balance b was computed from the input device control as follows: b=
min(t1 , t2 ) + 12 tc max(t1 , t2 ) + 12 tc
(1)
The variables t1 and t2 are the times of input device control of the two partners, and tc the time spent in a conflict state. The values for pair balance may range between zero and one, where one designates ideal balance. A pair balance of less than 0.5 means that the active partner controlled the input devices more than twice as long as the passive partner. Six out of nine novice pairs have a pair balance of less than 0.5; input device control is almost completely balanced in one pair only. In the expert group only one pair has a pair balance of less 0.5, but this pair is the most imbalanced of all. Table 1 shows the exact values for all pairs together with the percentage of conflicts. To check how the participants perceived pair balance, they were asked to rate the statement “Our activity on the keyboard was equal.” in the post-test questionnaire4 on a Likert scale from 1 (totally disagree) to 5 (totally agree). Figure 5 displays histograms of the replies for both groups. The participants’ reactions on that statement are not correlated to the corresponding pairs’ balance values (tested with Kendall’s rank correlation test, τ = 0.142, p = 0.324). Their perception seems to differ from reality here. Driving Times. Based on the assumption that one programmer remains driver until the other programmer takes control of the keyboard and/or mouse, driving 3 4
This is the expert pair that did not pass the acceptance test. Unfortunately, one expert pair had to leave before filling out the post-test questionnaire.
A. H¨ ofer More Active Programmer
Less Active Programmer
Conflict
0
20
40
60
Percent
80 100
226
E5
E7
E6
E2
E4
E1
E3
N7
N2
N3
N4
N6
N1
N5
N9
N8
Pair
(a) Expert’s Replies
4 2
totally agree
rather agree
un− decided
rather disagree
totally disagree
totally agree
rather agree
un− decided
rather disagree
totally disagree
0
Frequency
6
0 1 2 3 4 5 6
Frequency
Fig. 4. Input Device Control
(b) Novice’s Replies
Fig. 5. Replies to “The activity on the keyboard was equal”
times were computed from the keyboard and mouse transcripts. The driving time is the time span from the point a programmer gains exclusive control over the keyboard and/or the mouse to the point where the other programmer takes over. This time span includes time without activity on the input devices. In case of conflict, the time is added to the driving time of the programmer who had control before the conflict occurred. Further video analysis could help to identify the driver during those times. But since at least 90 percent of the working time is free of conflicts the driving times should be precise enough. Figure 6 shows a boxplot of the mean driving times of all pairs5 . The average driving time of all participants is below four minutes. The pairs switched keyboard and mouse control frequently. At first, the high switching frequency seemed rather odd, but this finding is in line with observations made by Chong and Hurlbutt [10] on a single team of professional programmers working on machines with two keyboards and mice. They state that within this team programming partners 5
Pair N3, represented by the outlier in the novices’ boxplot, had a phase of more than 100 minutes where one programmer showed absolutely no activity on the input devices. This biased the mean.
Exploratory Comparison of Expert and Novice Pair Programmers
227
Table 1. Balance and Conflict Pair
Balance
Conflict [%]
N1 N2 N3 N4∗ N5 N6 N7 N8 N9 E1 E2 E3∗ E4 E5 E6 E7
0.37 0.15 0.18 0.22 0.65 0.26 0.14 0.95 0.72 0.78 0.65 0.78 0.77 0.12 0.59 0.57
0.75 0.10 0.57 0.33 10.42 0.94 0.75 0.54 3.60 0.33 0.79 4.36 0.47 0.78 0.03 0.71
∗
Did not pass accept. test.
switched keyboard control frequently and rapidly. In an exemplary excerpt from a pair programming session in [10], the partners switched three times within two and a half minutes.
5
Threats to Validity
Apart from the different expertise in pair programming of the expert and novice pairs other possible explanations for the observed differences in the data set might exist. The novices also have less general programming experience and experience with test-driven development than the experts. Another threat to validity results from the fact that this study is a quasi-experiment and almost all experts came from one company: Thus, the outcome may also be affected by selection bias. Furthermore, the pairs might not have shown their usual working behavior because of the experimental setting and the cameras. The participants had to rate the statement “I felt disturbed and observed due to the cameras” on a Likert scale from 1 (totally disagree) to 5 (totally agree). Figure 7 displays histograms of the participants’ ratings. In general, the cameras were not perceived as disturbing, although it seems as if they are a bigger source of irritation for the novices than for the experts. Another reason for unusual working behavior might be that the participants were not accustomed to pair programming and therefore could not pair effectively. But we think that this is unlikely because the experts were used to pair and the novices had been trained to pair in the project week of our extreme programming lab course shortly before the quasi-experiment.
(a) Expert’s Replies
(b) Novice’s Replies
Fig. 8. Replies to “I enjoyed programming in the experiment”
totally agree
rather agree
un− decided
3
2
4
4
6
Frequency
2
8
5
(a) Expert’s Replies
rather disagree
0
1
totally agree
rather agree
un− decided
rather disagree
totally disagree
totally agree
rather agree
un− decided
rather disagree
totally disagree
1
3
Frequency
2
4
5
0 1 2 3 4 5 6
0
Frequency
Expert
totally disagree
totally agree
rather agree
un− decided
rather disagree
totally disagree
0
Frequency
0
2
4
6
Time [min] 8
10
12
228 A. H¨ ofer
●
Novice
Fig. 6. Mean Driving Time of the Pairs
(b) Novice’s Replies
Fig. 7. Replies to “I felt disturbed and observed by the cameras”
6 4
Frequency
8
229
(a) Expert’s Replies
totally agree
rather agree
un− decided
rather disagree
totally disagree
totally agree
rather agree
un− decided
rather disagree
totally disagree
0
2
6 4 0
2
Frequency
8 10
Exploratory Comparison of Expert and Novice Pair Programmers
(b) Novice’s Replies
Fig. 9. Replies to “I would work with my partner again”
Moreover, the fact that experts were paid for their participation and novices not might have lead to a bias in motivation. Figures 8 shows the frequency of replies on the statement “I enjoyed programming in the experiment”. The experts’ distribution of replies seems to be shifted to the right compared to the novices’ one which might indicate a higher motivation of the experts. But as the data set is small, this difference is not statistically significant. The participant’s motivation might also be influenced by how well the partners got along with each other. Figure 9 summarizes the ratings of the experts and novices of the statement “I would work with my partner again”. As before, the experts’ distribution appears to be shifted to the right compared to the novices’ one. Yet again, this difference is not statistically significant, due to the small size of our data set. Finally, the task was used in other studies before so some participants might have known the task. Consequently, we asked the participants if they already knew the task before they started. All participants answered the question with no.
6
Conclusions and Future Work
This article presented an exploratory analysis of a data set of nine novice and seven expert pairs. The experts’ tests had a higher quality in terms of instruction, line and method coverage, but in return the expert pairs were significantly slower than the novice pairs. The most important implication of the observed differences is that generalization of studies with novices remains difficult. Also, the direction of the difference is not necessarily the one predicted under the common assumption “experts perform better than novices”. In order to determine the reason why the expert pairs were slower than the novice pairs two things have to be done next: First, further analysis of the recorded video could indicate where the experts lost time. Second, we need to check whether the experts adhered more rigidly to the test-driven development process than the novices, which might be time consuming. We will do this with the revised version of our framework for the evaluation of test-driven development initially presented in [11].
230
A. H¨ ofer
The analysis of input activity revealed no significant differences between the groups. Nevertheless, it revealed that the roles of driver and navigator change frequently and that a majority of the pairs has one partner dominating input device control. The question what the less active partner did still needs to be answered. Analyzing the existing video material, focusing on the verbalizations of the programming partners, should help to answer this question. Acknowledgments. The study and the author were sponsored by the German Research Foundation (DFG), project “Leicht” TI 264/8-3. The author would like to thank Sawsen Arfaoui for her help on the video transcription and the evaluation of the questionnaires.
References 1. Dyb˚ a, T., Arisholm, E., Sjøberg, D.I., Hannay, J.E., Shull, F.: Are Two Heads Better than One? On the Effectiveness of Pair Programming. IEEE Software 24(6), 12–15 (2007) 2. Arisholm, E., Gallis, H., Dyb˚ a, T., Sjøberg, D.I.K.: Evaluating Pair Programming with Respect to System Complexity and Programmer Expertise. IEEE Transactions on Software Engineering 33(2), 65–86 (2007) 3. Domino, M.A., Collins, R.W., Hevner, A.R., Cohen, C.F.: Conflict in collaborative software development. In: SIGMIS CPR 2003: Proceedings of the 2003 SIGMIS Conference on Computer Personnel Research, pp. 44–51. ACM, New York (2003) 4. Chao, J., Atli, G.: Critical personality traits in successful pair programming. In: Proceedings of Agile 2006 Conference, p. 5 (2006) 5. Katira, N., Williams, L., Wiebe, E., Miller, C., Balik, S., Gehringer, E.: On understanding compatibility of student pair programmers. SIGCSE Bull. 36(1), 7–11 (2004) 6. Sfetsos, P., Stamelos, I., Angelis, L., Deligiannis, I.: Investigating the Impact of Personality Types on Communication and Collaboration-Viability in Pair Programming – An Empirical Study. In: Abrahamsson, P., Marchesi, M., Succi, G. (eds.) XP 2006. LNCS, vol. 4044, pp. 43–52. Springer, Heidelberg (2006) 7. Bryant, S.: Double Trouble: Mixing Qualitative and Quantitative Methods in the Study of eXtreme Programmers. In: 2004 IEEE Symposium on Visual Languages and Human Centric Computing, pp. 55–61 (September 2004) 8. Bryant, S., Romero, P., du Boulay, B.: The Collaborative Nature of Pair Programming. In: Abrahamsson, P., Marchesi, M., Succi, G. (eds.) XP 2006. LNCS, vol. 4044, pp. 53–64. Springer, Heidelberg (2006) 9. Bryant, S., Romero, P., du Boulay, B.: Pair Programming and the Mysterious Role of the Navigator. International Journal of Human-Computer Studies (in Press) 10. Chong, J., Hurlbutt, T.: The Social Dynamics of Pair Programming. In: Proceedings of the International Conference on Software Engineering (2007) 11. M¨ uller, M.M., H¨ ofer, A.: The Effect of Experience on the Test-Driven Development Process. Empirical Software Engineering 12(6), 593–615 (2007) 12. M¨ uller, M.M., Link, J., Sand, R., Malpohl, G.: Extreme Programming in Curriculum: Experiences from Academia and Industry. In: Eckstein, J., Baumeister, H. (eds.) XP 2004. LNCS, vol. 3092, pp. 294–302. Springer, Heidelberg (2004)
Exploratory Comparison of Expert and Novice Pair Programmers
231
13. Mackinnon, T., Freeman, S., Craig, P.: Endo-testing: unit testing with mock objects. In: Extreme Programming Examined, pp. 287–301. Addison-Wesley Longman Publishing Co., Inc., Boston (2001) 14. Thomas, D., Hunt, A.: Mock Objects. IEEE Software 19(3), 22–24 (2002) 15. TechSmith: Camtasia Studio, http://de.techsmith.com/camtasia.asp 16. Hollander, M., Wolfe, D.A.: Nonparametric Statistical Methods, 2nd edn. Wiley Interscience, Hoboken (1999) 17. Cohen, J.: Statistical Power Analysis for the Behavioral Sciences, 2nd edn. Lawrence Erlbaum Associates, Mahwah (1988) 18. EclEmma.org: EclEmma, http://www.eclemma.org 19. Beck, K.: Extreme Programming Explained: Embrace Change, 1st edn. AddisonWesley, Reading (2000) 20. Jeffries, R.E., Anderson, A., Hendrickson, C.: Extreme Programming Installed. Addison-Wesley, Reading (2001) 21. Wake, W.C.: Extreme Programming Explored, 1st edn. Addison-Wesley, Reading (2002) 22. Williams, L., Kessler, R.: Pair Programming Illuminated. Addison-Wesley Longman Publishing Co., Inc., Boston (2002) 23. Foundation, E.: Eclipse, http://www.eclipse.org 24. H¨ ofer, A.: Video Analysis of Pair Programming. In: APSO 2008: Proceedings of the 2008 International Workshop on Scrutinizing Agile Practices or Shoot-out at the Agile Corral, pp. 37–41. ACM, New York (2008)
State of the Practice in Software Effort Estimation: A Survey and Literature Review Adam Trendowicz1, Jürgen Münch1, and Ross Jeffery2,3 1
Fraunhofer IESE, Fraunhofer-Platz 1, 67663 Kaiserslautern, Germany {trend,muench}@iese.fraunhofer.de 2 University of New South Wales, School of Computer Science and Engineering Sydney 2052, Australia 3 National ICT Australia, Australian Technology Park, Bay 15 Locomotive Workshop, Eveleigh NSW 2015, Australia [email protected]
Abstract. Effort estimation is a key factor for software project success, defined as delivering software of agreed quality and functionality within schedule and budget. Traditionally, effort estimation has been used for planning and tracking project resources. Effort estimation methods founded on those goals typically focus on providing exact estimates and usually do not support objectives that have recently become important within the software industry, such as systematic and reliable analysis of causal effort dependencies. This article presents the results of a study of software effort estimation from an industrial perspective. The study surveys industrial objectives, the abilities of software organizations to apply certain estimation methods, and actually applied practices of software effort estimation. Finally, requirements for effort estimation methods identified in the survey are compared against existing estimation methods. Keywords: software, project management, effort estimation, survey, state of the practice, state of the art.
1 Introduction Rapid growth in the demand for high-quality software and increased investments in software projects show that software development is one of the key markets worldwide. The average company spends about 4 to 5 percent of its revenues on information technology, with those that are highly IT-dependent - such as financial and telecommunications companies - spending more than 10 percent [6]. A fast changing market demands software products with ever more functionality, higher reliability, and higher performance. In addition, in order to stay competitive, senior managers increasingly demand that IT departments deliver more functionality and quality with fewer resources, and do it faster than ever before [7]. Together with the increased complexity of software, the global trend towards shifting development from single contractors to distributed projects has led to software companies needing a reliable basis for making make-or-buy decisions. Finally, software development teams Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 232–245, 2011. © IFIP International Federation for Information Processing 2011
State of the Practice in Software Effort Estimation: A Survey and Literature Review
233
must strive to achieve all these objectives by exploiting the impressive advances in continuously changing (and often immature) technologies. In this situation, software planning and management seem to be an essential and, at the same time very difficult task. Dozens of well-known software disasters [6] are the most visible examples of problems in managing complex, distributed software systems. Independent of such experiences, many software organizations are still proposing unrealistic software costs, work within tight schedules, and finish their projects behind schedule and budget (46%), or do not complete them at all (19%) [19]. Finally, even if completed within a target plan, overestimated projects typically expand to consume whatever more resources were planned, while the functionality and quality of underestimated projects is cut to fit the plan. To address these and many other issues, considerable research has been directed at building and evaluating software effort estimation methods and tools [5, 20]. For decades, estimation accuracy was a dominant prerequisite for accepting or declining a certain effort estimation method. Yet, even though perhaps very accurate, it does not guarantee project success. Software decision makers have recently had to face this problem and find an estimation method that, first, is applicable in the context of a certain organization and, second, contributes to the achievement of organizationspecific objectives. Therefore, in order to accept or decline a certain estimation method, two major criteria should be met. The necessary acceptance criterion is that the software organization is able to fulfill the application prerequisites of a certain method, such as needed measurement data or required involvement of human experts. The sufficient acceptance criterion, on the other hand, refers to the method’s contribution to organizational objectives. As reported in this paper, besides accuracy, estimation methods are expected to support a variety of project management activities, such as risk analysis, benchmarking, or process improvement. In this article, we survey industrial objectives with respect to effort estimation, the abilities of software organizations to apply certain estimation methods, and effort estimation practices actually applied within the software industry. The remainder of the paper is organized as follows: Section 2 defines common sources of effort estimation problems. Section 3 outlines the study design. Section 4 summarizes current industrial practices with respect to software effort estimation, followed (Section 5) by a comparative evaluation of existing effort estimation methods concerning their suitability to meet certain industrial requirements. Finally, Section 6 concludes the results of the study presented in this paper and provides further research directions.
2 Sources of Deficits in Software Effort Estimation As with any other process or technology, software effort estimation is expected to meet certain objectives and requires certain prerequisites to be applied. A problem occurs where the applied estimation method is not in line with the defined objectives and/or the available resources. Specifically, we identify two major sources of effort estimation problems in the software industry, which may be addressed by the following questions:
234
A. Trendowicz, J. Münch, and R. Jeffery
• What is provided by a method? The estimation method provides less than required for the achievement of organizational objectives, e.g., it provides only a point estimate and thus hardly any support for managing project risks if the estimate is lower than the available budget. • What is required by a method? The estimation method requires more resources than currently available in the organization, e.g., it requires more measurement data than actually available in the organization.
3 Study Design The study presented in this paper consisted of two elements: (1) a review of related literature and (2) a survey performed among several software companies. 3.1 Study Objectives The objective of the study was the analysis of the current industrial situation regarding software effort estimation practices (state of the practice). In particular, the following aspects were focused on: • Effort estimation objectives: industrial objectives (expectations) with respect to effort estimation methods; • Effort estimation abilities: ability of software companies to apply a certain estimation approach, e.g., by providing necessary resources; • Effort estimation methods: effort estimation methods actually applied at software companies. In addition, existing estimation methods are analyzed concerning how they meet industrial needs and abilities regarding software effort estimation (state of the art). 3.2 Information Sources The study presented in this paper is based on two sources of information: • Literature Review: Literature published in the public domain such as books, journals, conference proceedings, reports, and dissertations. • Industrial Surveys: Information gained during two series of industrial surveys (S1 and S2) performed at 10 software organizations. The survey results presented in this chapter include both surveys, unless explicitly stated otherwise. 3.3 Literature Review The design of the review is based on the guidelines for systematic reviews in software engineering proposed, for instance, in [4, 8]. It was performed as follows (Fig. 1): 1. Identifying information sources: We identified the most relevant sources of information with respect to the study objectives. Initially, they included the most prominent software engineering journals and conference proceedings1. 1
For a detailed list of the information sources considered in the study, please refer to [20].
State of the Practice in Software Effort Estimation: A Survey and Literature Review
235
Fig. 1. Overview of the literature review process
2. Defining search criteria: We limited the review scope by two criteria: relevancy and recency. Relevancy defined the content of the reviewed publications. We focused on the titles that dealt with software effort estimation and related topics such as development productivity. Recency defined the time frame of the reviewed publications. Since software engineering is one of the most rapidly changing domains, we decided to limit our review to papers published after the year 1995. The analysis of industrial practices encompassed an even narrower scope, namely publication after the year 2000. 3. Automatic search: We performed an automatic search through relevant sources of information using defined search criteria (Table 1). This search included a search for specific keywords in the publications’ title, abstract, and list of keywords (if provided). We used generic search engines such as INSPEC (http://www.iee. org/publish/inspec/) and specific engines associated with concrete publishers such as IEEE Xplore (http://ieeexplore.ieee.org/Xplore/dynhome.jsp). 4. Initial review: We initially reviewed the title, abstract, and keywords of the publications returned by the automatic search with respect to the defined criteria for inclusion in the final review. 5. Full review: A complete review of the publications accepted during the initial review was performed. This step was followed by a manual search and review (if accepted) of referenced relevant publications that had not been found in earlier steps of the review. In total, 380 publications were reviewed in the study. 6. Manual search: In accordance with the recommendations presented in [8], we complemented the results of the automatic search with relevant information sources by doing a manual search through references found in reviewed papers as well as using a generic web search engine (http://www.google.com). Table 1. Query defined for the purpose of automatic search ((Software OR Project) AND (Effort OR Cost OR Estimation OR Prediction) IN (Title OR Abstract OR Keywords)) AND (Date >= 1995 AND Date <= 2008)
236
A. Trendowicz, J. Münch, and R. Jeffery
3.4 Industrial Surveys During the years 2005-2008, we performed a series of industrial surveys aimed at analyzing current industrial practices with respect to modeling (estimation and measurement) software development effort and productivity. In this paper, we present the aggregated results of the two most recent surveys and one industry workshop. The early survey (S1) focused on effort estimation practices as well as closely related productivity and size measurement practices. The survey encompassed 7 software organizations. The recent survey (S2) focused specifically on effort estimation practices. It included such particular issues as objectives of estimation, estimation process, as well as inputs and outputs of estimation. The survey was performed in 2 software companies, with one of them being represented by 7 different development groups. In addition, we include the results of an effort estimation workshop we performed in 2007 in the software business unit of a large international provider of software systems (embedded domain). During the workshop, we asked the 7 participants about their estimation objectives, the estimation methods currently applied, and organizational capabilities with respect to effort modeling. Table 2. Overview of the industrial survey characteristics Id S1.1 S1.2 S1.3 S1.4 S1.5 S1.6 S1.7 S2.1 S2.2 W1
Size L L L L L S S L S S
Org. Type Supplier Supplier Supplier Supplier Supplier Supplier Supplier Supplier IV&V Supplier
Involved DM, PM QM, QE, PM PM, PM, QE PM, PM, QE PM DM, PM, PM PM, PM, QM 7 x PM QE DM, 2 x SD, 2 x QE, 2 x PM
App. Type Finance Automation Systems Finance Finance Finance Finance Finance Medical, Automotive Space Automation Systems
Domain MIS MIS MIS MIS MIS MIS MIS EM EM EM
Table 2 presents an overview of the survey characteristics. For each survey, the following characteristics are included: • The size of the organization considered in the survey in terms of number of employees. We distinguish Small (less than 50 employees), Medium (between 50 and 200 employees), and Large (more than 200 employees). • The type of involved organizations was mainly software suppliers. In one case, the company provided independent verification and validation services (IV&V). • The involved roles mainly included middle-level management and consultants: software developers (SD), project managers (PM), quality engineers (QE), quality managers (QM), and group/division managers (DM). • The application type of typical products considered in the context of the involved organizations ranged from financial via medical, automotive, and industrial automation to safety-critical systems. • The domain of the organization includes mainly information systems (MIS) and embedded software (EM). Information systems covered mainly financial systems, whereas embedded software covered such application areas as medical, automotive, telecommunication, and industrial automation.
State of the Practice in Software Effort Estimation: A Survey and Literature Review
237
3.5 Study Limitations A major limitation of the study is the limited availability of information sources. The small sample of respondents in the industrial surveys might not be representative of the population of the software industry. Yet, since the results of the survey largely conform with the results of the literature review and our informal experiences gained during multiple industrial collaborations, we conclude that the results presented in this paper represent current trends in software effort estimation theory and practice.
4 State of the Industrial Software Effort Estimation Practice The industrial practices surveyed in this paper include: (1) objectives of effort estimation, (2) capabilities to meet prerequisites for applying a certain estimation method, and (3) effort estimation methods currently applied in the software industry. Objectives of Effort Estimation
Objective of effort estimation
Fig. 2 presents a summary of effort estimation objectives identified in the study. In the case of several objectives, there is a noticeable discrepancy between what is presented in the related literature and what we observed during industrial surveys. From the perspective of study limitations, the availability of information sources may have resulted in data samples that are not representative of the whole software industry. In practice, there are, however, several other sources of variance in the results presented in Fig. 2. Traditional effort estimation objectives, such as planning and tracking the software project, or reducing project management overhead, are so common (and thus considered so “obvious”) that they are typically not indicated if not asked for explicitly. Although the most recent literature focused most probably on the new objectives that have traditionally not been considered by the software industry until now, the survey also explicitly asked about traditional effort estimation objectives. Finally, we believe that the industrial surveys show some of the most recent trends that were not observed by the authors of the reviewed literature. Our personal experience confirms, for example, the increasing importance of process/productivity improvement objectives.
Project planning & tracking
29%
90%
Process improvement Minimize project mgmt overhead
6%
90%
Negotiating project costs
47%
40%
Risk management
41%
30%
Productivity improvement
40%
Project benchmarking
10% 12%
Change management
12%
0.0%
20.0%
10 Industrial Surveys 17 Related Publications
35%
18%
40.0%
60.0%
80.0%
100.0%
120.0%
Percentage of survey responses
Fig. 2. Summary of objectives regarding software effort estimation
238
A. Trendowicz, J. Münch, and R. Jeffery
Based on the review of related literature and the results of the industrial surveys, we identified the 7 most common objectives regarding effort estimation methods: 1. Project planning & tracking. The effort estimation method should support effective project planning by providing reliable and exact estimates at various levels of project granularity. There are two different ways of project planning dependent on the project type. In the context of fixed-price projects, effort estimation provides the basis for planning software functionality (how much functionality can be developed at a specified quality level and within a fixed budget). In the context of fixed-product projects, effort estimation provides a basis for planning software project schedule and cost (how much time and cost are required for developing software with a specified functionality and quality). 2. Process improvement. The effort estimation method should support the understanding and improvement of effort- and productivity-related development processes. 3. Project management overhead. The effort estimation method should minimize management overhead, i.e., the cost of applying and maintaining an estimation method (e.g., model building, application, and maintenance). 4. Negotiating project costs. The effort estimation method should support the communication and negotiation process (justifying development costs) in the context of software procurement between the stakeholders involved (project managers, management, customers, etc.). 5. Risk management. The effort estimation method should support management of project risks. This includes reciprocal integration of effort estimation and risk management, i.e., effort estimation uses the outputs of risk management (identified project risks having an impact on estimated effort) and provides input to risk management (cost-related risks). Moreover, the method should explicitly cover estimation uncertainty (i.e., accept uncertain/incomplete inputs and provide evaluation of the output's uncertainty). 6. Productivity improvement. The effort estimation method should support the identification of factors that have the greatest influence on development productivity. Achievement of objective 1 usually implicates productivity improvement, at least from a long-term perspective; yet productivity might be improved without improving related processes, e.g., by assigning more skilled people to the project instead of improving training processes. 7. Project benchmarking. The effort estimation method should support benchmarking of software projects with respect to development effort and productivity. Comparing software projects regarding productivity and effort between different organizations is especially important nowadays in the context of rapidly growing global development (out-sourcing, off-shoring, etc.) in order to support make-or-buy decisions and select/manage software suppliers. 8. Change management. The effort estimation method should be easy to reapply along the software development life cycle in order to support managing changes in project scope, such as modified (non-)functional software requirements. 4.2 Effort Estimation Capabilities The results of the industrial surveys and the literature review indicate that software organizations represent a very wide range of estimation capabilities.
State of the Practice in Software Effort Estimation: A Survey and Literature Review
Module Manager
75.0%
Expert's role (position)
Senior Developers
75.0%
Project Manager
50.0%
Test Manager
25.0%
Quality Manager
25.0%
Software Architect
12.5%
System Manager
12.5%
Competence Center Head
12.5%
Risk Manager
12.5%
0.0%
239
20.0%
40.0%
60.0%
80.0%
Percentage of experts involved in the estimation process
Fig. 3. Roles involved in the estimation process (survey S2)
Estimation budget: Due to extensive project pressure, software organizations do not typically assign sufficient manpower for estimation that is adequate to the approach applied. Depending on the specific domain, software organizations spend between 2% and 15% (on average 6%) of the entire project budget, which is typically not sufficient for applying estimation methods based on the judgment of multiple human experts. Expert involvement: At the same time, respondents highlighted insufficient resources for estimation as one of the major problems that reduce the applicability and effectiveness of project effort prediction. In practice, software organizations are very unwilling to spend the effort of human experts on estimation tasks when they are responsible for other activities in a software project (Fig. 3). In consequence, estimation is typically performed by people lacking the appropriate expertise and/or with tight resources (e.g., insufficient budget designated for estimation). Quantitative project data: On the other hand, software organizations typically do not have enough reliable project data to employ data-driven estimates and relieve human estimators. We observed (Fig. 4) that most of the time, the historical data a software organization can typically base estimates on does not exceed 10 projects. Even when the required amount of data has been collected, they are often not driven by specific objectives and often suffer from very low quality (inconsistency, incompleteness, etc.). The declared ratio of missing data ranges between 20% and 30%. Moreover, on average, 10-20% of the projects in the data repository can be described as exceptional projects that are not likely to be repeated (data outliers). One of the surveyed organizations, for example, collected measurement data from around 600 projects; yet, due to significant inconsistency and incompleteness, it was practically useless for estimation purposes (although great effort had been invested to validate and preprocess the data). This shows how important it is to define a proper, goal-oriented measurement program before data collection begins. Another interesting observation is that the quality and quantity of available data does not seem to be related to the specific reference process models used with an organization. A CMMI-L3 or -L5 organization may, for instance, have problems with collecting the minimal amount of useful project data, whereas an ISO9000 organization may already have a number of business-relevant measures collected for dozens of projects in their data repository.
A. Trendowicz, J. Münch, and R. Jeffery
Amount of available project data
240
33.3%
more than 50 projects 21-50 projects
0.0% 22.2%
11-20 projects
44.4%
up to 10 projects 0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
Percentage of survey responses
Fig. 4. Historical project data available in the software industry
One of the factors responsible for insufficient data availability [11, 13] is the rapid technological advancement in the software engineering domain. Software organizations focusing on large projects, such as those in the space domain, might never be able to collect a sufficient amount of up-to-date data. 4.3 Applied Effort Estimation Methods
Effort estimation method
Quite surprising in the light of available manpower seems to be the observation that a vast majority of software organizations employ effort estimation based on expert assessment (Fig. 5). The industrial surveys showed that 9 out of 10 surveyed companies employed estimation based on human judgment. This remains in line with observations made for over a decade, where about 60% to 85% of software projects rely exclusively upon expert estimates [14, 16]. Ad-hoc estimates are, however, rare and predictions are typically obtained through a group meeting where several experts follow a structured estimation process (such as that implemented in the Delphi method [1]) to come up with a final effort prediction. One of the major reasons is distrust of data-driven methods, which is the result of the lack of substantial evidence in favor of those methods [14]. Moreover, they are often perceived as complicated to use, and thus requiring significant overhead. Therefore, if used at all, only simple data-driven methods such as linear regression based on the effort and size of already finished projects are applied - often in combination with expert judgment [15].
Multiple experts (e.g., Delphi)
80.0% 70.0%
Regression (based on size and effort) Fixed (e.g., COCOMO, KnowledgePlan)
20.0%
Single expert
20.0%
Analogy (e.g., Angel) 0.0%
10.0% 20.0%
40.0%
60.0%
80.0%
100.0%
Percentage of survey responses
Fig. 5. Effort estimation methods applied in the software industry
Experts estimate software size (usually through a work breakdown structure), for instance, in combination with the average development productivity of already completed projects to come up with the effort prediction (Effort = Size / Average
State of the Practice in Software Effort Estimation: A Survey and Literature Review
241
Productivity). A simple combination of the individual results of expert- and databased estimates typically does not seem to provide significant improvement compared to expert estimates alone [15]. Finally, the use of more sophisticated methods, such as COCOMO [1] or decision trees [3], remains marginal. 4.4 Detailed Requirements Regarding Software Effort Estimation Methods An investigation of industrial objectives and abilities regarding effort estimation as well as an analysis of detailed comments given by study respondents led us to define a detailed list of requirements that should be considered when selecting a particular estimation method: 1. Expert involvement: The method does not require extensive expert involvement, i.e., it requires a minimal number of experts, limited involvement (effort) per expert, and minimal expertise. 2. Required data: The method does not require large amounts of measurement data of a specific type (i.e., measurement scale) and distribution (e.g., normal). 3. Robustness: The method is robust to low-quality data inputs, i.e., incomplete (e.g., missing data), inconsistent (e.g., data outliers), redundant, and collinear data. 4. Flexibility: The method is free from a specific estimation model and provides context-specific outputs. 5. Complexity: The method has limited complexity, i.e., it does not employ many techniques, its underlying theory is easy to understand, and it does not require specifying many sophisticated parameters. 6. Support level: There is comprehensive support provided along with the method, i.e., complete and understandable documentation, and a useful software tool. 7. Handling uncertainty: The method supports handling the uncertainty of the estimation (i.e., inputs and outputs). 8. Comprehensiveness: The method can be applied for estimating different kinds of project activities (e.g., management, engineering) on various levels of granularity (e.g., project, phase, and task). 9. Availability: The method can be applied during all stages (phases) of the software development lifecycle. 10. Empirical evidence: There is comprehensive empirical evidence supporting the theoretical and practical validity of the method. 11. Informative power: The method provides complete and understandable information that supports the achievement of numerous estimation objectives (e.g., effective effort management). In particular, it provides context-specific information regarding relevant effort factors, their interactions, and their impact on effort. 12. Reliability: The method provides the output that reflects the true situation in a given context. In particular, it provides accurate, precise, and repeatable outputs. 13. Portability: The method provides estimation outputs that are either applicable in other contexts without any modification or are easily adaptable to other contexts. These requirements may be further quantified, e.g., using a Likert-scale with respect to their value and importance, and used within multi-criteria decision support [17] for selecting the best suitable effort estimation method within a specific project context.
242
A. Trendowicz, J. Münch, and R. Jeffery
5 Overview of Existing Software Effort Estimation Methods Existing effort estimation methods basically differ with respect to the type of inputs they require and the form of the estimation model they provide. With respect to input data, we differentiate between three major groups: data-intensive, expert-based, and hybrid methods (combining available data and expert knowledge in order to come up with estimates). An analysis of existing estimation methods with respect to industrial objectives and derived requirements indicates a few leading methods that meet most of the requirements; although no single method satisfies all requirements [20]. The major point against data-intensive methods is that they require large data sets. This is not the typical industrial situation (even in high-maturity, process-oriented organizations), which is rather characterized by sparse, incomplete, and inconsistent data. Moreover, these methods are often complicated to use, and have not actually proven to be superior to expert-based methods. Among the data-intensive methods, some require past project data for building customized models (define-your-ownmodel approaches), others provide an already defined model, where factors and their relationships are fixed based on a set of multi-organizational project data (fixed-model approaches). The major advantage of fixed-model approaches is that they, theoretically, do not require any historical data to be applied. Those methods might be especially attractive in the IV&V context, where very sparse (if any) data are typically available. Yet, in practice, fixed models, such as COCOMO [1], are developed for a specific context and are, by definition, only suited for estimating the types of projects for which the fixed model was built. The applicability of such models for the IV&V context is, in practice, very limited. In order to improve their performance, a significant amount of organization-specific project data would be required for calibrating the generic model. In that case, the usefulness of the fixed-model approaches for IV&V effort estimation would not differ much from the define-yourown-model approaches, which require a significant amount of reliable, contextspecific data to build customized effort models. Application of the define-your-ownmodel methods is further limited by the additional requirements of specific methods. Parametric approaches, such as regression [18], for instance, make several assumptions about underlying project data (completeness, normal distribution, etc.) that are rarely met in the software domain. Non-parametric methods originating from the machine learning domain, such as artificial neural networks (ANN) [2] or Decision Trees [3], make practically no assumptions about the data but are quite sensitive to their parameter configuration, and there is usually little universal guidance regarding how to set those parameters. Thus, finding appropriate parameter values requires some preliminary experimentation. In contrast to data-intensive methods, expert-based estimation does not require any project measurement data. However, it is widely criticized due to large overheads and the requirement for seasoned experts each time the estimation needs to be performed. Moreover, the reliability of the outputs it provides largely depends on the expertise and individual preferences of the human experts involved. Moreover, since the rationale underlying the final estimates is not modeled explicitly, expert-based estimation, by itself, provides hardly any support for effective decision making (risk management, process improvement, negotiations, etc.). Even though experts identify the factors influencing development effort, we find that they typically tend to largely disagree on them and omit relevant factors while selecting irrelevant ones.
State of the Practice in Software Effort Estimation: A Survey and Literature Review
243
Recently, a few hybrid methods have been proposed to cope with the deficits of data-intensive and expert-based estimation. One of the main objectives of hybrid estimation is to reduce the amount of both measurement data and human expertise by combining those two sources of information. In consequence, more reliable estimates should be obtained with reduced overhead. Empirical applications [12, 21] report on higher estimation accuracy and stability of hybrid methods when compared to those based solely on data or experts. Moreover, methods that employ explicit causal effort modeling [12, 21] have proven to greatly contribute to the achievement of a variety of organizational objectives, such as risk management or process/productivity improvement. Yet, the causal effort model, from which an effort model is derived, is typically developed based either solely on experts or on data. This still requires either significant involvement of experienced human experts or large amounts of highquality data. Moreover, the reliability of a causal model based on homogeneous sources of information is typically limited. We found, for example, that domain experts, driven by their subjective preferences, tend to omit relevant causal effects (i.e., effort drivers and their causal interactions) while choosing irrelevant ones.
6 Summary and Further Work Directions In this paper, we analyzed industrial trends with respect to software effort estimation. In particular, we were interested in what the objectives of effort estimation are, what the industrial capabilities for applying certain estimation strategies are, and finally, which of the existing methods are actually employed within the software industry. To summarize our findings, there is a growing need for supporting various project and process management activities, such as risk management, project negotiations, or process improvement. Effort estimation methods that grew upon traditional planning objectives usually focus on providing an accurate point estimate without giving much insight into organization-specific causal effort dependencies, in particular into the most relevant factors contributing to project effort. In consequence, even though accurate estimates can be provided, there is hardly any support for decision making in situations where the estimate exceeds the available budget. Moreover, software organizations, even high-maturity ones, do not have sufficient resources to apply existing estimation methods, which typically require either extensive involvement of seasoned domain experts or large amounts of high-quality measurement data. In consequence, estimation performed with insufficient resources (manpower) by multiple human experts is still a common industrial practice. Even though quantitative methods are applied, they are rather simple and often based on sparse and unreliable data. In this situation, hybrid methods that combine data analysis with expert judgment and provide a transparent, context-specific model of causal effort dependencies seem to offer a potential remedy. Inclusion of subjective elements, such as an expert’s opinion, in analytical models potentially allows for a significant reduction in the number of irrelevant variables considered in a model, as well as accounting for factors that are difficult to measure. Yet, the few hybrid methods that have been proposed so far base the development of an explicit causal effort model either on measurement data or on human judgment. In consequence, they inherit the major weaknesses of data-driven and expert-based estimation.
244
A. Trendowicz, J. Münch, and R. Jeffery
The discrepancy between what is needed by the software industry and what is actually provided by the research community indicates that further research should, in general, focus on providing estimation methods that keep up with current industrial needs and abilities. In particular, methods that integrate analysis of sparse data with minimal involvement of human experts in order to come up with causal effort models should be investigated.
References 1. Boehm, B.W., Abts, C., Brown, A.W., Chulani, S., Clark, B.K., Horowitz, E., Madachy, R., Refer, D., Steece, B.: Software Cost Estimation with COCOMO II. Prentice Hall, Englewood Cliffs (2000) 2. Boetticher, G.: An Assessment of Metric Contribution in the Construction of a Neural Network-Based Effort Estimator. In: International Workshop Soft Computing Applied to Software Engineering, pp. 59–65 (2001) 3. Breiman, L., Friedman, J., Ohlsen, R., Stone, C.: Classification and Regression Trees. Wadsworth & Brooks/Cole, Advanced Books & Software (1984) 4. Brereton, P., Kitchenham, B.A., Budgen, D., Turner, M., Khalil, M.: Lessons from Applying the Systematic Literature Review Process within the Software Engineering Domain. Journal of Systems and Software 80, 571–583 (2007) 5. Briand, L.C., Wieczorek, I.: Resource Modeling in Software Engineering. In: Marciniak, J.J. (ed.) Encyclopedia of Software Engineering, 2nd edn. Wiley, Chichester (2002) 6. Charette, R.N.: Why Software Fails (Software Failure). IEEE Spectrum 32(9), 42–49 (2005) 7. Jørgensen, M., Løvstad, N., Moen, L.: Combining Quantitative Software Development Cost Estimation Precision Data with Qualitative Data from Project Experience Reports at Ericsson Design Center in Norway. In: International Conference on Empirical Assessments of Software Engineering (2002) 8. Jørgensen, M., Shepperd, M.: A Systematic Review of Software Development Cost Estimation Studies. IEEE Transactions on Software Engineering 33(1), 33–53 (2007) 9. Kitchenham, B.: Procedures for Performing Systematic Reviews. Technical report TR/SE0401, Software Engineering Group, Keele University (2004) 10. Kläs, M., Trendowicz, A., Wickenkamp, A., Münch, J., Kikuchi, N., Ishigai, Y.: The Use of Simulation Techniques for Hybrid Software Cost Estimation and Risk Analysis. Advances in Computers 74, 115–174 (2008) 11. MacDonell, S.G., Shepperd, M.J.: Comparing Local and Global Software Effort Estimation Models – Reflections on a Systematic Review. In: International Symposium on Empirical Software Engineering & Measurement, pp. 401–409 (2007) 12. Mendes, E.: A Comparison of Techniques for Web Effort Estimation. In: International Symposium on Empirical Software Engineering and Measurement, pp. 334–343 (2007) 13. Mendes, E., Lokan, C.: Replicating Studies on Cross- vs Single-company Effort Models Using the ISBSG Database. Journal of Empirical Software Engineering 13(1), 3–37 (2008) 14. Moløkken-Østvold, K.J., Jørgensen, M.: Expert Estimation of the Effort of WebDevelopment Projects: Why Are Software Professionals in Technical Roles More Optimistic Than Those in Non-Technical Roles? Journal of Empirical Software Engineering 10(1), 7–29 (2005)
State of the Practice in Software Effort Estimation: A Survey and Literature Review
245
15. Moløkken-Østvold, K.J., Jørgensen, M., Tanilkan, S.S., Gallis, H., Lien, A.C., Hove, S.E.: A Survey on Software Estimation in the Norwegian Industry. In: International Symposium on Software Metrics, pp. 208–219 (2004) 16. Moløkken-Østvold, K.J., Jørgensen, M.: A Review of Surveys on Software Effort Estimation. In: International Symposium on Empirical Software Engineering, pp. 223–230 (2003) 17. Paschetta, E., Andolfi, M., Costamanga, M., Rosenga, G.: A Multicriteria-based Methodology for the Evaluation of Software Cost Estimation Models and Tools. In: International Conference on Software Measurement and Management (1995) 18. Sentas, P., Angelis, L., Stamelos, I., Bleris, G.L.: Software Productivity and Effort Prediction with Ordinal Regression. Journal of Information & Software Technology 47(1), 17–29 (2005) 19. The Standish Group: CHAOS Chronicles. Technical report, The Standish Group International, Inc. (2007) 20. Trendowicz, A.: Software Effort Estimation - Overview of Current Industrial Practices and Existing Methods. Technical report 06.08/E, Fraunhofer IESE, Kaiserslautern, Germany (2008) 21. Trendowicz, A., Heidrich, J., Münch, J., Ishigai, Y., Yokoyama, K., Kikuchi, N.: Development of a Hybrid Cost Estimation Model in an Iterative Manner. In: International Conference on Software Engineering, pp. 331–340 (2006)
Testing of Heuristic Methods: A Case Study of Greedy Algorithm A.C. Barus , T.Y. Chen, D. Grant, F.-C. Kuo, and M.F. Lau Faculty of Information and Communication Technologies, Swinburne University of Technology John St., Hawthorn 3122 Australia {abarus,tchen,dgrant,dkuo,elau}@ict.swin.edu.au http://www.swin.edu.au/ict/
Abstract. Algorithms which seek global optima are computationally expensive. Alternatively, heuristic methods have been proposed to find approximate solutions. Because heuristic algorithms do not always deliver exact solutions it is difficult to verify the computed solutions. Such a problem is known as the oracle problem. In this paper, we propose to apply Metamorphic Testing (MT) in such situations because MT is designed to alleviate the oracle problem and can be automated. We demonstrate the failure detection capability of MT on testing a heuristic method, called the Greedy Algorithm (GA), applied to solve the set covering problem (SCP). The experimental results show that MT is an effective method to test GA. Keywords: heuristic method, greedy algorithm, metamorphic testing.
1
Introduction
Normally algorithms which deliver global optima are computationally expensive. Alternatively, heuristic methods have been proposed to provide approximate optima. These heuristic algorithms may be based on educated guesses, intuitive judgement, or simply common sense to seek answers which are hopefully close to the global optima. Such algorithms may deliver solutions that are global optima, close to global optima, local optima or close to local optima. In other words, there are uncertainties in the solutions delivered by these methods. Examples of heuristic methods include algorithms proposed by Johnson for combinatorial problems [1], by Bodorik et al. for distributed query processing [2], and by Cheng et al. for real-time data aggregation in wireless sensor networks [3]. When such algorithms are implemented as software, it is important to ensure the correctness of the implementation. The availability of a test oracle — a mechanism to verify the output of software — is necessary to determine whether the software passes the test undertaken. Heuristic methods may not give exact solutions for the computed problems. Therefore, it is difficult to verify outputs
Corresponding author.
Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 246–260, 2011. c IFIP International Federation for Information Processing 2011
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
247
of the corresponding software, which is known as the test oracle problem (or, simply the oracle problem) in Software Testing. Metamorphic testing (MT) was developed to deal with the oracle problem[4]. MT can be used to validate computed outputs automatically without the presence of a test oracle. It uses some properties of the computed problem, which are known as metamorphic relations (MRs), to help validate the correctness of the computed outputs. In this study we propose to apply MT to software implementing the greedy algorithm (GA), which is a simple and straightforward heuristic method, to solve the set covering problem (SCP) [1]. Particularly, we conduct a case study aimed at demonstrating the failure detection capability of MT. This paper is organized as follows: Section 2 presents the definition of MT, Section 3 explains GA on SCP, and Section 4 describes the metamorphic relations (MRs) identified in this study. Details of experimental work, results and discussions are presented in Section 5, and Section 6 concludes the paper.
2
The Metamorphic Testing (MT)
A test oracle is a mechanism that can be used by testers to verify the correctness of computed outputs of a program [5]. We encounter the test oracle problem when (i) there is no such oracle or (ii) the application of such an oracle becomes too expensive. To alleviate this problem, Chen et al. [4] have developed the metamorphic testing (MT) approach which has been successfully applied in various application domains ([7], [8], [9], [10], [11], [12]). MT is a property-based testing method. We use the sine function to illustrate the idea of MT. Suppose P is a program that computes the sine function. We assume that we do not have an oracle for this problem - i.e., we do not know exactly what is the value of the sine of an arbitrary input. Let 0.49 (radians) be a test input of P. After executing P with 0.49, the corresponding output is P(0.49). Due to the lack of an oracle, P(0.49) may be correct but we have no way to verify it. The key idea of MT is to use a relationship called a test case relation to generate some follow-up test cases, whose behaviour is predictable from the original test case. If the predicted behaviour is not exhibited, then this is indicative of an error in P. In the case of the sine function, we might choose 2π +0.49 and 4π +0.49 as follow-up test cases, as the sine function ought to yield the same value for these as for 0.49. The test case of 0.49 is referred to as the source test case in order to distinguish it from the follow-up test cases. As noted, for the sine function, we expect that sin(0.49) = sin(2π +0.49) = sin(4π +0.49). This relationship is referred to as the test result relation. After executing P with 2π + 0.49 and 4π + 0.49, we can then check whether the following equalities hold: P(0.49) = P(2π + 0.49) = P(4π + 0.49). If either one of the equalities does not hold, we know that P contains error. As exemplified by the sine function, a test case relation may involve the output of the source test case. The success of MT relies on the existence of a metamorphic relation (MR) which comprises of the two interrelated relations: the test case relation and the test result relation. Once an MR is defined, the generation of the follow-up test cases from the source test case and the verification of the test result relationship can be automated.
248
A.C. Barus et al.
The following are formal definitions of MR and the procedure of MT [13]: Definition MR. Suppose a function f has inputs, I1 = {x1 , x2 , ..., xi } where i ≥ 1 and let O1 = {f (x1 ), f (x2 ), ..., f (xi )} be the corresponding outputs. Let S= {f (xs1 ), f (xs2 ), ..., f (xsk )} denote a subset of O1 where S may be empty. Let I2 = {xi+1 , xi+2 , ..., xj } be other inputs to f where j ≥ i + 1 and O2 = {f (xi+1 ), f (xi+2 ), ..., f (xj )} be the corresponding outputs. Suppose there exists a relation R1 among I1 , S and I2 , and another relation R2 among I1 , I2 , O1 and O2 such that R2 must be satisfied whenever R1 is satisfied. Then, a metamorphic relation MR can be defined as: MR ={ (x1 , x2 , ..., xj , f (x1 ), f (x2 ), ..., f (xj ))| R1 (x1 , x2 , ..., xi , f (xs1 ), f (xs2 ), ..., f (xsk ), xi+1 , xi+2 , ..., xj ) → R2 (x1 , x2 , ..., xj , f (x1 ), f (x2 ), ..., f (xj ))} Elements of I1 and I2 are referred to as source test cases and follow-up test cases, respectively. Relations R1 and R2 are referred to as the test case relation and the test result relation, respectively. Procedure MT. Suppose the function f is implemented by a program P . The procedure of MT using the MR described in the above definition consists of the following steps: 1. Run P using a series of test cases I1 as source test cases and get the corresponding outputs O1 . 2. Use R1 , I1 , and O1 to generate follow-up test cases I2 . 3. Run P using I2 as inputs to get the corresponding outputs O2 . 4. Check the relation R2 : if R2 does not hold then a failure is revealed. As program failures may be sensitive to different MRs, it is recommended to identify more than one MR when applying MT. For our sine function example, other possible MR is as follow. For any inputs x1 and x2 where π/2 < x1 < x2 < 3π/2, sin(x1 ) must be greater than sin(x2 ). Formally speaking, MRsin2 : π/2 < x1 < x2 < 3π/2 → sin(x1 ) > sin(x2 ).
3
Greedy Algorithm on Set Covering Problem
The set covering problem (SCP) is one of the NP-complete problems [14] that has been well studied in computer science and complexity theory. Given a set of objects O and a set of requirements R that can be collectively satisfied by objects in O, SCP is to find the smallest subset of O that satisfies all requirements in R. For ease of discussion, in this paper, we use a key to represent an object in O and a lock to represent a requirement in R. SCP can be rephrased as a key-lock problem (KLP). Given a set K of keys that can collectively open a set L of locks, find a set of keys in K of smallest size that can open all locks in L. In this study, we focus on the greedy algorithm (GA) as one of many heuristic solutions to solve SCP. We refer to the expression of GA in [15] that can be translated to pseudocode presented in the Appendix. GA consists of a series of search steps. In each step, it looks for a local optimum which is a key that opens the largest number of locks that cannot be opened by previously selected keys. In general, the set of keys selected by GA may not be a global optimum.
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
249
Formally, suppose there are a set of keys, K = {k1 , k2 , ..., kx } and a set of locks, L = {l1 , l2 , ..., ly } where x, y > 0. For every pair (km , ln ) ∈ (K × L), we define r(m, n) as a relationship between key km and lock ln such that r(m, n) = 1 if km opens lock ln and r(m, n) = 0, otherwise. Initially, the relationship r(m, n) th is stored in the (m, n) element of matrix M , ∀m, 1 ≤ m ≤ x, ∀n, 1 ≤ n ≤ y. However, M consists of (x+1) rows and (y+1) columns. The additional column contains all identifiers for the keys in K and the additional row contains all identifiers for the locks in L. Each M [m][n] corresponds to r(m, n), the relationship between key km and lock ln where the key identifier km is stored in M [m][y + 1], ∀m, 1 ≤ m ≤ x and the lock identifier ln is stored in M [x + 1][n], ∀n, 1 ≤ n ≤ y. Intuitively speaking, M [m][ ] (the mth row), ∀m, 1 ≤ m ≤ x corresponds to key km and M [ ][n] (the nth column), ∀n, 1 ≤ n ≤ y corresponds to lock ln . Note that after GA in the Appendix selects the first key, its corresponding rows and columns (representing locks opened by the key) will be removed. Hence, we need the extra row and column to identify the remaining keys and locks in M . Matrix M can be presented as follows: ⎛ ⎞ r(1, 1) r(1, 2) . . . r(1, y) k1 ⎜ r(2, 1) r(2, 2) . . . r(2, y) k2 ⎟ ⎜ ⎟ ⎜. . .... . ⎟ ⎜ ⎟ M =⎜ . .... . ⎟ ⎜. ⎟ ⎝ r(x, 1) r(x, 2) . . . r(x, y) kx ⎠ l1 l2 . . . ly Given a set of keys K, a set of locks L, and their relationship stored in a matrix M , GA considers the number of locks in L that can be opened by each key in K (in other words, the total number of “1”s appearing in each row of M ) in order to make a series of decisions regarding the local optimum. As all key identifiers of K and all lock identifiers of L have been embedded in M , GA will merely analyse M to guide its search process. Basically, GA’s search consists of iterations of the following steps. 1. For each row of M , count the number of locks opened by the key corresponding to the row. Note: hereafter, the number of locks opened by a key k is referred as numOpenL(k). 2. Select a key (say kx ) in M such that kx can open the most locks in M . In other words, numOpenL(kx ) is the largest among all numOpenL(k) for all keys k in M . In case of a tie, select the key with the smallest row index. 3. Append the selected key kx to O, an array storing output elements of GA. 4. Remove columns in M corresponding to all locks that can be opened by kx . 5. Remove the row in M corresponding to kx . 6. Repeat steps 1 to 5 until M has one column left. At the end of the search, M only contains a column storing the identifiers of unselected keys. However, if all keys are selected into O, this column is empty. The output of GA is O which contains all keys selected in the search process. A test oracle for GA can be obtained manually only when the size of M is small. Even when M is moderate in size (say, with 30 or more rows and columns),
250
A.C. Barus et al.
there is an oracle problem for testing the implementation of GA. Therefore, in this study we propose to use Metamorphic Testing (MT) to verify the implementation of GA. As mentioned, GA uses and manipulates M to determine the local optima based on the largest numOpenL(km ), for each km in M . Details of the algorithm are presented in the Appendix. We illustrate GA in an instance of KLP, namely KL-example. KL-example. Suppose there is a set of keys, {k1 , k2 , ..., k5 }, a set of locks, {l1 , l2 , ..., l9 }, and the associated input matrix MKL to GA is as follows: ⎞ ⎛ 1 0 1 0 0 0 1 0 0 k1 ⎜ 1 0 0 1 0 1 0 0 0 k2 ⎟ ⎟ ⎜ ⎜ 0 1 0 0 0 0 1 1 0 k3 ⎟ ⎟ ⎜ MKL = ⎜ ⎟ ⎜ 1 0 1 0 0 0 1 1 0 k4 ⎟ ⎝ 0 0 1 0 1 0 0 0 1 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 Since k4 opens the most number of locks in MKL (that is four locks: l1 , l3 , l7 and l8 ), GA selects k4 as a local optimum and appends k4 to O so that now O = [k4 ]. The row corresponding to k4 and the columns corresponding to l1 , l3 , l7 and l8 are deleted from the matrix. As a result, the matrix is updated as follows: ⎛ ⎞ 0 0 0 0 0 k1 ⎜ 0 1 0 1 0 k2 ⎟ ⎜ ⎟ ⎜ 1 0 0 0 0 k3 ⎟ ⎜ ⎟ ⎝ 0 0 1 0 1 k5 ⎠ l2 l4 l5 l6 l9 In the second round of search, both k2 and k5 open the most number of remaining locks. However, because the row index of k2 is smaller than the row index of k5 , GA selects k2 so that now O = [k4 , k2 ]. The row corresponding to k2 and columns corresponding to locks opened by k2 (that is, l4 and l6 ) are deleted from the matrix. As a result, the matrix is updated as follows: ⎛ ⎞ 0 0 0 k1 ⎜ 1 0 0 k3 ⎟ ⎜ ⎟ ⎝ 0 1 1 k5 ⎠ l2 l5 l9 Using the same procedure, in the third round, k5 is picked as the local optimum so that now O = [k4 , k2 , k5 ]. Then the matrix is updated as follows: ⎛ ⎞ 0 k1 ⎝ 1 k3 ⎠ l2
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
251
Finally, GA selects k3 so that now O = [k4 , k2 , k5 , k3 ] and deletes all columns and rows associated with k3 . Then, the matrix is updated as follows: k1 At this point, the matrix contains only one column. Hence, GA stops the search process and returns its output O = [k4 , k2 , k5 , k3 ]. However, in this example, we can see that the global minimum solution is [k2 , k3 , k5 ]. So how can we know whether the implementation of GA is correct? (We emphasise that we refer to our implementation of GA. We do not know what the output of GA should be. We do have the output of our program. How do we check it is correct?)
4
Metamorphic Relation (MR)
Identification of MRs is an essential step in conducting MT. In this study, we propose nine MRs, MR1 to MR9, in applying MT on GA to solve SCP. We use M and M to denote a source test case and the follow-up test case respectively in describing the MRs. To illustrate those MRs further, we reuse the KL-example with MKL as the source test case, as discussed in the previous section. The follow-up test case generated using a particular MR, say MR-i, is denoted as M (MR-i). In the following discussion, the highlighted cells of M (MR-i) indicate cells modified after applying MR-i. We also use O and O to denote the output of the source and follow-up test cases, respectively and numO to denote the size (the number of elements) of O. Note that unlike the initial MKL specified in Section 3, the follow-up test cases M (MR-i) in this section may have the mth row corresponding to a key other than km in K and the nth column to a lock other than ln in L. We now discuss the nine MRs. 1. MR1 (Interchanging columns related to the key-lock relationship). If we generate the follow-up test case M by interchanging two columns of the source test case M which are related to the key-lock relationship, then O = O. (This corresponds to re-labelling two locks, and has no effect on the keys chosen by GA) For example, if we apply MR1 by interchanging columns related to l2 and l4 in MKL , ⎛ ⎞ 1 0 1 0 0 0 1 0 0 k1 ⎜ 1 1 0 0 0 1 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 0 0 1 0 0 1 1 0 k3 ⎟ ⎜ ⎟ M (MR1) =⎜ ⎟ ⎜ 1 0 1 0 0 0 1 1 0 k4 ⎟ ⎝ 0 0 1 0 1 0 0 0 1 k5 ⎠ l1 l4 l3 l2 l5 l6 l7 l8 l9 then we have O = O = [k4 , k2 , k5 , k3 ] 2. MR2 (Adding a useless key row). A key is said to be useless if it cannot open any locks. If M is obtained from M by adding a row corresponding to a useless key, then O = O. For example, if we apply MR2 to generate M (MR2) by adding a row corresponding to the useless key k6 in MKL ,
252
A.C. Barus et al.
⎛
1 ⎜1 ⎜ ⎜0 ⎜ M (MR2) = ⎜ ⎜1 ⎜0 ⎜ ⎝0 l1
0 0 1 0 0 0 l2
1 0 0 1 1 0 l3
0 1 0 0 0 0 l4
0 0 0 0 1 0 l5
0 1 0 0 0 0 l6
1 0 1 1 0 0 l7
0 0 1 1 0 0 l8
0 0 0 0 1 0 l9
⎞ k1 k2 ⎟ ⎟ k3 ⎟ ⎟ k4 ⎟ ⎟ k5 ⎟ ⎟ k6 ⎠
then we have O = O = [k4 , k2 , k5 , k3 ] 3. MR3 (Adding an insecure lock column). A lock is insecure if it can be opened by any key. If M is obtained from M by adding a column corresponding to an insecure lock, then O = O. For example, if we apply MR3 to generate a follow-up test case M (MR3) by adding a column corresponding to the insecure lock l10 in MKL , ⎛ ⎞ 1 0 1 0 0 0 1 0 0 1 k1 ⎜ 1 0 0 1 0 1 0 0 0 1 k2 ⎟ ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 1 k3 ⎟ ⎟ M (MR3) = ⎜ ⎜ 1 0 1 0 0 0 1 1 0 1 k4 ⎟ ⎜ ⎟ ⎝ 0 0 1 0 1 0 0 0 1 1 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 then we have O = O = [k4 , k2 , k5 , k3 ] 4. MR4 (Rearranging rows corresponding to the selected keys on top while preserving their order). M is obtained from M by rearranging 1) the rows of the selected keys to be the top rows of M while preserving their selected order and 2) those rows related to unselected keys all placed below the rows of selected keys, in any order. Then, we have O = O. We want to preserve the ordering so that the checking of the test result relation (O = O) can be implemented in linear time when O is stored as a 1-dimensional array. For example, if we apply MR4 to generate a follow-up test case M (MR4) from MKL , ⎛ ⎞ 1 0 1 0 0 0 1 1 0 k4 ⎜ 1 0 0 1 0 1 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 0 1 0 1 0 0 0 1 k5 ⎟ ⎜ ⎟ M (MR4) = ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 k3 ⎟ ⎝ 1 0 1 0 0 0 1 0 0 k1 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 then we have O = O = [k4 , k2 , k5 , k3 ] 5. MR5 (Adding a combined key of two consecutively selected keys). If two keys are combined together as one key, the resulting key can open all locks that can be opened by any individual key. The resulting key is referred to as a combined key of the two individual keys. In applying MR5, M is obtained from M by appending a row corresponding to a key, say kx , which combines two selected keys. To simplify the discussion, we restrict our discussion to the second and third selected keys, namely ko2 and ko3 ,
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
253
respectively. In fact, this can be generalized to any two consecutive selected keys. To generate the follow-up test case, there are two interrelated steps: (a) (Adding a combined key) Append a row corresponding to kx — the combined key of ko2 and ko3 . In other words, the relationship between kx and each lock ln in M can be defined as follow: r(x, n) = 1 if either r(o2 , n) = 1 or r(o3 , n) = 1. (b) (Preserving the ordering) Since the combined key kx may open more locks than the first selected key ko1 , we need to add N extra locks that can only be opened by ko1 so that ko1 is selected before kx in the new solution. By doing so, we can preserve the ordering of the selected keys so that the test result relation of this MR is simple and easily verified. If numOpenL(kx ) > numOpenL(ko1 ), then the number of extra locks needed for ko1 is numOpenL(kx ) - numOpenL(ko1 ). Otherwise, no extra locks are needed. Note: a lock that can be opened by one and only one key kx is referred to as an exclusive lock for kx . Then, we have kx in the second selected key in O and O - kx = (O - ko2 ) ko3 . Note that the minus (-) operator here (and also in the rest of the paper) denotes that the right operand, in this case a key, is deleted from the left operand, in this case an array of keys. For example, if we apply MR5 to generate a follow-up test case M (MR5) by appending k6 — a combined key of k2 and k5 — in MKL and then appending two exclusive locks to k4 (l10 and l11 ) because numOpenL(k4 ) - numOpenL(k6 ) = 6 − 4 = 2), ⎛ ⎞ 1 0 1 0 0 0 1 0 0 0 0 k1 ⎜ 1 0 0 1 0 1 0 0 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 0 0 k3 ⎟ ⎜ ⎟ ⎟ M (MR5) = ⎜ ⎜ 1 0 1 0 0 0 1 1 0 1 1 k4 ⎟ ⎜ 0 0 1 0 1 0 0 0 1 0 0 k5 ⎟ ⎜ ⎟ ⎝ 1 0 1 1 1 1 0 0 1 0 0 k6 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 then we have k6 = O [2] and O - k6 = (O - k2 ) - k5 . In this example, O = [k4 , k6 , k3 ]. 6. MR6 (Excluding a selected key other than the first selected key while preserving the order of the remaining selected keys). By excluding a selected key other than the first selected key, say kx , so that it will not be selected in the next output, we need to reset the selected keys preceding kx in the original output, so that they can open the locks that can be opened by the excluded key. In order to preserve the ordering of the remaining selected keys for ease of checking the two solutions, we need to take some precautions. First, we cannot simply reset the selected key preceding kx in O to be able to open all locks opened by kx . This is because the total number of locks opened by the key will increase and hence, it is possible that that key would then be selected earlier. Second, we cannot simply reset the first selected key ko1 to open all these locks as well. This is because it
254
A.C. Barus et al.
is possible that the selection order of the remaining keys might be upset, as shown⎛in the following example. ⎞ 1 1 1 1 1 0 0 0 0 0 0 0 0 0 k1 ⎜ 0 0 0 0 0 1 1 1 1 0 0 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 0 0 0 0 0 0 0 1 1 1 0 0 0 k3 ⎟ ⎜ ⎟ M =⎜ ⎟ ⎜ 0 0 0 0 0 0 0 0 0 0 0 1 1 0 k4 ⎟ ⎝ 0 0 0 0 0 1 1 0 0 0 0 0 0 1 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 l12 l13 l14 In this case, we have O = [k1 , k2 , k3 , k4 , k5 ]. Suppose we would like to exclude k5 in the next output by resetting the first selected key k1 as follows: ⎛ ⎞ 1 1 1 1 1 1 1 0 0 0 0 0 0 1 k1 ⎜ 0 0 0 0 0 1 1 1 1 0 0 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 0 0 0 0 0 0 0 1 1 1 0 0 0 k3 ⎟ ⎜ ⎟ M = ⎜ 0 0 0 0 0 0 0 0 0 0 0 1 1 0 k4 ⎟ ⎜ ⎟ ⎜ 0 0 0 0 0 1 1 0 0 0 0 0 0 1 k5 ⎟ ⎝ ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 l12 l13 l14 Then O = [k1 , k3 , k2 , k4 ] and, hence, the ordering of these previously selected keys is different from that in O. Therefore, in applying MR6, M is obtained from M by resetting the first selected key ko1 so that it can only open those locks that can be opened by kx (say kx is O[i], 2 ≤ i ≤ numO) but not by any key ky where ky = O[j], 2 ≤ j < i. In other words, when r(x, n) = 1 and r(y, n) = 0 then r(o1 , n) is updated to “1”, for all ln in M . Then, we have O = O - kx . For example, suppose we want to apply MR6 to generate M (MR6) by excluding k5 of MKL in O . The first selected key in O, k4 is reset so that it can open l5 and l9 , which are locks that can be opened by k5 but not by the other selected key preceding k5 , which is k2 . The follow-up test case is given by ⎛ ⎞ 1 0 1 0 0 0 1 0 0 k1 ⎜ 1 0 0 1 0 1 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 k3 ⎟ ⎜ ⎟ M (MR6) = ⎜ ⎟ ⎜ 1 0 1 0 1 0 1 1 1 k4 ⎟ ⎝ 0 0 1 0 1 0 0 0 1 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 and we then have O = O - k5 . In this example, O = [k4 , k2 , k3 ]. 7. MR7 (Deleting a selected key while preserving the order of the remaining selected keys). This MR is different from MR6 because the row corresponding to a selected key say, kx , is deleted from M . All columns related to locks opened by kx are also deleted from M . However, in order to preserve the order of the remaining selected keys, we have to add extra columns corresponding to exclusive locks to each of the selected keys preceding kx in O. In more detail, M is obtained from M in the following manner:
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
255
(a) The row corresponding to kx is deleted from M . (b) All locks that can be opened by kx are deleted from M . (c) For each ky , a selected key preceding kx in O, N exclusive locks to ky are appended to M where N is the number of locks that can be opened by both kx and ky . Note: a lock that can be opened by both kx and ky is referred to as share lock of kx and ky . The checking of the share locks is according to the order of the selected keys preceding kx in O. Once a lock has been considered as a share lock of any pair of keys, it cannot be used as a share lock of other pairs. Then, we have O = O - kx . For example, if we apply MR7 to generate a follow-up test case M (MR7) by deleting k3 in MKL , the row corresponding to k3 and all columns corresponding to locks opened by k3 are deleted. For the keys preceding k3 in O (that is k4 , k2 , and k5 ), we find that k3 and k4 have two share locks; k3 and k2 have no share locks; and k3 and k5 also have no share locks. Accordingly, we need to add two locks, l10 and l11 , that are exclusive to k4 . Hence, the follow-up test case is: ⎛ ⎞ 1 0 1 0 0 0 1 0 0 0 0 k1 ⎜ 1 0 0 1 0 1 0 0 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 0 0 k3 ⎟ ⎜ ⎟ M (MR7) = ⎜ ⎟ ⎜ 1 0 1 0 0 0 1 1 0 1 1 k4 ⎟ ⎝ 0 0 1 0 1 0 0 0 1 0 0 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 l11 Note that the darker highlighted cells will be deleted. Then, we have O = O - k3 . In this example, O = [k4 , k2 , k5 ]. 8. MR8 (Adding an exclusive lock to an unselected key). If M is obtained from M by adding an extra column that corresponds to an exclusive lock to a particular unselected key, then we expect that the unselected key will be in the new solution. Since the unselected key may open other existing locks, some previously selected key may be excluded from the solution. However, we cannot pre-determine which key will be excluded. For example, if we apply MR8 to generate a follow-up test case M (MR8) by adding an exclusive lock l10 to k1 — a key which is unselected in O, then ⎛ ⎞ 1 0 1 0 0 0 1 0 0 1 k1 ⎜ 1 0 0 1 0 1 0 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 0 k3 ⎟ ⎟ M (MR8) = ⎜ ⎜ 1 0 1 0 0 0 1 1 0 0 k4 ⎟ ⎜ ⎟ ⎝ 0 0 1 0 1 0 0 0 1 0 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 and we have k1 in O . In this example, O = [k1 , k2 , k3 , k5 ]. Please note that the key k4 (originally in O) is not in O . 9. MR9 (Adding an exclusive lock to an unselected key while preserving the order of the previously selected keys). This MR is different from MR8 because the order of the previously selected keys is preserved. In
256
A.C. Barus et al.
more details, the follow-up test case M of this MR is obtained as follows: Add an extra column that corresponds to an exclusive lock to a particular unselected key and reset the unselected key so that it can open and only open that exclusive lock. Then, we are guaranteed that the order of the previously selected keys in both solutions are the same. In other words, we have O - kx = O. For example, if we apply MR9 to generate a follow-up test case M (MR9) by adding an exclusive lock l10 to k1 and resetting k1 such that it can open and only open l10 , then ⎛ ⎞ 0 0 0 0 0 0 0 0 0 1 k1 ⎜ 1 0 0 1 0 1 0 0 0 0 k2 ⎟ ⎜ ⎟ ⎜ 0 1 0 0 0 0 1 1 0 0 k3 ⎟ ⎜ ⎟ M (MR9) = ⎜ ⎟ ⎜ 1 0 1 0 0 0 1 1 0 0 k4 ⎟ ⎝ 0 0 1 0 1 0 0 0 1 0 k5 ⎠ l1 l2 l3 l4 l5 l6 l7 l8 l9 l10 Then, we have O - k1 = O. In this example, O = [k4 , k2 , k1 , k3 , k5 ]
5
Results and Observation
We applied MT to test a program implementing the algorithm in the Appendix. It was written in Java and the testing was conducted in a Linux environment. We applied the fault seeding technique — which is widely used in software testing experiments ([9], [16])— to create five faulty versions of the program. We inserted one bug into each faulty version. The bug insertion process used mutant operators introduced by Agrawal et al. [17]. The mutant operators were chosen randomly and independently. Faults resulted from those mutant operators are detailed in Table 1. Source test cases were generated in matrix format with random sizes of rows and columns in the range between 1 and 100 (there were no empty matrix inputs). The content of matrixes (K, L and their relationship) were also generated randomly, subject to the constraint that every row had at least one “1” and every column had at least one “1” (in other words, every key can open at least one lock and every lock can be opened by at least one key). We generated a test suite of one hundred test cases. These test cases were used as source test cases for testing faulty programs V1 to V5 using the nine MRs introduced in Section 4. The results are presented in Table 2. From Table 2, we observe that most of the MRs contributed to reveal failure in at least one faulty version, except MR1, which did not reveal any failures. This is reasonable because we cannot guarantee that every MR can reveal any failure. Such a case simply exhibits the general limitation of software testing, that is, there is no guarantee that a certain fault will be exposed. We consider an entry in Table 2 as the execution of a faulty version using one hundred pairs of source test cases and follow-up test cases corresponding to a specific MR. Therefore, we have 9 x 5 = 45 entries in our experiments of which 24 entries are non-zero. If all pairs of source test cases and follow-up
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
257
Table 1. Faults in faulty programs V1 to V5 based on pseudocode in the Appendix Faulty Program Line# Original Statement Faulty Statement V1 8 i=1 i=2 V2 11 M [i][j] = 1 M [i][j] = 1 V3 14 noOLocks > maxOLocks noOLocks ≥ maxOLocks V4 15 maxOLocks := noOLocks maxOLocks := i V5 23 i=1 i=2 Table 2. Percentage of tests revealing failures in the faulty versions by MRs
V1 V2 V3 V4 V5
MR1 0 0 0 0 0
MR2 MR3 MR4 6 0 93 100 0 2 0 0 100 30 43 100 10 31 56
MR5 MR6 MR7 MR8 MR9 0 0 0 18 11 100 100 69 0 0 95 0 0 0 0 80 4 25 0 8 67 31 31 0 0
test cases are considered (a total of 4500 pairs), there are 1210 pairs (26.9 %) which reveal failures. Given that there is no way to verify the correctness of the computed outputs, and that MT could be fully automated once the MRs have been defined, a failure detection effectiveness of 26.9 % is in fact very encouraging. It should also be noted that to apply MT in generating test cases, the software test engineers only require minimal programming skills and the relevant problem domain knowledge. Different failures in our experiments can be detected by different MRs because of their different characteristics. For example, the faulty version V2 selects the key that opens the least number of locks instead of the maximum. Hence, MR2 can reveal this failure because it adds a useless key (a key that cannot open any lock) in the follow-up test case - a key that will be chosen in the faulty program but not in a correct program. On the contrary, MR3 cannot reveal this failure because it adds an insecure lock (a lock that can be opened by every key) in the follow-up test case, and such a key will never be chosen in V2. As our experiments show that some MRs, say MR4, can reveal more failures than the others, the selection of good MRs is crucial. Obviously, all MRs could potentially be used in MT. However, due to resource limitations — in industrial examples, running test cases even on powerful computers may take many hours — it is essential to identify which MRs should be given higher priority. Chen et al. have conducted some case studies on the selection of useful MRs in MT[9]. One of their general conclusions is that the bigger the differences between program executions on the source and follow-up test cases, the better are the MRs.
258
6
A.C. Barus et al.
Conclusion
Heuristic methods do not deliver exact answers. Accordingly, software implementing these methods are subject to the oracle problem. We propose to apply metamorphic testing (MT) — an automated property-based testing method — to test such software. In this study, we investigated the application of MT on the greedy algorithm (GA) applied to the set covering problem. We identified nine MRs to apply MT on GA and conducted testing on five faulty versions of a program implementing GA. Based on the experimental results, we found that MT reveals at least one failure in each of the faulty versions under investigation. It demonstrates the fault detection capability of MT in automatically testing heuristic programs such as GA. The experimental results report that some MRs can contribute in revealing more failures than others. This is due to the different characteristics of the MRs. Hence, it is crucial to select good MRs particularly when the resources are limited for conducting testing. Our study is limited by the following threats. (1) Threat to internal validity (e.g. experimental setup). We have carefully examined the processes to make sure that no such threat exists. (2) Threat to construct validity (e.g. evaluation method). The failure detection effectiveness of MT is only studied based on the percentage of pairs of source and follow-up test cases which revealed failures. It is worthwhile to examine the effectiveness of MT in other measurements, as part of the future work. (3) Threat to external validity (e.g. number and size of subjects in the case study). We only applied MT to one simple program of GA in the case study. In the future work, it is interesting to investigate the effectiveness of MT using more and bigger sized programs. In the future, we also propose to study a new approach for defining MRs. In the new approach, prior to defining MRs, we will attempt to use knowledge of possible faults in the software under test, (whereas our current approach is ad-hoc, defining the MRs independently from the possible faults or in this study, independently from the mutant generation). The faults can be identified at either high level (specification-based) or lower level (source code-based). Afterwards, we will attempt to define MRs for targeting such faults. We expect that this approach may contribute a more effective set of MRs than our current approach. Acknowledgment. We would like to acknowledge the support given to this project by an Australian Research Council Discovery Grant (ARC DP0771733).
References 1. Johnson, D.S.: Application of algorithms for combinatorial problems. Journal of Computer and System Science 9(3), 256–278 (1974) 2. Bodorik, P., Riordon, J.S.: Heuristic algorithms for distributed query processing. In: Proceedings of the First International Symposium on Databases in Parallel and Distributed Systems, DPDS’88. IEEE Computer Society Press, Los Alamitos (2000)
Testing of Heuristic Methods: A Case Study of Greedy Algorithm
259
3. Cheng, H., Liu, Q., Jia, X.: Heuristic algorithms for real-time data aggregation in wireless sensor networks. In: Proceedings of the 2006 International Conference on Wireless Communication and Mobile Computing, pp. 1123–1128 (2006) 4. Chen, T.Y., Cheung, S.C., Yiu, S.M.: Metamorphic testing: a new approach for generating next test cases. Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong (1998) 5. Weyuker, E.J.: On testing non-testable programs. The Computer Journal 25(4), 465–470 (1982) 6. Chan, W.K., Cheung, S.C., Leung, K.R.P.H.: Towards a metamorphic testing methodology for service-oriented software applications. In: Proceedings of the 5th International Conference on Quality Software(QSIC 2005), pp. 470–476. IEEE Computer Society Press, Los Alamitos (2005) 7. Chan, W.K., Cheung, S.C., Leung, K.R.P.H.: A metamorphic testing approach for online testing of service-oriented software applications. A Special Issue on Service Engineering of International Journal of Web Services Research 4(2), 60–80 (2007) 8. Chen, T.Y., Feng, J., Tse, T.H.: Metamorphic testing of programs on partial differential equations: a case study. In: Proceedings of the 26th Annual International Computer Software and Applications Conference (COMPSAC), pp. 327–333. IEEE Computer Society Press, Los Alamitos (2002) 9. Chen, T.Y., Huang, D., Tse, T.H., Zhou, Z.Q.: Case studies on the selection of useful relations in metamorphic testing. In: Proceedings of the 4th Ibero-American Symposium on Software Engineering and Knowledge Engineering (JIISIC), Polytechnic University of Madrid, pp. 569–583. Polytechnic University of Madrid (2004) 10. Chen, T.Y., Tse, T.H., Zhou, Z.Q.: Semi-proving: an integrated method based on global symbolic evaluation and metamorphic testing. In: Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pp. 191–195. ACM Press, New York (2002) 11. Chen, T.Y., Tse, T.H., Zhou, Z.Q.: Fault-based testing without the need of oracles. Information and Software Technology 45(2), 1–9 (2003) 12. Gotlieb, A.: Exploiting symmetries to test programs. In: Proceedings of the 14th International Symposium on Software Reliability Engineering, ISSRE (2003) 13. Chan, W.K., Chen, T.Y., Lu, H., Tse, T.H., Yau, S.S.: Integration testing of contextsensitive middleware-based applications: a metamorphic approach. International Journal of Software Engineering and Knowledge Engineering 16(5), 677–703 (2006) 14. Garey, M.R., Johnson, D.S.: Computers and Interactibility: A Guide to the Theory of NP-Completeness. W.H. Freeman, New York (1979) 15. Cormen, T.H., Leisevsen, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1990) 16. Do, H., Rothermel, G., Kinneer, A.: Prioritizing JUnit Test Cases: An Empirical Assessment and Cost-Benefits Analysis. An International Journal Empirical Software Engineering 11(1), 33–70 (2006) 17. Agrawal, H., DeMillo, R.A., Hathaway, R., Hsu, W., Hsu, W., Krauser, E.W., Martin, R.J., Mathur, A.P., Spafford, E.H.: Design of mutant operators for the C programming language. Technical Report SERC-TR-41-P, Software Engineering Research Center, Purdue University, West Lafayette, Indiana, USA (March 1989)
260
A.C. Barus et al.
Appendix: Greedy Algorithm on the Key-Lock Problem 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
INPUT M , where M is a matrix with (x + 1) rows and (y + 1) columns. O := [], an empty array to store the selected keys as GA’s output numO := 0, a variable to count the number of selected keys WHILE (x > 1), DO BEGIN maxOLocks := 0 bestRowIndex := 0 bestKeyID := 0 FOR i = 1 to y, DO BEGIN nOLocks := 0 FOR j = 1 to y, DO BEGIN IF M [i][j] = 1, THEN BEGIN nOLocks := nOLocks + 1 END IF nOLocks > maxOLocks, THEN BEGIN maxOLocks := nOLocks bestRowIndex := i bestKeyID := M [bestRowIndex][y + 1] END END arrOpenedLocks := [] FOR j = 1 to y, DO BEGIN IF M [bestRowIndex][j] = 1 THEN FOR i := 1 to x +1, DO BEGIN append M [i][j] to arrOpenedLocks END END Remove arrOpenedLocks from matrix M Remove M [bestRowIndex][] from matrix M x := x − 1 y := y − maxOLocks O[numO] := bestKeyID numO = numO + 1 END OUTPUT O
A Framework for Defect Prediction in Specific Software Project Contexts Dindin Wahyudin1, Rudolf Ramler2, and Stefan Biffl1 1
Institute for Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9/188, A-1040 Vienna, Austria {dindin,stefan.biffl}@ifs.tuwien.ac.at 2 Software Competence Center Hagenberg Softwarepark 21, A-4232 Hagenberg, Austria [email protected]
Abstract. Software defect prediction has drawn the attention of many researchers in empirical software engineering and software maintenance due to its importance in providing quality estimates and to identify the needs for improvement from project management perspective. However, most defect prediction studies seem valid primarily in a particular context and little concern is given on how to find out which prediction model is well suited for a given project context. In this paper we present a framework for conducting software defect prediction as aid for the project manager in the context of a particular project or organization. The framework has been aligned with practitioners’ requirements and is supported by our findings from a systematical literature review on software defect prediction. We provide a guide to the body of existing studies on defect prediction by mapping the results of the systematic literature review to the framework. Keywords: Software Defect Prediction, Systematical Literature Review, Metric-based Defect Prediction.
1 Introduction Software defect prediction has caught considerable attention from researchers as well as practitioners due to the increasing importance of software products as backbone for reliable industry systems. The rationale for identifying defective components of a software system prior to applying analytical quality assurance (QA) measures like inspection or testing has been summarized by Nagappan et al.: “During software production, software quality assurance consumes a considerable effort. To raise the effectiveness and efficiency of this effort, it is wise to direct it to those which need it most. We therefore need to identify those pieces of software which are the most likely to fail – and therefore require most of our attention.” [17]A wide range of studies provide evidence about successful prediction of defects and various scenarios on how to exploit defect prediction have been proposed, for example, focusing testing and QA activities, making informed release decisions, mitigating risks, allocating resources in maintenance planning, and supporting process improvement efforts. Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 261–274, 2011. © IFIP International Federation for Information Processing 2011
262
D. Wahyudin, R. Ramler, and S. Biffl
These studies also provide valuable advice and share lessons learned important for those who want to adopt defect prediction in practice. Currently there are many approaches to perform defect prediction [9] and respective validation methods [4, 20]. However, Koru et al. [8] advise that in practice, the most appropriate prediction method has to be selected for the current project context and the type of defect pattern to be predicted. Thereby, a good defect prediction model has to be constructed using a set of predictor variables that represents the actual measures of the software product and process [14, 15, 26]. Furthermore, several measures to evaluate the quality of a prediction are recommended, e.g. [13], and calibrating the prediction model to align false alarm rates with prediction goals and business scenarios is recommended [12]. Despite the many findings and the comprehensive information provided by the existing studies, there still is a wide gap between published research results and their adoption in real-world projects. Studies sharing insights about the application of defect prediction in practice are rare. Li et al. [10] discuss experiences and results from initiating defect prediction at ABB Inc. for product test prioritization and maintenance resource planning. Ostrand et al. [21] describe automating algorithms for the identification of fault-prone files to support the application of defect prediction in a wide range of projects. These studies show that in many cases, research results on defect prediction cannot directly be translated to practice. Adaptation and interpretation in the context of a particular project or organization is required. Furthermore, many studies focus on specific research questions. While these studies provide a valuable contribution to defect prediction research, this contribution remains an isolated piece of a bigger picture without following the entire track of research. The objective of this paper is to provide a guide to the body of existing studies on defect prediction to facilitate the use of systematic defect prediction in the context of a particular project or organization. Thus, common requirements for defect prediction in practice are outlined in Section 2, from these requirements a generic framework for conducting defect prediction in practice is derived in Section 3. From 12 published studies on defect prediction, findings have been distilled in a systematic literature review described in Section 4. The results are presented within the structure of the proposed framework in Section 5. Section 6 summarizes the paper and discusses questions for future work.
2 Requirements for Defect Prediction a Software Project A number of empirical studies provide evidence of successful prediction of defects using data from real-world projects conducted in an industrial or open-source context. However, practitioners are confronted with additional requirements when they try to replicate the success of these studies within the context of their specific projects and organizations. We compiled a list of typical requirements encountered when conducting defect prediction in practice from both, the existing body of literature and our own experience on predicting defects, e.g.,[26]. •
Aligning defect prediction with project and business goals. Empirical studies tend to focus on prevalent research questions. Practitioners, however, have to align defect prediction with the goals of their specific project. Concentrating testing on defect-prone components or planning the effort for maintenance
A Framework for Defect Prediction in Specific Software Project Contexts
•
•
•
•
•
263
activities are examples for such goals. Defining the goals first is therefore an important requirement as an appropriate budget has to be allocated for defect prediction and, moreover, the investment has to be justified according to estimated savings and benefits. Creating a project-specific prediction model. Prediction models are constructed from a project’s historical data. A prediction model, thus, models the context of a particular project. As a consequence, predictors obtained from one project are usually not applicable to other projects. Nagappan et al. [18], for example, showed that predictors are accurate only when obtained from the same or similar projects and that there is no single set of metrics that is applicable to all projects. These findings were supported by Koru and Liu [8] when analyzing the PROMISE repository containing data about projects conducted at different sites. “Normally, defect prediction models will change from one development environment to another according to specific defect patterns.” [8] Evaluating the feasibility in the project or organizational context. Despite the success reported by many studies, the prediction of defects in a particular project may not be possible. Typical reasons are the poor quality of the available data [6] or the effort required to extract and collect the necessary data [23]. Most published studies report solely successful cases of defect prediction. Only few studies point toward limitations, for example, Li et al. [10] comment on the poor accuracy in predicting field defects for one of the studied products. The feasibility of predicting defects has to be estimated early to confirm that the defined goals will be met. Striving for fast results. Even when the feasibility is positively evaluated, defect prediction is required to produce results fast. Defect prediction is relatively new in the software development arena, and practitioners face a high level of uncertainty concerning the return on the investment in defect prediction. Thus, when results cannot be obtained within one or a few iterations the chance defect prediction will be applied in a real-world project is low. The general concerns of practitioners have also been described by Ostrand et al. [21]: “In our experience, practitioners won't even consider using a new technology without evidence that it has worked on a substantial number of real systems of varying types. It is very unlikely that practitioners will be convinced that a new tool is worth learning and evaluating merely on the basis of its demonstration on toy systems or on systems much smaller than the ones they normally develop and maintain.” Dealing with incomplete, insufficient data. Extraction and integration of data from corporate databases and repositories is a costly, time-consuming endeavor and, eventually, does not assure the data is of appropriate quality. Thus, Li et al. [10] observe that “dealing with missing/incomplete information is important to practitioners because information is often not available in realworld settings and conducting analysis without important categories of predictors (e.g. deployment and usage predictors for field defect predictions) jeopardizes the validity and accuracy of results”. Predicting under uncertainty. Fenton and Neil [2] remind that “Project managers make decisions about software quality using best guesses; it seems to us that will always be the case and the best that researchers can do is 1) recognize
264
•
•
D. Wahyudin, R. Ramler, and S. Biffl
this fact and 2) improve the ‘guessing’ process. We, therefore, need to model the subjectivity and uncertainty that is pervasive in software development.” Uncertainty exists besides limitations resulting from incomplete, insufficient data. It arises often about how the data has to be interpreted, which reflects the peculiarities of a project such as individual project regulations, discontinuities in workflows and processes or specific use of tools. Practitioners therefore rely on expert judgment and have to make assumptions. These assumptions should be made explicit and – as a positive side-effect – the prediction model should provide information to verify these assumptions. Outsourcing of model creation. Ostrand et al. [21] found that “it is very time consuming to do the required data extraction and analysis needed to build the models, and few projects have the luxury of extra personnel to do these tasks or the extra time in their schedules that will be needed. In addition, statistical expertise was needed to actually build the models, and that is rare to find on most development projects“. As a consequence, it should be possible to organize data extraction and model creation separately so it can be outsourced or – if tool support permits – automated. Reusing and validating the existing model for upcoming releases. To optimize the return on the investment in model creation, the model has to be reused for upcoming releases with minimal additional effort. However, over time, the project’s context and the defect patterns can change. As a consequence, prediction results for a new release derived from a model created and verified with historical data have to be validated. Practitioners need a measure of reliability when they make decisions based on prediction results. Furthermore, Koru and Liu [8] point out that “as new measurement and defect data become available, you can include them in the data sets and rebuild the prediction model.” As adjusting or rebuilding the model requires additional effort, the validation results should serve as an indicator when adjusting or rebuilding becomes necessary.
3 Software Defect Prediction Framework In this section we described a framework for software defect prediction which consists of three phases – (A) preparation, (B) model creation and (C) model usage – as well as seven steps (see Figure 1.) This framework is in line with the requirements outlined in the previous section and has been derived from our experience and literature on software defect prediction and software estimation. 3.1 Phase A – Preparation As first phase in conducting a defect prediction, one should start by preparing the necessary preconditions prior to model construction. The intention of the preparation phase is to create a clear focus of what results should be provided by the prediction, to appropriately design the prediction approach, and to have quick analysis whether such design will accomplish the expected results within project and organizational context. Following the Goal Question Metrics (GQM) model proposed by Basili et al. [1], we structure the first phase with following steps:
A Framework for Defect Prediction in Specific Software Project Contexts
Fig. 1. Framework for Software Defect Prediction
265
266
D. Wahyudin, R. Ramler, and S. Biffl
A.1 Define defect prediction goal, which represents the objective of defect prediction with respect to a particular stakeholder perspective and the current project context. A.2 Specify questions and hypotheses. Questions are derived from the defect prediction goals. They are used to identify relevant models of the objects of study and, then, to more precisely define the expected achievement of a specific goal. The questions can be reframed as hypotheses about the observed situation or defect pattern. We recommend specifying hypotheses that are easily measurable to enable the falsification or acceptance of the hypotheses for a sound assessment of the prediction results. A.3 Quick feasibility study and variables specification. A quick feasibility study is essential to assess whether the initial goals of the prediction can be achieved using the available data from the observation objects. A negative assessment indicates the initial goals are not feasible and shows the need for adjusting the goals and questions. After conducting a feasibility study, the set of metrics that should be collected and estimated in the prediction model is collected. These metrics act as independent variables and dependent variables in the prediction model to be constructed in the next phase. 3.2 Phase B – Model Construction Constructing the prediction model is the core phase in defect prediction, Here, based on the variables and the defect prediction method specified in the previous phase, data collection, model training, and model evaluation are performed. B.1 Data collection for model training. As part of a close investigation of the available data sources, the period of observation and relevant project repositories and databases are specified. Based on the previously selected variables the data is collected from the observation objects. Invalid and missing data is thereby filtered or refined. For making a sound prediction, potential threats to validity are recorded. B.2 Prediction model training. Parameter selection is used to identify the parameters with a significant impact on the dependent variables. These parameters are used in training the model, usually applying standard statistical or machine learning tools. B.3 Prediction model validation. The trained model needs to be validated for its performance, i.e., accuracy, recall and precision. Unsatisfying results should trigger a feedback loop back to the step data collection, as it will not make sense to proceed with a low-performance model that, e.g., has a high number of false positives or errors. 3.3 Phase C – Model Usages A major concern from a practitioner’s point of view is that many studies reported a trained defect prediction model which show a good performance by means of cross
A Framework for Defect Prediction in Specific Software Project Contexts
267
validation with historical data [2]. Only limited studies reported the robustness of the model with different observations. This, however, is a necessity in practical usages for predicting the quality for a certain time period in the future. C.1 Project defect prediction. In this step the model trained in the previous phase is actually used, i.e. the model is parameterized with observations from new releases to predict defects in these releases. C.2 Analysis for prediction model robustness. Based on the results of step C.1, the robustness of the model is analyzed. Thereby, the reliability of the current prediction results are estimated to determine how to apply the prediction results in the project, e.g., to safely rely on them or to be careful. If the analysis indicates low reliability, a feedback loop back to re-creating or calibrating the model should be triggered as well as suggestions for refinement of the prediction hypotheses should be provided.
4 Review of the Body of Literature on Defect Prediction Numerous empirical studies on software defect prediction have been published in journals and conference proceedings. In order to provide a systematic guide to the existing body of literature, relevant studies have been searched and selected following the approach for a systematic literature review proposed by Kitchenham et al. [5]. By following this approach we identified 12 studies on defect prediction providing findings applicable within the framework outlined above. A systematic literature review is defined as “a form of secondary study that uses a well-defined methodology to identify, analyze and interpret all available evidence related to a specific research question in a way that is unbiased and (to a degree) repeatable” [5]. Staples and Niazi [24] summarize the characteristics of a systematic literature review: (a) a systematic review protocol defined in advance of conducting the review, (b) a documented search strategy, (c) explicit inclusion and exclusion criteria to select relevant studies from the search results, (d) quality assessment mechanisms to evaluate each study, (e) review and cross-checking processes to control researcher bias. A key element of a systematic literature review is the review protocol, which documents all other elements constituting the systematic literature review. They include the research questions, the search process, the inclusions and exclusion criteria, and the quality assessment mechanisms. • Research Questions. The research questions summarize the questions frequently addressed in empirical studies. These questions contribute essential findings from research to the application of defect prediction in practice and are mapped to the phases of the framework. According to the framework, we emphasize three research questions to guide the systematical literature review process: RQ1. How do successful studies in defect prediction design the prediction process prior to model construction? RQ2. How do successful studies in defect prediction construct the prediction model from collected data?
268
D. Wahyudin, R. Ramler, and S. Biffl
RQ3. How can external validation of the prediction model be provided for future predictions? •
•
•
Search Process. The search process describes the process to identify the list of candidate studies. Following search process advocated by Barbara Kitchenham et al. [7], the search process was organized into two separate phases. The initial search phase identified candidate primary studies based on searches of electronic digital libraries from IEEE, ACM, Elsevier, Springer, and Wiley. Search strings have been composed from search terms such as defect, error, fault, bug, prediction, and estimation. The secondary search phase is to review the references in each of the primary studies identified in the first phase looking for more candidate primary sources which repeated until no further relevant papers can be found. Inclusion and Exclusion Criteria. The criteria for including a primary study comprised any study that compared software defect predictions which enables metric-based approaches based on analysis of project data. We excluded studies where data collected from a small number of observations (less than 5 observations). We also excluded studies where models constructed only based on historical data of defects with no other metrics as predictor variables. The third exclusion criterion is that we only consider studies that performed internal validation and external validation of constructed prediction model. Formal inclusion criteria are that papers have to be peer reviewed and document empirical research. Regarding the contents, inclusion requires that the study addresses at least one of the defined research questions. Quality Assessment Mechanism. This systematic literature review has been based on a documented and reviewed protocol established in advance of the review. Furthermore, in this study two researchers were involved in conducting the systematic literature review and cross validation of the results. For example, one researcher queried a digital library and extracted candidate studies while the second researcher verified the search terms, search results, and the list of identified candidate studies. Thereby we minimized researcher bias and assured the validity of the findings of the review.
5 Extraction of Findings and Discussion This section maps the findings from the systematic literature review to the phases and tasks of the framework for defect prediction. The findings summarize the contributions extracted from the studies with respect to the research questions 1 to 3 used to drive our systematic literature review. Table 1 lists information about how current research defines the goals of defect prediction studies, questions and hypotheses, as well as how variables are specified to describe each question. Note that Explicit mean the study describes the following terms (goal, hypotheses, etc) clearly as a separate part from surrounding texts and adhere to our term definitions in the framework. Implicit mean we need to extract the information from the text to identify a term definition. An N/A reveals that there is no information that contains the definition of an expected term in the study.
A Framework for Defect Prediction in Specific Software Project Contexts
269
Table 1. Study Related Factors- Preparation Phase Study
Moser et al [14]
A.1 Goal definition Goal is implicitly described
Li et al [9]
Goal is implicitly described
Zimmermann et al [28]
Goal is implicitly described
Koru et al [8]
Implicit goal description
Nagappan et al [16] Li et al [11]
Implicit goal description Goal is implicitly described
Weyuker et al [27] Menzies et al [13]
Explicit goal description in later section Implict goal description
Graves et al [3]
Goal is implicitly described
Sunghun et al [25] Pai et al [22]
Implicit goal description Implicit goal description
Olague et al [19]
Ecxplicit goal statement
Preparation Steps A.2 A.3 Research questions Variables Specification Questions proposed Implicit variables specificawith respective null tions to predict module dehypotheses fect proneness Explicit research quesExplicit variables specification with no hypotheses tion to predict defect intensity of a release Implicit research quesImplicit variables specification with no hypotheses tions to predict module defect proneness Implicit research quesImplicit variables specification with no hypotheses tions to predict module defect proneness Explicit research hyExplicit variables specificapotheses tion Explicit research quesExplicit variables specification with no hypotheses tion to predict defect intensity of a release Implicit Research quesImplicit variable specification tions with hypotheses to predict file defect proneness Implicit research quesExplicit variables specification, hypotheses detion for module defect pronescribed later in the paper ness Implicit research quesExplicit variables specifications with no hypothetion for module defect proneses ness Explicit research Explicit variables specificahypotheses tion of file defect proneness Explicit research quesExplicit variables specification with no hypotheses tion for number of defect per class and class defect proneness Explicit research hyImplicit variable specification potheses to describe to predict class defect proneproposed goal ness
Most of the studies do not explicitly describe the goal of the study and there is no single study which identifies the target stakeholders of the results with their values expectations. 7 out of 12 studies explicitly stated the research questions and/or respective hypotheses, which provide guidance for the remaining empirical study process. Most of the studies specified the variables as part of the prediction model construction prior to data collection. Thus, we assert that the first phase in our framework which consists of goal definition, research questions and hypotheses formulation, and variable specifications is a common practice in conducting defect prediction with different levels of detail and presentation.
270
D. Wahyudin, R. Ramler, and S. Biffl Table 2. Study Related Factors- Model Construction Study B.1 Variable Selection
Model Construction Steps B.2 Prediction Internal Methods Validation 10 Fold cross Naive Bayes, Logistic regres- validation and Performsion and J48 ance measwith ure:
B.3 Model Performance Measures Number of False positive and Recall
Moser et al [14]
product and process metrics with no prior selection
Li et al [9]
Product and process metrics with prior variable selection Product metrics with selection by Spearman bivariate correlation analysis Product (Design) metrics with no prior selection
16 modeling methods
N/A
Average relative error
Naive Bayes, Logistic regression and J48
10 Fold cross validation.
J48
10 Fold cross validation
Nagappan et al [16]
Process (code churn) metrics with selection by Spearman correlation
Coefficient of determination analysis, F-test
Li et al [11] Li et al [11]
Product and process Product and process metrics with no prior selection Product and Process (developer) metrics Product (static code) metrics
Multiple regression, Step-wise regression and Principal Component Analysis (PCA) 16 modeling 16 modeling methods
Performance measures: Accuracy, recall and precision Performance measures Recall, Precision and FMeasure Discriminant analysis
N/A N/A
Average relative Average relative error
Negative binomial regression Naive Bayes with log transform, J48, OneR General linear models FixCache prediction method
N/A
Correctly identified files Accuracy, Number of false positive, Receiver operator curves Error measure
Zimmermann et al [28]
Koru et al [8]
Weyuker et al [27] Menzies et al [13]
Graves et al [3] Sunghun et al [25] Pai et al [22]
Olague et al [19]
Product (changes code) metrics Process (change) metrics with no variable selection Product metrics with variable selection by correlation analysis and backward linear regression Product (Object Oriented ) Metrics with selection by Spearman bivariate
Multiple linear regression, Logistic regression, Bayesian network model Univariate and Multivariate binary logistic regression
10 Fold cross validation
N/A Cross validation for all data set 10 Fold cross validation
Accuracy
Hold out method
Percentage of correctly classified classes
False positive rate, precision , specificity, sensitivity
A Framework for Defect Prediction in Specific Software Project Contexts
271
Table 3. Study Related Factors- Model Usages Study
Moser et al [14] Li et al [9]
Zimmermann et al [28] Koru et al [8] Nagappan et al [16] Li et al [11] Weyuker [27] Menzies [13] Graves [3] Sunghun[25] Pai et al [22] Olague et al [19]
Model Usages Steps C.1 C.2 External validation Robustness Analysis Cross validation with different releases N/A with low performance results Constructed model were used to predict Proposed framework was used for a certain period of defect growth per commercial context [11] release Cross validation of trained prediction N/A model in different releases and levels of observation Cross validation of trained prediction Depicts the need for model calibramodel with different class of data tion or refinement State briefly with no data N/A Cross validation with different releases N/A
N/A N/A
N/A N/A N/A N/A N/A
N/A N/A N/A N/A N/A
Table 2 outlines the collected data regarding common practices to construct prediction models. Most of these studies used variable selection prior to model construction. Methods such as Spearman bivariate correlation analysis and linear regression with selected methods (backward, stepwise, remove) are considered as common methods for variable selection. The selection of prediction methods is based on what kind of defect pattern to be predicted, i.e., classification techniques such as logistic regression can be used to predict file defect-proneness but will obtain poor performance to predict file defect rates. Similar to prediction method selection, one should also choose appropriate internal validation methods and model performance measures. The first two phases in our framework have been identified as commonly performed by researchers in defect prediction. However, for the third phase Model Usages (see Table 3), we found only two studies providing appropriate results of the two involved steps. This finding confirms the critique from Norman and Fenton [2] that most of the existing studies on defect prediction do not provide empirical proof whether the model can be generalized for different observations. There are several reasons why many studies did not report the external validation and robustness analysis of constructed prediction model such as the availability of new observation data [22] and external validation results which signify poor performance of the model [14] for which many of the authors do not wish to report. However from practitioners’ perspective such conditions should be addressed properly by data collection process refinement and model calibrations until the model can be proven for its usefulness for prediction in particular context.
272
D. Wahyudin, R. Ramler, and S. Biffl
6 Conclusion and Further Work Whilst a large number of studies address defect prediction, little support is provided about the application of defect prediction for practitioners. In this paper we proposed a framework for conducting software defect prediction as an aid for the practitioner establishing defect prediction in the context of a particular project or organization and as a guide to the body of existing studies on defect prediction. The framework has been aligned with practitioners’ requirements and supported by our findings from a systematical literature review on software defect prediction. The systematic literature review also served as an initial empirical evaluation of the proposed framework by showing the co-existence of the key elements of the framework in existing research on software defect prediction. The mapping of findings from empirical studies to the phases and steps of the framework show that the existing literatures can be easily classified using the framework and verifies that each of the steps is attainable. Nevertheless, we also found several issues relevant for applying defect prediction in practice, which are currently not adequately addressed by existing research. Related future work is encouraged in order to make software defect prediction a commonly accepted and valuable aid in practice: •
•
•
Existing studies on defect prediction neglect the fact that information is often missing or incomplete in real world settings. Practitioners therefore require methods to deal with missing or incomplete information. Li et al. reported: “We find that by acknowledging incomplete information and collecting data that capture similar ideas as the missing information, we are able to produce more accurate and valid models and motivate better data collection." [11] Defect predictions remain a risky endeavor for practitioners as long as upfront investments for data collection and model construction are high and a return on these investments has to be expected late or never. Most projects and organizations cannot afford this investment under such adverse conditions. Thus, means are required to conduct an early and quick estimation of the feasibility of predicting defects with acceptable performance in the context of a specific project or organization. If it is the vision that practitioners base critical decisions in software engineering such as what to test less on defect prediction results, they have to be sure not to endanger product quality by missing critical defects. Thus, in addition to results of a prediction, an additional measure has to indicate the reliability of the prediction, so practitioners are informed to what extent they can rely on the results and know to what extent they are taking risks if they do so.
Acknowledgements. This paper has been partly supported by The Technology-GrantSouth-East-Asia No. 1242/BAMO/2005 Financed by ASIA-Uninet.
References 1. Basili, V., Caldiera, G., Rombach, H.D.: The Goal Question Metric Approach. 2, 6 (1994) 2. Fenton, N., Neil, M.: A Critique of Software Defect Prediction Models. IEEE Trans. Softw. Eng. 25, 15 (1999)
A Framework for Defect Prediction in Specific Software Project Contexts
273
3. Graves, T.L., Karr, A.F., Marron, J.S., Siy, H.: Predicting fault incidence using software change history. IEEE Transactions on Software Engineering 26, 8 (2000) 4. Khoshgoftaar, T., Bhattacharyya, B., Richardson, G.: Predicting Software Errors, During Development, Using Nonlinear Regression Models: A Comparative Study. IEEE Transaction on Reliability 41, 5 (1992) 5. Kitchenham, B.: Guidelines for performing Systematic Literature Reviews in Software Engineering (2007) 6. Kitchenham, B., Kutay, C., Jeffery, R., Connaughton, C.: Lessons learnt from the analysis of large-scale corporate databases. In: Proceedings of the 28th International Conference on Software Engineering. ACM, Shanghai (2006) 7. Kitchenham, B.A., Mendes, E., Travassos, G.H.: Cross versus Within-Company Cost Estimation Studies: A Systematic Review. IEEE Transactions on Software Engineering 33, 316–329 (2007) 8. Koru, A.G., Hongfang, L.: Building Defect Prediction Models in Practice. IEEE Softw. 22, 23–29 (2005) 9. Li, P.L., Herbsleb, J., Shaw, M.: Forecasting Field Defect Rates Using a Combined TimeBased and Metrics-Based Approach: A Case Study of OpenBSD. In: The 16th IEEE International Symposium on Software Reliability Engineering, IEEE Computer Society, Los Alamitos (2005) 10. Li, P.L., Herbsleb, J., Shaw, M., Robinson, B.: Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc. In: Proceedings of the 28th International Conference on Software Engineering. ACM, Shanghai (2006) 11. Li, P.L., Herbsleb, J., Shaw, M., Robinson, B.: Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc. In: The 28th International Conference on Software Engineering. ACM, Shanghai (2006) 12. Menzies, T., Di Stefano, J., Ammar, K., McGill, K., Callis, P., Chapman, R., Davis, J.: When Can We Test Less? In: Proceedings of the 9th International Symposium on Software Metrics. IEEE Computer Society, Los Alamitos (2003) 13. Menzies, T., Greenwald, J., Frank, A.: Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering 33, 2–13 (2007) 14. Moser, R., Pedrycz, W., Succi, G.: A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction. In: The 30th International Conference on Software Engineering. ACM, Leipzig (2008) 15. Nachtsheim, C.J., Kutner, M.H.: Applied Linear Regression Models. McGraw-Hill Education, New York (2004) 16. Nagappan, N., Ball, T.: Use of relative code churn measures to predict system defect density. In: The 27th International Conference on Software Engineering. ACM, St. Louis (2005) 17. Nagappan, N., Ball, T., Zeller, A.: Mining metrics to predict component failures. In: The 28th International Conference on Software Engineering. ACM, Shanghai (2006) 18. Nagappan, N., Ball, T., Zeller, A.: Mining metrics to predict component failures. In: Proceedings of the 28th International Conference on Software Engineering, pp. 452–461. ACM, New York (2006) 19. Olague, H.M., Etzkorn, L.H., Gholston, S., Quattlebaum, S.: Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes. IEEE Transactions on Software Engineering 33, 402–419 (2007)
274
D. Wahyudin, R. Ramler, and S. Biffl
20. Ostrand, T.J., Weyuker, E.J.: How to measure success of fault prediction models. In: Fourth International Workshop on Software Quality Assurance: in Conjunction with the 6th ESEC/FSE Joint Meeting. ACM, Dubrovnik (2007) 21. Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Automating algorithms for the identification of fault-prone files. In: Proceedings of the 2007 International Symposium on Software Testing and Analysis. ACM, London (2007) 22. Pai, G.J., Dugan, J.B.: Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods. IEEE Transactions on Software Engineering 33, 675–686 (2007) 23. Ramler, R., Wolfmaier, K.: Issues and Effort in Integrating Data from Heterogeneous Software Repositories and Corporate Databases. In: 2nd International Symposium on Empirical Software Engineering and Measurement (ESEM 2008), Kaiserslautern, Germany (2008) (forthcoming) 24. Staples, M., Niazi, M.: Experiences using systematic review guidelines. Journal of Systems and Software 80, 1425–1437 (2007) 25. Sunghun, K., Thomas, Z., James, E., Whitehead, J., Andreas, Z.: Predicting Faults from Cached History. In: Proceedings of the 29th International Conference on Software Engineering. IEEE Computer Society, Los Alamitos (2007) 26. Wahyudin, D., Winkler, D., Schatten, A., Tjoa, A.M., Biffl, S.: Defect Prediction using Combined Product and Project Metrics A Case Study from the Open Source “Apache” MyFaces Project Family. In: 34th EUROMICRO Conference on Software Engineering and Advanced Applications SPPI Track, Parma, Italy (2008) 27. Weyuker, E.J., Ostrand, T.J., Bell, R.M.: Using Developer Information as a Factor for Fault Prediction. In: The Third International Workshop on Predictor Models in Software Engineering. IEEE Computer Society, Los Alamitos (2007) 28. Zimmermann, T., Premraj, R., Zeller, A.: Predicting Defects for Eclipse. In: International Workshop on Predictor Models in Software Engineering, PROMISE 2007: ICSE Workshops 2007, pp. 9–9 (2007)
Meeting Organisational Needs and Quality Assurance through Balancing Agile and Formal Usability Testing Results Jeff Winter1, Kari Rönkkö1, Mårten Ahlberg2, and Jo Hotchkiss3 1
Blekinge Institute of Technology, SE 37050 Ronneby, Sweden (jeff.winter,kari.ronkko)@bth.se 2 UIQ Technology, Ronneby, Sweden [email protected] 3 Sony Ericsson Mobile Communications, Warrington, England [email protected]
Abstract. This paper deals with a case study of testing with a usability testing package (UTUM), which is also a tool for quality assurance, developed in cooperation between industry and research. It shows that within the studied company, there is a need to balance agility and formalism when producing and presenting results of usability testing to groups who we have called Designers and Product Owners. We have found that these groups have different needs, which can be placed on opposite sides of a scale, based on the agile manifesto. This becomes a Designer and a Product Owner Manifesto. The test package is seen as a successful hybrid method combining agility with formalism, satisfying organisational needs, and fulfilling the desire to create a closer relation between industry and research. Keywords: agility, formalism, usability, product quality, methods & tools.
1 Introduction Osterweil et al [1] state that product quality is becoming the dominant success criterion in the software industry, and believe that the challenge for research is to provide the industry with the means to deploy quality software, allowing companies to compete effectively. Quality is multi-dimensional, and impossible to show through one simple measure, and they state that research should focus on identifying various dimensions of quality and measures appropriate for it and that a more effective collaboration between practitioners and researchers would be of great value. Quality is also important owing to the criticality of software systems (a view supported by Harrold in her roadmap for testing [2]) and even to changes in legislation that make executives responsible for damages caused by faulty software. One traditional approach to quality has been to rely on complete, testable and consistent requirements, traceability to design, code and test cases, and heavyweight documentation. However, the demand for continuous and rapid results in a world of continuously changing business decisions often makes this approach impractical or impossible, pointing to a need for agility. At a keynote speech at the 5th Workshop on Z. Huzar et al. (Eds.): CEE-SET 2008, LNCS 4980, pp. 275–289, 2011. © IFIP International Federation for Information Processing 2011
276
J. Winter et al.
Software Quality, held at ICSE 2007 [3], Boehm stated that both agility and quality are becoming more and more important. Many areas of technology exhibit a tremendous pace of change, due to changes in technology and related infrastructures, the dynamics of the marketplace and competition, and organisational change. This is particularly obvious in mobile phone development, where their pace of development and penetration into the market has exploded over the last 5 years. This kind of situation demands an agile approach [4]. This paper deals with a case study of a usability test package called UIQ Technology Usability Metrics (UTUM) [5], the result of a long research cooperation between the research group “Use-Oriented Design and Development” (U-ODD) [6] at Blekinge Institute of Technology (BTH), and UIQ Technology (UIQ) [7]. With the help of Martin et al.'s study [8] and our own case study, it presents an approach to achieving quality, related to an organizational need for agile and formal usability test results. We use concepts such as “agility understood as good organizational reasons” and “plan driven processes as the formal side in testing”, to identify and exemplify a practical solution to assuring quality through an agile approach. The original aim of the study at hand was to examine how a distributed usability test could be performed, and the effect that the geographical separation of the test leaders had on the collection, analysis and presentation of the data. As often happens in case studies, another research question arose during the execution of the study: How can we balance demands for agile results with demands for formal results when performing usability testing for quality assurance? Here, we use the term “formal” as a contrast to the term “agile”, not because we see agile processes as being informal or unstructured, but since “formal” in this case is more representative than “plan driven” to characterise the results of testing and how they are presented to certain stakeholders. We examine how the results of the UTUM test are suitable for use in an agile process. Even though eXtreme Programming (XP) is used as an illustrative example in this article, note that there is no strong connection to any particular agile methodology; rather, there is a philosophical connection between the test and the ideas behind the agile movement. We examine how the test satisfies requirements for formal statements of usability and quality. As a result of the investigation regarding the agile and the formal, we also identify parties interested in the different elements of the test data. Our study is in reference to Martin et al’s work [8]. Our work deals with quality and the necessary balance between agility and formality from the viewpoint of “day to day organizational needs”. Improving formal aspects is important, and software engineering research in general has successfully emphasized this focus. However, improving formal aspects may not help to design the testing that most efficiently satisfies organisational needs and minimises the testing effort. The main reason for not adopting “best practice” in testing is to orient testing to meet organisational needs, based on the dynamics of customer relationships, using limited effort in the most effective way, and the timing of software releases to the needs of customers as to which features to release (as is demonstrated in [8]). Both perspectives are needed! The structure of the article is as follows. An overview of two different testing paradigms is provided. A description of the test method comes next, followed by a presentation of the study method and an analysis of the material from the case study,
Meeting Organisational Needs and Quality Assurance
277
examining the balance between agility and formalism, the relationship between these and quality, and the need for research/industry cooperation. The article ends with a discussion of the work, and conclusions.
2 Testing – Prevailing Models vs. Agile Testing This section presents a brief overview of testing as seen from the viewpoints of the software engineering community and the agile community. As quality becomes a dominant success factor for software, the practitioner’s use of processes to support software quality will become increasingly important. Testing is one such process, performed to support quality assurance, and provide confidence in the quality of software, and an emphasis on software quality requires improved testing methodologies that can be used by practitioners to test their software [2]. This section therefore briefly discusses the field of testing, in connection with the fact that the test framework can be seen as an agile testing methodology. Within software engineering, there are many types of testing, in many process models, (e.g. the Waterfall model [9], Boehm’s Spiral model [10]). Testing has been seen as phase based, and the typical stages of testing (see e.g. [11], [12]) when developing large systems are Unit testing, Integration testing, Function testing, Performance testing, Acceptance testing, and Installation testing. The stages from Function testing and onwards are characterised as System Testing, where the system is tested as a whole rather than as individual pieces [12]. Unit testing, which should be performed in a controlled environment, verifies that a component functions properly with the expected types of input. Integration testing ensures that system components work together as described in the specifications. After this testing, the system has been merged into a working system, and system testing can begin. System testing begins with Function testing, where the system is tested to ensure that it has the desired functionality, and evaluates whether the integrated system performs the functions described in the requirements specification. A performance test compares the system with the rest of the software and hardware requirements, and after the performance test, the system is regarded as being a validated system. In an acceptance test, the system is tested together with the customer, in order to check it against the customer’s requirements description, to ensure that it works in accordance with customer expectations. When Acceptance testing is completed, the accepted system is installed in its proper environment, and in order to ensure that it functions as it should, an installation test is run [12]. Usability testing (otherwise named Human Factors Testing), which we are concerned with here, has been characterised as investigating requirements dealing with the user interface, and has been regarded as a part of Performance testing [12]. This is an example of the prevailing approach to testing, reliant on formal aspects and best practice. Agile software development radically changes how software development organisations work, especially regarding testing [13]. In agile development, exemplified here by XP [14], one of the key tenets is that testing is performed continuously by developers. In XP, the tests should be isolated, i.e. should not interact with the other tests that are written, and should preferably be automatic, although a recent study of testing practice in a small organisation has shown that not all
278
J. Winter et al.
companies applying XP automate all tests [8]. Tests come from two sources, from programmers and customers, who both create tests that serve, through continuous testing, to increase their confidence in the operation of the program. Customers write, or specify, functional tests to show that the system works in the way they expect it to, and developers write unit tests to ensure that the programs work the way they think that they work. Unit and functional tests are the main testing methods in XP, but can be complemented by other types of tests when necessary. Some XP teams may have dedicated testers, who help customers translate their test needs into tests, who can help customers create tools to write run and maintain their own tests, and who translate the customer’s testing ideas into automatic, isolated tests [14]. The role of the tester is a matter of debate. In both of the above paradigms it is primarily developers who design and perform testing, albeit occasionally at the request of the customer. However, within industry, there are seen to be fundamental differences between the people who are “good” testers and those who are good developers. The role of the tester as described above assumes that the tester is also a developer, even when teams use dedicated testers. Within industry, however, it is common that the roles are clearly separated, and that testers are generalists with the kind of knowledge that users have, who complement the perspectives and skills of the testers. A good tester can have traits that are in direct contrast with the traits that good developers need (see e.g. Pettichord [15] for a discussion regarding this). Pettichord claims that good testers think empirically in terms of observed behaviour, and must be encouraged to understand customers’ needs. As can be seen in the above, although there are similarities, there are substantial differences in the testing paradigms, how they treat testing, and the role of the tester and test designer. In our testing, the test leaders are specialists in the area of usability and testing, and generalists in the area of the product and process as a whole. There is a body of knowledge concerning usability testing, much of it in the field of Human Computer Interaction, but we have chosen not to look more closely at this. In this paper we concentrate on the studied company’s organizational needs and the philosophical connection between the test and the ideas behind the agile movement.
3 The UTUM Test Package UTUM is a usability test package for mass market mobile devices, and is a tool for quality assurance, measuring usability empirically on the basis of metrics for satisfaction, efficiency and effectiveness, complemented by a test leader’s observations. Its primary aim is to measure usability, based on the definition in ISO 9241-11, where usability is defined as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use” [16]. This is similar to the definition of quality in use defined in ISO 9126-1, where usability is instead defined as understandability, learnability and operability [17]. The intention of the test is also to measure “The User Experience” (UX), which is seen as more encompassing than the view of usability that is contained in e.g. the ISO standards [5], although it is still uncertain how UX differs from the traditional usability perspective [18] and exactly how UX should be defined (for some definitions, see e.g. ([19-21]).
Meeting Organisational Needs and Quality Assurance
279
In UTUM testing, one or more test leaders carry out the test according to predefined requirements and procedure. The test itself takes place in a neutral environment rather than a lab, in order to put the test participant at ease. The test is led by a test leader, and it is performed together with one tester at a time. The test leader welcomes the tester, and the process begins with the collection of some data regarding the tester and his or her current phone and typical phone use. Whilst the test leader is then preparing the test, the tester has the opportunity to get acquainted with the device to be tested, and after a few minutes is asked to fill in a hardware evaluation, a questionnaire regarding attitudes to the look and feel of the device. The next step is to perform a number of use cases on the device, based on the tester’s normal phone use or organisational testing needs. Whilst this is taking place, the test leader observes what happens during the use case performance, and records these observations, the time taken to complete the use cases, and answers to follow-up questions that arise. After the use case is complete, the tester is asked to answer questions about how well the telephone lets the user accomplish the use case. The final step in the test, when all of the use cases are completed, is a questionnaire about the user’s subjective impressions of how easy the interface is to use. This is based on the System Usability Scale (SUS) [22], and it expresses the tester’s opinion of the phone as a whole. The tester is finally thanked for their participation in the test, and is usually given a small gift, such as a cinema ticket, to thank them for their help. After testing, the data obtained are transferred to spreadsheets. These contain both quantitative data, such as use case completion times and attitude assessments, and qualitative data, like comments made by testers and information about problems that arose. The data is used to calculate metrics for performance, efficiency, effectiveness and satisfaction, and the relationships between them, leading to a view of usability for the device as a whole. The test leader is an important source of data and information in this process, as he or she has detailed knowledge of what happened during testing.
Test knowledge
User satisfaction Appraised inefficiency
Interested parties
User satisfaction Appraised efficiency
User User User dissatisfaction dissatisfaction Appra Appra
Influence Metrics/ graphs
Knowledge
UTUM test dat
RESULTS
Fig. 1. Contents of the UTUM testing, a mix of metrics and mental data
280
J. Winter et al.
Fig. 2. Qualitative results in spreadsheets (product information removed)
Figure 1 illustrates the flow of data and knowledge contained in the test and the test results, and how the test is related to different groups of stakeholders. Stakeholders, who can be within the organisation, or licensees, or customers in other organisations, can be seen at the top of the flow, as interested parties. Their requirements influence the design and contents of the test. The data collected is found both as knowledge stored in the mind of the test leader, and as metrics and qualitative data in spreadsheets. Figure 2 represents a spreadsheet where qualitative findings of the testing are stored, a Structured Data Summary, created and developed by Gary Denman, UIQ. It shows issues that have been found, on the basis of each tester and each device, for every use case. Comments made by the test participants and observations made by the test leader are stored as comments in the spreadsheet. The results of the testing are thereby a combination of metrics and knowledge, where the different types of data confirm one another. The metrics based material is presented in the form of diagrams, graphs and charts, showing comparisons, relations and tendencies. This can be corroborated by the knowledge possessed by the test leader, who has interacted with the testers and is the person who knows most about the process and context of the testing. Knowledge material is often presented verbally, but can if necessary be supported and confirmed by visual presentations of the data. UTUM has been found to be a customer driven tool that is quick and efficient, easily transferable to new environments, and that handles complexity [23]. For more detailed information on the contents and performance of the UTUM test and the principles behind it, see ([5], [23]). A brief video presentation of the whole test process (6 minutes) can be found on YouTube [24].
Meeting Organisational Needs and Quality Assurance
281
4 The Study Methodology and the Case This work has been performed as part of a long-term cooperation between U-ODD and UIQ, which has centred on the development and evaluation of a usability test (for more information, see [23, 25]). The prime area of interest has been on creating and studying a test method for quality assurance, on developing metrics to measure usability and on the combination of qualitative and quantitative results. This case study in this phase of the research cooperation concerned tests performed by UIQ in Ronneby, and by Sony Ericsson Mobile Development in Manchester. The process of research cooperation is Action research according to the researchand method development methodology called Cooperative Method Development (CMD), see [26] [27], [28] and ([29], chapter 8) for further details. Action research is “research that involves practical problem solving which has theoretical relevance” ([30] p. 12). It involves gaining an understanding of a problem, generating and spreading practical improvement ideas, applying the ideas in a real world situation and spreading the theoretical conclusions within academia [30]. Improvement and involvement are central to action research, and its purpose is to influence or change some aspect of whatever the research has as its focus ([31] p. 215). A central aspect of action research is collaboration between researchers and those who are the focus of the research. The terms participatory research and participatory action research are sometimes used as synonyms for action research ([31] p. 216). CMD is built upon a number of guidelines including the use of ethnomethodological and ethnographically inspired empirical research, combined with other methods if suitable. Ethnography is a research strategy taken from sociology, and that has its foundations in anthropology[32]. It is a method that relies upon the first-hand experience of a field worker who is directly involved in the setting that is under investigation [32]. CMD focuses on shop floor development practices, taking the practitioners’ perspective when evaluating the empirical research and deliberating improvements, and involving the practitioners in the improvements. This approach is inspired by a Participatory Design (PD) perspective. PD is an approach towards system design in which those who are expected to use the system are actively involved and play a critical role in designing it. It is a paradigm where stakeholders are included in the design process, and it demands shared responsibility, active participation, and a partnership between users and implementers [33]. This study has been performed as a case study, defined by Yin as “an empirical enquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident” ([34], s. 13). The data for the study has been obtained through observation, through a series of unstructured and semi-structured interviews [31], both face-to-face and via telephone, through participation in meetings between different stakeholders in the process, and from project documents and working material, such as a research protocol that ensures that the individual tests take place in a consistent manner, spreadsheets for storing and analysing qualitative and quantitative data, and material used for presenting results to different stakeholders. The interviews have been performed with test leaders, and with staff on management level within the two companies. Interviews have been audiotaped, and transcribed, and all material has been collected in a research diary. The diary is also the case study database, which
282
J. Winter et al.
collects all of the information in the study, allowing for traceability and transparency of the material, and reliability [34]. The mix of data and collection methods has given a triangulation of data that serves to validate the results that have been reached. The transcriptions of the interview material, and other case material in the research diary, have been analysed to find emerging themes, in an editing approach that is also consistent with Grounded Theory (see Robson [31] s. 458). The analysis process has affected the further rounds of questioning, narrowing down the focus, and shifting the main area of interest, opening up for the inclusion of new respondents who shed light on new aspects of the study. During the case study, as often happens in case studies [34], the research question changed. The first focus of the study was the fact that testing was distributed, and the effect this had on the testing and the analysis of the results. Gradually, another area of interest became the elements of agility in the test, and the balance between the formal and informal parts of the testing. We have tried to counter threats to validity and reliability in the study. One of these is bias introduced by the researchers most closely involved in the study. This is addressed by cross checking results with participants in the study, and discussing the results of the case study with research colleagues. One more threat is that most of the data in the case study comes from UIQ. Due to close proximity to UIQ, the interaction there has been frequent and informal, and everyday contacts and discussions on many topics have influenced the interviews and their analysis. Interaction with Sony Ericsson has been limited to interviews and discussions, but data from Sony Ericsson confirms what was found at UIQ. Another threat is that most of the data in the case study comes from informants who work within the usability/testing area, but once again, they come from two different organisations and corroborate one another, have been complemented by information from other stakeholders, and thus present a picture of industrial reality. In the case in question, test leaders from two organisations in two countries performed testing in parallel. Testing was performed in a situation with complex relationships between customers, clients, and end-users, and complexities of how and where results were used. Reasons for performing the tests were to validate the UTUM test itself as a tool for quality assurance, and to obtain a greater number of tests, creating a baseline for validation of products, to identify and measure differences or similarities between countries, and to identify issues with common use-cases. Normally, there is no need for a large number of testers or data points. However, although this can be seen as a large test from the point of view of the participating organisations, compared to their normal testing, with more than 10 000 data points, it was found to be an agile process, where results were produced quickly and efficiently. In the following, we present the results of the study, and discuss in which way the results are agile or plan-driven/formal, who is interested in the different types of results, and which of the organisational stakeholders needs agile or formal results.
5 Agile or Formal? The case study was grounded in thoughts concerning the importance of quality and agility in software processes, as specified previously. We have always seen the importance of the framework as a tool for quality, and verifying this was one purpose
Meeting Organisational Needs and Quality Assurance
283
of the testing that this case study was based on. Given the need for agility mentioned above, the intention of this case study became to see how the test is related to agile processes and whether the items in the agile manifesto can be identified in the results from the test framework. The following is the result of having studied the material from the case study from the perspective of the spectrum of different items that are taken up in the agile manifesto. The agile movement is based on core values, described in the agile manifesto [35], and explicated in the agile principles [36]. The agile manifesto states that: “We are uncovering better ways of developing software by doing it and by helping others do it. Through this work we have come to value: Individuals and interactions over processes and tools, Working software over comprehensive documentation, Customer collaboration over contract negotiation, and Responding to change over following a plan. That is, while there is value in the items on the right, we value the items on the left more”. Cockburn [37] stresses that the intention is not to demolish the house of software development, represented here by the items on the right (e.g. working software over comprehensive documentation), but claims that those who embrace the items on the left rather than those on the right are more likely to succeed in the long run. Even in the agile community there is some disagreement about the choices, but it is accepted that discussions can lead to constructive criticism. Our analysis showed that all these elements could be identified in the test and its results. In our research we have always been conscious of a division of roles within the company, often expressed as “shop floor” and “management”, and working with a participatory design perspective we have worked very much from the shop floor point of view. During the study, this viewpoint of separate groups emerged and crystallised, and two disparate groups became apparent. We called these groups Designers, represented by e.g. interaction designers and system and interaction architects, representing the shop floor perspective, and Product Owners, including management, product planning, and marketing, representing the management perspective. When regarding this in light of the Agile manifesto, and its connection to our test, we began to see how different groups may have an interest in different factors of the framework and the results that it can produce, and it became a point of interest in the case study to see how these factors related to the manifesto and which of the groups, Designers (D) or Product Owners (PO), is mainly interested in each particular item in the manifesto. The case study material was analysed on the basis of these emerging thoughts. Where the groups were found to fit on the scale is marked at the end of the paragraphs that follow. We have changed one of the items from “Working software” to “Working information” as we see the information resulting from the testing process as a metaphor for the software that is produced in software development. • Individuals and interactions – The testing process is dependent on the individuals who decide the format of the test, who lead the test, and who actually perform the tests on the devices. The central figure here is the test leader, who functions as a pivot point in the whole process, interacting with the testers, observing and registering the data, and presenting the results. This is obviously important in the long run from a PO perspective, but it is D who has the greatest and immediate benefit of the interaction, showing how users reacted to design decisions, that is a central part of the testing. D
284
J. Winter et al.
• Processes and tools – The test is based upon a well-defined process that can be repeated to collect similar data that can be compared over a period of time. This is of interest to the designers, but in the short term they are more concerned with the everyday activities of design and development that they are involved in. Therefore we see this as being of greatest interest to PO, who can get a long-term view of the product, its development, and e.g. comparisons with competitors, based on a stable and standardised method. PO • Working information – The test produces working information quickly. Directly after the short period of testing that is the subject of this case study, the test leaders met and discussed and agreed upon their findings. This took place before the data was collated in the spreadsheets. They were able to present the most important qualitative findings to system and interaction architects within the two organisations 14 days after the testing began, and changes in the implementation were requested soon after that. An advantage of doing the testing in-house is having access to the tester leaders, who can explain and clarify what has happened and the implications of it. This is obviously of primary interest to D • Comprehensive documentation – The comprehensive documentation consists of spreadsheets containing metrics and qualitative data. The increased use of metrics, which is the formal element in the testing, is seen in both organizations in this study as a complement to the testing methods already in use. Metrics back up the qualitative findings that have always been the result of testing, and open up new ways to present test results in ways that are easy to understand without having to include contextual information. They make test results accessible for new groups. The quantitative data gives statistical confirmation of the early qualitative findings, but are regarded as most useful for PO, who want figures of the findings that have been reached. There is less pressure of time to get these results compiled, as the most important work has been done, and the critical findings are already being implemented. In this case study, the metrics consisted of 10 000 data points collected from 48 users, a mixture of quantitative measurements and attitudinal metrics. The metrics can be subject to stringent analysis to show comparisons and correlations between different factors. In both organisations there is beginning to be a demand for Key Performance Indicators for usability, and although it is still unsure what these may consist of, it is still an indication of a trend that comes from PO level. PO • Customer collaboration – in the testing procedure it is important for the testers to have easy access to individuals, to gain information about customer needs, end user patterns, etc. The whole idea of the test is to collect the information that is needed at the current time regarding the product and its development. How this is done in practice is obviously of concern to PO in the long run, but in the immediate day to day operation it is primarily of interest to D • Contract negotiation – On a high level it is up to PO to decide what sort of cooperation should take place between different organisations and customers, and this is not something that D is involved in, so this is seen as PO • Respond to change – The test is easily adapted to changes, and is not particularly resource-intensive. If there is a need to change the format of a test, or a new test requirement turns up suddenly, it is easy to change the test without having expended extensive resources on the testing. It is also easy to do a “Light” version
Meeting Organisational Needs and Quality Assurance
285
of a test to check a particular feature that arises in the everyday work of design, and this has happened several times at UIQ. This is the sort of thing that is a characteristic of the day to day work with interaction design, and is nothing that is of immediate concern for PO, so this is seen as D • Following a plan – From a short-term perspective, this is important for D, but since they work in a rapidly changing situation, it is more important for them to be able to respond to change. This is however important for PO who are responsible for well functioning strategies and long-term operations in the company.
D
= Designers
Product Owner = PO
Individuals Interactions
D
PO
Processes & tools
Working information
D
PO
Comprehensive documentation
Customer collaboration
D
PO
Contract negotiation
Respond to change
D
PO
Following a plan
Agile
Plan Driven Fig. 3. Groups and their diverging interests
On opposite sides of the spectrum In this analysis, we found that “Designers”, as in the agile manifesto, are interested in the items on the left, rather than the items on the right (see figure 3). We see this as being “A Designer’s Manifesto”. “Product Owners” are more interested in the items on the right. Boehm characterised the items on the right side as being “An Auditor Manifesto”[4]. We see it as being “A Product Owner’s Manifesto”. This is of course a sliding scale; some of the groups may be closer to the middle of the scale. Neither of the two groups is uninterested in what is happening at the opposite end of the spectrum, but as in the agile manifesto, while there is value in the items on one side, they value the items on the other side more. We are conscious of the fact that these two groups are very coarsely drawn, and that some groups and roles will lie between these extremes. We are unsure exactly which roles in the development process belong to which group, but are interested in looking at these extremes to see their information requirements in regard to the results of usability testing. On closer inspection it may be found that none of the groups is on the far side of the spectrum for all of the points in the manifesto; more work must be done to examine this distribution and division.
286
J. Winter et al.
6 Discussion Here, we discuss our results in relation to academic discourses, to answer the research question: How can we balance demands for agile results with demands for formal results when performing usability testing for quality assurance? We also comment upon two related discourses from the introductory chapter, i.e. the relation between quality and a need for cooperation between industry and research, and the relationship between quality and agility. Since we work in a mass-market situation, and the system that we are looking at is too large and complex for a single customer to specify, the testing process must be flexible enough to accommodate the needs of many different stakeholders. The product must appeal to the broadest possible group, so it is difficult for customers to operate in dedicated mode with development team, with sufficient knowledge to span the whole range of the application, which is what an agile approach requires to work best [38]. In this case, test leaders work as proxies for the user in the mass market. We had a dedicated specialist test leader who brought in the knowledge that users have, in accordance with Pettichord [15]. Evidence suggests that drawing and learning from experience may be as important as taking a rational approach to testing [8]. The fact that the test leaders involved in the testing are usability experts working in the field in their everyday work activities means that they have considerable experience of their products and their field. They have specialist knowledge, gained over a period of time through interaction with end-users, customers, developers, and other parties that have an interest in the testing process and results. This is in line with the idea that agile methods get much of their agility from a reliance on tacit knowledge embodied in a team, rather than from knowledge written down in plans [38]. It would be difficult to gain acceptance of the test results within the whole organisation without the element of formalism. In sectors with large customer bases, companies require both rapid value and high assurance. This cannot be met by pure agility or plan-driven discipline; only a mix of these is sufficient, and organisations must evolve towards the mix that suits them best [38]. In our case this evolution has taken place during the whole period of the research cooperation, and has reached a phase where it has become apparent that this mix is desirable and even necessary. In relation to the above, Osterweil et al [1] state that there is a body of knowledge that could do much to improve quality, but that there is “a yawning chasm separating practice from research that blocks needed improvements in both communities”, thereby hindering quality. Practice is not as effective as it must be, and research suffers from a lack of validation of good ideas and redirection that result from serious use in the real world. This case study is part of a successful cooperation between research and industry, where the results enrich the work of both parts. Osterweil et al [1] also request the identification of dimensions of quality and measures appropriate for it. The particular understanding of agility discussed in our case study can be an answer to this request. The agility of the test process is in accordance with the “good organisational reasons” for “bad testing” that are argued by Martin et al [8]. These authors state that testing research has concentrated mainly on improving the formal aspects of testing, such as measuring test coverage and designing tools to support testing. However, despite advances in formal and automated fault discovery and their adoption in industry, the principal approach for validation and verification appears to
Meeting Organisational Needs and Quality Assurance
287
be demonstrating that the software is “good enough”. Hence, improving formal aspects does not necessarily help to design the testing that most efficiently satisfies organisational needs and minimises the effort needed to perform testing. In the results of the present paper, the main reason for not adopting “best practice” is precisely to orient testing to meet organisational needs. Our case is a confirmation of [8]. Here, it is based on the dynamics of customer relationships, using limited effort in the most effective way, and the timing of software releases to the needs of customers as to which features to release. The present paper illustrates how this happens in industry, since the agile type of testing studied here is not according to “best practice” but is a complement that meets organisational needs for a mass-market product in a rapidly changing marketplace, with many different customers and end-users.
7 Conclusion and Further Work In the UTUM test package, we have managed to implement a sufficient balance between agility and plan driven formalism to satisfy practitioners in many roles. The industrial reality that has driven the development of this test package confirms the fact that quality and agility are vital for a company that is working in a rapidly changing environment, attempting to develop a product for a mass market. There is also an obvious need for formal data that can support the quick and agile results. Real-world complex situations are not either on or off. The UTUM test package demonstrates one way to balance demands for agile results with demands for formal results when performing usability testing for quality assurance. The test package conforms to both the Designer’s manifesto, and the Product Owner’s manifesto, and ensures that there is a mix of agility and formalism in the process. The case in the present paper confirms the argumentation emphasizing ‘good organizational reasons’, since this type of testing is not according to “best practice” but is a complement that meets organisational needs for a mass-market product in a rapidly changing marketplace, with many different customers and end-users. This is partly an illustration of the chasm between industry and research, and partly an illustration of how agile approaches are taken to adjust to industrial reality. In relation to the former this case study is a successful cooperation between research and industry. It has been ongoing since 2001, and the work has an impact in industry, and results enrich the work of both parts. The inclusion of Sony Ericsson in this case study gives an even greater possibility to spread the benefits of the cooperative research. More and more hybrid methods are emerging, where agile and plan driven methods are combined, and success stories are beginning to emerge. We see the results of this case study and the UTUM test as being one of these success stories. How do we know that the test is successful? By seeing that it is in successful use in everyday practice in an industrial environment. We have found a successful balance between agility and formalism that works in industry and that exhibits qualities that can be of interest to both the agile and the software engineering community. As a follow up to this case study, work has been performed to collect more information regarding the attitudes of Product Owners and Designers towards the information they require from testing and their preferred presentation formats. This will help define the groups and their needs, and allow us to place them on the map of the manifesto, and tailor the testing and presentation methods to fulfil these needs, and thereby improve the test package even further.
288
J. Winter et al.
Acknowledgements This work was partly funded by The Knowledge Foundation in Sweden under a research grant for the software development project “Blekinge – Engineering Software Qualities”, www.bth.se/besq. Thanks to my colleagues in the U-ODD research group for their help in data analysis and structuring my writing. Thanks also to Gary Denman for permission to use the extract from the Structured Data Summary.
References 1. Osterweil, L.: Strategic directions in software quality. ACM Computing Surveys (CSUR) 28(4), 738–750 (1996) 2. Harrold, M.J.: Testing: A Roadmap. In: Proceedings of the Conference on The Future of Software Engineering. ACM Press, Limmerick (2000) 3. WoSQ. Fifth Workshop on Software Quality, at ICSE 2007 (2007), http://attend.it.uts.edu.au/icse2007/ (cited June 13, 2008) 4. Boehm, B.W.: Keynote address. In: 5th Workshop on Software Quality, Minneapolis, MN (2007) 5. UIQ Technology. UIQ Technology Usability Metrics (2006), http://uiq.com/utum.html (cited June-13, 2008) 6. U-ODD. Use-Oriented Design and Development (2008), http://www.bth.se/tek/u-odd (cited June 09, 2008) 7. UIQ Technology. Company Information (2008), http://uiq.com/aboutus.html (cited June 12, 2008) 8. Martin, D., Rooksby, J., Rouncefield, M., Sommerville, I.: ’Good’ Organisational Reasons for ’Bad’ Software Testing: An Ethnographic Study of Testing in a Small Software Company. In: ICSE 2007. IEEE, Minneapolis (2007) 9. Royce, W.W.: Managing the development of large software systems: concepts and techniques. In: 9th International Conference on Software Engineering. IEEE Computer Society Press, Monterey (1987) 10. Boehm, B.W.: A spiral model of software development and enhancement. Computer 21(5), 61–72 (1988) 11. Sommerville, I.: Software Engineering, 8th edn., p. 840. Addison Wesley, Reading (2007) 12. Pfleeger, S.L., Atlee, J.M.: Software Engineering, 3rd edn. Prentice Hall, Upper Saddle River (2006) 13. Talby, D., Hazzan, O., Dubinsky, Y., Keren, A.: Agile Software Testing in a Large-Scale Project. IEEE Software 23(4), 30–37 (2006) 14. Beck, K.: Extreme Programming Explained. Addison Wesley, Reading (2000) 15. Pettichord, B.: Testers and Developers Think Differently. STGE Magazine (2000) 16. International Organization for Standardization, ISO 9241-11 (1998): Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs) - Part 11: Guidance on Usability (1998) 17. International Organization for Standardization, ISO9126-1 Software engineering - Product quality - Part 1: Quality model, p. 25 (2001) 18. UXEM. User eXpreience Evaluation Methods in product development, UXEM (2008), http://www.cs.tut.fi/ihte/CHI08_workshop/slides/Poster_UXEM_ CHI08_V1.1.pdf (cited June 10, 2008)
Meeting Organisational Needs and Quality Assurance
289
19. Hassenzahl, M., Lai-Chong Law, E., Hvannberg, E.T.: User Experience - Towards a unified view. In: UX WS NordiCHI 2006, Oslo, Norway, cost294.org (2006) 20. Hassenzahl, M., Tractinsky, N.: User experience - a research agenda. Behaviour & Information Technology 25(2), 91–97 (2006) 21. UXNet:. UXNet: the User Experience network (2008), http://uxnet.org/ (cited June 09, 2008) 22. Brooke, J.: System Usability Scale (SUS): A Quick-and-Dirty Method of System Evaluation User Information. Digital Equipment Co. Ltd., Reading (1986) 23. Winter, J., Rönkkö, K., Ahlberg, M., Hinely, M., Hellman, M.: Developing Quality through Measuring Usability – The UTUM Test Package. In: 5th Workshop on Software Quality, at ICSE 2007. IEEE, Minneapolis (2007) 24. BTH. UIQ, Usability test (2008), http://www.youtube.com/watch?v=5IjIRlVwgeo (cited August 29, 2008) 25. UIQ Technology. UTUM website (2008), http://uiq.com/utum.html (cited June 14, 2008) 26. Dittrich, Y., Rönkkö, K., Eriksson, J., Hansson, C., Lindeberg, O.: Co-operative Method Development: Combining qualitative empirical research with method, technique and process improvement. Journal of Empirical Software Engineering (2007) 27. Dittrich, Y.: Doing Empirical Research in Software Engineering – finding a path between understanding, intervention and method development. In: Dittrich, Y., Floyd, C., Klischewski, R. (eds.) Social Thinking – Software Practice, pp. 243–262. MIT Press, Cambridge (2002) 28. Dittrich, Y., Rönkkö, K., Lindeberg, O., Eriksson, J., Hansson, C.: Co-Operative Method Development revisited. SIGSOFT Softw. Eng. Notes 30(4), 1–3 (2005) 29. Rönkkö, K.: Making Methods Work in Software Engineering: Method Deployment as a Social achievement. In: School of Engineering, Blekinge Institute of Technology, Ronneby (2005) 30. Mumford, E.: Advice for an action researcher. Information Technology and People 14(1), 12–27 (2001) 31. Robson, C.: Real World Research, 2nd edn. Blackwell Publishing, Oxford (2002) 32. Rönkkö, K.: Ethnography. In: Laplante, P. (ed.) Encyclopedia of Software Engineering (accepted, under review). Taylor and Francis Group, New York (2008) 33. Schuler, D., Namioka, A.: Participatory Design - Principles and Practices. In: Schuler, D., Namioka, A. (eds.), 1st edn., p. 319. Lawrence Erlbaum Associates, Hillsdale (1993) 34. Yin, R.K.: Case Study Research - Design and Methods. In: Robinson, S. (ed.) Applied Social Research Methods Series, 3rd edn., vol. 5, p. 181. SAGE publications, Thousand Oaks (2003) 35. The Agile Alliance, The Agile Manifesto (2001), http://agilemanifesto.org/ (cited June 04, 2008) 36. The Agile Alliance, Principlesof Agile Software (2001), http://www.agilemanifesto.org/principles.html (cited June 12, 2008) 37. Cockburn, A.: Agile Software Development. In: Cockburn, A., Highsmith, J. (eds.) The Agile Software Development Series, Addison-Wesley, Boston (2002) 38. Boehm, B.: Get Ready for Agile Methods, with Care. Computer 35(1), 64–69 (2002)
Author Index
Ahlberg, M˚ arten 275 Alchimowicz, Bartosz 20
Lau, M.F. 246 Leucker, Martin
Barus, A.C. 246 Bebjak, Michal 192 Biffl, Stefan 261 Bollig, Benedikt 103
Majt´ as, L’ubom´ır 62 Menkyna, Radoslav 192 Minchin, Leonid 158 M¨ unch, J¨ urgen 232
Chen, T.Y.
Nawrocki, Jerzy 20, 48 Nguyen, Tam 171 Nikiforova, Oksana 118
246
Dobiˇs, Michal 62 Dolog, Peter 192 Estublier, Jacky
Ochodek, Miroslaw Olek, L ukasz 48
171
Pavlova, Natalja
Flohr, Thomas 207 Franc˚ u, Jan 34 Grant, D. 246 Gschwind, Thomas
20, 48
118
Ramler, Rudolf 261 Ratkowski, Andrzej 76 Romanovsky, Konstantin R¨ onkk¨ o, Kari 275
1
Hanhela, Hanna 143 Hnˇetynka, Petr 34 H¨ ofer, Andreas 218 Hotchkiss, Jo 275 Ionita, Anca Daniela
103
ˇ Saloun, Petr 186 Samolej, Slawomir 131 Simil¨ a, Jouni 143 Szmuc, Tomasz 131 171
Jeffery, Ross 232 Jurkiewicz, Jakub 20 Katoen, Joost-Pieter 103 Kern, Carsten 103 Koehler, Jana 1 Koznov, Dmitry 158 Kuo, Fei-Ching 246 K¨ uster, Jochen 1 Kuvaja, Pasi 143
Trendowicz, Adam 232 Tureˇcek, Tom´ aˇs 186 V¨ olzer, Hagen 1 Vrani´c, Valentino 192 Wahyudin, Dindin Weiss, Petr 91 Winter, Jeff 275
261
Zalewski, Andrzej 76 Zimmermann, Olaf 1
158