Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
5745
Jörn Altmann Rajkumar Buyya Omer F. Rana (Eds.)
Grid Economics and Business Models 6th International Workshop, GECON 2009 Delft, The Netherlands, August 24, 2009 Proceedings
13
Volume Editors Jörn Altmann Seoul National University TEMEP, School of Industrial and Management Engineering College of Engineering 599 Gwanak-Ro, Gwanak-Gu, 151-742 Seoul, South-Korea E-mail:
[email protected] Rajkumar Buyya University of Melbourne Grid Computing and Distributed Systems (GRIDS) Laboratory Department of Computer Science and Software Engineering 111, Barry Street, Carlton, Melbourne, VIC 3053, Australia E-mail:
[email protected] Omer F. Rana Cardiff University School of Computer Science Queen’s Buildings, Newport Road Cardiff CF24 3AA, United Kingdom E-mail:
[email protected]
Library of Congress Control Number: 2009932574 CR Subject Classification (1998): J.1, K.4.4, H.3.5, H.3, C.2 LNCS Sublibrary: SL 5 – Computer Communication Networks and Telecommunications ISSN ISBN-10 ISBN-13
0302-9743 3-642-03863-8 Springer Berlin Heidelberg New York 978-3-642-03863-1 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2009 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12739295 06/3180 543210
Preface
GECON - Grid Economics and Business Models Cloud computing is seen by many people as the natural evolution of Grid computing concepts. Both, for instance, rely on the use of service-based approaches for provisioning compute and data resources. The importance of understanding business models and the economics of distributed computing systems and services has generally remained unchanged in the move to Cloud computing. This understanding is necessary in order to build sustainable e-infrastructure and businesses around this paradigm of sharing Cloud services. Currently, only a handful of companies have created successful businesses around Cloud services. Among these, Amazon and Salesforce (with their offerings of Elastic Compute Cloud and force.com among other offerings) are the most prominent. Both companies understand how to charge for their services and how to enable commercial transactions on them. However, whether a wide-spread adoption of Cloud services will occur has to seen. One key enabler remains the ability to support suitable business models and charging schemes that appeal to users outsourcing (part of) their internal business functions. The topics that have been addressed by the authors of accepted papers reflect the above-described situation and the need for a better understanding of Grid economics. The topics range from market mechanisms for trading computing resources, capacity planning, tools for modeling economic aspects of service-oriented systems, architectures for handling service level agreements, to models for economically efficient resource allocation. The 6th International Workshop on Grid Economics and Business Models, GECON 2009, presented innovative results on the above-mentioned research topics. The workshop attracted high-quality paper submissions. In total, we received 25 submissions from researchers worldwide. Of those submitted papers of great quality, eight papers were accepted, constituting an acceptance rate of 32%. Each paper was reviewed 3 times at least and on average 3.4 times. For the proceedings, the eight accepted papers of this workshop have been grouped into three categories: (1) Market Models and Mechanisms; (2) Business Support Tools; and (3) Business-Related Resource Allocation. The first category “Market Models and Mechanisms” comprises three papers. In the first paper “DEEP-SaM - Energy-Efficient Provisioning Policies for Computing Environments” by Christian Bodenstein, Tim Püschel, Markus Hedwig, and Dirk Neumann, a social welfare maximizing market mechanism for clearing a market for computing jobs has been introduced. The mechanism, which considers energy costs and customer satisfaction levels, is validated through simulation. The results suggest its suitability for use in modern data centers. The second contribution “Response Deadline Evaluation in Point-to-Point Negotiation on Grids,” submitted by Sébastien Noël, Pierre Manneback, and Gheorghe Cosmin Silaghi, analyzes a Grid market, in which each player (consumer or resource
VI
Preface
owner) in the market can broker incoming service requests. Using a simulator, the authors investigate the impact of different policies on the service negotiation process and the response times of the request-offer messages. The third paper in this category “A Framework for Analyzing the Economics of a Market for Grid Services” by Robin Mason, Costas Courcoubetis, and Natalia Miliou, introduces an economic model for analyzing a broad range of issues that are inherent to markets for Grid services. The numerical results show the effect of risk aversion and the durability of resources on the system behavior (i.e., the clearing price and volume of trade). The second category, “Business Support Tools,” contains three papers. The first paper “The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services” by Marcel Risch, Jörn Altmann, Li Guo, Alan Fleming, and Costas Courcoubetis, describes a testbed to enable the evaluation of services in a commercial setting. The authors explain the different components and services of the testbed as well as how they can be composed to form a Cloud computing business scenario. The second contribution, which is titled “VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management” by Ivona Brandic, Dejan Music, Philipp Leitner, and Schahram Dustdar, provides a novel framework for specifying service level agreements and for managing SLA mappings. The suggested solution can handle nonmatching SLA templates of consumers and providers. The last contribution of this category, “Cost Optimization Model for Business Applications in Virtualized Grid Environments” by Jörg Strebel, presents a cost model. Using mixed integer programming-based optimization, the model can be used to minimize the IT expenditure of an enterprise and to support enterprises in their decisions to outsource business software applications. Within the third category “Business-Related Resource Allocation,” there are the two remaining papers. The first one paper, “Increasing Capacity Exploitation in Food Supply Chains Using Grid Concepts,” by Eugen Volk, Marcus Müller, Ansger Jacob, Peter Racz and Martin Waldburger analyzes the supply chain of the food industry and suggests a market-based solution using virtual organization concepts. Market participants can execute product-specific negotiations for all parts of the supply chain. The second paper, “A QoS-based Selection Mechanism exploiting Business Relationships in Workflows,” by Dimosthenis Kyriazis, Konstantinos Tserpes, Ioannis Papagiannis, Kleopatra Konstanteli, and Theodora Varvarigou introduces an algorithm for mapping workflows (and their sub-jobs) to service instances. The algorithm considers Quality of Service (QoS) constraints of end-users and the business relationships between service providers. In addition to these papers, we received many paper submissions on research-inprogress. Out of these submissions, we selected six papers. These papers provide an overview about ongoing research projects on Grid and Cloud economics, addressing economic-related research in Cloud computing and software services. We grouped these papers into three categories as well: (1) Economic and Legal Models; (2) Business Models; and (3) Economic-Aware Architectures. The papers of the first category present an analysis of participation levels in Volunteer Grids and a discussion of the jurisdiction of service level agreements. The papers of the second category deal with business models in the application area of medicine and health. The third category of
Preface
VII
papers introduces new architectures for virtual organizations and for composing service level agreements. All of these papers provide an excellent basis for further discussions. To make this workshop a success, many people contributed to this event. We would like to express our gratitude to the organizers of the 2009 Euro-Par Conference, in particular, Hai Xiang Lin, for their support in hosting the GECON 2009 Workshop in Delft, The Netherlands. We would also like to thank Alfred Hofmann and Ursula Barth of Springer, who collaborated very effectively. Finally, our gratitude goes to Marcel Risch. He dedicated his time and effort to set up the website, to communicate with authors in preparing the manuscript of the workshop proceedings, and to coordinate the publication of the proceedings with Springer.
August 2009
Jörn Altmann Rajkumar Buyya Omer Rana
Table of Contents
Market Models and Mechanisms DEEP-SaM - Energy-Efficient Provisioning Policies for Computing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Bodenstein, Tim P¨ uschel, Markus Hedwig, and Dirk Neumann
1
Response Deadline Evaluation in Point-to-Point Negotiation on Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S´ebastien No¨el, Pierre Manneback, and Gheorghe Cosmin Silaghi
15
A Framework for Analyzing the Economics of a Market for Grid Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robin Mason, Costas Courcoubetis, and Natalia Miliou
28
Business Support Tools The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcel Risch, J¨ orn Altmann, Li Guo, Alan Fleming, and Costas Courcoubetis
46
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivona Brandic, Dejan Music, Philipp Leitner, and Schahram Dustdar
60
Cost Optimization Model for Business Applications in Virtualized Grid Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J¨ org Strebel
74
Business-Related Resource Allocation Increasing Capacity Exploitation in Food Supply Chains Using Grid Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eugen Volk, Marcus M¨ uller, Ansger Jacob, Peter Racz, and Martin Waldburger A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimosthenis Kyriazis, Konstantinos Tserpes, Ioannis Papagiannis, Kleopatra Konstanteli, and Theodora Varvarigou
88
102
X
Table of Contents
Work-in-Progress on Economic and Legal Models Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junseok Hwang, J¨ orn Altmann, and Ashraf Bany Mohammed
116
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Davide Maria Parrilli
128
Work-in-Progress on Business Models Engineering of Services and Business Models for Grid Applications . . . . . J¨ urgen Falkner and Anette Weisbecker Visualization in Health Grid Environments: A Novel Service and Business Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frank Dickmann, Mathias Kaspar, Benjamin L¨ ohnhardt, Nick Kepper, Fred Viezens, Frank Hertel, Michael Lesnussa, Yassene Mohammed, Andreas Thiel, Thomas Steinke, Johannes Bernarding, Dagmar Krefting, Tobias A. Knoch, and Ulrich Sax
140
150
Work-in-Progress on Economic-Aware Architectures Message Protocols for Provisioning and Usage of Computing Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nikolay Borissov, Simon Caton, Omer Rana, and Aharon Levine Business Collaborations in Grids: The BREIN Architectural Principals and VO Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Steve Taylor, Mike Surridge, Giuseppe Laria, Pierluigi Ritrovato, and Lutz Schubert Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
160
171
183
DEEP-SaM - Energy-Efficient Provisioning Policies for Computing Environments Christian Bodenstein, Tim P¨ uschel, Markus Hedwig, and Dirk Neumann Albert-Ludwigs-Universit¨ at Freiburg im Breisgau Chair of Information Systems Research 79085 Freiburg, Germany
Abstract. The cost of electricity for datacenters is a substantial operational cost that can and should be managed, not only for saving energy, but also due to the ecologic commitment inherent to power consumption. Often, pursuing this goal results in chronic underutilization of resources, a luxury most resource providers do not have in light of their corporate commitments. This work proposes, formalizes and numerically evaluates DEEP-Sam, for clearing provisioning markets, based on the maximization of welfare, subject to utility-level dependant energy costs and customer satisfaction levels. We focus specifically on linear power models, and the implications of the inherent fixed costs related to energy consumption of modern datacenters and cloud environments. We rigorously test the model by running multiple simulation scenarios and evaluate the results critically. We conclude with positive results and implications for long-term sustainable management of modern datacenters.
1
Introduction
Spawning a great controversy amongst researchers, Nicholas Carr, in ”IT Doesn’t Matter” [Ca1] argued that IT itself does not matter, but the way how we use it. Reserving any judgment to the controversy, this statement seems to become more and more relevant, where Green-IT usage policies step-up in day-to-day ITbusiness processes. Perhaps Carr foresaw, what would today become the driving trend in management of datacenters worldwide; not only greening the hardware, but also greening the usage of it. But sustainability is no new policy in business management. Since the industrialization during the 18th and 19th centuries the role of sustainable management movement steadily evolved. Where technologies such as the steam machine simplified the transportation opportunities, they inadvertently led us to the current state of threatening global warming. Accompanying the trend to globalization, the improved information and communication technologies (ICT), required by the fast growing IT-industry added to the toll. Alarmingly, most lower-level users in the industry are not even aware of their impact. Wiessner-Gross1 claims that a single Google-search can be traced to emit approximately 7 grams of CO2. Currently, global carbon emissions, attributable to the ICT, are around 2% to 2.5% of the world totals. 1
http://www.timesonline.co.uk/tol/news/environment/article5488934.ece
J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 1–14, 2009. c Springer-Verlag Berlin Heidelberg 2009
2
C. Bodenstein et al.
Hence, the energy efficient management of IT infrastructures is a key issue. To achieve this, the decision which server executes which operation at any given time (also known as the winner-determination problem) becomes crucial. Basic allocation mechanisms, like round-robin could possibly result in high-cost operations which could have been prevented using more energy-aware provisioning policies. This provisioning problem becomes more problematic when looking at the immense fixed cost component inherent to computing. Numerous authors ([Br1];[Ra2];[Se1]) while not always coherent on the exact magnitude of energy lost to inefficient hardware, would at least agree that the loss is significantly beyond 50% of the total energy consumed. However, to limit provisioning policies on strict green motives could greatly diminish their usefullness and applicability. The benefit of both providers and customers are important factors that may not be neglected in such decision models. For providers the benefit can usually be assumed to be profit. To determine customer benefit or customer satisfaction a simple metric will be introduced later in this work. The research question posed by this work is thus: Can current market-based provisioning mechanisms be adapted to consider fixed energy costs and customer satisfaction in their optimization dynamically, without losing economic value while still being computationally tractable? In reply to this research question, this work proposes, formalizes and numerically evaluates DEEP-SaM (Discriminatory Energy-Efficient Provisioning including Satisfaction Model), for clearing provisioning markets, based on the greedy maximization of welfare and the minimization of rejected jobs. DEEPSaM is novel as it: – includes energy costs in provisioning, allowing market-based platforms to allocate scarce resources in a truly efficient manner, – considers the fixed energy costs inherent to current computational devices, – considers service level agreements in the form of customer satisfaction, – is applicable in computing Clouds, Grids (to some extent) and datacenters. This work is structured as follows. Section 2 presents related work in the field of provisioning both with respect to market goals, energy efficiency and customer satisfaction, followed by the formalization of the provisioning model, the mathematical representation of DEEP-SaM in section 3 and the description of the fast-clearing DEEP-SaM heuristic in section 4. Section 5 presents the numerical evaluation of the models, followed by a short comment on possible implications this mechanism may have on job provisioning in section 6, before concluding and pointing to further research planned in section 7.
2
Related Work
To date, a lot of research has also gone into hardware inefficiencies and thermal hotspots in computing clusters ([Se1] found that most energy flowing into
DEEP-SaM - Energy-Efficient Provisioning Policies
3
systems is given off as heat, resulting that only about 10% (percentage varies in hardware complexity) of energy is actually available for energy optimization). With ample means of reducing heat in IT-infrastructures ([SS1];[CR1];[Ku1]), authors looked towards reducing energy consumption by the equipment itself. ([CM1];[FL1]) proposed solutions for efficiency in server clusters, ranging from energy-saving power supplies, to the powering down of unused hard drives to using different power states to conserve energy. Combining all these techniques, and a part of the three cornerstone approaches used in this work, ([Ha1];[Ra1];[RS1]) propose energy-efficient mechanisms and designs for datacenters. The above work focused on the technical aspects of provisioning in power-scarce environments given technical restrictions, rather than economic incentives. Another view on greening the use of IT, is the direct reduction in operating costs. [LR1] reuse these techniques and provide for promising contributions to energy management for servers. [RM1] presented an algorithm maximizing the rewards under a limited energy budget, without exceeding the deadlines or available energy. These algorithms were especially used to schedule on portable devices, where battery operation is crucial. More cost/price insight can be found in [HD1] and [CC1] who combine these solutions for server cluster environments. This is not dissimilar to [SS1], where the goal of the allocation is to minimize the average percentage of energy consumed by the application to execute across the machines in the ad hoc Grid, while meeting an application execution time constraint. Consolidated attempts are present in the work of ([MB1];[MG1];[PB1]) ranging from automated power management, to forcing servers to power-nap in underutilized phases. The idea of including additional objectives in scheduling decisions has been covered in various related work inspecting different aspects of those selected in this work. [FN1] discuss the application of economic theories to resource management. [AG1] present architecture for autonomic self-optimization, based on business objectives. Customer satisfaction has been mentioned as business objective and benchmark by [GC1], however they did not discuss how to include in scheduling mechanisms. An architecture for admission control on e-commerce websites that prioritizes user sessions based on predictions about the user’s intentions to buy a product is proposed by [PM1]. In this work, we intend contribute to the energy-efficient management of computing infrastructures, making use of utilization methods similar to ([Br1];[Ha1]), keeping the preferable revenue characteristics of [SN2], while implementing the consolidated methods proposed by ([MB1];[PB1]). This allows provisioning models to include energy costs in provisioning, allowing market-based platforms to allocate scarce resources in a truly efficient manner. Further we intend to provide for the goal of maintaining customer satisfaction , allowing the optimization to dynamically weigh the goals, as the provisioner sees fit. In its performance as a market mechanism, DEEP-SaM operates independent of the pricing mechanism invoked, and remains applicable even when faced with centralized computing environments.
4
3
C. Bodenstein et al.
The Provisioning Model
This section elaborates DEEP-SaM, a market-based model which consists of agents submitting time-sensitive computing jobs to the computing infrastructures such as Grids, Clouds or datacenters, nodes supplying computing resources, specifying the times that their machines are available, and an automated operator deciding allocation. Additionally, the computing environment may be required to handle additional jobs from other clusters or datacenters. Where some operators excel in storage, others may focus on processing, making the intercenter trade of tasks, an integral part of the provisioning process.
Fig. 1. Market Mechanism
Figure 1 depicts the base of operation in common resource infrastructures, but instead of using two separate agents; one for coordinating the resources and the other for computing the provisioning policy, we propose the use of a market mechanism to fulfill the task of both provisioning and coordination of computing resources. These include provisioning policies and other priorities directly influenced by the objective function of the market model. This step does not hinder the expandability of the model, as external restrictions, such as differing power models, can be arbitrarily added to the market model. This is easily attainable using goal programming approaches, by adding further objectives as goals and weights within the overall objective. To construct such a model, we first introduce the setting followed by a formal mathematical representation of the allocation problem. Depending on the valuation, the priorities are determined on the market which employs a greedy priority to accept only the most profitable jobs. Prioritizing the machine offers and job requests on the other hand is more intricate, and since it forms the target of this model, is covered in detail in 3.1. Secondly, we will elaborate a heuristic allocation algorithm which achieves a good approximation of its NP-hard counterpart, before proceeding to simulation evaluation in section 4.
DEEP-SaM - Energy-Efficient Provisioning Policies
3.1
5
The Setting
Computing power is the central scarce resource to be traded in an offline provisioning environment. There are two participating parties in the market: requesters, who wish to obtain computing resources, thereby maximizing their private utility functions and providers who supply the market with resources, maximizing their profits. These providers can range from private users to corporate datacenters, auctioning their idle resources. We assume that customers only request resources if their use is beneficial to them. Hence, the more requests are accepted the better. Furthermore we assume that the valuations submitted with the requests relate to their benefit for the customer. Important jobs are given higher valuations to assure that they are accepted. This leads us to the following simple customer satisfaction metric, the relative net satisfaction (RNS):
JT jt xjnt cj vj , JT jt cj vj
where xjnt is a binary decision variable deter-
mining if job j is allocated or not; cj is the capacity needed by job j and vj is the valuation per capacity unit for job j. Important to note here is that jobs are either allocated in requested phases, or not at all, meaning no stretching or squashing of jobs is allowed. Likewise, jobs are either executed in phases as requested, or not at all, resulting in no change of customer satisfaction through waiting time. Although this may seem limiting, it is necessary to allow for an analysis of changes to schedules when faced with just these two simple, yet major goals of satisfaction in terms of overall acceptance rates, and energy costs and their importance to policymakers. The enumerator of the equation therefore delivers the sum of the customer evaluations of all accepted jobs. To be able to compare results for different settings this term ins normalized by using a denominator that calculates the maximum possible evaluation in case all jobs are accepted. The market acts as a bulletin board for participants, meaning it collects offers and requests for a short period of time, before allocating the resources and clearing the board. Once this has been completed, the mechanism proceeds to collecting further posts. These periods will be referred to as phases. The allocation is primarily handled by a sealed bid auction mechanism, where requesters and suppliers do not know the other users requests and offers ([PK1];[SN2]). For the resource offers, let N describe the set of heterogeneous resource offers (”node”) made by the providers. When submitting such a resource offer n ∈ N the providers post the bit (rn , efn , cn , mn , fn , ln ) to the market board. rn (efn , μ) ∈ IR+ 0 specifies the node reservation price per computing power per phase, depicted in arbitrary monetary units (MU) where efn depicts the energy costs when machine n runs idle. cn ∈ IN and mn ∈ IN are the computing units (CPU) and memory available per time slot, where fn ∈ IN is the first available time slot in which these resources are available and ln ∈ IN is the last time slot; the time where the resources can be accessed [SN2]. Each node can be seen as a perfectly divisible virtual machine capable of executing multiple jobs in parallel as proposed by [BD1]. Providers are assumed to have complete information about their availability. When provisioning, the mechanism can determine with perfect foresight an optimal allocation scheme.
6
C. Bodenstein et al.
On the demand side, let j ∈ J represent requests to execute a job. Requesters report their bid (vj , cj , mj , fj , lj ) to the system. vj ∈ IR+ 0 represents the buyers maximum willingness to pay for a computing resources per unit time, depicted in monetary units (MU). cj ∈ IN and mj ∈ IN are the computing units (CPU) and memory required by the jobs per phase, where fj ∈ IN is the first phase in which these resources are required and lj ∈ IN is the last phase in which the resources are required. Each job is submitted and can only be executed in its entirety, allowing only complete allocation to the resources available in the designated start and end time, or none at all. Jobs can be migrated from one node to another, but each job can only be executed on one node at a time. Current schedulers would now aim to maximize the resource allocation, or to balance the overall system load. In this model the provisioner will additionally be the instrument as to which the energy consumption of the datacenter is optimized. By being able to control both the utilization levels (implicitly through job-stacking), and the allocation of jobs to nodes, the largest potential in energy conservation lies within the provisioning mechanism without losing any revenue. 3.2
Mathematical Formalization
Allocation is determined by the binary decision variable xjnt , where xjnt = 1 if job j is allocated to node n in time slot t, and xjnt = 0 if not. The decision variable ynt is the fixed cost initialization where ynt = 1 if machine n is powered in t, and ynt = 0 if not. The time horizon T = {t ∈ IN|fj ≤ t ≤ lj } ∪ {t ∈ IN|fn ≤ t ≤ ln } defines the set of all time slots for the underlying allocation problem. This ensures that no jobs or nodes are allocated in undefined time slots [SN2]. Hence, if J be defined as the set of job requests (resource requests) and N be the set of defined resource nodes (node suppliers), the winner determination problem can be mathematically formalized as follows: JNT max : α x
jnt
J T f xjnt cj (vj − rn ) − NT nt ynt en cn jt xjnt cj vj + (1 − α) NT f NT J T nt rn cn − nt en cn jt cj vj
(1)
Subject to: fj ≤ t ≤ lj , fn ≤ t ≤ ln , vj ≥ rn , N
xjnt ≤ 1, xjnt ∈ {0, 1}
∀ j ∈ J, n ∈ N
∀ j ∈ J, t ∈ T
(2) (3)
n J
xjnt cj ≤ cn ,
∀ n ∈ N, t ∈ T
(4)
j J
xjnt mj ≤ mn ,
∀ n ∈ N, t ∈ T
(5)
cj ≤ ynt , cn
∀ n ∈ N, t ∈ T
(6)
j J j
xjnt
DEEP-SaM - Energy-Efficient Provisioning Policies lj N u=fj
xjnu = (lj − fj + 1)
N
n
xjnt
∀ j ∈ J, t ∈ T
7
(7)
n
The objective (1) of this integer program is to dually maximize the overall utility of suppliers and requesters, subject to the hard constraint of minimizing the total energy costs and customer satisfaction, weighting the individual objectives using a goal programming approach. Constraint (2) ensures feasibility with respect to resource availability (fn ≤ t ≤ ln ), job-time constraints (fj ≤ t ≤ lj ), profitability (vj ≥ rn ). Constraint (3) ensures singularity; a job can only be allocated to one node in one time slot. Constraint (4) and (5) ensure that the resource constraints are abided, (6) initializes fixed costs when machines run idle and (7) enforces atomicity, the property of a task being executed either in its entirety or not at all; job migration is allowed. The allocation problem above is an instance of the multi-dimensional Generalized Assignment Problem ([MT1];[SN2]), thus NP-hard and as such not solvable in polynomial time. While smaller models may be solvable in seconds, largerscale problems take exponentially longer. We must thus resort to heuristics, as heuristic algorithms have the desirable property of solving hard problems near optimality in short periods of time. In the following subsection we define this heuristic to solve DEEP-SaM.
4
Heuristic Allocation Scheme
The DEEP-SaM heuristic approximates the optimal solution of the NP hard problem in polynomial time. The heuristic has been optimized for the special characteristics of the problem setting. The latter comparison with the optimal solution shows that the results of the heuristic nearly achieve the same outcome with a significantly lower run-time complexity. In the following, the term valuation is defined as both the consumer or provider valuation for one CPU unit. Furthermore the heuristic includes the parameter α allowing more weight to the customer satisfaction by diminishing the economic goals. DEEP-SaM Heuristic: 1 2 3 4 5 6 7 8
: DEEP SAM HEURISTIC(Jobs j, Nodes n, Double α) : j <- j.Sort(SortKeys = CPU.DEM * VAL; ASCENDING ) : n <- j.Sort(SortKeys = α * (CPU.CAP * VAL + (1-α) * EN); DESCENDING) : FOR EACH j IN jobs : FOR EACH n IN nodes : IF (n.cpu.capacity >= j.cpu.dem) and (n.mem.cap >= j.mem.cap) : IF (j.val < (n.en / n.cpu.cap * n.val)) : n.assignJob(j)
8
C. Bodenstein et al.
The input parameter of the heuristic is the set of jobs j and nodes n as well as the satisfaction parameter α. Both lists are sorted according to the resource demand and valuation. Following, the jobs are iteratively assigned to the nodes, starting with the job with the highest evaluation. Each job is assigned to the most economic node, which provides the required resources. Jobs, which cannot be assigned are rejected.
5
Numerical Experiments
For the evaluation of the proposed mechanism, we ran a number of numerical simulations, following a quantitative approach. 5.1
Data Generation
In order to simulate this scenario, the resource requirements and characteristics were all drawn randomly using lognormal distributions, similar to the data generation recommended by [Fe1] for the creation of parameters to simulate the model. The parameter values are always positive, since technically, there is no such thing as a negative runtime and the distribution is assumed to be skewed, implying a tight variance around the means. To achieve this simulation in the 2 model a lognormal distribution log IR+ 0 (μ, σ ) is used. As in related work, since reported valuations of users are uncommon in current traces to the best of our knowledge, we chose to combine these workload traced parameters, with the creation of artificial bids and valuations. The initial settings, a specific combination of parameters, were drawn from distributions based on real workload trace which originated from the parallel workload archive. We chose to generate the data in order to perform multiple simulations with the same distributions, to allow the individual traces to be comparable. As suggested in [Fe1], the computing and memory requirements were subject to lognormal distribution functions. The valuations and fixed costs, were subject to a uniform distribution. The two performance parameters to evaluate our model will be the profit calculated by P rof its :=
J N j
xjn vj cj −
n
J N j
xjn rn cj +
n
N n
yn efn cn
(8)
and RNS. JT RN S :=
jt xjnt cj vj JT jt cj vj
(9)
All provisioners, DEEP-SaM, the heuristic, and the benchmark LaFi will be subject to vigorous testing. To be certain that both models are subject to the
DEEP-SaM - Energy-Efficient Provisioning Policies
9
same conditions, and to avoid too many discrepancies, the provisioning phase-line for both models has been reduced to a single-phase one, where provisioning takes place in discrete unit size phases. In each trial, all three evaluated provisioners have been compared based on the same data set. Further we assume nodes are powered on demand, implying nodes which are not required remain shut down. Power-up costs are explicitly considered as a part of the fixed energy costs efn . For multi-phase models, this is not sufficient as the fixed energy costs no longer account for both operation and power-up costs, since power-up costs are only incurred in the first phase. 5.2
Data Analysis
In this subsection, we present the results of the experiments based on the simulations generated above. The simulations and optimizations for the exact solutions were run using GAMS/CPLEX. All results below are averages of 10 simulation runs. Further, no restrictions were imposed upon the CPLEX solver, allowing the solver to run until it reached the optimal solution. Figure 2 (left) shows the performance of the optimal algorithm in terms of their monetary worth, given a changing α. The allocation changed as expected. By increasing α, the importance of the energy efficient model increases. This also implies a decrease in job acceptance (right), even though higher profits may still be achieved. This can easily be explained, by the effect of the customer satisfaction term. By increasing the importance of near-full utilization, the acceptance threshold in terms of profitability is lowered. For the evaluation, a commonly employed Largest Job First (LaFi) mechanism is used as benchmark. This mechanism sorts jobs by jobsize and allocates as ordered; large jobs are scheduled before smaller ones. This way it tries to maximize resource utilization. Figure 4 illustrates how the heuristic performs in comparison to the benchmark. The profit achieved using the DEEP-SaM heuristic is always at least 38% higher than with LaFi. This is due to the fact that it takes valuation of the jobs as well as costs into account. Even against the optimal algorithm, the heuristic
Fig. 2. Optimal allocation evaluation for changing α
10
C. Bodenstein et al.
Fig. 3. DEEP-SaM evaluation for changing α
Fig. 4. Heuristic evaluation for changing α
performs equally well, with a near 94% efficiency on average, in terms of profits. The scenario with α = 1 represents a scenario where customer satisfaction is not taken into account and only job valuations and energy cost of the nodes are considered. In such a scenario the heuristic tends reject jobs that do not achieve their relative share of the fixed energy costs even if they would result in a small contribution to profit. This leads to a slight underutilization of the resources. This can be balanced by taking customer satisfaction into account. Maximum profit is achieved for α = 0.8. The relative net satisfaction is between 11% and 18% higher compared to the benchmark. Obviously it is higher the more the satisfaction term is considered. Node usage is comparable to the benchmark for low values of α, as unprofitable nodes are turned of for rising α it decreases.
DEEP-SaM - Energy-Efficient Provisioning Policies
6
11
Implications for Job Scheduling
The significant results achieved above suggest promising implications for employing DEEP-SaM. As workload traces [Fe1] suggest, real computing settings are largely governed by markets where the job sizes, phases, and order numbers are far greater than those presented in this work. Even though this may pose a problem for the real-time use of the exact mechanism, the surprisingly high efficiency ratio for larger order numbers illustrate the usability of the heuristics. Besides, if GAMS/CPLEX in their optimization resorts to heuristics to find proper solutions in a timely manner, this may suggest that the use of both the heuristic and then the optimal algorithm in succession may prove useful. To fully utilize the market potential, the decision of which pricing instrument to be used becomes an important task. In this work, we resorted to proportional pricing methods, to ensure some degree of truthfulness in reporting, and spent more time on the actual provisioning policy, than the pricing decision. In this regard, a major implication of this model is that when the pricing decision is determined pre-optimization, a practice often encountered in current data-center providers, no further complexity is added to the model. The provisioner will hence function even if pricing is determined in an iterative bidding process.
7
Conclusion and Outlook
Today, datacenters are faced with an increasing, yet still fluctuating ([Fe1] real workload traces exhibit a high volatility over time) demand for their resources. This results in idle resources during off-peak phases wasting energy and inadvertently polluting the environment. Grid computing offers feasible solutions to this problem, allowing jobs to be provisioned decentrally. Since Grids however do not directly maintain their own resources, the problem of energy-efficiency has rarely been discussed. The current mechanisms presented in section 2 are only partially able to cope with the setting given. In this work we presented DEEP-SaM, both an exact mechanism, and its representing heuristic to provide for a meaningful alternative for this problem. As a market mechanism, it can be used as a fast and efficient approach to the provisioning management in datacenters centrally, allowing the provisioner to directly take charge of the efficiency process. This however does not restrict it to be used only in such environments. Initial studies in current research-at-work show that DEEP-SaM can just as well be used in Datacenters, Cloud- and Utility computing environments, where the nodes can be seen as the hierarchal machine cluster addressed in provisioning or scheduling. To overcome the computational complexity of the exact mechanism, we resorted to the use of the DEEP-SaM heuristic. While we did encounter a loss in efficiency by resorting to heuristics, a general notion for the use of heuristics, we feel the decrease in complexity and ease of use fully justifies this forfeit. We set out, to improve current market-based provisioning mechanisms to consider the energy costs involved in computing, without ignoring the important task of customer satisfaction. We fine-tuned the model to consider energy cost
12
C. Bodenstein et al.
functions with fixed-cost components in their optimization and were able to do so without losing economic value while still being computationally tractable. We achieved this by extending current market-based provisioning models, and created both DEEP-SaM and DEEP-SaM heuristic, a model which includes energy costs in provisioning, allowing market-based platforms to allocate scarce resources in a truly efficient manner by directly considering energy costs in optimization. We specifically designed DEEP-SaM to be able to cope with different pricing strategies, be it pre- or post-optimization. To allow for a broad application radius, DEEP-SaM is specifically designed to be ignorant of the source of jobs, allowing it to applicable not only in datacenters, but also in distributed environments, such as computing clouds or Grids to some extent. Another important aspect when looking at customer satisfaction models is the phenomenon that different clients need to be handled differently, as some are satisfied differently to others known as client classification. While some clients may enjoy VIP status, and thus need to be prioritised, others need to be put on hold. This of course drastically changes the setup of the customer satisfaction model. In this work we deliberately kept the satisfaction model simple to allow for a sound foundation to build up from, permitting further research to include more interesting and realistic scenarios of client classification. Initial work has already been published in [PB2]. Since we consider moving away from the use of state-of-the-art heuristics, and given advancements in the field of quantum information processing and quantum computing, we intend to design and evaluate the use of next-generation quantum provisioning algorithms in pair with quantum bids to clear huge-scale provisioning environments. These will include multi-attributive combinatorial provisions, as no longer only the CPUs be traded, but the memory requirements paired to the job aswell. This also allows for entanglement between jobs, whereas one job can only be processed given certain conditions or states of other jobs in the queue and provide for a promising approach for next-generation job provisioning.
References [AG1] Aiber, S., Gilat, D., Landau, A., Razinkow, N., Sela, A., Wasserkrug, S.: Autonomic Self-Optimization According to Business Objectives. In: ICAC 2004: Proceedings of the First International Conference on Autonomic Computing (ICAC 2004), pp. 206–213 (2004) [BD1] Bapna, R., Das, S., Garfinkel, R., Stallaert, J.: A Market Design for Grid Computing. INFORMS Journal of Computing (2006) [Br1] Bradbury, D.: Real Innovation - Building Canada’s largest green data centre. Backbone Magazine (2008) [BR2] Burge, J., Ranganathan, P., Wiener, J.L. (CASH’EM) Cost-Aware Scheduling for Heterogeneous Enterprise Machines. HP-Labs: http://www.hpl.hp.com/techreports/2007/HPL-2007-63.pdf (retrieved March 23, 2007) [BA1] Buyya, R., Abramson, D., Venugopal, S.: The Grid Economy. In: Proceedings of the IEEE 93 Nr. 3 (2005) [Ca1] Carr, N.: IT doesn’t matter. Harvard Business Review 81(5), 41–49 (2003)
DEEP-SaM - Energy-Efficient Provisioning Policies
13
[CM1] Chen, G., Malkowski, K., Kandemir, M., Raghavan, P.: Reducing power with performance constraints for parallel sparse applications. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (2005) [CC1] Chun, B.N., Culler, D.E.: User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers. In: Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid. IEEE Computer Society, Los Alamitos (2002) [CR1] Coskun, A.K., Rosing, T.S., Whisnant, K.: Temperature aware task scheduling in MPSoCs. In: Proceedings of the conference on Design, automation and test in Europe, EDA Consortium, Nice, France (2007) [GC1] Ganek, A.G., Corbi, T.A.: The dawning of the autonomic computing era. IBM SYSTEMS JOURNAL 42(1) (2003) [Ha1] Hamann, H.F.: A Measurement-Based Method for Improving Data Center Energy Efficiency. In: Proceedings of the 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (sutc 2008), vol. 00. IEEE Computer Society, Los Alamitos (2008) [HD1] Heath, T., Diniz, B., Carrera, E.V., Wagner Jr., M., Bianchini, R.: Energy conservation in heterogeneous server clusters. In: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming. ACM, Chicago (2005) [Fe1] Feitelson, D.G.: Workload Modeling for Performance Evaluation. In: Calzarossa, M.C., Tucci, S. (eds.) Performance 2002. LNCS, vol. 2459, pp. 114–141. Springer, Heidelberg (2002) [FL1] Freeh, V.W., Lowenthal, D.K., Pan, F., Kappiah, N., Springer, R., Rountree, B.L.: Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications. IEEE Trans. Parallel Distrib. Syst. 18(6), 835–848 (2007) [FN1] Ferguson, D.F., Nikolaou, C., Sairamesh, J., Yemini, Y.: Economic models for allocating resources in computer systems. In: Market-based control: a paradigm for distributed resource allocation, pp. 156–183 (1996) [Ku1] Kurp, P.: Green computing. Communications of the ACM 51(10), 11–13 (2008) [LR1] Lefurgy, C., Rajamani, K., Rawson, F., Felter, W., Kistler, M., Keller, T.W.: Energy Management for Commercial Servers Computer 36(12), 39–48 (2003) [MT1] Martello, S., Toth, P.: Knapsack problems. John Wiley, Chichester (1990) [MB1] Mastroleon, L., Bambos, N., Kozyrakis, C., Economou, D.: Autonomic Power Management Schemes for Internet Servers and Data Centers. In: IEEE Global Telecommunications Conference, GLOBECOMM (2005) [MG1] Meisner, D., Gold, B.T., Wenisch, T.F.: PowerNap: eliminating server idle power. In: Proceeding of the 14th international conference on Architectural support for programming languages and operating systems. ACM, Washington (2009) [MW1] Mutz, A., Wolski, R., Brevik, J.: Eliciting honest value information in a batchqueue environment. In: 2007 8th IEEE/ACM International Conference on Grid Computing, pp. 291–297 (2007); Nielsen, L.S., Niessen, C.: Low-power operation using self-timed circuits and adaptive scaling of the supply voltage. IEEE Trans. Very Large Scale Integr. Syst. 2(4), 391–397 (1994) [PK1] Parkes, D.C., Kalagnanam, J., Eso, M.: Achieving Budget-Balance with Vickrey-Based Payment Schemes in Combinatorial Exchanges. In: IBM Research Report RC 22218 W0110-065 (2001) [PB1] Pinheiro, E., Bianchini, R., Carrera, E.V., Heath, T.: Load Balancing and Unbalancing for Power and Performance in Cluster-Based Systems. In: Workshop on Compilers and Operating Systems for Low Power (COLP) (2001)
14
C. Bodenstein et al.
[PM1] Poggi, N., Moreno, T., Berral, J.L., Gavald` a, R., Torres, J.: Web customer modeling for automated session prioritization on high traffic sites. In: Conati, C., McCoy, K., Paliouras, G. (eds.) UM 2007. LNCS, vol. 4511, pp. 450–454. Springer, Heidelberg (2007) [PB2] P¨ uschel, T., Borissov, N., Mac´ıas, M., Neumann, D., Guitart, J., Torres, J.: Economically enhanced resource management for internet service utilities. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 335–348. Springer, Heidelberg (2007) [Ra1] Raghavendra, P.: Optimal algorithms and inapproximability results for every CSP? In: Proceedings of the 40th annual ACM symposium on Theory of computing. ACM, Canada (2008) [Ra2] Rasmussen, N.: Implementing Energy Efficient Data Centers. In: APC White Paper # 114 (2006) [RS1] Rivoire, S., Shah, M.A., Ranganathan, P., Kozyrakis, C.: JouleSort: a balanced energy-efficiency benchmark. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data. ACM, Beijing (2007) [RM1] Rusu, C., Melhem, R., Mosse, D.: Maximizing Rewards for Real-Time Applications with Energy Constraints. ACM Transactions on Embedded Computer Systems 2, 537–559 (2003) [SN1] Schnizler, B., Neumann, D., Veit, D., Weinhardt, C.: Trading Grid Services a Multi Attribute Combinatorial approach. European Journal of Operational Research 187(3), 943–961 (2008) [Se1] See, S.: Is there a pathway to a Green Grid, http://www.ibergrid.eu/2008/presentations/Dia%2013/4.pdf (retrieved March 23, 2007) [SS1] Sharma, R.K., Shih, R., Bash, C., Patel, C., Varghese, P., Mekanapurath, M., Velayudhan, S., Manu Kumar, V.: On building next generation data centers: energy flow in the information technology stack. In: Proceedings of the 1st Bangalore annual Compute conference. ACM, Bangalore (2008) [SS2] Shivle, S., Siegel, H.J., Maciejewski, A.A., Sugavanam, P., Banka, T., Castain, R., Chindam, K., Dussinger, S., Pichumani, P., Satyasekara, P., Saylor, W., Sendek, D., Sousa, J., Sridharan, J., Velazco, J.: Static Allocation of Resources to Communicating Subtasks in a Heterogeneous ad-hoc Grid Environment. Journal of Parallel and Distributed Computing 66(4) (2006) [SN2] Stoesser, J., Neumann, D., Weinhardt, C.: GreedEx- A Scalable Clearing Mechanism for Utility Computing. Electronic Commerce Research 8(4) (2008) [WH1] Waldsburger, C., Hogg, T., Huberman, B.A., Kephart, J.O., Stornetta, W.S.: Spawn: A Distributed Computational Economy. IEEE Transactions on Software Engineering 18(2), 103–117 (1992) [WP1] Wolski, R., Plank, J.S., Brevik, J., Bryan, T.: Analyzing Market-Based Resource Allocation Strategies for the Computational Grid. International Journal of High Performance Computing Applications (2001)
Response Deadline Evaluation in Point-to-Point Negotiation on Grids S´ebastien No¨el1 , Pierre Manneback1 , and Gheorghe Cosmin Silaghi2 1
University of Mons, Faculty of Engineering, Belgium {Sebastien.Noel,Pierre.Manneback}@umons.ac.be 2 Babe¸s-Bolyai University of Cluj-Napoca, Romania
[email protected]
Abstract. In this paper we focus on access negotiation to services in decentralized commercial Grids. Besides being a resource provider, each site can play a brokering role, thus generate profit on the incoming service requests by subcontracting. In our configuration, negotiation is a pointto-point process consisting of couples request-offer messages. We present here an event-based simulator allowing specification of various policies that can affect the service negotiation process and the visualization of the information flow in the network. Further, we exploit the simulator to manage the response deadlines in the request-offer messages, so that sites accomplish their profit-maximization objectives. Simulations help to estimate an appropriate configuration of the policies depending on different Grid parameters leading to the maximization of the local profit. Keywords: Service access, Negotiation, Delegation.
1
Introduction
Nowadays, the Grid moves from a centralized architecture with coordinating entities toward a scalable fully decentralized infrastructure [4]. In a commercial context where sites try to make profit from their participation in the Grid, centralized management roles should be avoided because they introduce partiality issues. Moreover, central roles such as broker and resource discovery services represent a lack of architectural flexibility being issues of scalability. Grid sites could play these roles by themselves and try to make profit [5]. In our study we adopt this decentralized approach of a Grid with sites playing simultaneously the consumer, producer and broker roles. We introduce a bilateral point-to-point dialog mechanism to enable the brokering role at each Grid site. Two sites communicate by exchanging couples of request-offer messages with time deadlines. A site that receives a request message can either deliver by itself the requested service or can further propagate the message in the network trying to subcontract and make a profit. While the bilateral negotiation can follow the discounting pattern described in [7], we study the configuration of negotiation and management policies of a given site trying to optimize its behavior. To the best of our knowledge, such commercial Grids where multiple independent sites J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 15–27, 2009. c Springer-Verlag Berlin Heidelberg 2009
16
S. No¨el, P. Manneback, and G.C. Silaghi
interract in a point-to-point manner do not exist yet and offer new ways of doing business in a service oriented environment. In this context, many management decisions have to be made at the site during the negotiation process so that it fully exploits the local resources in one hand and the subcontractors network in the other hand. The site should be configured to attract consumers by proposing valuable offers and to maximize the local profit at the same time. Our key goal is to study a point-to-point negotiation environment and, based on the observations made, to propose adapted management policies. In this paper, our major contribution is to present an event-based simulator and to report the empirical observations we obtained. The simulator allows the specification of various policies that can affect the negotiation process; it help the Grid designer to visualize the information flow in the network and to estimate performance indicators for various network topologies and node profiles. As an example we study how policies for response deadlines in the communicated messages can influence the profits. For various global configurations including the network topology density and the management policies of the other sites in the environment, we deduce the best deadline values, depending on the context. These values can be used by administrators as a base for the configuration of sites. The paper is organized as follows. Section 2 describes the negotiation pattern and the internal working of the negotiation process. Section 3 shows how the simulator works and helps the designer to tune the response deadlines in order to maximize the Grid sites profit. Section 4 shortly briefs related work and section 5 concludes the paper.
2
The Negotiation Model
This section describes briefly the negotiation pattern and the negotiation process conducted into each Grid site. To get more details about the negotiation model, we refer the reader to [11]. 2.1
The Negotiation Pattern
The Grid consists of a collection of networked sites i, each offering a set of resources/services Ri . Each site i is acquainted with a localy defined list of other sites Ai considered as potential services providers. Therefore, besides being a producer, each site can also play a brokering role by intermediating the service requests and trying to subcontract them in its acquaintance network Ai . Users Ui of a site are the local consumers. The sets Ri , Ai and Ui might be empty, leading to the particular cases of single-role site: broker-only, consumer-only or provider-only. We envision that each node plays also the brokering role and tries to generate income by subcontracting. The general setup is similar with the contract net protocol [12]. The negotiation starts when a user of a site i needs some service
Response Deadline Evaluation in Point-to-Point Negotiation on Grids
17
Ser and the site forwards this request SR in its acquaintance network Ai . The service request is defined as in eq. (1): SR = [Ser, Rew, P en, tr ]
(1)
– Ser is the service request itself. It includes the quality of service parameters such as resources type and quantity (e.g. 10 processors AMD Opteron 252) – Rew is the reward that the consumer is willing to pay – P en is the penalty fee requested to the provider in case of failure in the service delivery – tr is the deadline up to which the message initiator waits a response for the request The two parameters Rew and P en are highly important and should be estimated precisely. Compared to the market mean price which can be estimated by any grid participant, a high reward Rew grants a higher interest from the provider. In this case, the consumer can therefore expect a higher response priority. A high penalty P en is generally proposed by the consumer when he needs high guarantees on the resources availability. A provider engaging in a high penalty contract generally expects the reward to be high as well. Even though these two parameters are fundamental in a real contract negotiation, they are not discussed deeply in this paper since we focus on the deadline parameters. A real message exchange can be performed using an adapted specification such as WS-Agreement [14] but we adopt here a more compact representation while exploiting WS-Agreement terminology. Moreover, we use an arbitrary money unit/time scale and we suppose that every sites are synchronized with a common universal clock. We suppose that some site j receives a service request SR. Depending on the acquaintance network Aj of the site j, the request can be forwarded to other sites in Aj after being possibly adapted (for instance, the reward Rew and the deadline tr may be decreased). Site j can forward the request to multiple sites, leading to concurrent negotiations as presented in [10]. When forwarding the requests, site j needs to carefully select the deadline parameter tr for the subsequent concurrent negotiations. Optimising the selection of the deadline parameter tr is the main scope of this paper. At any time but most probably before tr , the site j can send a service offer SO in response to the service request SR. Eq. (2) defines the service offer: SO = [Ser∼ , Rew∼ , P en∼ , tv ]
(2)
An offer has the same structure as a request: it contains the reward, the penalty and a deadline tv defining the validity of the offer. This value indicates that the request initiator i has to accept the offer before time tv . Offer parameters can be slightly different from the request when the site can not (or do not want to) provide exactly what the initiator asked for. The symbol ∼ means that the proposed value is somewhere near the requested value. The negotiation between a requester site i and a partner site j involves several rounds of request-offer messages; at each round a site makes some concession
18
S. No¨el, P. Manneback, and G.C. Silaghi
in the negotiated parameters. In our protocol, only the consumer side finally concludes the contract by sending an acceptance notification. As the consumer asks simultaneously several sites to provide offers, he may have many options to choose from. In the bilateral negotiation rounds, smaller the concessions in the negotiated issues are, the more the probability for the consumer to accept the offers decreases. The parameter tr which we focus on in this paper has a large influence on potential offers the initiator i will collect. Our previously described delegation protocol is close to the one described in [1] but with a fundamental difference: in our study, sites at each side of a delegation chain are never put in relation in the sense that they do not know each other. Each site considers the next site in the chain as the real provider of resources and has no information whether the offer is subcontracted or not. 2.2
Management Policies
We define the internal working of a Grid site with a set of management policies responsible for decision making with regard to one or more aspects of the negotiation. Each policy is defined by some parameters that influence the way the negotiation is conducted. In this study, we structured the management policies as presented on figure 1.
Fig. 1. Management policies structure
The left part of figure 1 corresponds to the dialog with the potential consumers and the right part with the potential providers. The upper part of the figure represents the processing of an incoming service request arriving at the considered site while the lower part represents the path followed by offers. Figure 1 should be read starting from its upper left arrow, which symbolizes an incoming request. The first management policy to be executed on this request is the Delegation Policy. The Delegation Policy is responsible for deciding whether the incoming request should be delegated or not (or at least, should the site try to delegate). The output of this policy is a boolean result indicating the next policy that will be applied. If the answer is true, the site runs the Delegate Choice Policy. In
Response Deadline Evaluation in Point-to-Point Negotiation on Grids
19
the other case, the site will not delegate but can possibly create an offer based on local resources by running the Offer Formulation Policy. The Delegate Choice Policy represents the algorithm to choose among the sites in the acquaintance network Ai which will be considered as potential subcontractors. This policy is useful when the acquaintance network is large and the site does not possess enough resources to carry on many simultaneous negotiations. The site i might decide to filter the list Ai based on different factor such as reputation, specialization, etc. Next, site i invokes the Request Formulation Policy for the list of potential delegates selected by the Delegate Choice Policy. The Request Formulation Policy is responsible for building subcontracting requests for the sites selected by the previous applied policy. This policy is very important because it will define how aggressive and opportunistic will be the negotiation behavior. Together with the Offer Consideration Policy, the Request Formulation Policy represents the heart of the negotiation process with the subcontractors. The output of this policy is a list of requests that are forwarded to the potential subcontractors. The Offer Consideration Policy is executed every time an offer is received from a potential subcontractor. This offer is received in response to a previously forwarded request and arrives generally right before the time tr . The Offer Consideration Policy is the algorithm employed to estimate the value of the offer and choose whether to iterate and send a new request or to accept the received offer and send an acceptance notification. The Offer Formulation Policy is responsible for constructing offers based either on the reserved local resources or on the received offers from subcontractors. Again, the negotiation behavior might differ with various left-side partners. Related works have been conducted to design the policy responsible for the creation of new offers (Offer Formulation Policy) and the analyze of the received offers (Offer Consideration Policy) (for instance, see [9]).
3
Simulations: Managing the Response Deadlines
This section describes the simulator and the performed simulations: the hypothesis, the results and their interpretation. 3.1
Simulator
We performed different simulations using a Java event-based simulator developed to test the negotiation model described in section 2. The objective of this tool is not to simulate the exchange of packets like in a real TCP/IP network but to visualize at a higher level the transmission and the reception of messages between nodes. The simulator also abstracts the services/resources: it envisages that a site is endowed with some resources, which can be reserved for a given time frame.
20
S. No¨el, P. Manneback, and G.C. Silaghi
Fig. 2. Graphical user interface of the Java negotiation simulator
The simulator allows the designer to: – evaluate different policy configurations and observe their influence on the negotiation process – measure indicators at a global scale in the whole simulated Grid environment (e.g. global user satisfaction, resource utilization) – analyze some local values at a given site (e.g. number of contracts, local profit) In comparison to other existing simulators such as GridSim [2] and SimGrid [3], our simulator has some advantages. It offers a visual representation of the network during the execution of the simulation. As presented in figure 2, it is possible to observe and follow the message exchanges such as requests, offers and acceptance notifications. While visualizing the Grid environment, it is possible to interrupt the simulation, go backward and forward to determine why any event occurred. Moreover, one can adapt the policy parameters dynamically at runtime and observe immediately the result on the visualization window and on reports. To allow the execution of simulations with different sets of parameters on a cluster or on a Grid, it is possible to disable the graphical user interface to run the simulator as a batch job. This simulator was used to test different negotiation policies and more precisely, the response deadlines that have to be set when delegating requests.
Response Deadline Evaluation in Point-to-Point Negotiation on Grids
3.2
21
Simulation Objective
For all simulations, we focus on a given site and we observe its profits at the end of the simulation time frame, in various configurations of the Grid environment. We discuss the response deadline value tr introduced when subcontracting requests to other sites considered as potential providers.
Fig. 3. Delegation of a request arriving at a considered site
Figure 3 presents an incoming service request from a consumer. This request can be forwarded to one (or more) provider site(s). We want to find the best value for tr in order to optimize the local profit of the considered site. This value has to be carefully selected because it influences the potential offers and the number of sites that may be contacted. The Request Formulation Policy is responsible for formulating requests SR that will be forwarded to the potential subcontractors in the acquaintance network Ai . This policy has to set a value for tr before forwarding the request: this value specifies the deadline for the provider to send an offer. In the policy olicy used for these simulations, we introduce a configuration parameter named tP r P olicy which influences the way tr is set. tr is bound as follows: olicy 0 ≤ tP ≤1 r olicy , the tr is calculated as in eq. (3): Based on a defined value of tP r olicy tr = ta + δ + tP (tr − ta − δ) r
(3)
where ta is the moment of arrival of the request (supposed equal to the time at which the Request Formulation Policy is executed) and δ is the time needed to transmit a message in the network. olicy ≈ 0, tr is close to the actual time ta , the considered site does When tP r not give much time to providers for sending an offer. Thus, the request SR has no time to go “deep” into the Grid: it is received only by sites included in the acquaintance network Ai . In this case, the site adopts a strategy to keep a long time for negotiation with the direct subcontractors.
22
S. No¨el, P. Manneback, and G.C. Silaghi olicy Table 1. Subcontracting characteristics for various levels of tP r olicy tP tr propagation time for negotiation r ≈ 0 (tr )min ≈ ta sites ∈ A long ≈ 1 (tr )max ≈ tr far short
olicy When tP ≈ 1, tr is as late as possible, that is close to time tr . The considr ered site decides to let many time for the direct providers of Ai to subcontract and negotiate with their own providers. Only a short time is kept for negotiation with the providers in Ai .
3.3
Simulation Configuration
In this part of the paper, we describe experimentation configuration including the simulation environment and the internal parameters of various sites, defining how they behave during the negotiation process. First, we present the configuration of the global environment. Each measure is obtained out of one experiment. For the same configuration, we run each experiment 30 times with a different random seed to get an average value of each indicator. The duration of a simulation experiment is 1000 time units; we interrupt the generation of new requests at time 800 so that sites are stable at the end of the simulation time frame. The Grid is composed of 10 sites; they represent big centers or companies offering a set of services on Internet. Considering the french Grid5000 infrastructure which is composed of 9 different sites, we believe that 10 sites is a realistic value for our simulations. The network topology is randomly generated, each grid site being randomly connected with the rest of the sites in the network. Each site is endowed with 0 to 50 units of a resource. There are 5 request sources (that is, users groups) randomly distributed among sites, that initiate service requests. The global load factor (the total amount of requested services reported to the total resource quantity in the network) is a-priori established randomly. All randomly generated numbers are uniformly distributed. We assume that no actor will successfully acknowledge a request before time tr ; there is no interest for the providers to answer before tr . Providers try to get the best possible resource allocation, therefore they will postpone the decision until the deadline. When formulating an offer, the reward estimation is made using the average value between the rewards in the last request and in the last offer of the negotiation session. More sophisticated negotiation strategies could also be defined such as the one presented in [8]. In the offer, the generating site does not change the service levels Ser and each site has full information about the market-price of a given resource.
Response Deadline Evaluation in Point-to-Point Negotiation on Grids
23
Fig. 4. Evolution of the profit for different values of the acquaintance network density
The various policies presented in figure 1 are configured as follows: – The Delegation Policy only decides if there is sufficient time to send and receive at least one couple request/offer. If the answer is yes, the request is forwarded to other sites int the acquaintance network. – The Delegate Choice Policy selects all sites in the acquaintance network as potential subcontractors. – The Offer Consideration Policy only considers the price to evaluate the utility of the offer. Each site will select the best price among the received offers. 3.4
Simulation Results
The objective of the first simulation is to observe the evolution of the profit for different values of the density of the acquaintance network Ai . The maximum density is when each site is connected with all the remaining sites in the grid. We generate a different number of links between sites to change the density of Ai : the curve denoted i − j means that we generate randomly n links between sites where i ≤ n ≤ j. The figure 4 presents the set of results obtained for this experiment. Using this graphic and having an estimation of the density of the acquaintance olicy values leading network, a site administrator can identify the best range of tP r to the highest local profit. This graph reveals some interesting results. We can observe that for a low network density, the considered site obtains a higher profit using a high value olicy . For a low density (1-2 or 2-3), the considered site is linked to a low of tP r olicy number of other sites. Therefore, by increasing tP , it expects that the request r will propagate far into the Grid and reach many other sites. More potential offers are generated leading to a higher profit.
24
S. No¨el, P. Manneback, and G.C. Silaghi
olicy Fig. 5. Evolution of the profit for different values of the tP in sites ∈ Ai r
For a high network density, the highest profit is obtained with a low value olicy . Actually, in this configuration, the considered site is linked to many of tP r other sites. There is no interest to let the request go “far” into the Grid; time should be saved for direct negotiation with the providers in Ai . We observed a local minimum in the high network densities curves obtained olicy olicy with tP equal to 0.7. This particular value depends on the tP used by r r other sites in the simulated network. In this case, the other sites in the neighborhood of the considered site send requests with a lower tr leading them to win most of the contracts. In the second simulation, we observe the evolution of the profit when the surolicy policy parameter. We analyze rounding sites use different values of their tP r P olicy values leading to the highest local profit. The results are prethe best tr olicy sented in figure 5. On the x-axis we took the tP value of the considered site, r olicy while each line is drawn for some average of the tP value for the remaining r sites in the grid. The graphic corroborates two intuitive ideas. First, when the sites in Ai use olicy a low tP value (for instance 0.3 or 0.4), the considered site should use a low r P olicy as well to get a higher profit. This can be explained easily. A high value of tr this parameter means the considered site attempts to contact more sites in the olicy , they Grid. If at the same time, direct subcontractors in Ai are using a low tP r block the forwarding of the request and only a small number of sites are finally reached. In that case, the considered site would have obtained higher profits by olicy and prioritize negotiation instead of propagation. using a low tP r olicy Second, in the opposite case, when the sites in Ai are using a high tP value r P olicy (for instance 0.8 or 0.9), the considered site should use a high tr as well. The number of sites potentially reached using this strategy has to be estimated by the site administrator, taking into account the observations made in the first simulation.
Response Deadline Evaluation in Point-to-Point Negotiation on Grids
25
olicy We also observe that when the sites in Ai are using a tP < 0.3, the r olicy considered site is not able to make any profit whatever its own tP value is. r
4
Related Work
Grid research did many attempts to build simulators for various sorts of experiments. SimGrid [3] and GridSim [2] represent the most important attempts. GridSim focuses on a commercial environment with priced services and allows the researcher to study the economics of the grid. On the other side, SimGrid is a more general message-exchange simulator allowing one to generate and simulate a broader sorts of grids and distributed environments. In our context, we needed the ability to visualize the Grid during the negotiation, go back and forth in the simulated time to observe and understand decisions taken by the sites. These features, which are now offered by our simulation tool, were not proposed by GridSim and SimGrid. Subcontracting is an old conceptual problem in distributed computing. The Contract Net Protocol [12] which defines a message-exchange protocol to distribute tasks on multiple execution nodes, represents the basis for many studies concerning delegation. Relating with the grid, we can cite the work of BustosJimenez et al. [1] that tries to study the delegation in a peer-to-peer grid. However, in comparison to our study, this work does not tackle the commercial aspects and the negotiation process. Other studies concern resource discovery in a peer-to-peer grid [5,6,13]. Opportunistic strategies regarding the bilateral negotiation in the grid are presented in [7,8] and are of high interest for defining the site behaviour in our study.
5
Conclusion
In this paper we focused on a negotiation model adapted for delegating services in a commercial Grid. This model is based on a point-to-point messages exchange between the different peers of the Grid and delivers more flexibility and business opportunities for the Grid participants. We developed a Java event-based simulator for our delegation model. This simulator is a useful tool for the site administrator to test different behavior policies before their implementation in the real environment. We used the simulator to evaluate how the response deadlines influence the local profits. The response deadlines of the request/offer messages is one important parameter that influence the overall negotiation. The simulations have been conducted in various Grid configurations: for difolicy in the setup ferent density levels of the network and different values of the tP r of the neighboring sites. We concluded that the considered site should use a high olicy value for low density network and inversely. Moreover, we showed that tP r olicy this value should be adapted according to the tP of other participants in the r Grid.
26
S. No¨el, P. Manneback, and G.C. Silaghi
Further we plan to use the simulator to highlight other important parameters in the negotiation, like the rewards and the penalties, resource aggregation and bundling, various algorithms for decision making at local sites and the interaction between these parameters. Moreover, we have to observe the evolution of negotiation policies when increasing the number of participants in the Grid. A olicy of the other method and tools to estimate the network density and the tP r sites in the Grid environment have to be defined and developped and will be the object of a future work. Moreover, the influence of the network density and the configuration of surrounding sites has been studied separately; as a future step, it would be interesting to consider the interaction of these various parameters on each other. Acknowledgements. G.C. Silaghi acknowledges support from the Romanian Authority for Scientific Research under project IDEI 2452.
References 1. Bustos-Jimenez, J., Varas, C., Piquer, J.: Sub-contracts: delegating contracts for resource discovery. In: Talia, D., Yahyapour, R., Ziegler, W. (eds.) Grid Middleware and Services. Springer, Heidelberg (2008) 2. Buyya, R., Murshed, M.: GridSim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing. In: Concurrency and Computation: Practice and Experience, vol. 14(13-15), pp. 1175–1220. Wiley Press, USA (2002) 3. Casanova, H., Legrand, A., Quinson, M.: SimGrid: a Generic Framework for LargeScale Distributed Experimentations. In: The 10th IEEE Intl. Conf. on Computer Modelling and Simulation (UKSIM/EUROSIM 2008) (2008) 4. Foster, I., Iamnitchi, A.: On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 118–128. Springer, Heidelberg (2003) 5. Iamnitchi, A., Foster, I.: A peer-to-peer approach to resource location in Grid environments. In: The 11th IEEE Intl. Symposium on High Performance Distributed Computing (2002) 6. Iamnitchi, A., Foster, I., Nurmi, D.C.: A peer-to-peer approach to resource discovery in Grid environments. In: High Performance Distributed Computing (2002) 7. Li, J., Sim, K.M., Yahyapour, R.: Negotiation Strategies Considering Opportunity Functions for Grid Scheduling. In: Kermarrec, A.-M., Boug´e, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 447–456. Springer, Heidelberg (2007) 8. Li, J., Yahyapour, R.: A strategic negotiation model for Grid scheduling. ITSSA 1(4), 411–420 (2006) 9. Lin, R., Kraus, S., Wilkenfeld, J., Barry, J.: Negotiating with bounded rational agents in environments with incomplete information using an automated agent. Artificial Intelligence 172(6-7), 823–851 (2008) 10. Nguyen, T.D., Jennings, N.R.: A heuristic model for concurrent bi-lateral negotiations in incomplete information settings. In: Intl. Joint Conf. on Artificial Intelligence, Mexico, August 9-15 (2003) 11. No¨el, S., Manneback, P., Silaghi, G.C., W¨ aldrich, O.: Delegation in service access negotiation. In: Proc. of the CoreGRID Integration Workshop, Greece (April 2008)
Response Deadline Evaluation in Point-to-Point Negotiation on Grids
27
12. Smith, R.G.: The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver. IEEE Trans. Comput. C-29, 1104–1113 (1980) 13. Trunfio, P., Talia, D., Papadakis, H., Fragopoulou, P., Mordacchini, M., Pennanen, M., Popov, K., Vlassov, V., Haridi, S.: Peer-to-Peer resource discovery in Grids: Models and systems. FGCS 23(7), 864–878 (2007) 14. Andrieux, A., Czajkowski, K., Dan, A., Keahey, K., Ludwig, H., Nakata, T., Pruyne, J., Rofrano, J., Tuecke, S., Xu, M.: Web Service Agreement Specification (WS-Agreement), March 14 (2007)
A Framework for Analyzing the Economics of a Market for Grid Services Robin Mason1 , Costas Courcoubetis2 , and Natalia Miliou2 1
University of Southampton, Highfield, Southampton, United Kingdom 2 Athens University of Economics and Business, Athens, Greece
Abstract. This paper provides a single broad model for the analysis of a range of issues underlying a market for Grid services. The demand and the supply sides of such a market are being treated separately and the relation between the two sides is being studied. We provide numerical results in order to derive conclusions about the viability of a market for Grid services. Underlying our model are parameters such as the cost technologies, the random processes driving demand and supply and the size of the market. We study the effect of the model’s parameters, such as risk aversion or the durability of resources, on the system’s behavior, eg. on the clearing price or volume of trade. Keywords: Grid economics, Grid services, Grid market, spot market, market failure, competitive equilibrium.
1
Introduction
The remarkable evolution of computational power, as well as the growth of internet, have changed the approach of research on computational resources. Namely, computational resources have become less expensive and storage capabilities have increased dramatically. Clustering geographically remote computational resources and using them as one unified resource has become not only feasible but also economically beneficial. In this work we treat the question of how computational resources should be traded. We demonstrate the benefits (to both sellers and buyers of Grid services) from trade. And we examine how these benefits arise in a particular trading mechanism. We analyse a competitive market model, with no market power on either the buyer or provider side. We opt for this approach, for two main reasons; it employs a mechanism that is simple yet effective with desirable economic properties, and it is an extension of the standard stock market mechanism for storable commodities. We view this as the limit of a double auction in which the number of traders becomes large.1 The properties of double auctions in static settings have been
1
Work partially supported by the EU-funded research project GRIDECON, FP6 2005 - IST5 - 033634. This double auction model has been adopted in the market mechanism developed in the project GRIDECON.
J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 28–45, 2009. c Springer-Verlag Berlin Heidelberg 2009
A Framework for Analyzing the Economics of a Market for Grid Services
29
investigated both theoretically and empirically. The theory of double auctions (see e.g. [2]) shows that every non-trivial equilibrium of a double auction converges to the competitive market outcome as the number of auction participants grows. Moreover, the rate of convergence is fast, namely of the order of 1/N 2−α for any α > 0, where N is the number of auction participants. Satterthwaite and Williams, [6], have shown that the worst-case speed of convergence of the double auction is as high as for any other type of auction. This work strengthens earlier results of e.g. [5]. The experimental evidence backs up these theoretical results. Even from early work, such as [1] and [10], experiments have found that double auctions yield outcomes that are close to fully efficient, even with a small number of human participants. In [3] it was shown that the same is true with ”zero intelligence” traders. As it is explained in [4], the rules of the double auction are such that only a small number of participants is needed to get (very close) to the competitive outcome. Work by Cliff, [7], has challenged this conclusion. For example, he shows that, under some circumstances, agents who are less than fully rational can fail to get close to equilibrium in a double auction. But note that this does not necessarily imply that the efficiency of a double auction is low (since a high degree of efficiency just requires that high value buyers are generally matched with low value sellers). Moreover, recent work (by e.g., Preist and van Tol in [8]) extending Cliff’s analysis suggests that the rate at which actual play approaches the competitive outcome is quite high. See Parsons et al. in [9] for a very good summary of the literature. Some complementary work is in [11]. The authors analyze there the use of real options traded at an additional contract market, to efficiently manage economical issues arising from the realization of a flexible resource reservation scheme. Necessary conditions so that even risk neutral agents have incentives to participate in such a market, as it increases their expected utility, are being derived. In this work, we develop a framework for analysing the economic properties of the competitive equilibrium of a market for Grid services. The objectives of the framework are mainly to provide a single broad model, instead of a set of separate models, within which a range of issues can be assessed and to look separately at the demand and supply sides of the market, while recognising the relation between the two sides. This framework also permits validation of the model, by using numerical and simulation methods to assess the broad features of the model and derives a set of conclusions about the viability of a market for Grid services (both spot and forward), based on the underlying parameters of the analysis: for example, the cost technologies, the random processes driving demand and supply, the size of the market (number of participants). In section 2 a discussion on the competitive market model and our basic assumptions is being presented. The definition of our model, as well as the parameters studied in this paper is being described in section 3. A general solution for the competitive equilibrium with a Grid spot market is being presented in section 4. The simulation environment in which our experiments are being
30
R. Mason, C. Courcoubetis, and N. Miliou
conducted is described in section 5. Finally, the analysis of the situation when the market fails to satisfy the demand in computational resources is being analyzed in section 6.
2
Our Approach
The starting point of economic analysis is usually a competitive market model, with no market power on either the buyer or provider side. In this framework, the price and quantity of Grid services are determined by competitive equilibrium in which demand and supply are equal. In order to analyse this equilibrium outcome, we must first model explicitly the demand and supply sides of the market for Grid services. We shall do so by considering the dynamics of resource allocation for both users and providers of Grid services. On the demand side, each user is subject to random arrival of jobs that need to be completed. These jobs might also be of different sizes or durations. Each user has to decide whether to use the Grid or buy its own resources. Economies of scale, through e.g., multiplexing, mean that the Grid may have lower costs; but uncertainty over total demand and supply means that the user cannot be certain of getting adequate resources at a given price from the Grid. The user’s decision optimally trades off these factors. The aggregate of different users’ decisions determines at any time the total demand for Grid resources. The supply side is modelled similarly. Each provider of Grid resources has its own capacity for computational tasks; and faces random usage patterns for that capacity. It releases capacity to the Grid whenever its own demand for resources is less than its own capacity. The Grid then offers a way in which a provider of computational resources can recover the costs of capacity investment, in the event that its own demand falls short of capacity. The aggregate of different providers’ decisions determines at any time the total supply of Grid resources. Hence, the demand and supply sides of the model are related in a two-stage procedure. In the first stage, the economic agents choose the amount of computational resources to buy for themselves. In the second stage, each agent observes its own idiosyncratic demand for computational resources. Those with too little capacity buy resources from the Grid, forming the demand side of the model. Those with too much capacity offer resources to the Grid: the supply side. Differences between the agents—primarily heterogeneity in the size and duration of the jobs—mean that some agents buy very little capacity and are systematically on the demand side of the market. Under this approach, it is assumed that there is a single homogeneous good that is to be sold through the market. Hence, no differentiation among resources is feasible. Such a commodity market assumes a single market price for a unit of the good sold. There are clear indications that current virtualization technologies justifies such a commoditization assumption, for instance Amazon EC2 cloud sells virtual machines of similar specifications. That is, it is assumed that all market actors, both consumers and producers, have a small fraction of the total market share. This way, they cannot affect the market price by means of
A Framework for Analyzing the Economics of a Market for Grid Services
31
a strategy, thus the market is perfectly competitive and all agents are pricetakers. It is clear that this assumption does not hold in the existing oligopolistic commercial Grid environment where few large providers dominate the market. The first benchmark to analyse is the sequence of competitive spot markets, where demand and supply are equalised in each period. The spot market is analysed under a number of different scenarios concerning the stochastic processes determining the demand and supply sides of the market. In order to analyse the use of forward and options markets, we shall measure the extent to which Grid market participants are willing to pay to avoid the risk attached to an uncertain Grid market price. This is one key source of uncertainty facing Grid participants. (The other is their own demand for computational services; we do not consider insurance against this sort of risk. In a sense, the Grid market performs this role, since it allows users to buy and sell resources, in the event that their demand exceeds or falls short of their own supply of computational resources.) Risk averse users, faced with uncertainty about demand and supply conditions, will want to buy insurance against that uncertainty. With a very large number of users (in the perfectly competitive limit, a continuum), there is no aggregate uncertainty. But for each user and provider, of course, there would be uncertainty in each time period. The most direct analysis is to calculate the risk premium that users are willing to pay to insure themselves entirely against market fluctuations. The aggregate risk premium gives a measure of the potential for forward and options markets for Grid services to develop. This risk premium can be calculated numerically under a number of different scenarios, using simulations of the competitive market outcome over time.
3
The Model
There are N agents, indexed by i. There are an infinite number of discrete time periods, indexed by t. Each time period t is divided in to (up to) three stages. During the first stage the amount of resources to be bought is being determined: agent i chooses to buy an amount xi (t) ∈ R+ of computational resource. The cost of buying an amount x of this resource is c(x). We make the following general assumptions about the cost function: buying nothing costs nothing c(0) = 0; variable cost of acquiring the resource is increasing and convex in the resource amount: c (·) > 0, c (·) > 0 for x > 0. Taken together, these features mean that (a) agents can rationally choose not to buy any resource; (b) demand for resources is bounded (due to the convexity of variable costs). c(·) is a key (functional) parameter of the model. There are a number of modelling options for how resources evolve over time. One extreme is that a resource is non-durable, so that each agent starts each period with a zero stock of the resource. The opposite extreme is that the resource is infinitely durable; each agent therefore accumulates the resource over time, with no loss or degradation of the resource. In this case, the resource available to agent i in period t is Xi (t) = Xi (t − 1) + xi (t). In between these two extremes, the resource may be durable but suffer from depreciation
32
R. Mason, C. Courcoubetis, and N. Miliou
(e.g., technical obsolescence). We model this by using the depreciation coefficient 1 − β, where 0 ≤ β ≤ 1. Hence, the resource available to agent i in period t becomes Xi (t) = βXi (t − 1) + xi (t). The second stage is initiated when the demand is being realized. Agent i’s demand parameter yi (t) ≥ 0 is a random variable at the time that it chooses its resource level. The value of the parameter is realized after the resource level is chosen, but before the Grid market operates. A key modelling question is the process that yi follows. One representation is yi (t) = αi yi (t − 1) + it where yi (0) is a parameter; and it is drawn from a distribution with zero mean (e.g., a normal), subject to yi ≥ 0. In this first-order autoregressive process, αi is the degree of persistence in the process for {yi (t)}. If αi = 0, then {yi (t)} is a random walk. The third stage is the stage of the actual operation of the Grid market. Agents participate in the Grid, according to their resource levels and demand realizations. Potentially, an agent’s maximization problem is very complex: choices in a given period can depend on choices made in the previous period; the current period’s environment (e.g., realisations of demands); and the evolution of the system in the future. It would be extremely difficult to solve this system fully. Instead, we make the simplifying assumption that agents are ‘myopic’, solving in each period a static maximization problem subject only to the constraints generated by previous choices. In particular, the effect of current choices on future maximization problems is ignored. In this case, agent i’s maximization problem in period t is (1) max yi (t)u(z) − q(t)(z − (βXi (t − 1) + xi (t))) . z≥0
u(·) is a utility function that is common to all agents and z models the aggregate need in computational resources; all heterogeneity is therefore contained within the realizations of {yi }. u(·) is assumed to be (weakly) increasing: for all agents, a greater use of resources cannot decrease utility. q(t) is the unit resource price that prevails in the Grid market at time t. An important modelling question is the functional form of the utility function. Since we shall be interested in considering forward markets, which effectively provide insurance for agents, risk aversion will be a key component of the model. For this, we need concavity of the utility function.
4
Solving for Competitive Equilibrium with a Grid Spot Market
In a competitive Grid spot market, a market-clearing condition must hold: zi∗ = Xi (t), (2) i
zi∗
i
where is agent i’s optimal choice of z. Note that this market-clearing condition will indeed hold with equality. There cannot be more (instantaneous) demand
A Framework for Analyzing the Economics of a Market for Grid Services
33
than there is supply: that it is not possible for i zi∗ > i Xi (t). If there is, were excess supply—if i zi∗ < i Xi (t)—then this would imply that the price of Grid resources, q(t), would be zero. Then there would be no cost to increasing use of the resource (see equation (1)); consequently, each agent would raise its zi until the market-clearing condition (2) holds with equality. Each agent i perceives that the Grid price q(t) is unaffected by its choice of resource xi (t) and resource consumption zi . (This follows from the competitive market environment that we consider.) In this case, the first-order condition characterizing the optimal consumption of resources is u−1 yq(t) yi (t) > 0, ∗ i (t) (3) zi (q(t); yi (t)) = 0 yi (t) = 0. Note, therefore, that agent i’s optimal consumption of the resource is independent of its own resource level xi . q(t) satisfies zi∗ (q(t); yi (t)) = Xi (t), (4) i
i
and hence is a function of the vector of demand realizations y(t) and the total resource level X(t) ≡ i Xi (t). Denote the market-clearing Grid price as q(y(t), X(t)) to emphasize this dependence. Supposedly agent i solves the following optimization problem, when choosing its resource level max Ey yi (t)u(zi∗ (q(y(t), X(t)) − q(y(t), X(t))(zi∗ (q(y(t), X(t))− x
(βXi (t − 1) + x)) − c(x) . (5)
Consistent with the competitive market setting, agent i does not consider the effect of its choice of xi on q since X(t) is much larger than x. Therefore, the optimization problem that is being solved in (5) is max Ey [q(y(t), X(t))]x − c(x). x
(6)
Essentially, xi maximizes the net benefit of the user from the sale to the Grid at price q. Hence the first-order condition for its resource choice is (7) x∗i = c−1 Ey [q(y(t), X(t))] if x∗i > 0 is optimal. However, note that this is not the exact optimization problem that an agent faces when choosing its resource level. Namely, the agent has to take into consideration the fact that resources being bought at time slot t in general last for multiple time slots and their net present value by selling them to the Grid should be used in the optimization problem 5 (instead of the revenue at time t only). Namely, if r is the rate of return that could be earned on an investment with similar risk, the net present value (NPV) of x (depreciated by 1 − β in each
34
R. Mason, C. Courcoubetis, and N. Miliou
βs period) is ∞ s=t xq(s) (1+r)s . This problem is in general difficult to solve since the future price values are not known. Instead, we assume for the purpose of an approximation of the NPV that the price remains constant and equal to the r+1 current price. In this case the NPV simplifies to r+1−β xq(t) and the subproblem 6 is modified to r+1 Ey [q(y(t), X(t))]x − c(x). (8) max x r+1−β As a consequence, problem (7) is modified to x∗i = c−1
r+1 Ey [q(y(t), X(t))] . r+1−β
(9)
Equation (9) with equations (3) and (4) define the market-clearing price q(t; y, X(t − 1)). Hence the sequence of events to solve the model in any given period t is – Compute the optimal resource demands x∗ (t) (which depends on expectations of the realizations of the agents’ types y(t) and the existing stock of resources x∗ (t − 1)). – Take the instantaneous realization of the agents’ types y(t). – For that realization and the agents’ resources, compute the market-clearing price q(t). – Given the market-clearing price q(t) and the realization of agents’ types y(t), compute the optimal demand z∗ (t). – Compute metrics of the performance of the Grid market: the aggregate indirect utility of participants:
U (t) ≡ Ey yi (t)u(zi∗ (t)) − q(t)(zi∗ (t) − (βXi (t − 1) + x∗i (t))) − c(x∗i (t)) i
and the realized volume of trade: V (t) ≡ 4.1
i
(10) |zi∗ (t) − (βXi (t − 1) + x∗i (t))|.
The Autarkic Benchmark
The outcome with a Grid spot market can be contrasted to the situation in which there is no trading of resources subsequent to demand realizations. Each agent’s maximization problem for the demand stage is then max{yi u(z)} z
subject to z ≤ xi .
(11)
The solution to this is simply zi† = xi : each agent consumes its own entire resource. For the optimal choice of resource under autarky, the agent faces max{Ey [yi ]u(x) − c(x)} x
(12)
which has a solution x†i given by Ey [yi ]u (x†i ) = c (x†i ).
(13)
A Framework for Analyzing the Economics of a Market for Grid Services
5
35
A Simulation Environment
In this section, we illustrate the main properties of a competitive Grid spot market by means of presenting a simple example and then establishing a general simulation environment. 5.1
An Example
To illustrate the model, this subsection looks at a very simple example. For the functions in the model, take the following forms c(x) = γx2 , u(z) = ln(z).
γ > 0;
(14) (15)
Suppose that the resource is entirely non-durable, so that the agents start each period with a zero resource stock. Suppose that each agent’s type yi is drawn from the uniform distribution on [0, 1] independently each period; the types of any two agents are independent draws from the same distribution. Given the realization of its type, agent i’s optimal demand is given by yi . q
zi∗ = Hence total demand is
zi∗ =
i
Market clearing requires that
∗ i zi
=
i
yi
q i
(16)
≡
Y . q
(17)
x∗i , so that
Y = x∗i ≡ X q i
(18)
or q = Y /X. Agent i’s optimal resource level is given by x∗i =
1 1 N Ey [q] = Ey [Y ] = . 2γ 2γX 4γX
(19)
Since x∗i = X/N , this means that N X = N 4γX
(20)
so that N X= √ ; 2 γ
1 x= √ . 2 γ
(21)
36
R. Mason, C. Courcoubetis, and N. Miliou
Elementary calculations show that Y is a random variable with a mean of N/2 and density function of N! 1 (y − k)N −1 sgn(x − k). (−1)k 2(N − 1)! k!(N − k)! N
(22)
k=0
These facts can be used to compute the (expected) volume of trade N yi 1
− √ V =E | √ 2 γY 2 γ i
(23)
and the expected utility of each agent 1
q U ∗ = Ey yi ln(yi ) − yi ln(q) − yi + √ − . 2 γ 4
(24)
Note that there are two sources of uncertainty for the agent: its own demand, determined by yi ; and the aggregate market conditions, summarised by the marketclearing price q. Because of risk aversion, both sources of risk decrease the agent’s utility; that is Ey yi ln(yi )] < Ey [yi ] ln(Ey [yi ]), and Ey yi ln(q)] < Ey [yi ] ln(Ey [q]). (25) Since we are interested in the role of forward contracts, we calculate the size of the ‘risk premium’ attached to the market price, which is (26) RP ≡ Ey [yi ] ln(Ey [q]) − Ey [ln(q)] . The risk premium is the cash amount that an agent would be willing to pay in each period in order to avoid the uncertainty attached to the market-clearing price. It is, therefore, the fee that a broker could charge to write a forward contract that insures the agent against fluctuations in the Grid price. These variables can be contrasted with the outcome from autarky: 1 x†i = √ , 4γ
zi† = x†i ;
(27)
the expected utility from autarky is2 1 1 U † = − ln(4γ) − . 4 4 2
(28)
Note that the expected utility under autarky may well be negative. This is not a problem: it is simply a question of normalization. The utility functions used in this model are, technically, von Neumann-Morgenstern utility functions, and hence are unique up to affine transformations. Put more simply, any constant can be added to the agents’ utilities without changing the outcome of the model. Hence negative utilities can be made positive by adding a large enough positive number. We take the simpler course of using U † as the benchmark utility level, and comparing it to U ∗ , the utility level with a Grid market.
A Framework for Analyzing the Economics of a Market for Grid Services
5.2
37
A More General Simulation Environment
In this subsection a more general simulation environment is being presented. Before describing the environment we provide three important remarks. First, we introduce demand parameters {yi } of general distributions. In the simulation experiment, we suppose that an agent’s demand parameter in period t is drawn from a Gamma distribution, with a scale parameter of 1 and a shape parameter ki . An agent’s shape parameter is determined initially (before the market opens) as a random draw from a uniform distribution on [0,1]. Hence agents are now ex ante asymmetric: agent i has a mean demand E[yi ] = ki , so that a high ki corresponds to high average demand. Draws between periods, and across agents, are conditionally independent (i.e., conditional on the realizations of {ki }). But note that the realization of ki induces intertemporal dependence in the realizations of agent i’s demand levels {yi (t)}t≥0 . Agents do not know in any period the initial realizations of {ki }, nor the per-period realizations {yi (t)}. It is common knowledge how the {ki } and {yi (t)} are generated. This is an extremely flexible way of specifying demand. The motivation behind choosing Gamma distribution consists in properties such as the divisibility and the relationship of Gamma distribution to the Gaussian and exponential distributions. Secondly, we allow agents to accumulate the resource over time either completely or partially, as has been already mentioned. Thirdly, we suppose that the utility function u(·) is given by 1−ρ z ρ > 0, ρ = 1 (29) u(z) = 1−ρ ln(z) ρ = 1. In general, u (z) = z −ρ > 0; u (z) = −ρz −(1+ρ) < 0;
−zu (z) = ρ. u (z)
(30)
These three properties mean that: 1. Utility is increasing in the amount consumed. 2. Utility is strictly concave, so that agents are risk averse. 3. The Arrow-Pratt coefficient of relative risk aversion, defined as −zu /u , is constant. Hence agents’ attitudes towards risk are parameterized completely by the constant ρ. We assume that the cost of acquiring a resource level x is γx2 , where γ > 0. With this set-up, agent i’s optimal demand in period t is 1 yi (t) ρ ∗ zi (t) = . (31) q(t) Market-clearing requires that 1 yi (t) ρ = X(t) = Xi (t), q(t) i i
(32)
38
R. Mason, C. Courcoubetis, and N. Miliou
which means that q(t) =
1
ρ
ρ i yi (t)
X(t)ρ
.
(33)
Hence
E[q(t)] =
ρ
1 ρ E i yi (t) .
(34)
x∗i (t) =
1 r+1 E[q(t)], 2γ r + 1 − β
(35)
x∗ (t) ≡
N r+1 E[q(t)]. 2γ r + 1 − β
(36)
X(t)ρ
Agent i’s optimal choice of resource in period t is
so that
Since X(t) = βX(t − 1) + x∗ (t), this gives a non-linear equation in X(t): ρ
1 N r+1 (X(t) − βX(t − 1))X(t)ρ = E yi (t) ρ 2γ r + 1 − β i
(37)
which can be solved numerically. With the resulting expression for X(t), the rest of the system can be computed. The outcome with trading in a competitive Grid spot market can be contrasted with autarky: x†i
=
ki 2γ
1 1+ρ
, zi† = x†i .
(38)
The simulation experiment proceeds as follows: 1. We fix the number of agents. 2. Each agent starts with zero resources: Xi (0) = 0 for all i. 3. A vector of demand parameters k is drawn, where each ki is an independent draw from the uniform distribution on [0,1]. 4. Given its ki parameter, and the consequent expectations about its demand yi (0) and the market-clearing price q(0), agent i chooses its resource level x∗i (0) (as described above). 5. Agent i’s demand level yi (0) is realised, as a draw from the Gamma distribution Γ (ki , 1). Given this demand, and the prevailing market-clearing price, agent i chooses its optimal demand zi∗ (0). 6. The market-clearing price ensures that aggregate demand is equal to aggregate supply.
A Framework for Analyzing the Economics of a Market for Grid Services
39
0.7
ρ=2,β =1 ρ=3,β=1 ρ=2,β=0.8 ρ=3,β=0.8
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
20
25
30
35
40
45
50
Fig. 1. The market clearing price as a function of time. We observe that as resources have a higher depreciation rate, equilibrium prices increase, since less resources are available in the Grid market for trading.
7. The experiment proceeds to the next period t = 1, and the steps are repeated. 8. We use multiple (independent) random draws of {yi (t)} in each period t to calculate numerical averages and expectations (i.e., we use a Monte Carlo method). The output of the experiment is then the average expected utility of the agents (where the average is taken over the demand parameters {ki }) under Grid trading and autarky, in each period; the average volume of trade and the risk premium, in each period; and the total level of resources X(t) over time. Such results are shown in the figures 1–4, in which there are 20 agents; the risk aversion parameter either ρ = 2 or ρ = 3; and the cost parameter γ = 1. Experiments showed that accounting for the net present value of the resource x instead of myopically considering the revenue at time t only, does not result to a substantially different behavior of the system. Therefore we conduct our experiments, for computational convenience, with a fixed rate of return 0.2. The figures show some common patterns. The activity levels of the Grid market increase over time, in terms of the average expected utility of the market participants and the average volume of trade. Of course, in any particular period, the volume of trade (figure 2) may decrease, as a result of low realizations of demand. But on average, over a number of periods of trading, trade volumes increase. As agents accumulate resources, the aggregate supply in the market increases: see figure 3. As a result, the market-clearing price is generally decreasing: figure 1. (Again, in any given period, the market-clearing price may increase, as a result of high realizations of demand.) As the resource stock increases over time, it acts as a buffer to smooth out fluctuations that arise due to random demand realizations. Hence the variability of the market-clearing price declines over time; consequently, the average risk premium also falls: see figure 4. Changing the
40
R. Mason, C. Courcoubetis, and N. Miliou
90 ρ=2,β=1 ρ=3,β=1 ρ=2,β=0.8 ρ=3,β=0.8
80
70
60
50
40
30
20
10
0
0
5
10
15
20
25
30
35
40
45
50
Fig. 2. The average volume of trade as a function of time. We observe that as resources have a higher depreciation rate, the volume of trade decreases. This may be due to the higher equilibrium prices, see Fig. 1.
5 ρ=2,β=1 ρ=3,β=1 4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
5
10
15
20
25
30
35
40
45
50
Fig. 3. The average resource level per agent as a function of time
depreciation coefficient 1−β also introduces changes in the market-clearing price and the average volume of trade, as it can be seen in figures 1 and 2 respectively. More specifically, when the resources are less durable, ie. the parameter β decreases, the market-clearing price increases, since less resources are available in the Grid market, see figure 1 . The increase of the equilibrium prices results to a lower value for the volume of trade, as can be seen in figure 2.
A Framework for Analyzing the Economics of a Market for Grid Services
41
−3
4.5
x 10
ρ=2, β=1 ρ=3, β=1 4
3.5
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
40
45
50
Fig. 4. The average risk premium as a function of time
6
Market Failure
The analysis so far has demonstrated that the operation of a Grid spot market can improve the expected utility of market participants, relative to autarky (no Grid trading). Further, since there is a positive risk premium that risk-averse agents attach to variability in the market-clearing price, there is scope in the market for the use of forward contracts. The risk premium decreases with the size of the market, and as the aggregate supply in the market increases. Nevertheless, it is always positive. Furthermore, we have assumed that all agents have the same degree of risk aversion. In practice, agents will differ in their risk preferences. Forward contracts will be particularly attractive to more risk-averse agents; and will be written by less risk-averse agents. The fact that Grid trading increases agents’ utilities is not surprising, given the general results from economics about the efficiency of competitive markets. Note, however, that the solution that we have analysed does not achieve full efficiency, since we have assumed that agents behave myopically in each period, ignoring the consequences in future periods of current choices. The main set of assumptions underlying the analysis are those that make up a competitive market: 1. Each agent ignores the effects of its own choices on the market-clearing price (even when there is a very small number of agents in the market). 2. There is no asymmetric information in the market: for example, all agents are equally well-informed about the characteristics of Grid resources. 3. There are no external effects: an agent’s choices of resource level and demand have no direct effects on other agents (other than those that operate through the market-clearing price).
42
R. Mason, C. Courcoubetis, and N. Miliou
Of these three assumptions, assumption 2 is likely to be the most critical for the existence and operation of a Grid market. Market power—the ability of agents to influence the market price—will affect the volume of trade and utility levels associated with the market. But it is very unlikely to cause a complete collapse of the market. Similarly, externalities may limit the scale of the market, but are unlikely to cause collapse. But if agents are poorly informed about the quality of resources traded in the market, then it may be that the market fails to operate at all. In order to assess the scope for this market failure, we study the simplified case, when β = 0 and ρ = 1, where c(x) = γx2 ,
γ > 0,
u(z) = ln(z); the resource is entirely non-durable; and each agent’s type yi is drawn from the uniform distribution on [0, 1] independently each period. Now suppose that an agent knows for certain that its own resource is reliable and can be used with probability 1 to service its demand. If an agent has to buy resources in the Grid, then it perceives that there is some probability φ that the Grid resources fail to satisfy its demand. In the event that this happens, the agent has to pay the cost of buying the resource in the market, and bears a utility cost of λ. The agent’s expected utility is then yi ln(z) − q(z − xi ) z ≤ xi , (39) ui = yi ((1 − φ) ln(z) + φλ) − q(z − xi ) z > xi . Agent i’s optimal demand in this case is ⎧y i yi ≤ qxi , ⎪ ⎨q ∗ zi = xi qxi < yi ≤ ⎪ ⎩ qxi yi < yi . (1 − φ) q 1−φ
qxi 1−φ ,
The expected market-clearing price is √ (1 − φ)(1 − 1 − φ) E[q] = 2γ . φ
(40)
(41)
The dependence of the expected market-clearing price on the failure parameter φ is shown in figure 5. As the figure shows, the market-clearing price is higher when the failure probability is lower. In the limit, when any Grid resource fails with probability 1, the market-clearing price is 0, and the market collapses for sure. Consequently, each agent buys √ (1 − φ)(1 − 1 − φ) ∗ (42) xi = 2γφ of the resource.
A Framework for Analyzing the Economics of a Market for Grid Services
43
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.2
0.4
0.6
0.8
1
Fig. 5. The market-clearing price against the failure parameter φ
−0.5 −0.55 U+ −0.6 −0.65 −0.7 U*
−0.75 −0.8 −0.85 −0.9 −0.95 −1
0
0.2
0.4
0.6
0.8
1
Fig. 6. Expected utilities against the failure parameter φ
These facts can then be used, along with the realizations of the demand parameters {yi }, to determine numerically the equilibrium outcome, as a function of the failure parameter φ. It can be seen in figure 6, that if φ is sufficiently low agents will trade in the Grid market. Once Grid resources become sufficiently unreliable, autarky is preferred.
44
R. Mason, C. Courcoubetis, and N. Miliou
Obviously, a key parameter determining the ‘crossing point’—the critical value of φ at which the Grid market collapse—is the utility cost λ. We have modelled this in a very simple way here: as a constant that is the same for all agents. If this is an important feature of an emerging Grid market, then this aspect can be modelled further. For example, the utility cost could be allowed to be different for different agents. It could also be related to agents’ realized demand types y.
7
Conclusions
In this paper we analyzed the economics of a competitive market for Grid services. Namely, we provided an analytical model, as well as simulation results regarding the critical parameters of such a model. Accounting for different levels of risk aversion, we studied the behavior of the market clearing price, the volume of trade, the resource level per agent and the average risk premium. In all those cases we assumed that the resources do not necessarily vanish after the end of each period, but last longer and each agent accumulates a stock of such resources. We observed that in general the market clearing price decreases and the aggregate supply in the market increases as agents accumulate resources. We also analyzed the situation where Grid resources are not reliable and the market fails to satisfy the demand and observed that, unless Grid resources are highly unreliable, the agents still prefer to trade in the Grid market than to choose autarky.
References 1. Chamberlin, E.H.: An Experimental Imperfect Market. Journal of Political Economy 56(2), 95–108 (1948) 2. Cripps, M.W., Swinkels, J.M.: Efficiency of Large Double Auctions. Econometrica 74(1), 47–92 (2006) 3. Gode, D.K., Sunder, S.: Allocative Efficiency of Markets with Zero Intelligence (ZI) Traders: Market as a Partial Substitute for Individual Rationality. Journal of Political Economy CI, 119–137 (1993) 4. Gode, D.K., Sunder, S.: What Makes Markets Allocationally Efficient? Quarterly Journal of Economics 112(2), 603–630 (1997) 5. Rustichini, A., Satterthwaite, M.A., Williams, S.R.: Convergence to Efficiency in a Simple Market with Incomplete Information. Econometrica 62(1), 1041–1063 (1994) 6. Satterthwaite, M.A., Williams, S.R.: The optimality of a simple market mechanism. Econometrica 70(5), 1841–1863 (2002) 7. Cliff, D.: Minimal-intelligence agents for bargaining behaviours in market-based environments, Technical Report HP-97-91, Hewlett-Packard Research Laboratories, Bristol, England (1997) 8. Preist, C., van Tol, M.: Adaptative agents in a persistent shout double auction. In: Proceedings of the 1st International Conference on the Internet, Computing and Economics, pp. 11–18. ACM Press, New York (1998)
A Framework for Analyzing the Economics of a Market for Grid Services
45
9. Parsons, S., Marcinkiewicz, M., Niu, J., Phelps, S.: Everything you wanted to know about double auctions but were afraid to (bid or) ask, Technical Report, Department of Computer & Information Science, Brooklyn College, City University of New York (2005) 10. Smith, V.L.: An Experimental Study of Competitive Market Behavior. Journal of Political Economy LXX, 111–137 (1962) 11. Meinl, T., Neumann, D.: A Real Options Model for Risk Hedging in Grid Computing Scenarios. In: Proceedings of the 42nd Hawaii International Conference on System Sciences, pp. 1–10 (2009)
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services Marcel Risch1, Jörn Altmann1, Li Guo2, Alan Fleming3, and Costas Courcoubetis4 1
TEMEP, School of Industrial and Management Engineering College of Engineering, Seoul National University 599 Gwanak-Ro, Gwanak-Gu, Seoul 151-742, South-Korea
[email protected],
[email protected] 2 Department of Computing, Imperial College London 180 Queen's Gate, London SW7 2BZ, UK
[email protected] 3 Real Time Limited Lochside House, 3 Lochside Way, Edinburgh Park, Edinburgh EH12 9DT, UK
[email protected] 4 Department of Computer Science, Athens University of Economics and Business 47A Evelpidon Str., Athens 11362, Greece
[email protected]
Abstract. Within this paper, we present the GridEcon Platform, a testbed for designing and evaluating economics-aware services in a commercial Cloud computing setting. The Platform is based on the idea that the exact working of such services is difficult to predict in the context of a market and, therefore, an environment for evaluating its behavior in an emulated market is needed. To identify the components of the GridEcon Platform, a number of economicsaware services and their interactions have been envisioned. The two most important components of the platform are the Marketplace and the Workflow Engine. The Workflow Engine allows the simple composition of a market environment by describing the service interactions between economics-aware services. The Marketplace allows trading goods using different market mechanisms. The capabilities of these components of the GridEcon Platform in conjunction with the economics-aware services are described in this paper in detail. The validation of an implemented market mechanism and a capacity planning service using the GridEcon Platform also demonstrated the usefulness of the GridEcon Platform. Keywords: Cloud computing, Grid computing, marketplace for computing resources, market-based resource allocation, service-oriented architectures, utility computing, market emulation tool, market mechanism.
1 Introduction In the past, many Grid research projects implemented models of markets for computing resources. Some examples include the Popcorn Market [1], the Spawn system [2], the Grid Architecture for Computational Economy (GRACE) [3], and Tycoon [4]. All J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 46–59, 2009. © Springer-Verlag Berlin Heidelberg 2009
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
47
of these projects focused very much on providing economically efficient resource allocations or on using various market mechanisms to sell computing resources. However, all these implementations remained sample prototypes. They could not be re-used by other projects because of its static implementation. For example, the market mechanism developed by Popcorn is not interoperable with the market mechanism developed by Tycoon. Furthermore, all these market implementations did not consider the market environment, i.e. the value-added services around the market mechanism. To model a market environment comprehensively, we have developed the GridEcon Platform within the GridEcon Project [5] [6]. In particular, the GridEcon Platform allows evaluating economics-aware services in a commercial Cloud computing setting. The Platform is based on the idea that the exact working of such a service is difficult to predict in the context of a market and, therefore, an environment for evaluating its behavior and testing its interfaces in an emulated market is needed. The most important two components of the GridEcon Platform are the Marketplace and the Workflow Engine. The Workflow Engine component allows the simple composition of a market environment from different economics-aware services by describing the interaction between these services. The Marketplace allows the execution of different kinds of market mechanisms for trading computing resources. We demonstrate the usefulness of the GridEcon Platform by implementing a market mechanism (continuous double auction) and a capacity planning service and integrating them into the GridEcon Platform. The easy-to-perform integration enabled us to conduct performance measurements on the market mechanism and the capacity planning component. The remainder of the paper is structured as follows: In section 2, we give an overview of the state-of-the-art in Grid market research and economics-aware services for Cloud computing markets. In section 3, we describe the GridEcon Platform, in particular the Workflow Engine and the Marketplace component. Besides those, we introduce some value-added Cloud services that may be of use in a commercial computing resource environment. The implementation of the GridEcon Platform and a validation of the implementation is given in section 4. We conclude the paper in section 5.
2 State of the Art of Grid Market Research This section gives an overview of previous Grid market projects and their approach to developing market environments. This demonstrates that previous Grid market implementations did not take into account the need for platforms, on which businesses can test and evaluate their services in a realistic market setting. The Popcorn Market [1] [7] has identified three actors: buyers, sellers, and the market, which implements the market mechanism. By looking at the list of entities, it is already obvious that no support services has been considered during the implementation phase. While the market is flexible enough to support three different market mechanisms, the problem of implementing business services on top of the market has not been addressed. Therefore, this market is of limited use, if it comes to testing different business models of Cloud computing services as the effort for integrating additional services is quite high.
48
M. Risch et al.
The Spawn system [2] presents the idea of value-added services but without considering the full economic potential and the requirements of these services. Since the implementation of these services and their integration into the Spawn environment will be challenging, this environment is of limited use if it comes to modeling the behavior of additional market services within Spawn. The Grid Architecture for Computational Economy (GRACE) [3] has identified one central component: “The resource broker acts as a mediator between the consumer and Grid resources using middleware services. It is responsible for resource discovery, resource selection, binding of software, data, and hardware resources, initiating computations, adapting to the changes in Grid resources and presenting the Grid to the consumer as a single, unified resource” (p. 4, [3]). It can be clearly seen from the task description, that the Grid Resource Broker is a component with many functionalities, which do not have to be provided by the same entity. This would be the idea starting point for looking into services, which can be provided by various businesses competing for customers. However, since the Grid Resource Broker represents a monopolistic service provider, the GRACE implementation cannot be used to determine how the existence of different providers with different business models will affect the market. Furthermore, extending this architecture to provide different services would require significant effort which makes GRACE an unattractive choice for simulating economics-aware Cloud services. The Tycoon market is a highly distributed market [4], in which agents take over a number of jobs for the consumer. These jobs include: (1) interpreting the consumer’s preferences, (2) examining the state of the system, (3) make bids, and (4) verify that the acquired resources are available. The first job is very important to consumers, who have little or no Grid expertise and need help to determine the correct resource purchase. It is also a very challenging job, since consumers may not know how to express their preferences due to their lack of IT expertise. In such a case, the agent has to perform tasks such as capacity planning, portfolio management, and workflow analysis. All these tasks are highly complex and, thus, mean that the agents have to be very powerful to be able to perform them. The main weakness of this model is that, while agents could be provided by different providers, the idea of an agent market, in which consumers can buy agent services, has not been developed. Beyond the Grid Markets described so far, some additional research comes from the gridbus [8] project. This has been the only project that has introduced a number of services for the user, namely a Workflow Broker [9], the Nimrod-G Broker [10], and the Gridbus Data Broker [11]. Furthermore, gridbus has developed the GridSim [12] and CloudSim [13] simulation environments. However, these environments are aimed at analyzing the effects of different resource allocation algorithms as well as the effects of resource failures. The effect of additional services cannot be easily modeled in these environments. Neumann et al described the technological and economical challenges, which must be overcome when developing a Grid market, in [19]. In conclusion, existing Grid Market research has not considered the idea of allowing an evaluation of new services added to an existing market or an evaluation of a new design of a market. In particular, a component that helps composing these services has also not been proposed.
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
49
3 The GridEcon Platform The GridEcon Platform, the architecture of which is shown in Figure 1, consists of three tiers: the service provider tier, the marketplace tier, and the economics-aware services tier. Each of these tiers is built using a service-oriented architecture (SOA). The interaction between the tiers and between the different value-added services is definable through the Workflow Engine. In order to achieve that, the Workflow Engine provides common interfaces to services. The marketplace tier encompasses the main trading location of this environment. It is flexible enough to allow trading of computing resources to be based on any kind of market mechanism such as auctions, bargaining, or posted prices. The service provider tier contains all providers of services, which are traded in the marketplace. No distinction will be made of the type of services. This means that the GridEcon Platform could be used to sell access to data, Software (Software-asa-Service), and computing resources. However, for simplicity, we focus on computing resources for the remainder of the paper exclusively.
Fig. 1. The GridEcon Marketplace Environment
The economics-aware service tier is the place where different providers can offer their value-added services. Although the GridEcon project has considered a number of services, which could be of use in such an environment, in a production-level market, there may be demand for additional services. In order to illustrate the workings of the GridEcon Platform in detail, in the following subsections, we will focus, in particular, on the Workflow Engine, the Marketplace component, and the Capacity Planning Service as an example for a value-added service.
50
M. Risch et al.
3.1 Workflow Engine Component The Workflow Engine is a component designed to facilitate the coordination and communication between the different components and services of the GridEcon Platform. With respect to the GridEcon architecture, the Workflow Engine sits between all other GridEcon Platform components and value-added services. It hides the underlying system complexity and the interactions within the system from end-users and, thus, makes the GridEcon Platform easier to work with. The entire system architecture for the Workflow Engine is shown in Figure 2. The Workflow Engine is a specification-driven system, i.e. the composition of the components and value-added services is defined by feeding a specification (i.e. workflow) into the Workflow Engine. This idea was borrowed from the multi-agent domain. This sort of system design significantly reduces the effort required for developing a market environment by reducing the development life cycle. The development of a different market environment is made simple by feeding the new workflow into the Workflow Engine. Because of this design, the Workflow Engine can also manage different market mechanisms. In other words, the GridEcon Platform is independent of any specific market mechanism. The Workflow Engine consists of a group of agents, where each agent conceptually represents a GridEcon component or value-added Service (e.g. Marketplace component, Capacity Planning service). Each GridEcon components is made public through Web services. The agents of the Workflow Engine communicate with each other to handle the interaction between the components. The interaction between agents follows a predefined interaction protocol, namely the current workflow. The language that is used to describe a workflow is called Lightweight Coordination Calculus (LCC) [14]. In detail, the Workflow Engine offers interfaces for users to submit asks and bids. It also offers various callback interfaces, which can be used by the economic-aware service to communicate with each other. These callback features of the Workflow Engine ensure that all communication can be performed asynchronously.
Fig. 2. Architecture of the Workflow Engine
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
51
3.2 Marketplace Component The Marketplace component is responsible for managing the trading of resources. Since computing resources can be traded based on different market mechanisms, the Marketplace can be implemented with a number of different market mechanisms. For validating the Marketplace implementation, a continuous double auction (CDA) has been implemented as the market mechanism [6]. This market mechanism has been chosen, since it is similar to market mechanisms for other non-storable goods, such as electricity [15]. With respect to the Marketplace Component itself, this market mechanism has been chosen arbitrarily. The underlying principle for this market mechanism is that of a standard futures market: All parties announce the maximum price they are willing to buy for (i.e. bid price, for the purchasers) and the minimum price they are willing to sell for (i.e. ask price, for the sellers) in a specified time interval. These prices are recorded and put in a directory. Bids are, in fact, standing (or limit) orders and will not be immediately fulfilled unless there is a previously posted compatible reciprocal ask. If the bid price of a buy order exceeds the ask price of a sell order, the orders are immediately executed. If no compatible reciprocal bids are available, the bids remain in the directory until they are matched in the future or expire. The market mechanism takes into account both vertical and horizontal atomicity of bids and asks. Vertical atomicity in bids means that resources have to be provided by a single provider, however, the bidder is willing to switch resources during the usage period. Horizontal atomicity means that the resources are available for the entire usage duration by one provider, thereby minimizing switching costs. However, not all resources have to be offered by the same provider. For asks, vertical atomicity requires that all resources have to be sold to one user at a time. However, the resources can be sold to different users at different times. The horizontal atomicity requirement for asks means that the resources have to be sold for the entire availability time. This differentiation of atomicity gives buyers and sellers of computing resources the opportunity to specify whether they require all resources in one place, the resources for the entire duration, or both. The architecture of the Marketplace Component is shown in Figure 3. Since the CDA mechanism was implemented in the GridEcon Platform for testing purposes, this market mechanism has been highlighted. However, other market mechanism, such as described in [20], could also be implemented.
Fig. 3. The Marketplace Component Architecture
52
M. Risch et al.
As Figure 3 indicates, the architecture of the Marketplace Component also allows trading of different types of computing resource types (e.g. Res Type 1, Res Type 2). However, the interface of the Marketplace component remains unchanged, independent of the market mechanism and the resource type. 3.3 Value-Added Services The GridEcon project has identified a number of services, which may be of interest to resource consumers. All of these services have been examined so that the GridEcon Platform could potentially integrate any implementation of those. The first service is the Capacity Planning Service, which helps users to plan their computing resource capacity. Capacity planning is a difficult task in a commercial Cloud computing setting. The service has to analyze the existing infrastructure, the deployed applications, and the resource requirements before suggesting a course of action. An action can be the purchase of Cloud resources, the purchase of inhouse resources, or a combination of both. Furthermore, the Capacity Planning Service can suggest that some resources can be sold on the Marketplace if they are idle [16]. More detailed information about the Capacity Planning Service can be found in [17] [18]. The Fixed-Price Quotation Broker allows selling fixed price contracts even if the underlying market price for the resource is variable. This allows consumers to plan the Cloud computing costs with certainty. At the same time, this price will usually be higher than the actual price, since the Fixed-Price Quotation Broker is expected to generate profits for the provider. At the time when the contract is due, the Fixed-Price Quotation Broker places bids on the marketplace to buy the resources in order to fulfill its contractual obligations. The Portfolio Broker addresses the fear of Cloud users that their Cloud resources might fail. Since it is very difficult for Cloud users to determine which resources have a high risk of failure, they can use the Portfolio Broker to determine the failure risk of each resource in their resource portfolio. Based on this analysis and the user’s risk tolerance level, the Portfolio Broker can then suggest a method for reaching the risk tolerance level of the user. This method may require the diversification of resources to different vendors or the purchase of spare resources. The Insurance Broker provides insurances against a number of events that consumers fear. The GridEcon project has identified the following events which would be harmful to the Cloud user: (1) Non-delivery of the resources, (2) delivery of incorrect resources or (3) interruption of the resource availability. Using the Insurance Broker, consumers can protect themselves against the financial effects of these events. In return, the Insurance Broker will charge a premium which is proportional to the cost of the adverse event and the probability that this event will occur. Since all these services can be used with a number of market mechanisms, we have analyzed which services are useful with which market mechanism. The result of this analysis is shown in the following table: a √√ shows that the component is very useful; a single √ shows that the component is moderately useful, and an × shows that the component is not useful with respect to the market mechanism.
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
53
Table 1. Service Comparison with Respect to the Market Mechanism Service
Capacity Planning Service Fixed-Price Quotation Broker Portfolio Broker Insurance Broker
Market Mechanism
Posted Price
Bargaining
Dutch Auction
Double Auction
√√
√√
√√
√√
×
×
√√
√√
√√ √√
√√ √√
√ √
√ √
The Capacity Planning Service is very useful regardless of the market mechanism, since its task is to help the consumer to decide which resources should be purchased and where to purchase them. Since this problem has to be solved regardless of the market mechanism, this component is useful in any utility computing market. The Fixed-Price Quotation Broker requires demand-based pricing, which is given when Dutch and double auctions are used. While posted prices can also be adjusted to reflect the demand, this occurs less frequently and, therefore, the Fixed-Price Quotation Broker will be less useful in such a market environment. For the same reason, it is also less useful in a bargaining market. The Portfolio Broker is useful in a non-anonymous market mechanism, such as the posted-price and bargaining markets. This is due to the fact that in such a market it can calculate the resource reliability not only for each resource type but also for each resource type of each provider. In an anonymized market, it can only calculate the resource reliability for each resource type across all providers. As the Portfolio Broker, the Insurance Broker is also more useful if the market mechanism is not anonymous, i.e. if providers can be uniquely identified. In this case, the Insurance Broker can determine more closely which provider is trustworthy.
4 Implementation and Validation All GridEcon Platform components and value-added services described in the previous sections can be orchestrated by the user through the Workflow Engine. Therefore, a typical GridEcon-Platform-based market environment always relies heavily on this central component. Figure 4 shows a sequence diagram illustrating the interaction of the Marketplace, Capacity Planning Service, and the Workflow Engine. It can be clearly seen in Figure 4 that the user only interacts with the Workflow Engine, even if different services are to be used. This approach hides the complexity of using the Cloud by giving the user a single point of entry into the Cloud. The following section describes the implementation of the Workflow Engine, validating the workings of the GridEcon Platform. By describing the simple integration of the market mechanism and the capacity planning service into a GridEcon-Platformbased emulated market environment, we validate the usefulness of the GridEcon Platform.
54
M. Risch et al.
Fig. 4. Interaction of Services and Components Orchestrated by the Workflow Engine
4.1 Implementation of the Workflow Engine The Workflow Engine is deployed as a Java Web Service using JAX-WS and is made available to end-users through the GridEcon Web Interface. The workflow specifications are created in XML and, therefore, can be edited by any XML editor. The system was evaluated with respect to three parameters: system performance, system reliability, and accessibility. To test these criteria, we simultaneously submitted 5000 workflow requests for the same task 1000 times. The tests were carried out on a Linux server with two 3.2GHz CPUs and 2GB memory. We used an Apache Tomcat 6.0 server running Java JRE1.6.0_02 to deploy the Workflow Engine. The results of the performance tests are shown in Figure 5.
Fig. 5. System Performance of the Workflow Engine
The best response time that we obtained on the above-mentioned configuration was 360.0864ms; the worst response time obtained was 473.3914ms. The average time for the 1000 tests was 417.1616ms. Since the GridEcon Platform did not fail during these tests, we derived that the Workflow Engine and the entire system, the GridEcon Platform, is reliable and accessible.
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
55
4.2 Implementation of the Marketplace The Marketplace was implemented using Java Web Services technologies and has been deployed in the GridEcon Platform implementation as an axis2 AAR file. A major goal in the design of the Marketplace component was to provide a flexible, pluggable architecture, allowing easy configuration and expedite future development. In order to provide this extensibility and adaptability, the Spring Framework was utilized. The Spring Framework allows the central matching mechanism to be configured via dependency injection and without requiring any further compilation or deployment steps. Dependency injection allows a single market mechanism to be re-deployed with differing configuration and allows different market mechanisms to be developed and configured into the existing Marketplace component. This far, the only market mechanism implemented and tested is the continuous double auction mechanism as described in section 3.2 [5]. As one core responsibility of the Marketplace component is to initiate the start and stop of the resulting allocations (i.e. matched offers), a scheduler mechanism (through Open Symphony Quartz) was implemented. When an allocation is produced through the matching process, a scheduled event is created in Quartz that will initiate the startup of the resource at the scheduled time. The tests of the Marketplace component took the matching accuracy of the implemented CDA into account, as well as the time taken to find a match for a given bid and ask. The matching accuracy reached 100%, which was the design goal, since the matching algorithm of the marketplace is vital to the functioning of the market. Once this goal had been achieved, the performance of the matching algorithm with respect to the matching time was tested. An excerpt of the results can be seen in Table 2. The table shows the time for handling a single bid submission and the average times for servicing 50 requests under nominal load and heavy load. Nominal load corresponds to the expected average system load. The load was generated by having 30 providers submitting 300 asks and 10 bidders submitting 100 bids over a time period of 1000 trading days. The system’s performance was measured on a laptop with AMD Turion TL50 processor with 1 GB DDR2 running Windows XP. The heavy load case represents a situation in which the system has to handle increased load. In this case, 75 providers submitted 750 asks and 75 bidders submitted 750 bids over a time period of 1000 trading days. Table 2. Marketplace Performance Test Results
Task Submit Bid - No Match Submit Bid – Match
Single Transaction 45ms 43ms
Nominal Load 46ms 45ms
Heavy Load 88ms 76ms
It can be clearly seen in Table 2 that the matching process is quite fast, even under high system load. Besides validating the overall workings of the GridEcon Platform, the results also show that the Marketplace Component and, therefore, the GridEcon Platform is useful for validating market mechanisms. A market mechanism can simply be plugged into the Marketplace Component. It is simply a module within the Marketplace component.
56
M. Risch et al.
4.3 Implementation of the Capacity Planning Service To test the viability of a Capacity Planning Service and to show the usefulness of the GridEcon Platform, we developed a market simulation with the help of the GridEcon Platform. This emulation was aimed at showing that even basic prediction algorithms are valuable for capacity planning in Cloud Computing Markets. The Capacity Planning Service is expected to function within the previously described continuous double auction (section 3.2). The matching algorithm of the market applied horizontal and vertical bid atomicity, meaning that the buyer of a Cloud resource obtained all resources from a single provider. At the end of each day, each trader determines its expected demand for the next day. If the expected demand is larger than the number of in-house resources, the trader will bid for resources for the next day on the Cloud market. Should the number of required resources be lower than the number of in-house resources, the trader would attempt to sell the excess resources for the next day on the Cloud market. The traded resources are made available in the morning of the following day. Therefore, each market participant can change his role between being a buyer and being a seller. The GridEcon-Platform-based simulation consisted of 10000 traders, which traded resources for 500 trading days. A detailed overview of the simulation parameters is given in Table 3. Table 3. Simulation Parameters Overview Parameters
Number of traders Number of in-house resources (per trader) Market mechanism Number of simulated trading days Offer expiration time Demand distribution Demand distribution mean Demand distribution variance
Value
10,000 20-40 CDA 500 1 day Normal 30 30
A trader’s resource supply is calculated as the sum of existing in-house resources and purchased Cloud resources. Three of the 10000 traders were given identical demand for resources. In order to work with a realistic demand pattern, we used the results of network traffic measurements to the Web servers of the International University in Germany during a period from October 2007 to December 2008. The demand over this time period is shown in Figure 6 below. Each of these three traders used a specific prediction algorithm. The first trader used a basic prediction algorithm: It used its current demand level to determine the resource amount to be purchased or sold for the next day. The second trader used the past thirty days as a basis for a linear regression to predict demand. The third trader performed a Fourier analysis. The outcomes of the simulations are illustrated in Figure 7. A negative value shows that the trader has fewer resources than required, while a positive value indicates that the trader has more resources than it requires.
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
57
Resource Demand 70
ResourceQ uantity
60 50 40 30 20 10 0 1
27
53
79
105 131 157 183 209 235 261 287 313 339 365 391 417 443 469 Time
Fig. 6. Network Traffic Demand from Web Server Measurements
Fig. 7. Resource Demand Prediction Results
It can be clearly seen in Figure 7 that, initially, all traders had sufficient resources at their disposal. However, after about 300 days, all were unprepared for a sudden demand spike. This result is expected, since all prediction methods are based on past measurements. Another factor that contributed to this result is that not necessarily all resource purchases are matched. As in any market, if there is no matching supplier, then the resource demand cannot be met. This is caused by a difference in prices or the horizontal and vertical bid atomicity that we applied. If this requirement were to be removed, the market would have much greater liquidity and the traders would have more resources at their disposal. With the help of the Capacity Planning Service, we also demonstrated how a valueadded service can easily be in integrated into a GridEcon-Platform-based market environment. The GridEcon Platform allowed integrating a prototype of the capacity planning service and conducting performance measurements on the value-added Capacity Planning Service. This highlights the usefulness of the GridEcon Platform for validating new economics-aware, value-added services.
5 Conclusion and Future Work In this paper, we have analyzed the existing approaches to utility computing markets, mainly the markets of Popcorn, Spawn, GRACE, and Tycoon. However, due to their lack of flexibility, these prototypes cannot be used to evaluate new market designs or the performance of economics-aware services in a market environment. This limitation has been recognized by the GridEcon project, which created the GridEcon Platform, a testbed for evaluating new designs of computing resource markets and for evaluating value-added services in a commercial Cloud Computing setting.
58
M. Risch et al.
As part of the GridEcon platform, two central components have been created: the Workflow Engine, which enables the composition of a market environment in a simple way, and the Marketplace component, which allows the execution of different market mechanisms. The Workflow Engine can be used by users of the GridEcon Platform to orchestrate different market scenarios needed for their analysis. The Marketplace component allows testing and validating different market mechanisms. The GridEcon Platform, in particular, the Workflow Engine, has been tested. It performed its tasks as expected and proved to be vital for developing additional valueadded services. The usefulness of the GridEcon Platform has been demonstrated by integrating a market mechanism (continuous double auction) into the GridEcon Marketplace Component and by integrating a capacity planning service into a GridEconPlatform-based emulated market environment. The GridEcon-Platform-based market environment allowed validating the market mechanism and the capacity planning service by conducting performance measurements.
Acknowledgement This work has been funded by the European Commission within the framework of the GridEcon project. GridEcon (contract no. 033634) is a STREP project under the Framework Program FP6 in the area of IST. The authors would also like to thank the members of the GridEcon consortium for their contributions. Special acknowledgement is due to Kostas Giannakakis, Alon Lahav, and Sergios Soursos.
References 1. Regev, O., Nisan, N.: The POPCORN market - An Online Market for Computational Resources. In: Proceedings of the First international Conference on Information and Computation Economies. ICE 1998. ACM, New York (1998) 2. Waldspurger, C.A., Hogg, T., Huberman, B.A., Kephart, J.O., Stornetta, W.S.: Spawn: A Distributed Computational Economy. IEEE Transactions on Software Engineering 18(2), 103–117 (1992) 3. Buyya, R., Abramson, D., Giddy, J.: An Economy Grid Architecture for Service-Oriented Grid Computing. In: 10th IEEE International Heterogeneous Computing Workshop. HCW 2001. IEEE Computer Society Press, San Francisco (April 2001) 4. Lai, K., Rasmusson, L., Adar, E., Zhang, L., Huberman, B.A.: Tycoon: An Implementation of a Distributed, Market-Based Resource Allocation System. Multiagent and Grid Systems 1(3), 169–182 (2005) 5. Altmann, J., Courcoubetis, C., Stamoulis, G.D., Dramitinos, M., Rayna, T., Risch, M., Bannink, C.: GridEcon: A market place for computing resources. In: Altmann, J., Neumann, D., Fahringer, T. (eds.) GECON 2008. LNCS, vol. 5206, pp. 185–196. Springer, Heidelberg (2008) 6. Courcoubetis, C., Dramitinos, M., Rayna, T., Soursos, S., Stamoulis, G.D.: Market mechanisms for trading grid resources. In: Altmann, J., Neumann, D., Fahringer, T. (eds.) GECON 2008. LNCS, vol. 5206, pp. 58–72. Springer, Heidelberg (2008) 7. The Popcorn Project (2008), http://www.cs.huji.ac.il/~popcorn/ 8. The gridbus project (2008), http://www.gridbus.org/
The GridEcon Platform: A Business Scenario Testbed for Commercial Cloud Services
59
9. Rahman, M., Buyya, R.: An Autonomic Workflow Management System for Global Grids. In: Proceedings of the Eighth IEEE International Symposium on Cluster Computing and the Grid. CCGrid 2008, May 2008, pp. 578–583. IEEE Computer Society, Washington (2008) 10. Buyya, R., Abramson, D., Giddy, J.: Nimrod-G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid. In: Proceedings of the 4th International Conference on High Performance Computing in Asia-Pacific Region. HPC Asia 2000, May 2000. IEEE Computer Society Press, Beijing (2000) 11. Venugopal, S., Buyya, R., Winton, L.: A Grid Service Broker for Scheduling Distributed Data-Oriented Applications on Global Grids. In: Proceedings of the 2nd Workshop on Middleware For Grid Computing. MGC 2004, October 2004, vol. 76, pp. 75–80. ACM, Toronto (2004) 12. Buyya, R., Murshed, M.: GridSim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing. Journal of Concurrency and Computation: Practice and Experience 14(13-15), 1175–1220 (2002) 13. Calheiros, R.N., Ranjan, R., De Rose, C.A.F., Buyya, R.: CloudSim: A Novel Framework for Modeling and Simulation of Cloud Computing Infrastructures and Services (to be published in ICPP 2009) 14. Roberston, D.: A Lightweight Method for Coordination of Agent Oriented Web Services. In: Proceedings of the AAAI Spring Symposium on Semantic Web Services (2004) 15. European Energy Exchange (2009), http://www.eex.com/en/ 16. Risch, M., Altmann, J.: Cost Analysis of Current Grids and Its Implications for Future Grid Markets. In: Altmann, J., Neumann, D., Fahringer, T. (eds.) GECON 2008. LNCS, vol. 5206, pp. 13–27. Springer, Heidelberg (2008) 17. Risch, M., Altmann, J., Makrypoulias, Y., Soursos, S.: Economics-Aware Capacity Planning for Commercial Grids. In: Collaborations and the Knowledge Economy, vol. 5, pp. 1197–1205. IOS Press, Amsterdam (2008) 18. Risch, M., Altmann, J.: Capacity Planning in Economic Grid Markets. In: Proceedings of the 4th International Conference on Grid and Pervasive Computing 2009. GPC 2009, Geneve, Switzerland. LNCS, pp. 1–12. Springer, Heidelberg (2009) 19. Neumann, D., Stößer, J., Weinhardt, C., Nimis, J.: A Framework for Commercial Grids Economic and Technical Challenges. Journal of Grid Computing 6(3), 325–347 (2008) 20. Schnizler, B., Neumann, D., Veit, D., Weinhardt, C.: Trading Grid Services - A MultiAttribute Combinatorial Approach. European Journal of Operational Research 187(3), 943–961 (2008)
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management Ivona Brandic, Dejan Music, Philipp Leitner, and Schahram Dustdar Distributed Systems Group, Institute of Information Systems Vienna University of Technology, Vienna, Austria {ivona,dejan,leitner,dustdar}@infosys.tuwien.ac.at
Abstract. Novel computing paradigms like Grid and Cloud computing demand guarantees on non-functional requirements such as application execution time or price. Such requirements are usually negotiated following a specific Quality of Service (QoS) model and are expressed using Service Level Agreements (SLAs). Currently available QoS models assume either that service provider and consumer have matching SLA templates and common understanding of the negotiated terms or provide public templates, which can be downloaded and utilized by the end users. On the one hand, matching SLA templates represent an unrealistic assumption in systems where service consumer and provider meet dynamically and on demand. On the other hand, handling of public templates seems to be a rather challenging issue, especially if the templates do not reflect users’ needs. In this paper we present VieSLAF, a novel framework for the specification and management of SLA mappings. Using VieSLAF users may specify, manage, and apply SLA mapping bridging the gap between non-matching SLA templates. Moreover, based on the predefined learning functions and considering accumulated SLA mappings, domain specific public SLA templates can be derived reflecting users’ needs.
1
Introduction
Nowadays, well established and traditional resource sharing models are shifted towards novel market-oriented models revolutionizing existing Grid and High Performance Computing (HPC) concepts [8]. In market-oriented resource sharing models users discover resources on demand and pay for the usage of the specific resources. In turn they expect that besides requested functional requirements, non-functional requirements of the application execution are also fulfilled [1,19,12]. Non-functional requirements comprise application execution time, reliability, availability, and similar issues. Non-functional requirements are termed as Quality of Service (QoS) and are expressed and negotiated by means of Service Level Agreements (SLAs). SLA templates represent empty SLA documents i.e., SLA documents, with all required elements like parties, SLA parameters, metrics, and objectives, but without QoS values [9]. A large body of work deals with SLA based QoS negotiation and integration of QoS concepts into Grid management tools [10,17]. However, most of the J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 60–73, 2009. c Springer-Verlag Berlin Heidelberg 2009
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
61
existing work relies either on inflexible QoS models assuming that the communication partners have matching SLA templates or provide public SLA templates, which can be downloaded and utilized on the users’ system. On the one hand, matching SLA templates limit QoS negotiation only between partners where QoS relationship is already established off-line, or to partners who belong to a particular Grid portal [1]. On the other hand, publicly available SLA templates usually do not reflect users’ needs. Thus, in order to increase QoS versatility, flexible QoS models are necessary where negotiation is possible even between services which do not have matching SLA templates. The problems with nonmatching templates can be exemplified on a very simple example with differing terms of contract on both sides. The term price may be defined as usage price or service price, etc., leading to inconsistencies during the negotiation process. Another example of not matching templates are SLA templates which slightly differ in their structure. In this paper we approach the gap between existing QoS methods and novel computing paradigms like Cloud Computing by proposing VieSLAF, a framework for the management of SLA mappings. Thereby, mappings are defined by XSLT1 documents where inconsistent parts of one document are mapped to another document e.g, from the consumer’s template to the provider’s template. Moreover, based on SLA mappings and deployed taxonomies we eliminate semantic inconsistencies between consumer’s and provider’s templates. The purpose of the submitted SLA mappings is twofold: (1) using VieSLAF users may discover services on demand, define mappings to available templates, if necessary and finally start the negotiation with selected services. Therefore, the negotiation is not only limited to services belonging to a special portal or where a relationship is already established off-line; (2) based on VieSLAF ’s predefined learning functions and accumulated SLA mappings we facilitate user driven definition of public SLA templates. Based on a case study the presented SLA mapping architecture has been successfully used to manage SLA mappings in context of a Grid workflow management tool [4] and adaptable Cloud services [6]. Additionally to [4,6] where we presented the general approach for SLA mappings, in this paper we present (1) the VieSLAF architecture in detail with modules for the measurement of SLA parameters, (2) implementation details of the VieSLAF framework; and (3) first experimental results. The main contributions of this paper are: (1) description of the scenarios for the definition of SLA mapping documents; (2) definition of the VieSLAF architecture used for the semi-automatic management of SLA mappings (3) demonstration of learning functions, which can be used to obtain realistic public templates and (4) evaluation of the VieSLAF architecture using an experimental testbed. The rest of this paper is organized as follows: Section 2 presents the related work. In Section 3 we present our SLA mapping approach. In particular we discuss the management of SLA mappings, SLA transformations, and an example 1
XSL Transformations (XSLT) Version 1.0, http://www.w3.org/TR/xslt.html
62
I. Brandic et al.
SLA mapping document. Section 4 presents the VieSLAF architecture including the used semantic model, methods for SLA mappings and transformations, used registries and features for the SLA monitoring and adaptation of SLA templates. In Section 5 we discuss our first experimental results. Section 6 concludes this paper and describes the future work.
2
Related Work
Currently, a large body of work exists in the area of Grid service negotiation and SLA-based QoS. Most of the related work can be classified into the following three categories: (1) adaptive SLA mechanisms based on OWL, DAML-S and other semantic technologies [17,10,23]; (2) SLA based QoS systems, which consider varying service requirements but do not consider non matching SLA templates [1,20]; and (3) systems relying on the principles of autonomic computing [3,14,15]. Work presented in [18] discusses the incorporation of SLA-based resource brokering into existing Grid systems. Oldham et al. describe a framework for semantic matching of SLAs based on WSDL-S and OWL [17]. Dobson at al. present a unified quality of service (QoS) ontology applicable to the main scenarios identified such as QoS-based Web services selection, QoS monitoring and QoS adaptation [10]. Zhou et al. survey the current research on QoS and service discovery, including ontologies such as OWL-S and DAML-S. Thereafter, an ontology is proposed, DAML-QoS, which provides detailed QoS information in a DAML format [23]. Hung et al. propose an independent declarative XML language called WS-Negotiation for Web services providers and requestors. WSNegotiation contains three parts: negotiation message, which describes the format for messages exchanged among negotiation parties, negotiation protocol, which describes the mechanism and rules that negotiation parties should follow, and negotiation decision making, which is an internal and private decision process based on a cost-benefit model or other strategies [13]. Work presented in [1] extends the service abstraction in the Open Grid Services Architecture (OGSA) for QoS properties focusing on the application layer. Thereby, a given service may indicate the QoS properties it can offer or it may search for other services based on specified QoS properties. Quan et al. discuss the process of mapping a light communication workflow within an SLA context with different kinds of sub-jobs and resources [16]. Dan et al. present a framework for providing customers of Web services differentiated levels of service through the use of automated management and SLAs [9]. Ardagana et al. present an autonomic grid architectures with mechanisms to dynamically re-configure service center infrastructures, which is basically exploited to fulfill varying QoS requirements [3]. Koller et al. discuss autonomous QoS management using a proxy-like approach. The implementation is based on WS-Agreement [21]. Thereby, SLAs can be exploited to define certain QoS parameters that a service has to maintain during its interaction with a specific customer [14]. K¨ onig at al. investigate the trust issue in electronic negotiations,
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
63
dealing with how to trust a potential transaction partner and how to choose such partners based on their past behavior [15]. However, to the best of our knowledge none of the discussed approaches deals with user-driven and semi-automatic definition of SLA mappings enabling negotiations between inconsistent SLA templates. Moreover, none of the presented approaches facilitates user driven definition of publicly available SLA templates.
3
The SLA Mapping Approach
In the presented approach each SLA template has to be published into a registry where negotiation partners i.e., provider and consumer, can find each other. The management of SLA mappings and published services is presented in Section 3.1. The transformations between remote and local SLA templates are discussed in Section 3.2. Finally, an example SLA mapping document is presented in Section 3.3. 3.1
Management of SLA Mappings
Figure 1(a) depicts the architecture for the management of SLA mappings and participating parties. The registry comprises different SLA templates whereby each of them represents a specific application domain e.g., SLA templates for the medical, telco or life science domain. Thus, each service provider may assign his/her service to a particular template (see step 1 in Figure 1(a)) and afterwards assign SLA mappings, if necessary (see step 2). Each template a may have n services assigned. Available templates can be browsed using an appropriate GUI. Service consumers may search for the services using meta-data and search terms (step 3). After finding appropriate services each service consumer may define mappings to the associated template (step 4). Thereafter, the negotiation between service consumer and service provider may start as described in the next section. SLA mappings should be defined in a dynamic way. Thus, SLA templates can be updated frequently to reflect the actual SLAs used by service provides and consumers based on predefined adaptation rules (step 5). The adaptability functionality facilitates the generation of user driven public SLA templates.
5. <
>
Service Consumer
Service Provider
(a)
1. <>
- Servicea: 1 Template - Service Service1 2 - - -Service 1 -Service Service2 3 Service 3 - - -Service 2 ... ... - - -Service Service 3 n Service n - - ... - Service n
2. <>
3. <<search services>>
4. <>
Service Registry
local WSLA template
+
Rule from local to remote
XSLTransformations
Rule from remote to local
XSLTransformations
(b)
Fig. 1. Management of SLA-Mappings (a) QoS basic scenario (b)
remote WSLA Template
+
64
I. Brandic et al.
Currently, SLA mappings are defined on an XML level, where users define XSL transformations. A UML based GUI for the management of SLA-mappings is under development [4]. 3.2
SLA-Mappings Transformations
Figure 1(b) depicts a scenario for defining XSL transformations. As the SLA specification language we use Web Service Level Agreements (WSLA)s [22]. We also developed first bootstrapping strategies for communication across different SLA specification languages [5]. WSLA templates are publicly available and published in a searchable registry. Each participant may download already published WSLA templates and compare it in a semi-automated or automated way with the local template. If there are any inconsistencies discovered, the service consumer may write rules (XSL transformation) from his/her local WSLA template to the remote template. The rules can also be written by using appropriate visualization tools, for example using a GUI as depicted in Figure 3. Thereafter, the rules are stored in the database and can be applied during the runtime to the remote WSLA template. Since during the negotiation process transformations are done in two directions, the transformations from the remote WSLA template to the local WSLA template are necessary as well. As depicted in Figure 1(b), a service consumer is generating a WSLA. The locally generated WSLA plus the rules defining transformations from local WSLA to remote WSLA deliver a WSLA which is compliant to the remote WSLA. In the second case the remote WSLA template has to be translated into the local one. In that case the remote WSLA plus the rules defining transformations from the remote to local WSLA deliver a WSLA which is compliant to the local WSLA. Thus, the negotiation may be done between non-matching WSLAs in both directions: from service consumer to service provider and vice versa. The service provider can define rules for XSL transformations in the same way as depicted in Figure 1(b) from the publicly published WSLA templates to the local WSLA templates. Thus, both parties, provider and consumer, may match on a publicly available WSLA template. 3.3
SLA-Mappings Document (SMD)
Figure 2 shows a sample rule for XSL transformations where price defined in Euros is transformed to an equivalent price in US Dollars. Please note that for the case of simplicity we use a relatively simple example. Using XSLT more complicated mappings can also be defined. Explanation of this is out of scope of this paper. As shown in Figure 2, the Euro metric is mapped to the Dollar metric. In this example we define the mapping rule returning Dollars by using the Times function of WSLA Specification (see line 5). The Times function multiplies two operands: the first operand is the Dollar amount as selected in line 12, the second operand
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
65
1. ... 2. <xsl:template ...> 3. <xsl:element name="Function" ...> 4. <xsl:attribute name="type"> 5. <xsl:text>Times 6. 7. <xsl:attribute name="resultType"> 8. <xsl:text>double 9. 10. <xsl:element name="Operand" ...> 11. <xsl:copy> 12. <xsl:copy-of select="@*|node()"/> 13. 14. 15. <xsl:element name="Operand" ...> 16. <xsl:element name="FloatScalar" ...> 17. <xsl:text>1.27559 18. 19. 20. 21. 22. ...
Fig. 2. Example XSL Transformation
is the Dollar/Euro quote (1.27559) as specified in line 17. The dollar/euro quote can be retrieved by a Web service and is usually not hard coded. With similar mapping rules users can map simple syntax values (values of some attributes etc.), but they can even define complex semantic mappings with considerable logic. Thus, even syntactically and semantically different SLA templates can be translated into each other.
4
VieSLAF Framework
In this section we present the architecture used for the semi-automated management of SLA mappings and generation of public SLA templates. We discuss a sample architectural case study exemplifying the usage of VieSLAF. Thereafter, we describe each VieSLAF ’s core component in detail. 4.1
VieSLAF Architecture
The VieSLAF framework enables application developers to efficiently develop adaptable service-oriented applications simplifying the handling with numerous Web service specifications. The framework facilitates management of QoS models as for example management of meta-negotiations and SLA mappings [7]. Based on the VieSLAF framework service providers may easily manage QoS models and SLA templates and frequently check whether selected services satisfy developer’s needs e.g., specified QoS-parameters in SLAs. Furthermore, we discuss basic ideas about the adaptation of SLA templates needed for the generation of realistic public SLA templates.
66
I. Brandic et al.
Data Model
(4)
(6) DB
Adaptation
Monitoring
Knowledge Base
Registry
(9)
Cloud of measurement services Remote
Remote Remote SLA SLA SLA template template template
(8)
DB
meta Remote negotiatio negotiatio SLA document n ndocument template
Sevice 2
Thread1_param1 Thread2_param2 ... Threadn_paramn
(2)
(3)
MetaNegotiaiton Negotiaiton Meta Meta Negotiaiton Middleware (MNM) SLA Mapping Middleware (MNM) Middleware (MNM) Middleware
SLA Mapping Middleware TransTransformation formation rules: rules: XSLT, XSLT, XPath XPath
(5) ...
WSDL API
Client, consumer specific middleware
...
TransTransformation formation Rules: Rules: XSLT, XSLT, XPath XPath
Sample Provider
(1)
Local SLA template
Local SLA template
Sample Consumer
Thread1_param1 Thread2_param2 ... Threadn_paramn
meta
Adaptation rules for SLA templates
(1)
Sevice 1
(7)
Sample Service provider specific middleware
Fig. 3. VieSLAF Architecture
We describe the VieSLAF components based on Figure 3. As shown in step (1) in Figure 3 users may access the registry using a GUI, browse through existing templates using the SLA mapping middleware. In the next step (2) service providers specify SLA mappings using the SLA mapping middleware and submit them to the registry. Thereafter, in step (3) service consumers may define their own SLA mappings to the remote templates and submit them to the registry. SLA mapping middleware on both sides (provider’s and consumer’s) facilitates the management of SLA mappings. Submitted SLA mappings are parsed and mapped to a predefined data model (step 4). Thereafter, service negotiation may start (step 5). During the negotiation SLA mappings and XSLT transformations are applied (step 6). After the negotiation, the invocation of the service methods may start (step 7). SLA parameters are monitored using the monitoring service (step 8). Based on the submitted SLA mapping publicly available SLA templates are adapted reflecting the majority of local SLA templates (step 9). 4.2
VieSLAF Components
As shown in Figure 3 the major VieSLAF components are the knowledge base, components for monitoring and adaptation and the SLA middleware used by service provider and consumer. Knowledge Base. As shown in Figure 3 the knowledge base is responsible for storing SLA templates and SLA mapping documents. For storing of SLA template documents we implemented registries representing searchable repositories. Currently we have implemented a MS-SQL 2008 database with a Web
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
67
service front end that provides the interface for the management of SLA mappings. To handle scalability issues we intend to host the registries using a cloud of databases hosted on a service provider such as Google App Engine [11] or Amazon EC2 [2]. SLA templates are stored in a canonical form enabling comparison of XML documents. The registry methods are implemented as Windows Communication Foundation (WCF) services and can be accessed only with appropriate access rights. The database is manipulated based on the role-model. We define three roles: service consumer, service provider and registry administrator. Service consumers are able to search suitable services for the selected service categories e.g., by using the method findServices. Service consumers may also create SLA-mappings using the method createAttributeMapping. Service providers may publish their services and bind it to a specific template category using the method createService. Sample Provider and Sample Consumer. A sample provider and a sample consumer are shown in the lower part of Figure 3. Basically, a service consumer/provider consists of a client/service based middleware, SLA mapping middleware facilitating the access to registries, and a GUI used for browsing remote templates. SLA Mapping Middleware. As already mentioned SLA mapping middleware is based on different WCF services. For the sake of brevity, in the following we discuss just a few of them. The RegistryAdministrationService provides methods for the manipulation of the database where administrator rights are required e.g., creation of template categories. Another example represents the WSLAMappingService, which is used for the management of SLA mappings by service consumer and service provider. WSLAQueryingService is used to query the SLA mapping database. The database can be queried based on template categories, SLA attributes and similar attributes. Other implemented WCF service are, for example, services for SLA parsing, XSL transformations, and SLA validation. Service consumers may search for appropriate services through WSLAQueryingService and define appropriate SLA-mappings by using the method createAttributeMapping. Each query request is checked at runtime, if the service consumer has also specified any SLA-mappings for SLAElements and SLAAttributes specified in category’s SLA-Template. SLA transformations are applied before the requests of the service consumers can be completely checked. The rules necessary for the transformations of attributes and elements can be found in the database and can be applied using the consumer’s WSLA-Template. Thereafter, the consumer’s template is completely translated into a category’s WSLA-Template. Transformations are done by WSLATransformator implemented with the .NET 3.5 technology and using LINQ2 . Monitoring Service. As depicted in Figure 3, we implemented a lightweight concept for the monitoring of SLA parameters for all services published in a 2
Language Integrated Query.
68
I. Brandic et al.
specific template category. The aim of the monitoring service is to frequently check the status of the SLA parameters of an SLA agreement and deliver the information to the service consumer and/or provider. Furthermore, the monitoring service monitors values of SLA parameters as specified in the SLA-Template of the published services. Monitoring starts after publishing a service in a category and is provided through the whole lifetime of the service. The monitoring service is implemented as an internal registry service, similar to other services for parsing, transformation, and validation, that we have already explained in previous sections. In the following we describe how the monitoring process can be started i.e., all the steps necessary to setup monitoring. After the publishing of the service and SLA mappings, SLAs are parsed and it is identified which SLA parameters have to be monitored and how. We distinguish between periodically measured SLA parameters and the parameters which are measured on request. The values of the periodically measured parameters are stored in the so-called parameter-pool. The monitoring service provides two methods: a knock-in method for starting the monitoring and a method for receiving the measured SLA parameters from the measurement pool. Whenever a user requests monitoring information of the particular SLA (i) SLAs parameters are requested from the parameter-pool in case of periodically measured parameters or (ii) SLA parameters are immediately measured as defined in the parsed and validated SLAs in case of on-request parameters. Adaptation Service. Remote SLA templates should not be defined in a static way, they should reflect provider’s and consumer’s needs. Thus, we implemented a first prototype of an internal registry’s adaptation service, which can be used by consumers and providers as shown in Figure 3 in order to derive realistic public SLA templates. Users can specify SLAParameters which should be added into SLA-Template or choose some SLAParameters which they do not need and want to delete. Each ParameterWish (add/delete) is saved as an XML chunk that contains all SLAParameter s with metrics which should be added/deleted from a specific SLA-Template. Registry administrators have to configure a learning capability property for each template category. The property defines how many requests for a specific ParameterWish have to be defined in order to add/delete ParameterWish to/from an SLA-Template. Whenever a new ParameterWish is accepted a new revision category of an SLA template is generated. All services and consumers who voted for that specific wish are automatically re-published to the new revision. Also all SLA mappings are automatically assigned to the new template revision. Old SLA mappings of the consumers and services are deleted and also all old background threads used for calculation for old SLA template are aborted. The newly generated SLA template is thereafter parsed and new background monitoring threads are created and started for each service. Thus, based on the presented adaptation approach public templates can be derived in a user driven way reflecting the majority of local templates.
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
5
69
Evaluation
In this section we evaluate the VieSLAF framework. In Section 5.1 we measure the overhead produced by SLA mappings compared to Web service invocation without mappings. We describe the experimental testbed and the setup used. Thereafter, we discuss the experimental results. In Section 5.2 we discuss stress tests with the varying number of concurrently invoked SLA mappings. In Section 5.3 we present results with the varying number of SLA mappings per single Web service invocation. S1
...
S10 VieSLAF
SLA Mapping Middelware
Registry Database
Administrator Role
Windows Server 2008 SP 1 Registry administration Service invocation
VieSLAF Client
Mapping, Parsing
Fig. 4. VieSLAF Testbed
5.1
Overhead Test
In order to test the VieSLAF framework we developed a testbed as shown in Figure 4. As a client machine we used an Acer Aspire Laptop, Intel Core 2 Duo T5600 1.83 GHz, 2 MB L2 Cache 1GB RAM. For hosting of 10 sample services, calculator services with 5 methods, we used a single core Xenon 3.2Ghz, 2MB L1 cache, 4GB RAM Sun blade machine. We use the same machine to host VieSLAF s WCF services. The aim of our evaluation is to measure the overhead produced using VieSLAF ’s WSLAQueryingService for search and SLA mappings of the appropriate services. We created 10 services (S1,..., S10) and 10 accounts for service providers. We also created the registry administrator’s role, which manages the creation of template categories with the corresponding SLA templates. The SLA template represents a remote calculator service with five methods: Add, Subtract, Multiply, Divide and Max. Both, the provider and the consumers define five SLAMappings, which have to be used during the runtime. We specify three simple, syntactic mappings where we only change the name of an element or attribute. The other two mappings consider also semantic mappings, where we map between structurally different SLA templates. Table 1 shows the experimental results. The measured values represent the arithmetic mean of 20 service invocations. The overhead measured during the experimental results includes the time needed for validation of WSLA documents (column Validation in Table 1), the time necessary to perform SLA-mappings from the local consumers to the remote SLA templates (column Consumer Mapping) and the time necessary to transform the remote SLA templates to the local
70
I. Brandic et al.
Table 1. SLA Mappings Overhead Compared to Simple Web Service Invocation (Without SLA Mappings) Service Search Time
Total
SLA-Mapping
Remaining Time
Validation Consumer Mapping Provider Mapping Time in sec Time [%]
0.046
0.183
0.151
1.009
3.32
13.17
10.87
72.64 100.00
1.389
providers (column Provider Mapping). Furthermore, we measured the remaining time necessary to perform a search. The remaining time includes the round trip time for a search including data transfer between the client and the service and vise versa. As shown in Table 1 the time necessary to handle SLA mappings (V alidation+ConsumerM apping +P roviderM apping) represents 0.38 seconds or 27, 36% of the overall search time. Please note that the intention of the presented experimental results is the proof of concept of the SLA mapping approach. We did not test the scalability issues, since we intend to employ computing Clouds like Google App Engine [11] or Amazon EC2 [2] in order to cope with the scalability issues. 5.2
Stress Tests
In this Section we describe tests on how the VieSLAF middleware copes with the multiple SLA mappings executed concurrently with differing complexity. Evaluation is done on an Acer Aspire Laptop, Intel Core 2 Duo T5600 1.83 GHz, 2 MB L2 Cache, 1GB RAM. For the evaluation we have used two different SLA mappings: – Simple: Invocation of the simple SLA mappings, an example is translation of one attribute to another attribute e.g., usage price to price. – Complex: Represents the invocation of the complex SLA mappings, as for example semantic mappings considering two structurally different SLA templates. We tested VieSLAF with different versions of XSLT transformers, namely with XSLTCompiledTransform, .Net version 3.0 and with the obsolete XSLTTransform Class from .Net 1.1. Figure 5(a) shows the measurements with the XSLTCompiledTransform Transformer and with the XSLTTransform Class. The x axis depicts the number of SLA mappings performed concurrently i.e., number of runs. The y axis depicts the measured time for the execution of SLA mappings in seconds. Considering the measurement results we can observe that the XSLTTransform Class is faster than the XSLTCompiledTransform Transformer from the newer .Net version. Complex mappings executed with the XSLTTransform Class almost overlap with the simple mappings executed with the XSLTCompiledTransform. We can observe that in both cases, simple and complex mapping, the
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
5x
10
10x
15x
20x
25x
50x
71
100x 500x 1000x
Simple XSLTCompiledTransform ComplexXSLTCompiledTransform Simple XSLTTransform ComplexXSLTTransform
1
0.1
0.01
(a)
(b)
Fig. 5. Stress Tests with XSLTCompiledTransform Transformer and XSLTTransform Class (a) Measurements with varying number of SLA mappings per Web Service Invocation (b)
performance starts to significantly decrease with the number of SLA mappings > 100. If the number of mappings < 100, the execution time is about or less than 1 second. 5.3
Multiple SLA Mapping Tests
In this section we discuss performance results measured during a Web service call with varying numbers of SLA mappings per service. We measured 5, 10, 15 and 20 SLA mappings per Web service call. In order to create a realistic testbed we used SLA mappings which depend on each other: e.g., attribute A is transformed to attribute B, B is transformed to C, C to D, and so on. Thus, we simulate the worst case, where SLA mappings can not be performed concurrently, they have to be performed sequentially. Evaluation is done on an Acer Aspire Laptop, Intel Core 2 Duo T5600 1.83 GHz, 2 MB L2 Cache, 1GB RAM. Figure 5(b) shows measured results. The x axis depicts the number of SLA mappings performed concurrently or sequentially considering attribute dependencies. The y axis depicts the measured time for the execution of SLA mappings in milliseconds. We executed SLA mappings between the remote template and the provider’s template (i.e., provider mappings as described in Table 1) before the runtime, because these mappings are known before consumer requests. Thus, only mappings between the consumer’s template and the remote template are done during the runtime as indicated with the SLA Mapping line. The line SLA Mapping + Client invocation comprises the time for the invocation of a Web service method including SLA mapping time. The SLA Mapping + Client invocation line does not comprise round-trip time, it comprises only the request time. We can conclude that even with the increasing number of SLA mappings and considering the worst case scenario with sequentially performed mappings the SLA mapping time represents about 20% of the overall execution time.
72
6
I. Brandic et al.
Conclusion and Future Work
In this paper we presented the VieSLAF framework used for the management of SLA mappings. SLA mappings are necessary in service oriented Grids and computational Clouds where service consumer and provider usually do not have matching SLA templates. Thus, based on SLA mappings even those partners with slightly different templates may negotiate with each other and increase the number of potential negotiation partners. We have demonstrated how Grid service users (provider and consumer) may search for appropriate services, define SLA mappings, if necessary, and finally start service negotiation and execution. Using VieSLAF users can even monitor SLA parameters during the execution of the service calls. Thereafter, we presented how the SLA mappings and the predefined learning functions can be used to adapt SLA templates. Adaptability functions facilitate generation of user driven public SLA templates. Finally, we discussed our first proof of concept based on the experimental results. In the future we plan to extend our work on adaptable Cloud services and test our approach with real life applications.
Acknowledgments The work described in this paper was partially supported by the European Community’s Seventh Framework Programme [FP7/2007-2013] under grant agreement 215483 (S-Cube) and by the Vienna Science and Technology Fund (WWTF) under grant agreement ICT08-018 Foundations of Self-governing ICT Infrastructures (FoSII).
References 1. Al-Ali, R.J., Rana, O.F., Walker, D.W., Jha, S., Sohail, S.: G-qosm: Grid service discovery using qos properties. Computing and Informatics 21, 363–382 (2002) 2. Amazon Elastic Compute Cloud (Amazon EC2), http://aws.amazon.com/ec2/ 3. Ardagna, D., Giunta, G., Ingraffia, N., Mirandola, R., Pernici, B.: QoS-driven web services selection in autonomic grid environments. In: Meersman, R., Tari, Z. (eds.) OTM 2006. LNCS, vol. 4276, pp. 1273–1289. Springer, Heidelberg (2006) 4. Brandic, I., Music, D., Dustdar, S., Venugopal, S., Buyya, R.: Advanced QoS Methods for Grid Workflows Based on Meta-Negotiations and SLA-Mappings. In: The 3rd Workshop on Workflows in Support of Large-Scale Science. In conjunction with Supercomputing 2008, Austin, TX, USA, November 17 (2008) 5. Brandic, I., Music, D., Dustdar, S.: Service Mediation and Negotiation Bootstrapping as First Achievements Towards Self-adaptable Grid and Cloud Services. In: Grids meet Autonomic Computing Workshop 2009 - GMAC 2009. In conjunction with the 6th International Conference on Autonomic Computing and Communications Barcelona, Spain, June 15-19 (2009) 6. Brandic, I.: Towards Self-manageable Cloud Services. In: The Second IEEE International Workshop on Real-Time Service-Oriented Architecture and Applications (RTSOAA 2009). In conjunction with the 33rd Annual IEEE International Computer Software and Applications Conference, Seattle, Washington, USA, July 20-24 (2009)
VieSLAF Framework: Enabling Adaptive and Versatile SLA-Management
73
7. Brandic, I., Venugopal, S., Mattess, M., Buyya, R.: Towards a Meta-Negotiation Architecture for SLA-Aware Grid Services. In: Workshop on Service-Oriented Engineering and Optimizations 2008. In conjunction with International Conference on High Performance Computing 2008 (HiPC 2008), Bangalore, India, December 17 - 20 (2008) 8. Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Brandic, I.: Cloud Computing and Emerging IT Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility. Future Generation Computer Systems 25(6), 599–616 (2009) 9. Dan, A., Davis, D., Kearney, R., Keller, A., King, R., Kuebler, D., Ludwig, H., Polan, M., Spreitzer, M., Youssef, A.: Web services on demand: WSLA-driven automated management. IBM Systems Journal 43(1) (2004) 10. Dobson, G., Sanchez-Macian, A.: Towards Unified QoS/SLA Ontologies. In: Proceedings of the 2006 IEEE Services Computing Workshops (SCW 2006), Chicago, Illinois, USA, September 18-22 (2006) 11. Google App Engine, http://code.google.com/appengine 12. Foundations of Self-Governing ICT Infrastructures (FoSII) Project, http://www.wwtf.at/projects/research_projects/details/index.php? PKEY=972_DE_O 13. Hung, P.C.K., Haifei, L., Jun-Jang, J.: WS-Negotiation: an overview of research issues. In: Proceedings of the 37th Annual Hawaii International Conference on System Sciences, Big Island, Hawaii, January 5-8 (2004) 14. Koller, B., Schubert, L.: Towards autonomous SLA management using a proxy-like approach. Multiagent Grid Syst. 3(3) (2007) 15. K¨ onig, S., Hudert, S., Eymann, T., Paolucci, M.: Towards reputation enhanced electronic negotiations for service oriented computing. In: Falcone, R., Barber, S.K., Sabater-Mir, J., Singh, M.P. (eds.) Trust 2008. LNCS, vol. 5396, pp. 273–291. Springer, Heidelberg (2008) 16. Quan, D.M., Altmann, J.: Resource allocation algorithm for light communication grid-based workflows within an SLA context. International Journal of Parallel, Emergent and Distributed Systems (IJPEDS) 24(1), 31–48 (2009) 17. Oldham, N., Verma, K., Sheth, A.P., Hakimpour, F.: Semantic WS-agreement partner selection. In: Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, Scotland, UK, May 23-26 (2006) 18. Ouelhadj, D., Garibaldi, J.M., MacLaren, J., Sakellariou, R., Krishnakumar, K.T.: A multi-agent infrastructure and a service level agreement negotiation protocol for robust scheduling in grid computing. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds.) EGC 2005. LNCS, vol. 3470, pp. 651–660. Springer, Heidelberg (2005) 19. Venugopal, S., Buyya, R., Winton, L.: A Grid Service Broker for Scheduling e-Science Applications on Global Data Grids. In: Concurrency and Computation: Practice and Experience, vol. 18(6), pp. 685–699. Wiley Press, New York (2006) 20. Walker, D.W., Huang, L., Rana, O.F., Huang, Y.: Dynamic service selection in workflows using performance data. Scientific Programming 15(4), 235–247 (2007) 21. Web Services Agreement Specification (WS-Agreement), http://www.ogf.org/documents/GFD.107.pdf 22. Web Service Level Agreement (WSLA), http://www.research.ibm.com/wsla/WSLASpecV1-20030128.pdf 23. Zhou, C., Chia, L.T., Lee, B.S.: Semantics in service discovery and QoS measurement. IT Professional 7(2), 29–34 (2005)
Cost Optimization Model for Business Applications in Virtualized Grid Environments Jörg Strebel Universität Karlsruhe (TU), 76131 Karlsruhe, Germany [email protected]
Abstract. The advent of Grid computing gives enterprises an ever increasing choice of computing options, yet research has so far hardly addressed the problem of mixing the different computing options in a cost-minimal fashion. The following paper presents a comprehensive cost model and a mixed integer optimization model which can be used to minimize the IT expenditures of an enterprise and help in decision-making when to outsource certain business software applications. A sample scenario is analyzed and promising cost savings are demonstrated. Possible applications of the model to future research questions are outlined.
1 1.1
Introduction Motivation
Today’s corporate IT departments are typically being plagued by a multitude of challenges. Among those is the increasing complexity of IT infrastructure caused by a large number of dedicated, heterogeneous IT resources. This situation entails total operations expenses of complex software landscapes to rise continuously. The “Grid computing” model, which allows the sharing of computing resources promises to remedy the complexity of IT infrastructure by enabling more efficient infrastructure utilization. But this promise has not yet caught on in the corporate world: while Grid-based solutions enjoy great popularity in the scientific community (e. g. Grid-based data processing for the LHC at CERN), they hitherto have found little acceptance with corporate users and are thus hardly found in the business IT landscape - if it all (c.f. [1, p. 4]). In the Business in the Grid (BIG) project [1], the authors conducted numerous interviews with relevant findings. “All companies see Grid computing as a way to reduce costs in certain areas...” [1, p. 5], but a good cost model is still missing, so that companies currently cannot quantify the potential, Grid-related savings. “In our opinion, a weakness of Grids for business exist at the moment, because Grids are not profitable”[1, p. 8]. This observation raises doubts whether the promises associated with Grid computing really exist. A solid analysis of the business cases of Grid solutions is therefore required. In this research paper, a cost-optimization model is presented, that will be helpful in answering the question, what the cost-saving potential of Grid computing is and how a company can realize this potential. J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 74–87, 2009. c Springer-Verlag Berlin Heidelberg 2009
Cost Optimization Model for Business Applications
75
The paper was motivated by the research questions arising from the current Biz2Grid research project1 . Its goal is to clarify under which conditions business applications can be moved to the Grid; the BMW Group acts as an industry partner to the project. 1.2
Related Work
The research question mentioned above has already attracted a fair body of research; especially research from the following three perspectives has insights to offer: research on Grid computing cost, research on decision support tools and research on resource management. As a researcher in Grid computing cost, Opitz et al. [2] tries to quantify the total cost of ownership (TCO) for grid computing resource providers in absolute cost figures from real-world scenarios and comes up with an estimate for the total cost of a CPU-h offered by a commercial resource provider; the model in Opitz et al. does not include any storage costs, which are of major importance in business settings. So, the necessary cost calculations for computing resource-consuming enterprises have to look different from the model in Opitz et al. for resource providers. Risch et al. [3] analyzed a number of Grid computing scenarios using a cost-based approach; they showed that Grid computing is beneficial in scenarios, where either short and infrequent peaks have to be covered or where data backups have to be conducted or where lightly used resources have to be replaced. However, they recommend that each company performs its own cost analysis as the benefits are depending on the cost level of the in-house resources. Gray [4] specifically deals with the decision when to outsource given the price ratios between the different computing resources. Generally, the business model behind Grid computing remains case-specific; he maintains that business benefits are only realized for very CPU-intensive software applications. Gray’s model fails to include the often costly commercial software licenses required by corporate users and completely omits cost factors like transaction costs. It has long been known from research on IT outsourcing, that those factors play an important role in the outsourcing decision (e.g. Smith et al. [5]). As a conclusion, it can be stated that the cost models for Grid computing found in the current research literature are rather incomplete, scenario-specific and not focused on the corporate decision-maker. The decision-support perspective is the second important perspective on the potential of Grid computing. Kenyon and Cheliotis [6] addressed the area of Grid resource commercialization, which is frequently called utility computing. They conceive Grid resources as commodities and apply financial instruments for conventional commodities like gas or electricity to those resources. Within the scope of their analysis, they identified the necessity for decision support, when Grid users buy or sell Grid resources on a Grid marketplace. However, the need for such elaborated decision support models will only arise if a working Grid resource market similar to the existing markets of conventional commodities should ever exist, which is currently - despite research initiatives such as 1
http://www.biz2grid.de
76
J. Strebel
SORMA [7] and GridEcon [8], not the case. Even without the existence of Grid marketplaces, Grid resource consumers still need decision support today when planning their Grid usage. The research literature has only limited insights to offer for today’s scenarios. In the area of resource management, Rolia et al. (e.g. [9]) suggest a resourcemanagement framework for automatic software application placement in the data center using Grid-computing principles like resource allocation and scheduling. Their main focus lies on the optimization of in-house data-center resources, they do not address the question under which conditions to use external resources. Their optimization approach minimizes the number of CPUs and does not consider actual cost factors from an enterprise IT environment. Bichler et al. [10], Wimmer et al. [11] and Almeida et al.[12] pursue the same goal and suffer from the same drawbacks. Bagchi [13] uses simulation to analyze financial metrics (e.g. ROI) for Enterprise Grid systems, but does not feature optimization. The decision-support problem may be found in the Outsourcing research literature as well (Grid computing can be seen as on-demand outsourcing of IT infrastructure). However, the literature review of Gonzalez et al.[14] and the review of Dibbern et al.[15] show that the question of what to outsource has mostly been analyzed conceptually or in a positivist fashion so far, but not through mathematical modeling, even though cost is universally recognized as an important reason for Outsourcing. The use of linear programming in outsourcing decisions was suggested by Knolmayer [16], who also presented a model for deciding which IT service tasks to outsource. However, he never actually implemented or evaluated his model in real life, so his work remains rather conceptual. It seems the current research literature on Grid computing does offer some help when optimizing the enterprise data center, but it fails to help with the question of when to utilize external resource pools in an economically beneficial way. This paper suggests a novel model for optimizing the overall cost of internal and external IT resource usage for enterprises. In the following section, the optimization model and the cost model are developed and their underlying assumptions are stated. In Sec. 3, the model is instantiated and solved for a typical enterprise scenario; the solution is then discussed and future research directions are given in the last section.
2
Methods
The following chapter will describe a novel model for optimizing internal and external resource usage. First, the assumed IT architecture on which the cost model is based, will be defined; second, the actual optimization model that operates on this IT architecture will be explained. 2.1
Enterprise IT Architecture
Strong [17] gives examples of how a typical enterprise IT architecture looks like. Fig. 1 is based on this architecture; as an extension, a second resource pool
Cost Optimization Model for Business Applications
Internal Pool
Internal Server
Storage
Internal Server
App. 1
App. 2
App. 3
DB
DB
DB
OS
OS
OS
Virtual Machine
Virtual Machine
Virtual Machine
LAN
77
Grid Middleware
͙ Internal cost: -HW lease -Licenses -Storage - tE͙
Internet External Pool
External Server
Grid Storage
App. x
͙
DB OS (Grid)
Utility model: - Φ/CPU-h - Φ/GB-Month -͙
Fig. 1. Enterprise Grid Architecture
(e.g. a Grid service provider) and the Grid middleware (for resource management) are added. The necessary economic metrics needed for the cost evaluation are also displayed. A utility computing model is assumed for all external resources. A number of different business software applications can potentially be run on both internal and external resources. A typical business application consists at least of the application itself (featuring the business logic), a database system for persistent data storage and the Grid middleware. The following definitions correspond to the entities in Fig. 1; they are used throughout the rest of the paper. Let T be the time interval under analysis in the optimization model. Let T be divided up in periods T = {1, 2, . . . l}(t ∈ T ) of equal length. Let J be a set of software applications (J = {App1 , App2 , . . . AppN }, j ∈ J); each application has processor requirements pjt ∈ R+ 0 measured in number of CPU cores, inbound networking requirements nIjt ∈ R+ 0 measured in GB (Giga+ O bytes), outbound networking requirements njt ∈ R0 and storage requirements stojt ∈ R+ 0 measured in GB. Each application shall run in its own virtualized environment on a separate software stack (OS, database). The virtualization overhead is factored into the load data pjt . The CPU, networking and storage requirements can be estimated from historical real-world system traces coming from performance monitoring systems (c.f. Rolia et al.[9]). Let I be a set of internal servers (I = {InternalServer1 , . . . InternalServerM }, i ∈ I) with each internal server having a capacity of si ∈ R+ 0 measured in number of CPU cores. Let E be a set of external servers (E={ExternalServer1, . . . ExternalServerR }, e ∈ E) with each external server having a capacity of se ∈ R+ 0 measured in number
78
J. Strebel
of CPU cores. There is only one class of service for all servers; this class of service shall correspond to the resource provider’s class of service. 2.2
Optimization Model
The task of the optimization model consists of assigning each software application in each period to a resource from either the internal or the external pool. The model is essentially a mixed-integer programming problem (MIP) where the cost function is composed of cost factors typically found in business software applications. The following list shows the decision variables of the optimization model. – binary variable xijt for assigning application j to internal server i in period t – binary variable xejt for assigning application j to external servers e in period t – binary variable zi for recording the use of internal server i in T – binary variable yet for recording the use of external server e in period t – binary variable wj for one-time migration set-up activities of application j – binary variable o for the one-time overhead of using the external resource pool – binary variable at for the periodic overhead of using the external resource pool in period t – integer variable v for the required number of software licenses – rational variable u for the required size of the internal storage infrastructure zi records if the internal server i was used at all in T (e.g. z1 = 1 means that InternalServer1 was used at least in one period t). This information is required to accurately calculate the hardware costs. yet records if the external server e was used in period t; if that is the case, then the external compute fees for that server in that period are factored in the total cost. wj is a binary decision variable which is set to 1 if application j is moved from the internal resource pool to the external resource pool. This migration requires a number of set-up activities (interface implementation, network configuration, possible reconfigurations of existing systems etc.) whose one-time cost cimpl (see j Table 1) will be added to the total cost if the migration takes place. Those set-up activities only occur once in the lifetime of an application; subsequent migrations do not result in additional expenses. The decision variable o is set to 1 if any resources from the external pool are used at all. Then the overhead cost factor co comes into play. This cost factor represents the one-time transaction costs incurred when choosing an external provider and when setting up the contractual relationship. The effort required for vendor information retrieval, vendor selection, contract negotiation etc. determines the level of the one-time transaction costs. The decision variable at is set if any resources from the external resource pool are used in period t. Then the overhead factor ca is added to the total
Cost Optimization Model for Business Applications
79
Table 1. Cost factors Time fo- Cost type cus
Cost factor
Variant Unit costs EUR
One-time Implementation Interfaces, Network conexpenses figuration set-up Software
EUR
Licenses (Database,Middleware) Transaction cost
Overhead Running expenses
App
core
Internal Infras- Internal HW cost (Server, 4core tructure Racks)
EU R EUR period
period
16core
period
24core
period
EUR EUR GB
EUR
Software maintenance
Data transfer
transfer from the external pool to the enterprise
EUR
transfer from the enterprise to the external pool
EUR
External storage fees
license∗period GB
GB
Periodic head
over- Transaction cost, Support cost
EUR
1core
period
4core
period
8core
period
per size
GB∗period
per I/O requests
co chw it
EUR
Software
External infras- External Compute fees tructure
cv
EUR
GB∗period
WAN usage
cimpl j
EUR
8core
Internal Storage cost
Cost Coeff.
cut cwan cvm t cdout cdin cfet
EUR EUR EUR EUR request∗period
EUR period
cestors t cestorr t ca
cost. ca models periodic transaction costs resulting from activities like vendor management, service level monitoring or contract changes. Decision variable v comprises the maximum number of software licenses required; each application requires one database license and one Grid middleware license per core used. Licenses have to be purchased no matter if an application resides on internal or external resources. For simplicity’s sake, it is assumed that the different software applications only use one type of database system. Table 1 describes the cost factors used throughout the model. The cost factor chw it represents expenditures for internal server hardware. The regular payments per period for hardware could either be the server rent or server depreciation
80
J. Strebel Table 2. Cost function components
Cost factor Cost coeffic. Cost component N impl Interfaces, Network configu- cimpl wj j j=1 cj ration set-up Software license purchases cv cv ∗ v One-time Overhead co co ∗ o M l hw hw HW cost (Server, Racks) cit i=1 zi t=1 cit l u u Internal Storage ct t=1 ct ∗ u R l N wan wan I O WAN usage c c t=1 j=1 (njt + njt ) e=1 xejt l vm Software maintenance cvm c ∗ v t t=1 t O R Data transfer to enterprise cdout cdout lt=1 N j=1 njt e=1 xejt l N R din din I Data transfer to the external c c t=1 j=1 njt e=1 xejt pool l R f External Compute fees cfet t=1 e=1 cet yet l R estors estors N External storage fees (size) ct t=1 ct j=1 stojt e=1 xejt l N R estorr estorr External storage fees (re- ct t=1 ct j=1 e=1 xejt quests) Periodic overhead ca ca lt=1 at
costs. The server expenditures do not depend on the actual server utilization; if the server is used within T, its expenditures have to be paid for the complete time interval T. Potential hardware replacements and other repairs are included in the hardware cost. (The hardware cost also includes cost factors like electrical power, server operations and data center facilities.) The cost model for software licenses is simple: cv is the price for one database license and one Grid middleware license (licenses are priced per core). The software license maintenance fee cvm is t a fixed amount per period per license (usually a percentage of the initial license cost). It is assumed that the external infrastructure provider only provides OS (operating system) licenses for the external servers; the rest of the software licenses has to be purchased by the Grid user. LAN costs are neglected as LAN transports typically cost 10000 times less than WAN transports[4]. Moreover, they are not helpful in distinguishing the cost between internal and external resources, as all data has to pass through the LAN, no matter if it is destined for internal or external resources. WAN connections are assumed to be sized large enough so that the internal deployment of all applications does not create any bandwidth bottlenecks. The cost model structure and the cost factors were found in Sekatzek’s work on the metrics of corporate SAP systems [18, p. 135]. Table 2 combines the cost factors and the decision variables of the model and shows the cost function components. The complete cost function is the sum of all cost function components listed in Table 2. The cost function is subject to the following constraints. Equation (1) mandates that one application is assigned to exactly one server per period. It is
Cost Optimization Model for Business Applications
81
neither possible to run one application on several servers nor to run several application instances during the same period. Inequality 2 ensures that the load per period placed on each external server is at most the maximum capacity per period of that server. Inequality 3 applies to internal servers analog to (2). Inequality (4) makes sure that zi is set as soon as server i is used at least once in T. Inequality (5) sets the overhead decision variable whenever there is at least one external server. Inequality (6) sets the decision variable for outsourcing set-up activities, whenever application j uses an external server at least once. Inequality (7) sets the periodic transaction cost decision variable if external servers are used. Inequality (8) makes sure that the minimal necessary amount of software licenses are purchased; (9) models the internal storage requirements and sets u to the minimal amount of internal storage required across all periods. Constraint 10 limits the decision variables to binary values. R
∀j ∈ J, ∀t ∈ T
(1)
pjt xejt − se yet ≤ 0
∀e ∈ E, ∀t ∈ T
(2)
∀i ∈ I, ∀t ∈ T
(3)
e=1 N
M
xijt = 1
xejt +
i=1
j=1 N
pjt xijt − si ≤ 0
j=1
xijt − zi ≤ 0
∀t ∈ T, ∀i ∈ I, ∀j ∈ J
(4)
xejt − o ≤ 0 xejt − wj ≤ 0
∀e ∈ E, ∀t ∈ T, ∀j ∈ J ∀e ∈ E, ∀t ∈ T, ∀j ∈ J
(5) (6)
∀e ∈ E, ∀t ∈ T
(7)
xejt ) ≤ 0
∀t ∈ T
(8)
stojt xijt ≤ 0
∀t ∈ T
(9)
yet − at ≤ 0 −v +
N j=1
M
pjt (
xijt +
i=1
−u +
M N
R e=1
i=1 j=1
xejt , xijt , yet , zi , o, wj , at ∈ {0.1}∀e ∈ E, ∀t ∈ T, ∀j ∈ J, ∀i ∈ I (10) v ∈ N+ 0 u ∈ R+ 0
3
(11) (12)
Results
The following section shows an exemplary instantiation of the optimization model; first, sample cost figures from the literature are presented; then the optimization model is applied to a sample scenario using these cost figures.
82
J. Strebel
3.1
Cost Figures
All cost calculations are based on actual cash-flows; no depreciation rules are used, as the cost accounting of Grid computing expenses is not in the focus of this research paper. As a general rule, 200 working days per year are assumed for a full-time employee; an employer has to calculate with around 100000$ (ca. 75000 EUR) for salaries and indirect labor costs per year for a full-time employee[19]. The cost for setting up the interface between the enterprise and the external depends on the interface implementation resource pool for application j cimpl j effort (measured in working days). It is assumed that an interface of medium complexity requires 5 FTE. This estimation is based on software industry bestpractices for EAI interfaces. The one-time transaction cost co incurred when choosing an external resource provider is extremely hard to estimate. In this instance, the effort required by the vendor selection process is used as a proxy for the transaction cost. In the industry, an RFP (request-for-proposal) is launched whenever there is outsourcing work to be done and whenever there are several potential vendors. In this case, an activity-based costing approach is chosen to assess the tasks involved and estimate the complexity in FTE (Full-Time Equivalents). The total effort in FTE required by the vendor selection process can be captured in the following equation: co = 3 + 0.6 ∗ vendors + 0.125 ∗ vendors ∗ evaluators
(13)
The data for (13) has been collected through an interview with a BMW IT project manager with several years of professional experience. For the cost model in Table 3, it is assumed that there are 3 potential vendors and 3 employees acting as evaluators in the enterprise, which results in an effort of ca. 6 FTE (= 2250 EUR). The periodic transaction cost level ca is similarly hard to estimate as the one-time transaction cost level. The enterprise requires personnel to control the outsourcing provider. The coordination effort is mainly depending on the number of different contracts and not so much on the size of each contract. The literature states that a single employee can handle contract sizes of 5-10 Mil. EUR per year[20, p. 135]. The contract sizes for ad-hoc utility computing should range considerably below this level; it is assumed that a 0.1 FTE per year (= 7500 EUR) can handle the vendor management required by the usage of external resources. The hardware costs chw it have been estimated with an IBM System x3850 M2 server in mind using the IBM online configurator2. Cost factors like server operations, data center facilities and operating system licenses add ca. 350 EUR to the actual price of the hardware (based on McKinsey’s data[19]). Table 3 summarizes realistic cost figures for the optimization model (the cost figures have no relationship to BMW IT cost figures). 2
http://www-03.ibm.com/systems/de/x/hardware/enterprise/x3850m2/index. html
Cost Optimization Model for Business Applications
83
Table 3. Cost figures Cost co- Variant Value effic. cimpl ca. 1875EUR j cv ca. 6600EUR co ca. 2250EUR chw 4core ca. 522 EUR it
month
interface set-up Oracle licensea see (13) based on [19], IBMb
8core
EUR ca. 553 month
8Gb RAM
16core
EUR ca. 604 month
16Gb RAM
24core
EUR ca. 833 month
24Gb RAM
EUR
cut
ca. 1.60
cwan
ca. 0.28 EUR GB
cvm t
Comment
x86
ca. 123
GB∗month
EUR core∗month
based on [21] based on [22] Oracle Software Update License and Support, Sun Grid engine subscription (4core CPU) c
cdin
ca. 0.07 EUR GB
based on Amazon Web Services (AWS)d
cdout
ca. 0.13 EUR GB
based on Amazon Web Services
cfet
1core
EUR ca. 60 month
based on AWS on-demand Linux instances
4core
EUR ca. 240 month
based on AWS on-demand Linux instances
8core
EUR ca. 480 month
based on AWS on-demand Linux instances
cestors t
EUR ca. 0.42 GB∗month
Amazon EBS and S3 snapshot
cestort t
EUR ca. 21 month
Amazon EBS, 100 I/O request per second
ca
EUR ca. 926 month
vendor and contract management, AWS support
a b c
d
http://www.oracle.com/corporate/pricing/technology-price-list.pdf http://www-03.ibm.com/financing/de/itfinancing/tools/ezrate/de500.html http://globalspecials.sun.com/DRHM/servlet/ControllerServlet? Action=DisplayProductDetailsPage&SiteID=sunstor&Locale= en_US&productID=107684700 Pricing for a European company http://aws.amazon.com/ec2/#pricing
3.2
Optimization Results
In the scenario under study, two software applications are analyzed over the course of 6 months (12 periods), with each period lasting 2 weeks. The first application is a business software which requires 2 CPU cores in the first half of the month and 6 CPU cores in the second half of the month for billing runs and other batch processing. This application needs 600GB of storage in each period; it receives 0.1 GB and sends 1 GB of data over the network in the first two weeks of the month and receives 0.5 GB and sends 5 GB of data in the second two weeks of the month.
84
J. Strebel Table 4. Results of the scenario Policy Total cost separate scheduling on two internal machines 100275 EUR combined scheduling on internal and external re- 67791 EUR sources combined scheduling on external resources 74202 EUR
Savings n.a. 32% 26%
The second application is a simulation software and constantly requires 2 CPU cores and has only minimal storage and network requirements. The internal resource pool features 4 servers with 4, 4, 8 and 16 cores; the external resource pool features 3 servers with 1, 4 and 8 cores. Three resource management policies are tested: Policy 1 puts each application on a separate internal server and uses a maximum sizing approach for hardware and software (i.e. the first application would run on an 8 core server using 8 software licenses, the second application would run on a 4 core server using 4 software licenses). This approach is commonly used today in the industry. Policy 2 tries to consolidate the applications on internal and external servers of optimal size. Policy 3 tries to consolidate the applications on external resources of optimal size. Table 4 shows the results of scenario 1; the savings are calculated in comparison to policy 1.
4
Discussion
The results in Table 4 leads to the following conclusions: – Policy 2 leads to consolidation of the two applications on one internal server, which is the cost-optimal solution for this scenario. Even though external resources were available, the optimization model picked the more cost-efficient internal servers. Hence, an Enterprise Grid, where the Grid middleware optimizes the management of internal resources, would be the best architectural decision in this scenario. Further research must show if a longer time interval under study (>6 months) might lead to a different result, as the overhead costs of using external resources would then be distributed over more periods. – A forced usage of external resources is slightly more costly (ca. 9%) than pure internal operations in this scenario. – Consolidation leads to a much improved server utilization and to a much improved software license utilization. A major part of the savings (81%) in policy 1 comes from the lower number of required software licenses (12 for policy 1 compared to 8 for policy 2). As a next step, the model will be evaluated using load traces (processor loads, network throughput) from a number of BMW SAP systems from different functional areas (production, finance, engineering human resources) over the period of 6 months (the corresponding load data is being collected at the moment).
Cost Optimization Model for Business Applications
85
From a scientific perspective, this novel method enables the treatment of an array of relevant research questions. Using external resources to offload peak resource requirements is often cited as one of the most promising applications of Grid computing. The optimization model will be helpful in analyzing what resource demand patterns pjt , stojt , nIjt , nO jt can economically be shifted to the Grid. Another tractable research question goes into the direction of analyzing the effects of dynamic resource pricing. If the prices of external resources are no longer static over time, as they are now, then how will dynamic resource pricing affect the outsourcing business case? A sensitivity analysis of cost factors like cestors ,cestort or cfet in the model will help answer this question. (Please notice t t that those cost factors are time-dependent; therefore, the model can already accommodate changing external resource prices.) 4.1
Limitations on the Research Design and Material
MIP models like the one suggested in this paper are NP-complete and solving times usually grow exponentially with the model size; certain large scenarios cannot be solved to optimality. This limitation however is a minor one: first, the MIP solver will give an estimate of how close the current solution is to the optimal one, so the quality of the non-optimal solution can be assessed; second, even non-optimal model solutions can give valuable insights for the research questions mentioned above. The transaction cost model needs better scientific support, i.e. a better understanding of the processes involved when searching for an outsourcing partner is required. Activity-based costing approaches and outsourcing process analyses can be helpful here. If the existence of Grid computing markets is assumed, another way to estimate transaction costs would be to use transaction cost figures from existing commodity markets such as electricity. The interactions between two applications are not modeled: if two applications both running on external resources are exchanging data, the data exchange cost factor will be different from the WAN cost factor. It is hard to determine what part of the network traffic goes to other applications running on external resources and what part goes to the enterprise. So the network costs might be slightly exaggerated with the current model. The current model does not include quality measures for computing resources. Grading compute resources according to benchmarks like the SAP Application Performance Standard (SAPS) or SPECint will give a more accurate price vs. performance picture. The cost model does not include any cost factors for either application licenses, application maintenance or application operations. It is assumed that those cost factors are comparable no matter where the application runs. Software-as-aService scenarios are not in the focus of this paper, however they might be a future model extension.
86
4.2
J. Strebel
Conclusion
This paper suggests a novel model of optimizing the cost of IT resource usage for enterprises. The resulting model is helpful for both optimizing the internal and the external deployment of an application; it can be set up using data that is readily available in the enterprise (system traces, internal cost figures); it can be solved using standard PC hardware and therefore facilitates the exploration of research questions relevant to enterprises pondering the use of Grid computing. A sample scenario demonstrates the usefulness of the model. However, as the scenario shows, using external resources is not beneficial for every situation; a careful analysis of a larger number of business scenarios has to be conducted using the optimization model to reveal where the promises of Grid computing hold true.
References 1. Schikuta, E., Donno, F., Stockinger, H., Vinek, E., Wanek, H., Weishäupl, T., Witzany, C.: Business in the grid: Project results (2005), http://www.pri. univie.ac.at/Publications/2005/Schikuta_austriangrid_bigresults.pdf (accessed on October 05, 2008) 2. Opitz, A., König, H., Szamlewska, S.: What does grid computing cost? Journal of Grid Computing (2008) n.a. 3. Risch, M., Altmann, J.: Cost analysis of current grids and its implications for future grid markets. In: Altmann, J., Neumann, D., Fahringer, T. (eds.) GECON 2008. LNCS, vol. 5206, pp. 13–27. Springer, Heidelberg (2008) 4. Gray, J.: Distributed computing economics, http://research.microsoft.com/ research/pubs/view.aspx?tr_id=655(access ed on October 05, 2008) (2003) 5. Smith, A.D., Rupp, W.T.: Application service providers: an application of the transaction cost model. Information Management and Computer Security 11, 11–18 (2003) 6. Kenyon, C., Cheliotis, G.: Grid resource commercialization: Economic engineering and delivery scenarios (chapter 28). In: Nabrzyski, J., Schopf, J.M., Weglarz, J. (eds.) Grid Resource Management: State of the Art and Future Trends, 1st edn. International Series in Operations Research & Management Science, pp. 465–478. Kluwer Academic Publishers, Dordrecht (2004) 7. Neumann, D., Stoesser, J., Anandasivam, A., Borissov, N.: SORMA – Building an Open Grid Market for Grid Resource Allocation. In: Veit, D.J., Altmann, J. (eds.) GECON 2007. LNCS, vol. 4685, pp. 194–200. Springer, Heidelberg (2007) 8. Altmann, J., Courcoubetis, C., Darlington, J., Cohen, J.: GridEcon – The Economic-Enhanced Next-Generation Internet. In: Veit, D.J., Altmann, J. (eds.) GECON 2007. LNCS, vol. 4685, pp. 188–193. Springer, Heidelberg (2007) 9. Rolia, J., Andrzejak, A., Arlitt, M.: Automating enterprise application placement in resource utilities. In: Brunner, M., Keller, A. (eds.) DSOM 2003. LNCS, vol. 2867, pp. 118–129. Springer, Heidelberg (2003) 10. Bichler, M., Setzer, T., Speitkamp, B.: Capacity planning for virtualized servers. In: Workshop on Information Technologies and Systems (WITS), Milwaukee, Wisconsin, USA (2006)
Cost Optimization Model for Business Applications
87
11. Wimmer, M., Nicolescu, V., Gmach, D., Mohr, M., Kemper, A., Krcmar, H.: Evaluation of Adaptive Computing Concepts for Classical ERP Systems and Enterprise Services. In: The 8th IEEE International Conference on and Enterprise Computing, E-Commerce, and E-Services, The 3rd IEEE International Conference on ECommerce Technology, 2006, San Francisco, CA, USA, pp. 48–51. IEEE Computer Society, San Francisco (2006) 12. Almeida, J., Almeida, V., Ardagna, D., Francalanci, C., Trubian, M.: Resource management in the autonomic service-oriented architecture. In: IEEE International Conference on Autonomic Computing, 2006. ICAC 2006, pp. 84–92 (2006) 13. Bagchi, S., Hung, E., Iyengar, A., Vogl, N., Wadia, N.: Capacity planning tools for web and grid environments. In: valuetools 2006: Proceedings of the 1st international conference on Performance evaluation methodolgies and tools, p. 25. ACM Press, New York (2006) 14. Gonzalez, R., Gasco, J., Llopis, J.: Information systems outsourcing: A literature analysis. Information & Management 43, 821–834 (2006) 15. Dibbern, J., Goles, T., Hirschheim, R., Bandula, J.: Information systems outsourcing: A survey and analysis of the literature. Database for Advances in Information Systems 35(4), 6–102 (2004) 16. Knolmayer, G.: Die Auslagerung von Servicefunktionen als Strategie des ISManagements (the outsourcing of service functions as an is management strategy). In: Heinrich, L., Pomberger, G., Schauer, R. (eds.) Die Informationswirtschaft im Unternehmen (Information management in the enterprise). Universitaetsverlag Rudolf Trauner, Linz (1991) 17. Strong, P.: Enterprise grid computing. ACM Queue 3(6), 50–59 (2005) 18. Sekatzek, E.P.: Einsatzentscheidung und -steuerung von SAP Standard Business Software in der deutschen Automobilindustrie. Wiku-Wissenschaftsverlag Dr. Stein, Duisburg und Köln (2008) 19. McKinsey: Clearing the air on cloud computing, http://uptimeinstitute.org/ content/view/353/319/ (last accessed April 27, 2009) (2009) 20. Küchler, P.: Technische und wirtschaftliche Grundlagen. In: IT-Outsourcing, pp. 51–159. Erich Schmidt Verlag, Berlin (2004) 21. Hamilton, J.: Internet-scale service efficiency. In: Large-Scale Distributed Systems and Middleware, LADIS 2008 (2008) 22. Armbrust, M., Fox, A., Joseph, A.D., Katz, R.H., Konwinski, A., Lee, G., Patterson, D.A., Rabkin, A., Stoica, I., Zaharia, M.: Above the Clouds: A Berkeley View of Cloud Computing. Technical Report UCB/EECS-2009-28, University of California at Berkeley (2009)
Increasing Capacity Exploitation in Food Supply Chains Using Grid Concepts Eugen Volk1 , Marcus M¨ uller2 , Ansger Jacob2 , 3 Peter Racz , and Martin Waldburger3 1
3
High Performance Computing Center Stuttgart, Germany 2 Universitaet Hohenheim, Germany Department of Informatics IFI, University of Zurich, Switzerland
Abstract. Food supply chains today are characterized by fixed trade relations with long term contracts established between heterogeneous supply chain companies. Production and logistics capacities of these companies are often utilized in an economically inefficient manner only. In addition, increased consumer awareness in food safety issues renders supply chain management even more challenging, since integrated tracking and tracing along the whole food supply chain is needed. Facing these issues of supply chain management complexity and completely documented product quality, this paper proposes a full lifecycle solution for dynamic capacity markets based on concepts used in the field of Grid [1], like management of Virtual Organization (VO) combined with Service Level Agreement (SLA). The solution enables the cost-efficient utilization of real world capacities (e.g., production capacities or logistics facilities) by using a simple, browser-based portal. Users are able to enter into product-specific negotiations with buyers and suppliers of a food supply chain, and to obtain real-time access to product information including SLA evaluation reports. Thus, business opportunities in wider market access, process innovation, and trustworthy food products are offered for participating supply chain companies.
1
Introduction
Today’s global food industry possesses a huge market size of 3’500 billion US $ [2]. At the same time a tremendous amount of goods are thrown away because of quality issues or overproduction. In 1997 Kantor et al. calculated, that 27% of all edibles available in the USA are thrown away instead of being eaten [3]. Latest studies of the University of Arizona shows, that 50% of all produced perishables in the US were never consumed [4]. The combination of this huge market size and the possibility of a reduction of waste of perishable goods, shows the enormous economic impact of this situation. The above problem could be addressed by the economically efficient exploitation of capacities, such as production resources, storing and transport facilities, and vendor capacities. But not only on the level of one single company, more on the level of a whole supply chain. In order to raise economic potentials in terms of cost savings and increasing earnings, some gaps need to be bridged. Food supply chains usually consist J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 88–101, 2009. c Springer-Verlag Berlin Heidelberg 2009
Increasing Capacity Exploitation in Food Supply Chains
89
Fig. 1. Distribution of European agricultural holdings by economic size. 1 ESU (European Size Units) is roughly corresponding to either 1.3 hectares of cereals or 1 dairy cow or 25 ewes. All farmers with less than 16 ESU are categorized as small principal and part-time farmers [5].
of companies of heterogeneous nature – many small or medium sized farmers (see Figure 1) located around the world, local consolidators, logistics providers and some very large food retailers like Carrefour or WalMart. All of them are running different Enterprise Resource Planning (ERP) software or even paper based solutions to manage their production or transport facilities. Due to this heterogeneity, supply chain partners are not able to coordinate their capacities towards new dynamic production networks by using advantages of modern ICT to lower transaction costs. Also, trust-building and commercialization support mechanisms are not available today or at least not in an integrated solution. This leads to four main challenges agrifood supply chains need to face: (1) supporting capacity markets for inter-company capacity allocation, (2) restructuring food chain management with new, dynamic coordination mechanisms, (3) addressing demand trust and food safety issues integratively and last but not least (4) implementation of a full lifecycle solution covering management of capacities, SLAs, VOs, and tracking/tracing (T&T) data. The collaborative use of distributed resources and all corresponding problems like SLA Management, VO Management, trust building mechanisms and security issues are some of the core functionalities in modern Grid solutions. Thus, using Grid concepts is a proper way to address the challenges of agrifood supply chains.
90
E. Volk et al.
The use of Grid technology concepts help to break up fixed boundaries in food supply chains and enables dynamic supply chain composition to increase the exploitation of production capacities and ensures efficient as well as gap free quality monitoring. At the same time, Grid secures data ownership by using distributed data storage facilities at the level of a single supply chain member. Thus, companies keep their data ownership, which is a crucial factor for market success. Grid concepts improve scalability of the overall solution and offer already suitable solutions for composition and monitoring of supply chains. In our first paper [6] we described the developed solution – called AgroGrid – from the technical point of view, focussing on composition and monitoring of dynamic supply chains. In this paper we address the economic use of Grid concepts to solve a business problem in the area of the food industry. The AgroGrid solution is introduced and the technical as well as the business details are presented. The remainder of the paper is organized as follows. Section 2 presents scientific background and related work in the field of Grid Economics. Section 3 introduces the AgroGrid solution and discusses architectural details, while Section 4 presents business aspects of AgroGrid. Finally, Section 5 concludes this paper and presents possible future work.
2
Scientific Background and Related Work
Grid came to an age where technical problems were already satisfactorily solved respectively where first Grid solutions became economically successful. Since the Grid grew out of the laboratories of scientific oriented high performance computing centers, business aspects and concrete business applications were underrepresented in the current literature. Earlier projects in the field of Grid like the Globus Toolkit project, SweGrid, EGEE, GridBus, and Akogrimo developed mainly basic Grid components and architectural concepts [7]. Projects as GRASP, Akogrimo, GRIA, TrustCom, and NextGrid developed new grid components and architectural solutions, addressing business related aspects like risk, trust, service-level agreement and basic business models [7]. Commercialization issues were not elaborated in these projects in detail. These projects set their focuses on the development of basic grid functionalities or on the solution of upcoming technical problems like virtual organization lifecycle management, information/service discovery, distributed data storage and processing, Grid resource allocation and management, trust, security and contract management etc. AgroGrid uses a subset of these features to implement a business solution. The focus of AgroGrid does not lie on further technical development but more on the combination, customization and exploitation of existing components in order to show that currently offered open source grid components are sufficient for building up new business solution. Grid provides a mean of abstraction for managed, shared or coordinated resources across members of Virtual Organization. In contrast to above mentioned projects whose components or framework dealt with management, sharing or coordination of computer related resources like computing power, storage capacity,
Increasing Capacity Exploitation in Food Supply Chains
91
bandwidth etc., is AgroGrid dealing with logistics capacities or with material resources – trade goods – which are delivered across the supply chain. AgroGrid provides a mean of abstraction of material resources. Parties participating in a supply chain share the information about specific trade goods delivered across the supply chain. A first work towards the commercialization of Grid resources was done by Kenyon and Cheliotis in 2006. The authors pointed out some factors for a successful implementation of a Grid market and mentioned that customer’s requirements and demands need to be met accurately to make a Grid market become a true success story [8]. As well as Kenyon and Cheliotis, [9] and [8] define success factors for Grid commercialization. But all named authors are rather generic than providing a concrete solution for the business needs [10]. Beside those papers concerning the commercialization of Grid markets, a range of work related to business models and value networks can be found in the literature. In general these works identify two major business models, (1) selling Grid technology and (2) providing Grid-enabled applications [11], [12]. Furthermore, many market and business studies dealing with economic exploitation of Grid technology and Grid-enabled solutions are provided [13] [14] [15]. A first broad sum up of the current state of the art in market-oriented Grids and Utility Computing was done by Broberg, Venugopal and Buyya in 2008 [16]. The authors provide an overview over market-based approaches, technologies, price setting and negotiation mechanisms as well as trust models and accounting. Numerous contributions in the field of Grid economics were provided by the Grid Computing and Distributed Systems laboratory (GRIDS). The GRIDS lab was involved in the Economy Grid project where some main challenges for the business use of the Grid were emerged [17] [10]. For instance the Nimrod-G Grid Resource Broker and Economic Scheduling Algorithms was developed by the GRIDS lab. The Gridbus project, another project GRIDS lab is involved in, sets its focus on the implementation of Grid middleware which is used for building e-business applications [18]. The largest research project investigating the business use of Grid is the Business Experiments in Grid (BEinGRID) project founded by the European Commission within the 6th framework programm [19]. The project is structured in two waves of business experiments (BE), all instructed to use Grid-enabled solutions to build workable concepts for start up companies. All together, 25 business experiments were set up. The second wave BEs are asked to use foundings and technical solutions developed by the BEs of the first wave. In particular these solutions are used to deal with general security, license management, data management, Virtual Organization management, portals and Service Level Agreements. AgroGrid is one of the business experiments of the second wave in the BEinGRID project.
3
AgroGrid – Technical Aspects
The motivation behind the business experiment AgroGrid is to introduce Grid technology in the agricultural sector by offering a full lifecycle solution for
92
E. Volk et al.
dynamic capacity markets which integrates VO and SLA management with the R [20]. market leading solution in global distributed tracking and tracing, GTNet A more detailed description of the technical solution can be found in [6]. 3.1
AgroGrid Components
AgroGrid provides a Grid-enabled market place for capacities which allows companies in the agriculture food sector to offer and search for capacities, negotiate SLAs, and create dynamic supply-chains. Additionally, based on distributed R platform, AgroGrid provides tracking and tracing capability of the GTNet means to monitor and evaluate quality and safety of food delivered across supplychains. The solution provided by AgroGrid consists of the following components (see Figure 2): the Portal, the VO-Management, the SLA-Negotiator, the SLAMonitoring&Evaluator and the Track&Trace component. The Portal provides AgroGrid users with a common, web-browser based, userfriendly, and secure interface to AgroGrid services. The secure and personalized access to AgroGrid portal and thus to AgroGrid services is authenticated by user login and password, and could be combined with users’ certificates. The AgroGrid portlets and services hosted on the portal enable AgroGrid users to publish their own capacities, to search for capacities, to negotiate SLAs, and to build supply chains in a guided and user-friendly way. The portal offers also access to the SLA-Monitoring&Evaluator component that generates SLA evaluation
Fig. 2. AgroGrid Architecture
Increasing Capacity Exploitation in Food Supply Chains
93
reports as a result of the monitoring of quality and environmental conditions of food trade units during their production, storage, transportation or delivery. The VO-Management component is used for the setup and administration of partner memberships in the supply chains. A party wanting to create a supply chain, called supply chain manager (SC-Manager), uses VO-Management for the management of parties participating in the supply chain, called supply chain member (SC-Member). The SLA-Negotiator component allows negotiation of SLAs between a capacity requester and a capacity provider. The negotiation of SLAs includes the negotiation of price, quantity, quality parameters of food to be delivered, environmental conditions during the transport or storage, and the compensation in case of SLA-violation. The SLA-Negotiator is connected to the SLA-TemplateRepository, where SLA-Templates are stored. The SLA-Template reflects a capacity offered by a company on the market. The Track&Trace component consists of an Enterprise Resource Planning system (ERP), Traceability Information Exchange (TIX) database [20] and a Notification Proxy. The ERP serves as a source of tracking and tracing information, as well as a source of monitoring information about quality and environmental conditions of food trade units during their production, storage, transportation or delivery, located on the company sites. The tracking and tracing information, as well as monitoring information provided by the ERP are stored in the local TIX database. The local storage of data is important for not violating company’s data ownership – a crucial criterion for companies deciding whether to join the AgroGrid market place or not. The local TIX provides also interfaces for querying traceability and monitoring information stored in the TIXs of the supply chain members. The access to TIXs of the supply chain members is mediated by a local TIX, and is secured by mechanisms provided by GTNet-Hub (Global Traceability Network-Hub) which connects all TIXs. The Notification Proxy offers interfaces for submission of messages and manages their subscription, publication and notification. The Notification Proxy notifies supply chain members and the local SLA-Monitoring&Evaluator service about occurred delivery events. The SLA-Monitoring&Evaluator component is responsible for the monitoring and evaluation of SLAs which were successfully negotiated. In order to obtain monitoring information, the component queries the monitoring data which is located not only in the local TIX database, but also in the TIXs of the supply chain members. The queried information is transformed into SLA metrics defined by the corresponding SLAs. The monitored SLA metrics are compared and evaluated in the SLA-Evaluator service against the evaluation criteria defined by the SLA. In case of SLA violation detection, the SLA-Evaluation service notifies affected supply chain members about the incident. The result of the SLA evaluation is stored in the evaluation report database and is accessible via the AgroGrid portal to supply chain members. The evaluation report serves for checking of successful SLA fulfilment, and in case of a detected SLA violation it serves for the determination of penalty and compensation.
94
E. Volk et al.
3.2
Building Dynamic Supply Chains
A supply chain in AgroGrid is represented by a Virtual Organization formed by those parties, who participate in the sale, delivery, and production of a particular product or food trade unit. The composition of supply chains is based on market mechanisms – law of supply and demand. By using the AgroGrid system, a company wanting to offer its capacities, publish them in the AgroGrid capacity registry, providing details on product quality, quantity and pricing (optionally). The other companies are now able to query and discover capacities stored in the AgroGrid capacity registry, and retrieve the associated SLA-Template from the SLA-Template repository of the capacity provider. The SLA-Template contains in addition to the capacity data also pricing, environmental condition during transport and storage, possible penalties in case of SLA violation, delivery date, and in particular evaluation metrics. The building of supply chains in AgroGrid is supported by a supply chain template, which defines roles (producer, consolidator, logistics, retailer etc.) required for the supply chain. A party wanting to create a supply chain, called supply chain manager (SC-Manager), selects from the list of available templates an appropriate supply chain template (e.g., for building a supply chain for apricots). If there is no such template defined, the AgroGrid portal offers possibilities to define templates. After selection of a specific template, the system provides an overview of required roles and, after successful negotiation, also an overview of contracted parties with specific roles participating in the supply chain. In the next step, the SC-Manager queries the capacity registry for a specific capacity (e.g., for apricots provided by an apricot producer) needed for building the supply chain. After selecting the required capacity from the capacity registry, the SC-Manager (who acts as a capacity requester) initiates the negotiation process by retrieving the SLA-Template from the capacity provider. In the next negotiation step, the SC-Manager sends an SLA-offer to the capacity provider. The offer contains the SLA-Template with modified or unchanged SLA-terms. The provider may reject or accept the offer by sending an acceptance or rejection notification to the requester with the (link to) accepted SLA. After receiving an acceptance notification from the capacity provider with the accepted SLA, the system sets up automatically the SLA-Monitoring&Evaluator service, which is responsible for continuous monitoring of established SLAs. As a result of the continuous monitoring and evaluation process the SLA-Evaluator service creates an SLA-evaluation report, which reflects the fulfilment of SLA, and, in case of a detected SLA violation it serves for the determination of penalty and compensation. The AgroGrid portal provides an overview of monitored SLAs and allows to retrieve a specific SLA-evaluation report to approve fullfiment of contracted SLAs. After a SC-Manager has established an SLA with one party of the required role – like apricot provider –, he might select further capacities and initiate the negotiation process with more capacity providers (e.g., provider of logistics capacity) in order to complete the building of the supply chain. This procedure allows a chaining of parties, based on bipartite SLAs, in order to create dynamic
Increasing Capacity Exploitation in Food Supply Chains
95
supply chains of parties who participate in the sale, delivery, and production of particular product or food trade unit. Dynamic supply chains in AgroGrid refer to the timely extension of the supply chain by new partners, and to the possibility to remove or replace a supply chain member, who violated SLAs many times, by new supply chain members.
4
AgroGrid – Business Aspects
AgroGrid addresses the demand for more supply flexibility and increased exploitation of capacities. Food supply chain companies can offer and query capacities in the AgroGrid market place. The 2007 report on competitiveness of the European food industry provides a detailed insight in the nature of the market as such, identified trends, potential for competitiveness improvements, and key characteristics of those mentioned market players [2]. Producers, consolidators, logistics providers, and retailers are highly heterogeneous in terms of company size. Market concentration figures given in the report support this statement with respect to a dominant position of a few large retailers: “In 2003 consumers spend 1,028 billion Euro at the retailers and foodservices: the market share of retailers is 66%. The concentration is high and still increasing: the top-5 supermarkets have a market share of around 70% in most EU countries. The top-25 global supermarkets, of which 60% with a European headquarter, are active in several countries and even at several continents” [2]. The study elaborates on interrelations between SME (typically food producers) and large food industry businesses (typically retailers). It was found that SMEs face strong challenges with implementing ICT for electronic interchange with other supply chain partners. Most SMEs were found to adopt systems for electronic interchange “due to external pressure (e.g. from retailers) rather than to gain a competitive advantage” [2]. Another important finding characterizing industry players and the industry as a whole is the observed changing consumer demand over the last decades (see Figure 1). Consumers have shifted focus from price, quality, variety, and delivery time to product innovation, individualized food, whereas consumers are today highly sensible to food safety issues. Accordingly, concerns of top priority at the management of all supply chain steps are related to innovation and how to differentiate from competitors by innovative products. The organizational focus to achieve a high level of product innovation is consequently set on optimized sourcing and the restructuring of chain networks. R addresses these mentioned issues, challenges, and AgroGrid (and GTNet) topics of innovation at their core. AgroGrid’s Grid-enabled market place for capacities brings market players of most different background and size together, imposes only a minimum level of entry barriers, facilitates electronic interchange across business boundaries, improves food chain management, enables innovative R – an integration with products, and achieves – in combination with GTNet tracking and tracing along the full supply chain.
96
E. Volk et al. Table 1. Change of consumer’s Preferences [2]
Period Consumer Management demand concern
Management technique
Performance agri- Organizational business focus
1960s price 1970s quality
Efficiency Quality
efficiency quality
firm firm
1980s variety
Quality
flexibility
bi-lateral
1990s delivery time
Flexibility
just in time material requirements planning supply chain management efficient consumer response
velocity
chain
2000s uniqueness Innovation
innovation power chain network
As shown in Figure 3, actors of relevance in AgroGrid embrace BEinGRID and the EU acting as a Grid research funding party (funding in return for Grid business and technology cases) as well as the four AgroGrid partner entities. These comprise on one hand three partners with an academic profile (High Performance Computing Center Stuttgart, Universitaet Hohenheim, University of Zurich) providing Grid technology support to AgroGrid and on the other hand an industrial partner (TraceTracker, TRTR) with a strong commercial background R in tracking and tracing by means of GTNet.
Fig. 3. Value Network of AgroGrid
Increasing Capacity Exploitation in Food Supply Chains
97
AgroGrid service provider, infrastructure user, and customer constitute key business roles in the AgroGrid value network of relevance to the exploitation of the business model. Supply chain (SC) companies – such as producers, logistics providers, consolidators, and retailers in the case of a food production supply chain – are perceived as the primary AgroGrid users. The AgroGrid service provider enables supply chain companies to enter a Grid-enabled market place for production and logistics capacities. With this market place, the AgroGrid service provider establishes the complementary channel for AgroGrid users to offer, negotiate, and book available capacities. The AgroGrid service breaks open traditional business boundaries by the dynamic creation of Virtual Organizations and therefore rationalizes and reduces the overall costs (in special: transaction costs) of providing services for various market levels. The AgroGrid service provider combines strengths of such dynamic capacity networks with the world’s leading solution for tracking, tracing and collaboraR This unique tion in food markets, the Global Traceability Network (GTNet). combination of a newly introduced Grid-enabled market place for capacities and R will enhance curthe highly successful tracking and tracing solution GTNet rent IT capabilities in order to enable companies delivering better, cheaper and faster services for the customers of the agro IT industry. This will lead to positive impact in consumer confidence, faster product recalls and finally to a higher level of public welfare. Agro food IT industry players may access AgroGrid’s market place as Softwareas-a-Service (SaaS) through AgroGrid’s portal infrastructure. This determines a core aspect in the light of a high heterogeneity level in the industry. The AgroGrid portal encapsulates various service functionality in well-defined containers – called portlets – such as a portlet to visualize the current state of an instantiated VO (see Figure 4). AgroGrid users will be enabled to consume those services offered by the service provider. These services include, for example, the publishing and booking of production or transporting capacities in relation to a food SC. Such capacities are transformed into agreements by means of an SLA negotiation phase. As soon as actual trade units were exchanged according to the negotiated agreements, tracking and tracing information on these trade units are made available to R authorized entities, i.e., the respective user’s customers, by means of GTNet. For using those services, in principle, two pricing strategies can be envisaged: – Pricing in proportion of the transactional value – Subscription-based pricing A subscription-based pricing scheme has advantages as (a) it is easy to understand, (b) it is easy to implement with respect to the underlying accounting and charging infrastructure required, (c) it makes planning ahead simple, and (d) it allows customer segment-based pricing in a simple way. A transactional pricing scheme, on the other hand, would lower entry barriers for potential users while charging only and in direct relation to value created. Long term market experience of TraceTracker has led to a good understanding of targeted enterprises in the food sector and of their willingness to pay.
98
E. Volk et al.
Fig. 4. Screenshot VO-Visualization
First market contacts show, that customers payment reserves are up to 5,000 Euro/year in case of a subscription-based pricing scheme. Existing trading platforms (like [21] or [22]) traditionally work on a commission basis, charging for the services a certain percentage of the traded goods. Depending on the value of the traded products, the commission is in an area of 1 to 3%. However, this does normally not include any traceability or other services. Setting up a new company acting as the AgroGrid technology and service provider will lead to costs of about 320,000 Euros per year this includes all needed hard- as well as software, human resources, office furnitures, rental fees for office accommodation, and so on. A carefully worked out three year business plan based on a detailed cost estimation and on the previously described sales revenues multiplied with an estimated number of clients demonstrates the possibility of a positive net profit before tax in the second year after incorporation (see table 2). Table 2. AgroGrid Financial Figures [estimated, in Euro]
Estimated number of clients Revenue Total costs of production Net Profit / Loss before Tax
Year 1
Year 2
Year 3
10 50,000 270,000 -220,000
100 500,000 350,000 150,000
800 4,000,000 750,000 3,250,000
Increasing Capacity Exploitation in Food Supply Chains
99
Table 3. Critical success factors and risks Success factors
Risks
Gaining fast and embracing market access
Reach sufficient market acceptance enabling a sustainable business in the long run (critical mass) Development of a fine-grained, precise financial model covering revenue potentials, and long-term full costs
Mobilizing the critical mass of users from varying value chains
Leveraging the benefit for the users as predicted in order to turn benefits into revenue
Table 3 shows the key set of critical success factors and risks which were identified for the AgroGrid business model.
5
Conclusion and Future Work
Issues of supply chain management complexity and fully documented product quality have led to the design and implementation of AgroGrid. AgroGrid enables the cost-efficient utilization of capacities through the AgroGrid portal. The promising business case shows, that already existing open source Grid concepts and components are able to support the collaborative use of distributed production and logistics capacities, and not only traditional Grid-related resources like CPU-power or data storage facilities. Although, adopting Grid concepts for production and logistics capacities leads to useful applications, not all problems were solved during development of the AgroGrid business case and during implementation of the corresponding software solution. Not all security related issues were considered completely until now even though the use of a current state of the art VO Management component supports all needed security concepts, what makes development of such an add-on very easy [23]. At the moment, a fully automated discovering of possible market partners including autonomous negotiating and contracting is not implemented. At least there are legal restrictions to be kept in mind. The current state of the law does not cover the requirements needed for such automatically conclusion of contracts. Thus, AgroGrid users are asked to conclude their contracts manually after the fully automated negotiation. Beside these technical and legal aspects, AgroGrid is a step towards the use of Grid concepts to manage and share capacities alongside whole supply chains. It was shown, that these concepts could successfully be adopted to solve discovered problems in food supply chains, and not only problems related to classical high performance computing issues like traditionally Grid systems focusses on.
100
E. Volk et al.
Acknowledgements. The results presented in this paper are partially funded by the European Commission through the BEinGRID project.
References 1. Foster, I., Kesselman, C.: The Grid. In: Blueprint for a New Computing Infrastructure.: Blueprint for a New Computing Infrastructure, vol. 2. Morgan Kaufmann, San Francisco (2003) 2. Wijnands, J., van der Meulen, B., Poppe, K.J.: Competitiveness of the european food industry - an economic and legal assessment. Technical report, European Commission (2006) 3. Kantor, L.S., Lipton, K., Manchester, A., Oliveira, V.: Estimating and addressing america’s food losses. FoodReview 20(1), 2–12 (1997) 4. Ilic, A., Staake, T., Fleisch, E.: Using sensor information to reduce the carbon footprint of perishable goods. IEEE Pervasive Computing 7(1), 22–29 (2009) 5. Eurostat: Agricultural statistics, main results 2006-2007. Technical report, European Union (2008) 6. Volk, E., Jacob, A., M¨ uller, M., Racz, P., Waldburger, M., Bjerke, J.: Agrogrid - composition and monitoring of dynamic supply chains. In: Proceedings of The Cracow Grid Workshop 2008 (CGW 2008), Krakow, March 2009, pp. 373–381 (2009) 7. Altmann, J., Routzounis, S.: Economic modelling for grid services. In: Proceedings of the e-Challenges (2006) 8. Kenyon, C., Cheliotis, G.: International Series in Operations Research and Management Science 64, 465–478 (2006) 9. Cheliotis, G., Miller, S., Woodward, J., OH, D.: Questions for getting smarter on creating a grid market hub, grid marketplace roundtable. In: Proceedings of the GECON 2006, Singapore (May 2006) 10. Thanos, G.A., Courcoubetis, C., Stamoulis, G.D.: Adopting the grid for business purposes: The main objectives and the associated economic issues. In: Veit, D.J., Altmann, J. (eds.) GECON 2007. LNCS, vol. 4685, pp. 1–15. Springer, Heidelberg (2007) 11. Joseph, J., Ernest, M., Fellenstein, C.: Evolution of grid computing architecture and grid adoption models. IBM System Journal 43(4), 624–644 (2004) 12. Sawhny, R., Dietrich, A.J., Bauer, M.T.: Towards business models for mobile grid infrastructures - an approach for individualized goods. In: Proceedings of the Workshops Access to Knowledge through Grid in a Mobile World, 5. International Conference on Practical Aspects of Knowledge Management, Wien, Austria (2004) 13. Quocirca: Business grid computing - the evolution of the infrastructure (September 13th 2006) 14. The Insight Research Corporation: Grid computing: A vertical market perspective 2006-2011. Technical report, The Insight Research Corporation (2005) 15. The 451 Group: Grid computing - where is the value? Technical report, 451 Grid Adoption Research Service (Report 1) (August 2004) 16. Broberg, J., Venugopal, S., Buyya, R.: Market-oriented grids and utility computing: The state-of-the-art and future directions. Journal of Grid Computing 6(3), 255– 276 (2008) 17. Buyya, R., Abramson, D., Venugopal, S.: The grid economy. In: Special Issue of the Proceedings of the IEEE on Grid Computing. IEEE Press, Los Alamitos (2005)
Increasing Capacity Exploitation in Food Supply Chains
101
18. The Grid Computing and Distributed Systems (GRIDS) Laboratory, University of Melbourne, http://www.gridbus.org/ 19. Business Experiments in Grid, BEinGRID, http://www.beingrid.eu 20. TraceTracker Innovation ASA, http://www.tracetracker.com/ 21. Nungesser GmbH, http://www.nungesser.com/ 22. Alibaba.com, http://www.alibaba.com 23. Gaeta, A.: Main results of the vo management thematic area - a beingrid whitepaper, http://www.gridipedia.eu/virtualorganizationmanagement.html
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows Dimosthenis Kyriazis, Konstantinos Tserpes, Ioannis Papagiannis, Kleopatra Konstanteli, and Theodora Varvarigou Dept. of Electrical and Computer Engineering, National Technical University of Athens, 9, Heroon Polytechniou Str, 15773 Athens, Greece {dkyr,tserpes}@telecom.ntua.gr, [email protected], {kkonst,dora}@telecom.ntua.gr
Abstract. This paper deals with the problem of selecting service instances to execute workflow processes by not only taking into consideration Quality of Service (QoS) constraints, posed by the end-users, but also the business relationships between different service providers. These business / strategic relationships, the providers have, may affect the parameters of the service instances regarding the offered QoS level, and therefore the relationships need to be modeled and expressed with specific metrics. In this paper we present an innovative algorithm that maps workflow processes to service instances by taking into account the aforementioned metrics during the selection process. We also demonstrate the operation of the implemented algorithm and evaluate its effectiveness using a scenario, based on a 3D image rendering application. Keywords: Quality of Service, Business Relationships, Workflows.
1 Introduction Although initially designed to cover the computational needs of high performance applications [1], [2], Grid technology of nowadays aims at providing the infrastructure for the general business domain. Advanced infrastructure requirements combined with innate business goal for lower costs have driven key business sectors such as multimedia, engineering, gaming, environmental science, among others towards adopting Grid solutions into their business. Furthermore, complex application workflows are emerging along with specification languages used to enable the workflow description and execution on Grid environments. The final success of this business orientation of Grid technology however will primarily depend on its real adopters; the end users whose main demand refers to the offered level of quality. Since workflow is a wide concept in technology, the terminology regarding workflow definitions that is used afterwards in this paper is defined. Regarding the general definition, Workflow Management Coalition (WfMC) provides the following definition [3]: “Workflow is the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules”. A Workflow Model / Specification is used to define a workflow both in task and structure level. There are two J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 102–115, 2009. © Springer-Verlag Berlin Heidelberg 2009
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows
103
types of workflows, namely Abstract and Concrete [4], [5] while concrete workflows are also referred to as executable workflows in some literature [6]. In an abstract model, the tasks are described in an abstract form without referring to specific Grid resources for task execution since it provides the ability to the users to define workflows in a flexible way, isolating execution details. Furthermore, an abstract model provides only service semantic information on how the workflow has been composed and therefore the sharing of workflow descriptions between Grid users is feasible, which is of major importance for the participants of Virtual Organizations (VOs) [1]. Abstract models can be composed with systems like the one presented in [7]. In the concrete model, the tasks of the workflow bind to specific resources and therefore this model provides service semantic and execution information on how the workflow has been composed both for the service instances and for the overall composition (e.g. dataflow bindings, control flow structures). This shift from science Grids to business Grids in parallel with the replacement of simple job executions to complex workflow management [3] and enactment in Grids resulted in advanced requirements in the field of workflow mapping with regard to QoS metrics / resources’ special attributes (e.g. performance profile). Based on the fact that each workflow contains processes that can be executed from a set of service providers / instances (candidates), which are annotated with QoS information, workflow mapping refers to the mapping of the aforementioned workflow processes to Grid provided services taking into account the QoS metrics in order to provide a selection of candidates guarantying end-to-end QoS for the submitted workflow. In bibliography, it is referred as Workflow QoS Constraints and remains one of the key factors in a Grid Workflow Management System and more specific in the Workflow Design element [8]. As presented in the Related Work section of this paper, there are many approaches that address the QoS issue in Grid environments while in one of our previous works [9] we have presented in detail a QoS-aware workflow mapping mechanism. However, the business relationships between the service providers are not taken into consideration during the selection process. In greater detail, the service providers may have business relationships that can be Cooperating, non-Cooperating or even Antagonistic, Cheating, or Malicious. These relationships affect the workflow mapping since the QoS metrics of a service provider may change based on a selection of another provider. In many occasions, a service provider may alter his offered services’ QoS values based on the selection of another service provider depending on their business relationships. Given that the modeling the strategic relationships and concluding to metrics for defining a service providers “friendliness” based on the relationships that service provider has with other ones was presented in our previous work [10], this paper completes it by describing a QoS-based selection mechanism that takes into account business relationships during the selection process and meets the user’s QoS requirements. We do not describe how the business relationships are captured in the Service Level Agreements (SLAs) since this it out of the scope of this paper that mainly looks in the selection process. The remainder of the paper is structured as follows. Section 2 presents related work in the field of QoS-based workflows management in Grids. Section 3 briefly introduces the concept of Business Relationships in workflows and links the work
104
D. Kyriazis et al.
presented in this paper with the modeling results presented in [10] while Section 4 describes the selection algorithm. Initial results for evaluation the mechanism are included thereafter while Section 5 concludes with a discussion on future research and potentials for the current study.
2 Related Work There are various approaches for QoS-based workflow management in Grid environments. In some cases, the selection process is based on the SLA negotiation process, as discussed in [12], [13] and [14] while the GridEcon and AssessGrid projects have both considered the issue of QoS-based workflow management as presented in [29], [30] and [31].The end-user’s constraints and preferences are parsed to several service providers through the functionality offered by a broker (usually the SLA Management Service) for allocating the appropriate service providers. The Globus Architecture for Reservation and Allocation (GARA) [15] addresses QoS at the level of facilitating and providing basic mechanisms for QoS support, namely resource configuration, discovery, selection, and allocation. Outcomes of the research on QoS-based selection for workflows are also presented in [16], [17] and [18]. The first one proposes an algorithm that minimizes cost in the time constraint while the second work presents a system that is able to meet pre-defined QoS requirements during the workflow mapping process. Authors of [18] discuss a system that based on event condition action rules maps workflow processes to Grid resources taking into account QoS information. A workflow QoS specification and methods to predict, analyze and monitor QoS are presented in [19] and [20]. The work is focused on the creation of QoS estimates and the QoS computation for specific metrics – time, cost, fidelity and reliability with the use of two methods: analysis and simulation. In this case, the parameters are handled one by one similar to [16] and [21] and not in a combined way while the overall estimation emerges from the individual tasks. Authors in [23] present the ASKALON tool which comprises four components along with a service repository to support performance-oriented development of parallel and distributed (Grid) applications. Literatures [24], [25] and [26] discuss the ICENI environment in which a graph based language is used to annotate component behaviors and perform optimizations based on the estimated execution times and resource sharing. The gathered performance information is taken into account during the resource selection while the mapping of work onto resources through a workflow enabled scheduler (which is able to make use of performance information) is also supported. Moreover, a three-layered negotiation protocol for advance reservation of the Grid resources and a mechanism that optimizes resource utilization and QoS constraints for agreement enforcement is presented in [27]. The difference between the systems presented in this section and our proposed scheme lies on the fact that while the ones presented here yield very good results for QoS-based selection, they consider as QoS parameters during the selection process either the ones published by the service providers (via SLAs) or the ones obtained from monitoring tools over the resources. However, they do not tackle an issue that may affect the selection process and refers to changes in the QoS values due to business relationships. This kind of information cannot be obtained with monitoring tools
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows
105
since these work during the execution of a process whilst algorithms and methods have not been published for QoS-based selection with a priori knowledge of the effects of service providers’ business relationships.
3 Business Relationships Overview This section briefly presents the outcomes of our previous work [10] regarding modeling a strategic relationship and characterizing the “friendliness” of a service provider, since these metrics are prerequisites for the selection algorithm presented in Section 4. This study was performed for the initial indicative parameters: Availability, Cost and Time. For a detailed description of the aforementioned modeling please refer to [10]. The definitions that follow are needed during the selection algorithm that is presented thereafter and for that reason we briefly cite them in this section. A strategic relationship can be interpreted into the way the selection of a provider affects other providers. To model a relationship let us consider it as a directed edge on the problem’s graph from a service provider A to a service provider B. The source of the edge is the provider that stimulates the relationship and the destination is the provider that alters its service parameters in response to the selection of the source.
Fig. 1. Strategic Relationships in a workflow: A-B: Immediate Influence, A-E: Future Influence
In the example presented on the above figure (Fig. 1), service provider A triggers a change to provider’s B QoS parameters and thus, we have an edge from node A to node B. In case that the selection of provider B changes the parameters of provider A we require the existence of a second edge with the opposite direction. In order to design a function that characterizes a service’s instance influence, we need to have a way to express the influence of a specific strategic relationship. In our study we concluded that it can be expressed from the following function (Strategic Relationship Influence - SRI): ,
,
, ,
(1)
where: trigger refers to the service provider that triggers a strategic relationship, affected refers to the service provider that alters its QoS parameters, and levels refers to the set of processes that we measure the influence of the relationship on.
106
D. Kyriazis et al.
The Immediate Influence (II) of a specific strategic relationship is given by the following equations: (2) and (3) where: Value is the corresponding QoS parameter’s value of the affected service provider and Minimum and Maximum values are referring to the corresponding QoS parameter inside a service process and Slope refers to a user-defined parameter that describes how important is one parameter comparing to another and may get values: [0,1]. The Future Influence (FI) of an affected node towards a set of process levels that we are measuring it on, is given by: . . ∑
, , ,
,
, ,
, , , , ,
:
(4) 0 0
, , ,
,
:
, , ,
(5)
(6)
(7)
where: remLevels is the set (levels-level(affected)) that represents the remaining process levels that F.I. is calculated on, Num[] returns the current number of service providers in a given set of processes, and adj(affected,remLevels) is the set of adjacents to the affected node inside a set of processes. Based on the above, the Service Provider Friendliness (SPF) is given by: | ∑
,
| |
|
(8)
, ,
0 0 ,
: ,
(9)
(10)
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows ,
: ,
107
(11)
where: provider is the service provider we are calculating the SPF for, levels is the set of process levels that we are calculating the SPF on, curLevel is the current process level from the given set of levels, Affected Nodes is the number of adjacent to provider nodes in the current level, Mean SRI is the arithmetic mean of SRIs for the strategic relationships that are triggered from provider and affect other service providers only in the current level, and the Balancing Factors are calculated as in F.I. and they count SRI types per level.
4 QoS-Based Selection Mechanism Following, we describe the algorithm used within the workflow mapping mechanism in order to select the service instances per process based on the QoS parameters and the application workflow while taking advantage of any Strategic Relationships that appear. The main goal of the algorithm is to result to an optimum selection with regard to the QoS metrics requested by the user and offered by the service providers. The algorithm’s strategy is initially to map workflow processes to service instances in a way that the constraints set by the user are met (select instances that meet the requested availability level without violating the budget constraint). Afterwards, the instances that offer higher level of QoS (in terms of availability and execution time) for the corresponding cost are defined and replacements on the initial mapping take place. Within the algorithm, the user’s preferences are expressed as slope values for availability, cost and time parameters. These values enact how important each parameter is considered to be by the user and each one may get values 0 ≤ Slope ≤ 1. The higher the slope is, the more important the value is for the user. Decimal values in slope allow the detailed and exact input on user preferences. Consequently: AvailabilitySlope + CostSlope + TimeSlope ≤ 3
(12)
These slope parameters have already been showcased while calculating a Strategic Relationship’s Immediate Influence. Their use there was to express user’s preferences towards specific QoS parameters by improving a relationship’s Immediate Influence on them. Following, we describe in detail the major steps of the algorithm along with their sub-steps. Step 1: Calculation of auxiliary values (pilots for the algorithm completion) in order to define metrics that “characterise” the level of QoS provided by the service instances as well as Strategic potential. I. Calculation of the minimum and maximum values of availability, cost and time for each service type based on their service instances (candidates). II. Computation of the pilot values for availability, cost and time parameters based on their minimum and maximum values with the use of the following functions:
108
D. Kyriazis et al.
exp
(13) exp
(14)
exp
III.
IV. V.
(15)
In the above functions, x is the value of availability, cost and time for each service instance and MinAvailabiltyValue, MaxAvailabilityValue, MinCostValue, MaxCostValue, MinTimeValue and MaxTimeValue are the minimum and maximum values of the parameters (as described in the previous sub-step). These functions, which came as a result of our study of various experiments and simulations, are non-linear with positive slope and they enact the influence of any change in the availability, cost and time values to the user defined parameters and preferences. Calculation of Strategic Relationship’s Influence for all strategic relationships in the problem’s space and subsequent calculation of Service Provider’s Friendliness for each Service Provider (for availability, cost and time). Calculation of the minimum and maximum values of SPF (availability, cost and time) for each service type based on their service instances (candidates). Computation of the converted values for availability, cost and time Service Provider Friendliness based on their minimum and maximum values with the use of the following functions: 1,08
,
,
1,08
,
e1,08
,
, ,
, ,
(16)
0, ,
, , max
0,
,
(18)
0
0, 0,
(17)
0 0, 0,
(19)
0 0
In the above functions, x is the value of SPFavailability, SPFcost and SPFtime for each service instance and Min/ Max are the minimum and maximum values of the parameters (as described in the previous sub-step). Function H() is a helper function that transforms absolute to relational values inside a service type and thus allows a better approach to the integration of Friendliness ratings with actual QoS parameters. It is needed here, unlike substep ii, because SPF can take negative values and they should be reflected
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows
VI.
109
to the final results as reduction of original QoS values. The exponential base value 1.08 will be clarified in substep vii. Calculation of the new parameters’ values that will be used further on based on the aforementioned functions:
(20) (21) (22)
VII.
In the above equations, InitAvValue, InitialCostValue and InitialTimeValue are the values of the parameters that were initially obtained by the service instances as their offers for the job execution. Calculation of the following ConvertedIndex that will be used in sequel in order to proceed with the selections: (23)
This index is the major criterion during the selection process since it shows for each service instance the offered level of availability with regard to the corresponding cost and execution time. Moreover, it takes account of possible strategic relationships and gives better results for more influential providers, The lower the index is, the higher level of quality is provided by the candidate since for a job execution it demands lower budget and offers lower execution time for a higher availability. At this point, it is necessary to clarify the use of 1.08 as base in the exponential function of substep v. By reviewing how ConvSPFs are used while calculating the ConvertedIndex it is clear that the friendliest service provider in each service type will have a maximum ConvSPF equal to 1.08Slope. If this service provider is the absolute best for every QoS parameter, then its ConvertedIndex will be multiplied by 1/(1.08cSlope1.08tSLope1.08aSlope)=1/1.08cSlope+tSlope+aSlope
(24)
Taking into account that AvailabilitySlope+CostSlope+TimeSlope≤3
(25)
we observe that the best ConvertedIndex will be multiplied by 1.08-3, a reduction of about 25%. The base 1.08 provides a significant reduction of 25% to the friendliest provider while scaling all others providers reductions up to 25%. As a result, if a single provider is more that 25% better that another, the second one will never be able to be selected. This mechanism reduces the effect that strategic relationships have to the selection algorithm while it provides a way that relationship’s effect can be finetuned. The specific 1.08 value is a result of various experiments with different application workflows.
110
D. Kyriazis et al.
Step 2: Initial workflow mapping with service instances that meet the user’s requirement for availability without violating the cost constraint. I. For each service type, categorization of service providers with ascending NewCostValue. All service providers that do not meet the user’s availability constraint are excluded from this initial step. This exclusion is temporarily as these service providers may alter their cost offerings affected by previous selections. II. Selection of the service provider with the absolute minimum NewCostValue amongst all service types. III. In case that the selected provider is triggering any Strategic Relationships, QoS parameters for all affected providers are altered according to the relationship. IV. The selection on the previous service type is considered finalized and NewCostValue is calculated again for every remaining provider. With one service type less, SRF is changed for each remaining provider and the recalculation is necessary. This step is the actual recalculation of the heuristic function after an algorithms’ iteration. V. Steps i to iv are repeated until selections are made on all service types. VI. The overall workflow cost is calculated for the instances selected in the previous sub-steps. If it exceeds the user’s cost constraint, the workflow cannot be mapped into service instances based on the user’s requests and the algorithm ends. Otherwise it continues with the following step. Step 3: Definition of a service instance (candidate) for each service type. The reason of this step is to discover the candidates that provide higher level of QoS or improved Strategic potential for each service type within a workflow. I. For each service type, the service instance with the lowest ConvertedIndex value is defined in comparison with the one selected in Step 2 of the algorithm. If no instances are defined, the service type is excluded from the rest of the algorithm execution since no optimization can be performed. The selection in this specific service type is then considered finalized and SRF/ConvertedIndex are calculated again for all the remaining services. If this applies for all service types of the workflow, the algorithm ends and the initial workflow mapping is considered to be the final one. Otherwise, it continues with the following sub-step. II. Selection of the service instances (candidates) for each service type with the lowest value of the ConvertedIndex. III. Calculation of the differences in the values of availability, cost, time, SRFavailability, SRFcost and SRFtime between the initial service instance selection (from Step 2) and the replacement one (from the previous sub-step). Step 4: Creation of a list with the “best candidates” for each service type in order to find possible replacement(s) I. For each difference that has been calculated in Step 3, the ConvertedIndex is re-calculated. Basically, Step 1 of the algorithm is re-executed considering as initial values for the service instances the aforementioned differences and the replacements are made based on their differences (e.g. it is preferable to
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows
II. III.
IV.
111
spend 6 cost units in order to increase availability by 4% than spend 10 cost units in order to increase it by 5%, the same goes for SPF increases). The service instances with the lowest new ConvertedIndex are selected. Based on the new selection of service instances, the overall workflow cost and time values are re-calculated. In order to check whether the best service instance can replace the original choice, we must calculate the minimum possible cost in case that the replacement provider is accepted. This calculation is done by repeating step 2 while considering this choice (and all previous finalized choices) finalized. If step 2 is able to find a mapping that does not violate user’s cost constraint, the new selection is considered finalized, the corresponding service process is excluded from future SPF/SRI calculations and the algorithm moves to the next sub-step. Otherwise they are excluded permanently from the available service instances, no selection is considered finalized on the specific service type and the algorithm moves ahead. The algorithm is looped and continues from Step 3 for all still active service types and providers. When selections have been made on all workflow processes the algorithm stops.
5 Evaluation For the purposes of our experiments we deployed a 3D image rendering application in a GRIA middleware environment [28]. The workflow consists of a 3D Rendering Service, the Shaders Compilation Service and Textures Compilation Service which are prerequisites for the rendering service, as depicted in the following figure:
Input Data
Shaders Compilation Service
Output Data Rendering Service
Textures Compilation Service
Fig. 2. 3D Image Rendering Scenario
The above scenario has three distinct service types that require execution and in our experiments we used eight different service instances providing different QoS parameters. Apart from the aforementioned QoS parameters, service providers are also involved in various strategic relationships. Figure 3 depicts the three different workflow processes (3 rows) with the corresponding service instances to which the processes can be mapped (service instances marked with the letters A to H in each row - workflow process). For each service instance the QoS parameters’ values that are published by the providers are also depicted in Figure 3 (“c” stands for Cost, “t” for Time and “a” for Availability) as well as the providers’ relationships (lines connecting the service instances). These relationships and the influence on the QoS parameters are summarized in Table 1.
112
D. Kyriazis et al.
Fig. 3. Visualized Strategic Relationships and QoS Parameters Table 1. Strategic relationships between Service Providers Triggering Instance G1 G1 C1 E2 E2 H1 H2 A0 A0 D1 D1
Affected Instance G2 E2 E2 B3 E3 H2 H3 A1 D1 A2 D2
New Cost 70.00 50.00 63.00 75.00 80.00 50.00 78.00 74.00 75.00 80.00 90.00
New Time 3.00 3.00 8.00 5.00 5.00 2.20 4.00 2.30 2.40 2.50 3.50
New Availability 94.60 94.60 69.00 85.00 91.00 94.00 95.00 97.00 96.50 99.00 97.00
In our experiment let us consider that the user defines the following hard constraints: (i) Maximum Cost 300 units and (ii) Minimum Availability 50%. Moreover, the user wants to get the best for his money and, as a result, he sets his preferences as follows: (i) Cost Slope=0.00, (ii) Availability Slope=1.5, (iii) Time Slope 1.5. Thus, he is interested in maximizing equally both execution time and availability while he is uninterested in cutting costs below 300 units or making the best value for money selections. Before letting the algorithm run, we can make a very interesting observation: the requested workflow cannon be executed at all using only 300 cost units without taking advantage of strategic relationships. As we can see from Figure 3, the absolute cheapest service providers per service type are C1, E2 and B3 resulting in an overall cost of 306.79. More interestingly, provider E2 actually increases its cost when provider C1 is selected, resulting in an even more expensive mapping. The outcome is that this workflow with the above constraints cannot be executed at all by any mapping algorithm that ignores strategic relationships. The original solution that the mapping algorithm finds and then tries to improve is G1, E2 and B3. This solution has an initial cost of 293.24 units and we can verify that it is actually the cheapest possible one, taking advantage of one strategic relationship (G1 to E2) to reduce E2’s cost. We notice that G1 is not actually the cheapest provider in service type 1 but, nevertheless, it is selected as it reduces the overall workflow cost. From this point, the algorithm tries to improve that original solution with regard to Availability and Execution Time. In order to do so, it tries to find replacement nodes
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows
113
Table 2. Experiment 1, Algorithm iterations for improving initial provider mapping
Iteration 1 2 3 4 5 6 7 8 9 10 11 12 13
Initial Service Provider G1 G1 G1 G1 E2 E2 E2 E2 E2 G1 G1 E2 B3
Possible Replacement Provider A1 D1 E1 F1 H2 G2 A2 D2 F2 B1 H1 H2 H3
Cost Difference
Availability Difference
Time Difference
Swap Commit
-12.079 -10.659 -9.50 -8.85 -12.71 -14.25 -10.15 -22.08 -9.65 -2.57 -2.44 10.18 2.93
7.0 4.29 2.90 0.70 20.44 24.87 20.53 29.68 18.22 0.89 0.59 23.84 13.09
70.69 49.16 50.29 40.81 5.25 4.41 4.26 1.68 1.54 10.00 2.44 5.33 1.88
No No No No No No No No No No Yes Yes Yes
with better ConvertedIndex that will allow a more efficient workflow execution. In the following table we have an overview of these steps (algorithm iterations): The resulting mapping is H1, H2 and H3 resulting in a total cost of 298.69 units, 90.1% availability and 374.69 time units. In the above table, a negative sign in cost column means that the proposed service provider requires more money than the initial choice. On the other hand, a negative sign in Availability and Time differences means that the proposed provider actually has worse QoS parameters.
6 Conclusions In this paper we have presented an innovative QoS-based selection mechanism algorithm for the definition of concrete workflows that takes into account the strategic relationship during the selection process. It promotes the most positively influential service providers and puts aside those with a negative influence during a QoS-based selection process. The latter is of major importance since it enables the definition of a concrete workflow that meets the user’s cost constraint, which might not be feasible without considering the business relationships (as also depicted in the presented experiment). Moreover, the resulting concrete workflow may offer higher level of end-to-end QoS since the cost difference obtained due to the business relationships of the service providers may be used to select service instances with higher QoS values of other parameters (e.g. lower execution time). Concluding, service oriented infrastructures have not yet adopted an effective scheme that will facilitate end-to-end QoS provisioning taking into consideration possible business relationships between the service providers. In that rationale, we have shown the importance of a selection mechanism, the use of which is expected to significantly increase the effort to address in a dynamic way the business relationships in Grid workflows.
114
D. Kyriazis et al.
References 1. Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal Supercomputer Applications 15(3) (2001) 2. Leinberger, W., Kumar, V.: Information Power Grid: The new frontier in parallel computing? IEEE Concur. 7(4), 75–84 (1999) 3. Workflow Management Coalition, Terminology & Glossary, Document Number WFMCTC-1011, Issues 3.0 (1999) 4. Deelman, E., Blythe, J., Gil, Y., Kesselman, C.: Workflow Management in GriPhyN. In: The Grid Resource Management. Kluwer, Netherlands (2003) 5. Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Patil, S., Su, M.H., Vahi, K., Livny, M.: Pegasus: Mapping Scientific Workflow onto the Grid. In: Across Grids Conference 2004, Nicosia, Cyprus (2004) 6. Ludäscher, B., Altintas, I., Gupta, A.: Compiling Abstract Scientific Workflows into Web Service Workflows. In: 15th International Conference on Scientific and Statistical Database Management, Cambridge, Massachusetts, USA, pp. 241–244. IEEE CS Press, Los Alamitos (2003) 7. Bubak, M., Gubała, T., Kapałka, M., Malawski, M., Rycerz, K.: Workflow composer and service registry for grid applications. Future Generation Computer Systems 21(1), 79–86 (2005) 8. Yu, J., Buyya, R.: A Taxonomy of Workflow Management Systems for Grid Computing. Journal of Grid Computing 3(3-4), 171–200 (2005) 9. Kyriazis, D., Tserpes, K., Menychtas, A., Litke, A., Varvarigou, T.: An innovative Workflow Mapping Mechanism for Grids in the frame of Quality of Service. Future Generation Computer Systems (2007) 10. Papagiannis, I., Kyriazis, D., Kardara, M., Andronikou, V., Varvarigou, T.: Business relationships in grid workflows. In: Altmann, J., Neumann, D., Fahringer, T. (eds.) GECON 2008. LNCS, vol. 5206, pp. 28–40. Springer, Heidelberg (2008) 11. Spooner, D.P., Cao, J., Jarvis, S.A., He, L., Nudd, G.R.: Performance-aware Workflow Management for Grid Computing. The Computer Journal (2004) 12. Bochmann, G., Hafid, A.: Some Principles for Quality of Service Management, Technical report, Universite de Montreal (1996) 13. Al-Ali, R.J., Amin, K., von Laszewski, G., Rana, O.F., Walker, D.W., Hategan, M., Zaluzec, N.J.: Analysis and Provision of QoS for Distributed Grid Applications. Journal of Grid Computing, 163–182 (2004) 14. Padgett, J., Djemame, K., Dew, P.: Grid-based SLA Management. LNCS, pp. 1282–1291 (2005) 15. Foster, I., Kesselman, C., Lee, C., Lindell, B., Nahrstedt, K., Roy, A.: A Distributed Resource Management Architecture that Supports Advance Reservation and Co-Allocation. In: Proceedings of the International Workshop on QoS, pp. 27–36 (1999) 16. Yu, J., Buyya, R., Tham, C.K.: QoS-based Scheduling of Workflow Applications on Service Grids, Technical Report, GRIDS-TR-2005-8, Grid Computing and Distributed Systems Laboratory. University of Melbourne, Australia (2005) 17. Guo, L., McGough, A.S., Akram, A., Colling, D., Martyniak, J., Krznaric, M.: QoS for Service Based Workflow on Grid. In: Proceedings of UK e-Science 2007 All Hands Meeting, Nottingham, UK (2007) 18. Khanli, L.M., Analoui, M.: QoS-based Scheduling of Workflow Applications on Grids. In: International Conference on Advances in Computer Science and Technology, Phuket, Thailand (2007)
A QoS-Based Selection Mechanism Exploiting Business Relationships in Workflows
115
19. Cardoso, J., Sheth, A., Miller, J.: Workflow Quality of Service. In: Proceedings of the International Conference on Enterprise Integration and Modeling Technology and International Enterprise Modeling Conference (ICEIMT/IEMC 2002). Kluwer Publishers, Dordrecht (2002) 20. Cardoso, J., Miller, J., Sheth, A., Arnold, J.: Modeling Quality of Service for Workflows and Web Service Processes, Technical Report, LSDIS Lab, Department of Computer Science University of Georgia (2002) 21. Buyya, R., Abramson, D., Venugopal, S.: The Grid Economy. Proceedings of the IEEE 93(3), 698–714 (2005) 22. Buyya, R., Murshed, M., Abramson, D.: A Deadline and Budget Constrained Cost-Time Optimization Algorithm for Scheduling Task Farming Applications on Global Grids. In: Proceedings of the 2002 International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA7 2002) (2002) 23. Fahringer, T., Jugravu, A., Pllana, S., Prodan, R., Seragiotto Jr, C., Truong, H.L.: ASKALON: a tool set for cluster and Grid computing. Concurrency and Computation: Practice and Experience 17(2-4), 143–169 (2005) 24. Mayer, A., McGough, S., Furmento, N., Lee, W., Newhouse, S., Darlington, J.: ICENI Dataflow and Workflow: Composition and Scheduling in Space and Time. In: UK eScience All Hands Meeting, Nottingham, UK, pp. 894–900. IOP Publishing Ltd., Bristol (2003) 25. McGough, S., Young, L., Afzal, A., Newhouse, S., Darlington, J.: Performance Architecture within ICENI. In: UK e-Science All Hands Meeting, Nottingham, UK, pp. 906–911. IOP Publishing Ltd., Bristol (2004) 26. McGough, S., Young, L., Afzal, A., Newhouse, S., Darlington, J.: Workflow Enactment in ICENI. In: UK e-Science All Hands Meeting, Nottingham, UK, pp. 894–900. IOP Publishing Ltd., Bristol (2004) 27. Siddiqui, M., Villazon, A., Fahringer, T.: Grid capacity planning with negotiation-based advance reservation for optimized QoS. In: Proceedings of the ACM/IEEE Conference on SuperComputing, SC 2006 (2006) 28. Surridge, M., Taylor, S., De Roure, D., Zaluska, E.: Experiences with GRIA-Industrial Applications on a Web Services Grid. In: Proceedings of the First International Conference on e-Science and Grid Computing, pp. 98–105. IEEE Press, Los Alamitos (2005) 29. Quan, D.M., Altmann, J.: Grid Business Models for Brokers Executing SLA-Based Workflows. In: Buyya, R., Bubendorfer, K. (eds.) Market-Oriented Grid and Utility Computing. Wiley, New York (2009) 30. Quan, D.M., Kao, O., Altmann, J.: Concepts and Algorithms of Mapping Grid-Based Workflow to Resources within an SLA Context. In: Business Process Management: Concepts, Technologies and Applications (2009) 31. Padgett, J., Djemame, K., Gourlay, I.: Economically Enhanced Risk-aware Grid SLA Management. In: Proceedings of eChallenges 2008 Conference, Stockholm, Sweden (2008)
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis Junseok Hwang, Jörn Altmann, and Ashraf Bany Mohammed International IT Policy Program, Technology Management, Economics and Policy Program, School of Management and Industrial Engineering, College of Engineering, Seoul National University, 599 Gwanak-no, Gwanak-Gu, 151-744 Seoul, South Korea [email protected], [email protected], [email protected]
Abstract. Volunteer Grids, in which users share computing resources altruistically, play a critical role in fostering research. Sharing and collaboration in Volunteer Grids is determined by many factors. These determinants define the participation in Grids and the amount of contribution to such Grids. Whereas previous studies focused on explaining researchers’ and countries’ willingness to share resources in Volunteer Grids based on social sharing theory, this research argues that without the appropriate technological capabilities, countries or researcher cannot implement their willingness. Based on the literature review, this paper defines the influential determinants for participating in global Volunteer Grids. Besides, this research employs a multiple regression analysis of these determinants, using a total of 130 observations collected from international data repositories. Our results show that R&D and Internet connection type (broadband or dial-up) are significant determinates for participating in Volunteer Grids. This result explains why developed countries are active and enjoy the benefits from Volunteer Grids, while developing countries still lag behind. Therefore, an increased participation in Grids cannot be solely achieved by interconnecting with developing countries through high-speed Internet backbones. Index Terms: Determinants, Grid computing, volunteer computing, adoption, Grid economics, regression analysis, infrastructure funding, developing countries.
1 Introduction Grids are virtual supercomputers created through aggregation of distributed computing resources, generating computational power to execute demanding scientific problems in a cost-effective way. With the maturating of the concept, the middleware, and its applications, many organizations launched Grid computing infrastructure projects based on volunteer contributions from individuals and organizations globally. These projects offer computational infrastructures for investigating complex research problems that range from exploring new HIV/AIDS treatments (fightAIDS@ home), finding new prime numbers (primegrid), searching for gravitational waves J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 116–127, 2009. © Springer-Verlag Berlin Heidelberg 2009
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis
117
(Einstein@Home), predicting protein structure (TANPAKU), to searching for extraterrestrial intelligence (SETI@home). However, because of the imbalance of computational power between countries (Figure 1), it is not surprising to see weak participation from developing countries in these global Volunteer Grids. Yet, since the success of Volunteer Grids depends on attracting sufficient participation [1], the role of developing countries should not be ignored. Therefore, all factors have to be understood that impact the amount of contributions from countries. While much of the effort in exploring and explaining the determinants of participation in Volunteer Grids projects has been concentrated on social sharing theory, the infrastructure drivers and technological needs have been overlooked.
United States, 58.00% United Kingdom, 9.20% France, 5.20% Germany, 5.00% Japan, 3.40% China, 3.00% Italy, 2.20% Sweden, 1.60% 1
Russia, 1.60% India, 1.60% Spain, 1.20% Poland, 1.20% Switzerland, 0.80% New Zealand, 0.80% Netherlands, 0.60% Denmark, 0.60% others(less than 0.60% ), 4.00%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
Fig. 1. Processing power distribution between countries (based on data from Top500.com)
In this paper, we identify influential factors for a country’s participation in Volunteer Grids and analyze 130 observations using a multiple regression analysis. Data for this analysis was collected from diverse international data repositories. This work applies the theory of ICT adoption and ICT diffusion to explain Volunteer Grid participation determinants. This study contributes to the literature of Volunteer Grids by highlighting prerequisites for participation, reveal limitations that developing countries face, and assist ICT policy makers in redefining their priorities, and create the proper environment for more active participation in global computing Grids. The remainder of this paper is organized as follows: the next section presents the literature review on Grid computing. Section 3 provides our theoretical model. Section 4 discusses the methodology and the empirical analysis; whereas section 5 draws findings and policy recommendations. Section 7 concludes this work.
2 Literature Review: Volunteer Grids In last few years, different types of Grids were developed: Enterprise Grids, Research Grids, Desktop Grids, and National Grids. The increase in computing power of PCs and Internet penetration along with the advancement of Gird technologies encouraged numerous communities to develop Volunteer Grids over the Internet. These Infrastructures provided a simple but efficient technique for accumulating huge
118
J. Hwang, J. Altmann, and A.B. Mohammed
computational power to carry out diverse data processing projects. However, the factors that define the success of these projects have not been completely analyzed. These factors are often more than just the individual’s interest. They rather rely on many other factors that characterize technology usage and adoption [2]. They are of technical, social, and legal origin [3], characterized by the environment and the organization [4]. However, previous empirical studies on the determinants for adopting and participating in global Volunteer Grids are very limited. The lack of empirical data and the relatively new research field contributed to this shortage in the literature. Most of the existing studies used surveys as their main instrument and concentrated mainly on exploring user-side determinants. Taylor examined determinants affecting participation in Grid computing for supporting health science activities (e.g. drug evaluation) [1]. Using confirmatory factor analysis (CFA) in a laboratory experiment on 249 individuals, Taylor theorizes a multiple perspectives approach, in which he accounted for individual differences in technology adoption and voluntary participation in Volunteer Grids. Wijngaert and Harry studied circumstances, under which people are willing to share the resources in wireless Grids [5]. They explored factors that explain the use of this technology. Their findings showed that trust in communication partners can explain the willingness to use wireless Grids. It was the original work of Engelbrecht [6,7], which pioneered the research in explaining cross-country determinants for participating in global Volunteer Grids. Engelbrecht studied social sharing determinants of participation in SETI@HOME using the United Nations HDI (Human Development Index), and a group country variable (indicating developed country or developing country) as his independent variables, and “SETI participants per capita” and “SETI results per capita” as dependent variables. Employing multiple regression analysis on 172 observations for both developed and developing countries, Engelbrecht found that participation and its intensity can be explained largely by the degree of Information Communication Technology (ICT) access and Gross Domestic Product (GDP) per capita. Although nobody can neglect the contribution of Engelbrecht’s work in setting the foundation for this research field, his work has endured some drawbacks. Seeking to accommodate more countries, Engelbrecht used composite indexes, which he criticized as controversial himself. In fact, the use of indexes creates difficulties in capturing the exact underlying weights of each of the factors that compose the index. Some of the direct consequences of this approach are the difficulties in defining the significance differences between sub-variables, the over-inclusion of many sub-variables, and the difficulties in deriving definitive policy implications. For instance, Digital Access Index (DAI) includes sub-variables as mobile subscribers per 100 inhabitants, fixed telephone subscribers per 100 inhabitants, and adult literacy. This indicator can easily be attacked. Fixed or mobile phone subscribers can rarely (if not at all) contribute to Volunteer Grids due to poor bandwidth capacity. It can also be noticed that literacy is a precondition for participation but not a sufficient condition. Finally, although Engelbrecht found that SETI@home participation and its intensity are not ‘idiosyncratic’ but rather largely explained by DAI and GDP, he stressed the importance of a “detailed analysis of the socio-institutional conditions affecting the further development of the underlying network infrastructures and incentives for participants”. The framework for
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis
119
such research was defined by Barjak et al. [8]. The framework centers on four groups of essential factors that describe participation in Volunteer Grids: 1) technological frames and user requirements; 2) scientific shaping of technology; 3) economic factors; and 4) political influences. Based on this literature and to fill some of the research gaps in this field, this paper defines more precise and detailed determinants of participation in Volunteer Grids.
3 Theoretical Framework: Volunteer Grids Determinants The theoretical foundation for identifying the Volunteer Grid determinants comes from two main branches of literature: literature on Volunteer Grids and literature on technology adoption [6], [7], [2] and [8]. The literature defines four categories, namely ICT infrastructure access (through high quality network connection), science and education intensity (through expenditure on R&D), economic capabilities, globalization, and country groups as core determinants for participating in and building Volunteer Grids. Based on this background, we refined the indicators. Figure 2 illustrates these indicators within our theoretical research model on Volunteer Grid determinants. 3.1 Independent Variables on ICT Infrastructure Traditionally, ICT infrastructure (especially, PC and Internet penetration) has always been used as one of key determinants in technology adoption literature and technology diffusion literature [9] [10]. Grids scholars have emphasized the importance of underlying ICT infrastructure as a core prerequisite for a successful participation in Grids. For instance, Engelbrecht found empirically that ICT access is a significant determinant of participation in global Volunteer Grids [6] [7]. However, not defining the type of ICT and the ICT quality, which is required and sufficient for participation, was a notable weakness of this study. Seeking to shed light on this aspect, this study uses more specific ICT infrastructure indicators to explore their effect on participation in Volunteer Grids. In detail, the following indicators are used: ratio of computers per inhabitants (PC), ratio of broadband subscribers per inhabitants (BS), ratio of dial-up connection per inhabitants (DC), the level of competition in the ISP sector (CI), and the Internet Bandwidth Interconnection capacity (IB). Based on this, we will investigate the hypothesis H1: ICT infrastructure indicators are positively correlated and significant determinants of participation in Volunteer Grids. 3.2 Independent Variables on Economic Capabilities A considerable amount of literature also emphasizes economic capabilities as key determinants of technology adoption and usage [11] [12]. Gross domestic product (GDP - the value of all final goods and services of a nation in a given year) per capita or the GDP divided by the size of the population has always been a key indicator of economic capability in ICT adoption and diffusion studies. However, GDP per capita is a general indicator that provides only a broad signal of people’s economic capabilities. Therefore, GDP per capita cannot exclusively reflect the intensity of investments in research and education, which are essential for the e-infrastructure development.
120
J. Hwang, J. Altmann, and A.B. Mohammed ICT Infrastructure
1-Personal Computers. (+) 2- Broadband Internet subscribers (+) 3-Dial up connection (-) 4-Competition in the ISP sector (+) 5- International broadband bandwidth (+)
Economic Capability 1- GDP/capita (+) 2- Expenditure on Education (+) 3- Expenditure on Research and Development (+)
Science and Education Intensity 1- Researchers (full-time) (+) 2-Availability of scientists and engineers (+) 3-Quality of scientific research institutions (+)
Grid Participation Capacity
Globalization (+), Country Group (-)
Fig. 2. Theoretical research model on Volunteer Grid determinants
Hence, in our study, the expenditure on education (EE) and the expenditure on R&D (ER) are used to reflect a country’s economic support for science and technology. We will analyze the hypothesis H2: Economic capability factors are positively correlated and significant determinants of participation in Volunteer Grids. 3.3 Independent Variables on Science and Education Intensity Throughout the literature, science and education is a significant indicator for measuring ICT adoption and usage [13]. While most studies emphasize illiteracy as the main determinant of ICT use and ICT diffusion [14] [15], in the context of Grid computing, not only an adequate level of education is needed but also advanced knowledge capacity. Consequently, we argue that the number of researchers (NR), availability of scientists and engineers (AS), and the quality of science and education (QR), are far more realistic indicators to reflect the differences between countries in participating in Volunteer Grids. During our study, we will examine hypothesis H3: Science and education intensity factors are in general positively correlated and significant determinants of participation in Volunteer Grids. 3.4 Independent Variables on Globalization and Country Group Given that globalization is an important aspect of e-infrastructures [16] [12], we argue that globalization (GL) is one determinant of participation in volunteer Grids. It indicates that citizens of countries with high internationalization are more interested in participating in global Grid e-infrastructures than countries with low internationalization. For our study, the KOF Index of Globalization was used to capture the internationalization and the openness of the country. The KOF index was selected due to its flexibility. It allows eliminating factors that may create multi-collinearity and, therefore, guarantees the integrity of the data. As can be seen in Figure 1, a gap exists between developing and advanced countries in term of processing power. Consequently, in a first step, we used a binary
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis
121
variable to account for the country’s status in the same way as in [6]. Yet, our initial analysis results showed that the variable could not capture the huge gap between advanced and developing countries. The data showed high skewness (long tail). Therefore, we grouped the data into three categories: 1) highly developed; 2) averagely developed and highly developing (DV); and (3) averagely developing and least developing (LD). Two binary variables were used to capture the later two categories, whereas the first category was used as the reference category. In detail, we will analyze hypothesis H4: Globalization and country group indicators are in general positively correlated and significant determinants of participation in Volunteer Grids. 3.5 Dependent Variable: Grid Participation Capacity To create a data set that is as comprehensive and representative as possible, this study employs data collected by the Berkeley Open Infrastructure for Network Computing (BOINC) project [17]. BONIC implements a public, volunteer desktop computing Grid. It incorporates and supports more than 40 global Volunteer Grids from diverse fields (e.g. mathematics, strategy games, biology, and medicine). Participants can register their computing resources with one or multiple projects, and can control the amount of resources contributed to the different projects (e.g., 60% to studying global warming, and 40% to SETI) [18]. BONIC collects this data and builds statistics about usage per user, team, project, and country. For constructing the dependent variable, we used the BONIC data collected for all projects on a per-country basis, averaged over a period of three months (from October 2008 to December 2008). The large diversity of projects, that users are offered to contribute their resources, represents a neutralization of personnel preferences, motives, and interests. Results per capita are used as dependent variable following the argument of Engelbrecht [6] [7].
4 Data and Methodology 4.1 Data Data was collected from diverse international data repositories. ICT indicators were collected mainly from the World Telecommunication Indicators Database of the International Telecommunication Union (ITU). Economic indicators were obtained from International Monetary Fund (IMF) and the United Nation Human Development Index (HDI), whereas the science and education intensity indicators were provided by the HDI and the World Economic Forum (WEF) competitiveness report. The data represents the most recent data available, year 2006 and 2007. We assume that the time lag of not more than one year between some of the variables cannot create a bias since significant changes in these variables cannot be realized within one year by a country. Yet, from 248 countries, only 130 countries were analyzed due to data availability.
122
J. Hwang, J. Altmann, and A.B. Mohammed
4.2 Methodology Based on the prior established theoretical research models explaining the determinants of participation in Volunteer Grids, the following multiple regression model (unrestricted model) has been formulated: γi = b 0 + b1( PC ) + b 2 ( BS ) + b 3 ( DC ) + b 4 (CI ) + b 5 ( IB ) + b 6 (GDP ) + b 7 ( EE ) + b 8 ( ER ) + b 9 ( NR ) + b10 ( AS ) + b11(QR ) + b12 (GL ) + b 13( DV ) + b14 ( LD ) + ε i (1)
For the purpose of analysis and to find the most fitted model, several models (levellevel model, level-log model, log-level model, and double-log model) were estimated. All regression parameters were estimated using Ordinary Least Squares. Data management was performed using SPSS software 16.0.1(2007).
5 Empirical Results and Analysis 5.1 Estimation Before estimating the model, the correlation matrix was calculated using the Pearson bivariate correlation coefficient, in order to understand and identify cross-influences between variables. The values of correlation matrix suggest the existence of multicollinearities. To resolve this multi-collinearities problem, we eliminated variables that cause this problem and produced a reduced model (Table 2). However, since the theory does not suggest such elimination, we performed a one-to-one estimation of all independent variables against the dependent variable. The results of this analysis showed that all selected explanatory variables are statistically significance and each one of them has large explanation power against the dependent variable. Consequently, we choose to regress both models, the unrestricted model and the reduced model. Then, we use the R-square and F-statistics to choose the best fitted model. Table 1 and Table 2 show the regression estimate results for both models. Table 1. Unrestricted Model Regression Results
(Constant) Personal computers (PC) Dial-up Internet (DC) Broadband (BS) International Internet bandwidth (IB) Competition in ISP (CI) GDP/capita Public expenditure on education (EE) R&D expenditure Researchers in R&D (ER)
Level-Level 93.769 (8.308) a
Log-Level -0.500 (-1.01)
Level-Log 119.621 (2.81) a
Log-Log 0.884 (0.84)
49.209 (3.04) a
0.584 (0.82)
4.563 (0.99)
0.354 (3.13)a
5.5E-7 (2.02) b
-2.3E-9 (-0.19)
1.761 (1.08)
-0.006 (-0.15)
-0.508 (-0.01)
1.867 (1.01)
-4.641 (-1.52)
0.329 (4.36) a
0.001 (2.48) a
-3.8E-5 (-2.24) a
2.222 (0.74)
-0.012 (-0.16)
-1.909 (-0.95)
0.138 (1.58)
-4.877 (-0.23)
0.493 (0.97)
0.001 (-1.82) c
2.1E-5 (3.50) a
-1.290 (-0.19)
0.449 (2.74) a
3.667 (0.55)
0.285 (1.74) c
0.413 (0.57)
0.088 (2.83)
-4.618 (-1.43) 4832.3 (2.88)
a
a
-0.126 (-0.89) -52.766 (-0.72)
0.322 (0.11) 5.288 (1.92)
0.092 (1.357) c
-0.046 (-0.67)
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis
123
Table 1. (continued) Scientists & engineers (NR) Quality of scientific research institutions (QR) Globalization (GL) Developing (DV) Least developed (LD) R2 Adjusted R2 F-value
-0.956 (-0.33)
-0.137 (-1.10)
2.610 (0.90)
0.007 (0.05) a
-46.074 (-1.39)
-2.482 (-3.04) a
61.629 (2.16) b
0.528 (0.75)
0.135 (1.43)
0.016 (3.95)
3.842 (0.31)
0.176 (0.59)
-75.3 (-11.74)a
-0.258 (-0.92)
-78.7 (-10.59) a
-0.368 (-2.01) a
-99.99 (-2.690) a
-1.2 (-3.77) a
-16.3 (-13.92) a
-1.2 (-6.09) a
0.887 0.873 64.218
0.811 0.788 35.188
0.832 0.812 40.809
0.911 0.900 83.584
(..)a denotes statistical significance at the 1% level for the two-sided t-test. (..)b denotes statistical significance at the 5% level for the two-sided t-test. (..)c denotes statistical significance at the 10% level for the two-sided t-test.
Table 2. Reduced Model Regression Results (Constant) Personal computers (PC) Dial-up Internet (DC)
Level-Level 91.110 (9.42) a
Log-Level -0.567 (-1.33)
Level-Log 113.3 (2.73) a
Log-Log 0.175 (0.166)
51.284 (3.31) a
0.744 (1.09)
4.429 (0.96)
0.37 (3.17) a
5.7E-7 (2.15) b
-1.3E-9 (-0.11)
2.696 (1.70) c
-0.014 (-0.34)
Broadband (BS) -8.759 (-0.22) 2.496 (1.41) -4.587 (-1.51) International Internet b b -3.8E-5 (-2.24) 2.295 (0.76) 0.001 (2.55) bandwidth (IB) c a 1.9E-5 (3.22) 0.013 (0.01) 0.001 (-1.69) GDP/capita Public expenditure on 0.612 (0.89) 0.085 (2.80) b 7.042 (1.10) education (EE) -4.545 (-1.47) -0.148 (-1.09) 0.847 (0.31) R&D expenditure Researchers in R&D -61.742 (-0.84) 5.482 (1.98) b 4883.56 (2.94) a (ER) 3.801 (0.36) 0.124 (1.45) 0.017 (4.42) a Globalization (GL) -79.38 (-10.62) a -75.013 (-11.78) a -0.281 (-1.01) Developing (DV) -117.1 (-13.92) a -99.679 (-12.83) a -1.260 (-3.69) a Least developed (LD) 0.885 0.806 0.825 R2 0.874 0.788 0.809 Adjusted R2 82.704 44.538 50.592 F-value (..)a denotes statistical significance at the 1% level for the two-sided t-test. (..)b denotes statistical significance at the 5% level for the two-sided t-test. (..)c denotes statistical significance at the 10% level for the two-sided t-test.
0.309 (4.01) a 0.001 (0.01) 0.390 (2.33) b 0.356 (2.19) b 0.036 (0.52) -0.046 (-0.66) 0.109 (0.37) -0.36 (-1.89) c -1.23 (-5.75) a 0.901 0.892 97.663
To test for excluding variables while moving from the unrestricted to the reduced model, we use the F statistics:
F =
( SSR r − SSR un ) / q SSR un /( n − k − 1)
=
( R 2 ur − R 2 r ) / q (1 − R 2 ur ) /( n − k − 1)
(2)
where SSRr is the residual sum of squares for the reduced model. SSRun is the residual sum of squares for the unrestricted model. R2ur is the R square for the unrestricted model, while R2r is the R square for the reduced model. q is the numerator degree of freedom that is equal to the reduced model degree of freedom minus the unrestricted
124
J. Hwang, J. Altmann, and A.B. Mohammed
model degree of freedom. Finally, (n-k-1) represents the denominator degrees of freedom of the unrestricted model, where n is the sample size and k denotes the number of predictor variables. Assuming that the Classical Linear Model (CLM) assumptions hold, then F is distributed as a random variable with q and n-k-1 degrees of freedom. Looking at the F-distribution table, we get for the numerator degree of freedom q = 3 and the denominator degree of freedom (n-k-1)= 115 at the 5% critical value a value between 2.71 and 2.68. Table 3 shows the calculated F statistics (equation 2) and, based on those, the decision between the unrestricted and the reduced model. Table 3. F statistics test Model
F Statistics Result
Level-Level Log-Level Level-Log Log-Log
0.6784 1.0140 1.5972 4.3071
Decision Between the Unrestricted and the Reduced Model at 5% Critical Value restricted model restricted model restricted model unrestricted model
As can be seen in Table 3, only the log-log model indicates that the unrestricted model has coefficients not equal to zero. The reduced model is better off in all other cases. 5.2 Results Discussion Results of Table 1 and Table 2 show that computer and broadband penetration estimates are positive and statistically significant factors in the ICT infrastructure determinants category. In contrast, the factor “dial-up internet connection” in the same category does not provide consistency and statistically significant estimates. The same is true for the factor “competition in the ISP sector” of the same category. However, the factor “Internet Interconnection Capacity” was able to generate statically significant estimates in many of the estimates despite the changing sign between positive and negative values. Overall, the ICT infrastructure category reports mostly expected and interesting results in many of the estimates. Therefore, the hypothesis H1 can be accepted. In the “Science and Education Intensity” category of determinants, the “researcher in R&D” factor indicates a strong and positive effect on a country’s participation in Volunteer Grids. Yet, even though availability of scientists and engineers as well as quality of scientific research institutions are believed to be two essential factors for participation, these two factors could not show statistically significant results. Their results vary from positive to negative and from significant to insignificant. This outcome could be a result of multi-collinearity or/and the way, in which data was collected. Yet, most of the estimates indicate that hypothesis H2 can also be accepted. The results within the “Economic capabilities” category of determinants are expected and consistence. The factor “GDP per capita” always reported strong, positive, and statistically significant estimates. The factor “Expenditure on education and on R&D” shows mostly statistically significant and positive estimates. Only in some cases, the factor results in statistically insignificant and/or negative coefficient
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis
125
estimates. The same is true for the factor “researcher in R&D” (which, on the other hand, is not surprising, given the very high correlation between the two factors). In general, these results suggest the acceptance of hypothesis H3. Finally, while the results in the category “globalization” showed in most cases positive coefficients but statistically significant estimates, the “country group” binary variables showed in most cases strong, negative and statistically significant estimates. Consequently, we can only accept hypothesis H4 as well.
6 Discussion and Policy Implications Volunteer Grids are one of the most promising new computing paradigms. The vast potential it holds is changing the way of computational scientific research in all fields. While advanced countries are active and already enjoying the benefits of these supercomputers, OECD (Organization for Economic Co-operation and Development) acknowledged the potential benefits of global Volunteer Grids for developing countries as well. They state that: “The potential benefits to developing countries are considerable, since scientists would be able to join international global-scale collaborations with only a modest investment in a local infrastructure as minimal as an Internetlinked and a high-performance workstation” [21]. However, as our results suggest, participation in Volunteer Grids is hardly restricted by the availability of “Internet-linked and high-performance” workstations. Instead, other quantitative and qualitative determinants, which are more important, have to be fulfilled. For instance, the factor “dialup”, which represents a link to the Internet, could not produce any statistically significant coefficient in any of the models. In contrast, the “broadband penetration” indicator shows positive and statistically significant estimates in all models, identifying it as a key element for participating in Volunteer Grids. Moreover, the results show that, although funding of high-speed interconnections to developed countries is crucial, it is not sufficient. It is obvious that, to bear participation in Grid infrastructures, developing countries have to stimulate many other determinants (e.g. “R&D” and “education”) first. In fact, without competence in R&D, participation in Grid e–infrastructures does not benefit developing countries. Consequently, policies for supporting science and technology in developing countries should be developed to attain adequate levels of gains from Grid participation. Increasing the number of R&D centers, facilitating personnel mobility, encouraging partnerships between universities and industry, continuous upgrading and training of personnel, increasing international cooperation and international joint research programs are just some of the polices that could help here. Our results also suggest a strong linkage between “economic capability” estimates (“GDP per capita”, “expenditure on education”, and “expenditure on R&D”) and the level of participation in global Volunteer Grids. This represents one of the most challenging concerns of developing countries. Promoting public-private partnership in science and technology research, tax incentives for private sector spending in R&D, and facilitating human resource mobility between public and private R&D sectors, could help governments to tackle these set of determinants. In general, our results indicate not only a difference between developing and advanced countries but also a
126
J. Hwang, J. Altmann, and A.B. Mohammed
difference between advanced countries and countries in-between these two groups. A future study could investigate a finer grouping of countries than just the three groups that have been considered here. Lastly, we have to mention that, although our model could identify a large number of factors that can explain participation in Volunteer Grids, there will always be an unexplained fraction that needs further analysis. In addition, any policies that try to bridge the identified gaps need to be not only undertaken nationally but internationally.
7 Conclusion Based on literature analysis and a multiple regression analysis, this study proposes a model that identified and estimated the determinants of participation in global Volunteer Grids. Four key determinant categories were investigated: ICT infrastructure, science and education intensity, economic capabilities, and country group. The analysis of the results showed that many factors impact the intensity of participation and utilization of Volunteer Grids. In particular, we can state that not only an Internet connection and a workstation are the stimulating factors for participating in Volunteer Grids but also many other factors. The results showed the significance of factors such as broadband Internet connections, the R&D capacity, and the differences between developed and developing countries. Consequently, in addition to funding Internet interconnections to advanced countries, proper policies for increasing science and technology capacity and for growing international cooperation in R&D will support developing countries’ efforts to participate in global Volunteer Grids.
References 1. Taylor, N.J.: Public grid computing participation: An exploratory study of determinants. Information & Management 44, 12–21 (2007) 2. Voss, A., Mascord, M., Fraser, M., Jirotka, M., Procter, R., Halfpenny, P., et al.: e-Research Infrastructure Development and Community Engagement. Paper presented at the UK e-Science 2007 All Hands Meeting (retrieved 2007), http://www.allhands.org.uk/2007/proceedings/papers/866.pdf 3. David, P.: Towards a cyberinfrastructure for enhanced scientific collaboration: Providing its ‘soft’ foundations may be the hardest. Research Report No. 4, Oxford Internet Institute (August 2004), http://www.oii.ox.ac.uk/resources/publications/ OIIRR4_200408.pdf 4. Maqueira, J.M., Bruque, S.: Towards an Adoption Model of Grid Information Technology in the Organisational Arena. In: Proceedings of 7th IEEE/ACM International Conference on Grid Computing, Barcelona, September 28-29 (2006) 5. Wijngaert, L., Bouwman, H.: Would you share? Predicting the potential use of a new technology. Telematics and Informatics 26, 85–102 (2009) 6. Engelbrecht, H.-J.: Social Sharing’ by Means of Distributed Computing: Some Results from a Study of SETI@Home. In: Proceedings of the International Telecommunication Society Africa-Asia-Australasia Regional Conference: ICT Networks: Building Blocks for Economic Development. Curtin University of Technology, Perth (August 28- 30, 2005)
Determinants of Participation in Global Volunteer Grids: A Cross-Country Analysis
127
7. Engelbrecht, H.-J.: Internet-based ‘social sharing’ as a new form of global production: The case of SETI@home. Telematics and Informatics 25(3), 156–168 (2008) 8. Barjak, F., Wiegand, G., Lane, L., Kertcher, Z., Poschen, M., Procter, R., Robinson, S.: Accelerating Transition to Virtual Research Organization in Social Science (AVROSS): First Results from a Survey of e-Infrastructure Adopters. In: Third International Conference on e-Social Science, October 7-9. Ann Arbor, Michigan (2007), http://ess.si.umich.edu/papers/paper141.pdf 9. Chinn, M., Fairlie, R.: The determinants of the global digital divide: a cross-country analysis of computer and internet penetration. Oxford Economic Papers, pp. 16–44. Oxford University Press (2006), doi:10.1093/oep/gpl024 10. Lee, S., Brown, J.: Examining broadband adoption factors: An empirical analysis between countries. Info 10(1), 25–39 (2008) 11. Bagchi, K., Udo, G.: Empirically testing factors that drive ICT adoption in Africa and OECD set of nations. Issues in Information Systems VIII(1-2), 45–53 (2007) 12. Schroeder, R.: e-Research Infrastructures and Open Science: Towards a New System of Knowledge Production. Prometheus 25(1), 1–18 (2007) 13. Lucchetti, R., Sterlacchini, A.: Factors affecting the adoption of ICTs among SMEs: Evidence from an Italian survey. In: Quaderni di ricercan, p. 155. Università degli Studi di Ancona (2001) 14. Quibria, M.G., Ahmed, S.N., Tschang, T., Reyes-Macasaquit, M.: Digital divide: determinants and policies with special reference to Asia. Journal of Asian Economics 13, 811–825 (2003) 15. Dwivedi, Y., Lal, B.: Socio-economic determinants of broadband adoption. Industrial Management and Data Systems 107(5), 654–671 (2007) 16. Schroeder, R.: e-Research Infrastructures and Open Science: Towards a New System of Knowledge Production. Prometheus 25(1), 1–18 (2007) 17. Berkeley Open Infrastructure for Network Computing (BOINC), http://boinc.berkeley.edu/ (accessed October 2008) 18. Anderson, D.P.: Public computing: reconnecting people to science, Conference on Shared Knowledge and the Web, Residencia de Estudiantes, Madrid, Spain, Madrid, Spain (November 2003) 19. Andronico, G., Barbera, R., Koumantaros, K., Ruggieri, F., Tanlongo, F., Vella, K.: Grid Infrastructures as catalysts for development on escience: experiences in the Mediterranean. Bio-Algorithms and med-systems journal edited by medical college –jagiellonian university 3(5), 23–25 (2007)
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements Davide Maria Parrilli Interdisciplinary Centre for Law and Technology (ICRI), K.U. Leuven, IBBT, Sint-Michielsstraat 6, 3000 Leuven, Belgium [email protected]
Abstract. Service Level Agreements in Grid and Cloud scenarios can be a source of disputes particularly in case of breach of the obligations arising under them. It is then important to determine where parties can litigate in relation with such agreements. The paper deals with this question in the peculiar context of the European Union, and so taking into consideration Regulation 44/2001. According to the rules on jurisdiction provided by the Regulation, two general distinctions are drawn in order to determine which (European) courts are competent to adjudicate disputes arising out of a Service Level Agreement. The former is between B2B and B2C transactions, and the latter regards contracts which provide a jurisdiction clause and contracts which do not. Keywords: SLA, Jurisdiction, Legal, Grid, Cloud.
1 Introduction1 The commercial success of Grid and Cloud computing technologies relies to a great extent on the Service Level Agreements (SLAs) entered into by the parties involved in a Grid or Cloud value chain. In other words, SLAs are contractual instruments that allow the parties to construct a business relation that may be fruitful and satisfying for them or, conversely, may potentially lead to litigation and to the termination of the relationship. Given this consideration, SLAs do deserve great attention by both researchers and practitioners and shall be drafted in a consistent and viable way. SLAs, in fact, are contracts, and like any other contract, which is binding for the parties, shall be efficient and effective: the parties should be aware of their respective obligations and the agreement shall state clearly which sanctions (damages, service credits, etc) are linked to failures to comply with such obligations. As regards the former aspect, the core of the SLA, i.e. the definition of the mutual obligations, is the description and definition of the quality of services (QoS) promised by the supplier, the level of availability of the service, the indication of the security measures adopted by the provider, the fees to be paid by the customer, etc. Nevertheless, the efficiency of the SLA is not limited to these aspects, but involves other issues that are often neglected by Grid and Cloud providers, suppliers of Grid and 1
I wish to thank Dr. Luca Penasa of the School of Law of the University of Padova (Italy) for his precious collaboration and support in writing this paper.
J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 128–139, 2009. © Springer-Verlag Berlin Heidelberg 2009
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements
129
Cloud-based services and customers. We refer, in particular, to some clauses that should be included in all SLAs, like the indication of the law governing the contract and the choice of the competent court in case of disputes arising from the interpretation and/or the execution of the contract. This latter issue is particularly interesting and our aim is to provide an assessment of the topic in a pioneering and innovative way. This paper will focus precisely on the problems linked to the individuation of the court that is competent to judge the disputes arising from a SLA entered into by the parties in a Grid or Cloud computing environment. First of all, our approach will be European, i.e. we will take into consideration the sources enacted by the European lawmaker in the field, with a special attention to Regulation 44/20012 which is the most important legal source in the field of jurisdiction and recognition of foreign decisions in the European Union (EU). Furthermore, our study will be based on the experience arising from the consultancy provided to real pilot cases of implementation and adoption of Grid technologies by business operators in the framework of the FP6 European-funded project BEinGRID.3 This research project, in fact, is centered on a certain number (25) of cases (so-called Business Experiments) of commercial exploitation of Grid and Grid-based services by European companies and Universities. In order to render this paper as general as possible, and thus interesting for a wide audience, we base our analysis on the SLAs adopted in simplified scenarios, i.e. we will take into consideration the provision of Grid and Cloud computing and storage resources and the supply of Grid and Cloud-based services, like for instance Software as a Service (SaaS). [1] In the former situation, there will be a SLA between the technology provider and the client. If the customer is (basically) a company or an undertaking, the SLA will be qualified as a business to business (B2B) contract, while if the client is an individual acting outside his professional activity the SLA is a business to consumer (B2C) agreement. As it will be showed below, such a distinction has a great practical relevance and often the legal solutions in the two cases will radically differ. In case of supply of Grid or Cloud-based services, for instance SaaS, we will have typically two different SLAs, one between a technology provider and the SaaS provider (basically regulating the provision of Grid or Cloud resources) and one between the SaaS provider and the end user. The former SLA is a B2B contract while the latter can be, according to the concrete circumstances of the case, a B2B or a B2C agreement. The considerations expressed in this paper are applicable to both SLAs. In the above business cases it is, of course, possible that the customer is not satisfied with the provision of the service, or that the client does not pay for the services, or that the data supplied by the customer get lost or damaged for some security failures, etc. In any case there will be a dispute, which can be solved amicably by the parties (e.g. through an out-of-court dispute resolution system) or judicially by a judge. If the parties want that the dispute is discussed in a court and decided by a judge, the problem that arises is the individuation of the competent court. This issue arises if the case has an international dimension, i.e. if the parties are located in different countries: if, of course, they are all established in the same country and the 2
Council Regulation (EC) No 44/2001 of 22 December 2000 on Jurisdiction and the Recognition and Enforcement of Judgments in Civil and Commercial Matters [OJ L12, 16.1.2001, pp. 1-23]. 3 For further information, please visit http://www.beingrid.eu
130
D.M. Parrilli
SLA is signed in that country, the competent court will be that of the same State (the only problem then will be the individuation of the competent judge inside the country, according to the national rules of civil procedure). In order to provide solutions to the above issue, we will first take into consideration the situation in which the parties do not state in the SLA which court will be competent. In this case, we will assess (i) how it is possible to find the competent court in both B2B and B2C scenarios. The parties, then, may set forth the competent court in a specific clause in the SLA, but in this case it is necessary to analyse (ii) in which form such a clause must be drafted and, (iii) in case of B2C SLA, to what extent such a clause is valid pursuant to the European legal framework.
2 The Determination of the Competent Court in the Absence of Choice by the Parties In B2B SLAs… When the parties do not include a choice of court clause in their contract (SLA), in the Member States of the EU the first question to be asked is whether Regulation 44/2001 is applicable, and secondly if the defendant is domiciled within the European territory. In a B2B contract both the parties are generally companies or other legal persons, so, according to Art. 60 of the same Regulation, they are domiciled where they have alternatively their statutory seat, their central administration, or their principal place of business. It is to be pointed out that it is rather difficult to locate in any specific geographical point the principal place of business of, typically, a technology provider and a supplier of Grid or Cloud-based services. [9] Moreover, in case of Grid or Cloud providers, we would exclude that the place where the Grid or Cloud components (i.e. servers, routers, nodes, etc) are located identifies the principal place of business of the company and for these purposes also the reference to the top level domain name is unhelpful.4 In order to assess when the Regulation applies, we have to refer basically to the general rule set out in Art. 2 of the Regulation which provides for the jurisdiction of the court of the State where the defendant (i.e. the person sued) is domiciled5, if he is domiciled in a Member State. In contractual disputes, however, Art. 5(1) of the Regulation confers jurisdiction also to the courts of a Member State different from that of the domicile of the defendant. Art. 5 thus states that “[A] person domiciled in a Member State may, in another Member State, be sued: 1. (a) in matters relating to a contract, in the courts for the place of performance of the obligation in question; (b) for the purpose of this provision and unless otherwise agreed, the place of performance of the obligation in question shall be: 4
5
The technology providers and the Grid or Cloud-based service suppliers, in fact, have very often domain names ending with “.com” and so it is not possible to connect them with any specific State. When Regulation does not apply, the court seized will determine whether it is competent under the national rules on jurisdiction of the forum (see Art. 4 of the Regulation).
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements
131
- in the case of the sale of goods, the place in a Member State where, under the contract, the goods were delivered or should have been delivered, - in the case of the provision of services, the place in a Member State where, under the contract, the services were provided or should have been provided, (c) if subparagraph (b) does not apply then subparagraph (a) applies”. It is then necessary to consider whether a SLA can be considered a “contract for the provision of services” within the meaning of Art. 5(1)(b). If the answer is positive, the competent courts to adjudicate all the disputes concerning the SLA will be those of the country where the service is provided or shall be provided; otherwise the competent courts will be determined taking into consideration (i) the specific obligation upon which the claimant founds his action and (ii) the place of performance of such an obligation. We can fortunately answer the question with the authoritative guidance of the European Court of Justice (ECJ). In the recent judgement Falco Privatstiftung6 it was held that a licence agreement cannot be included into the notion of “contract for the provision of services” employed by Art. 5(1)(b) for lack of some activity or active conduct7. The solution provided by the ECJ is similar particularly to one of the opinions expressed in literature [2]. Moreover, it is to be pointed out that this interpretation appears in line with the objective of the reform of Art. 5(1) of 2000, that is to provide a “pragmatic determination of the place of performance” [7]: when the service supplier’s obligation does not imply any positive conduct, there is no place where the performance can be “pragmatically” located and then Art. 5(1)(b) must be held inapplicable [3]. It is then possible to argue that SLAs are not contracts for the provision of services for the purposes of Regulation, since no active conduct is required to the technology provider or to the Grid or Cloud-based service supplier, either in case of provision of Grid and Cloud storage capacity, or in case of supply of Grid and Cloud-based services, like for instance SaaS, that does not basically require any specific activity vis-àvis each client, since the supplier only uploads the software in his servers. In all these cases, Art. 5(1)(a), instead of Art. 5(1)(b) will then apply. In conclusion, and in light of the case law of the ECJ,8 as far as SLA disputes are concerned, Regulation 44/2001 confers special jurisdiction to the court of the place where the obligation upon which the claimant founds his action has to be performed, under the law governing the contract.9 Such a law will be determined in each Member State by the rules of Rome Convention of 1980 on the law applicable to contractual 6
ECJ, Falco Privatstiftung, Thomas Rabitsch v Gisela Weller-Lindhorst, C-533/07. See in the same sense the Opinion of Advocate General Trstenjak in Falco case, delivered on 27 January 2009. 8 See the decisions in the cases De Bloos (ECJ, C-14/76, De Bloos Sprl v Bouyer SA [ECR 1976, I-1497]) and Tessili (ECJ, C-12/76, Industrie Tessili Italiana Como v Dunlop [ECR 1976, I-1473]. 9 National laws will generally provide “default rules” establishing the place of performance even if it is not possible to identify it on a factual basis (e. g., Art. 1182 of the Italian Civil Code and Art. 1247 of the Belgian Civil Code). 7
132
D.M. Parrilli
obligations10. So, when the SLA contains a choice of law clause the law governing the contract will be that chosen by the parties, according to Art. 3 of the Rome Convention. In absence of such a clause, Art. 4 provides for the law of the country with which the contract is most closely connected to be applicable, i.e. the State where the party, “who is to effect the performance which is characteristic of the contract [i.e. the performance of the non pecuniary obligation, the provision of ICT services], has, at the time of conclusion of the contract, his habitual residence, or, in the case of a body corporate or unincorporate, its central administration”. Then, the place of performance of the obligation (and as a consequence the competent court) will be generally determined under the law of the country where the technology provider or the service supplier has its central administration, unless the parties agreed otherwise in the SLA. It is noteworthy that, as far as contracts entered into after 17 December 2009 are concerned, Rome Convention will be replaced by Regulation Rome I11, but this fact will not have any particular effect on the identification of the law governing SLAs. …and in B2C SLAs Regulation 44/2001 provides for specific rules applicable to consumer contracts, i.e. agreements concluded with a person “for a purpose which can be regarded as being outside his trade or profession”, as specified by Art. 15(1). The first problem to analyse regards the notion of ‘consumer’. The definition provided by the above legal rule is apparently very clear, so that a person is a consumer every time he buys a good or a service for purposes not linked to his activity, profession, or business. Therefore, for instance, a lawyer who requires Grid or Cloud storage resources to host his professional files is not a consumer and, as a consequence, the special rules set forth by the Regulation (see below) do not apply. If, on the other side, the same lawyer buys storage capacity for his personal files, e.g. music, videos or pictures, he is deemed to be a consumer. Part of the literature proposes a different approach, and some authors wonder whether or not it is possible to include in the notion of consumers also small businesses or professionals that are not very familiar (they are “profane”) with the products or services they buy. [4] It is possible to agree with this approach but only to a limited extent. To be more precise, we agree that the definition of consumer is not satisfactory, since it is not able to protect, in many cases, the weak party. In other words, a lawyer or a small company that buys computing capacity or services from big multinational companies like, for instance, Amazon or Sun is not a consumer, based on the literary interpretation of the above definition, but both of them are certainly the weak party in the contractual relationship. Therefore, it would be extremely useful to have a modification of the definition of consumer by the European lawmaker with the aim to include in such a definition also small businesses and professionals that, in practice, are like ‘traditional’ consumers (this need is even more urgent if we consider that a professional can use Grid or Cloud computing resources to store professional and private files at the same time, 10
11
1980 Rome Convention on the law applicable to contractual obligations [OJ L266, 9.10.1980, pp. 1-19]. Regulation (EC) No 593/2008 of the European Parliament and the Council of 17 June 2008 on the law applicable to contractual obligations (Rome I) [OJ L177, 4.7.2008, pp. 6-16].
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements
133
and in these cases it is cumbersome to establish which regime will be applicable).12 Having said that, given the actual definition of consumer, it is extremely difficult to enlarge it and therefore small businesses and professionals, when buy Grid or Cloud resources (or Grid or Cloud-based services), cannot be considered to be consumers.13 The special rule for the determination of the competent court can be found in Art. 16 of the Regulation. This provision is applicable if the criterion set forth by Art. 15(1)(c) is met, i.e. if and when “the contract has been concluded with a person who pursues commercial or professional activities in the Member State of the consumer’s domicile or, by any means, directs such activities to that Member State or to several Member States including that Member State, and the contract falls within the scope of such activities.” It may be problematic, in an e-commerce scenario14, to assess whether or not this criterion is met. The indent ‘by any means’ refers, as pointed out by the European Commission [7] and in the literature [5] [6], to business activities carried on through the Internet, and an example will clarify the point. Let us assume that a Grid or Cloud provider, established in the Netherlands, has a website through which customers located all around the world, including in the Member States of the EU, can buy services. The question that arises is whether or not such a website can constitute a way to direct activities to those Member States: to provide an answer to this question, some criteria must be taken into account. First of all, we could assume that the language of the website is a key factor. [8] Can we say that a website that is, for instance, only in English is directed exclusively to English-speaking countries, so that if a German consumer buys services there the special protection set forth by Regulation 44/2001 does not apply? We would say that this solution is not satisfying and may lead to unfair treatments: provided that the majority of the websites of ICT vendors is in English, only customers domiciled in the United Kingdom or in Ireland will be protected, although these websites are accessible also by other countries and sell services to customers located all around Europe. On the other way, we can assume that a website translated in many languages is a clear indicator of the will of the company to address some specific national markets. From a different perspective, it is necessary to assess the nature and the characteristics of the website in order to establish whether or not it is directed to sell products or services in one or more Member States and therefore whether or not the special rules set forth by the Regulation to protect the consumers apply. The literature [4]15 and the European Commission itself [7] rely on the distinction between passive and active website: the former is a website that contains only advertising material and that provides mere information, while the latter allows the client to enter into an 12
The problem, then, is the definition of small business, in other words it may be rather difficult to indicate clear borders to which the special protections for consumers apply. 13 The ECJ, in the case C-269/05, Francesco Beniscasa v. Dentalkit S.r.l. [ECR 1997, I-3767], decided that consumer contracts concern only agreements whose aim is to satisfy the private consumption needs of an individual, supposed to be the weakest party. 14 We assume, in fact, that the provisions of Grid or Cloud computing resources and of Grid and Cloud-based services are e-commerce activities. 15 These authors point out, then, that “a consumer’s knowledge of the existence of an interactive website in a particular Member State will not give rise to jurisdiction…unless the consumer actually has concluded his contract through actual use of the website’s interactive capability.”
134
D.M. Parrilli
agreement with the supplier and therefore buy Grid or Cloud resources or Grid or Cloud-based services. It is necessary to perform a case-by-case analysis in order to assess whether a website is passive or active: in principle, if the technology or service provider has a website with information in several European languages and it is possible for the customers to buy the offered services, i.e. ‘signing’ a contract, providing his credit card number, etc there are few doubts that the provider wanted to target those national markets and therefore customers domiciled there will enjoy from the protection offered by Regulation 44/2001.16 This happens unless the provider expressly states in the website that the offered products and services are not intended for a certain market [9], or if payment is expressed only in one currency and no other currencies are accepted (this criterion is not very effective after the adoption of the Euro by many European States) or it is possible to pay for the services only with credit cards issued in one or more specific countries (if, for instance, only credit cards issued by German banks are accepted it is pretty clear that the services are not intended to be sold in Belgium). Having said that, Art. 16 of the Regulation specifies which courts will be competent if the above requisites are met. Pursuant to paragraph 1, a customer, when acting as plaintiff i.e. when bringing an action against a technology or service provider, can choose the court of the place where he is domiciled or the court of the place where the provider is domiciled. Art. 16(2) provides a solution for the opposite case, when the provider brings an action against the consumer (who is thus the defendant): “proceedings may be brought against a consumer by the other party to the contract only in the courts of the Member State in which the consumer is domiciled.” The protection for the consumer is here evident. [10] The above provisions, especially paragraph 1, make also clear that these special rules are applicable only if the defendant is domiciled in a Member State. If, pursuant to Art. 4 of the Regulation, this is not the case in point, the national jurisdictional rules will apply. In other words, if a consumer, domiciled in France, wants to sue a Grid provider domiciled in the United States the jurisdiction will be assessed in light of the national rules of civil procedure of France. If, on the other side, the Grid provider is domiciled in a Member State, no problems do arise and the provisions of the Regulation will be applicable. It is, then, necessary to establish when a person is domiciled in a Member State.17 For what concerns “a company or other legal person or association of natural or legal persons” (i.e. Cloud or Grid providers and suppliers 16
17
The European Commission and the Council “stress that the mere fact that an Internet site is accessible is not sufficient for Article 15 to be applicable, although a factor will be that this Internet site solicits the conclusion of distance contracts and that a contracts has actually been concluded at a distance, by whatever means. In this respect, the language or currency which a website uses does not constitute a relevant factor.” [11] Pursuant to Art. 59(1), the court seised of a dispute will decide whether or not the party is domiciled in that State applying its national law.With this regard there is the risk that, for instance, a court of State A will decide that the party is domiciled in country B and the judges of this State will assess, applying the national law, that the party is domiciled in State C or…in country A. It should be therefore desirable that the Regulation sets clear criteria that are valid in all Member States.
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements
135
of Grid-based services), the abovementioned Art. 60 will determine the criteria to determine the domicile of the company. It is to point out that paragraph 2 of Art. 15 of the Regulation states that “where a consumer enters into a contract with a party who is not domiciled in the Member State but has a branch, agency or other establishment in one of the Member States, that party shall, in disputes arising out of the operations of the branch, agency or establishment, be deemed to be domiciled in that State”. Provided the different possibilities in interpreting this provision, we do believe that there is no branch, agency or other establishment in a country if the technology or service provider has just a website registered in that State or some Grid or Cloud components are located there. This means that the above provisions are not likely to be applied in a Grid or Cloud scenario unless there is an establishment of the provider that has a real organization of people and means together with administrative and contractual autonomy. This would be the case in point if, for instance, an American technology provider has a branch in a European country that deals with clients, enters into contracts, etc. A further issue related to the domicile of the consumer is the consideration that must be paid to the place from where the consumer buys the services. In other words, does it matter if a consumer, who lives (i.e. is domiciled) in Germany, buys online services while on business or leisure trip to Spain? Does this prevent the competence of the German courts, as indicated by the abovementioned Art. 16? The literature did not reach a univocal position on this point, but the majority of the authors say, in the above example, that German courts will be competent as far as the website targets also Germany.18 If, for instance, the German consumer buys Grid or Cloud-based services while he is in Spain from a website that allows sales only to users located in Spain (e.g. thanks to the adoption of GeoIP-based filtering systems), the consumer could not sue the company in Germany. 19 [12] [9]
18
“One could reasonably expect that, as long as a consumer has his or her permanent domicile on the territory of a Contracting state, the e-commerce contract can be concluded not only from this domicile in one of the Union states, but also while the person in question is on a business trip to, let us presume, Japan. Of course, in such a case one condition is that the web-site where the goods or material are being advertised would be available in the Contracting State where the consumer has his or her domicile.” [9] 19 Contra, inter alia, [13]. Further problems may arise if the provider uses banners in websites to attract customers. We can imagine the case of a technology provider that advertises its services in an international website of an ICT magazine with a banner especially dedicated to customers domiciled in one specific country, thanks to a GeoIP tracking system. The customer sees the banner is his language and then clicks it with the result of being redirected to the website of the technology provider. If also this website is in the customer’s language (or other elements clearly indicate that it is directed to one Member State) no particular problems arise. But what if the website does not allow determining the targeted market? Are all the Member States deemed to be targeted also if the banner was only directed to potential clients domiciled in one specific country? Let us imagine that a German customer goes on business trip to Spain, clicks on a banner dedicated to Spanish costumers and then he buys the services from the ‘international’ website of the technology provider. Are German courts still competent? We would say no, but the issue is surely open to discussion.
136
D.M. Parrilli
3 Choice of Court by the Parties 3.1 The Written Form in B2B Contractual Choice of Court In the previous paragraphs we assessed how it is possible to determine the judicial competence when the parties did not state, in the SLA, which court will be competent. It is definitely advisable that the parties always state in any agreement which court will have jurisdiction in case disputes arise, in order to reduce (costly and time consuming) issues. We take into account first of all agreements in B2B scenarios. At European level Regulation 44/2001 sets forth that the parties can decide which European court (for instance, in a more general way ‘German courts’ or, more specifically, ‘the court of Munich’) will be competent and, unless they agree otherwise, this court or these courts will have exclusive jurisdiction, so that no other court can be competent.20 The choice of court must respect some formal requirements, so that it must be, pursuant to Art. 23(1) second part of the Regulation, either: “(a) in writing or evidenced in writing; or (b) in a form which accords which practices which the parties have established between themselves; or (c) in international trade or commerce, in a form which accords with a usage of which the parties are or ought to have been aware and which in such trade or commerce is widely known to, and regularly observed by, parties to contracts of the type involved in the particular trade or commerce concerned.” As regards this last requisite, it is necessary to perform a case-by-cases analysis in every business and commercial sector in order to evaluate the most used practices. Nevertheless, in the field of electronic contracts, like the SLAs, the lawmaker of the Regulation introduced a useful specification, so that, according to Art. 23(2), “any communication by electronic means which provides a durable record of the agreement shall be equivalent to ‘writing’”. [9] The key factor is thus the durable record of the electronic agreement, i.e. of the provision in the SLA that contains the choice of court provision. One could wonder whether a webpage containing the SLA with a button ‘I agree’ at the end respects this requisite. In this case the electronic means can provide a durable record if the customer save the page into his hard disk, or copy into a word document and then save it. This issue is highly controversial and it is difficult to provide a solution that cannot be the object of criticism. As a matter of precaution, we would suggest Grid or Cloud providers and suppliers of Grid or Cloud-based services to deliver to their clients a copy of the agreement in pdf format, either to be downloaded from the website or sent by email. The adoption of an electronic signature by both parties, then, further reduces the uncertainties linked to the respect of the above legal requirement. Furthermore, we do believe that, even if we assume that the ‘click and wrap’ does not respect the requisite of the written form, it can be considered as a practice widely 20
Art. 23(1) first part states, in fact, that “if the parties, one or more of whom is domiciled in a Member State, have agreed that a court or the courts of a Member State are to have jurisdiction to settle any disputes which have arisen or which may arise in connection with a particular legal relationship, that court or those courts shall have jurisdiction. Such jurisdiction shall be exclusive unless the parties have agreed otherwise.”
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements
137
known in international trade. In practice this means that it should always be an acceptable choice of court clause or agreement. Finally, it has to be observed that Art. 23 of the Regulation applies only to jurisdiction agreements in favour of an European court, while, when the parties confer jurisdiction to a court of a third State, the formal and substantial validity of this agreement has to be assessed according to the national rules of the seised court. 3.2 Intra-EU and Extra-EU Derogation of the Special Rules ex Article 15 and 16 in B2C Scenarios In case of B2C provision of services, the abovementioned special rules provided for by Art. 15 and 16 of the Regulation are, in principle, of strict application. The only exceptions are listed in Art. 17 of the Regulation, so that a Grid or Cloud provider or a supplier of Grid or Cloud-based services and customers/consumers can agree on a different competent court from those ex lege only if such agreement: 1. 2. 3.
“is entered into after the dispute has arisen”; or “allows the consumer to bring proceedings in courts other than those indicated” in Art. 16; or “is entered into by the consumer and the other party to the contract, both of whom are at the time of the conclusion of the contract domiciled or habitually resident in the same Member State, and which confers jurisdiction on the courts of that Member State, provided that such an agreement is not contrary to the law of that Member State.”
This means that, apart from those situations, “any forum selection agreement between a business and a consumer providing for a forum other than the courts of the home country of the consumer is null and void”. [6] A European Grid or Cloud technology or service provider, thus, can not state in the SLA for the supply of its services that the exclusive competence to solve the disputes with the consumer will be of the court of the place where the company is established: this provision will be void and virtually replaced by the legal rules provided by the Regulation. It is to be underlined that the literature generally thinks that these limits apply also to jurisdiction agreements in favour of the courts of a third State [14] [15], on the basis of two different arguments. The first is literal, being based on Art. 17 of the Regulation, which does not make any distinction between choice of an European court and choice of a non European forum, but only requires that the agreement derogates to jurisdiction conferred under Art. 15 and 16 of the Regulation. The other argument takes into consideration the objective of Art. 17, that is to guarantee the consumer against a choice of court substantially imposed by the other party. Such an objective leads to a broad interpretation of the scope of application of Art. 17, which covers also jurisdiction agreements in favour of a third State court. So in a B2C contract, the parties cannot confer exclusive jurisdiction to the court, for istance, of New York. The only valid jurisdiction agreements are those providing for a further (extra-European) forum, where the consumer can sue the other party, or those entered into after the dispute has arisen.
138
D.M. Parrilli
4 Conclusions One could say that issues linked to the competent court are relatively marginal in the field of SLAs. This would be a big mistake, for the very fact that jurisdiction comes into play when problems arise in the relationship between the parties. And if this happens, the relationship and the trust, from the point of view of the client, in the specific provider and, more generally, in ICT providers can be restored only if justice is made. If the customer is left alone and he can not obtain an enforceable decision that restores his rights and that allows him to recover from the damages suffered, it is difficult to build trust in some products and services (in this case Grid and Cloud services and Grid/Cloud-based services). With this respect, from the European perspective, businesses and consumers are in very different positions. The latter are, to a certain extent, protected by the applicable legal sources while the former do not receive adequate safeguards, unless they are in the position to negotiate with the supplier more favourable clauses. This is a matter of fact, as basically no laws force the ICT provider to draft fair contracts. In the above pages we provided the reader with a (hopefully clear) picture of the situation according to the applicable European framework: we assessed when a client (in B2B and B2C SLAs) can sue or can be sued by the provider. Two other issues to take into consideration are then the enforcement of the decisions issued by the competent court and the position of SMEs. As regards the former aspect, obtaining a decision by a judge does not mean that it will be enforced. This is the rule, but in practice exceptions may exist. This is true especially if the judgement must be enforced in another country. Inside the EU there are no particular problems, given the fact that “a judgment given in a Member State shall be recognized in the other Member States without any special procedure being required”, pursuant to Art. 33(1) of Regulation 44/2001. Things are different if the decision must be executed in other non-European countries, and with this regard we would stimulate the competent national and international authorities to enter into agreements aimed to facilitate the mutual recognition and enforceability of judgments. From a different perspective, we would invite the European lawmaker to reconsider the position of SMEs when dealing with ICT (but not only) providers. The fact that, for instance, a micro-enterprise is listed in the commercial registers and that has a VAT-number does not mean that this entity is able to negotiate in equal terms with big operators. It has simply to accept the SLAs unilaterally drafted by the supplier and, very often, if some problems arise, it will not enforce its rights in another country or continent. This paper is not aimed to assess whether such a lack of protection is fair or not. We would instead promote the debate about the issue, in order to reconsider the position of entities that, legally speaking, are not consumers but, from the practical point of view, are typically as weak as consumers. Solutions to this problem must be found at political level.
References 1. Parrilli, D.M., Stanoevska, K., Thanos, G.: Software as a Service (SaaS) Through a Grid Network: Business and Legal Implications and Challenges. In: Cunningham, P., Cunningham, M. (eds.) E-Challenges 2008. Collaboration and the Knowledge Economy, pp. 633–640. IOS Press, Amsterdam (2008)
The Determination of Jurisdiction in Grid and Cloud Service Level Agreements
139
2. Franzina, P.: La giurisdizione in materia contrattuale, L’art. 5 n. 1 del regolamento n. 44/2001/CE nella prospettiva della armonia delle decisioni. In: CEDAM, Padova (2006) 3. De Cristofaro, M.: La tutela giurisdizionale ordinaria nella virtualità del cyberspazio: competenza interna e giurisdizione. In: Bessone, M. (ed.) Trattato di diritto privato, XXXII, Commercio elettronico. UTET, Torino (2007) 4. Wild, C., Weinstein, S., Riefa, C.: Council Regulation (EC) 44/2001 and Internet Consumer Contracts: Some Thoughts on Article 15 and the Futility of Applying “In the Box” Conflict of Law Rules to the “Out of Box” Borderless World. Int’l Rev. L. Computers & Tech. 19(1), 13–21 (2005) 5. Rowe, H.: E-Commerce: Jurisdiction Over On-Line Contracts and Non-Contractual CrossBorder Disputes, Part 1. J. Int’l. Bank. Fin. L. 19(2), 51–55 (2004) 6. Berliri, M.: Jurisdiction and the Internet, and European Regulation 44 of 2001. In: Woodley, S. (ed.) E-Commerce: Law and Jurisdiction. CLYIB, pp. 1–12. Kluwer Law International, The Hague (2008) 7. European Commission: Proposal for a Council Regulation (EC) on jurisdiction and the recognition and enforcement of judgments in civil and commercial matters (presented by the Commission). COM 1999, 348 final 99/0154, CNS (1999) 8. Graham, J.A.: The Cybersecurities’ Notion of “Targeting” in General Private International Law. Cyberbanking & Law (2003) 9. Rosner, N.: International Jurisdiction in European Union E-Commerce Contracts, http://www.llrx.com/features/eu_ecom.htm (2009) (last retrieved 21.4.2009) 10. Stone, P.: The Treatment of Electronic Contracts and Torts in Private International Law Under European Community Legislation. Inf. Comm. Tech. L. 11(2), 121–139 (2002) 11. European Commission, European Council: Joint Council and Commission statements on Council Regulation (EC) N (...) on jurisdiction and the recognition and enforcement of judgments in civil and commercial matters. Doc. 13742/00 JUSTCIV 131 (2000) 12. Riefa, C.: Article 5 of the Rome Convention on the Law Applicable to Contractual Obligations of 19 June 1980 and Consumer E-contracts: The Need for Reform. Inf. Comm. Tech. L. 13(1), 59–73 (2004) 13. Foss, M., Bygrave, L.: International Consumer Purchases Through the Internet: Jurisdictional Issues Pursuant to European Law. Int’l J. L. IT 8(99), 99–138 (2000) 14. Gaudemet-Tallon, H.: Compétence et execution des jugements en Europe: reglement n. 44/2001, Conventions de Bruxelles et de Lugano. L.G.D.J., Paris (2002) 15. Mari, L.: Il diritto processuale della Convenzione di Bruxelles, I, Il sistema della competenza. In: CEDAM, Padova (1999)
Engineering of Services and Business Models for Grid Applications Jürgen Falkner and Anette Weisbecker Fraunhofer-Institut für Arbeitswirtschaft und Organisation, Nobelstr. 12, 70569 Stuttgart, Germany {Juergen.Falkner,Anette.Weisbecker}@iao.fraunhofer.de http://www.swm.iao.fraunhofer.de
Abstract. In the context of using grid applications in medicine and bioinformatics a combination of classic biomedical services like the analysis of biomaterial with grid services is quite common. Services for customers in those fields need to comprise both and offer an easy way to use these combined services. Within the German project Services@MediGRID1, methods for the systematic development of complex customer services including the use of grid applications are developed. In coordination with this service engineering approach for grid applications, commercial business models are derived for a set of biomedical grid services. Keywords: Software-as-a-Service models, Grid value chain, Business modeling and analysis.
1 Introduction The field of biomedical applications contains a lot of specific application areas, ranging from medical image processing and complex data processing in the context of clinical studies to bioinformatics applications like genome sequence analyses. Each of those applications has a demand for huge amounts of computing power and often also for vast storage capacities. Unlike classic early adopters of grid technologies, the users of these applications are usually not IT-experts and they are typically not the same person as the application developer. For this reason usability requirements of grid application interfaces are far higher than would have been required in early grid infrastructures. Also the tasks of using and maintaining or deploying grid applications are conducted by different people with different roles. Since the usual high security requirements [1] of clinical and biomedical research environments also apply, a new kind of service offer is required. The basic change compared with classic grid usage is the introduction of mediators between the biomedical customers on one side and the providers of grid applications, grid infrastructure services and grid resources on the other side. These mediators 1
The Services@MediGRID project is funded by the German federal ministry of education and research under the registration mark 01IG07015F.
J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 140–149, 2009. © Springer-Verlag Berlin Heidelberg 2009
Engineering of Services and Business Models for Grid Applications
141
provide customer services which may also combine the offer for resource consuming IT applications together with classic services like the sequencing of biomaterial. In order to design such new services systematically and to quickly approach a good quality of service, especially for the customers, the German project Services@MediGRID has chosen a service engineering approach [2]. The first step of this approach, described in section 2, was to identify the added values of a series of biomedical grid applications like the analysis of haplotypes and transcriptomes or the optimization of pharmaceutical agents. In a second step, described in section 3, the complete processes in which the applications are used have been modeled as service blueprints [3] and iterated in exchange with the application developers. Starting from these blueprints and a systematic understanding of the processes and therefore also of the service chains in a grid context, the different cost factors became visible as well as the interactions between the participants in the value chain. Combining all the information that has systematically been gathered at this point the business models for the different biomedical services involving grid applications could be derived, as is described in section 4.
2 Analysis of Grid Applications 2.1 Business Requirements in Biomedical Use Cases The most common starting point on the way towards developing a service in connection with a grid or cloud application is a hugely resource-consuming application for which a suitable execution environment is needed. This can be a grid or cloud environment in which the software application is provided as a service. Justifying the effort and cost of deployment is in most cases only possible if a significant number of users are willing to make use of it. At this point the specifics of the application and its potential clientele are coming into play. As already hinted at in the introduction most users in the area of health care and bioinformatics tend to prefer easy-to-use applications without the necessity to personally scan monitoring web sites for suitable resources or writing job scripts to distribute applications to grid resources and data to storage systems [4]. Having said so, it is also clear that those users do not want to take care of the deployment of applications to grid or cloud resources or of application maintenance. All this should be done by someone else. The problem is to find this person or group of persons. On the other hand this can be seen as an opportunity to provide new kinds of services – for application deployment and maintenance, for the operation of resources and for the operation of value-added services such as resource brokering, metascheduling, accounting and billing, just to name a few. The problem remaining for the end user is how to access the grid application in a secure and still easy-to-use fashion. In the Services@MediGRID project an approach has been developed in which a grid service provider (GSP) takes care of the application usage and therefore can bear the complexities of the grid or cloud security infrastructures so the end user and customers is relieved from security matters.
142
J. Falkner and A. Weisbecker
What this grid service provider offers to the customer is a full service that includes feeding the grid application with the customer’s inputs, executing whatever grid or cloud job the customer wants to perform, taking care of the necessary deployment of the application of suitable grid or cloud resources and returning the requested result to the customer. All security requirements for grid users are therefore met by the grid service provider. This also offers the opportunity to combine the grid service with additional services like e.g. the sequencing of bio materials. This is a quite common necessity in order to generate the inputs for bioinformatics applications, anyway. It also offers an obvious solution to the problem of who should offer the grid services. 2.2 Analyzing Applications and Use Cases The analysis of grid or cloud applications for the purpose of creating end user services around them requires a systematic approach in establishing the procedural and technical components, not only but particularly in the areas of health care and life sciences. The goal to keep in mind is a) to provide a service that customers embrace and b) to formulate a suitable business model. In order to do so it is necessary to also keep in mind the possible business models for all of the services involved in the final provision of a grid or cloud application from the definition of the task until the return of the desired results and the allocation of costs. A look at the basic roles involved in such a service provision, as shown in Fig. 1 gives a hint at the interactions and interdependencies. From the experience of the Services@MediGRID project one can say that in the beginning scientists use to develop biomedical applications that solve specific scientific tasks or problems. Usually the application developers do not have a large expertise on grid environments and the necessities of application or service deployment, the maintenance of grid applications and services or the interaction with grid information Customer
„Conventional“ Communication with Providers of Customer
Customer Organisation
M ediator betw een Providers and Customers
Grid / Cloud User
Desktop App
Application Services Provider
S a a S
P a a S
I a a S
Contractor for SLAs
Provider of Customer Services
Portal
Rich Client
SaaS-Provider, Operator of SOA- or Grid-Services
Platform Services Provider
„Platform“ Services (w hich may aggregate value-added services from external providers)
z.B. Accounting, Billing, Brokering, SLA Management, Application Container, etc.
Middleware Provider
Virtualisation and M iddlew are
e.g. Grid, SOA, OS-Virtualisierung …
Resource Provider
Hardw are: Operation, M aintenance, dyn. Allocation ( -> Cloud )
Sa a S – So ft w a re a s a Se rvice Pa a S – Pla t fo rm a s a Se rvice Ia a S – In fra st ru ct u re a s a Se rvice
Fig. 1. Roles in the grid value chain
Applications Value-added Services
M iddlew are Hardw are
Engineering of Services and Business Models for Grid Applications
143
services and monitoring or accounting. Besides these technical obstacles they usually lack the capacities for providing a grid application service over on a mid- or longterm perspective – both in terms of IT resources and of human resources. In addition they lack a commercial point of view which is necessary for providing a sustainable service. Interviews and surveys with application developers therefore need to concentrate on the technical specifics of the applications as well as on the surroundings and possible service providers that might take care of commercial deployment and operation. Even for scientific business models a commercial perspective in terms of costs and benefits is key to convince the funding agencies. It also proved that often scientific applications have a large potential of providing a benefit in the clinical treatment of diseases or in prevention, therefore potentially saving huge costs in public health care and triggering the willingness for funding. At the same time the providers of these applications tend to focus too much on the specifics of their solution while losing the bigger picture. Therefore professional analysis and consulting combined with a systematic engineering of services and business cases is highly recommended.
3 Engineering of Services Using Grid Applications The research discipline of service engineering deals with the systematic development of business services using standardized methods, as is common in the development of real assets in mechanical engineering, e.g. automobile parts [2]. The major benefit of service engineering is that it provides a clearly structured methodic approach to the development of new services. It comprises the investigation of customer preferences and of the interactions of service components as well as user interfaces. The different views on service processes, product flows as well as on the resources and actors involved help finding the way from a useful application to a commercially successful service which also takes into account operation efforts and costs. It therefore creates a basis for the creation of business models and business plans. In order to identify such obstacles in advance and to think of solutions before running into problems, systematic service engineering methods provide a valuable toolset. A first step is the identification of all process steps necessary for the use of the application that is to be provided. This includes everything from obtaining a PKI certificate and providing or installing user interfaces to the deployment of applications, the communication with grid services or resources and the final billing of resource usage. Service blueprints [3] are a special form of displaying process models that provide the opportunity to clearly visualize the roles involved in these processes as well as their interaction depth. This is particularly important for the interaction between the customer and service providers. By identifying not only the line of interaction but also the line of visibility, i.e. those process steps of which the user is aware, the basis for the optimization of service processes is defined. As a result of the analysis of the grid applications in connection with the surveying of application developers, as described in section 2, service blueprints can be derived. Fig. 2 gives an example service blueprint that shows the different types of symbols used for process steps, sub-processes, decisions and the connectors between these
144
J. Falkner and A. Weisbecker
Role: Service Provider
- visible for customer -
Role: Service Provider
- not visible for customer -
Role: Customer
symbols. The different participants in the processes are represented by their roles in the process. For each role at least one horizontal swim lane is reserved. From top to the bottom the visibility of process steps decreases for the role in the upper swim lane. The lines of interaction and visibility show the respective event horizons for the top role which is in most cases the customer. In case one role performs both, actions that are visible and actions that are invisible to the customers, there may be several swim lanes for one role. The lines of interaction and visibility are particularly important as the main goal of service engineering should be to reduce the complexity for the customer as much as possible in order to lower the barriers for adopting the new service. The complexity of grids or clouds is not in itself negative – as long as the end user is not involved in it.
Fig. 2. Legend for the understanding of service blueprints
When the processes have been modeled as service blueprints they need to be discussed with the application developers. Potential candidates for all the necessary roles and the resources behind them need to be identified. The processes need to be viewed from an end user perspective as well as from the provider perspective and need to be simplified as much as possible – not only but in particular for the customer. As a result, service blueprinting is an iterative process itself. In this process the services are optimized constantly. During the optimization the necessities for the ingredients of possible business models become visible. This in turn lays the groundwork for the business models as described in section 4. As there are typically lots of dependencies between different services provided by different players with different roles, the establishment of business models in grid or cloud environments requires the detailed knowledge of all process steps – subsequent to their optimization. In addition to the process view which is only one part of the service engineering toolset, the views on a) the results produced by a service offer and b) the resources needed to provide a certain service have to be taken into account [5]. The resources include potentials, abilities and willingness of the different participants as well as the technical premises in terms of infrastructure and the skills needed for the different roles. These additional views can and also need to be added as further dimensions to the process view described with the service blueprints. As the processes and therefore also the service blueprints for the services in Services@MediGRID are highly complex only smaller fractions can be displayed here for lack of space and the visualizations will concentrate on the process view. In the
Engineering of Services and Business Models for Grid Applications
145
Application / Service Deployment and Maintenance
Yes
(development environment)
Remote compilation of application
Error / Update / Upgrade Copy application to remote resource
End No
No Test of remote application
(remote on GRP resources)
Uses Web-Interface for registration of applications / services
Remote Login on grid resource as VO-SWAdmin
Configuration of application / service
Start of application
Deletion from grid information systems
Deinstallation of application
Line of interaction (visible for developers)
(Service Registry)
Grid Resource Provider (GRP)
Grid Information Application Developer System
Application development
(not visible for developers)
(Service Registry)
Grid Information System
End of Lifecycle?
Yes
(not visible for developers)
Application Developer
Start
Provides web interface for registraion of applications / services Line of visibility
Import of role information from SAMLVOMS
Creation of software installation directories for VOs
Apply access rights for users of role SW-Admin on VO-SWdirectories
Registration with service registry / grid information system
Provision of resource information for broker and workflow services
Deletion from service registry / grid information system
Fig. 3. Deployment and Maintenance of grid applications and services
following subsections two common sub-processes will be described that are necessary for all use cases. These are a) the deployment and maintenance of grid applications and b) the accounting and billing process. 3.1 The Case for Grid Service Providers Taking a look at the process of deployment and maintenance of grid applications, as displayed in Fig. 3, it becomes obvious why end users are usually overextended with taking care of the applications themselves. In all application areas where end users are not automatically IT-experts process steps like compiling an application for a remote system, configuring an application on a remote system and inserting application/service information into remote grid information and monitoring systems exceed their abilities by far. Even for application developers this is a real challenge as the coaching workshops for application developers in the predecessor project MediGRID2 have shown. Therefore service providers are needed in between the user and the grid. They should bring along the technical expertise required and make these processes transparent for the end user and customer. The early adopter use case in which user, application developer and application deployer are the same person is a special case of personal union between several roles. This can be useful in application areas where the scientists write their own program code which they modify constantly and which needs to be deployed ad-hoc on a regular basis. However this makes great demands 2
The MediGRID project was funded by the German federal ministry of education and research under the registration mark 01AK803B.
146
J. Falkner and A. Weisbecker
for the skills and abilities of these users. For the end user clientele in the Services@MediGRID project this special case was not applicable. 3.2 Accounting and Billing A most important question in the engineering of any service and business model is how the usage of resources and the working time of service providers can be measured, accounted for and finally billed. In the general service blueprint derived for the accounting and billing process it is noticeable that the number of process steps in the customers swim lane could be reduced to just four. This means that his direct involvement in the process is reduced to a minimum. He only needs to register with the provider on a web site, place his order via a web site and download the results – again via a web site. Finally he receives the bill for his order (by post, via e-mail or in his web portal account) and can pay for it using conventional payment methods such as credit card payment, bank transfer or bank collection. The visibility is also reduced to an absolute minimum of information which is needed for his direct interactions. The by far larger part of the process is hidden from the final customer. This part deals with the interactions between grid services like technical accounting on resource level [6] or functional accounting on an aggregated dataset [7] from which correlations with certain grid jobs are made as well as the issuing of invoices. The accounting and billing process resulting from the service blueprint optimization allows the providers in the MediGRID infrastructure to run grid jobs under their own identity and therefore receiving invoices from all grid resources and services. As they have the possibility to send job parameters like a customer pseudonym along with their grid jobs the accounting daemons are able to detect and record these. So the usage records for the provider not only contain the information on the resource usage of the provider but also on the customers for which these resources had been consumed. The functional accounting process interprets the usage records, divides them into subsets for each customer and correlates them with the contracts and service level agreements between the service provider and his customers. This allows for a detailed view on the actual customer-related grid usage and therefore for a customer-specific billing of grid jobs. As only pseudonyms are attached to the jobs as job parameters the privacy of the customers is always preserved. The conditions of payment, be it payper-use, flatrates, volume models or anything else are not fixed by this accounting and billing process. If the service provider chooses to provide fixed price offers to his customers he still has the opportunity to do so. The detailed information that he can derive from this kind of accounting enables him to calculate such fixed prices from the history of usage records and his experience. 3.3 Service and Provider Interactions in Grid Jobs A very complex service blueprint, which also cannot be displayed here due to its size, is the detailed process view on the execution of a grid job by a grid service provider on behalf of a customer. It shows all the different interactions between resource brokers, workflow systems, schedulers and meta-schedulers [8], monitoring services [9], grid information services [10], resource providers and content providers. The latter is also an important aspect in biomedical use cases as large databases with value-added
Engineering of Services and Business Models for Grid Applications
147
content are of great significance here. The service blueprint also contains the accounting and billing sub-processes. Looking at this service blueprint it becomes obvious that due to the large number of different service and resource providers the payment process needs to be re-defined. The users of these services currently need to have service level agreements (including pricing and payment regulations) with a huge number of providers. In the German national grid infrastructure “D-Grid” resource providers are beginning to join forces in an alliance of resource providers. The same needs to be done for the providers of infrastructure services such as the ones named above. Suitable billing schemes can be found in the telecom market, where call-by-call providers do not send their invoices directly to the customer [11]. Instead, one of the providers aggregates the invoices so that the customer only gets one bill and he only needs to pay one time every month. The service blueprint can impressively show the problem here. By systematically analyzing applications, infrastructure and service environments and correlating this analysis with the view on service results and resources, service engineering can be of great value in defining and designing use cases and services in connection with grid applications and infrastructures.
4 Deriving Grid Business Models The systematic engineering of services as described in section 3 together with the preceding analysis of grid applications and the surveys of application providers already offers a lot of information needed for a suitable business model. Modeling the business relations and monetary flow between the different roles in distributed IT service environments, as shown in Fig. 4 gives a first overview over cost factors and income sources in connection with a grid service offer. Together with the application developers a large part of the ingredients of a business model [12] can be derived. The definition of the value proposition and the creation of a competitive advantage are the central issues and results of the iterative refinement of services during the service engineering process. Market opportunities and the competitive environment are also relatively easy to derive in connection with the application developers. In terms of market strategy the first and most important factor is identifying a suitable and willing service provider who is able to deal with the technicalities of grid environments while at the same time speaking the language of the customers. In biomedical use cases a good approach is to combine the grid service with an already established biomedical service like the already mentioned sequencing service for the analysis of bio materials. Having identified such a provider it is easy to concentrate on the necessities of organizational development. Again, the service engineering process provides valuable input for the business models here as the view on the resources including potentials, abilities and willingness of the different participants already contains the necessary information. It also provides input for required experiences and background of the management team and the people involved in the delivery of the final services.
148
J. Falkner and A. Weisbecker
Customer // Customer Customer Organisation Organisation Provider Provider ofof Customer Customer Services Services
bundle: application plus calculation in the grid fixed prices / flatrates / pay per use … Application Application Provider Provider
SaaS / application licenses flatrates / pay per use(r) … … Platform Platform Services ServicesProvider Provider
flatrates / pay per use(r) … Resource ResourceProvider Provider // Infrastructure Infrastructure as asa Service Service
On demand resource usage pay per use (CPU/h + GB/month/per screening/per sequencing…) Fig. 4. Business Relations and monetary flows between the different roles in distributed IT service environments
The difficulties, however, start with the calculation of costs for the services further down in the value chain on which the revenue model needs to be based. Cost factors like the providers own infrastructure and labor cost are only one part of the overall costs as the service provider needs to acquire other infrastructure and platform services. In current grid environments the providers of resources as well as of basic services like monitoring or accounting are largely dependent on government funding which is linked with grid projects and they are currently showing only faint signals of developing pricing models that would allow for a clear calculation of costs further up in the stack of grid providers. Unfortunately the service providers described in this paper are at the top of the grid stack with software services, platform services and infrastructure services below them. As they are dependent on the prices and preceding cost calculations of the lower layers of this stack the market can only develop from the bottom up. Cloud computing in itself is a good example for this development as it started out by taking the provision of infrastructure as a service out of the grid concept and starting successful business models with it. As soon as Amazon’s Elastic Compute Cloud (EC2) and Simple Storage Service (S3) emerged other providers of this infrastructure as a service (IaaS) model appeared on the market. Soon after this, platform as a service (PaaS) offers emerged, providing not only the infrastructure but also an environment for the easy provision of services. These platforms can in turn be used for software as a service (SaaS) offers which finally fill the gap described above which is between the grid service providers and the lower grid stacks. Although these platform services are commercially available there is still a lack of successful grid business models and applications. The reasons here are twofold. First, a look at the resource view shows that although platform services are available they do not yet meet the technical requirements while at the same time the scientific platforms which would meet them do not provide cost calculations. Second, the skills on the provider side need to be established still as this is a complex field with yet too few
Engineering of Services and Business Models for Grid Applications
149
experts. Both will however evolve in the near future providing new models for distributed IT services – possibly combined with classic services. As described in section 3 the systematic engineering of services can provide a powerful means for approaching this goal.
References 1. Sax, U., Mohammed, Y., Viezens, F., Rienhoff, O.: Grid-Computing in der biomedizinischen Forschung, Datenschutz und Sicherheit. Urban & Vogel, München (2006) 2. Bullinger, H.-J., Scheer, A.-W.: Service Engineering: Entwicklung und Gestaltung innovativer Dienstleistungen. Springer, Berlin (2003) 3. Shostak, L.: How to Design a Service. In: Donelly, J.H., George, R.W. (eds.) Marketing of Services, Chicago, pp. 221–229 (1981) 4. Falkner, J., Weisbecker, A.: Integration of Applications in MediGRID. In: Bubak, M., Turala, M., Wiatr, K. (eds.) Proceedings of the Cracow Grid Workshop 2006. CYFRONET AGH, Krakow (2007) 5. Meiren, T., Barth, T.: Service Engineering in Unternehmen umsetzen – Leitfaden für die Entwicklung von Dienstleistungen. Fraunhofer IRB Verlag, Stuttgart (2002) 6. Wiebelitz, J., Brenner, M.: Konzept für das Accounting im D-Grid, Version 1.0, D-Grid, DGI-2 Fachgebiet 5.2 (2008) 7. Brenner, M., Wiebelitz, J.: Accounting von vermittelten Grid-Jobs, Version 1.0, D-Grid, DGI-2 Fachgebiet 5.2 (2008) 8. Wieczorek, M., Hoheisel, A., Prodan, R.: Towards a General Model of the Multi-criteria Workflow Scheduling on the Grid. In: Future Generation Computer Systems. Elsevier, Amsterdam (2008) 9. Baur, T., Bel Haj Saad, S.: Virtualizing Resources: Customer Oriented Cross-Domain Monitoring for Service Grids. In: Moving from Bits to Business Value: Proceedings of the 2007 Integrated Management Symposium. IFIP/IEEE (2007) 10. Wolf, A.: D-GRDL – Beschreiben und Auffinden von Ressourcen in IT-Umgebungen, http://www.enterprisegrids.fraunhofer.de/Images/ Flyer-FraunhoferFIRST-D-GRDL_de_tcm267-80518.pdf 11. Rothfuss, T.: Umsetzungskonzept für ein Framework zur Leistungsverrechnung in verteilten Systemen auf der Basis der Anforderungen in D-Grid, pp. 39–45. Thesis (2007) 12. Laudon, K.C., Traver, C.G.: E-commerce: business. technology. society, 2nd edn. Prentice Hall, Boston (2004)
Visualization in Health Grid Environments: A Novel Service and Business Approach Frank Dickmann1, Mathias Kaspar1, Benjamin Löhnhardt1, Nick Kepper2,3, Fred Viezens4, Frank Hertel4, Michael Lesnussa2, Yassene Mohammed5, Andreas Thiel6, Thomas Steinke7, Johannes Bernarding4, Dagmar Krefting8, Tobias A. Knoch2,3, and Ulrich Sax9 1
Department of Medical Informatics, University of Göttingen, Robert-Koch-Straße 40, 37075 Göttingen, Germany 2 Biophysical Genomics, Dept. Cell Biology & Genetics, Erasmus MC, Dr. Molewaterplein 50, 3015 GE Rotterdam, The Netherlands 3 Biophysical Genomics, Genome Organization & Function, BioQuant Center/ German Cancer Research Center, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany 4 Otto-von-Guericke University, Institute for Biometrics and Medical Computer Science, Leipziger Str. 44, 39120 Magdeburg, Germany 5 RRZN – Regional Compute Centre for Lower Saxony, Leibniz Universität Hannover, Schloßwender Straße 5, 30159 Hannover, Germany 6 OFFIS / R&D Division Health, Escherweg 2, 26121 Oldenburg, Germany 7 Zuse Institute Berlin (ZIB), Takustrasse 7, 14195 Berlin-Dahlem, Germany 8 Department of Medical Informatics, Charite - University Medicine, Campus Benjamin Franklin (CBF), Hindenburgdamm 30, 12200 Berlin, Germany 9 Department of Information Technology, University Medicine Göttingen, Robert-Koch-Straße 40, 37075 Göttingen, Germany {fdickmann,mathias.kaspar,benjamin.loehnhardt}@med.unigoettingen.de, [email protected], {fred.viezens,frank.hertel,johannes.bernarding}@med.ovgu.de, [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract. Advanced visualization technologies are gaining major importance to allow presentation and manipulation of high dimensional data. Since new health technologies are constantly increasing in complexity, adequate information processing is required for diagnostics and treatment. Therefore, the German DGrid initiative started to build visualization centers in 2008, which have recently been embedded into the existing compute and storage infrastructure. This paper describes an analysis of this infrastructure and the interplay with life science applications for 3D and 4D visualization and manipulation. Furthermore, the performance and business aspects regarding accounting, pricing and billing are investigated. The results show the viability and the opportunities for further optimization of this novel service approach and the possibilities for a sustainable business scenario. Keywords: MediGRID, distributed visualization, accounting and billing, telemedicine, service business model. J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 150–159, 2009. © Springer-Verlag Berlin Heidelberg 2009
Visualization in Health Grid Environments: A Novel Service and Business Approach
151
1 Introduction The MediGRID community represents biomedical life science users as a part of the D-Grid1 initiative in different projects since 2005. A grid computing infrastructure has been built according to healthcare requirements as well as biomedical research applications ranging from medical imaging to genome analysis [1]. Beside the initial MediGRID project, Services@MediGRID, MedInfoGRID, PneumoGRID, Gap-SLC and WissGrid projects and the Medical Grid Forum are working on further grid aspects within life sciences2. While grid infrastructures are capable of processing biomedical data of increasing complexity, new adequate approaches of presenting the related data are necessary because existing grid services do not tackle this issue. The presentation approach concerns visualization of high dimensional data. Here, multiple dimensions are presented in 2D or 3D including temporal changes in order to reduce the level of complexity [2]. This contributes to the fact that the human visual perception can handle complex structures, coherencies and dependencies more easily [3]. With the new generations of high performance graphics chips, data presentation for professional visualization purposes have become affordable, e.g. in Cave Automatic TM Virtual Environments (CAVE ), but not yet widely available. For professional multidimensional output in a workplace environment, 3D autostereoscopic screens offer a screen with perceivable depth by using parallax barrier technology [4]. Additionally present medical IT products do not support interfaces to the grid and expensive hardware is required, e.g., Picture Archive and Communication Systems (PACS) displaying medical imaging data. While grid computing can offer the required storage and compute power, high quality graphics output is still rare. Nevertheless, visualization nodes with high end graphic capabilities integrated within the grid computing infrastructure can provide high quality graphics output to a broader range of scientists. This approach follows the original idea of offering visualization devices through the grid by Foster and Kesselmann as in the case of radio telescopes [5]. Thus, the goal of the presented solution in this paper is to supply scientists and researchers with an on-demand high performance graphics and compute platform. The German life science grid community is tackling this issue by establishing a grid based infrastructure for visualization resources. These resources are high performance grid nodes equipped with special high-end graphics output devices. Since the beginning of 2009 the community projects – MediGRID, Services@MediGRID and MedInfoGRID – have set up grid clusters enhanced by high-end graphics output devices in Berlin, Göttingen, and Heidelberg in Germany. The sites of Berlin and Göttingen are also equipped with 3D autostereoscopic screens. Here, we describe and analyze the current infrastructure in respect to the interaction with life science applications for visualization and manipulation, in terms of the technical feasibility and performance of this concept. Sustainability of such a complex service depends on the translation into adequate business models regarding accounting, pricing and billing. The results show further opportunities for optimization and 1 2
D-Grid initiative: http://www.d-grid.de MediGRID: http://www.medigrid.de, Services@MediGRID: http://services.medigrid.de, MedInfoGRID: http://www.medinfogrid.de, Grid forum: http://www.tmf-ev.de. The other projects have been initiated recently and the websites are not available yet.
152
F. Dickmann et al.
that the complexity can be handled and used for a sustainable business scenario in life sciences with its complex needs.
2 Visualization in the Grid In the following we describe (a) the technical specifications of the present installations and (b) a performance analysis that shows the advantages of this infrastructure by using life science applications. 2.1 Hardware, OS and Software Specifications The visualization resources in Berlin, Göttingen and Heidelberg are based on HP Scalable Visualization Array (SVA) / XC cluster technology [6]. Each node of the cluster is configured with two NVIDIA Quadro FX5600 cards. For different visualization workloads each cluster is built from two types of nodes: a) 8x HP xw8600 workstations for default and MPI parallelized visualization applications (e.g. based on MPI and VTK), and b) one or two HP Proliant DL785 32 core server equipped with 128 or 256 GB RAM. 3D autostereoscopic screens are installed in research labs at the Charité and ZIB (Berlin) and at the department of Medical Informatics (Göttingen). The screens with diagonals of 42 inch (Full-HD screen resolution of 1920x1080 pixels) and 27 inch (WUXGA screen resolution of 1920x1200 pixels.) support up to five viewers simultaneously. The 3D autostereoscopic screens are connected to the cluster resources via KVM switches (EVS-4 by Thinklogical) via optical fiber and are therefore stationary. The Remote Graphics Software (RGS) within the HP SVA implementations consists of a sender component and a receiver component. Its functionality is similar to well-known remote desktop software client/server products like Virtual Network Computing (VNC). It offers the possibility to transmit full HD resolution even through a low bandwidth network with a minimum loss of performance and quality. RGS achieves this by advanced image compression methods combined with an additional user option to adjust the picture quality level. In theory, an RGS session can be viewed by an unlimited number of guests. In order to use 3D autostereoscopic output of OpenGL based applications TechViz XL was chosen. TechViz transforms the OpenGL stream into an autostereoscopic output and directly sends it to the graphics card driver. 2.2 Integration into the D-Grid Environment To integrate the visualization resources into the D-Grid computing environment, the required MediGRID grid middleware stack had to be installed. This includes e.g., installation of the Globus Toolkit on the head node of the visualization clusters. The XC-SVA preconfigured batch system for visualization jobs is based on SLURM/LSF and was made interoperable with Globus. Application workflows of the MediGRID Virtual Organization (VO) are handled by the Grid Workflow Execution Service (GWES). Visualization workflows impose additional constraints regarding usage time (usually daily working hours). To handle the constraints GWES still needs further optimization.
Visualization in Health Grid Environments: A Novel Service and Business Approach
153
3 Performance Analysis To test the feasibility of this new grid approach, a test bed is defined by using life science applications: These are: 3D-Slicer [7], GLOBE 3D Genome Browser [8], and the Visual Molecular Dynamics (VMD) [9]. The analysis includes local and distant collaboration tests, and a performance test. 3.1 Test Bed Applications 3D-Slicer is a free open source visualization and image processing tool designed for medical imaging [10]. Its BSD license allows unrestricted commercial use. 3D-Slicer offers representation of multi-dimensional medical data, time series of volume data, multi-modal images or simulated data. It also allows export of visualization scenes in standard formats, and is further extensible by external stand-alone applications.. The GLOBE 3D Genome Browser is a grid based virtual “paper tool” developed for the analysis, manipulation and understanding of multi-dimensional genomic data in a 3D environment. The Genome Browser is designed for research on genomic information in a holistic manner. VMD [11] is a tool designed to visualize, model and analyze biological systems such as proteins, nucleic acids and lipid bilayer assemblies. VMD can be used to animate and analyze trajectories of molecular dynamics simulations (MD). It can also act as a graphical front end for an external MD program by displaying and animating a molecule undergoing simulation on a remote computer. 3.2 Test Bed Parameters for Collaboration and Performance Tests The test bed focuses on the rendering speed and usability of the applications in terms of latency. Therefore, the frames per second were measured for different tests. The influence of the 3D autostereoscopic rendering by TechViz and of the RGS quality setup were considered. The RGS performance was compared to a Lenovo ThinkPad T60 laptop (1,83 GHz CPU; ATI Mobility Radeon X1300 with 64 MB dedicated; 1 GB RAM) as a very common computer model. For the 3D tests the autostereoscopic monitors were used in full resolution; the RGS and the T60 were tested with a SXGA resolution of 1280x1024 pixel. TechViz was used for the 3D output. Due to the lack of an FPS indicator 3D-Slicer was tested by judging the level of latency. 3D-Slicer is also not compatible with TechViz. All three applications used only one CPU core. 3.3 Collaboration and Performance Tests The perception of depth within the 3D displays has been successfully tested for up to five viewers with VMD and the Genome Browser. The models of the Genome Browser appear to be more realistic. Distant collaboration of the RGS has been successfully tested with networks of 1 GBit/s, 54 MBit/s wireless LAN and asynchronous private internet access network. Within the wireless and high-bandwidth cable environment the transmission at 100% quality is possible. Nonetheless, it depends on the network load and the screen resolution. Sharing the graphics output with two or more guests does not affect the RGS performance significantly.
154
F. Dickmann et al.
The performance of 3D-Slicer was tested with the model SPL Abdominal Atlas [12] on three different test runs: i) conventional layout, ii) 3D only layout, iii) volume visualization. The cluster offered an overall good performance without latency except iii) displayed with RGS. The laptop showed latency but was still usable for ii) and showed no latency for i). For iii) the latency was too high on the laptop. The use of RGS reduces the perceived performance. For the test of the Genome Browser three test runs were defined: i) 1 chromosome, ii) 3 chromosomes, and iii) 20 chromosomes. The usage of RGS reduces the performance as well as for 3D-Slicer. The measurement of FPS indicated a reduction of 41 percent on average for SXGA. While there was no significant performance difference between the two test resolutions, a reduction of 95 percent on average was measured by using WUXGA and TechViz. On average the cluster was 3442 percent faster than the laptop at SXGA by using RGS. The tests of the VMD were performed using the X-ray diffraction structure of xylose (glucose) isomerase from actinoplanes missouriensis (PDB-ID 9XIM). With this structure three test runs were applied: i) no zoom, ii) continuous rotation, and iii) 5x zoom with rotation. The measured FPS indicated a performance decrease between WUXGA and SXGA by 33 percent on average. Furthermore, the FPS decreased by 54 percent on average with RGS enabled at SXGA. Applying TechViz in WUXGA reduced the FPS by 87 percent on average. The cluster was faster by 91 percent on average than the laptop at SXGA by using RGS.
4 Accounting, Pricing and Billing Approaches According to our performance study, visualization services can be offered via resources which are integrated into the grid. Accounting determines pricing and billing of the services and is to be defined by possible use cases. The infrastructure costs are reflected in the service prices. Thus, the service supplier structure needs to be analyzed. For billing prices have to be combined with the appropriate metrics to measure the actual usage of each service. Regarding the distributed complex character of the grid, accounting, pricing, and billing of visualization services need to be integrated into the technical and organizational structure of D-Grid. 4.1 Implementation of Accounting in D-Grid Accounting in D-Grid is based on the Distributed Grid Accounting System (DGAS) [13] and encompasses participating D-Grid resource providers, infrastructure resources and services. Resource usage is collected locally and transmitted to the central DGAS service at the RRZN. Accounting information are accessible per user, per VO, per site and per D-Grid supervisor (ROC-Manager) level via the HLRmon website and the command line tools of DGAS. Each level aggregates the information and can be filtered on a per job view. HLRmon and the DGAS command line tools are web based and use certificate based authentication and authorization. Currently, the implementation focuses on the CPU related accounting metrics. Metrics for storage accounting are planned to be established. [14]
Visualization in Health Grid Environments: A Novel Service and Business Approach
155
To establish legally liable relationships amongst, resource providers, general service providers in D-Grid, and professional service providers, who will be represented by a Virtual Organization, service level agreements (SLA) are required [15]. SLAs will offer reliability for the professional service providers as well as the users. Because a VO is not legally liable a responsible institution is required to act as the partner of SLA contracts on behalf of the VO. Professional service providers can then distribute their grid based IT services and fulfill the requirements of their customers – the users. SLAs between the VO MediGRID and the resource providers as well as the general service provider RRZN for accounting are in the process. 4.2 Use Cases for Visualization Services According to the technical specifications and the designated workflow there are eight visualization use cases in the grid: local and distant remote usage, each with and without including additional compute grid tasks and each with and without sharing the graphics output for collaboration purpose.
Compute & Storage
Local / Remote Visualization
Compute & Storage
Fig. 1. Use cases as a value chain of visualization in the grid
The use case of local usage of visualization resources includes the stationary 3D autostereoscopic displaying of applications and the local usage via RGS. For remote visualization just RGS is applicable (3.2). Further computational grid tasks can be performed before and after the visualization itself in order to prepare data sets for visualization, and to analyze interactivly modified data from a visualization session. These use cases are sequential and can be described using a value chain [16] (Fig. 1). 4.3 Metrics Definition for Accounting of Visualization Services There are usually two parameters used for the accounting of the consumed computing power: the CPU time and the wall clock time [17]. The wall clock time measures the runtime of a job on a node regardless whether the CPU is fully occupied at all times or not. The real-time character of visualization jobs require the accounting of the complete time of a node occupation since other visualization jobs cannot be processed on the same node since the HP SVA allows only one session per node. Thus, the complete session time of visualization jobs needs to be measured and represents the accounting parameter for interactive visualization jobs processed by grid resources. Additionally, the pre-login time after a successful reservation is to be assigned to the requesting grid user. This is necessary because the reservation blocks resources and therefore incurs opportunity costs. Thus, the overall visualization time is similar to the wall clock time (Fig. 2). Due to the lack of parallelized visualization applications using distributed graphics rendering power, here the accounting measures the use of exactly one node. For a future use of distributed graphics rendering power, the accumulated session times will be required to be accounted.
156
F. Dickmann et al.
Due to the defined use cases (4.2) the accounting of compute and storage resources within the grid is also required. The batch processed compute jobs, the occupied storage space and the real-time session usage for visualization should be combined in a holistic accounting approach. In other words, the more diverse the use cases are the more complex accounting of visualization jobs becomes due to the increasingly different individual metrics involved.
a
Rt
Lt pre-login
Et
t
t = a= Rt = Lt = Et =
Time Accounted session time Reservation Login Session end
session Fig. 2. Accounted visualization session time
4.4 Cost Factors for Pricing Visualization services factors consist of hardware, software, service and support as a whole IT service [18]. Hardware will be offered by resource providers as well as basic software, e.g., operating systems. General services, mostly the middleware, are offered by the D-Grid Integration Project such as accounting and monitoring services. The discipline-related tools and applications are offered by professional service providers. Each provider also offers support to users and other providers. The end-user is required to pay for all of these services. [19] The costs of resource providers comprise hardware investments, software licenses and subsequent annual expenditures regarding energy, maintenance and administration. The prices of resource providers are usually defined on a total cost of ownership calculation. Since the local visualization scenarios require 3D screens, the facility costs should also be included. Since customers like to receive full service as well as the invoice from a single point of contact, professional service providers should aggregate all relevant costs, including its own total cost of ownership. Considering visualization as a grid service, the present grid economy definition [20] needs to be extended to addressing real-time services. Since the graphics performance of the contemporarily available resources is very similar, the graphics performance does not yet affect the choice of resources. Further development will increase diversity due to different choices of visualization implementations. Thus, visualization resources will become more diverse and this will have direct impact on the choice of appropriate resources. Due to its real-time character and the limited number of resources the prices for visualization services should also be adjusted to varying demands during the day: prices during peak hours should obviously be higher than during non-peak hours. This will generate a price competition between visualization resource providers in different
Visualization in Health Grid Environments: A Novel Service and Business Approach
157
time zones. Additionally, as on-demand services for collaborative use cases visualization services have restrictions due to the fact that resources have to be available at an appointed time. Thus, low prices can be charged for advance reservations because they support resource providers in order to optimize the utilization of the visualization resources. Nonetheless, visualization resources can also act as common compute nodes and their utilization can be optimized by compute jobs. The price for a visualization session can be charged on the usage basis or on a flat rate. A flat rate will reduce complexity for the end-user and the professional service provider but there is also a risk of excessive usage by the end-users. Thus, constant price monitoring will be vital for visualization grid business models. 4.5 Distribution and Billing The defined accounting and pricing concept for visualization services in the grid are relevant for distribution in a grid service market. Furthermore, the additional value of the visualization services has to be emphasized and communicated to the customers. According to the use cases, the integration into an on-demand high performance compute infrastructure and the availability of worldwide collaboration are the predominant customer values. Beyond, the customers need to have an easy access to visualization services. Thus, brokers are necessary in order to bring together customers and professional service providers [16]. For the visualization services, the broker also needs to support resource and/or room reservation, estimated session duration, and ideally estimates the costs based on the reservation parameters. Customers will be billed based on the accounting and pricing information according to their contracts with professional service providers. The resulting accounting information is to be aggregated on a per service base for each customer and provides the invoice data.
5 Conclusion and Outlook Obviously advanced visualization technologies for the presentation and manipulation of multi-dimensional data sets would largely profit from the usage of grid visualization infrastructures. The analysis of this infrastructure by a feasibility test showed that the envisioned goal: visualization resources can be accessible to multiple viewers. In an institutionally and internationally distributed research environment as in the life sciences with a growing demand for visualization applications the grid will be able to support collaboration and therefore scientific progress. Nevertheless, adequate visualization tools for a distributed environment are still missing. The performance of the tested life science applications can be significantly increased by using more than one CPU and/or GPU core. Therefore, the development of visualization applications for life sciences has to take parallelization into account especially in respect to future market sustainability. This also addresses parallelization of the graphics rendering process. This would result in an enormous increase in speed of 3D applications, which is the best competitive advantage marketable. Based on this technological advancement accounting, pricing and distribution aspects were analyzed. Accounting requires additional metrics as well as integration
158
F. Dickmann et al.
between the resources, general services and professional services within D-Grid. The integration of an accounting mechanism will be crucial for further commercial viability. Since total costs of the infrastructure determine the visualization service price, every involved grid resource provider and service provider need to have an audit regarding their own costs. Distribution will require a customer relationship management according to the requirements of life science customers to achieve commercial success. Transparent billing integration into enterprise resource planning systems can increase acceptance by customers and professional service providers. Consequently, visualization in (health) grid environments could be a big opportunity for novel service and business approaches with increased complexity due to the integration of various resources into a virtual “high-performance” desktop environment and thus offers great possibilities for sustainable valorization.
Acknowledgements This publication was supported by the alliance projects Services@MediGRID (FKZ 01IG07015A-G), MedInfoGrid (FKZ 01G07016A) and the D-Grid Integration Project (FZK 01IG07014) funded by the German Federal Ministry of Education and Research (BMBF).
References 1. Krefting, D., Bart, J., Beronov, K., Dzhimova, O., Falkner, J., Hartung, M., Hoheisel, A., Knoch, T.A., Lingner, T., et al.: MediGRID: Towards a user friendly secured grid infrastructure. Future Generation Computer Systems 25(3), 326–336 (2009) 2. Holden, B.J., Pinney, J.W., Lovell, S.C., Amoutzias, G.D., Robertson, D.L.: An exploration of alternative visualisations of the basic helix-loop-helix protein interaction network. BMC Bioinformatics 8, 289 (2007) 3. Noirhomme-Fraiture, M., Nahimana, A.: Visualization. In: Diday, E., NoirhommeFraiture, M. (eds.) Symbolic Data Analysis and the SODAS Software, pp. 109–120. Wiley-Interscience, New York (2008) 4. Dodgson, N.A., Moore, J.R., Lang, S.R.: Multi-View Autostereoscopic 3D Display. In: The International Broadcasting Convention (IBC 1999), pp. 497–502 (1999), http://www.cl.cam.ac.uk/~nad10/pubs/IBC99-Dodgson.pdf 5. Foster, I., Kesselman, C.: The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (2003) 6. HP Scalable Visualization Array, http://h20311.www2.hp.com/HPC/cache/281455-0-0-0-121.html 7. 3D Slicer, http://www.slicer.org 8. Mangan, M.E., Williams, J.M., Lathe, S.M., Karolchik, D., Lathe, W.C.: 3rd: UCSC genome browser: deep support for molecular biomedical research. Biotechnol. Annu. Rev. 14, 63–108 (2008) 9. VMD - Visual Molecular Dynamics, http://www.ks.uiuc.edu/Research/vmd 10. Pieper, S., Halle, M., Kikinis, R.: 3D SLICER. In: Proceedings of the 1st IEEE International Symposium on Biomedical Imaging: From Nano to Macro 2004, vol. 1, pp. 632–635 (2004)
Visualization in Health Grid Environments: A Novel Service and Business Approach
159
11. Kepper, N., Foethke, D., Stehr, R., Wedemann, G., Rippe, K.: Nucleosome geometry and internucleosomal interactions control the chromatin fiber conformation. Biophysical Journal 95(8), 3692–3705 (2008) 12. Talos, I., Jakab, M., Kikinis, R.: SPL Abdominal Atlas. In: Surgical Planning Laboratory, Department of Radiology, Brigham and Women’s Hospital. Harvard Medical School, Boston (2008), http://www.na-mic.org/publications/item/view/1266 13. The Distributed Grid Accounting System (DGAS), http://www.to.infn.it/grid/accounting/main.html 14. Wiebelitz, J., Brenner, M.: Konzept für das Accounting im D-Grid 1.0. Regionales Rechenzentrum für Niedersachsen (RRZN), Hannover (2008), http://www.rrzn.uni-hannover.de/fileadmin/ful/ mitarbeiter/wiebel/Accounting/ DGI-2_FG-5-2_Acct-Konzept_08.pdf (accessed 2008.11.02) 15. McKee, P., Taylor, S., Surridge, M., Lowe, R., Ragusa, C.: Strategies for the Service Market Place. In: Veit, D.J., Altmann, J. (eds.) GECON 2007. LNCS, vol. 4685, pp. 58–70. Springer, Heidelberg (2007) 16. Stanoevska-Slabeva, K., Talamanca, C., Thanos, G., Zsigri, C.: Development of a Generic Value Chain for the Grid Industry. In: Veit, D.J., Altmann, J. (eds.) GECON 2007. LNCS, vol. 4685, pp. 44–57. Springer, Heidelberg (2007) 17. Thigpen, W., Hacker, T.J., Athey, B.D., McGinnis, L.F.: Distributed Accounting on the Grid. In: Caulfield, H.J., Chen, S.-H., Cheng, H.-D., Duro, R.J., Honavar, V., Kerre, E.E., Lu, M., Romay, M.G., Shih, T.K., et al. (eds.) Proceedings of the 6th Joint Conference on Information Science, pp. 1147–1150. JCIS / Association for Intelligent Machinery, Inc., Research Triangle Park (2002) 18. Zarnekow, R., Brenner, W.: A product-based information management approach. In: Proceedings of the 11th European Conference on Information Systems, ECIS 2003, Naples, Italy, (2003), http://is2.lse.ac.uk/asp/aspecis/20030183.pdf 19. Altmann, J., Ion, M., Bany Mohammed, A.: Taxonomy of Grid Business Models. In: Veit, D.J., Altmann, J. (eds.) GECON 2007. LNCS, vol. 4685, pp. 29–43. Springer, Heidelberg (2007) 20. Buyya, R., Abramson, D., Venugopal, S.: The Grid Economy. Proceedings of the IEEE 93(3), 698–714 (2005)
Message Protocols for Provisioning and Usage of Computing Services Nikolay Borissov1, Simon Caton2 , Omer Rana2 , and Aharon Levine3 1
Institute f¨ ur Informationswirtschaft und Management (IISM), Universit¨ at Karlsruhe [email protected] 2 School of Computer Science, Cardiff University, UK {s.j.caton,o.f.rana}@cs.cf.ac.uk 3 Correlation Systems, Isreal [email protected]
Abstract. The commercial availability of computational resources enable consumers to scale their applications on-demand. However, it is necessary for both consumers and providers of computational resources to express their technical and economic preferences using common language protocols. Ultimately, this requires clear, flexible and pragmatic communication protocols and policies for the expression of bids and resulting generation of service level agreements (SLAs). Further standardization efforts in such description languages will foster the specification of common interfaces and matching rules for establishing SLAs. Grid middleware are not compatible with market-orientated resource provisioning. We aim to reduce this gap by defining extensions to a standardized specification such as JSDL. Furthermore, we present a methodology for matchmaking consumer bids and provider offers and map the additional economic attributes into a SLA. We demonstrate the usage of the message protocols in an application scenario. Keywords: Message protocols, Expressing Economic Preferences, SLA.
1
Introduction
Allocation of jobs in current Grid systems utilizes schedulers based on properties of provider resources and consumer technical preferences, but not economic metrics. Current SLA specifications also do not have adequate scope for the definition of economic attributes [1]. Although there has been some work in supporting negotiation scenarios, many of these have limited benefit for supporting market mechanisms in general. In a market scenario, computing services can be acquired by a consumer willing to pay for the provided resource within a target market. Here, allocations and prices are determined by a market mechanism, which are intrinsically linked to the supply and demand of the specified resource(s). To participate in a market, autonomous bidding agents may act on behalf of market consumers and providers. They transparently generate bids and interact with the market mechanisms by applying well-defined message protocols that enable the automation of the market-based allocation processes. In this paper we specify a J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 160–170, 2009. c Springer-Verlag Berlin Heidelberg 2009
Message Protocols for Provisioning and Usage of Computing Services
161
term language for expressing economic preferences, which extend available specifications that express only technical preferences. In a market-based scenario, the result of the matchmaking process will be a SLA between a provider and consumer, which contains well-defined service level objectives. The structure of this paper is as follows. In Section 2 we present a scenario of market-based message exchange and derive requirements for the message protocols and service level objectives. In Section 3 we discuss related work, outlining existing description languages and their limitations. In Section 4 we present a term language for specifying economic preferences. Section 5 contains an outline of the generation process of a SLA from a match. In Section 6 we analyse our approach and its use in an implemented system for market-based scheduling.
2
Scenario and Challenges
The general motivation for this work stems from the emergence of industrial and academic work in the area of Cloud Computing. Here, (commercial) providers like Amazon, Google, IBM and Sun already offer computational services such as storage, processing capability and also hosting environments. Currently, consumers can compare and make a binding decision among the offered provider services, their prices and quality of service levels with respect to their own technical and economic preferences. The decision is binding, thus the consumer has to prepare their application and use only the services of the selected provider. Market mechanisms for allocating computing services demand the expression of demand and supply (in technical and economic preferences) using standardized communication protocols and service interfaces. In such a market-oriented allocation scenario, providers are incentivized to offer computing services with service qualities that reflect the demand of the consumers. Moreover, the result of the market matchmaking process, a SLA, will establish and enforce the conditions of the provisioning and usage processes, thus reduce risks for consumers and providers. 2.1
Basic Scenario
In the context of the SORMA1 project, we aim to provide methodologies and tools for market-based provisioning and usage of computing services. In this context, consumers and providers need to report their technical and economic preferences in the form of bids to the market (implemented through a Trading Manager component). The Trading Manager performs technical and economic matchmaking resulting in SLAs between consumers and providers that set the conditions under which a consumer utilizes resources of an external provider. The general SORMA information flow consists of processes shown in Fig. 1. (1) Providers and consumers describe and submit their technical and economic preferences to the BidGenerator framework [2]. Here, preferences can be estimated by using evolutionary techniques such as Conjoint Analysis and Analytical Hierarchy Process [3]. (2) Based on the specified preferences, bids (see 1
Self-Organizing ICT Resource Management, www.sorma-project.eu
162
N. Borissov et al.
Fig. 1. Components and message protocols for market-based scheduling
Section 4) are generated and submitted to the target market mechanism (e.g. Continuous Double Auction), running at the Trading Management (TM) component [4]. (3) TM performs technical and economic matchmaking of all submitted bids. When a match is identified, it submits a message (see Section 4) to Contract Management. (4) Based on the received message, Contract Management generates and returns a binding SLA, which is used to establish and enforce the agreed conditions of the match. (5) The consumer submits their job to the matched provider. (6) & (7) The SLA is enforced for violation-aware billing. In this paper we aim to define message protocols for exchanging economic and technical preferences, and establishment of SLAs, i.e. steps (1) – (4). 2.2
Research Challenges
In an auction-based scenario, SLAs are established based on the submission of bids and offers (signaling technical and economic preferences) to the target auction mechanism – if a suitable match exists. The implication of this approach is that if a match is identified, both parties are obliged to enter into a SLA, which may be legally binding, and a monetary transaction will ensue. This is comparable to the use of auctions in on-line e-commerce systems such as eBay. Hence, the establishment of SLAs between market participants requires consideration of the following challenges: •R1: Expression of economic and technical preferences of both consumers and providers [5]. However, it is often the case, that neither consumers nor providers know their exact preferences, although they may have beliefs of baseline estimations, which can be iteratively improved using selected methodologies of preference elicitation [3]. •R2: Definition of interaction protocols for exchanging private, public and marketbased data, where the latter is utilized in the generation of a SLA.
Message Protocols for Provisioning and Usage of Computing Services
163
•R3: Incentives for participants to use and offer high-quality computing services that reflect supply and demand in terms of service level guarantees and penalty functions [6,7]. In this paper we aim to primarily address challenges R1 and R2 in the Sections 4 and 5. Although R3 will be partially addressed here, this constitutes part of our further research.
3
Related Work
A market-oriented allocation scenario demands common communication protocols that enable the expression of technical and economic preferences by consumers and providers. Current communication protocols lack techniques for specifying economic preferences suitable for interactions in a market-oriented scheduling scenario. The Job Submission Description Language (JSDL), Resource Specification Language and Job Definition Language are applied in various Grid middleware as quasi-standard languages for technical resource and job description [8,9,10]. The Common Information Model (CIM ) [11] has a global and detailed coverage of a computing system’s attributes, but is more suitable as a frame of reference than as an exchange format. There are also many approaches for the definition of SLAs. The Contract Net Protocol [12] is one of the first protocol for negotiating electronic service contracts. Additional languages for specifying SLAs include WSLA, SLAng, WS-Agreement and RBSLA [13,14]. WS-Agreement is the most widely utilized and has been applied in various contexts and EU projects. However, these approaches are not sufficient in a market-oriented scheduling scenario and lack the ability to express economic preferences [1]. It is also necessary for each used term to be associated with a concept definition [15] – encoded within an ontology. Such a link between an economic term and an ontology concept definition can be established with the SAWSDL2 protocol [16]. In the next Section we specify and present economic extensions for general market-based scheduling, which are extensions to widely used specifications such as JSDL and WS-Agreement.
4
Economic Extensions for Market-Based Scheduling
Our term language consists of three types of messages, for private data (PrivateBidData), for (public) bid data (BidData) and for describing a match (MarketData) which is subsequently submitted to ContractManagement (see Section 5). Furthermore, a matchmaking strategy is used comparing consumer and a provider bids. 2
www.w3.org/TR/sawsdl
164
4.1
N. Borissov et al.
Economic Extensions for Bids
BidData is the economic part of a consumer or provider bid document that contains public data submitted from BidGenerator to the target market mechanism (step 2 in Fig. 1): •bidId - The unique identifier of the bid. •requestType - The type of the document, which can be bid, offer or match. •participant - The unique identifier of a consumer or provider. •bidPrice - The generated bid (in base units), based on the selected bidding strategy. •timeToLive - The unit of time (in milliseconds) that a bid is valid for in the target auction’s order book. •duration - A measurable quantity that relates a temporal object to the length of time, in milliseconds, during which it existed, has taken place, or has been obtained. In our use case, the duration is an upper-bound estimation for a job duration or the duration for which a resource is offered. •serviceType - The type of the requested service – a command line batch job (batch) or a Web application (webservice). •paymentType - Indicates if payment occurs before or after job execution. •currency - The currency to be used in transactions e.g. EUR, USD, or some arbitrary currency unit. •end point reference - the location of the provider’s portal for the submission of jobs and services. •penalty - The penalty function, which will determine the final price of the match, based upon adherence to the terms specified. •endpointReference - the location of the provider’s web service interface for the deployment and execution of applications (jobs). 4.2
Economic Extensions for Private Data
PrivateBidData is the economic part of a consumer or provider document, which contains private economic data exchanged between a participant (consumer or provider) and BidGenerator (step 1 in Fig. 1). In addition to the information in BidData, it transfers the following economic attributes: •valuation - The valuation is the monetary utility return of a good or a service. It is a subjective term that has value to one party, and may have no value to another. The valuation is the result of events in which an estimation of the amount that would be paid for a certain artifact either (a) if it were sold: a reservation price, or (b) if a service should be bought: the maximum likely price to be received. •strategy - A bidding strategy is a complete plan of actions for whatever situation may arise; this fully determines an agent’s behavior. A bidding strategy will determine the action the agent will take at any stage of the bid generation and market-based scheduling process, considering the available history and market information. Bidding strategies are implemented using well-defined interfaces.
Message Protocols for Provisioning and Usage of Computing Services
165
The values of this attribute is the name of the preferred strategy implementation (class-name). 4.3
Economic Extensions for Market Data
MarketData comprises economic data from the market mechanism – submitted to related parties e.g. ContractManagement: •clearingPrice - The monetary value assigned to an artifact on market clearing i.e. match identification. In an auction-based scenario, the price is determined by the auction, based on the bids and offers of the respective consumers and providers interested in trading that artifact. •contractId - The unique identifier of the established contract; a contract is a collection of agreements. Each instance is a legal document in which two or more agreeing agents promise to do (or not do) something. There are legal consequences to breaking the promises made in a contract. •participants - The unique identifiers of the consumer and provider and the IDs of their bids. •penalty & endpointReference - see definitions in Section 4.1. 4.4
Matchmaking of Bids
In the case of market-based scheduling of jobs, the matchmaking process of consumer bids to provider offers is executed by the market (auction) mechanism in two steps – utilizing technical and economic (price-based) matchmaking. In step one, the technical matchmaking assures that a provider offer is “technically compatible” to the consumer bid. In step two, the economic matchmaking is executed by the auction mechanism (e.g. Continuous Double Auction). The result of the matchmaking process is the generation of a market data document (Section 4.3) with well-defined attribute values (ranges). The process by which a consumer bid and a provider offer are compared is a precursory stage to the definition of a SLA (Section 5). This market-based matchmaking process occurs as follows: •Step 1 – Technical Matchmaking: As a consumer bid or provider offer arrives in the auction’s order book, the technical description (e.g. JSDL as part of the bid message) is passed to the SORMA technical matchmaking component, which identifies the resulting set of “technically compatible” offers and bids. A bid bi is compatible to an offer if there exists an intersection over the specified consumer and provider technical attributes: ∀abid , aof f er , abid ∈ Abid , aof f er ∈ Aof f er , abid ∩ aof f er = ∅. For example, if the technical attribute is CPU with certain frequency rating a = CP U Speed , the consumer technical preference is abid = {2GHz, 3GHz}, the provider offers resources with aof f er = {1GHz, 2GHz}, the intersection will be {2GHz} = ∅. •Step 2 – Economic Matchmaking: The economic matching is performed by the auction mechanism itself, using a price-based matchmaking of bids and offers. The result of the economic matching is the clearing price (Section 4.3). The
166
N. Borissov et al.
calculated clearing price is based on the pricing scheme used by the market mechanism, e.g. k-pricing [17]. •Step 3 – Constructing the Service-Level-Agreement : The SLA document (Section 5), a result of the match, contains technical attributes with values taken from step 1 and economic data as defined in Section 4.3, taken from step 2. The following Section describes the generation process of a SLA and the mapping of the economic extensions as a term language within WS-Agreement.
5
Generation of the SLA
In our system, a single SLA is generated, where the agreement scope covers both the consumer and the provider of a service in one document. The generation process inserts market data into WS-Agreement, using the economic attributes defined in previous sections. Fig. 2 illustrates the mapping of one or more market data documents to a SLA, which is encoded using WS-Agreement. There are four basic steps involved in this process, which are described below.
Fig. 2. The dissemination of one or more EJSDLMarket (see Section 6) matches in order to map the technical and economic data into a SLA specified with WS-Agreement
First, a globally unique SLA ID is generated. Second, the SLA context, identifying roles (consumer/provider) and validity period of the SLA, is composed. Here, the consumer and provider identifiers and their associated bid/offer identifers are used to create the agreement initiator and responder respectively. If the bid offer contains reservation data, this is also used to construct the validity period of the SLA. Third, the SLA service description (what is being provided) is composed, and an identifier unique to the SLA is created for each service. The main body of the JSDL description, without the JSDL Resources definition, is extracted to create the service description. The endpoint reference of the provider’s submission portal is also added as a service reference for use by the consumer and SLAEnforcement to interact with the provider. The Guarantee Terms are then generated using the JSDL Resource definition as a Key Performance Indicator (KPI), and currently assigned an equal weight (denoted
Message Protocols for Provisioning and Usage of Computing Services
167
as importance) of 1. Each term relates to one resource element, to either quantify the quality of service offered, or define other requirements specified by the consumer. Approaches such as SWAPS [18] can provide a foundational basis to provide different weightings for Guarantee Terms. Disseminated market data is also added here as entries in the business values of the Guarantee Term. Namely, paymentType, currency, penalty, and a portion of the clearingPrice. The service properties are then defined as a summary of the Guarantee Terms to summarise the metrics needed to enforce the SLA. There are some inconsistencies in the mapping of a market document to WSAgreement, due to differences in the conceptual models of the two specifications. The bidding framework considers only a single reward (payment) and penalty for all terms in the match. WS-Agreement, however, enables all Guarantee Terms to specify these stipulations at an individual level. This difference translates to repeated information in the SLA document. However, specifying the Guarantee Terms in this way enables a more intuitive representation of adherence during an enforcement stage. For instance, if only one resource attribute is violated this can be captured at the individual term level, rather than the global level, which is more representative of the violation.
6
Technical Analysis and Application
The proposed economic language terms (Section 4) are implemented as extensions to JSDL, – referred to as EJSDL.3 Here, JSDL forms a core part in each message protocol and is extended with message formats for PrivateData, BidData and MarketData, i.e. EJSDLPrivate, EJSDLBid and EJSDLMarket respectively. The example4 shows the application of EJSDL in respect to the messages in steps 1, 2 and 3 of Fig. 1 by a provider request. Moreover, the introduced economic extensions are linked using SAWSDL [16] and defined within a market ontology.5 The definitions of some of the concepts are linked to OpenCyc [19] in order to bind our proposed ontology concepts to existing ones, based on a well-defined and widely maintained upper-ontology. The specified message protocols have been used in the ViSAGE (Video Stream Analysis in a Grid Environment) application, a collection of motion detection and object recognition services, implemented in Java and deployable on common Java web servers. Fig. 3 shows the final step (application deployment and job execution) of the market-based allocation processes using SORMA. Here, the VISAGE client invokes the deployed VISAGE service (on the target provider machine) by submitting video stream data for analysis. As a result, the ViSAGE client receives images of detected motion, objects and related meta data. To do this, ViSAGE clients invoke the BidGenerator by submitting technical 3 4 5
www.rz.uni-karlsruhe.de/~ri116/sorma/marketProtocol/ejsdl.xsd www.rz.uni-karlsruhe.de/~ri116/sorma/marketProtocol/EJSDLXMLExamples. pdf www.rz.uni-karlsruhe.de/~ri116/sorma/marketProtocol/owl/marketProtocol. owl
168
N. Borissov et al.
Fig. 3. ViSAGE client invoking deployed ViSAGE service, which endpoint reference is a part of the returned WS-Agreement document as a result of the negotiation process
and economic preferences utilizing the provided message protocols. The technical preferences of the VISAGE application are specified as JSDL templates. The economic preferences such as a bidding strategy, valuation (results from Budgeting Strategy) as well as bid and clearing prices of the jobs (video data streams) are displayed on the VISAGE client as a result of the bidding and market allocation processes in SORMA. Upon receiving an allocation, in the form of a WS-Agreement document, the ViSAGE client can start with the invocation and streaming of the video data. Using the SORMA market, a consumer can utilize provider machines on-demand, and thus scale automatically by adjusting the quality of video streams.
7
Conclusions
We describe a general term language extending an existing specification such as JSDL, for expressing economic preferences of both consumers and providers (R1 ). The economic attributes are structured into message protocols that support a market-based allocation process by exchanging private, public and market data (R2 ). Furthermore, we described the generation process of a SLA and the mapping of technical and economic attributes into the WS-Agreement
Message Protocols for Provisioning and Usage of Computing Services
169
specification. Finally, we demonstrated the use of the defined message protocols in a real application context. We believe that further research has to be performed on the description of service level objectives and their related utility and penalty functions, thus these elements can be a powerful instrument for providing incentives to consumers to use such a service, and for providers to offer high quality services by reducing consumer risks. This stems from the observation that consumers and providers will indirectly represent their utility and penalty functions through the specification and weighting of their technical and economic preferences in the bid.
References 1. Kar¨ anke, P., Kirn, S.: Service level agreements: An evaluation from a business application perspective. In: Proceedings of eChallenges (2007) 2. Borissov, N., Wirstr¨ om, N.: Q-Strategy: A bidding strategy for market-based allocation of Grid services. In: OTM Conferences, vol. (1), pp. 744–761 (2008) 3. Stoesser, J., Neumann, D.: A model of preference elicitation for distributed marketbased resource allocation. In: 17th European Conference on Information Systems, ECIS 2009 (2009) 4. Nimis, J., et al.: D2.2a: Final specification and design documentation of the sorma components – revised version. Technical report (2009), http://www.im. uni-karlsruhe.de/sorma/fileadmin/SORMA_Deliverables/D2.2a_final.pdf 5. Chevaleyre, Y., Dunne, P., Endriss, U., Lang, J., Lemaˆıtre, M., Maudet, N., Padget, J., Phelps, S., Rodriguez-Aguilar, J., Sousa, P.: Issues in multiagent resource allocation (2005) 6. Becker, M., Borissov, N., Deora, V., Rana, O., Neumann, D.: Using k-Pricing for Penalty Calculation in Grid Market, p. 97 (2008) 7. Wilkes, J.: Utility functions, prices, and negotiation. Technical report, HP Laboratories (2008) 8. Anjomshoaa, et al.: Job Submission Description Language (JSDL) Specification, Version 1.0 (2005) 9. Laure, E., Fisher, S., Frohner, A., Grandi, C., Kunszt, P., Krenek, A., Mulmo, O., Pacini, F., Prelz, F., White, J., et al.: Programming the Grid with gLite. Computational Methods in Science and Technology 12(1), 33–45 (2006) 10. Feller, M., Foster, I., Martin, S.: GT4 GRAM: A functionality and performance study. In: TeraGrid Conference (2007) 11. DMTF: Common Information Model (CIM) v2.19.1. Distributed Management Task Force, DMTF (2008) 12. Paurobally, S., Tamma, V., Wooldridge, M.: A framework for web service negotiation. ACM Transactions on Autonomous and Adaptive Systems 2(4) (2007) 13. Paschke, A., Dietrich, J., Kuhla, K.: A logic based sla management framework. In: Semantic Web and Policy Workshop (SWPW) at ISWC 2005 (2005) 14. Andrieux, A., et al.: Web services agreement specification, WS-Agreement (2007) 15. Tamma, V., Phelps, S., Dickinson, I., Wooldridge, M.: Ontologies for supporting negotiation in e-commerce. Engineering applications of artificial intelligence 18(2), 223–236 (2005)
170
N. Borissov et al.
16. Kopecky, J., Vitvar, T., Bournez, C., Farrell, J.: Sawsdl: Semantic annotations for wsdl and xml schema. IEEE Internet Computing 11(6), 60–67 (2007) 17. Satterthwaite, M., Williams, S.: Bilateral trade with the sealed bid k-double auction: Existence and efficiency. Journal of Economic Theory 48(1), 107–133 (1989) 18. Oldham, N., Verma, K., Sheth, A., Hakimpour, F.: Semantic ws-agreement partner selection. In: 15th international conference on World Wide Web (2006) 19. Mascardi, V., Cord`ı, V., Rosso, P.: A comparison of upper ontologies. In: Agenti e industria: Applicazioni tecnologiche degli agenti software, WOA 2007, pp. 24–25 (2007)
Business Collaborations in Grids: The BREIN Architectural Principals and VO Model Steve Taylor1, Mike Surridge1, Giuseppe Laria2, Pierluigi Ritrovato2, and Lutz Schubert3 1
University of Southampton IT Innovation Centre, 2 Venture Road, Chilworth, Southampton SO16 7NP, UK {sjt,ms}@it-innovation.soton.ac.uk 2 Centro di Ricerca in Matematica Pure e Applicata (CRMPA) c/o DIIMA, Università degli Studi di Salerno, via Ponte Don Melillo, 84084 Fisciano (Italy) {laria,ritrovato}@crmpa.unisa.it 3 High Performance Computing Center Stuttgart (HLRS) Nobelstr. 19 D - 70569 Stuttgart, Germany [email protected]
Abstract. We describe the business-oriented architectural principles of the EC FP7 project “BREIN” for service-based computing. The architecture is founded on principles of how real businesses interact to mutual benefit, and we show how these can be applied to SOA and Grid computing. We present building blocks that can be composed in many ways to produce different value systems and supply chains for the provision of computing services over the Internet. We also introduce the complementary BREIN VO concept, which is centric to, and managed by, a main contractor who bears the responsibility for the whole VO. The BREIN VO has an execution lifecycle for the creation and operation of the VO, and we have related this to an application-focused workflow involving steps that provide real end-user value. We show how this can be applied to an engineering simulation application and how the workflow can be adapted should the need arise. Keywords: Service Oriented Architecture, Grid Computing, Supply Chain, Value Network, Virtual Organisation, SLA, Workflow.
1 Introduction This paper discusses the business-focused architectural principles and VO model approach of the BREIN project. The architecture is founded on the principles of how real businesses interact to mutual benefit, and we show how these can be applied to SOA and Grid computing by way of examples driven by the project’s end-user scenarios. BREIN has two real-world end-user scenarios that have determined requirements and validation for the project’s innovations. These are in the two areas of airport ground handling and virtual engineering design simulations (so-called because the simulations enable virtual engineering designs to be evaluated without the need for J. Altmann, R. Buyya, and O.F. Rana (Eds.): GECON 2009, LNCS 5745, pp. 171–181, 2009. © Springer-Verlag Berlin Heidelberg 2009
172
S. Taylor et al.
real-world mock-ups). While these may seem very different, they pose very similar problems regarding business situations and resource and service management. In both cases the resources required can be treated as services, and therefore can be traded: in the airport scenario, a bus carrying passengers from an aircraft is treated as a service supplied by the bus company to the airport, and in the virtual engineering scenario computational fluid dynamics simulation software running on a cluster can be similarly offered as a service. This paper is in two broad sections, each illustrated with examples from BREIN’s end-user real-world scenarios. Section 2 describes the BREIN architectural principles that support business interactions. Section 3 describes the BREIN VO concept and how it supports the BREIN architectural principles.
2 Architectural Principles In a service-oriented Grid or Service-Oriented Architecture (SOA), users buy services from suppliers, but the suppliers may themselves use other services as raw materials, thus creating a supply chain: an organisation buys in component goods or services from its suppliers, adds value to them, and delivers the result to its customers. Bearing this in mind, we advocate a key principle: an organisation can act both as provider of services for their customers but at the same time be the customer of services from other providers. The BREIN architecture is founded on this principle, which emerged from analysis of supply and value chains (see [1], [2]). This principle also reflects real business interactions more accurately than traditional “Grid” approaches, as in the traditional approaches, participants donated resources to a centrally managed pool in return for resources of another kind. To support service provision and consumption requires the adoption of an approach reflecting a generic customer-provider interaction pattern, and is based on three fundamental principles, discussed next. 2.1 Bipartite Relationships A fundamental principle is an approach completely grounded in bipartite (1-1) business relationships with two parties, typically with the roles of customer and provider. These were introduced in the NextGRID project [4], [5], and are expanded upon here. The primary reason for bipartite relationships is that they represent how established business patterns operate; the customer buys goods and services from a provider. Relationships are formed, and trust is built up (or destroyed!) based on experience of past dealings. We advocate that the approach to business relationships is to take the view that each organisation is the centre of their own world, as they see their suppliers and customers and utilise products and services from their suppliers to deliver value to their customers. Thus the same organisation can be a provider and a customer in different situations. There are special (more limited) cases of this, for example where there is an end-consumer who does not have any customers themselves, or a provider that runs all their own application services in-house, but the general and most common pattern is that an organisation buys in component goods or services, adds
Business Collaborations in Grids: The BREIN Architectural Principals and VO Model
173
value to them to produce their goods or services and sells these to their customers. This is the placement of the organisation into the supply chain, and the value chain inside the organisation determines how value is added. A single organisation is unlikely to be aware of the entire supply chain: in the real world there is also no omniscient viewpoint of organisations’ relationships and there is certainly no central control. Compared to a traditional VO, our approach is a value network, and the participation of the actors towards a goal of the VO is governed by the actors' individual economic aims, rather than an altruistic aim of true collaboration. If it is in the interest of an actor to participate, they will. There is most likely to be a contract governing the relationship between a customer and provider. Once this is established, it provides a mutually-agreed set of points against which measurement of performance can take place. Often, this relationship will be in the form of a Service Level Agreement (SLA), which, in addition to that discussed above, describes the service on offer, and the obligations of the parties involved for delivery of that service. The provider is responsible to the customer for delivery of the work they are contracted for. As long as the provider keeps to the terms of their agreement with the customer (which may indeed provide some restrictions on how the provider does the work), the provider can provision resources for the work any way they see fit. Unless the contract with their customer forbids it, a provider can outsource all or part of their service (e.g., the provider may subcontract out part of their service offering). Thus the bipartite relationship is a recursive pattern: a provider in one relationship may be a customer in another. 2.2 Virtualisation and Encapsulation Over recent years, the concepts of “Virtualisation” and “Encapsulation” [4], [6], [7] [8], [9] have been increasingly adopted for simplifying infrastructure complexity that is exposed as simple services, which can be easily integrated. This hiding of resources is related to the concept of hiding details and separation of concerns in practice (at programmatic level), in use since the 1970s (for example see [3] and also [10]) and is completely consistent with our argument that the service provider should be free to provision a service any way they see fit, provided they comply with the terms of the agreement with their customer. Outsourcing may also be encapsulated and virtualised using a principle derived from work in the NextGRID project [5], described in [4]. A customer may purchase a service from a provider, which itself may be made up from services themselves purchased by the provider. This is shown in Figure 1.
Fig. 1. Encapsulation of Outsourcing
174
S. Taylor et al.
It is said that the SP2 encapsulates its suppliers (here exemplified by SP3 but there may be many suppliers) in order to create Service X for its customer C1. Service Y is a component part of Service X and is part of the sub-contract arrangement between SP2 and SP3, with SP3 being the subcontracted party. SP2 must take responsibility for SP3's delivery of Service Y for the production of Service X. In other words, it is the SP2’s problem if one of their suppliers fails to deliver. The customer cannot be responsible for this! This is termed “aggregation of risk” as SP2 is managing their suppliers and the risks of them not delivering. 2.3 Orchestration Orchestration is used where the Customer of more than one service provider wants the providers to interact. The customer manages this interaction, and thus orchestrates the providers (see [11]). An example is shown in Figure 2. SP2 providing Service X SP Customer C1
C
SLA (C1, SP2 Service X) Communication for Customer C1: Output of Service X is input to Service Y
C SLA (SP2, SP3 Service Y)
SP3 providing Service Y SP
C
Fig. 2. Orchestration
Here the customer C1 orchestrates SP2 and SP3. C1 tells SP2 to deliver the output of Service X to be the input of Service Y hosted at SP3, and SP3 is instructed to expect it and use as it as input for Service Y. 2.4 An Example Business Structure The principles presented above are intended as building blocks from which complex business structures and workflows where one step may be decomposed into multiple steps or outsourced as a service may be built if required. Figure 3 shows an example. Here the Customer has business relationships with SP1 and SP7, and instructs the output of SP1 to be sent to SP7, thus orchestrating SP1 and SP7. However, SP1 encapsulates a series of outsourced services to create its output. These are SP2, SP3 and SP4 - all orchestrated by SP1. Data is transmitted from SP1 to SP2 (transmission 2), processed by SP2, and sent onto SP3 (transmission 3) under the control of SP1. The bubbles in Figure 3 illustrate certain actors' fields of view and influence: a field of
Business Collaborations in Grids: The BREIN Architectural Principals and VO Model
175
Fig. 3. Example Business Structure
view is defined by the partners that the actor can see and interact with, and the field of influence is where the actor can have some degree of control1. SP1’s field of view and influence for this example are shown. SP1’s field of view comprises the actors they are aware of, and the field of influence is where they are giving SP2, SP3 and SP4 economic vested interest to provide services to them, as SP1 is a customer (and therefore will pay) SP2, SP3 and SP4. SP1 has no knowledge of SP5 and SP6, as they are encapsulated by SP4. Similarly, C1 has no knowledge of any provider apart from SP1 and SP7. All the others are managed by the respective provider encapsulating and orchestrating them. We assert that the fields of influence are actually business Virtual Organisations (VOs), as they represent collaboration from the point of view of one main actor. The other partners in the VO respond to SLA-based goals from the main actor. The BREIN VO concept is described next.
3 The BREIN VO Concept Our real-world economic focus also influences our concept of the BREIN Virtual Organisation (VO). Classical grid-based VO approaches generally assume that 1
Not all possible fields of view and influence in the figure are shown for reasons of clarity. The real situation is likely to contain more partners in the fields of view and influence, as each actor will have many interactions from multiple unrelated SLAs, and each actor will have to manage the demands of each of these SLAs.
176
S. Taylor et al.
providers expose their resources for sharing with others in the network. Whilst this is applicable in a shared-resources academic approach, it does not hold true in an economic world, where organisations sell services made from resources rather than the resources themselves. Thus, the traditional Grid VO approach is not compatible with the supply and value chains of actual businesses, and hence there was a need to adopt an approach that reflects a generic customer-provider interaction pattern, as this is the fundamental pattern for interaction in real businesses. The next sections describe the BREIN VO related to examples from the BREIN airport scenario concerning ground handling services. As with the field of influence, a BREIN VO is centric to a main contractor who defines a complex goal to be realised via a (virtual) collaboration and orchestrates subcontracted parties to achieve that goal via bipartite agreements. Such a goal definition contains all the collaboration details, i.e. goals, contractual scope, capabilities, requirements and limitations. With participants hiding their infrastructure and with VOs allowing for simple integration of outsourced resources into the local business processes, Service Providers may create their own VO within another VO, i.e. they may use an encapsulated subcontract to provide the capabilities they expose to external customers. Figure 4 shows an example of this using BREIN’s airport userpartner scenario as an example.
VO1 contractual binding
Passengers
Airline • VO1.MC
VO3
VO2 contractual binding
Airport • VO1.C • VO2.MC
✚
contractual binding
Ground Handling • VO1.SC • VO2.C • VO3.MC
Legend: MC– Main Contractor C– Contractor SC– Sub Contractor (invisible)
Y Buses • VO3.C
Gangway • VO3.C
Fig. 4. Three nested VOs in an Airport Context
This figure includes several parties involved in the supply chain related to airport management. These parties can form different VOs (e.g. VO1, VO2, and VO3) that meet specific goals, but there is an implicit relationship between the contracts of the linked VOs, and this is determined by the supply chain formed by the VOs that has the overall goal of servicing the airline. The subcontracted parties may not know the overall goal, but will know the goals of their SLA with their customer. This means e.g. that the airport will negotiate a contract with a ground handling service provider in order to fulfil the contract with the airline – whilst the airline is unconcerned with respect to how this requirement was fulfilled. Though the VO2 contract may actually base itself on the VO1 contract, it is more likely that a completely new contract is generated to ensure additional business aspects, but the critical point is that the VO2
Business Collaborations in Grids: The BREIN Architectural Principals and VO Model
177
contract must not prevent the Airport fulfilling its contract with the Airline in VO1, and the Airport must ensure this in its contract negotiations with the Ground Handling service. 3.1 Workflow and VO Lifecycle A typical BREIN application from either end-user scenario can be seen as a sequence of steps to be executed: a workflow. We assert that within the application workflow, there is an orthogonal business workflow that is executed by the main contractor enacting the application workflow. For each application step (or group of steps), a business process that follows the VO lifecycle is required to execute the application step in a service-based environment. The BREIN VO lifecycle follows the typical VO lifecycle that is widely used (for example see [12]) and contains the phases: Identification (find providers), Formation (make agreements with providers), Operation (use the providers’ services) and Dissolution (dissolve the VO). The relationship between an end-user's application focus and the VO lifecycle is given next by way of an example from the BREIN Virtual Engineering scenarios. One case study is that the effects of locating an electricity-generating wind-turbine in a particular location are required. The simulation tools in the VE scenario are able to provide insights into the wind effects on the turbine and wake interactions between it and neighbouring turbines at that location, A terrain model of where the turbines are to be located is supplied by the user, and the simulation is required to analyse the effects of the wind from multiple different compass-point directions and at different speeds. For example, if there are 12 compass-point directions and five chosen wind speeds per compass-point, the actual task comprises 60 separate simulations: one per compass-point and wind speed combination. These simulations are independent, and therefore they can be run in parallel to provide a fast answer to the user. The workflow corresponding to this simulation comprises three major steps: creation of an input "mesh" that represents the wind turbines located in the terrain in a format the simulation software can understand, the 60 independent simulations themselves, and the aggregation and post-processing of the numerical simulation outcomes into a visual form easily understandable by humans. The workflow for this application and its relationship to the business process required to make each step executable is shown in Figure 5. The figure shows the application focused workflow (the vertical path) is the solution to the customer goal and enacted by the main contractor, but the main contractor will not execute it completely in house – instead, they will sub-contract all the processing (forming VOs) and manage the interactions with the subcontractedservice providers (according with the encapsulation principle). Each application step to be performed by the main contractor the application thus requires the creation of a VO that can perform that application step and follows the VO lifecycle (i.e. find, agree, execute). This is represented by the horizontal paths in Figure 5. The identification phase corresponds to the task of locating and selecting external suppliers that can perform the application step. In a marketplace of services, there may be many providers that are able to provide the functionality for one step, and therefore selection is required.
178
S. Taylor et al. Business Process (VO Workflow) for Each Application Step Application Steps
ABSTRACT (APPLICATION-FOCUSED) WORKFLOW VO-Workflow Application Step 1 E.g. create simulation mesh
Find: Discover Provider to create mesh
Agree SLA with provider 1
Execute: Create mesh at provider 1
VO-Workflow
Application Step 2 E.g.run simulation
Find: Discover Provider to run simulation
Agree SLA with provider 2
Execute: Run simulation at provider 2
Agree SLA with provider 3
Execute: Run simulation at provider 3
Agree SLA with provider 4
Execute: Run simulation at provider 4
VO-Workflow Application Step 3 E.g.post-process output for visualistion
Find: Discover Provider to post process
Identification
Execute: Generate display at provider 5
Agree SLA with provider 5
Formation
Operation & Evolution
Fig. 5. Application-Focused Workflow and VO-Workflow
The formation phase comprises the signing of bipartite SLAs resulting from the identification phase, meaning that after the signatures, the partners are committed to the VO for the purposes of providing the service quoted in the SLA. The operation and evolution phase of the VO lifecycle involves actually invoking the services and orchestrating them together. The main contractor must orchestrate the providers and their services together to provide the overall application workflow. For this example, orchestration requires uploading input data to the service providers executing the processing at each provider, monitoring the executing processes at each of the providers to determine if they have finished, or if there is a problem for which action needs to be taken, collating and transferring any output data to another provider that requires it as input, and downloading the final output. The evolution phase of the VO lifecycle may involve more complex handling. Evolution means that the VO is changed somehow during its life. An example of this is shown in Figure 6 and is related to monitoring the execution: if the main contractor detects a problem with one of the service providers running the simulation, then adaptation will need to occur. The strategy for adaptation will depend on the circumstances of the problem or the strategy of the main contractor. For example, if a provider is running too slowly, a different SLA promising higher performance may be selected, or if there is a total failure, a replacement service provider may be sought.
Business Collaborations in Grids: The BREIN Architectural Principals and VO Model
179
Fig. 6. VO Adaptation - Find New Provider
This second example requires a new VO lifecycle for the replacement simulation task, and this is illustrated in Figure 6. Here provider 4 has failed, and the main contractor has decided to locate a new provider to perform the simulation tasks in place of provider 4. There is a new VO workflow dedicated to the task of locating a provider, signing an SLA and running the simulation in place of provider 4. The main contractor must orchestrate this new provider and VO workflow into the existing workflow to ensure continuous operation. As with the nested VOs described previously, the overall responsibility for the whole workflow belongs to the main contractor: they have to manage the subcontracted providers to achieve the customer’s request. In this way, the entire application workflow may be encapsulated for its user – the user need never be aware of the complexity behind what appears to be an atomic service. The user benefits in that a complex-to-manage service is simple to use and can be offered with SLAguaranteed terms, and this is the reason for the main contractor’s business offering – they hide the complexity and take responsibility for management of the complicated, composite service.
180
S. Taylor et al.
4 Conclusions In this paper, we have presented the BREIN project architectural approach to business interactions via its architectural principles and VO concept. We have related these elements to an application-focused workflow, which provides actual benefit for realworld users. We have advocated the principle of a 1-1 business relationship, with typically a customer and provider, as it is a fundamental building block used throughout the real business world. We have also discussed the encapsulation of resources into services, together with encapsulation of outsourcing where subcontracted service providers are liable to their main contractor. This is the essence of the BREIN VO concept, which is centric to, and managed by, the main contractor who bears the responsibility for the VO to deliver value to its customers. The BREIN VO has an execution lifecycle for the creation and operation of the VO, and we have related this to an application-focused workflow involving steps that provide real value to end-users. We have shown how this can be applied to an engineering simulation application, and show that the workflow can be adapted should the need arise. Acknowledgments. This work has been supported by the BREIN project (http://www.gridsforbusiness.eu) and has been partly funded by the European Commission’s IST activity of the 6th Framework Programme under contract number 034556. This paper expresses the opinions of the authors and not necessarily those of the European Commission. The European Commission is not liable for any use that may be made of the information contained in this paper.
References 1. Porter, M.: Competitive Advantage. Free Press, New York (2004) ISBN-10: 0743260872, ISBN-13: 978-0743260879 2. Supply-Chain Council (eds.): Supply Chain Operations Reference Model (SCOR®) Overview (2007), http://www.supply-chain.org 3. Parnas, D.: On the Criteria to Be Used in Decomposing Systems Into Modules published in the Communications of the ACM (December 1972) 4. Herenger, H., Heek, R., Kubert, R., Surridge, M.: Operating Virtual Organizations Using Bipartite Service Level Agreements, Grid Middleware and Services: Challenges and Solutions. In: Talia, D., Yahyapour, R., Ziegler, W. (eds.) Association with the 8th IEEE International Conference on Grid Computing (Grid 2007). Springer, Heidelberg (2008) 5. NextGrid – Architecture for Next Generation Grids, http://www.nextgrid.org/ 6. Dimitrakos, T., Gaeta, M., Serhan, B., et al.: An emerging architecture enabling Grid based application service provision. In: Proceedings of the 7th International Conference on Enterprise Distributed Object Computing, 7. Foster, I., Kesselman, C., Nick, J.M., Tuecke, S.: The Physiology of the Grid 8. The IBM Group (2007) Why Virtualization Matters to the Enterprise Today, ftp://ftp.software.ibm.com/common/ssi/rep_wh/n/OIW03008USEN/ OIW03008USEN.PDF
Business Collaborations in Grids: The BREIN Architectural Principals and VO Model
181
9. Nash, A.: Service Virtualization – Key to Managing Change in SOA (2006), http://www.bitpipe.com/detail/RES/1130171201_512.html 10. Haller, J., Schubert, L., Wesner, S.: Private Business Infrastructures in a VO Environment, in Paul Cunningham. In: Cunningham, P., Cunningham, M. (eds.) Exploiting the Knowledge Economy - Issues, Applications, Case Studies, pp. 1064–1071 (2006) 11. Terracina, A., Kirkham, T., et al.: Orchestration and Workflow in a mobile Grid environment. In: Proceeding of the Fifth International Conference on Grid and Co-operative Computting Workshops, GCCW 2006 (2006) 12. Saabeel, W., Verduijn, T., Hagdorn, L., Kumar, K.: A Model for Virtual Organisation: A structure and Process Perspective. Electronic Journal of Organizational Virtualness (2002)
Author Index
Altmann, J¨ orn
46, 116
Bernarding, Johannes 150 Bodenstein, Christian 1 Borissov, Nikolay 160 Brandic, Ivona 60 Caton, Simon 160 Courcoubetis, Costas
28, 46
Dickmann, Frank 150 Dustdar, Schahram 60 Falkner, J¨ urgen 140 Fleming, Alan 46 Guo, Li
46
Hedwig, Markus 1 Hertel, Frank 150 Hwang, Junseok 116 Jacob, Ansger
88
Kaspar, Mathias 150 Kepper, Nick 150 Knoch, Tobias A. 150 Konstanteli, Kleopatra 102 Krefting, Dagmar 150 Kyriazis, Dimosthenis 102 Laria, Giuseppe 171 Leitner, Philipp 60 Lesnussa, Michael 150 Levine, Aharon 160 L¨ ohnhardt, Benjamin 150
Manneback, Pierre 15 Mason, Robin 28 Miliou, Natalia 28 Mohammed, Ashraf Bany 116 Mohammed, Yassene 150 M¨ uller, Marcus 88 Music, Dejan 60 Neumann, Dirk No¨el, S´ebastien
1 15
Papagiannis, Ioannis 102 Parrilli, Davide Maria 128 P¨ uschel, Tim 1 Racz, Peter 88 Rana, Omer 160 Risch, Marcel 46 Ritrovato, Pierluigi
171
Sax, Ulrich 150 Schubert, Lutz 171 Silaghi, Gheorghe Cosmin Steinke, Thomas 150 Strebel, J¨ org 74 Surridge, Mike 171 Taylor, Steve 171 Thiel, Andreas 150 Tserpes, Konstantinos Varvarigou, Theodora Viezens, Fred 150 Volk, Eugen 88 Waldburger, Martin Weisbecker, Anette
102 102
88 140
15