Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
7201
Jens B. Schmitt (Ed.)
Measurement, Modelling, and Evaluation of Computing Systems and Dependability and Fault Tolerance 16th International GI/ITG Conference MMB & DFT 2012 Kaiserslautern, Germany, March 19-21, 2012 Proceedings
13
Volume Editor Jens B. Schmitt University of Kaiserslautern disco - Distributed Computer Systems Lab Computer Science Department Building 36, P.O. Box 3049 67663 Kaiserslautern, Germany E-mail:
[email protected]
ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-28539-4 e-ISBN 978-3-642-28540-0 DOI 10.1007/978-3-642-28540-0 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2012932064 CR Subject Classification (1998): C.2, C.4, C.1, D.2.8, D.2, D.4.8, D.4 LNCS Sublibrary: SL 2 – Programming and Software Engineering © Springer-Verlag Berlin Heidelberg 2012 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
This volume contains a selection of the papers presented at the 16th International GI/ITG Conference on Measurement, Modelling and Evaluation of Computing Systems and Dependability and Fault Tolerance (MMB & DFT 2012) held during March 19–21, 2012 in Kaiserslautern, hosted by the University of Kaiserslautern. MMB & DFT 2012 covered diverse aspects of performance and dependability evaluation of systems including networks, computer architectures, distributed systems, software, fault-tolerant and secure systems. This biennial conference has a long tradition starting as early as in 1981. Besides its main scientific program, MMB & DFT 2012 comprised two keynotes, two tutorials from academic and industrial experts, several tool presentations as well as three workshops. Specifically, we were very happy to have the keynote talks by Anja Feldmann (TU Berlin / Deutsche Telekom Laboratories) on “Internet Architecture Trends” and by Lothar Thiele (ETH Z¨ urich) on “Modeling and Evaluation of Thermal System Properties.” In this edition of MMB & DFT, we had the speciality of integrated workshops featuring certain topics (with their own call for papers): – Workshop on Network Calculus (WoNeCa), organized by Anne Bouillard (ENS, France), Markus Fidler (Leibniz University of Hannover), and Florin Ciucu (TU Berlin / Deutsche Telekom Laboratories) – Workshop on Modeling and Analysis of Complex Reaction Networks (MACoRN), organized by Werner Sandmann (TU Clausthal) and Verena Wolf (Saarland University) – Workshop on Physically Augmented Security for Wireless Networks (PILATES), organized by Matthias Hollick (TU Darmstadt), Ivan Martinovic (University of Oxford), and Dirk Westhoff (HAW Hamburg) Overall we received 54 submissions, 36 to the main conference (including 6 tool papers) and 18 to the workshops, by authors from 17 different countries. Each submission was reviewed by at least 3, and on average 3.9, Program Committee members. In a physical TPC meeting with further technical discussions, 26 of these submissions were selected for inclusion in this volume. On behalf of the TPC, we would like to thank all authors who submitted their work to MMB & DFT 2012. We hope that all authors appreciate the hard work of the TPC members, and found their feedback and suggestions valuable. We would like to express our debt and gratitude to all the members of the TPC, and the external reviewers, for being so responsive and for their timely and valuable reviews.
VI
Preface
We are grateful to everyone involved in the organization of the MMB & DFT 2012 conference, as well as to the speakers and the attendees of the conference. We also appreciate the excellent support of EasyChair in managing the processes of submission, reviewing, and preparing the final version of the proceedings. January 2012
Jens B. Schmitt
Organization
MMB & DFT 2012 was organized by the Distributed Computer Systems Lab, University of Kaiserslautern, Germany.
Organizing Committee General and Program Chair Local Organization Chairs
Tools Chair Submission Chair Publication Chair Web Chair Publicity Chair
Jens Schmitt Steffen Bondorf Steffen Reithermann Carolin Reffert-Schmitt Hao Wang Matthias Wilhelm Michael Beck Adam Bachorek Wint Yi Poe
Program Committee Lothar Breuer Peter Buchholz Joachim Charzinski Hans Daduna Klaus Echtle Bernhard Fechner Markus Fidler Reinhard German Boudewijn Haverkort Gerhard Haßlinger Holger Hermanns Joost-Pieter Katoen J¨ org Keller Peter Kemper Udo Krieger Wolfram Lautenschl¨ ager Axel Lehmann Ralf Lehnert Erik Maehle Michael Menth Bruno M¨ uller-Clostermann
University of Kent, UK TU Dortmund, Germany Hochschule der Medien Stuttgart, Germany Universit¨at Hamburg, Germany Universit¨ at Duisburg-Essen, Germany Universit¨ at Augsburg, Germany Leibniz Universit¨ at Hannover, Germany Universit¨ at Erlangen-N¨ urnberg, Germany University of Twente, The Netherlands Deutsche Telekom, Germany Universit¨ at des Saarlandes, Germany RWTH Aachen, Germany FernUniversit¨ at in Hagen, Germany The College of William and Mary, USA Otto-Friedrich-Universit¨ at Bamberg, Germany Alcatel-Lucent, USA Universit¨ at der Bundeswehr M¨ unchen, Germany TU Dresden, Germany Universit¨ at zu L¨ ubeck, Germany Universit¨at T¨ ubingen, Germany Universit¨ at Duisburg-Essen, Germany
VIII
Organization
Peter Reichl Anne Remke Johannes Riedl Francesca Saglietti Werner Sandmann Jens Schmitt Markus Siegle Helena Szczerbicka Aad Van Moorsel Oliver Waldhorst Max Walter Verena Wolf Bernd Wolfinger Katinka Wolter Armin Zimmermann
Forschungszentrum Telekommunikation Wien, Austria University of Twente, The Netherlands Siemens AG, Germany Universit¨at Erlangen-N¨ urnberg, Germany TU Clausthal, Germany TU Kaiserslautern, Germany Universit¨ at der Bundeswehr M¨ unchen, Germany Leibniz Universit¨at Hannover, Germany Newcastle University, UK Karlsruher Institut f¨ ur Technologie, Germany TU M¨ unchen, Germany Universit¨ at des Saarlandes, Germany Universit¨at Hamburg, Germany FU Berlin, Germany TU Ilmenau, Germany
Additional Reviewers Hern´ an Bar´ o Graf Matthias Becker Martin Drozda Christian Eisentraut Philipp Eittenberger Luis Mar´ıa Ferrer Fioriti Klaus-Dieter Heidtmann Michael Hoefling Oliver Hohlfeld Andrey Kolesnikov Minh Lˆe Alfons Martin Linar Mikeev Jorge Perez-Hidalgo Martin Riedl Johann Schuster Falak Sher David Spieler Mark Timmer Sebastian Vastag Hannes Weisgrab
Universit¨ at des Saarlandes, Germany Leibniz Universit¨ at Hannover, Germany Leibniz Universit¨ at Hannover, Germany Universit¨ at des Saarlandes, Germany Otto-Friedrich-Universit¨ at Bamberg, Germany Universit¨ at des Saarlandes, Germany Universit¨ at Hamburg, Germany Universit¨ at T¨ ubingen, Germany TU Berlin, Germany Universit¨ at Hamburg, Germany TU M¨ unchen, Germany Universit¨ at T¨ ubingen, Germany Universit¨at des Saarlandes, Germany TU Dresden, Germany Universit¨ at der Bundeswehr M¨ unchen, Germany Universit¨ at der Bundeswehr M¨ unchen, Germany RWTH Aachen, Germany Universit¨at des Saarlandes, Germany University of Twente, The Netherlands TU Dortmund, Germany Forschungszentrum Telekommunikation Wien, Austria
Keynote Talks at MMB & DFT 2012
Modeling and Evaluation of Thermal System Properties Lothar Thiele, ETH Zurich
[email protected]
Power density has been continuously increasing in modern processors, leading to high on-chip temperatures. A system could fail if the operating temperature exceeds a certain threshold, leading to low reliability and even chip burnout. There have been many results in recent years about thermal management, including (1) thermal-constrained scheduling to maximize performance or determine the schedulability of real-time systems under given temperature constraints, (2) peak temperature reduction to meet performance constraints, and (3) thermal control by applying control theory for system adaption. The presentation will cover challenges, problems and approaches to real-time scheduling under temperature constraints for single- as well as multi-processors.
Internet Architecture Trends Anja Feldmann, TU Berlin / T-Labs
[email protected]
The ever growing demand for information of Internet users is putting a significant burden on the current Internet infrastructure who’s architecture has been more or less unchanged over the last 30 years. Indeed, rather than adjusting the architecture small fixes, e.g., MPLS, have been deployed within the core network. Today, new technical abilities enable us to rethink the Internet architecture. In this talk we first highlight how Internet usage has changed in the area of user generated context. Then we explore two technology trends: Cloud networks and open hardware/software interfaces. Virtualization, a main motor for innovation, decouples services from the underlying infrastructure and allows for resource sharing while ensuring performance guarantees. Server virtualization is widely used, e.g., in the clouds. However, cloud virtualization alone is meaningless without taking into account the network needed to access the cloud resources and data: cloud networks. Current infrastructures are limited to use the tools provided by the hardware vendors as there are hardly any open software stacks available for network devices in the core. This hurts innovation. However, novel programing interfaces for network devices, e.g., OpenFlow, provide open hardware/software interfaces and may enable us to build a network OS with novel features. We outline initial work in this area.
Table of Contents
Full Papers Availability in Large Networks: Global Characteristics from Local Unreliability Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hans Daduna and Lars Peter Saul
1
Stochastic Analysis of a Finite Source Retrial Queue with Spares and Orbit Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Zhang and Jinting Wang
16
Bounds for Two-Terminal Network Reliability with Dependent Basic Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minh Lˆe and Max Walter
31
Software Reliability Testing Covering Subsystem Interactions . . . . . . . . . . Matthias Meitner and Francesca Saglietti
46
Failure-Dependent Timing Analysis - A New Methodology for Probabilistic Worst-Case Execution Time Analysis . . . . . . . . . . . . . . . . . . . Kai H¨ ofig
61
A Calculus for SLA Delay Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sebastian Vastag
76
Verifying Worst Case Delays in Controller Area Network . . . . . . . . . . . . . . Nikola Ivkovic, Dario Kresic, Kai-Steffen Hielscher, and Reinhard German
91
Lifetime Improvement by Battery Scheduling . . . . . . . . . . . . . . . . . . . . . . . . Marijn R. Jongerden and Boudewijn R. Haverkort
106
Weighted Probabilistic Equivalence Preserves ω-Regular Properties . . . . . Arpit Sharma
121
Probabilistic CSP: Preserving the Laws via Restricted Schedulers . . . . . . Sonja Georgievska and Suzana Andova
136
Heuristics for Probabilistic Timed Automata with Abstraction Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luis Mar´ıa Ferrer Fioriti and Holger Hermanns
151
Simulative and Analytical Evaluation for ASD-Based Embedded Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ramin Sadre, Anne Remke, Sjors Hettinga, and Boudewijn Haverkort
166
XII
Table of Contents
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems . . . . Alireza Abdollahpouri and Bernd E. Wolfinger
182
Performance Evaluation of 10GE NICs with SR-IOV Support: I/O Virtualization and Network Stack Optimizations . . . . . . . . . . . . . . . . . Shu Huang and Ilia Baldine
197
Business Driven BCM SLA Translation for Service Oriented Systems . . . Ulrich Winkler, Wasif Gilani, and Alan Marshall
206
Boosting Design Space Explorations with Existing or Automatically Learned Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ralf Jahr, Horia Calborean, Lucian Vintan, and Theo Ungerer
221
Tool Papers IBPM: An Open-Source-Based Framework for InifiniBand Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Hoefling, Michael Menth, Christian Kniep, and Marcus Camen
236
A Workbench for Internet Traffic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . Philipp M. Eittenberger and Udo R. Krieger
240
A Modelling and Analysis Environment for LARES . . . . . . . . . . . . . . . . . . Alexander Gouberman, Martin Riedl, Johann Schuster, and Markus Siegle
244
Simulation and Statistical Model Checking for Modestly Nondeterministic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jonathan Bogdoll, Arnd Hartmanns, and Holger Hermanns
249
UniLoG: A Unified Load Generation Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrey Kolesnikov
253
Selected Workshop Papers Non Preemptive Static Priority with Network Calculus: Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . William Mangoua Sofack and Marc Boyer
258
A Demand-Response Calculus with Perfect Batteries . . . . . . . . . . . . . . . . . Jean-Yves Le Boudec and Dan-Cristian Tomozei
273
A Formal Definition and a New Security Mechanism of Physical Unclonable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rainer Plaga and Frank Koob
288
Table of Contents
XIII
Modeling and Analysis of a P2P-VoD System Based on Stochastic Network Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kai Wang, Yuming Jiang, and Chuang Lin
302
Using NFC Phones for Proving Credentials . . . . . . . . . . . . . . . . . . . . . . . . . . Gergely Alp´ ar, Lejla Batina, and Roel Verdult
317
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
331
Availability in Large Networks: Global Characteristics from Local Unreliability Properties Hans Daduna and Lars Peter Saul University of Hamburg, Department of Mathematics, Mathematical Statistics and Stochastic Processes, Bundesstrasse 55, 20146 Hamburg, Germany
Abstract. We apply mean-field analysis to compute global availability in large networks of generalized SIS and voter models. The main results provide comparison and bounding techniques of the global availability depending on the local degree structure of the networks. Keywords: Reliability, SIS model, voter models, mean field analysis, stochastic ordering, convex order, bounding global availability.
1
Introduction
The research described in this paper is on reliability theory of large networks of interacting components. We are interested in the characterization of availability of network resources which will be described as availability of a typical node under a global averaging process. The technical tool we shall use to quantify global availability is mean field analysis. This averaging principle is a standard technique in statistical mechanics and has recently found interest in other fields where large systems of interacting simple entities and their quantitative behavior are described. Seemingly this growing interest in mean field analysis is in parallel with the emergence of the network science in different fields of applications. Examples are the theory of social networks, for a review see [Lam09]. Our starting point are epidemic models which were used by Jackson and Rogers [JR07]. Their starting point is the mathematical theory of a generalized SIS model, which describes epidemics spreading in populations (SIS≡Susceptible-Infected-Susceptible). Similar SIS models are recently used to describe and investigate peer-to-peer networks, the propagation of computer viruses in the internet, the diffusion of innovation in economical communities, networks of mobile computers, where interconnection of the computers is organized by (push- or pull-like) protocols which determine a regime for continuous exchange of local information about the status of the network’s members (”gossiping”). Application of mean field analysis to gossip based models for diffusion of information in large networks is, e.g., in [BCFH11], [BCFH09], and furthermore [BGFv09], and [BMM07]. J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 1–15, 2012. c Springer-Verlag Berlin Heidelberg 2012
2
H. Daduna and L.P. Saul
We set available≡Susceptible and unavailable≡Infected in our availability analysis and our aim is to determine the portion of available (functioning) nodes over a large network. The formulas obtained allow a parametric analysis of global network availability depending on the local characteristics of the nodes. A similar investigation was undertaken by Jackson and Rogers [JR07], with respect to diffusion of innovation in economical communities. Some of our research directions are influenced by that paper, although the application we have in mind requires more complicated behavior of the individual nodes than the SIS models provide. Fundamental for SIS models is that susceptible individuals become infected at a rate that is strongly dependent on the number of infected neighbors while the recovering of infected individuals follows a rule which is independent of the status of their neighbors, see (1) and (3) below. The latter assumption is unrealistic in the context of unreliable networks of queues. A more appropriate assumption is that the break down structure and the repair structure should be of a comparable level of complexity. This is the case e.g. in the well established principle of reduced workload or capacity: In a queueing network with unreliable nodes for each node the individual availability (portion of time the node is up) is computed and then its service capacity is individually reduced to that portion. For more details and more elaborated versions of this principle see [CM96]. Voter models [Lig85] can be considered as symmetrization of the classical SIS model, where changing state (e.g. political preferences) in both directions follows the same mechanism, defined by the infection process. In a similar spirit we will symmetrize the transition mechanisms of Jackson and Rogers [JR07] and will therefore arrive at a network model where the individual nodes’ behavior follows rules of a generalized voter model and a generalized SIS model. 1.1
Connections to Network Science and Epidemic Models
Queueing networks and their structure are closely related to the networks of the recently emerging network science, but usually are of a higher order of complexity with respect to the individual nodes’ behavior. In network science in almost any case we can think of the network consisting of individuals and interconnections between them, the individuals are represented as vertices and the connections between individuals as edges of a graph. Therefore our models are from the realm of Graph Theory and, because of the emergence of random effects and influences, of Random Graph Theory. The emergence of ”network science” over the last decade relies on different predecessor fields where large-scale structures, described by graphs, constitute an important aspect of real world phenomena, like social, biological, physical, informational networks. The rapid development of the still very diverse field resulted in several recent books and surveys, e.g., [DM03], [DM10], [Jac08], [BBV08]. The diffusion mechanism which describe, e.g., the propagation of information in networks is borrowed from models for spreading of epidemics. Modeling of epidemics is described and surveyed in, e.g., the classical book [Bai75] and [AB00], [DG01]. Epidemics are described as contact processes on regular lattices
Availability in Large Networks
3
in the theory of Markovian interacting particle systems [Lig85][Chapter VI]. The models in the present paper generalize the models from [JR07], [LP08]. The research presented here is part of our ongoing investigation of queueing networks of the Jackson or Gordon-Newell network types under the condition that the nodes (servers, stations) are unreliable and can break down. Working periods and subsequent repair phases of the servers are random. For integrated models which encompass performance analysis and availability in a closed model (performability analysis) see [SD03], [Sau06], [HMRT01]. Another field where our results are related to is modeling and investigation of disruptive (dealy-tolerant) wireless (possible mobile) sensor networks, see [WDW07].
2 2.1
Availability Modeling Finite Networks
In this section we describe the behavior of the networks on the micro-level in finite systems. This will make the mean field model better understandable. To describe the availability of interacting nodes we use a a finite undirected graph G = (V, E) with vertices V = {1, 2, . . . , J} and edges E ⊆ V 2 \ diag(V 2 ) without multiple edges between vertices. The vertices represent stations which are either up≡functioning≡susceptible (=state 0 for the node) or down≡under repair≡infected (= state 1 for the node). Nodes interact with another if they are neighbors, i.e. if they are connected by an edge. Ni := {j ∈ V : (i, j) ∈ E} is the neighborhood of node i and di := |Ni | is the degree of node i. An important characteristic of the network is the degree distribution p = (p(d) : d ∈ IN) which is defined as a discrete probability with finite support p(d) =
|i ∈ V : di = d| , J
d ∈ IN.
Although it is for a given network a deterministic quantity it will be considered in the following as a statistical descriptor of the network. Unless otherwise specified, we will always require that p(0) < 1 holds. The states of the network are vectors n = (n1 , . . . , nJ ) ∈ {0, 1}J =: S which describe the states of all nodes (up or down). We assume that the development of the system can be described by a continuous time (time homogeneous) Markov process X = (X(t) : t ≥ 0) with state space {0, 1}J . This requires to prescribe the non-zero transition intensities (rates) Q = (q(n, m) : n, m ∈ S) for X. These rates are characterized by parameters ν, δ > 0 and are proportional to the respective status of the node’s neighborhood and a constant drift x, y ≥ 0. Definition 1 (Neighborhood dependent local breakdown rates and repair rates). Assume that the state of the network at time t ≥ 0 is n ∈ S.
4
H. Daduna and L.P. Saul
If node i ∈ {1, . . . , J}, is up, i.e. ni = 0, its break down rate is q((n1 , . . . , ni−1 , 0, ni+1 , . . . , nJ ), (n1 , . . . , ni−1 , 1, ni+1 , . . . , nJ )) = ν( nj + x),
(1)
j∈Ni
where ν > 0 is the spreading rate, i.e., a parameter describing the amount of breakdown rate similar to an infection transmission and x ≥ 0 is a constant rate with which a functioning node breaks down independent of the status of its neighborhood. If node i ∈ {1, . . . , J}, is down, i.e. ni = 1, then its repair rate is q((n1 , . . . , ni−1 , 1, ni+1 , . . . , nJ ), (n1 , . . . , ni−1 , 0, ni+1 , . . . , nJ )) = δ( (1 − nj ) + y),
(2)
j∈Ni
where δ > 0 is in the epidemic context the spreading rate of recovering, and y ≥ 0 is a constant rate with which an infected individual becomes susceptible depending not on the states of its neighborhood. A repair rate which depends in a similar way on the behavior of the nodes in the neighborhood as the breakdown rate is a reasonable property in modeling unreliable queueing networks. The break down - repair models arising then are variants of the well known voter model [BBV08][Section 10.3], in the context of Markov interacting particle systems, see e.g. [Lig85][Chapter 5]. The local ”repair rates” in the model from [JR07] are: If at time t ≥ 0 in state n ∈ S node i ∈ {1, . . . , J} is down, i.e. ni = 1, then its repair rate is q((n1 , . . . , ni−1 , 1, ni+1 , . . . , nJ ), (n1 , . . . , ni−1 , 0, ni+1 , . . . , nJ )) = δ.
(3)
The process with transition mechanism (1), (3) is known as generalized SIS model in the epidemics literature. The effective spreading rate in the network with neighborhood dependent breakdown and repair rates is λ := ν/δ > 0, which is assumed throughout to be strictly positive. The process with transition mechanism (1), (2) is known as (generalized) voter model. The distinction to work in [LP06], [JY07] is that the rates do depend on the actual status (opinion) of the voter himself. As Lopez-Pintado [LP08][p.576] remarked, an explicit analysis of X is extremely complicated. The way out of this is to consider approximate models with averaging over a large population, i.e., mean field analysis [JR07], [LP08]. Example 1. Assume that the network graph is complete, i.e. for any i ∈ V we have Ni = V \ {i}: any two nodes are connected by an edge. In this situation under the above assumptions the total number of infected individuals Z(t) at time t ≥ 0 is a Markov process Z = (Z(t) : t ≥ 0) with discrete state space {0, 1, 2, . . . , J}. In fact, it is a finite birth-death process with transition rates n < J : q(n, n + 1) = (J − n)ν(n + x),
n > 0 : q(n, n − 1) = n(δ(J − n + y)).
Availability in Large Networks
2.2
5
Infinite Networks: Mean Field Description for Neighborhood Dependent Breakdown and Repair Rates
We now assume that the graph of the network has an infinite number of vertices V := IN and is locally finite, i.e., the any node i ∈ V has only a finite number di := |Ni | < ∞ of neighbors. The set of degree numbers di , i ∈ IN, need not be bounded. We assume that the degree distribution of the network is a well defined discrete probability density p = (p(d) : d ∈ IN), and consider d : IN → IN as random variable with distribution p. We interpret p as the degree distribution of a typical node in the network or as the probability p(d) that a randomly chosen node has degree d. We always require that the average degree Ep (d) := p(d)d < ∞ is finite. d∈IN Unless stated otherwise (some remarks on totally disconnected networks will be given) we always require p(0) < 1. Note, that under p(0) < 1 we have Ep (d) > 0. Definition 2. The probability that a randomly selected link originates from a node with degree n (see [JR07][p. 3]) is q = (q(n) : n ∈ IN), computed as q(n) :=
p(n)n , d∈IN p(d)d
n ∈ IN.
The interplay of the following time dependent quantities determine the analysis, for more details and a convenient interpretation see [JR07][p. 3] for rates (1), (3); a similar interpretation is appropriate in case of rates (1), (2): – ρt (d) ∈ [0, 1]: average down rate among nodes with degree d at time t, which is defined as the portion of broken down nodes at time t among nodes with degree d, – ρt = d∈IN p(d)ρt (d) ∈ [0, 1]: average down rate in the network at time t, definedas portion of broken down nodes at time t in the network, – θt = ( d∈IN p(d)dρt (d))( d∈IN p(d)d)−1 : the average neighbor down rate, defined as the portion of broken down nodes at time t in the neighborhood of a randomly selected node. θt can be interpreted as the probability that at time t some node at the end of a randomly selected link is broken down. A mean-field approximation for the development of the system is determined by a system of differential equations which have an appealing intuitive interpretation. For all d ∈ IN ∂ρt (d) = (1 − ρt (d))ν(θt d + x) − ρt (d)δ((1 − θt )d + y), ∂t
t ≥ 0.
(4)
The most important analysis of the model is the search for steady states for ρt (d) and θt . This requires the lefthand side of (4) to be constant zero and consequently the righthand side being independent of t. This makes the steady state quantities ρ(d) and θ amenable to an equilibrium analysis which yields ρ(d) =
λ(θd + x) λ(θd + x) + (1 − θ)d + y
(5)
6
H. Daduna and L.P. Saul
and θ=
λ(θd + x) p(d)d · . p(d)d λ(θd + x) + (1 − θ)d + y d∈IN
d∈IN
(6)
Solving (6) for θ is equivalent to solving a fixed-point problem for the function Hp (θ) : [0, 1] → IR,
θ → Hp (θ) =
d∈IN
p(d)d λ(θd + x) · , p(d)d λ(θd + x) + (1 − θ)d + y d∈IN
and the fixed points of Hp are by definition the stationary points (or stationary states) of the mean-field equations.
3
Infinite Networks: Fixed Point Analysis
For the case of neighborhood independent local repair rates in [JR07] the fixed points of the function θ → Hp (θ) are determined and classified. We classify the fixed points associated with (6) similarly, but the stability pattern is more complicated. On the other hand we can prove that the patterns obey some nice symmetry structures now. We consider degree distributions p = (p(d) : d ∈ IN) with p(0) < 1. Recall Ep (d) := d∈IN p(d)d, and the effective spreading rate λ := ν/δ > 0. Lemma 1. The function Hp on [0, 1] has the following properties: (2) Hp (0) = 0 ⇔ x = 0 (1) 0 ≤ Hp (θ) ≤ 1 (4) Hp is continuously differentiable. (3) Hp (1) = 1 ⇔ y = 0 (5) Hp is strictly increasing on [0, 1]. Furthermore
Hp is strictly concave ⇔ λ > 1 ⇔ ν > δ Hp is strictly convex ⇔ λ < 1 ⇔ ν < δ ⇔λ=1⇔ν=δ Hp is linear
The proof of the lemma is by direct computation. A surprising consequence is Theorem 1. For degree distribution p with p(0) < 1, λ > 0, x, y ≥ 0, and for λ∗ := λ−1 holds Hp (θ) = 1 − Hp∗ (1 − θ) with Hp∗ (θ) =
p(d)d λ∗ (θd + y) · ∗ Ep (d) λ (θd + y) + (1 − θ)d + x
d∈IN
The proof of the theorem is by direct computation. An important consequence of the theorem is that for a fixed point θ¯ of Hp additionally θ¯∗ = 1− θ¯ is a stationary ¯ = 1 − H ∗ (1 − θ) ¯ ⇔ H ∗ (1 − θ) ¯ = 1 − θ. ¯ point as well because of θ¯ = Hp (θ) p p We shall prove that under certain parameter settings there may exist more than one fixed point, which indicates that the limiting distributions will depend on initial conditions. We will not investigate this in detail here and consider only the possible limting pictures.
Availability in Large Networks
3.1
7
Infinite Networks: Stationary States for Mean Field Models
Existence of solutions of the fixed point equation (6) depends on the parameter setting of the model. Whenever such a stationary point exists, we denote by θ¯ the stationary average neighbor down rate, by ρ¯(d) the stationary average down rate among nodes with degree d, and by ρ¯ the stationary average down rate. From (5) follows ¯ + x) λ(θd (7) ρ¯(d) = ¯ ¯ + y, λ(θd + x) + (1 − θ)d and in case of stationarity we can compute ρ¯ = p(d)¯ ρ(d).
(8)
d∈IN
We remark en passant that from (7) and (8) it follows that for a totally disconnected network with x > 0 or y > 0 holds ρ¯ = λx/(λx + y).
(9)
In the following p(0) = 1 will be excluded. Theorem 2. For p(0) < 1 there exist in any case a stationary state. Denote a := Ep (d)(
d∈IN
p(d)
d2 −1 ) ≥1 d+y
and
b := (
d∈IN
p(d)
d2 )(Ep (d))−1 ≤ 1. d+x
For x = y = 0 are θ¯1 = 0 and θ¯2 = 1 stationary. We can characterize existence of additional stationary states as function of the effective spreading rate λ Table 1. Stationary states as function of λ a<λ x=y=0 y > x = 0 θ¯1 x>y=0 x, y > 0
λ=1
λ
θ¯ ∈ {0, 1} all θ¯ ∈ [0, 1] θ¯ ∈ {0, 1} = 0, θ¯2 ∈ (0, 1) θ¯1 = 0 θ¯1 = 0 θ¯1 = 1 θ¯1 = 1, θ¯2 ∈ (0, 1) θ¯1 = 1 θ¯1 ∈ (0, 1) θ¯1 ∈ (0, 1) θ¯1 ∈ (0, 1)
Proof. Recall that we have degree distribution p with p(0) < 1, and therefore Ep (d) > 0. We distinguish cases λ > 1, λ < 1 and λ = 1. For λ > 1 holds: Hp is strictly concave (Lemma 1). If x > 0, then Hp (0) > 0 and therefore only one stationary point can exist. For y = 0 this is by Lemma 1 at θ¯ = 1, for y > 0 we have θ¯ ∈ (0, 1).
8
H. Daduna and L.P. Saul Table 2. Stationary states as function of x and y y=0
y>0
λ = 1 : all θ¯ ∈ {0, 1} λ
0 θ¯1 ∈ (0, 1) λ < b : θ¯1 = 1, θ¯2 ∈ (0, 1) x=0
∂Hp (0) If x = 0, then θ¯1 = 0 is a stationary state (Lemma 1). If ∂θ > 1 holds, ∂Hp (0) Hp has another fixed point, because Hp is strictly concave. We have ∂θ = P (d)d λd Ep (d) · d+y .
d∈IN
So Hp has a second fixed point θ¯2 , if and only if λ > a. For y = 0 by Lemma 1 this is θ¯2 = 1, for y > 0 the second fixed point lies in ¯ θ2 ∈ (0, 1). For λ < 1 holds: Hp is strictly convex (Lemma 1). If y > 0, then Hp (1) < 1 and therefore only one stationary point can exist. If x = 0 this is θ¯1 = 0, otherwise if x = 0 it lies in θ¯1 ∈ (0, 1). ∂Hp (1) If y = 0, then θ¯1 = 1 is a fixed point. If ∂θ > 1 holds, so a second fixed P (d)d ∂Hp (1) d point θ¯2 exist, because Hp is strictly convex. We have ∂θ = Ep (d) · λ(d+x) . d∈IN
So a second fixed point exists if λ < b. For x = 0 this is (Lemma 1) θ¯2 = 0 while for x > 0 it lies in θ¯2 ∈ (0, 1). (Note, that this result follows also from Theorem 1.) For λ = 1 holds: Hp is linear (Lemma 1), and for all θ ∈ [0, 1] is P (d)d (∗) P (d)d d ∂Hp (θ) = · =1 ≤ ∂θ Ep (d) d + x + y Ep (d) d∈IN
d∈IN
(∗)
If x = y = 0, in ≤ equality holds and we have Hp (θ) = θ, so all θ¯ ∈ [0, 1] are fixed points. In all other cases only one fixed points exists. For x = 0 and y > 0 it is θ¯1 = 0, for x > 0 and y = 0 it is θ¯1 = 1 and for x, y > 0 it lies in θ¯1 ∈ (0, 1). From Tables 1 and 2 we see all cases when θ¯ ∈ {0, 1} can (and will) occur, the proof is in the Appendix. For θ¯ ∈ (0, 1) we can provide more information. Theorem 3. Let θ¯ ∈ (0, 1) be stationary state of a network with degree distribution p with p(0) =
1. Then (i) λ = 1 ⇔ θ¯ = (ii) λ > 1 ⇔ θ¯ > (iii) λ < 1 ⇔ θ¯ <
x x+y x x+y , x x+y .
if x > 0 or y > 0,
Availability in Large Networks
9
Proof. We can assume x > 0 or y > 0 holds, because otherwise from Table 1 we see that for λ = 1 and x = y = 0 no stationary state θ¯ ∈ (0,1) exists, and for
x λx is well defined, and Hp x+y . = λx+y x (i) From x, y > 0, λ = 1 is obviously equivalent to θ¯ = x+y . (ii) For λ > 1 we see from Table 1 that for y = 0 only θ1 = 0 and θ2 = 1 can be stationary. So we have y > 0. x x x If θ¯ ≤ x+y would hold, then Hp ( x+y ) ≤ x+y , because θ¯ < 1 is the maximal x λx x = λx+y value with Hp (θ) = θ and Hp strictly concave. But Hp x+y > x+y , which is a contradiction. x x x If θ¯ ∈ ( x+y , 1), we must have y > 0 (otherwise x+y = 1 or x+y is not well x x x λx x , defined.) It follows Hp ( x+y ) > x+y . If λ ≤ 1 , then Hp x+y = λx+y ≤ x+y this is a contradiction. (iii) For λ < 1 we see from Table 1 that for x = 0 only θ1 = 0 and θ2 = 1 can be stationary. So we have x > 0. We consider Hp∗ from Theorem 1 is strictly inceasing (1) and obtain for λ∗ = λ−1 > 1 (ii) x y ¯ Hp > 1 − Hp∗ (θ¯∗ ) = Hp (1 − θ¯∗ ) = Hp (θ). = 1 − Hp∗ x+y x+y
λ = 1 by assumption x, y > 0. So
x x+y
x . Because Hp is strictly inceasing in θ, it follows θ¯ < x+y x ¯ If θ ∈ (0, x+y ), then from parts (i) and (ii) we conclude λ < 1.
So for λ = 1, i.e., if breakdown and repair rate equalize, we see that the average neighbor down rate is uniquely determined by the state-independent breakdown and repair rate. In the unbalanced situation x/(x + y) provides bounds for the proportion of down nodes in the neighborhood of a typical node, y/(x + y) provides bounds for the the global availability, the proof is in the Appendix.
4
Infinite Networks: Stochastic Orderings
In this section we provide a parametric analysis of the networks’ global stationary availability states under variation of degree distributions. Our procedure is: We compare two networks which are identical in any of their defining fundamental characteristics other than the degree distributions, which we denote by pi = (pi (n) : n ∈ IN) for i = 1, 2. To be more precise: Given a first network with degree distribution p1 . Replace p1 by p2 to obtain the second network. 4.1
Stochastic Order for Balanced Breakdown and Repair Rates
In this case we have λ = 1 and obtain by direct computations from (8) and (7) and using (i) from Theorem 3: Proposition 1. The stationary average neighbor down rate under λ = 1 is independent of the degree distribution and for x = y = 0 all θ¯ ∈ [0, 1] are stationary. In all other cases the solution sets are discrete, see Table 3. In any case holds θ¯ = ρ¯.
10
H. Daduna and L.P. Saul Table 3. Values of θ¯ and ρ¯ under λ = 1 θ¯
ρ¯
x = y = 0 all θ ∈ [0, 1] θ 0 0 y>x=0 1 1 x>y=0 x x x, y > 0 x+y x+y
Corollary 1. Consider the two networks with different degree distributions pi = (pi (n) : n ∈ IN), i = 1, 2, being identical otherwise. Let θ1 , resp. θ 2 , denote the largest average neighbor breakdown rates and ρ1 , resp. ρ2 , the largest steady state overall breakdown rates, and suppose θi ∈ (0, 1). If λ = ν/δ = 1 then θ2 = θ1 = ρ2 = ρ1 . Compared with the result of Theorem 4 below the result of the corollary is somewhat surprising. The interpretation is: In case λ = ν/δ = 1 the effects of the neighborhood-induced breakdowns and repairs compensate perfectly under any degree distribution. 4.2
Stochastic Order for Unbalanced Breakdown and Repair Rates
For λ = 1 we can prove a more detailed analysis of degree distributions. Definition 3. p1 is stochastically greater than p2 (write p1 ≥st p2 or p2 ≤st p1 ) if ∞ ∞ p1 (d) ≥ p2 (d) ∀ ∈ IN. (10) d=
d=
p1 is greater than p2 in the convex (stochastic) order (write p1 ≥cx p2 or p2 ≤cx p1 ) if for all convex functions f : IN → IR holds ∞
p1 (d)f (d) ≥
d=0
∞
p2 (d)f (d)
if both sums exist.
(11)
d=0
Recall from Definition 2 qi (n), the probability that a randomly selected link originates from a node with degree n. We generalize Theorem 1 of [JR07]. Theorem 4. Consider the two networks with degree distributions pi = (pi (n) : n ∈ IN), i = 1, 2, being identical otherwise. Let θ1 , resp. θ2 , denote the largest average neighbor breakdown rates and ρ1 , resp. ρ2 , the largest steady state overall breakdown rates, and suppose θi ∈ (0, 1). If λ = ν/δ = 1 and if p1 ≥st p2 and q1 ≥st q2 , then (i)
λ > 1 =⇒ θ 1 ≥ θ2
and
ρ1 ≥ ρ2
(ii)
λ < 1 =⇒ θ 1 ≤ θ 2
and
ρ1 ≤ ρ2 .
Availability in Large Networks
11
Proof. (i) We can assume θ¯1 =
θ¯2 . From Table 2 we conclude for θ¯1 , θ¯2 ∈ (0, 1) that Hp1 (1) < 1 holds for θ → Hp1 (θ) =
1
Ep1 (d
d∈IN
p1 (d)d
λ(θd + x) λ(θd + x) + (1 − θ)d + y
θ¯1 is by assumption the largest θ ∈ (0, 1) with Hp1 (θ) = θ, so Hp1 (θ) < θ for all θ ∈ (θ¯1 , 1]. So, if θ¯1 < θ¯2 would hold we would have Hp1 (θ¯2 ) < θ¯2 . For any θ ∈ (0, 1) we can read (5) as definition of the function ρ : IN → IR. We λ(θd+x) formally can extend this function to g : IR+ 0 → IR with g(d) := λ(θd+x)+(1−θ)d+y , which is differentiable with λ(θ(x + y) − x) x ∂g(d) = . >0 ∀ θ> ∂d (λ(θd + x) + (1 − θ)d + y)2 x+y x So, for all θ ∈ ( x+y , 1) ρ(d) is strictly increasing in d. From q1 ≥st q2 follows Eq1 (f (d)) ≥ Eq2 (f (d)) for all increasing functions f , and we conclude
Hp1 (θ) =
q1 (d)ρ(d) ≥
d∈IN
q2 (d)ρ(d) = Hp2 (θ),
d∈IN
∀ θ∈(
x , 1], x+y
which implies θ¯2 > Hp1 (θ¯2 ) ≥ Hp2 (θ¯2 ). This contradicts the fact that θ¯2 is a stationary average neighbor down rate. We must have θ¯1 > θ¯2 . It remains to prove the statement on average down rates ρ(d) which for fixed d is strictly increasing in θ for all d > 0. From (4) in Lemma 1, λd(d + x + y) ∂ρ(d) = > 0. ∂θ (λ(dθ + x) + (1 − θ)d + y)2
∀ θ ∈ [0, 1] :
For θ1 = θ 2 we have ρ1 = ρ2 . Assume θ¯1 > θ¯2 , then for all d > 0 for stationary average down rates of nodes with degree d holds ρ¯1 (d) > ρ¯2 (d), which is used in (1) below. Using p2 ≤st p1 in (2) yields ρ¯1 =
(1)
p1 (d)¯ ρ1 (d) >
d∈IN
(2)
p1 (d)¯ ρ2 (d) ≥
d∈IN
p2 (d)¯ ρ2 (d) = ρ¯2 .
d∈IN
x ). Analogously to (i) the (ii) For the case λ < 1, Theorem 3 says θ¯i ∈ [0, x+y function −g(d) is strictly increasing in d, because of
∂g(d) λ(θ(x + y) − x) x = ) < 0 ∀ θ ∈ [0, ∂d (λ(θd + x) + (1 − θ)d + y)2 x+y So for these θ the function −ρ(d) is strictly increasing d and from q1 ≥st q2 : Hp1 (θ) = −
d∈IN
q1 (d)(−ρ(d)) ≤ −
d∈IN
q2 (d)(−ρ(d)) = Hp2 (θ).
12
H. Daduna and L.P. Saul
If θ¯1 ≥ θ¯2 holds, the proof is completed. Assume θ¯1 > θ¯2 : Because θ¯2 is the greatest θ ∈ (0, 1) with Hp2 (θ) = θ, it follows that Hp2 (θ) < θ for all θ ∈ (θ¯2 , 1]. Summarizing, we have θ¯1 > Hp2 (θ¯1 ) ≥ Hp1 (θ¯1 ). This contradicts the fact that θ¯1 is stationary average neighbor down rate. So ¯ θ1 < θ¯2 . Analogously to (i) we have ρ¯1 (d) < ρ¯2 (d) for all d > 0, because ρ(d) is increasing in θ and θ¯1 < θ¯2 . Utilizing p1 ≥st p2 for , we finally obtain ρ¯1 =
p1 (d)¯ ρ1 (d) <
d∈IN
p1 (d)¯ ρ2 (d) = −
d∈IN
p1 (−ρ¯2 (d)) ≤ −
d∈IN
p2 (−ρ¯2 (d)) = ρ¯2 .
d∈IN
While Proposition 1 and Theorem 4 demonstrate reactions of the network to variation of degree distributions according to strong stochastically monotone changes (which implies increasing mean values), we now describe what happens, if the mean is fixed and variability of the degree distribution is increased. Theorem 5. Consider two networks with degree distributions pi , i = 1, 2, which are structurally identical otherwise. Let θ1 , resp. θ 2 , denote the largest average neighbor breakdown rates and suppose θi ∈ (0, 1), and λ = ν/δ = 1. If p2 ≤cx p1 then (i)
λ > 1 =⇒ θ1 ≥ θ2 ,
and
(ii)
λ < 1 =⇒ θ1 ≤ θ2 .
θ¯2 . Define for degree distribution p with p(0) < 1 Proof. We can assume θ¯1 = f : IR+ 0 → IR,
d → f (d) :=
λ(θd + x) d · Ep (d) λ(θd + x) + (1 − θ)d + y
f (d) is well defined, because Ep (d) = 0 and x and y do not vanish concurrently, because otherwise no stationary θ ∈ (0, 1) would exist. f (d) is two times differentiable in d. x , and (i) If λ > 1, from Theorem 3 for the stationary values holds θ¯i > x+y 2
x for all θ > x+y we have ∂ ∂f2(d) d > 0, so f (d) is strictly convex in this case. From p1 ≥cx p2 we therefore conclude Hp1 (θ) = p1 (d)f (d) ≥ p2 (d)f (d) = Hp2 (θ). d∈IN
d∈IN
Assume now, that θ¯1 < θ¯2 holds. Then θ¯2 > Hp1 (θ¯2 ), because θ¯1 is the greatest θ ∈ (0, 1) with Hp1 (θ) = θ and because Hp1 (θ) is concave. So θ¯2 > Hp1 (θ¯2 ) ≥ Hp2 (θ¯2 ), which contradicts that θ¯2 is stationary state for p2 . It follows θ¯1 > θ¯2 . 2
x and ∂ ∂f2(d) < 0. So in this case −f (d) (ii) For λ < 1 holds θ¯i < x+y d strictly convex and therefore Hp1 (θ) = p1 (d)f (d) = − p1 (d)(−f (d)) < d∈IN d∈IN − p2 (d) − f (d)) = p2 (d)f (d) = Hp2 (d)(θ). d∈IN
d∈IN
Assume now θ¯1 > θ¯2 : Then θ¯1 > Hp2 (θ¯1 ) ≥ Hp1 (θ¯1 ), which contradicts the fact that θ¯1 is stationary state for p1 . It follows θ¯1 < θ¯2 .
Availability in Large Networks
5
13
Comparison of Global Availability and Bounds
Theorems 4 and 5 describe the consequences for the availability when degree distributions increase or the degree distributions become more variable. Note, that for simpler notation we denoted by θt the portion of broken down nodes at time t in the neighborhood of a randomly selected node (≡ average neighbor down rate). So, Avt := 1 − θt is the portion of available nodes at time t in the neighborhood of a randomly selected node; denote Av = 1 − θ, which can be interpreted as ”global availability”. The interesting case in applications to reliability of service networks is λ < 1, i.e. the breakdown rate parameter ν is less then the repair rate parameter δ. We consider two networks with degree distributions pi , i = 1, 2, which are structurally identical otherwise. (1) If p1 ≥st p2 and q1 ≥st q2 holds, from Theorem 4 (ii) we conclude for the respective largest average neighbor down rates θ¯i : Av 1 ≥ Av 2 , i.e., in network 1 is the global availability greater than in network 2, which under λ < 1 is intuitive: More connectivity in the network increases the force of the neighborhood dependent characteristics, which enforces the impact of repair against breakdown because of ν < δ. (2) If p1 ≥cx p2 holds, it follows equality of the average node degrees Epi (d) := m, i = 1, 2, see [MS02][Theorem 1.5.3]. From Theorem 5 (ii) we conclude for the respective largest average neighbor down rates θ¯i : Av 1 ≥ Av 2 , i.e., in network 1 is the global availability greater than in network 2, which under λ < 1 seems to be less intuitive than the conclusion under (1): More variability generates more availability. We conclude furthermore: If all nodes in the network have the same degree m ∈ IN, then for all degree distributions p with fixed mean Ep (d) := m ∈ IN, we have a guaranteed (minimal) global availability Av min which is given by the network with fixed degree m =constant: Av min = 1 − x/(x + y) = y/(x + y). This is the bound which we obtained already in Theorem 3, (iii). The reason behind this observation is that in the class of all distributions on IR with fixed mean m the one-point distribution in m is the minimum under the convex order ≥cx , see [MS02][Example 1.10.5].
6
Discussion
We did not prove rigorously the validity of the mean field approximation in Section 2.2. This would be a difficult task. On the other hand, the approach is acknowledged as natural in many fields, e.g. for physical systems or chemical reactions. An example in the field of communication systems is in [PM84], where for an ”Interconnection Network” of completely reliable nodes with mean field analysis performance metrics on the macro level are obtained. Nevertheless, there are fundamental differences between finite state space models, see e.g., Example 1 and an associated mean field model. These are well
14
H. Daduna and L.P. Saul
known in epidemic models, where a deterministic description of an epidemic reveals usually at least one steady state, while the associated micro level models are transient with a unique absorbing state 0. In Example 1 we can classify this completely: If x = y = 0 then (even under complete connectivity) states 0 and J both are absorbing, while under x = 0 and y > 0 only 0 is absorbing, and under y = 0 and x > 0 only J is absorbing. Only for y > 0 and x > 0 the process Z is ergodic with a unique proper steady state. The averaging principle which is behind the definition of, e.g., the portion of broken down nodes over the network resembles flow approximations (Law of Large Numbers limits) in queueing networks. Another connection is developed in [DPR08] where an increasing sequence of finite cycles of exponential queues is investigated. The point of interest is for unboundedly growing networks (number of nodes and number of customers) the average throughput taken as network’s throughput per node. The main result is given by finding conditions on the asymptotic profile of the network sequence which guarantee the existence of a proper limit of the sequence of average throughput. As indicated in the Introduction the results in this paper are motivated by investigation of large networks of unreliable services. Our aim is to combine the approach of [DPR08] with the development presented here and to attack problems beyond limiting average throughput. The investigation and classification of performability in the mean field limit seems to be an open problem. Acknowledgement. We thank the referees for careful reading of the manuscript and for their helpful suggestions.
References [AB00]
Anderson, H., Britton, T.: Stochastic Epedemic Models and Their Statistical Analysis. Lecture Notes in Statistics. Springer, New York (2000) [Bai75] Bailey, N.J.: The Mathematical Theory of Infectiuos Diseases and Its Applications. Hafner Press, New York (1975) [BBV08] Barrat, A., Barthelemy, M., Vespignani, A.: Daynamical Processes on Complex Networks. Cambridge University Press, Cambridge (2008) [BCFH09] Bakhshi, R., Cloth, L., Fokking, W., Haverkoert, B.R.: Mean-field analysis for the evaluation of gossip protocols. In: Proceedings of the Sixth International Conference on Quantitative Evaluation of Systems, pp. 247–256. IEEE Computer Society (2009) [BCFH11] Bakhshi, R., Cloth, L., Fokking, W., Haverkoert, B.R.: Mean-field framework for performance evaluation of push-pull gossip protocols. Performance Analysis 68, 157–179 (2011) [BGFv09] Bakhshi, R., Gavidia, D., Fokking, W., van Steen, M.: An analytical model of information dissemination for a gossip-based protocol. Computer Networks 53, 2288–2303 (2009) [BMM07] Boudec, J.-Y., McDonald, D., Mundinger, J.: A generic mean field convergence result for systems of interacting objects. In: Proceedings of the Fourth International Conference on Quantitative Evaluation of Systems, pp. 3–15. IEEE Computer Society (2007)
Availability in Large Networks [CM96]
15
Chakka, R., Mitrani, I.: Approximate solutions for open networks with breakdowns and repairs. In: Kelly, F.P., Zachary, S., Ziedins, I. (eds.) Stochastic Networks, Theory and Applications. Royal Statistical Society Lecture Notes Series, vol. 4, ch. 16, pp. 267–280. Clarendon Press, Oxford (1996) [DG01] Daley, D.J., Gani, J.: Epidemic Modelling: An Introduction. Cambridge University Press, Cambridge (2001) [DM03] Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of Networks. Oxford University Press, Oxford (2003) (reprint 2004) [DM10] Draief, M., Massoulie, L.: Epidemics and Rumours in Complex Networks. London Mathematical Society Lecture Note Series, vol. 369. Cambridge University Press, Cambridge (2010) [DPR08] Daduna, H., Pestien, V., Ramakrishnan, S.: Throughput limits from the asymptotic profile of cyclic networks with state-dependent service rates. Queueing Systemes and Their Applications 58, 191–219 (2008) [HMRT01] Haverkort, B.R., Marie, R., Rubino, G., Trivedi, K.: Performability Modeling, Technique and Tools. Wiley, New York (2001) [Jac08] Jackson, M.O.: Social and Economic Networks. Princeton University Press, Princeton (2008) [JR07] Jackson, M.O., Rogers, B.W.: Relating network structure to diffusion properties through stochastic dominance. The B.E. Journal of Theoretical Economics 7(1), 1–13 (2007) [JY07] Jackson, M.O., Yariv, L.: Diffusion of behavior and equilibrium properties in network games. American Economical Review (Papers and Proceedings) 97, 92–98 (2007) [Lam09] Lamberson, P.J.: Linking network structure and diffusion through stochastic dominance. In: Complex Adaptive Systems and the Threshold Effects: Views from the Natural and Social Sciences, pp. 76–82. Association for the Advancement of Artificial Intelligence (2009); Papers from the AAAI Fall Symposium: FS-09-03 [Lig85] Liggett, T.M.: Interacting Particle Systems. Grundlehren der mathematischen Wissenschaften, vol. 276. Springer, Berlin (1985) [LP06] Lopez-Pintado, D.: Contagion and coordination in random networks. International Journal of Game Theory 34, 371–381 (2006) [LP08] Lopez-Pintado, D.: Diffusion in complex social networks. GAMES and Economic Behavior 62, 573–590 (2008) [MS02] M¨ uller, A., Stoyan, D., Liggett, T.M.: Comparison Methods for Stochastic Models and Risks. Wiley, Chichester (2002) [PM84] Pinski, E., Yemini, Y.: A statistical mechanics of some interconnetion networks. In: Gelenbe, E. (ed.) Performance 1984, pp. 147–158. NorthHolland, Amsterdam (1984) [Sau06] Sauer, C.: Stochastic product form networks with unreliable nodes: Analysis of performance and availability. PhD thesis, University of Hamburg, Department of Mathematics (2006) [SD03] Sauer, C., Daduna, H.: Availability formulas and performance measures for separable degradable networks. Economic Quality Control 18, 165–194 (2003) [WDW07] Wang, Y., Dang, H., Wu, H.: A survey on analytic studies of Delay-Tolerant Mobile Sensor Networks. Wireless Communications and Mobile Computing 7, 1197–1208 (2007)
Stochastic Analysis of a Finite Source Retrial Queue with Spares and Orbit Search Feng Zhang and Jinting Wang Department of Mathematics, Beijing Jiaotong University, Beijing, 100044, China {zhangfeng,jtwang}@bjtu.edu.cn
Abstract. This paper aims at presenting an analytic approach for investigating a single server finite-source retrial queue with spares and constant retrial rate. We assume that there is a single repair facility (server) and K independent parts (customers) in the system. The customers’ life times are assumed to be exponentially distributed random variables. Once a customer breaks down, it is sent for repair immediately. If the server is idle upon the failed customer’s arrival, the customer receives repair immediately. The failed customer that finds the server busy upon arrival enters into the retrial orbit. Upon completion of a repair, the server searches for a customer from orbit if any. However, when a new primary customer arrives during the seeking process, the server interrupts the seeking process and serves the new customer. There are some spares for substitution of failed machines and the system is maintained by replacing failed part by spares and by repairing failed parts so that they may become spares when they are repaired. We carry out the steady-state analysis of the model and obtain various steady-state performance measures. Keywords: Quasi-random input, orbital search, spares, busy period, waiting time.
1
Introduction
Retrial queues have been widely used to model many problems arising in telephone switching systems, telecommunication networks, computer networks, etc. The main characteristic of a retrial queue is that a customer who finds the service facility busy upon arrival is obliged to leave the service area, but some time later he comes back to re-initiate his demand. Between trials a customer is said to be in “orbit”. The literature on retrial queueing systems is very extensive. For a recent account, readers may refer to the recent books of Falin and Templeton [8] and Artalejo and G´omez-Corral [5] that summarize the main models and methods. Most studies in the literature assume the source size of primary customers to be infinite and then the flow of primary calls can be modeled by Poisson process. However, when the customer population is of moderate size, it seems more
Corresponding author.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 16–30, 2012. c Springer-Verlag Berlin Heidelberg 2012
A Finite Source Retrial Queue with Spares and Orbit Search
17
appropriate that the retrial queueing systems should be studied as a system with finite source of customers. In the queueing literature, it is called machine interference problem (MIP) or machine repairman problem which can be used to model a wide variety of real systems, see [10]. In these situations, it is important to take into account the fact that the rate of generation of new primary calls decreases as the number of customers in the system increases. Such a finite source queue is also known as queues with quasi-random input. Retrial queues with quasi-random input are of recent interest in modeling many practical systems such as magnetic disk memory systems, cellular mobile networks, computer networks, and local-area networks, see [4, 9] for the detailed descriptions. Since [11], there has been a rapid growth in the literature on this topic [1–3, 6, 7, 15]. Recently, retrial queueing systems with a finite source of customers where the servers search for customers after service have been investigated in several papers [16–18]. The authors used the software MOSEL (Modeling, Specification and Evaluation Language) to formulate and solve their problems. Such retrial models are typically used in the study of computer networks, and they differ from the majority of articles in the retrial queueing literature where each blocked customer joins the retrial orbit and becomes a source of repeated requests for service at rate ν, independently of the other customers. This is the classical retrial policy that the total retrial rate when there are j customers in the orbit is jν. In contrast to this, for some applications in computer and communication networks, one is interested in designing finite-source retrial queues of search of orbital customers immediately after a service completion, where the time between two successive repeated attempts is controlled by the server. Consequently, the total retrial rate is a constant θ, independently of the number j of customers in orbit. This constant retrial policy is also used to model systems in which the blocked customers leave their contact details when they find the server busy. Then the server seeks for a customer at a constant rate θ after a service completion among those who have left their contract details. Some related studies on the constant retrial policy can be found in [3, 6, 13], among others. In this paper we study a single server finite-source retrial queue with spares and orbit search. Such a model arises in the maintenance of various practical systems. We assume that there exists a single repair facility (server) and K independent machines (customers) in the system. The customers have exponential life times and once breaks down, it is sent to the server for repair immediately. If the server is idle upon arrival, the customer receives repair immediately. Otherwise, the failed customer enters a pool of retrial group which is called orbit for later repair. There are some spares for substitution of failed machines and the system is maintained by replacing failed part by spares and by repairing failed parts so that they may become spares when they are repaired. An examination of the literature shows that there is no work on the analytic approach for investigating a finite source retrial queue taking into account the constant retrial rate and spares. This motivates us to investigate such queueing systems in this work.
18
F. Zhang and J. Wang
The organization of this paper is as follows. The model under consideration is described and several main performance characteristics are obtained in Section 2. Section 3 investigates the busy period of the server. In Section 4, the waiting time distribution is discussed. In Section 5 we show some numerical examples to illustrate the impact of the parameters on the system performance. Finally, some conclusions are given in Section 6.
2
Model Description
We consider a single-server retrial queueing system with no waiting space in which the primary calls are generated by K, 1 < K < ∞, homogeneous sources. The server can be in two states: idle and busy. If the server is idle, it can serve the calls of the sources. The service times of the customers are assumed to be exponentially distributed random variables at rate μ. If a source is free at time t it can generate a primary call during interval (t, t + dt) with probability αdt. If the server is free at the time of arrival of a call then the call starts to be served immediately, the source moves into the under service state and the server moves into busy state. The customers that find the server busy upon arrival abandon the system but leave their contact details; hence, we can think that they join a “virtual” retrial orbit or that they are registered in a server’s waiting list, i.e., the service order in the retrial orbit is first-come first-served. After finishing service, a customer leaves the system and the server seeks to serve a customer from the retrial orbit. The time required to find a customer from the retrial orbit is assumed exponentially distributed with rate θ. However, when a new primary call arrives during the seeking process, the server interrupts the seeking process and serves the new call. We assume that the input stream of primary sources, service times, seeking times are mutually independent. The finite-source model can be generalized to include the use of spares. We assume now that there are K machines in operation plus an additional M spares. When a machine in operation fails, a spare is immediately substituted for it (if available). Once a machine is repaired, it becomes a spare, unless the system is short, in which case the repaired machine goes immediately into service. At any given time, there are at most K machines in operation, so the rate of failures is at most Kα (i.e., spares that are not in operation do not contribute to the failure rate). From the description of the model, we represent the state of the system at time t by a pair (C(t), N (t)), where C(t) denotes the state of the server (0: idle, 1: busy) and N (t) records the number of customers in the orbit, i.e., a (virtual) queue of retrial customers with FIFO scheduling is maintained to record the state of the orbit. It should be noted that the situation C(t) = 0, N (t) = K + M is impossible and thus the state space of the process (C(t), N (t)) is the set {0, 1} × {0, 1, . . . , K + M − 1}. We define the probabilities as follows: pij (t) = P {C(t) = i, N (t) = j},
i = 0, 1,
0 ≤ j ≤ K + M − 1.
A Finite Source Retrial Queue with Spares and Orbit Search
19
Since the state space of the process (C(t), N (t)) is both finite and irreducible for all values of the generation rate of primary calls α, i.e., all states are pairwise reachable from each other via a finite number of transitions. Hence, the CTMC is positive recurrent which implies ergodicity. From now on, the system will be assumed to be in the steady state. Then we let pij be the limiting probabilities as t → ∞ and the balance equations for the stationary distribution are Kαp00 = μp10 , (Kα + θ)p0j = μp1j , ((K + M − j)α + θ)p0j = μp1j ,
(1) j = 1, 2, . . . , M − 1, j = M, M + 1, . . . , K + M − 1,
(Kα + μ)p1j = Kαp0j + θp0,j+1 + Kαp1,j−1 , j = 0, 1, . . . , M − 1,
(2) (3) (4)
((K + M − 1 − j)α + μ)p1j = (K + M − j)αp0j + θp0,j+1 +(K + M − j)αp1,j−1 , j = M, M + 1, . . . , K + M − 1, (5) where p0,−1 = p0,K+M = 0. From (1) and (4), we have Kαp10 = θp01 ,
(6)
together with (2) and (4) we obtain j = 0, 1, . . . , M − 1.
Kαp1j = θp0,j+1 ,
(7)
Putting j = M − 1 in (7) and substituting (3) into (5), we get (K + M − 1 − j)αp1j = θp0,j+1 ,
j = M, M + 1, . . . , K + M − 2.
(8)
From (1) and (6), we have p01 =
K 2 α2 p00 . μθ
(9)
Substituting (2) into (7) and (3) into (8) yields Kα(Kα + θ) p0j , j = 1, 2, . . . , M − 1, μθ (K + M − 1 − j)((K + M − j)α + θ)α p0j , = μθ j = M, M + 1, . . . , K + M − 2.
p0,j+1 = p0,j+1
(10)
(11)
With the help of Eqs. (5), (7)-(11), all probabilities pij can be expressed in terms of p00 : p0j =
(Kα)j+1 (Kα + θ)j−1 p00 , (μθ)j
j = 1, 2, . . . , M,
(12)
20
F. Zhang and J. Wang j
p0j =
(K + M − n)
n=M+1
j−1
((K + M − n)α + θ)
n=M
αj+1 K M+1 (Kα + θ)M−1 p00 , (μθ)j j = M + 1, M + 2, . . . , K + M − 1, (13) Kα Kα(Kα + θ) j ( ) p00 , j = 0, 1, . . . , M, = (14) μ μθ j αj+1 K M+1 (Kα + θ)M (K + M − n)((K + M − n)α + θ) n=M+1 = p00 , μ(μθ)j j = M + 1, M + 2, . . . , K + M − 1. (15) ×
p1j
p1j
where p00 is determined by the normalizing equation
K+M−1
(p0j + p1j ) = 1:
j=0
p00 =
p00 K+M−2
(p0,j+1 + p1j ) + p1,K+M−1 +
M−1
=
(p0,j+1 + p1j ) + p00
j=0
j=M
j K+M−1 αj+1 (K(Kα + θ))M (K + M − n)((K + M − 1 − n)α + θ) n=M
(μθ)j+1
j=M M−1
Kα(Kα + θ) j+1 ) + ( +1 μθ j=0
−1 .
(16)
In the following, we give main performance characteristics of the system which are expressed in terms of probabilities pij : 1. The probability that the server is idle p0 =
K+M−1
p0j .
(17)
p1j = 1 − p0 .
(18)
j=0
2. The probability that the server is busy p1 =
K+M−1 j=0
3. The mean number of customers in the orbit E[N ] =
K+M−1 j=0
j(p0j + p1j ).
(19)
A Finite Source Retrial Queue with Spares and Orbit Search
21
4. The mean rate of generation of primary calls λ = Kα
M−1
(p0j + p1j )
j=0
+α
K+M−1
((K + M − j)p0j + (K + M − 1 − j)p1j ).
(20)
j=M
5. The mean waiting time in the orbit can be easily obtained by using Little’s formula E[W ] = (λ)−1 E[N ].
(21)
Remark 1. When θ → ∞ and M = 0, our model becomes the finite source queue without retrial orbit or spares. In this case, Eqs. (12)-(16) reduce to α K! p0 , n = 1, 2, . . . , K, pn = ( )n μ (K − n)! 1 p0 = , K α n K! 1+ ( μ ) (K−n)!
(22) (23)
n=1
where pn is defined as the probability that the there are n customers in the system (including the one being served). We can see Eqs. (22) and (23) agree with equations (17) and (19) in [12] for r = 1. Remark 2. If M is very large, we essentially have an infinite calling population with mean arrival rate Kα, i.e., our model becomes the constant retrial queue with an infinite population in which customers arrive according to a Poisson process at rate Kα. In this case, we let M → ∞ in Eqs. (12), (14) and (16) and obtain Kα (1 − ρ)ρj , j = 0, 1, 2, . . . Kα + θ(1 − δj0 ) Kα (1 − ρ)ρj , j = 0, 1, 2, . . . = μ
p0j =
(24)
p1j
(25)
where δji is Kronecker’s delta being 1 if j = i and 0 otherwise, and ρ = Kα(Kα+θ) . Eqs. (24) and (25) agree with equations (3.10) and (3.11) in [6] for μθ r = 1 and λ = Kα.
3
The Busy Period
Assume that all sources are free at time t0 = 0, i.e., C(0) = 0, N (0) = 0, and one of them just generates a request for service which initiates a busy period. The busy period ends at the first service completion epoch at which (C(t), N (t))
22
F. Zhang and J. Wang
returns to the state (0, 0). The busy period consists of service periods and seeking periods during which the server is free and there are sources of repeated calls in the system. The length of the busy period will be denoted by L, its distribution function P (L ≤ t) by Π(t) and its Laplace-Stieltjes transform by π(s) (see [9]). Let a busy period start at time t0 = 0. Define: P0j (t) = P {L > t, C(t) = 0, N (t) = j} ,
1 ≤ j ≤ K + M − 1,
(26)
P1j (t) = P {L > t, C(t) = 1, N (t) = j} ,
0 ≤ j ≤ K + M − 1.
(27)
Kolmogorov differential equations that govern the dynamics of these taboo probabilities are given by:
P0j (t) = −(Kα + θ)P0j (t) + μP1j (t),
1 ≤ j ≤ M,
(28)
P0j (t) = −((K + M − j)α + θ)P0j (t) + μP1j (t), M + 1 ≤ j ≤ K + M − 1, (29)
Π (t) = μP10 (t),
(30)
P1j (t) = −(Kα + μ)P1j (t) + KαP1,j−1 (t) + KαP0j (t) + θP0,j+1 (t), 0 ≤ j ≤ M − 1,
(31)
P1j (t) = −((K + M − 1 − j)α + μ)P1j (t) + (K + M − j)αP1,j−1 (t) +(K + M − j)αP0j (t) + θP0,j+1 (t), M ≤ j ≤ K + M − 1, (32) where P00 (t) = P0,K+M (t) = P1,−1 (t) = 0. In addition, the initial conditions are P0j (0) = 0 and P1j (0) = δj0 , where δj0 is Kronecker’s delta. Define: ∞ ∞ −st e P0j (t)dt, ϕ1j (s) = e−st P1j (t)dt. ϕ0j (s) = 0
0
So we get (s + Kα + θ)ϕ0j (s) = μϕ1j (s), (s + (K + M − j)α + θ)ϕ0j (s) = μϕ1j (s),
1 ≤ j ≤ M,
M + 1 ≤ j ≤ K + M − 1, π(s) = μϕ10 (s),
(33) (34) (35)
(s + Kα + μ)ϕ1j (s) = Kαϕ1,j−1 (s) + Kαϕ0j (s) +θϕ0,j+1 (s) + δj0 , 0 ≤ j ≤ M − 1, (s + (K + M − 1 − j)α + μ)ϕ1j (s) = (K + M − j)αϕ1,j−1 (s)
(36)
+(K + M − j)αϕ0j (s) + θϕ0,j+1 (s), M ≤ j ≤ K + M − 1. (37) We observe that ϕ00 (s), ϕ0,K+M (s) and ϕ1,−1 (s) are equal to 0.
A Finite Source Retrial Queue with Spares and Orbit Search
23
Eliminating ϕ0j (s) in (36) and (37) with the help of (33) and (34) respectively, we have
θμ Kαμ ϕ1,j+1 (s) + − (s + Kα + μ) ϕ1j (s) s + Kα + θ s + Kα + θ +Kαϕ1,j−1 (s) = 0, 1 ≤ j ≤ M − 1, (38) (K + M − j)αμ θμ ϕ1,j+1 (s) + ( s + (K + M − 1 − j)α + θ s + (K + M − j)α + θ −(s + (K + M − 1 − j)α + μ))ϕ1j (s) + (K + M − j)αϕ1,j−1 (s) = 0, M ≤ j ≤ K + M − 1, (39) where ϕ1,K+M (s) = 0. From (33), (35) and (36), we obtain π(s) , μ (s + Kα + μ)(s + Kα + θ) s + Kα + θ ϕ11 (s) = . π(s) − θμ2 θμ ϕ10 (s) =
(40) (41)
Similar to [9], we can express all functions ϕ1j (s), 0 ≤ j ≤ K + M − 1, with the help of (38)-(41), in terms of π(s) as follows: ϕ1j (s) = Aj (s)π(s) + Bj (s),
0 ≤ j ≤ K + M − 1.
(42)
The coefficients Aj (s) and Bj (s) can be found with the help of the following recursive relations: 1 , B0 (s) = 0, μ (s + Kα + μ)(s + Kα + θ) s + Kα + θ A1 (s) = , , B1 (s) = − θμ2 θμ θμ Kαμ Aj+1 (s) + ( − (s + Kα + μ))Aj (s) + KαAj−1 (s) = 0, s + Kα + θ s + Kα + θ 1 ≤ j ≤ M − 1, θμ Kαμ Bj+1 (s) + ( − (s + Kα + μ))Bj (s) + KαBj−1 (s) = 0, s + Kα + θ s + Kα + θ 1 ≤ j ≤ M − 1, θμ (K + M − j)αμ Aj+1 (s) + ( (43) s + (K + M − 1 − j)α + θ s + (K + M − j)α + θ −(s + (K + M − 1 − j)α + μ))Aj (s) + (K + M − j)αAj−1 (s) = 0, A0 (s) =
M ≤ j ≤ K + M − 2, θμ (K + M − j)αμ Bj+1 (s) + ( s + (K + M − 1 − j)α + θ s + (K + M − j)α + θ −(s + (K + M − 1 − j)α + μ))Bj (s) + (K + M − j)αBj−1 (s) = 0, M ≤ j ≤ K + M − 2.
24
F. Zhang and J. Wang
Letting j = K + M − 1 in (39), it follows that (
αμ − (s + μ))(AK+M−1 (s)π(s) + BK+M−1 (s)) s+α+θ +α(AK+M−2 (s)π(s) + BK+M−2 (s)) = 0. (44)
Therefore, we can calculate the Laplace-Stieltjes transform of the length of the busy period as follows: π(s) = −
αμ − (s + μ))BK+M−1 (s) + αBK+M−2 (s) ( s+α+θ . αμ ( s+α+θ − (s + μ))AK+M−1 (s) + αAK+M−2 (s)
(45)
Upon suitable differentiation we obtain the mean length of the busy period:
E[L] = −π (0) θμ 2 ) (AK+M−1 (0)BK+M−1 (0) − AK+M−1 (0)BK+M−1 (0)) = (( α+θ αθμ (A + (0)BK+M−2 (0) + AK+M−2 (0)BK+M−1 (0) α + θ K+M−1 −AK+M−1 (0)BK+M−2 (0) − AK+M−2 (0)BK+M−1 (0)) αμ )(AK+M−1 (0)BK+M−2 (0) − AK+M−2 (0)BK+M−1 (0)) +α(1 + (α + θ)2
+α2 (AK+M−2 (0)BK+M−2 (0) − AK+M−2 (0)BK+M−2 (0))) θμ ×(− AK+M−1 (0) + αAK+M−2 (0))−2 . α+θ
4
(46)
Waiting Time
The analysis of the waiting time process for retrial queues is usually far more difficult than the analysis of the number in the system. To study the waiting time, first we need to obtain the arriving customer’s distribution of the server state and the queue length denoted by qij , where qij is the state probability that the given source finds the system in the state (i, j), i.e., the server is at state i and there are j customers in the system, when a primary arrival occurs. Here, we have qij = pij . Therefore, we have to relate the stationary probability pij to the probability qij that a primary arrival finds the system in the state (i, j). We follow Theorem 2.10.6 in Walrand [14] and obtain qij as follows: q0j = (λ)−1 Kαp0j ,
0 ≤ j ≤ M,
(47)
−1
q0j = (λ) (K + M − j)αp0j , M + 1 ≤ j ≤ K + M − 1, q1j = (λ)−1 Kαp1j , 0 ≤ j ≤ M − 1,
(48) (49)
q1j = (λ)−1 (K + M − 1 − j)αp1j ,
(50)
M ≤ j ≤ K + M − 2.
Assume that at time t = 0 there are j sources in the orbit and i customers in service, 1 ≤ j ≤ K + M − 1, i = 0, 1. We mark the kth customer in the queue
A Finite Source Retrial Queue with Spares and Orbit Search
25
in orbit, 1 ≤ k ≤ j, and denote by fijk (t) the probability that by the time t this customer is not served yet, i.e., the residual waiting time of the tagged customer, τijk , is greater than t fijk (t) = P {τijk > t}. In terms of these probabilities the complementary waiting time distribution function of a new primary call F (t) can be expressed as follows: F (t) =
K+M−1
q1,j−1 f1jj (t).
(51)
j=1
Using (49)-(50) and (7)-(8), we can rewrite (51) as M K+M−1
F (t) = (λ)−1 α Kp1,j−1 + (K + M − j)p1,j−1 f1jj (t) j=1
= (λ)−1 θ
K+M−1
j=M+1
p0j f1jj (t).
(52)
n=1
We introduce an auxiliary Markov processζ(t) with the state space {(i, j, k) | i = 0, 1; j = 1, 2, . . . , K + M − 1; 1 ≤ k ≤ j} {(1, j, 0) | j = 0, 1, . . . , K + M − 2}. State (i, j, k) can be thought of as the presence in the system of i customers in service, j customers in orbit, and the tagged customer is at the kth position in the queue in orbit. The special states {(1, j, 0) | j = 0, 1, . . . , K+M −2} are absorbing states, and transition into one of these states means that the tagged customer starts to be served. Thus the residual waiting time of the tagged customer, τijk , is simply the time until absorption. From the Kolmogorov backward equations for the Markov chain ζ(t) we get:
f0jk (t) = −(Kα + θ)f0jk (t) + θf1,j−1,k−1 (t) + Kαf1jk (t), 1 ≤ j ≤ M − 1, 1 ≤ k ≤ j,
(53)
f0jk (t) = −((K + M − j)α + θ)f0jk (t) + θf1,j−1,k−1 (t) +(K + M − j)αf1jk (t),
M ≤ j ≤ K + M − 1, 1 ≤ k ≤ j,
(54)
f1jk (t) = −(Kα + μ)f1jk (t) + μf0jk (t) + Kαf1,j+1,k (t), 1 ≤ j ≤ M − 1, 1 ≤ k ≤ j,
(55)
f1jk (t) = −((K + M − 1 − j)α + μ)f1jk (t) + μf0jk (t) +(K + M − 1 − j)αf1,j+1,k (t), M ≤ j ≤ K + M − 1, 1 ≤ k ≤ j. (56) ∞ −st For Laplace transform φijk (s) = 0 e fijk (t)dt, introducing the LaplaceStieltjes transform of the waiting time W , ∞ W (s) = 1 − s e−st F (t)dt, (57) 0
26
F. Zhang and J. Wang
and Laplace-Stieltjes transform of the conditional waiting times τijk , τijk (s) = 1 − sφijk (s).
(58)
and combining (52) and (57)-(58), we get W (s) = 1 − (λ)−1 θ
K+M−1
p0j (1 − τ1jj (s)).
(59)
j=1
Differentiating this relation with respect to s at the point s = 0 we get the following formula for the nth moment of the waiting time W : E[W n ] = (λ)−1 θ
K+M−1
n p0j E[τ1jj ],
n ≥ 1.
(60)
j=1
Thus, to calculate the nth moment of W we need to know the nth moment of the conditional waiting times τ1jj , 1 ≤ j ≤ K + M − 1. Next, we will show how to compute recursively the moments of τijk . (n) (n) n n First, we denote E[τ0jk ] by ajk and E[τ1jk ] by bjk . Multiplying Eqs. (53)(56) by tn−1 and integrating with respect to t, from t = 0 to t = ∞, we get the following set of equations for n ≥ 1: (n)
(n)
(n)
(n−1)
−(Kα + θ)ajk + θbj−1,k−1 + Kαbjk = −najk
,
1 ≤ j ≤ M − 1, 1 ≤ k ≤ j, −((K + M − j)α + (n)
(n) θ)ajk
+
(n) θbj−1,k−1
(n)
(n) j)αbjk
+ (K + M − = M ≤ j ≤ K + M − 1, 1 ≤ k ≤ j, (62)
(n)
(n−1)
−(Kα + μ)bjk + μajk + Kαbj+1,k = −nbjk −((K + M − 1 − j)α +
(n) μ)bjk
(n) μajk
+
(61)
(n−1) −najk ,
, 1 ≤ j ≤ M − 1, 1 ≤ k ≤ j, (n) j)αbj+1,k
(63)
(n−1) −nbjk ,
+ (K + M − 1 − = M ≤ j ≤ K + M − 1, 1 ≤ k ≤ j. (64)
(n)
Eliminating ajk from these relations we find that: (n)
(n)
(n)
Kα(Kα + θ)bj+1,k + (Kαμ − (Kα + μ)(Kα + θ))bjk + θμbj−1,k−1 (n−1)
= −n(μajk
(n−1)
+ (Kα + θ)bjk
),
(K + M − 1 − j)((K + M − j)α +
1 ≤ j ≤ M − 1, 1 ≤ k ≤ j, (n) θ)αbj+1,k
(65)
+ ((K + M − j)αμ (n)
(n)
−((K + M − 1 − j)α + μ)((K + M − j)α + θ))bjk + θμbj−1,k−1 (n−1)
= −n(μajk
(n−1)
+ ((K + M − j)α + θ)bjk ), M ≤ j ≤ K + M − 1, 1 ≤ k ≤ j. (66) (n)
(n)
(0)
(0)
It is easy to see that bj−1,0 = bK+M,k = 0 and ajk = bjk = 1 for n ≥ 1,1 ≤ j ≤ K + M − 1 and 1 ≤ k ≤ j.
A Finite Source Retrial Queue with Spares and Orbit Search
27
This set of equations can be solved with the help of the following algorithm: Step 1: Put n = 1 in (61)-(66). Step 2: Putting j = K + M − 1, K + M − 2, . . . , M in (66) and j = M − 1, M − (n) 2, . . . , 1 in (65) when k = 1, we can obtain the values of bj1 , 1 ≤ j ≤ K + M − 1. Step 3: Repeat Step 2 by putting k = 2, 3, . . . , j sequentially, 2 ≤ j ≤ K +M −1. (n) Now we have calculated bjk , 1 ≤ j ≤ K + M − 1, 1 ≤ k ≤ j. (n)
(n)
Step 4: Substituting all bjk in (62)-(63), we obtain ajk for 1 ≤ j ≤ K + M − 1, 1 ≤ k ≤ j. Step 5: Repeat Step 2-4 by putting n = 2, 3, . . .. Then we can calculate all n moments of the conditional waiting times E[τijk ].
5
Numerical Examples
In this section we investigate the effect of the parameters on the main performance characteristics of the system. To this end, three curves which correspond to M = 1, 5, 10 are presented in Fig. 1-3 where the figures depict the rate of generation of primary calls α versus the mean number of customers in orbit E[N ], the mean waiting time in orbit E[W ], and the mean length of the busy period E[L]. The model is considered with K = 10 sources, service rate μ = 1 and retrial rate θ = 0.1. From Fig. 1 we can get some conclusions. It is evident that E[N ] is a monotonically increasing function of both α and M . This is due to the fact that there will be more primary calls arriving to the system with the greater values of α
20 18 M=10
Mean queue length E[N]
16 14 M=5
12 10
M=1
8 6 4 2 0
0
0.1
0.2
0.3 0.4 0.5 Source arrival rate α
0.6
Fig. 1. Mean queue length vs. α
0.7
0.8
28
F. Zhang and J. Wang
60
Mean waiting time E[W]
50
M=10
40 M=5 30 M=1 20
10
0
0
0.1
0.2
0.3 0.4 0.5 Source arrival rate α
0.6
0.7
0.8
Fig. 2. Mean waiting time vs. α 180
Mean length of the busy period E[L]
160 140 120 100 80 M=10 60 40 M=5 20 M=1 0 0.02
0.022
0.024 0.026 Source arrival rate α
0.028
0.03
Fig. 3. Mean length of the busy period vs. α
and M . When the server is busy, the more primary calls arrive, the more the number of sources in the orbit will be. Fig. 2 describes the the influence of the parameters α and M on the mean waiting time in orbit E[W ]. We observe that E[W ] increases with increasing value of α form 0 to some value, i.e., it has a maximum, but then becomes a decreasing function of α. On the other hand, with the increasing of the value M , there are more primary calls having to move into the orbit when they find the server is busy upon their arrival. Thus, E[W ] is an monotonically increasing function of M . Fig. 3 depicts the behavior of the mean value of the busy period E[L] against α and M . As intuition tells us, with the increase of the arrival rate α, there are
A Finite Source Retrial Queue with Spares and Orbit Search
29
more sources going to the retrial orbit so that the length of the busy period is to be increased. Meanwhile, more spares that are considered may increase the number of customers waiting in the orbit, which leads to a longer length of the busy period. Therefore, as M increases, E[L] also increases.
6
Conclusions
In this paper we present an exhaustive study of the queueing measures of a finite source retrial queueing system with orbit search, in which the system is maintained by replacing failed part by spares. We model our system as a Markov chain and derive some important queueing measures in steady-state. This research presents an extension of the finite source retrial queueing theory and the analysis of the model will provide a useful performance evaluation tool for more general situations arising in practical applications, such as production systems, flexible manufacturing systems, computer and communication systems, and many other related systems. Acknowledgments. This work was sponsored by the National Natural Science Foundation of China (Grant No. 11171019) and the Fundamental Research Funds for the Central Universities (Nos. 2011JBZ012 and 2011YJS281).
References 1. Alfa, A.S., Sapna, I.K.P.: An M/P H/k retrial queue with finite number of sources. Computers & Operations Research 31, 1455–1464 (2004) 2. Alm´ asi, B., Bolch, G., Sztrik, J.: Heterogeneous finite-source retrial queues. Journal of Mathematical Science 121, 2590–2596 (2004) 3. Artalejo, J.R., G´ omez-Corral, A.: Steady state solution of a single-server queue with linear repeated requests. Journal of Applied Probability 34, 223–233 (1997) 4. Artalejo, J.R.: Retrial queues with a finite number of sources. Journal of the Korean Mathematical Society 35, 503–525 (1998) 5. Artalejo, J.R., G´ omez-Corral, A.: Retrial queueing systems: a computational approach. Springer, Heidelberg (2008) 6. Economou, A., Kanta, S.: Equilibrium customer strategies and social-profit maximization in the single-server constant retrial queue. Naval Research Logistics 58, 107–122 (2011) 7. Efrosinin, D., Sztrik, J.: Stochastic analysis of a controlled queue with heterogeneous servers and constant retrial rate. Information Processes 11, 114–139 (2011) 8. Falin, G.I., Templeton, J.G.C.: Retrial queues. Chapman & Hall, London (1997) 9. Falin, G.I., Artalejo, J.R.: A finite source retrial queue. European Journal of Operational Research 108, 409–424 (1998) 10. Haque, L., Armstrong, M.J.: A survey of the machine interference problem. European Journal of Operational Research 179, 469–482 (2007) 11. Kornyshev, Y.N.: Design of a fully accessible switching system with repeated calls. Telecommunications 23, 46–52 (1969) 12. Naor, P.: On machine interference. Journal of the Royal Statistical Society, Series B (Methodological) 18, 280–287 (1956)
30
F. Zhang and J. Wang
13. Neuts, M.F., Ramalhoto, M.F.: A service model in which the server is required to search for customers. Journal of Applied Probability 21, 157–166 (1984) 14. Walrand, J.: An Introduction to Queueing Networks. Prentice Hall, Englewood Cliffs (1988) 15. Wang, J., Zhao, L., Zhang, F.: Analysis of the finite source retrial queues with server breakdowns and repairs. Journal of Industrial and Management Optimization 7, 655–676 (2011) 16. W¨ uchner, P., Sztrik, J., de Meer, H.: Structured Markov chains arising from homogeneous finite-source retrial queues with orbit search. In: Dagstuhl Seminar Proceedings 07461, Numerical Methods for Structured Markov Chains. Dagstuhl. Germany (2008) 17. W¨ uchner, P., Sztrik, J., de Meer, H.: Homogeneous finite-source retrial queues with search of customers from the orbit. In: Proceedings of 14th GI/ITG Conference MMB - Measurements, Modelling and Evaluation of Computer and Communication Systems, Dortmund, Germany, pp. 109–123 (2008) 18. W¨ uchner, P., Sztrik, J., de Meer, H.: Finite-source M/M/s retrial queue with search for balking and impatient customers from the orbit. Computer Networks 53, 1264–1273 (2009)
Bounds for Two-Terminal Network Reliability with Dependent Basic Events Minh Lˆe1 and Max Walter2 1
Lehrstuhl f¨ ur Rechnertechnik und Rechnerorganisation Technische Universit¨ at M¨ unchen Munich, Germany 2 Siemens AG N¨ urnberg, Germany
Abstract. The two-terminal reliability problem has been a widely studied issue since 1970s. Therefore many efficient algorithms were proposed. Nevertheless, all these algorithms imply that all system components must be independent. With regard to nowadays applications it is not sufficient to assume independent component failures because in fault tolerant systems components may fail due to common cause failures or fault propagation. We therefore propose an algorithm which deals with upcoming dependencies. In addition to that, lower and upper bounds can be obtained in case the algorithm cannot be conducted until the end. The performance and accuracy of the algorithm is demonstrated on a certain network obeying a recursive structure where the exact result is given by a polynomial.
1
Introduction
For determining the reliability or availability of a fault tolerant system, the system redundancy structure can be modelled by a Reliability Block Diagram (RBD) [20]. Therein the edges represent the system components with binary state. Two nodes are specified to be the terminal nodes. Under the assumption of independent components, the probability is seeked that there exist at least a path with working edges that connects the terminal nodes (Fig. 1). This problem is known to be NP complete and many different algorithms have been conceived to solve it in an efficient way. In the literature one can find the following methods: State enumeration and sum of disjoint products (SDP [23]), Factoring theorem with series and parallel reductions [1] and Edge Expansion Diagrams (EED) using Ordered Binary Decision Diagram (OBDD) based method [4]. The methods using SDP and state enumeration require that minimal paths or cuts have to be enumerated in advance. However, the vital drawback of those methods is that the computational effort in disjointing of the minimal path or cut sets grows rapidly with the network size. Instead, it is more recommended to apply Factoring or EED using OBDDs [4]. The efficiency of the BDD based methods depends largely on BDD variable ordering which itself is known to be NP hard [4]. Furthermore, both mentioned methods lack the ability of providing bounds in case J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 31–45, 2012. c Springer-Verlag Berlin Heidelberg 2012
32
M. Lˆe and M. Walter
of non termination. Thus, another method was proposed by Gobien and Dotson which is based on set theoretical partition of the sample space into disjoint sets and can yield at least lower and upper bounds in case the computation cannot be led to the end [22]. This method has been extended with series and parallel reductions for increasing efficiency [17]. As we already stressed, the appearance of dependencies in component failures have become an integral part for the appropriate reliability assessment of e.g. communication, water supply or power networks because the simplification of independent failures would lead to overoptimistic results. For instance, certain close-by components in a large water supply network may fail dependently due to natural influences like local earthquakes. Servers may fail dependently due to power spikes. Hence, those interdependencies must be taken into account. This can be done by introducing disjoint sets of interdependent components (SICs). More precisely, interdependent components are enclosed in one SIC whereby a system can contain several different SICs. Furthermore, any component of an arbitrary SIC is independent from all other components beyond this SIC. With regard to dynamic fault trees (DFT) [9] [8] where dependencies can also be considered, our approach can handle the arbitrary arrangement of SICs in the system structure, whereas a DFT can only be decomposed in independent subtrees if the leafs which have a common AND/OR-gate are in one SIC. So the exploitation of independent subtrees is only possible for a certain configuration of SICs. The same issue holds when applying GSPNs (Generalized stochastic petri nets [20]) to the SICs of a DFT. For each SIC a stochastic based model (SBM) is then generated wherefrom the probability of dependent failure combinations can be obtained. A SBM can be represented by Petri nets [11], Copulas [16][19], Stochastic Process Algebras [7][12] or stochastic simulation models [6]. In our previous works [14] and [13] we already considered the arbitrary arrangement of SICs: The first work [14] is based on EEDs using OBDD and uses the Shannon expansion for each SIC. This becomes quite complex for growing SIC sizes. So the latter work [13] proposes a method based on Factoring and reductions where only relevant dependent basic events are considered instead of all possible combinations. Nevertheless, the efficiency of method [13] strongly depends on a good variable ordering. Because the limitation of those two methods can soon be reached by larger graph sizes, it is important to obtain at least bounds in case a computation cannot be finished. On these grounds we want to extend the Gobien Dotson (GD) algorithm with series and parallel reductions in order to handle component interdependencies. In view of the LARES framework [15], this algorithm shall be integrated into the LARES toolchain using CASPA [12] as a solver for the SBM. After giving the formal statement of the described problem and a brief introduction of the GD algorithm in section 2, we will present the dependent version of the GD algorithm in section 3.1 and apply it to an example network Fig. 1. To show that the algorithm can deal with large networks without any reducible structures, we will demonstrate in section 4 the performance of the algorithm on a recursive network structure named K4-ladder. Finally, an outlook will be given in section 5.
Bounds for Two-Terminal Network Reliability
33
Fig. 1. Initial graph
2 2.1
Preliminaries Formal Description
We have a multigraph G := (V, E) with no loops and where V stands for a set of vertices or nodes and E a multiset of unordered pairs of vertices, called edges. Given the redundancy structure of a system modelled by a network graph G := (V, E), we specify two nodes s and t which characterize the terminal nodes (In Fig.1 those nodes are colored in black). We define two not necessarily injective maps f and g, where f :E→C assigns the edges to the system components and g : E → V 2 assigns each edge to a pair of nodes. The finite set of system components is defined by C = S ∪ T where T is the set of independent components - each component in T can be regarded as a SIC on its own - and S = SIC1 ∪SIC2 ∪. . .∪SICn , n ∈ N represents the disjoint union of SICs. The mapping of several edges to one component c implies that c is a multiple component whereas component c can be from any SIC or from set T . The dependency relation between two components infers that they must belong to the same SIC. Because the dependency property is transitive, a SIC can be regarded as a transitive closure where each element depends on the others whether directly or indirectly. So two components which are dependent must belong to the same SIC and if they are from different SICs they are independent. In other words, for all i, j with i = j it holds that SICi ∩ SICj = ∅. Moreover it holds that |SICk | ≥ 2 ∀k ∈ I. So if there are two components x1 , x2 from different SICs, then their conjoint probability of failure is the product of their respective failure probabilities. If they would be in the same SIC, we have to establish a separate SBM for this SIC to obtain the conjoint probability. Because each SIC can be regarded as a random variable X : ω ∈ SIC → 2|SIC| (states), the effort for generating the state space probabilities in terms of the SBM (the respective probability density function for X) grows exponentially by the quantity of components in one SIC. In our example graph of Fig. 1 there are two SICs: SIC1 contains three components and SIC2 two. All remaining independent components are assigned to T . Because f is not necessarily injective we can assign several edges to one component. This does not infer that f is surjective because there might be components which are not represented by any edge. For example two components x, y fail due to a common cause failure coming from component z, but z plays no role in the system’s redundancy structure. This means that we allow the multiple
34
M. Lˆe and M. Walter
occurrence of one component in any system’s redundancy structure. This fact is implied by saying that there exist a multiple component in the redundancy structure. The example network from Fig. 1 contains three multiple components 1, 2 and 3 which represent a two out of three redundancy substructure. After introducing the notations we can now formulate the problem as follows: Statement of the problem. Given a network graph G := (V, E), its terminal nodes s, t, a set of system components C = S ∪ T and two not necessarily injective maps f : E → C and g : E → V 2 . Each component c ∈ C represents a random variable with two states: failed or working. The reliability for each c ∈ C is given by pc . For each SIC ⊆ S there exist a corresponding SBM. The system’s terminal pair reliability R is the probability that the two specified terminal nodes can be connected by at least one path consisting of only edges associated with working components. Rules for series and parallel reductions. For speeding up the calculation, reduction techniques should be applied to the graph whenever it is possible. With negligible expense certain substructures can be identified and simplified so that the graph size decreases under preservation of the probability for the reliability of the original graph. Typical well known reduction methods are those of series and parallel reductions. Under the consideration of dependencies among certain system components we want to sum up the rules and heuristics for series and parallel reductions proposed in our last work [13]: – Reductions with multiple components are not allowed except multiple components become unique in the course of the algorithm. – Reductions can only be performed among independent components and among components which are from the same SIC, i.e. e1 , e2 ∈ T or e1 , e2 ∈ SICi , i ∈ N. For the algorithm in section 3 we use a map named EdgeProbMap (epm) where the failure probability for each edge is stored. In addition, each edge is initially associated to a component in order to distinguish edges which are multiple, dependent or independent. By doing so one can take account of reductions and hence this map can only change in case a reduction has taken place. In case of an independent reduction, the probability of the edge resulting from the reduction will be re-adjusted according to the rules for a series or parallel reduction. So for edges e1 , e2 ∈ T the probability for the new edge - labeled with ri - would be pri = pe1 pe2 for a series and pri = pe1 + pe2 − pe1 pe2 for a parallel reduction whereby i stands for the i-th independent reduction. In case of a dependent reduction, we would relabel one of the two affected edges with a capital letter Rj whereby j stands for the j-th dependent reduction. W.l.o.g. we label e1 with Rj and delete e2 . Rj comprises the concatenated expression of the two affected edges. Here we introduce the labeling function l : E → B where B stands for a boolean expression in disjunctive normal form - initially l equals f where all components are literals set to true. To be more precise, for edges e1 , e2 ∈
Bounds for Two-Terminal Network Reliability
35
SICk , k ∈ N , Rj = l(e1 ) ∧ l(e2 ) for a series reduction and Rj = l(e1 ) ∨ l(e2 ) for a parallel reduction. Hereafter e2 will be deleted and l(e1 ) = Rj . For details of the algorithmic approach we refer to [13]. There are also other established reduction methods such as the polygon-to-chain [10] or triangle reduction [21] which have not yet been considered due to their high complexity. 2.2
Basics of the Independent GD Algorithm
Before starting with the dependent extension for the Gobien Dotson algorithm, we want to recapitulate the independent version. According to [3], a network consisting of n links, an elementary event E is a binary specification of n links in a n-dimensional sample space. E can be represented by a vector where each entry bears a link label which can be negated or not. A full event is recursively defined as either an elementary event or the union of two events differing only in one entry of the event vector. For example, if m = 2 then full event [1] is the union of [1, 2] and [1, ¯ 2]. A path is a sequence of links l1 , l2 , . . . , lk such that the terminal node of li coincides with the initial node of li+1 , 1 ≤ i ≤ k − 1. A path is the full event that is the union of all elementary events that include the links in the path. A success event is defined as a full event such that each of its elementary events contains an s-t-path where s and t are the terminal nodes. For a failure event it holds the same only that each of its elementary event contains an s-t-cut. Assume there exist a path P = [1, 2, . . . , r] in our network graph G. After [17] the reliability expression of G can be obtained by recursively applying the factoring theorem for each of the r edges in path P . Starting with the first edge e1 gives us: Rel(G) = p1 · Rel(G ∗ e1 ) + q1 · Rel(G − e1 ), where ∗/− stands for a contraction/deletion of the edge and qi = 1 − pi , 1 ≤ i ≤ r is the probability of the edge ei ’s failure. Then the last term Rel(G − e1 ) will again be expanded by factoring on edge e2 . Overall it follows: Rel(G) = + +
q1 · Rel(G − e1 ) p1 q2 · Rel(G ∗ e1 − e2 ) ...
+ p1 p2 · · · pr−1 qr · Rel(G ∗ e1 ∗ e2 ∗ . . . ∗ er−1 − er ) r + k=1 pk So we have r subproblems respectively subgraphs emanating from path P. Again, for each subproblem this equation can be recursively applied. Thus, for each subproblem we are looking for the topologically shortest path to keep the number of subproblems low. In each subgraph series and parallel reductions can be performed if possible. Suppose S to be a disjoint exhaustive success collection of success events Si , 1 ≤ i ≤ N in G. Then after [22] the terminal pair reliability of G is represented by N Rel(G) = P(Si ). i=1
36
M. Lˆe and M. Walter
Where in our example P(Si ) = rk=1 pk if Si = P . Analogously it holds for the exhaustive failure collection F := {Fi , 1 ≤ i ≤ M } Rel(G) = 1 −
M
P(Fi ).
i=1
For u < |S|, v < |F | and u, v ∈ N the lower and upper bounds for the reliability are u v P(Si ) ≤ Rel(G) ≤ 1 − P(Fi ). i=1
i=1
The cut and path terms contributing to the bounds will be exemplified by an example in section 3.2.
3
The Dependent GD with Reductions
This part of our work will give a brief overview of the whole dependent GD algorithm. First we will give the appropriate explanations for the listed procedures underneath. Then in section 3.2 we demonstrate the algorithm by means of an example network. 3.1
The Algorithm
Starting with procedure 1 the relevant structure of the input network is required in the form of a RBD. All edge probabilities of the RBD are contained in the map epm. The map componentSICIndex assigns each component to its appropriate SIC index and componentEdges to the respective edge. Therefrom we can deduce the M ultipleEdges map which contains all multiple edges in the current network graph. Afterwards we initialize the global lists which stores maps of the accumulated graph operations (M reconstr), the accumulated path information (M acc2) and the current edge probabilities (M epm) to be processed. We start the computation by calling procedure 2. The recursive algorithm generates a call tree wherein the edges are labeled with the edges to be contracted or deleted and the nodes contain the subproblem derived from the parent node (e.g. Fig.3). As it can be seen, the level parameters are set in such a way that the nodes of the recursion tree are processed obeying the breadth first search (BFS). Proceeding this way causes a high memory consumption as the breadth of the tree can grow exponentially in relation to its depths whereas the bounds would converge faster after proceeding each depth level because the most probable paths or cuts can be found in the upper levels of the tree. We could also proceed the nodes by depth first search (DFS). On the one hand this would be less memory consuming but on the other hand the computation of bounds turns out to be obsolete due to the extremely slow convergence. In order to retain a good bound convergence behavior and at the same time keeping the memory consumption as low as possible, we have to put up with longer runtimes. This can be done
Bounds for Two-Terminal Network Reliability
37
by only storing the changes made in the graph in map M reconstr. Instead of storing the subgraphs, we reconstruct them using function reconstructGraph() and map M reconstr (procedure 4, line 7). In procedure 2 we check whether the current graph is connected otherwise there is a cut which will be processed by the function classif yCutT erms: All edges belonging to the cut will be classified according to their SICs. For each SIC the extracted Boolean term will be stored in the cutDep Map whereas the probabilities of the independent edges are multiplied and the result stored in the cutIndep. Both entries obtain the same index i for the i-th cut - so that later on when the probabilities for the dependent Boolean terms are returned from the SBM, the whole probability for the cut can be reassembled. In line 9 of procedure 2 the graph will be reduced if possible. In case a reduction has taken place, the epm would change accordingly. Additionally M reconstr must be updated with the changes from the reduction. Because the graph is connected, we are seeking for the topologically shortest path (sp) by applying BFS. Then the sp will be classified together with the accumulated path terms in function classif yP athT erms as just described. In procedure 3 line 6 multiple edges are treated for the case that we have at least two edges assigned to the same component in our sp. Hereafter all edges from the sp to be contracted will be collected by the temporary list Lcollect and finally be delivered to the Lcontract List. All edge contractions and deletions are stored in the map M acc (line 10-23) in order to accumulate them to map M acc2 at the end (line 24). Finally all relevant maps will be parsed to the next level in order to be processed (line 28-29). After finishing procedure 3, procedure 4 will be called in line 10 of procedure 1. There the nodes of the respective levels will be first reconstructed by reconstructGraph() and processed so that one can obtain the current values for the lower (lb) and upper bounds (ub) of the unreliability (C and 1 − R) after finalizing all nodes of each level.
Procedure 1. GobienDotsonDependent Input: RBD InitGraph, EdgeProbMap epm 1: //Initialize mappings 2: Map componentSICIndex, componentEdges, M ultipleEdges ⇐ InitM appings(); 3: //Initialize global lists 4: List lastLevel = new List; 5: List nextLevel = new List; 6: //Start computation 7: computeRel(InitGraph,new Map<Edge,Bool>,new Map<Edge,Bool>,epm); 8: lastLevel = nextLevel; 9: nextLevel =new List; 10: bf sLevel();
38
M. Lˆe and M. Walter
Procedure 2. ComputeRel Input: RBD Graph,Map M acc,Map M reconstracc,EdgeProbMap epm 1: bool b = Graph.f indP ath() 2: if b == f alse then 3: //Cut found resp. Graph is not connected. 4: classif yCutT erms(M acc, epm); 5: return 6: end if 7: //Preprocessing: Reduce graph. 8: SPRed red = new SPRed(Graph, epm, IsM ultiple, CompSICIndex); 9: Graph = red.Reduce(); 10: //Update edge probabilities and accumulate changes due to reductions 11: epm = red.getEdgeP robM ap(); 12: M reconstracc.add(red.getreconstr()); 13: //Find shortest path by breadth first search 14: List sp = BF S.shortestP ath(Graph); 15: //Group terms according to their SICs and add them to pathIndep and pathDep 16: classif yP athT erms(M acc, sp, epm); 17: processShortestP ath(sp, M acc, M reconstracc, epm);
3.2
A Case Study
Now we will describe the workings of the algorithm on the example in Fig.1. The example graph G0 comprises two SICs: The first SIC contains three components and the second two. All other components which do not belong to any SIC are regarded to be independent. There is an 2-out-of-3 edge modeled by multiple edges assigned to components 1, 2 and 3. The two terminal nodes s and t are marked in black. Because there are no possible reductions, we start by looking for a shortest path. The algorithm delivers for example path 1 ∧ 2 (Fig.2a). In the next step we would obtain two subgraphs: G1 by deleting edges labeled with component 1. G2 by contracting edges assigned to component 1 and deleting edges assigned to component 2. Again, for each of those subgraphs we try to reduce. We notice that a series reduction can be made between the edges labeled with 2 and 3, because 2 and 3 are in the same SIC. One of the edges will be labeled as R1 and the other deleted. We store the reduction made in our edge-probability-map for graph G1 which is a complete graph containing four nodes. In the literature it is also known as a K4 graph. Now we continue to search for the shortest path which is obviously R1 . After deleting R1 (Fig.2b), we would obtain G3. Normally the algorithm would proceed with subgraph G2 which has the same structure as G1 . Hence we omit the sketch of processing G2, nevertheless it can be reconstructed by the help of Fig.3. Continuing with G3 we would look for a shortest path since no parallel or series reduction are possible. Partitioning the graph on the base of shortest path 4 ∧ 8 we arrive at G5 and G6. Though G5 is a series structure, we are not allowed to reduce because components 7 and 6 are not from the same SIC. Proceeding on the basis of the shortest path 7 ∧ 6 we obtain two cuts since the terminal nodes are
Bounds for Two-Terminal Network Reliability
Procedure 3. processShortestPath Input: List sp,Map M acc,Map M reconstracc,EdgeProbMap epm 1: //Initialization of auxiliary maps and lists 2: Map M reconstr, M acc2, M epm, M newacc, ⇐ InitEmptyM aps(); 3: List Lcollect, Lcontract = new List<Edge>; 4: int i = 0; 5: for each edge e ∈ sp do 6: avoid a redundant deletion/contraction of multiple edges; 7: i = i + 1; 8: Lcollect.add(e); 9: if i == 1 then 10: M reconstr.put(i, M reconstracc); 11: M reconstr.get(i).put(e, f alse); 12: M newacc.put(e, f alse); 13: else 14: M reconstr.put(i, M reconstracc); 15: add all edges ee ∈ Lcollect to Lcontract, ee = e; 16: M reconstr.get(i).putAll(Lcontract, true); 17: M reconstr.get(i).put(e, f alse); 18: M newacc.putAll(Lcontract, true); 19: M newacc.put(e, f alse); 20: Lcontract.clear; 21: end if 22: //Setting parameters for lower recursive levels. 23: M acc.putAll(M newacc); 24: M acc2.put(i, M acc); 25: M epm.put(i, epm); 26: end for 27: List level = Level(M reconstr, M acc2, M epm, i) 28: nextLevel.add(level)
Procedure 4. bfsLevel 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:
if lastLevel.size()==0 then return ; end if for each level ∈ lastLevel do Map M reconstr, M acc2, M epm ⇐ level.getM aps; for i = 1 to level.Length() do RBD rbd = reconstructGraph(M reconstruct.get(i)); computeRel(rbd, M acc2.get(i), M reconstr.get(i), M epm.get(i)); end for end for lastLevel = nextLevel; nextLevel = new List; print ”upper bound for Unreliability = 1 − computeRelbyP aths()”; print ”lower bound for Unreliability = computeRelbyCuts()”; bf sLevel();
39
40
M. Lˆe and M. Walter
disconnected. The cuts are highlighted in the dotted boxes of Fig.3. We climb up the recursion tree to go on with G6. There we can do a dependent parallel reduction since components 5 and 7 are from the same SIC. The reduction is captured in a separate map as R2 = 5 ∨ 7. Again, we end up with two cuts on the base of the shortest path R2 ∧ 6. The recursion tree in Fig.3 illustrates all possible paths and cuts obtained in each depth/level of the recursion. Every time just before a path/cut is added to the path/cut list, it was analyzed by the procedure classif yP ath/CutT erms. Therein the expressions within the Boolean term, respectively path/cut are rearranged and grouped according to their SIC filiation. The grouped terms are stored separately in the lists pathDep/cutDep to be handed over to the SBM later on. The probabilities of the independent expressions are simply multiplied and then added to the lists pathIndep/cutIndep. After the probabilities of the independent terms were returned from the SBM, the whole probability for any path/cut term can be reassembled by multiplying, since the SICs are independent among each other. For instance, the relevant probabilities of the first cut term !1 ∧ R1 ∧!4∧!7 would be classified as follows: The value of p!4 would be added to the list cutIndep at the first index whereas !1 ∧ R1 (belonging to SIC 1) and !7 (belonging to SIC 2) would be added to the first position of the list cutDep. Analogously the second cut would be stored at the second position of the respective list. When all values for the dependent basic events are returned from the stochastic model, the probability for the first cut is computed by p!4 · p!1∧R1 · p!7 .
(a) First level
(b) Second and third level
Fig. 2. Algorithm
Bounds for Two-Terminal Network Reliability
41
Fig. 3. Recursion Tree
4
A Recursive Example
In this section we provide some experimental results of the algorithm performed on a recursive structure - the K4 ladder (Fig.4). By knowing the exact result which is given by a polynomial for each ladder dimension (refer to [2]), we can validate on the one hand the correctness of the algorithm for the independent case and on the other hand we can assess the accuracy of bounds obtained from large network graphs. We assume that each edge is assigned to a different component. All components have the same failure probability of 0.1. The experiments were conducted on a 2.1 GHz machine with 2 GB RAM. It can be taken from Table 1 that up to ladder size of 9 the algorithm terminates. As we can see, the computation time increases exponentially with the number of components. All result values are exactly the same as those obtained from the polynomial in [2]. For the ladder size of 10 - corresponding to 51 components - we obtain lower and upper bounds for the unreliability (3.1557 · 10−3 and 3.1560 · 10−3 ) after 18913 milliseconds(ms) runtime. Following this, the tightness of those bounds can be justified by knowing the exact result of 3, 1558 · 10−3 . For the dependent case we have set up two SIC configurations, one where there are two SICs each containing the three components emerging from the terminal nodes and another where the numbers of SICs (each having two components) equals the ladder size (Fig.4). We impose that each component in one SIC is correlated to all others of this respective SIC with correlation factor ρ > 0 (see [16]). ρ has an
Fig. 4. K4-ladders with 2 SICs (upper) and N SICs (lower)
42
M. Lˆe and M. Walter
increasing effect on the probabilities of failure combinations between SIC components . Those probabilities become larger with rising ρ. In our example ρ takes three values: 0.01, 0.05 and 0.1. For the case of two SICs we manage to determine the unreliability until ladder size 7 whereas for the case of N SICs the algorithm can cope until ladder size N = 6. This is justified by the fact that the effort grows by the number of SICs which corresponds to the number of SBMs to be evaluated. The runtimes for the dependent cases are about the same for N = 2. From size three on (16 components), more computation time is needed for the N SIC case. The unreliability values for the two cases and their respective ladder sizes are depicted in Fig.5 and Fig.6. It can be seen that the unreliability increases with ρ. Furthermore, the position of the SICs plays a significant role: The reliabilities for the case of two SICs are lower than for the N SICs case. This is due to the fact that a dependent failure of the three components at the terminal nodes leads to a system failure event respectively a disconnection of the terminal nodes whereas dependent failures of two components of the N SICs case can be tolerated by the redundancy structure. For each of the two cases the bounds are computed as accurately as possible for the next higher ladder size (Table 2) - the tightness of the bounds is shown by δ. To show that the algorithm still delivers acceptable bounds for large graph sizes, we have computed bounds for the independent case with 71 components and for the N SICs case with 66 components. As a matter of fact, the tightness of the bounds deteriorates with the rise of the ladder size. Parts of the results are listed in Table 3 which corresponds to Fig.7 showing the fast convergence of the bounds towards the exact value. We have also taken into consideration to estimate the effect of series and parallel reductions by omitting them in the algorithm. The runtimes show that the overhead for the reductions pays off for all ladder sizes. This can be observed for the independent and dependent case. For instance, it would take 30ms instead of 19ms for ladder size 2 and even 2615ms instead of 415ms for ladder size 5 to compute the unreliabilities (independent case). It would not be possible to compute the unreliability for ladder sizes larger than 5 due to the enormous number of subproblems. Similar observations can be made for the dependent cases. For the lack of space we omit the listing of the runtime tables without reductions. Table 1. K4-ladder independent case Size
2
3
4
5
6
7
8
9
Unrel·10−3 2.2013687 2.3206358 2.4399907 2.5593314 2.6786578 2.7979700 2.9172679 3.0365515 Time (ms)
19
71
216
415
647
1644
7946
47109
#Comp.
11
16
21
26
31
36
41
46
Bounds for Two-Terminal Network Reliability
43
Table 2. Bounds for N & 2 SICs, (lb/ub/δ) ·10−3 , time in ms Case
2 SICs
(size 8)
ub
δ
lb
N SICs (size 7) time
lb
ub
δ
time
Corr 0.01 3.0751421 3.0771007 0.00195 14911 2.8156037 2.8209134 0.00530 11433 Corr 0.05 3.8989072 3.9018633 0.00295 14643 2.8853496 2.8865489 0.00119 10345 Corr 0.10 4.9122819 4.9157460 0.00346 15066 2.9775923 2.9808147 0.00322 12009
Table 3. Bounds for high ladder sizes - independent & N SICs case (lb/ub)·10−3 indep.
(size 14)
time (ms)
lb
ub
time (ms)
lb
ub
891
1.3734
71.3803
3395
0
188.2954
1515
2.5433
25.0081
4822
1.2460
73.1824
2905
3.1343
9.4892
7030
2.5394
26.7525
7703
3.3891
4.9957
11038
3.2936
10.6765
23857
3.4781
3.8455
28099
3.6580
5.7223
(size 13 Corr 0.1)
31
36
independent corr = 0.01 corr = 0.05 corr = 0.1
6 Unreliability ·10−3
N SICs
5.5 5 4.5 4 3.5 3 2.5 2
11
16
21
26
# Components
Unreliability ·10−3
Fig. 5. K4-ladder with 2 SICs at terminal nodes
4 independent corr = 0.01 corr = 0.05 corr = 0.1
3.5 3 2.5 2
11
16
21
26
# Components
Fig. 6. K4-ladder with N different SICs
31
M. Lˆe and M. Walter
Unreliability ·10−3
44
200 independent corr = 0.01 corr = 0.05 corr = 0.1
150 100 50 0
0
5
10
15
20
25
30
Time in seconds
Fig. 7. Convergence of bounds for independent case (size 14) & N SICs case (size 13)
5
Conclusion
By this work we have shown that the dependent GD algorithm with reductions is an effective method to assess the reliability of systems incoorporating dependencies. The algorithm can cope with large network sizes depending on their redundancy structure. Meaning that the memory saving technique allows us to proceed further in the recursion tree and hence obtain more accurate bounds. We manage to determine good bounds for the K4 ladder example consisting of 71 components for the independent case and 66 components for the dependent case (Table 3). The correctness of the algorithm has been reaffirmed for the independent case by providing exact results in advance [2]. Depending on certain SIC settings, the results in Fig.5 and Fig.6 indicate that the algorithm delivers reasonable unreliability values. Further on, the measurements show that the number of subproblems can be dramatically reduced by the help of reductions. In future, we want to extend our algorithm to be able to deal with the K-terminal problem [5] [10]. Also node failures shall be considered because in reality nodes representing servers or routers might fail as well [18][3]. As this work is part of the main framework LARES [15], the conceived algorithms will prospectively be integrated into the LARES tool-chain for allowing to assess the availability of systems modelled by LARES. Acknowledgements. We would like to thank M.Siegle, A. Gouberman, M. Riedl and J. Schuster from Universit¨ at der Bundeswehr for their insightful discussions. A special thank is dedicated to C. Tanguy from Orange FTGroup for his cordial support. We also thank the four anonymous reviewers for their helpful comments. This work is partly funded by the Deutsche Forschungsgemeinschaft within the project BO 818/8-1: ”Modellbasierte Analyse der Verl¨ asslichkeit komplexer fehlertoleranter Systeme”.
References 1. Satyanarayana, A., Chang, M.K.: Network reliability and the factoring theorem. Networks 13(1), 107–120 (1983) 2. Tanguy, C.: Asymptotic mean time to failure and higher moments for large, recursive networks. In: CoRR (2008)
Bounds for Two-Terminal Network Reliability
45
3. Torrieri, D.: Calculation of node-pair reliability in large networks with unreliable nodes. IEEE Trans. Reliability 43(3), 375–377 (1994) 4. Yeh, F.M., Lu, S.K., Kuo, S.Y.: Determining terminal-pair reliability based on edge expansion diagrams using obdd. IEEE Trans. Reliability 48(3), 234–246 (1999) 5. Hardy, G., Lucet, C., Limnios, N.: K terminal network reliability measures with binary decision diagrams. IEEE Trans. Reliability 56(3), 506–515 (2007) 6. Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. Journal of Physical Chemistry 81(25), 2340–2361 (1977) 7. Hermanns, H., Herzog, U., Katoen, J.P.: Process algebra for performance evaluation. Theoretical Computer Science Archive 274(1-2), 43–87 (2002) 8. Dugan, J.B., Venkataraman, B., Gulati, R.: Diftree: a software package for the analysis of dynamic fault tree models. In: RAMS (1997) 9. Sullivan, K.J., Coppit, D.: Galileo: A tool built from mass-market applications. In: ICSE (2000) 10. Wood, K.: A factoring algorithm using polygon-to-chain reductions for computing k-terminal network reliability. Networks 15(2), 173–190 (1985) 11. Marsan, M.A., Balbo, G., Conte, G., Donatelli, S., Franceschinis, G.: Modelling with generalized stochastic Petri nets. John Wiley & Sons (1995) 12. Kuntz, M., Siegle, M., Werner, E.: Caspa - a tool for symbolic performance and dependability evaluation. In: EPEW (FORTE Co-located Workshop), pp. 293–307 (2004) 13. Lˆe, M., Walter, M.: Considering dependent components in the terminal pair reliability problem. In: DYADEM-FTS 2011, pp. 415–422 (2011) 14. Pock, M., Walter, M.: Efficient extraction of the structure formula from reliability block diagrams with dependent basic events. Journal of Risk and Reliability 222(3), 393–402 (2008) 15. Walter, M., Gouberman, A., Riedl, M., Schuster, J., Siegle, M.: LARES - A Novel Approach for Describing System Reconfigurability in Dependability Models of Fault-Tolerant Systems. In: ESREL (2009) 16. Walter, M., Esch, S., Limbourg, P.: A copula-based approach for dependability analyses of fault-tolerant systems with interdependent basic events. In: ESREL, pp. 1705–1714 (2008) 17. Deo, N., Medidi, M.: Parallel algorithms for terminal pair reliability. IEEE Trans. Reliability 41(2), 201–209 (1992) 18. Theologou, O.R., Carlier, J.G.: Factoring and reductions for networks with imperfect vertices. IEEE Trans. Reliability 40(2), 210–217 (1991) 19. Nelsen, R.B.: An Introduction to Copulas. Springer, Heidelberg (1999) 20. Sahner, R., Trivedi, K., Puliafito, A.: Performance and Reliability Analysis of Computer Systems. Kluwer Academic Publishers (1996) 21. Hsu, S.J., Yuang, M.C.: Efficient computation of terminal-pair reliability using triangle reduction in network management. ICC on Communications 1, 281–285 (1998) 22. Dotson, W.P., Gobein, J.: A new analysis technique for probabilistic graphs. IEEE Trans. Circuit & Systems 26(10), 855–865 (1979) 23. Chen, Y.G., Yuang, M.C.: A cut-based method for terminal-pair reliability. IEEE Trans. Reliability 45(3), 413–416 (1996)
Software Reliability Testing Covering Subsystem Interactions Matthias Meitner and Francesca Saglietti Chair of Software Engineering, University of Erlangen-Nuremberg, 91058 Erlangen, Germany {matthias.meitner,saglietti}@informatik.uni-erlangen.de
Abstract. This article proposes a novel approach to quantitative software reliability assessment ensuring high interplay coverage for software components and decentralized (sub-)systems. The generation of adequate test cases is based on the measurement of their operational representativeness, stochastic independence and interaction coverage. The underlying multi-objective optimization problem is solved by genetic algorithms. The resulting automatic test case generation supports the derivation of conservative reliability measures as well as high interaction coverage. The practicability of the approach developed is finally demonstrated in the light of an interaction-intensive example. Keywords: Software reliability, interaction coverage, component-based system, system of systems, emergent behavior, statistical sampling theory, testing profile, multi-objective optimization, genetic algorithm.
1
Introduction
The systematic re-use of tested and proven-in-use software components evidently contributes to a significant reduction in software development effort. By the principles of abstraction and partition the component-based paradigm supports the transparency of complex logic both from a constructive and an analytical point of view. Nonetheless, a number of spectacular incidents [5] proved that various risks may still be hidden behind inappropriate component interaction, even in case of inherently correct components. For such reasons, novel approaches were recently developed, aimed at systematic, measurable and reproducible integration testing for componentbased software (e.g. [1, 4, 13, 14]). Meanwhile, this issue is becoming particularly crucial in case of decentralized, autonomous systems interacting for coordination purposes, so-called systems-ofsystems: in fact, while so far classical software engineering has been mainly concerned with the implementation of well-structured, monolithic or component-based code to be designed in the context of a common project, modern applications increasingly involve the independent development of autonomous software systems, merely J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 46–60, 2012. © Springer-Verlag Berlin Heidelberg 2012
Software Reliability Testing Covering Subsystem Interactions
47
communicating with each other for the purpose of a super-ordinate cooperation. Due to the inherent autonomy of such systems, often enough the multiplicity of their potential interactions cannot be systematically tested during integration. As the functional scope of single subsystems may evolve with time at a high degree of autonomy, the multiplicity of their interplay may increase at rapid pace, possibly resulting in unforeseeable interplay effects, generally known as emergent behavior [8]. Therefore, especially when dealing with safety-critical applications, a preliminary software reliability assessment must take into account potential emergent behavior by accurately identifying the variety of potential scenarios involving the interaction of autonomous parts and by assessing their adequacy via operationally representative test cases. In other words, a rigorous reliability evaluation must be based on behavioral observations. • reflecting operative conditions, • at the same time capturing high amounts of potential interactions between components resp. subsystems. Concerning the first requirement, a well-known and technically sound approach to quantitative software reliability estimation is provided by statistical sampling theory (introduced in section 2) on the basis of operationally representative observations. Admittedly, in general this technique may not be easily practicable; nonetheless, it could be successfully applied to a real-world software-based gearbox controller for trucks within an industrial cooperation [15, 16]. Though successful in evaluating operational experience, statistical sampling does not address interplay coverage which may be measured according to different criteria (s. section 3 and [1, 4, 9, 13, 14, 17]). An integration test exclusively targeted to the detection of emergent behavior, on the other hand, is not necessarily representative for the application-specific operational profile and thus does not support the sound derivation of probabilistic software reliability estimates. The novelty of the approach presented in this article consists in combining the above mentioned, diverse perspectives into a common procedure capable of generating test cases supporting both sound reliability estimation and high interaction coverage for highly reliable and interaction-intensive software. The article is organized as follows: • section 2 provides a brief introduction into software reliability evaluation by statistical sampling theory; • section 3 presents a number of metrics addressing coverage of component resp. (sub-) system interaction; • section 4 illustrates the potential shortcomings of a statistical sample exclusively based on the operational profile; • section 5 proposes a novel approach targeting the combined optimization of three different objectives (namely, operational representativeness, stochastic independence and interaction coverage);
48
M. Meitner and F. Saglietti
• section 6 reports on the application of the approach to a highly-interactive component-based software system; • finally, section 7 summarizes the investigations reported and the conclusions drawn.
2
Reliability Assessment by Statistical Sampling Theory
Statistical sampling theory is a well-established approach for deriving a reliability estimate for a software system [3, 7, 10, 11]. It allows to derive • at any given confidence level β and • for a sufficiently high number n of correctly performing test cases (n > 100) • an upper bound ~ p of the unknown failure probability p, i.e. with:
P (p ≤ ~ p)= β
(1)
The theory requires the fulfillment of a number of conditions; a part of them concerns the testing process and must be ensured by appropriate test bed resp. quality assurance measures: • Test run independence: the execution of a test case must not influence the execution of other test cases. If required, this may be enforced by resetting mechanisms. • Failure identification: in order to exclude optimistic reliability estimates, failure occurrence must be properly identified by dependable test oracles, typically plausibility checks based on domain-specific expert judgment. • Correct test behavior: no failure occurrence is observed during testing. In principle, the theory allows for a low number of failure observations, at the cost of correspondingly lower derivable reliability estimates.
Other conditions concern the selection of test data which is central in this article: • Test data independence: as the application of statistical sampling theory is based on a number of independent experiments, the selection of one test case must not influence the selection of the other test cases. • Operationally representative test profile: as reliability measures must refer to a given operational profile, test cases must be selected with the same probability of occurrence.
If all the conditions mentioned above are met, the following quantitative relation between the number n of test cases, the failure probability upper bound ~p and the confidence level β can be derived [3, 18]:
~ p =1− n 1− β Table 1 shows some examples for this relation.
(2)
Software Reliability Testing Covering Subsystem Interactions
49
p and β Table 1. Examples for the relation between n, ~ n
~ p
β
4 603
10-3
0.99
46 050
10-4
0.99
69 074
10
-4
0.999
10
-5
0.999
690 773
Because the costs for applying this approach during a preliminary testing phase may be considerable, posterior evidence collected during operation may be also exploited to lower the costs significantly. This was successfully carried out for a software-based gearbox controller within an industrial research cooperation [15, 16]. In order to derive conservative reliability estimates at pre-defined confidence levels, operational data collected during road testing was analyzed. In this particular case test validation simply consisted of checking that the gear shifts commanded were actually carried out within a pre-defined time frame.
3
Measures of Interaction Coverage
Several measures of interaction coverage were introduced in the past, a. o. [1, 4, 13, 14, 17]; some of them are based on models arising during the early or late design phases (like state diagrams and sequence diagrams), while others directly relate to component resp. system invocations captured at code level. As the perspective taken in this article is focused on the assessment of highly reliable software by statistical testing, also the amount of interactions covered by test cases is measured in the light of executed code instructions. Inspired by classical data flow coverage [12], interaction testing criteria transfer structural concepts from code to interfaces. Among them, coupling-based testing [4] addresses the following coupling categories: • parameter coupling, where one method calls another method and passes parameters to it; • shared data coupling, where two methods use the same global variable; • external device coupling, where two methods use the same external device (e.g. a database or a file).
Coupling-based testing examines the interactions between components or systems, where one method (the caller) calls another method (the callee). The node in the control flow graph containing the invocation is called a call site. A node containing the definition of a variable that can reach a use in another component on some execution path is called a coupling-def. [4] distinguishes three types of coupling-defs:
50
M. Meitner and F. Saglietti
• last-def-before-call: last definition of a formal parameter before a call; • last-def-before-return: last definition of a formal parameter before a return statement; • shared-data-def: definition of a global variable.
A coupling-use is a node containing the use of a variable that has been defined in another component and that can be reached on some execution path. There are three different kinds of coupling-uses [4]: • first-use-after-call: first use of a formal parameter in the caller after the return statement; • first-use-in-callee: first use of a formal parameter in the callee; • shared-data-use: use of a global variable.
A path is called a def-clear path with respect to a certain variable if there is no definition of that variable along that path. A coupling path between two components is a def-clear path from a coupling-def of a variable to a coupling-use of the same variable in another component. Figure 1 illustrates the concepts for coupling-based testing by means of an example.
Fig. 1. Coupling-based testing
In [4] the following coupling-based coverage criteria were defined: • call coupling: all call sites must be covered; • all-coupling-defs: for each variable at least one coupling-path from each definition to at least one of its coupling-uses must be covered; • all-coupling-uses: for each variable at least one coupling-path from each definition to all reachable coupling-uses must be covered; • all-coupling-paths: for each variable all coupling-paths from each definition to all reachable coupling-uses must be covered. As this definition would require an unbounded number of test cases in case of loops, the criterion was weakened, so that each loop body has to be skipped and executed at least once.
Software Reliability Testing Covering Subsystem Interactions
51
In the particular case of object-oriented programming supporting reusability and maintainability by inheritance and polymorphism, interaction coverage has to be strengthened to take account of potential side effects. For example, faults may arise by incorrect dynamic binding in case of a property being fulfilled in some inheritance contexts, but violated in others. In order to increase the chances of detecting such anomalies, the above mentioned coupling-based coverage concepts were extended to the object-oriented paradigm [1] by additionally requiring also context coverage.
4
Potential Bias of Operationally Representative Samples
4.1
Shortcoming of Reliability Testing
Though well-founded, reliability testing by statistical sampling theory has a fundamental shortcoming: it depends on one single random experiment. This experiment consists of generating a number of test cases according to a given distribution intended to reflect the expected operational profile. As this random generation is carried out only once, it cannot guarantee to cover all required testing scenarios. In fact, even assuming an accurate knowledge of the operational profile, the resulting randomly generated test case sample may still deviate from statistical expectation. In particular, a considerable number of relevant scenarios involving crucial interactions may be fully neglected, as will be illustrated by the following example. For this reason, it is felt that, especially in case of safety-critical software-based applications, reliability testing should be enhanced by extending the sample demands to include – beyond operating representativeness and stochastic independence – also interaction coverage. The novel approach developed for this purpose will be introduced in section 5. 4.2
Application Example
This section introduces an example for a software-based system involving a high degree of component interactions. It consists of 4 parts: • one of them represents a central controller (the so-called Control Component), • while the remaining 3 components represent cooperating tasks (Service 1, Service 2 and Service 3) to be invoked and parameterized by the Control Component.
The application processes the following 4 input parameters: • • • •
parameter 1 of type Integer; parameter 2 of type Integer; parameter 3 of type Double; parameter 4 of type Double.
Depending on these inputs, the Control Component invokes one or more of the Service Components providing each of them with 8 parameters: • 4 of them (the so-called data parameters) are common to all components, • while the remaining 4 control parameters are component-specific control elements.
52
M. Meitner and F. Saglietti
Figure 2 offers a graphical representation of the invocation hierarchy.
Fig. 2. Interacting components of software application
In terms of the all-coupling-uses criterion the system includes 155 def-use pairs. The operational profile of the application is provided by defining for each of the 4 independent inputs its corresponding probability density function, as shown in Table 2. 4.3
Evaluation
For each n ∈ {1 000, 3 000, 10 000, 20 000, 50 000} 10 experiments were carried out, each consisting of generating n test cases according to the given operational profile. The number of def-use pairs covered by each of the 50 resulting experiments was successively determined, as shown in Table 3. Table 2. Parameters with corresponding probability density functions
software inputs
distribution
probability density function
a = −10 000
parameter1 uniform distribution
f (x) =
parameter2
parameter3
parameter4
distribution parameters
b = 20 000
1 b−a
a=0 b = 1 000 000
Weibull distribution
normal distribution
a⎛x−c⎞ f (x) = ⎜ ⎟ b⎝ b ⎠
f (x) =
1 2πσ 2
a −1 −⎛⎜ x −c ⎞⎟ e ⎝ b ⎠
e
−
(x −μ )2 2 σ2
a
a=2 b = c = 10 000 μ=0 σ = 1 000
Software Reliability Testing Covering Subsystem Interactions
53
For example, it can be noticed that the 10 experiments devoted to the random generation of 10 000 test cases only covered between 124 and 132 def-use pairs; in other words, they missed to generate test cases triggering between 23 and 31 component interactions. This means that potential faults affecting the uncovered interactions (about 15% - 20% of all interactions) would remain undetected by any of these test samples. Although these effects tend to diminish with increasing test size, the concern remains valid even for the highest test size. In order to address potential emergent behavior during reliability testing, therefore, the original statistical sampling approach was extended to capture also interaction coverage among components, subsystems or systems. Table 3. Minimum and maximum numbers of def-use pairs covered for each test size n
test size n
5
1 000
3 000
10 000
20 000
50 000
min coverage
76
97
124
136
144
max coverage
89
109
132
141
146
Multi-criteria Test Case Optimization
Since the approach presented in this article focuses on the generation of optimal test case sets, only the conditions concerning the selection of test cases are considered. Further conditions mentioned in section 2 and concerning quality assurance of product and test bed (like resetting mechanisms, test oracles, restart of the whole process after fault removal) are outside the scope of this article. Statistical sampling theory requires independently selected test cases; in other words, the input values must not be correlated. On the other hand, input parameters may be functionally or semantically dependent due to the nature of the application under test, e.g. • by physical laws, like wavelength, speed and frequency, or • by logical patterns, like the coefficient values of an invertible matrix. Evidently, correlations due to physical laws or logical patterns cannot be removed; these application-inherent dependencies must be captured by the operational profile. Further correlations arising by instantiation of functionally independent parameters, (e.g. numerical dependencies), however, have to be avoided or removed by filters. 5.1
Objective 1: Operational Representativeness of Test Cases
The operational profile of the system under test is assumed to be available on the basis of a preliminary estimation concerning the frequency of occurrence of the input
54
M. Meitner and F. Saglietti
parameters. While functionally independent parameters can be randomly generated according to this operational profile, the functionally correlated parameters must be defined such as to fulfill their application-specific dependencies. The degree of operational representativeness of test cases can be measured by different goodness-of-fit tests, like the χ2 test [6], the Kolmogorov-Smirnov test [6] or the Anderson-Darling test [6]. They quantify the confidence in the validity of a null hypothesis, in our case Ho: “the observed distribution is consistent with the pre-specified distribution”
in the light of the data observed. Depending on the goodness-of-fit test, a so-called test statistic S1 is first determined; for example, for the χ2 test, the statistic is defined as follows: k
χ2 = ∑
(Oi − E i )2
i =1
(3)
Ei
where k denotes the number of bins that contain the data, Oi the observed and Ei the expected frequency of bin i (1 ≤ i ≤k). The validity of the null hypothesis is successively verified by a significance test determining a critical threshold T1. If the test statistic S1 is higher than the critical value T1, then the null hypothesis is rejected, otherwise accepted; in this case the test data can be taken as sufficiently representative for the distribution specified. Figure 3 shows an exemplifying goodness-of-fit test for a specified Gamma distribution.
Fig. 3. Distribution fitting
5.2
Objective 2: Test Case Independence
Statistical sampling theory further requires the independent selection of test cases; in other words, the values of parameters previously identified as independent must not be correlated. For this purpose, both auto- and cross-correlation measures are considered, each yielding a statistical correlation coefficient S2. Auto-correlation describes the dependence of a specific parameter instance within a test case to other instances of the same parameter in further test cases. Hereby, the
Software Reliability Testing Covering Subsystem Interactions
55
auto-correlation metrics consider the so-called lag between test cases, i.e. the distance between two test cases w.r.t. to the sequence order of their generation; in particular, the lag between test cases generated in direct succession is 1. Cross-correlation, on the other hand, describes the dependencies between different parameters within the same test case. There are several metrics to measure crosscorrelation, mainly differing in terms of computational complexity and dependency type addressed. Among the metrics selected for the approach illustrated in this article are Pearson’s product moment correlation coefficient [2], Spearman’s rank correlation coefficient and Cramer’s V [19]. For example, for two different random variables X and Y, Pearson’s product moment correlation coefficient rxy is defined as follows:
r xy =
1 n 1 n
∑ (x n
∑ (x n
i =1
i
)(
− x ⋅ yi − y
i =1
i
−x
)
2
1 n
∑ (y n
i =1
i
) −y
)
(4)
2
where n denotes the number of test cases and xi resp. yi (1 ≤ i ≤ n) denote values of X and Y with corresponding average values x and y . To determine whether the parameters are correlated or not, the correlation coefficient S2 is compared with a maximum threshold value T2. Similarly to the goodness-offit tests, parameters with correlation S2 higher than T2 cannot be taken as sufficiently independent. 5.3
Objective 3: Interaction Coverage
The coupling-based testing criteria presented in section 3 can be arranged in the subsumption hierarchy shown in Figure 4 [4].
Fig. 4. Subsumption hierarchy of coupling-based testing
While call coupling only considers the invocation of methods and is therefore too weak to measure interaction coverage, the other criteria are appropriate for being used in the optimization procedure. The example presented in section 6 is based on the allcoupling-uses criterion, whose coverage measure will be denoted by S3.
56
5.4
M. Meitner and F. Saglietti
Combination of Objectives 1, 2 and 3
The new approach presented in this article combines the three above mentioned criteria. Since the main objective is the generation of a dependable software reliability estimate, objectives 1 and 2 must be fulfilled in order to enable the application of statistical sampling theory. In other words, both these criteria are knock-out criteria dominating over objective 3 which should be maximized without violating them. The high complexity of this problem makes the application of systematic and analytical techniques inadequate. Therefore, a heuristic approach to this multi-objective optimization problem is applied, making use of genetic algorithms. In general, these proceed iteratively by evaluating the fitness of single individuals and by generating new populations based on the best individuals found so far and on genetic manipulations of past populations. In this specific case single individuals are sets of test cases, where each single test case consists of values to be assigned to the input variables. Cross-over operations may be used at two different levels: at the higher level, test case sets exchange test cases, while at the lower level only values of single input variables are swapped. Test case sets are mutated by deleting individual test cases and by generating an identical number of new ones, or by random mutations of individual input parameters. In addition to the genetic manipulations described, the elitism operator is applied to maintain unaltered the best test case sets generated up to then. In order to determine the fitness of a candidate test case set with respect to its fulfillment of objectives 1 and 2, the values Si (i ∈ {1,2}) determined by goodness-of-fit tests (as introduced in section 5.1) and by auto- resp. cross-correlation metrics (as introduced in section 5.2) are first normalized such as to lie within the interval [0; 1] by the following normalization procedure N: • for Si ∈ [0; Ti] let N(Si) = 1; • for Si ∈ ]Ti ; maxi] let N(Si) be defined as shown in Figure 5, where maxi denotes the highest value taken by Si within a test case population.
Fig. 5. Normalization function for objectives 1 and 2
Software Reliability Testing Covering Subsystem Interactions
57
Interaction coverage measures S3 do not require any normalization, as they already lie within the interval [0; 1] by definition. For each test case set, the fitness function is defined as the following weighted sum of its three normalized measures: fitness value (test case set) = 1.0 * N(S1) + 1.0 * N(S2) + 0.1 * S3
The coefficient 0.1 is chosen such as to maximize the weight of S3, while preventing the violation of any of the two knock-out criteria, even in case of full interaction coverage, because: • a test case set that violates a knock-out criteria has a fitness value < 2, • a test case set that fulfills both knock-out criteria has a fitness value ≥ 2. The genetic algorithm successively selects individuals with higher fitness values at higher probability, such that the result of the optimization is a test case set that does not violate any knock-out criteria.
6
Example
The new approach was applied to the software system presented in section 4.2. The interaction coverage is measured with respect to the all-coupling-uses criterion introduced in section 3. Goodness-of-fit tests are carried out at a significance level of 0.1. The genetic algorithm proceeds as follows:
• initially, it generates 10 random test case sets according to the operational profile, • successively, it evaluates their fitness and • starts the optimization process. This process involves 10 optimization runs where the genetic operators (selection, cross-over and mutation) are applied to test case sets consisting of a fixed number of test cases. This predefined number is chosen in view of the reliability estimate to be derived after optimization; therefore, it does not change during the optimization procedure. Figure 6 shows the evolution of the coverage achieved for test set sizes of 10 000 resp. 50 000 test cases. As already mentioned in section 4.2, the application contains 155 feasible def-use pairs in terms of the all-coupling-uses criterion. The best of the initial 10 test case sets consisting of 10 000 test cases managed to cover 132 def-use pairs. After genetic optimization the resulting test set improved to 152 covered def-use pairs or nearly 98% of all feasible pairs. The multi-objective approach applied to sets containing 50 000 test cases managed to reach 100% coverage after optimization. After validation of the test results, such a test allows to derive a conservative software reliability p < 9.21 ⋅ 10 −5 at confidence level β = 0.99. estimate ~ Figure 7 shows the output of such an optimization run.
58
M. Meitner and F. Saglietti
Coverage Def-Use Pairs 160 155 150 145 n = 10 000
140
n = 50 000
135 130 125 120 Start
1
2
3
4
5
6
7
8
9
10
Fig. 6. Covered def-use pairs for test case sets with n = 10 000 and n = 50 000
Fig. 7. Optimization result
Software Reliability Testing Covering Subsystem Interactions
7
59
Conclusion
In this article a new approach to software reliability assessment combined with high interaction coverage was presented. Optimal test case sets are generated by use of genetic algorithms. An adequate fitness function considers the objectives of operational representativeness, test case selection independence and interaction coverage. The approach was tested on a software system showing that interaction coverage can be significantly increased while guaranteeing the conditions required for statistical testing, such that a well-founded, conservative reliability estimate can be derived. Acknowledgment. The authors gratefully acknowledge that the work presented was partly funded by Siemens Corporate Technology.
References 1. Alexander, R.T., Offutt, A.J.: Coupling-based Testing of O-O Programs. Journal of Universal Computer Science 10(4) (2004) 2. Hartung, J.: Statistik. Oldenbourg (1995) 3. Ehrenberger, W.: Software-Verifikation. Hanser Verlag (2002) 4. Jin, Z., Offutt, A.J.: Coupling-based Criteria for Integration Testing. Software Testing, Verification & Reliability 8(3), 133–154 (1998) 5. Jung, M., Saglietti, F.: Supporting Component and Architectural Re-usage by Detection and Tolerance of Integration Faults. In: 9th IEEE International Symposium on High Assurance Systems Engineering (HASE 2005). IEEE Computer Society (2005) 6. Law, A.M., Kelton, W.D.: Simulation, Modeling and Analysis. McGraw-Hill (2000) 7. Littlewood, B., Wright, D.: Stopping Rules for Operational Testing of Safety Critical Software. In: 25th International Symposium Fault Tolerant Computing, FCTS 25 (1995) 8. Maier, M.W.: Architecting Principles for Systems-of-Systems. Systems Engineering 1(4), 267–284 (1998) 9. Oster, N., Saglietti, F.: Automatic Test Data Generation by Multi-objective Optimisation. In: Górski, J. (ed.) SAFECOMP 2006. LNCS, vol. 4166, pp. 426–438. Springer, Heidelberg (2006) 10. Parnas, D., van Schouwen, J., Kwan, S.: Evaluation of Safety-critical Software. Communications of the ACM 33(6) (1990) 11. Quirk, W.J. (ed.): Verification and Validation of Real-time Software. Springer, Heidelberg (1985) 12. Rapps, S., Weyuker, E.J.: Data Flow Analysis Techniques for Test Data Selection. In: 6th International Conference on Software Engineering, ICSE 1982 (1982) 13. Rehman, M., Jabeen, F., Bertolino, A., Polini, A.: Software Component Integration Testing: A Survey. Journal of Software Testing, Verification, and Reliability, STVR (2006) 14. Saglietti, F., Oster, N., Pinte, F.: Interface Coverage Criteria Supporting Model-Based Integration Testing. In: Workshop Proceedings of the 20th International Conference on Architecture of Computing Systems (ARCS 2007), VDE (2007) 15. Söhnlein, S., Saglietti, F., Bitzer, F., Meitner, M., Baryschew, S.: Software Reliability Assessment based on the Evaluation of Operational Experience. In: Müller-Clostermann, B., Echtle, K., Rathgeb, E.P. (eds.) MMB&DFT 2010. LNCS, vol. 5987, pp. 24–38. Springer, Heidelberg (2010)
60
M. Meitner and F. Saglietti
16. Söhnlein, S., Saglietti, F., Meitner, M., Bitzer, F.: Bewertung der Zuverlässigkeit von Software, Automatisierungstechnische Praxis, 52. Jahrgang, 6/2010, 32-39, Oldenbourg Industrieverlag (2010) 17. Spillner, A.: Test Criteria and Coverage Measures for Software Integration Testing. Software Quality Journal 4, 275–286 (1995) 18. Störmer, H.: Mathematische Theorie der Zuverlässigkeit. R. Oldenbourg (1970) 19. Storm, R.: Wahrscheinlichkeitsrechnung, mathematische Statistik und Qualitätskontrolle. Hanser Verlag (2007)
Failure-Dependent Timing Analysis A New Methodology for Probabilistic Worst-Case Execution Time Analysis Kai H¨ ofig AG Software Engineering: Dependability University of Kaiserslautern Kaiserslautern, Germany [email protected] http://agde.cs.uni-kl.de/
Abstract. Embedded real-time systems are growing in complexity, which goes far beyond simplistic closed-loop functionality. Current approaches for worst-case execution time (WCET) analysis are used to verify the deadlines of such systems. These approaches calculate or measure the WCET as a single value that is expected as an upper bound for a system’s execution time. Overestimations are taken into account to make this upper bound a safe bound, but modern processor architectures expand those overestimations into unrealistic areas. Therefore, we present in this paper how of safety analysis model probabilities can be combined with elements of system development models to calculate a probabilistic WCET. This approach can be applied to systems that use mechanisms belonging to the area of fault tolerance, since such mechanisms are usually quantified using safety analyses to certify the system as being highly reliable or safe. A tool prototype implementing this approach is also presented which provides reliable safe upper bounds by performing a static WCET analysis and which overcomes the frequently encountered problem of dependence structures by using a fault injection approach. Keywords: fault tolerance, software safety, static analysis, tool, WCET, fault tree.
1
Introduction
Embedded real-time systems are growing in complexity, which goes far beyond simplistic closed-loop functionality. Modern systems also execute on complex input data types or implement rich communication protocols. The underlying hardware has to provide more and more resources and is therefore extended by caches or multi-core processors, for instance. To assure the quality of such systems, e.g., their reliability or safety, analyses have to cope with this extended complexity. We present in this paper a new approach for timing analysis that J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 61–75, 2012. c Springer-Verlag Berlin Heidelberg 2012
62
K. H¨ ofig
reduces overestimations, which are often based on the system’s increased complexity. Since many embedded systems are real-time systems, which are frequently safety critical, the probability of a timing failure can violate reliability requirements. Current approaches for worst-case execution time (WCET) analysis are used to verify the execution time of a system under worst-case conditions. These approaches calculate or measure the WCET as a single value that is expected as an upper bound for a system’s execution time. Overestimations are taken into account to make this upper bound a safe bound that guarantees termination within a given deadline. Modern processor architectures with caches, multi-threading, and instruction pipelines often expand those overestimations for safe upper bounds into unrealistic areas, making them useless in an industrial context [1]. The former assumption that a missed deadline is, in the worst case, equivalent to always missing the deadline is too stringent for systems that require only a probabilistic guarantee that a task’s deadline miss ratio is below a given threshold [2]. Some approaches try to solve this problem by calculating multiple upper bounds and argue that each single upper bound will hold for a certain probability (probabilistic worst-case execution time). As summarized in Section 2 of this paper, many of those approaches require either probabilities as input or make assumptions, which have to be verified for each system to be fulfilled, in order to apply statistical methods. In contrast to these approaches, we show in this paper how safety analysis model probabilities can be combined with elements of system development models to calculate a probabilistic worst-case execution time. Safety analysis models are used here as a source of probabilities. Since safety analysis models typically reflect the occurrence of failures and their propagation through the system, our approach aims at mechanisms in systems that are executed in addition to a failure. Such mechanisms usually belong to the area of fault tolerance and detect or process an error. Examples of safety analysis models that reflect fault tolerance mechanisms can, e.g., be found in [3–9]. The remainder of the paper is organized as follows: Section 2 discusses related approaches. In Section 3, the methodology of failure-dependent timing analysis is formalized. This analysis is applied in an example in Section 4 using a tool prototype. The results of the tool provide multiple worst-case execution times under certain failure conditions. Section 5 concludes this paper and provides a perspective for future work.
2
Related Work
This section describes related work regarding worst-case execution time analysis with particular attention being given to probabilistic approaches. As far as we know, there exists no approach that uses safety analysis models as probabilistic input for WCET analysis. Current approaches in WCET analysis can be divided into deterministic and probabilistic analysis approaches. The difference between typical deterministic WCET approaches and probabilistic approaches is that deterministic WCET
Failure-Dependent Timing Analysis
63
approaches calculate a single execution time for a program, whereas probabilistic WCET approaches calculate multiple execution times for a program, each valid with a certain probability. The approach presented here can be assigned to the category of probabilistic WCET approaches. Approaches in both groups can be further classified into measurement-based approaches and static analysis approaches. In static timing analysis, the execution times of individual static blocks are computed for a given program or a part of it [10–12]. These approaches are able to provide safe upper bounds for the WCET by making pessimistic assumptions at the expense of overestimating the WCET in order to guarantee deadlines for the analyzed program [13]. Advanced (deterministic) approaches, such as those presented in [14, 15], encompass precise hardware models to reduce overestimation of the WCET as much as possible. On the other hand, measurement-based approaches do not need to perform any complex analysis [16–18]. They measure the execution time of a program on real hardware or processor simulators. These approaches are generally unable to provide a safe upper bound for the WCET, since neither an initial state nor a given input sequence can be proven to be the one that produces the WCET [13]. Static analysis approaches thus have important benefits over measurement-based approaches when safe upper bound guarantees are required. For this reason, the approach presented here also uses static analysis. Since deterministic WCET approaches do not distinguish between different execution times for certain probabilities, even a tight upper bound with minor overestimation can be so improbable that it has no real significance. Some probabilistic WCET approaches have emerged in recent years that combine execution times with probabilities in order to focus on significant execution times by incorporating probabilities for different execution times of a program. In [19], the authors use measured execution time samples of a task and apply the central limit theorem to approximate the probability that a task finishes its execution within a given threshold. The authors assume that the inputs for each sample are equally distributed in the real-world application, but this has to be proven for each specific application. For example, when the probability of one input leading to high execution time is more likely than other inputs, the approach cannot be applied. This approach is extended to Extreme Value Theory in [20]. Measurements from random sample data are used to approximate a Gumbel distribution of the execution times. This methodology requires the stochastic independence and identical distribution (i.i.d.) of the input data. This is generally a problematic assumption as mentioned before, especially since stochastic independence is not given for programs that change the world. This problem is considered in [21], where the authors propose resetting the system after every execution or proving that the i.i.d. assumption can be applied for a specific program. To derive safe probabilities (not safe execution time bounds) without proof or reset, they shift the calculated distribution into safe bounds. Furthermore, since failures that influence systems are typically not distributed equally and since failures can also be self-energizing, this approach cannot be
64
K. H¨ ofig
applied to solve the problem of analyzing different execution times under certain failure conditions. A different approach presented in [22] measures the execution times of atomic units of execution, so-called basic blocks, to obtain a probabilistic distribution. These distributions for basic blocks are then combined by applying different rules for each control structure in the syntax tree in a bottom-up process. This results in a distribution for different execution times of an entire program. In an ongoing work, the authors present such a set of rules for the sequential, conditional, and iterative execution of basic blocks [23]. Starting with a simple timing schema, where A and B are basic blocks of a program, e.g., leaves of a syntax tree, they formulate the problem that a probability distribution Z of the worst-case execution times of the sequential execution W CET (A, B) is hard to determine because of the dependencies of the probability distributions of A and B. In [1], these dependency structures are estimated by upper and lower bounds using copulas. However, this approach does not provide input for the calculation of distribution functions of basic blocks, and deriving the dependence structure for every control structure of a program is quite complex. In contrast to the measurement-based approaches discussed above, the approach presented in this paper uses safety analysis models as inputs and provides a way to combine them with model elements of embedded systems to perform probabilistic worst-case execution time analysis. Furthermore, the dependency structures of the failure probabilities are handled here by using widely accepted and proven-in-use fault trees with Boolean logic (see Section 4). Complex manual analysis of dependency structures in the code is also not necessary, since automated fault injection can overcome this problem, which is also described in Section 4. Similar to the aforementioned measurement-based approaches, the authors present in [24] an annotation scheme that allows enriching a program’s source code with conditional statements. For the top-down decomposition of a source code into a syntax tree, the composition of statements is used bottom-up to calculate different execution times along with their probability of occurrence. The timing schema presented in [24] is quite simplistic and thus not suitable for modern processor architectures. Furthermore, in order to analyze an entire program, the probability for each conditional branch has to be known as input for this methodology. Besides these approaches, which can be clearly assigned to probabilistic WCET analysis, there exist also approaches that deal with probabilistic response times of tasks. In [25], the authors use a so-called Null-Code Service, which immediately terminates when executed, to measure task execution times from response times. Distributions are measured for both the Null-Code Service and the task to be analyzed. The different response times are then subtracted to obtain execution times. Another approach for response times is described in [26]. The authors simulate the execution of a task in a typical embedded environment with state variables and scheduling. Extreme Value Theory is applied to the measurements of task response times to estimate the upper bound of the task’s response time for a given probabilistic threshold. Since in this approach,
Failure-Dependent Timing Analysis
65
only response times that are close to a worst-case response time are relevant, the authors use a method called block maxima to eliminate less important measurements and to optimize the number of measuring points. Applicability to entire systems is, as the authors conclude, questionable and part of their future work. Also, some approaches dealing with probabilistic execution times or response times can be found that clearly aim at scheduling, e.g. [2], where the backlog of different scheduling strategies is modeled using Markov chains. The previously discussed approach presented in [20] also aims at scheduling by using a probabilistic WCET. It can be concluded that current probabilistic approaches in WCET analysis mainly derive probabilistically distributed execution times from measurements. As already stated in the introduction to this section, measured execution times can be problematic when upper bound guarantees are required. In general, it cannot be assured that a measured execution time is the WCET or close to it. In contrast to that, the approach presented in this paper is based on static analysis and thereby can provide safe upper bound guarantees. Probabilities and execution times are not obtained via measurements that require certain statistical assumptions to be true, but rather probabilities are extracted from safety analysis models and execution times are proven safe upper bounds taken from static analysis. Dependency structures are here handled using proven-inuse static analysis. Dependencies in failure probabilities can be modeled using Boolean fault tree logic, which is also approved in industry. Since we use failure probabilities as probabilistic input, the approach presented here is limited to systems that act differently in the case of failures. For other systems, the approach presented in this paper is at least as good as a static non-probabilistic analysis. In the next section, such a system is initially described as an example. After that, the Failure-Dependent Timing Analysis approach is presented.
3
Failure-Dependent Timing Analysis
In this section, an example system is first described in paragraph 3.1. This system is used to illustrate the Failure-Dependent Timing Analysis (FDTA) approach and is later analyzed in Section 4 using a tool chain that implements this approach. Paragraph 3.2 describes the combination of development model elements and safety analysis model elements for automated Failure-Dependent Timing Analysis. 3.1
Example System
The example system in this paper is the fault-tolerant subsystem of a FaultTolerant Fuel Control System [27]. The Simulink model of this system is depicted in figure 1. The subsystem is fed with four sensor values: throttle delivers the throttle bias, speed delivers the engine speed, map is short for manifold pressure and delivers the intake air pressure, and ego is short for exhaust gas oxygen and
66 1 Sensors
K. H¨ ofig Throttle Estimate
throttle
m speed
Sensors
throtle
EGO MAP
Speed Estimate Sensors
we
u
throttle sensor failure speed sensor failure
2 Failures
m
1 Corrected
EstimateMAP Sensors
map
pressure sensor failure
Fig. 1. Simulink model of the subsystem SensorFaultCorrection
delivers the measured amount of oxygen in the exhaust gas. These sensor values are used to calculate the gasoline intake for the lowest engine emissions. The sensors throttle, speed, and map can be estimated if they are not measurable due to failures. To estimate one of them by doing a table lookup, the other two corresponding sensor values are required. The subsystems Throttle Estimate, Speed Estimate, and EstimateMAP are technically equivalent subsystems that estimate the values for detected sensor failures. The main system detects sensor failures by performing a range check, where all incoming sensor values are checked against a given range. Sensor values that are outside of this range are assumed to be incorrect and the value of this sensor is estimated instead. If one or more of the sensor readings is erroneous, the system switches from low emission to rich mixture mode. In this case, the engine operates with a non-optimal mixture. Since the estimation of such a sensor value requires an additional calculation compared to simply routing the signal through the subsystem SensorFaultCorrection, the execution time of this subsystem depends on the occurrence of failures in the sensors (failure-dependent execution time). The probabilities of such sensor failures are typically part of safety analysis models. Since these models themselves are not new, and since presenting a safety analysis model showing the safety behavior of an entire system would clearly exceed the limitations of this paper, only a simplistic fault tree model for the MAP sensor is provided in figure 2. The fault tree models for the other two sensors are equivalent. The failure mode detected failure (leftmost rectangle on the top) occurs when either the sensor value is outside a given range or the range check erroneously judges the sensor data to be out of range. The failure mode erroneous sensor data propagated occurs if the sensor data is out of range and the range check erroneously judges the data to be valid. In this case, erroneous sensor data remains undetected and is used erroneously for further processing. Both failure modes are typical for a safety measure, such as the range check applied here, since they model the effectiveness of the measure. The failure mode no TLookup
Failure-Dependent Timing Analysis detected_failure
67
erroneous sensor data propagated
no_TLookup
CFT MAP OR &
OR
Range Check false negative
MAP out of Range
Range Check false positive
Fig. 2. Component Fault Tree for SensorFaultCorrection
additionally models that no table lookup is performed. This is the case if either the range check erroneously judges a result as valid or if no failure is detected. This failure must also be modeled for the methodology presented here in order to reflect the required behavior for the failure-dependent execution path. In paragraph 3.2, we describe the combination of development model elements and safety analysis model elements for automated Failure-Dependent Timing Analysis. To illustrate the methodology, the above example is used. 3.2
Analysis
The above description of the system’s functionality shows that sensor failures cause additional execution time for a table lookup compared to a failure-free execution. This relation between safety analysis model and system development model can be combined to calculate different execution times for a system along with their probabilities of occurrence. Some of the connections depicted in figure 1 are signals that can be related to failures. They are labeled as throttle-, speed-, and pressure sensor failure and carry specific data indicating failures. They are therefore called failure-dependent. Let all such connections be in a set C, with C = {c1 , .., cn }. In the example, each sensor failure signal is either carrying a 1 to indicate that the corresponding sensor signal is fault-free, or carrying a 0 to indicate that a
68
K. H¨ ofig
sensor signal is erroneous. Since such signals may also carry data types other than 1 and 0, e.g., true and f alse or more complex data types, we call the set of possible configurations for a failure-dependent connection ci execution scenarios S(ci ), with S(ci ) = {s1 , .., sm }. In the example, S(ci ) = {0, 1} for all ci ∈ C. Each execution scenario si of a failure-dependent connection cj has an associated failure mode f mi of a fault tree as depicted in figure 2, with F M (si ) = f mi . For example, F M (1) = no T Lookup and F M (0) = detected f ailure for S(pressure sensor f ailure) = {0, 1}. Since all combinations of failures are possible in general, e.g., in the example system all three sensor values may be erroneous or none may be so, all possible combinations of execution scenarios have to be considered. Such a combination is called a mode here and all possible combinations for a system are modeled by the set M , with M = {{s1 , .., sn } | s1 ∈ S(c1 ), .., sn ∈ S(cn ), C = (c1 , .., cn ), n ∈ N}. Since every execution scenario si has an associated failure mode f mi of a fault tree, the overall probability for each mode m ∈ M can be extracted from Boolean logic by combining them using the Boolean and (in the formula represented by ∧), with
F M (m) = {
n
F M (si ) | m = {s1 , .., sn }
i=1
m ∈ M, n ∈ N}. Each mode m ∈ M has a specific (worst-case) execution time tm for the corresponding execution scenarios si ∈ m of its connections ci . In combination with its failure modes F M (m), a set of measure points Ω can be obtained that allows a probabilistic (worst-case) execution time analysis: Ω = {(tm , F M (m))|m ∈ M }. In the next section, we describe how the execution times of each mode can be determined automatically using an analytical WCET approach. The entire process of system modeling, safety analysis, and Failure-Dependent Timing Analysis is demonstrated in the next section using a tool chain. The results of the FDTA is a set of (worst-case) execution time upper bound guarantees, each valid for a certain probability.
Failure-Dependent Timing Analysis
4
69
FDTA Tool Chain
In this section, we present a tool chain for Failure-Dependent Timing Analysis (FDTA). First, we present in Paragraph 4.1 the tool chain’s architecture. After that, we describe in Paragraph 4.2 how the previously introduced example system is analyzed. A Failure-Dependent Timing Analysis is performed by associating the failure modes of a safety analysis model with the connections of a system development model. The methodology of modes presented in Section 3 is implemented in this tool chain to automate the process of obtaining failure-dependent execution times. 4.1
Architecture
The architecture of the Failure-Dependent Timing Analysis tool chain is depicted in figure 3. It automates the entire analysis process and is based on Enterprise Architect (EA) (see figure 3:2) [28]. Systems are modeled in Simulink and then imported into EA as SysML models (see figure 3:1 and 3:3) [29]. The process of safety engineering is performed in EA using so-called Component Fault Trees (CFTs) (see figure 3:4). CFTs are a special kind of the widely accepted Fault Trees. They allow modeling the behavior of the system under failure conditions and are used to model quantitative as well as qualitative dependability-related statements for assessing the safety of a system [30]. Failure modes of the CFTs can be associated with connections of the imported Simulink model in a new diagram, the Failure-Dependent Timing Analysis Diagram. The diagram is defined using UML profile mechanisms [31]. In this diagram, the execution scenarios as described in Section 3 can also be set for every connection. The diagram is then evaluated by the tool to calculate all modes as presented in Section 3. For each particular mode, the connections of the Simulink model are then changed to the value provided by the execution scenarios. This results in multiple versions of a system (one for each mode), each with different failure conditions injected (see figure 3:5). Those remodeled systems are used to generate C code, which is compiled for the ARM7 Processor Family [32] using the YAGARTO tool chain [33] (see figures 3:6 and 3:7). The different compiled versions of the former Simulink models are afterwards analyzed regarding their worst-case execution time (WCET) using aiT (see figure 3:8) [34]. The results of the WCET analysis are then related to the previously extracted modes of the FDTA Diagram. For each mode, the probability is calculated using the associated failure modes. To obtain the probabilities, the widely accepted fault tree analysis tool FaultTree+ is used [35]. The resulting execution times along with their probabilities are then depicted in combination with their probabilities of occurring extracted from the safety analysis model. Paragraph 4.2 demonstrates the Failure-Dependent Timing Analysis of the example system.
70
K. H¨ ofig
Matlab Simulink
6
1
7 Real Time Workshop & Compiler Path-Specific Sources & Executables
5 .mdl File
.mdl Files For Specific Execution Paths
8 WCET Analyzer
2 Failure Dependent Timing Analyzer
3
SysML Architecture Model
4 Safety Analysis CFT Model
Fig. 3. Architecture of the Failure-Dependent Timing Analysis (FDTA) Tool Chain
4.2
Analysis Example
To perform a Failure-Dependent Timing Analysis of the system described in Section 4, the system is imported from Simulink into EA and the fault trees for the sensors are modeled as depicted in figure 2 for the MAP sensor. After that, the failure-dependent connections of the imported model are identified and associated with values that reflect a certain execution scenario. Each scenario is then related to a certain failure mode of the fault tree. This is modeled in a additional view, the Failure-Dependent Timing Analysis Diagram, which is depicted for the example system in figure 4. On the leftmost side of this figure, the failure-dependent connections are depicted as double-arrows. These are the connections from the failure demux to the blocks that perform a table lookup as depicted in figure 1. Each connection can either be 1 for the scenario of a
Failure-Dependent Timing Analysis
71
correct sensor value or 0 for the scenario of an erroneous sensor value. The circles in the middle of this picture represent these execution scenarios. Each execution scenario has an associated failure mode of the fault tree, e.g. the execution scenario with the value 0 of the connection between failure demux and MAP table lookup is connected to the failure mode detected failure, since this failure mode corresponds to this value.
Value 1 Speed: no_TLookup Failure_Demux/2-> Corrected_Speed/2
Value 0 Speed: detected_failure Value 1 MAP: no_TLookup
Failure_Demux/3-> Corrected_MAP/2
Value 0 MAP: detected_failure Value 1 Throttle: no_TLookup
Failure_Demux/1-> Corrected_Throttle/2
Value 0 Throttle: detected_failure
Fig. 4. Failure-Dependent Timing Analysis Diagram for the system SensorFaultCorrection
From this diagram, all possible modes can be derived as described in Section 3. For each mode, a new Simulink model is generated, with the corresponding values for all connections injected. These different generated models are then used to generate code, which is compiled and analyzed regarding its WCET using the external static timing analysis tool aiT. After that, the associated failure modes are quantified as described in Section 3. For probabilistic quantification, we assumed the range check to be 100% reliable and set the failure probability for all false positive and false negative failure modes, like range check false negative and range check false positive as depicted in figure 2, to zero (as it is sometimes done for testable software routines, e.g. in [36]). For the failure probabilities of the out-of-range failure modes of the sensors, we decided to use quantifications from standards, since the system analyzed here is an academic example that demonstrates the methodology of FDTA. We set the mean time to failure (MTTF) to two million hours for the MAP sensor, according to the 20PC SMT Honeywell Pressure Sensor; the MTTF for the throttle sensor to 3767 years and the MTTF
72
K. H¨ ofig
Speed MAP Throttle
Table 1. Analysis results. First data row indicates execution time without manipulations.
0 1 0 1 1 0 0 1
0 1 1 0 1 0 1 0
0 1 1 1 0 1 0 0
Exec. Time (µs) Probability 593.633333 600.133333 0.00006 032.033333 0.44316 221.400000 0.51145 221.400000 0.01984 221.400000 0.00118 410.766667 0.02290 410.766667 0.00136 410.766667 0.00005
for the speed sensor to 114155 hours, both according to DIN EN ISO 13849-1. The results of the FDT Analysis are depicted in table 1. The first row of table 1 shows the WCET for the original system in addition to the results from the FDT Analysis. The other rows show the WCETs for the systems that have values injected, e.g., the system represented by the second row has all failure-dependent connections set to 0 (all sensor data range checks indicate errors) and a WCET of approx. 600μs with a probability of 6∗10−5. The difference in WCET for the first row and the second row results from the injection of values into the Simulink model. Additional code constructs are required that slightly extend the WCET estimation, e.g., to set the connection from failure demux to MAP to 1. In our experiments we measured that there is a small overestimation at about 3μs for every value that is injected into the model. These experiments are not part of this paper. The results show a strong dependency between execution times and the occurrence of failures. Execution times vary vastly for the different modes of the system. The probability for the upper bound guarantee of about 600μs is comparatively low. With a probability of 0.99994, the system will execute within an upper bound of about 410μs, nearly a 33% reduction compared to the overall WCET value.
5
Conclusion and Future Work
In this paper, the methodology of Failure-Dependent Timing Analysis was presented. The methodology allows analyzing a system’s (worst-case) execution times under certain failure conditions. The timely termination of a system can be analyzed to provide a probabilistic guarantee that a task’s deadline miss ratio is below a given threshold.
Failure-Dependent Timing Analysis
73
A tool chain for FDTA was presented that uses elements of safety analysis models and elements of system development models to derive failure-dependent execution times. The tool allows importing Simulink models into Enterprise Architect as SysML models. Using the UML profiling mechanism, connections carrying failure-dependent data can be identified, as can be elements of Component Fault Trees that are related to those failure-indicating data. Both types of elements are related in an additional view, the Failure-Dependent Timing Analysis Diagram. Supported by this relation, the tool can automatically evaluate the different execution times and relate them to failure probabilities. Since the worst-case execution times are calculated using static analysis, they represent safe upper bound guarantees for different failure scenarios. Injecting faults into system development models implies small overestimations compared to non-exploited systems, but the example shows that these are negligible. The use of Fault Trees as a source of probabilities supports arguing for a reliable probabilistic behavior, since such safety analysis models are already accepted by authorities for quantifying quality attributes like reliability and safety. In future work, the methodology and the tool will be evaluated further in an industrial environment with special attention being paid to the use of the results for certification purposes. Additionally, the analysis results will be processed to allow deeper-level analyses and to support better graphical evaluation of the results. Since the UML stereotype mechanisms can be applied to many model elements, different safety analysis models, such as Markov Chains or Petri Nets, can be easily included in the methodology. This is beneficial for analyzing fault tolerance mechanisms with more complex failure behavior.
References 1. Bernat, G., Burns, A., Newby, M.: Probabilistic timing analysis: An approach using copulas. J. Embedded Comput. 1, 179–194 (2005) 2. Diaz, J.L., Garcia, D.F., Kim, K., Lee, C.-G., Lo Bello, L., Lopez, J.M., Min, S.L., Mirabella, O.: Stochastic analysis of periodic real-time systems. In: 23rd IEEE Real-Time Systems Symposium, RTSS 2002, pp. 289–300 (2002) 3. Laprie, J.-C., Arlat, J., Beounes, C., Kanoun, K.: Definition and analysis of hardware- and software-fault-tolerant architectures. Computer 23(7), 39–51 (1990) 4. Arlat, J., Kanoun, K., Laprie, J.-C.: Dependability modeling and evaluation of software fault-tolerant systems. IEEE Transactions on Computers 39(4), 504–513 (1990) 5. Belli, F., Jedrzejowicz, P.: Fault-tolerant programs and their reliability. IEEE Transactions on Reliability 39(2), 184–192 (1990) 6. Pucci, G.: A new approach to the modeling of recovery block structures. IEEE Transactions on Software Engineering 18(2), 159–167 (1992) 7. Dugan, J.B., Doyle, S.A., Patterson-Hine, F.A.: Simple models of hardware and software fault tolerance. In: Proceedings of the Annual Reliability and Maintainability Symposium, January 24-27, pp. 124–129 (1994) 8. Doyle, S.A., Mackey, J.L.: Comparative analysis of two architectural alternatives for the n-version programming (nvp) system. In: Proceedings of the Annual Reliability and Maintainability Symposium, pp. 275–282 (January 1995)
74
K. H¨ ofig
9. Tyrrell, A.M.: Recovery blocks and algorithm-based fault tolerance. In: Proceedings of the 22nd EUROMICRO Conference EUROMICRO 1996. Beyond 2000: Hardware and Software Design Strategies, pp. 292–299, 2-5 (1996) 10. Mok, A., Amerasinghe, P., Chen, M., Tantisirivat, K.: Evaluating tight execution time bounds of programs by annotations. IEEE Real-Time Syst. Newsl. 5(2-3), 81–86 (1989) 11. Lindgren, M., Hansson, H., Thane, H.: Using measurements to derive the worstcase execution time. In: Proceedings of the Seventh International Conference on Real-Time Computing Systems and Applications, pp. 15–22 (2000) 12. Gustafsson, J., Ermedahl, A., Lisper, B.: Towards a flow analysis for embedded system C programs. In: 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems, WORDS 2005, pp. 287–297, 2-4 (2005) 13. Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J., Stenstr¨ om, P.: The worst-case execution-time problem—overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3), 1–53 (2008) 14. Ferdinand, C.: Worst case execution time prediction by static program analysis. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, p. 125 (April 2004) 15. Ferdinand, C., Heckmann, R.: aiT: Worst-Case Execution Time Prediction by Static Program Analysis. Building the Information Society 156, 377–383 (2004) 16. Puschner, P., Nossal, R.: Testing the results of static worst-case execution-time analysis. In: Proceedings of the 19th IEEE Real-Time Systems Symposium, pp. 134–143, 2-4 (1998) 17. Wolf, F., Staschulat, J., Ernst, R.: Hybrid cache analysis in running time verification of embedded software. Design Automation for Embedded Systems 7(3), 271–295 (2002) 18. Li, X., Mitra, T., Roychoudhury, A.: Modeling control speculation for timing analysis. Real-Time Syst. 29(1), 27–58 (2005) 19. Burns, A., Edgar, S.: Predicting computation time for advanced processor architectures. In: 12th Euromicro Conference on Real-Time Systems, Euromicro RTS 2000, pp. 89–96 (2000) 20. Burns, A., Edgar, S.: Statistical analysis of WCET for scheduling. In: Proceedings of the 22nd IEEE Real-Time Systems Symposium, pp. 215–224 (December 2001) 21. Griffin, D., Burns, A.: Realism in Statistical Analysis of Worst Case Execution Times. In: Lisper, B. (ed.) 10th International Workshop on Worst-Case Execution Time Analysis (WCET 2010). OpenAccess Series in Informatics (OASIcs), vol. 15, pp. 44–53. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2010); The printed version of the WCET 2010 proceedings are published by OCG (www.ocg.at) - ISBN 978-3-85403-268-7 22. Bernat, G., Colin, A., Petters, S.M.: WCET Analysis of Probabilistic Hard RealTime Systems. In: Proceedings of the 23rd Real-Time Systems Symposium, RTSS 2002, pp. 279–288 (2002) 23. Bernat, G., Colin, A., Petters, S.: pWCET: A tool for probabilistic worst-case execution time analysis of real-time systems. Technical report, University of York. England UK (2003) 24. David, L., Puaut, I.: Static determination of probabilistic execution times. In: Proceedings of the 16th Euromicro Conference on Real-Time Systems, ECRTS 2004, June-2 July, pp. 223–230 (2004)
Failure-Dependent Timing Analysis
75
25. Perrone, R., Macedo, R., Lima, G., Lima, V.: An approach for estimating execution time probability distributions of component-based real-time systems. Journal of Universal Computer Science 15(11), 2142–2165 (2009), http://www.jucs.org/jucs_15_11/an_approach_for_estimating 26. Lu, Y., Nolte, T., Kraft, J., Norstrom, C.: Statistical-based response-time analysis of systems with execution dependencies between tasks. In: 15th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), pp. 169–179 (March 2010) c 1994-2011 The MathWorks Inc., 3 Apple Hill DriveNatick, MA 0176027. Simulink 2098, United States of America, http://www.mathworks.de/products/simulink c 2000-2011 Sparx Systems Pty Ltd., Creswick, Victoria, 28. Enterprise Architect, 3363, Australia, http://www.sparxsystems.com.au c 1997-2011 Object Management Group Inc., 29. OMG Systems Modeling Language, 140 Kendrick Street, Building A, Suite 300 Needham, MA 02494, United States of America, http://www.omgsysml.org 30. Kaiser, B., Liggesmeyer, P., M¨ ackel, O.: A new component concept for fault trees. In: SCS 2003: Proceedings of the 8th Australian Workshop on Safety Critical Systems and Software, pp. 37–46. Australian Computer Society, Inc., Darlinghurst (2003) 31. OMG. A UML Profile for MARTE: Modeling and Analysis of Real-Time Embedded systems, Beta 2, 2008. Object Management Group (July 2009), http://omgmarte.org, OMG Document Number: ptc/2008-06-09 c 2011 ARM Ltd., Equiniti Aspect House, Spencer Road Lancing BN99 32. ARM7, 6DA, United Kingdom, http://www.arm.com/products/processors/classic/arm7 33. YAGARTO, Yet another GNU ARM toolchain, Michael Fischer, Faustmuehlenweg 11, 34253 Lohfelden, Germany, http://www.yagarto.de/imprint.html c 1998-2011 AbsInt Angewandte In34. aiT Worst-Case Execution Time Analyzers, formatik GmbH, Science Park 1, 66123 Saarbruecken, Germany, http://www.absint.com/ait c 1986-2011 Isograph Ltd., 2020 Main Street, Suite 1180, Irvine, CA 35. FaultTree+, 92614, United States of America, http://www.isograph-software.com/ftpover.htm 36. DO-178B. Software Considerations in Airbone Systems and Equipment Certification Standard, Radio Technincal Commission for Aeronautics (1991)
A Calculus for SLA Delay Properties Sebastian Vastag Technische Universit¨ at Dortmund, Informatik IV, D-44221 Dortmund, Germany [email protected]
Abstract. Service Providers in Service-Oriented Architectures (SOA) often specify system performance values with the help of Service Level Agreements (SLAs) that do not specify details of how the system realizes services. Analytic modeling of SOA to estimate performance values is thus made difficult without knowledge of service rates. Service components are characterized by quantitative requirements in SLAs only, that are not supported by most modeling methods. We propose a calculus to model and evaluate SOA with quantitative properties described in SLAs. Instead of defining a system by its service capacity we will use flexible constraints on delays as found in SLAs. From these delays approximate service rates to fulfill the delay will be derived. Keywords: Network Calculus, SOA, SLA.
1
Introduction
Service-Oriented Architectures (SOAs) are based on the idea that processing functions of software systems can be offered as services accessible over the net. Services can be composed of other services and may form hierarchies [1]. A common implementation of SOA are Web services [1]. With cloud computing as an emerging system structure users are not able to distinguish anymore between local, remote and composed services [2]. Service performance is unknown to users and not measurable unless one is a customer in contract. This can result in situations where users of a service do not obtain the required performance and availability of service components. To avoid system shortages quantitative requirements for quantitative measures like reliability and response times are laid down in Service Level Agreements (SLAs) [3, 4]. SLAs are a contract between user and service provider. When the first agrees to limit its workload to the system the other one is able to guarantee a certain level of service. SLAs can be issued by service providers as an offer or by customers as a requirement. Challenging for system modeling including SLAs is that the performance of the service, or more specifically, the processing rate for requests, is unknown. The only available model parameters are upper bounds for the workload and response times as defined in the SLAs. For analytical models which are often used for capacity planing and validation the missing service rate leaves a gap.
This research was supported by the Deutsche Forschungsgemeinschaft (DFG).
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 76–90, 2012. c Springer-Verlag Berlin Heidelberg 2012
A Calculus for SLA Delay Properties
77
Example 1. A university plans to offer a central literature database using a Web service. The service can answer queries by author or year so researchers can include a list of their own papers on their homepage. The service shall be hosted by an external provider. The university formulates an SLA. Service description, query and output format as well as contract duration and pricing are functional SLA properties. Of course, fast service respond times are desirable. The SLA requires the Web service to respond within 2 seconds, this is a quantitative (nonfunctional) SLA property. In addition to unknown processing rates, modeling and validation of SLAs in SOAs has several characteristics differing from other modeling domains. Ideally SLAs define boundaries for performance numbers that should not be undercut or exceeded. However, a system communicating over a network does not rely on a completely controlled environment and can be influenced by external network traffic and other factors out of the control of service user and provider. Hence quantitative requirements in SLAs often have to include tolerances. Especially for delays a hard deadline, as found in embedded systems, would often be violated. Nondeterministic methods like queueing theory allow tolerances in models by choosing appropriate distributions and can be used to compute mean performance values in steady state [5]. In our opinion, the use of mean values to validate SLAs in SOA introduces several problems: Mean values do not indicate SLA limits violated. Furthermore there is no option to take into account startup phases or short service request bursts. Example 2. All literature database providers offer their services over the Internet. Even if the maximum response time of 2 seconds could be delivered no provider agrees on a hard deadline. Reliable transmission rates over the public network cannot be guaranteed and so no one will include a fixed number in a contract. Therefore the university has to relax its nonfunctional requirements on delay times. As a consequence, modeling for SOA with SLAs has to consider on the one hand limits for quantitative requirements that a system should meet and on the other hand has to allow flexibility to transform hard deadlines into soft ones. This is a requirement of modeling nonfunctional properties in SLAs with several open questions: How to build analytical models for systems when only SLAs for components are known? How to model quantitative requirements in SLAs? How to determine performance bounds of a SOA composed of services that feature SLAs? And, from the viewpoint of service providers: Can a SLA be fulfilled? In this paper we extend our approach [6] to model requirements on delays in SLAs by introducing recently developed SLA Calculus. – SLA Calculus is a deterministic calculus for quantitative requirements in SLAs under worst-case assumptions. It is a subset of (min,+) system theory [7–10] to form delay properties for services. This reflects the situation in SOA modeling: service performance is often unknown, only SLAs specifying bounds on request arrival rates and delays are available for modeling.
78
S. Vastag
– Service demand to a system with arrival rates and short term bursts is described by arrival curves as common in Network Calculus [9, 10]. The delay occurring at a service provider is also captured by curves limiting a longterm delay rate but also allowing short phases of lower system performance and longer delay. – Arrivals, delay and service in systems are related. Network Calculus is used unidirectional to derive delays from arrivals and service. We provide a method to derive the required service rates to fulfill an SLA containing bounds on demand and delay. The output of our method are service curves also found in Network Calculus closing the circle. – Our model includes elements for abstract service providers and quantitative requirements in SLAs. The service provider model contains the structure of a SOA and arising interdependency of SLAs. This allows us to reason about performance values of composed systems. To define delay curves we will use the time difference between input and output of a system. These values can be derived by measurement or, for worst-case analysis, with Network Calculus. This paper extends previous results published in [6] in several directions. In our previous approach [6] derivation of a service curve was based on an optimization problem. By giving an analytic method based on (min,+) system theory optimization algorithms become redundant and faster implementations are possible. Further contributions of this work are a improved model for SOA with quantitative requirements in SLAs. SLA Delay Properties are formulated and their concatenation in workflows are discussed. 1.1
Related Work
For analyzing models of Service-Oriented Architectures simulations can be used. In [11] a simulation framework for SOA was proposed. Model analysis gives performance numbers but does not include the description of SLAs. Based on a process chain modeling language SOA models are analyzed in [4, 12]. Timeouts are included as hard deadlines for service calls and quantitative requirements are considered. Serious disadvantages of simulation are a high effort for model generation, parametrization and the computational effort of simulation runs. The classical approach to analytical system analysis uses (extended) queueing theory. It has been shown to be suitable for efficient analysis of computer systems [5] or SOA [13]. [14] also includes functional properties of SLAs for system modeling with queueing systems. SLAs include limits of system load and system performance to deliver. The way boundaries are chosen in SLAs is similar to descriptions used in computer networks or realtime systems. A preferred tool here is Network Calculus [9, 10] to obtain deterministic bounds in queuing systems. It uses (min,+)-algebra [9, 10, 15] to set up a filtering theory for flows in networks and to derive worst-case performance bounds. Network Calculus was successfully used for computer network analysis [10, 16] and software tools including the approach are available [17, 18]. Many extensions [19, 20] as well as different
A Calculus for SLA Delay Properties
79
fields of application apart from data networks have been developed [21–23]. One of the most advanced derivatives is the Real-Time Calculus [24, 25] bringing the ideas of Network Calculus to the analysis of realtime systems. Although queueing theory gives mean values and Network Calculus works with worst-case scenarios both can describe the same systems [26] from different viewpoints. Not much is known how to combine both techniques. The Stochastic Network Calculus [27] for example combines distributions for rate modeling with (min,+) algebra. Although SLAs can be represented with Network Calculus it is rarely used to analyze SOA. Attempts were made in [28] and in our own work [6]. 1.2
Outline
The next section gives a short introduction to (min,+)-algebra. We extend the calculus with Delay Curves in section 3. A model for systems with SLAs featuring quantitative requirements is presented in section 4. It allows us to derive new SLAs when system nodes are combined to more complex systems. We consider serial concatenation of services in this paper. The inverse application of delay curves to derive a service curve is discussed in 5. Finally, an outlook of future work is given in section 6.
2
(min,+) Calculus
This work is based on Minplus or (min,+) algebra. (min,+) uses the minimum function as addition and replaces multiplication with addition. In this notation, (+, ·) becomes (min,+). Operators min() and + form a dioid [10] with ∞ as neutral element of addition and 0 as neutral element of multiplication. An extensive overview of the theory can be found in the book by Baccelli et al. [15]. Key elements for the linear time invariant filtering theory in Network Calculus are (min,+) equivalents of convolution and deconvolution of functions (cf. [10]). Definition 1 (Wide-sense increasing functions). A function is wide-sense increasing if and only if f (a) ≤ f (b) for all a ≤ b. F is the set of wide-sense increasing functions with f (t) ≥ 0 ∀t, f ∈ F . F0 is the subset of F with functions that are zero for t < 0. Definition 2 ((min,+) (de-) convolution). Let f and g be two functions or sequences in F0 . The (min,+) convolution of f and g (notation f ⊗ g) is the function (f ⊗ g)(t) = inf {f (t − s) + g(s)} 0≤s≤t
If t < 0 : (f ⊗ g)(t) = 0. The dual operation to ⊗ in (min,+) is deconvolution (notation f g): (f g)(t) = sup {f (t + s) − g(s)} s≥0
S. Vastag
80
Fig. 1. Arrival flow r(t), arrival function R(t) and arrival curve α(t)
Fig. 2. Horizantal and vertical deviation of R(t) and R∗ (t)
(min,+)-convolution is associative, commutative, distributive in respect to min() and closed in F0 . Again (F , min, ⊗ ) is a diod [9, 19]. For additional properties we refer to chapter 3 in [10]. 2.1
Arrival and Service Curves
In Network Calculus there are two main sets of functions to represent arrivals to a system over time and the service available to process them. Arrivals to a system are measured in bits, packets or any other quantitative unit used to describe data transmissions. Let r(t) be the number of arrivals of an arrival process at time t. Definition 3 (Arrival Function). Arrival function R(t) is the cumulative sum t of arrivals in time interval [0, t], thus R(t) = 0 r(x) dx. R(t) is continuous, wide-sense increasing and R(t) = 0 for t ≤ 0, thus R(t) ∈ F0 . To characterize arrival flows and to set bounds on arrival functions, Network Calculus abstracts arrival functions with functions called arrival curves conforming to the arrival curve property [24]. Definition 4 (Arrival Curve). A function αU ∈ F is an upper arrival curve for arrival function R(t) iff R ≤ R ⊗ αU . A lower curve αL is given by R ≥ R ⊗ αL . Arrival curve αU limits R from above while αL is lower boundary. When an arrival flow R is processed in a system it will leave as outgoing flow R∗ . In general R ≥ R∗ holds and R∗ fulfills the arrival curve property, hence it is often referred as outgoing arrival flow. Figure 2 shows both flows. R∗ can be obtained by measurements or derived by a second system property: Analogous to arrivals the processing resources a system can offer to its arrival flow at time t is given by function b(t). It should be noted that b(t) is the service the system is able to offer although r(t) < b(t) will usually hold. Definition 5 (Service Function). Service function C(t) is the cumulative t sum of service a system can deliver in time interval [0, t], thus C(t) = 0 b(x) dx.
A Calculus for SLA Delay Properties
81
Abstraction from arbitrary service functions is done with minimum and maximum functions satisfying the service curve property: Definition 6 (Service Curve). An upper service curve β U or a lower service curve β L for a service function C(t) is given by the relation: R∗ ≥ R ⊗ β L and R∗ ≤ R ⊗ β U
(1)
Definition 7 (Horizontal Deviation). Let f, g ∈ F be two functions. The horizontal distance between both in t is δ(f, g)(t) = inf {τ ≥ 0 : f (t) ≤ g(t + τ )}
(2)
Network Calculus uses horizontal deviation (Fig. 2) to find the maximum system latency [9, 10]. A flow with arrival curve α processed in a system offering service curve β has a maximum delay of h(t) = sup {δ(α, β)(t)}. The vertical deviation α(t) − β(t) gives the backlog (buffer content) of the flow (Fig. 2).
3
Delay Curves
Next to arrival and service curves a third curve variant was introduced in [6]. Delay curves limit the delay of a flow traversing a system. In the same way as arrival functions that are based on the cumulated number of arrivals, delay functions are an expression of the delay in a time interval. Definition 8 (System Delay). The delay between input flow R and output flow R∗ at time t is the horizontal deviation (Def. 7) between both functions d(R, R∗ )(t) = δ(R, R∗ )(t) = inf {τ ≥ 0 : R(t) ≤ R∗ (t + τ )}
(3)
The unit of delay is a unit of time. Delays in a time interval can be seen as a delay flow similar to arrival flows. To describe delay flows in a time interval we use delay functions. Definition 9 (Delay Function). Let d(R, R∗ )(t) be the delay between an arrival curve and an departure curve. Delay function D(t) is the cumulative sum of delays in time interval [0, t]. t d(R, R∗ )(x) dx (4) D(t) = 0
D(t) ∈ F since D(t) = 0 for t ≤ 0 and D(t) is wide-sense increasing. Thus D(t) features the same properties as arrival functions and can be described with similar algebraic methods. Example 3. Between request arrival to the exemplary literature web service and the time a reply is delivered time passes. The time each request waits is measured and written to the system log. The log file can be seen as a flow of delay analogously to traces of arrivals and departures. When we sum up the delay flow in a time interval we obtain a bound on the cumulated waiting times in this interval.
82
S. Vastag
Fig. 3. Service provider with SLA Delay Property
Fig. 4. Concatenation of service providers with resulting SLA Delay Property
The last step is to define delay curves with a delay curve property. Definition 10 (Delay Curve). A lower delay curve Ψ L or upper delay curve Ψ U for delay functions D(t) satisfy the relations D ≥ D ⊗ Ψ L and D ≤ D ⊗ Ψ U
(5)
This is equivalent to Ψ L (t − s) ≤ D(t) − D(s) ≤ Ψ U (t − s) ∀ 0 ≤ s ≤ t Ψ L (t) and Ψ U (t) are lower and upper bounds of delays occurring in the interval [0, t].
4
SLA Calculus System Model
This section introduces an abstract system model for SOA systems with quantitative requirements specified in SLAs. It includes descriptions for task arrivals to service providers and constraint delays that still allow flexibility. SLA Delay Properties providing the scopes for valid agreements will be defined and we will discuss combinations of several delay properties. 4.1
Requests to Systems
The basic system model in Network Calculus and Real-Time Calculus is similar to queueing systems. Workload arrives at a system and awaits service, after processing it leaves (figure 3). Workload can be either computational tasks, customers, data packets or anything else the model specifies. For SLA Calculus we will also use a similar interpretation of workload, jobs or tasks sent to a service as they occur in SOA are pooled to the term requests. A service request is the invocation of a service including transmission of input data, processing and sending the result. Triggering a Web service, an implementation of the SOA paradigm, are examples. Requests are discrete arrivals to systems weighted by their request size or processing complexity [4]. We will use a fluid model of requests with continuous time domain and abstract discrete jobs to a request flow. A request flow is the workload a customer requests from a service. Function R(t) is the cumulated number of service requests that arrived in the interval [0, t].
A Calculus for SLA Delay Properties
83
Definition 11 (Arrival Curve for Request Flow). A request flow to a system is limited by function αU (t) if αU (t) is a upper arrival curve for request function R(t) (see Def. 4). Arrival curves for request flows are a limitation on the usage of services. They provide a bound from above on the workload that is sent to service providers. From the perspective of SLAs it is the customers part of the contract. 4.2
SLA Delay Properties
SLAs for SOA can contain various aspects in different definition formats. In this work nonfunctional properties are considered, especially delay and timing. Description languages and system management [29] are out of scope of this paper, so we focus on delay properties as a fraction of an SLA description. Delay Curves limit the cumulated delay within a time interval. This curve has no expressiveness without a limit on arrivals. On the one hand a node might easily fulfill a delay curve Ψ U if there is only one arrival. On the other hand, no node can fulfill a delay curve if the arrival rate exceeds the processing rate for a long time. In consequence SLA Delay Properties include boundaries for the customer side and the provider part. They are valid under the condition that customers use the service according to the agreement. As SLA Delay Properties only virtually regulate request flows service users are free to produce workloads above limits but in those cases they cannot claim any guarantees on delays. Definition 12 (SLA Delay Property). An SLA Delay Property is a set SLA = {αL , αU , Ψ L , Ψ U } with αL , αU ∈ F0 and Ψ U , Ψ L ∈ F0 , αL ≤ αU , Ψ L ≤ Ψ U and αU (t), Ψ U (t) > 0∀t > 0. The definition implicates αL = 0 as default value when lower envelopes of arrival flows are not known or unnecessary. The same works for Ψ L = 0 since the majority of service demands do not require a minimum processing time. However, knowledge of a lower envelope for processing times can avoid excessive reservation of resources in system capacity planning. For the remaining paper we use {0, αU , 0, Ψ U } = {αU , Ψ U }. To instantiate SLA Delay Properties in SLA Calculus request arrival and delay curves can use the same function set f ∈ F0 . They only differ in their quantitative parameters. Piecewise linear functions are most convenient to use for the description of upper bounds within a time interval. Affine functions γr,b (t) = max(0, rt + b) match leaky buckets with rate r and burst size b [10]. In Network Calculus T-SPEC traffic specifications for computer networks curves are frequently used. They combine two leaky buckets T-SPEC(p, M, r, b)(t) = min(γp,M (t), γr,b (t)) with M as maximum packet size and p as peak rate. For delay curves in SLA Calculus we use a different interpretation with delay time instead of packets. Due to the fluid model we set M = 0. r is the maximum long term delay rate, b adds flexibility for variations in request processing rate and p is a higher delay rate when the service provider slows down for a short time.
84
S. Vastag
Example 4. We are going to relate an SLA Delay Property to the literature web service. The service should be able to process requests on an average rate of 1.5 requests per second. We also know that sometimes user demand increase for a short time to over 4 requests per second. To include this information on rate and safety margin we formulate a request arrival curve using the T-SPEC pattern: αU = T-SPEC(4.0, 0, 1.5, 10). For the delay we use the delay rate of 2 seconds per request. However, not expecting realtime performance from a Web service, we also grant a burst in the delay flow to relax our performance requirements. During a burst of 40 seconds in the delay flow a request may take up to 5.0 seconds. We also formulate this information as a delay curve: Ψ U = T-SPEC(5.0, 0, 2, 40). Figure 6 shows both curves. 4.3
Service Provider
Request flows are served by service providers. Definition 13 (Service Provider). A service provider S accepts a request flow R(t), processes the request and produces an outgoing request flow R∗ (t). In general R(t) ≥ R∗ (t) ∀t. While f traverses S, a delay flow of rate d(t) = δ(R, R∗ )(t) is generated by S. Figure 3 contains the basic system model. Arrivals (R(t)) enter the system from the left and leave after they have been served to the right (R∗ (t)). The emitted delay flow is represented by delay function D(t). Definition 14 (Service Provider with SLA Delay Property). {αU , Ψ U } is a SLA Delay Property. A service provider does conform to {αU , Ψ U } if D ≤ D ⊗ Ψ U with condition R ≤ R ⊗ αU . Ψ U is a shaping curve for D. The delay flow is bounded similar to an maximal f -source in [9], but without any input traffic or delay. Precondition R ≤ R ⊗ αU is the user part of a SLA contract, it limits the request arrivals. An assignment of a SLA Delay Property {αU , Ψ U } to a service is depicted in figure 3. 4.4
Service Provider Structure
The system structure in SOA is configured by the ordering of tasks that have to be processed. In Web Services as a common implementation of SOA, the ordering of tasks is called a workflow. A workflow controller invokes web services to execute workflows, a process which is called orchestration [1]. There has also been extensive work on the optimal selection of web services for workflows [14, 28]. As service orchestration and selection are out of the scope of this paper we consider workflows as predetermed. Therefore the workflow controller is omitted in our model. The sequence of service requests in a workflow is mapped to a feed-forward network [9] of service providers. A workflow requires the invocation of service Si+1 after service Si . Then, in the SLA Calculus system model, the outgoing arrival flow Ri∗ of service Si is feed into service Si+1 as an arrival flow. Figure 4 shows a construction blueprint. A similar model is proposed in [28].
A Calculus for SLA Delay Properties
4.5
85
Combination of SLA Delay Properties
In the following we will reason about SLA Delay Properties for workflows composed of more than one service request. SOA systems can be designed in service hierarchies with workflows presenting themselves as single services to other services. When SLA Delay Properties for services are known, requested properties for higher hierarchy levels can be formulated. Lemma 1 (Delay Function Concatenation). Workflow W is composed sequentially from services Si , i = 1 . . . n. Each service emits a cumulated delay flow Di . The delay function of workflow W is given by D(t) =
n
Di (t)
(6)
i=1
A request arrival flow traversing the system defined by a workflow passes each service. When a service process is requested it emits a delay flow as defined in the system model for service providers (Def. 13). Per definition a delay flow is a kind of arrival flow of equal time units, thus blind multiplexing can be applied.The output of an ideal multiplexer with inputs Ai , i = 1 . . . n satisfies n A(t) = i=1 Ai (t) [9]. Figure 4 illustrates the principle. With help of lemma 1 we can formulate the concatenation of SLA Delay Properties. Theorem 1 (SLA Delay Property Concatenation). Workflow W is composed from services Si , i = 1 . . . n with associated SLA Delay Properties pi = U ¯ for workflow W has a request {αU i , Ψi }. The composed SLA Delay Property p arrival curve and a upper delay curve given by αU (t) =
n
αi (t) and Ψ U (t) =
n
i=1
ΨiU (t)
(7)
i=1
Proof. R is the request flow processed by workflow W . Each request arrival curve in a SLA Delay Property acts as a constraint on R (Def. 11). In [9] this constraint is known as maximal f -regulator with f as limiting function. It limits R traversing the regulator to B1 = R ⊗ f1 . When B1 is feed into a second f2 -regulator the arrival flow is limited to B = B1 ⊗ f2 = (R ⊗ f1 ) ⊗ f2 = A ⊗ (f1 ⊗ f2 ) using assiociativity of ⊗ . Repeating the concatenation for f1 . . . fn leads to n Bn = R ⊗ i=1 fi . Setting αU i = fi completes the first part of the proof. Delay curves are a upper bound for the delay flow emitted by each service (Def. 10). The second part of the theorem follows from D = Ψ U and lemma 1.
This brings up the following: Corollary 1. Let W a workflow with serial requests to services S1 . . . Sn . WorkU flow W is delay constrained by ΨW if U with DW = D W ≤ D W ⊗ ΨW
n i=1
Di
(8)
86
5
S. Vastag
Lower Service Curves for Delay Properties
When a SLA Delay Property for a single service is known or has been derived for composed workflows a natural step is to determine a system that fulfills the property. This is a common task in capacity planning [14] or for validating if a workflow is executable in time with an existing system. In terms of Network Calculus this equals the derivation of a service curve β L from the SLA property {αU , Ψ U }. Let (R, R∗ ) be a pair of arrival and departure functions of a system. Then the problem of service curve estimation [30, 31] is to find a maximal solution for a lower service curve β L with R∗ (t) ≥ (R ⊗ β L )(t) ∀t ≥ 0. The lower service curve β L describes the lower boundary of processing speed a system has to deliver to observe the SLA Delay Property. Service curves were derived from given αU and Ψ U in [6]. The method supports T-SPEC type arrival and delay curves and derives a rate-latency service curve. A flaw of the method proposed in [6] was the use of optimization to match the service rate. In the following we are going to generalize the derivation of a service curve by dropping the limitation on T-SPEC curves. We will use an analytic method and remove the need for an optimizer. Our method is based on recent results for bandwidth estimation in networks. A approach based on (min,+) deconvolution was used in [30, 31] to estimate the bandwidth in form of service curves for network links based on measurements. Deconvolution (Def. 2) is the dual operation to (min,+) convolution. In general, deconvolution does not invert convolution [10, 31]: f = (f ⊗ g) g. For system curves with f = β L and g = αU one cannot completely reconstruct service curves. However, Liebeherr et al. [31] showed that deconvolution indeed gives sufficient estimates for lower service curves when applied to input/output functions. Their estimation goal was to deduce an unknown lower service curve C(t) from measured cumulated arrival and departure flows R and R∗ with R∗ ≥ (R ⊗ C)(t) for all pairs of R, R∗ . Deconvolution gives C˜ = R∗ R ˜ So C˜ is a service curve that reconstructs with C˜ ≤ C and R∗ (t) = (R ⊗ C)(t). the departure curve and can replace the unknown curve C. The bandwidth estimation in [31] is based on traces of a real systems and has to assume linear and time invariant systems. As most systems have a nonlinear input/output behavior deconvolution still has limited use and is replaced by other estimation schemes. However, service curve estimation for SLA Delay Properties is a special case where deconvolution can be applied since the available informations on systems are the SLA Delay Properties {αU , Ψ U } they conform to. They have a simpler structure than time varying measurements used in [31]. We use αU as worst-case assumption for input and also a worst-case assumption on the output based on αU and Ψ U . To find a service curve that fulfills a SLA Delay Property two steps have to be performed. 1. For a SLA Delay Property {αU , Ψ U } find a maximum output function R∗ with Ψ U (t) ≥ D(αU , R∗ )(t). 2. Use (min,+)-deconvolution to derive a lower service curve: β L = R∗ αU .
A Calculus for SLA Delay Properties
87
When defining Ψ U with piecewise linear functions like T-SPEC functions again some difficulties arise. A delay function is a cumulative sum and thus based on integration (see Def. 10). To get the actual delay d(t) Ψ U (t) has to be derived, but intersections of two linear pieces are not continuous and problems in the d f (t) = rt(f, t) for derivation. derivation arise. We use notation dy The following theorem introduces a valid output flow for an SLA Delay Property. Theorem 2. Let s = αU , Ψ U be an SLA Delay Property, αU , Ψ U are concave. A valid output flow function B(t) with Ψ U ≥ D(αU , B) is given by B(t) = αU (t − rt(Ψ U , t));
(9)
Proof. To prove that B(t) is an outgoing arrival flow that complies with SLA t Delay Property s = αU , Ψ U we have to show that D(t) = 0 h(αU , B)(x) dx ≤ Ψ U (t) for all t ≥ 0. We use rt(D, t) = d(t). Def. 7 rt(Ψ U , t) ≥ d(t) = inf τ ≥ 0 : αU (t) ≤ B(t + τ ) U = inf τ + t ≥ 0 : α (t) ≤ B(t + τ ) − t t + τ = Δ (10) = inf Δ ≥ 0 : αU (t) ≤ B(Δ) − t This enables Δ as control variable and gives a simplified notation. Arrival function α is wide-sense increasing, so Δ ≥ t holds. Delay function Ψ U is concave, so the delay rate between t and Δ is constant or is decreasing. A constant rate is found in piecewise-linear functions often used for approximation, too. First we consider the case of a constant delay rate rt(Ψ U , t) = rt(Ψ U , Δ). rt(Ψ U , t) ≥ d(t) = inf Δ ≥ 0 : αU (t) ≤ αU (Δ − rt(Ψ U , Δ)) = inf Δ ≥ 0 : αU (t) ≤ αU (Δ − rt(Ψ U , t)) − t = t + rt(Ψ U , t) − t = rt(Ψ U , t) Now the case when the rate decreases, i.e. rt(Ψ U , t) < rt(Ψ U , Δ). Let r∗ be the value of rt(Ψ U , Δ) with r∗ = rt(Ψ U , Δ) < rt(Ψ U , t). rt(Ψ U , t) ≥ d(t) = inf Δ ≥ 0 : αU (t) ≤ αU (Δ − rt(Ψ U , Δ)) = inf Δ ≥ 0 : αU (t) ≤ αU (Δ − r∗ ) − t = t + r∗ − t < rt(Ψ U , t)
The estimated bound for the output flow B(t) is not sharp. A limitation of function B(t) is that at time t only arrival and delay rate are known. Better bounds can be computed if B(t) is estimated with the knowledge of earlier delay rates. Corresponding methods will be subject to future research.
88
S. Vastag
Fig. 5. Cases 1 and 2 for theorem 2
Example 5. A service provider was asked by the university to host the literature Web service according to the SLA Delay Property. It receives the SLA Delay Property and computes the necessary service capacity by estimating an output function and applying deconvolution. The resulting lower service curve is the dashed line in figure 6.
Fig. 6. Arrival curve, delay curve and derived service curve for Web service example
6
Conclusions and Future Work
In this paper we formulate a calculus based on (min,+) filtering theory to reason about quantitative delay properties in SLAs. SLA Calculus can be used for SOA
A Calculus for SLA Delay Properties
89
performance modeling when only SLAs for services instead of service rates are known. The concept of delay flows allows one to limit service delay with delay curves and to model delay requirements in SLAs including a quantifiable amount of flexibility. Together with arrival curves delay curves are paired to form SLA Delay Properties. For building SOA models SLA Calculus can be used to reason about SLA Delay Properties in composed workflows of multiple services. Results for serial compositions are shown. Additionally, we complete the triplet of curves by deriving service curves from SLA Delay Properties. Future research will consider parallel compositions of services and their combined SLA Delay Properties. As each service in parallel is forced by delay curves to deliver some service we expect better worst-case estimates than in queueing theory. Another interesting question is how delay curves of systems with serial composition are subject to the “pay burst only once” principle of service curves.
References 1. Peltz, C.: Web services orchestration and choreography. Computer, 46–52 (2003) 2. Hayes, B.: Cloud computing. Communications of the ACM 51, 9–11 (2008) 3. Trienekens, J.J.M., Bouman, J.J., van der Zwan, M.: Specification of Service Level Agreements: Problems, Principles and Practices. Software Quality Journal 12, 43–57 (2004) 4. Bause, F., Buchholz, P., Kriege, J., Vastag, S.: Simulation Based Validation of Quantitative Requirements in Service Oriented Architectures. In: Rossetti, M.D., Hill, R.R., Johansson, B., Dunkin, A., Ingalls, R.G. (eds.) Proceedings of the 2009 Winter Simulation Conference, pp. 1015–1026. IEEE (2009) 5. Menasce, D., Almeida, V., Dowdy, L., Dowdy, L.: Performance by design: computer capacity planning by example. Prentice Hall (2004) 6. Vastag, S.: Modeling quantitative requirements in SLAs with Network Calculus. In: Proceedings of the 5th International ICST Conference on Performance Evaluation Methologies and Tools (ValueTools), ENS, Cachan, France, ICST (2011) 7. Cruz, R.: A calculus for network delay, part I: Network elements in isolation. IEEE Transactions on Information Theory 37, 114–131 (1991) 8. Cruz, R.: A calculus for network delay, part II: Network analysis. IEEE Transactions on Information Theory 37, 132–141 (1991) 9. Chang, C.: Performance guarantees in communication networks. European Transactions on Telecommunications 12, 357–358 (2001) 10. Le Boudec, J.Y., Thiran, P.: Network Calculus - A Theory of Deterministic Queuing Systems for the Internet. LNCS, vol. 4. Springer, Heidelberg (2004) 11. Sarjoughian, H., Kim, S., Ramaswamy, M., Yau, S.: A simulation framework for service-oriented computing systems. In: Mason, S.J., Hill, R.R., M¨ onch, L., Rose, O., Jefferson, T., Fowler, J.W. (eds.) Proceedings of the 2008 Winter Simulation Conference, pp. 845–853. IEEE (2008) 12. Vastag, S.: ProC/B for Networks: Integrated INET Models. In: M¨ ullerClostermann, B., Echtle, K., Rathgeb, E.P. (eds.) MMB&DFT 2010. LNCS, vol. 5987, pp. 315–318. Springer, Heidelberg (2010) 13. Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., Tantawi, A.: An analytical model for multi-tier internet services and its applications. ACM SIGMETRICS Performance Evaluation Review 33, 291–302 (2005)
90
S. Vastag
14. Eckert, J., Schulte, S., Repp, N., Berbner, R., Steinmetz, R.: Queuing-based capacity planning approach for Web service workflows using optimization algorithms. In: Digital Ecosystems and Technologies, DEST 2008, pp. 313–318. IEEE (2008) 15. Baccelli, F., Cohen, G., Olsder, G., Quadrat, J.: Synchronization and Linearity. Wiley, New York (1992) 16. Altman, E., Avrachenkov, K., Barakat, C.: TCP network calculus: The case of large delay-bandwidth product. In: INFOCOM 2002, vol. 1, pp. 417–426. IEEE (2002) 17. Schmitt, J., Zdarsky, F.: The DISCO network calculator: a toolbox for worst case analysis. In: 1st International Conference on Performance Evaluation Methodolgies and Tools, pages 8. ACM (2006) 18. Undheim, A., Jiang, Y., Emstad, P.: Network Calculus approach to router modeling with external measurements. In: Communications and Networking in China, CHINACOM 2007, pp. 276–280. IEEE (2007) 19. Fidler, M., Recker, S.: Conjugate Network Calculus: A dual approach applying the Legendre transform. Computer Networks 50, 1026–1039 (2006) 20. Xie, J., Jiang, Y.: A Temporal Network Calculus Approach to Service Guarantee Analysis of Stochastic Networks. In: Proceedings of the 5th International ICST Conference on Performance Evaluation Methologies and Tools (ValueTools), ENS, Cachan, France (2011) 21. Schmitt, J.B., Roedig, U.: Sensor Network Calculus – A Framework for Worst Case Analysis. In: Prasanna, V.K., Iyengar, S.S., Spirakis, P.G., Welsh, M. (eds.) DCOSS 2005. LNCS, vol. 3560, pp. 141–154. Springer, Heidelberg (2005) 22. Touseau, L., Donsez, D., Rudametkin, W.: Towards a SLA-based approach to handle service disruptions. In: Services Computing, SCC 2008, vol. 1, pp. 415– 422. IEEE (2008) ´ Optimal routing for end-to-end 23. Bouillard, A., Gaujal, B., Lagrange, S., Thierry, E.: guarantees using Network Calculus. Performance Evaluation 65, 883–906 (2008) 24. Thiele, L., Chakraborty, S., Naedele, M.: Real-time calculus for scheduling hard real-time systems. In: ISCAS 2000, vol. 4 (2000) 25. Chakraborty, S., K¨ unzli, S., Thiele, L.: A general framework for analysing system properties in platform-based embedded system designs. In: Proc. 6th Design, Automation and Test in Europe (DATE), pp. 190–195 (2003) 26. Jiang, Y.: Network Calculus and Queueing Theory: Two sides of one coin. ICST ValueTools (2009) 27. Jiang, Y., Liu, Y.: Stochastic Network Calculus. Springer-Verlag New York Inc. (2008) 28. Eckert, J., Pandit, K., Repp, N., Berbner, R., Steinmetz, R.: Worst-case performance analysis of Web service workflows. In: Proceedings of the 9th International Conference on Information Integration and Web-based Application & Services (2007) 29. Molina-Jim´enez, C., Pruyne, J., van Moorsel, A.: The Role of Agreements in IT Management Software. In: de Lemos, R., Gacek, C., Romanovsky, A. (eds.) Architecting Dependable Systems III. LNCS, vol. 3549, pp. 36–58. Springer, Heidelberg (2005) 30. Liebeherr, J., Fidler, M., Valaee, S.: A min-plus system interpretation of bandwidth estimation. In: 26th IEEE International Conference on Computer Communications, INFOCOM 2007, pp. 1127–1135. IEEE (2007) 31. Liebeherr, J., Fidler, M., Valaee, S.: A system-theoretic approach to bandwidth estimation. IEEE/ACM Transactions on Networking 18, 1040–1053 (2010)
Verifying Worst Case Delays in Controller Area Network* Nikola Ivkovic1, Dario Kresic1, Kai-Steffen Hielscher2, and Reinhard German2 1
Department of Information Technologies and Computing, Faculty of Organization and Informatics, University of Zagreb, Pavlinska 2, HR-42000 Varaždin, Croatia {nikola.ivkovic,dario.kresic}@foi.hr 2 Department of Computer Networks and Communication Systems, University of Erlangen-Nuremberg, Martensstraße 3, D-91058 Erlangen, Germany {ksjh,german}@informatik.uni-erlangen.de
Abstract. Controller Area Network (CAN) protocol was developed to fulfill high availability and timing demands in modern cars, but today it is also used in many other mission critical applications with hard real-time requirements. We present a compact model of the CAN bus specified by a timed automaton and prove its applicability for estimating worst case delays which are crucial for hard real-time systems. Using our model we detected flaws in previous approaches to determine the worst case delays in CAN systems. Keywords: Controller area network, CAN, real-time system, medium access, model checking, timed automata, worst case delay, latency.
1
Introduction
Modern cars are equipped with many collaborating Electronic Control Units (ECUs) which are used for various tasks like braking system, infotainment, occupant protection etc. The number of electronic subsystems in current vehicles keeps increasing, as most of innovation in the automotive industry is achieved by adding new electronic functions and devices. Innovation happens in various fields of application like entertainment, wireless connectivity or active and passive occupant protection. Most of these ECUs need data from sensors connected to other ECUs. Therefore, communication plays a significant role in modern vehicles. For this purpose, a number of automotive communication buses are utilized, ranging from LIN (Local Interconnect Network) [1] buses for connecting simple sensors and actors over MOST (Media Oriented Systems Transport) [2] for infotainment applications with *
Supported by German Research Council as a part of the project "Verification of Real-Time Warranties in CAN”.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 91–105, 2012. © Springer-Verlag Berlin Heidelberg 2012
92
N. Ivkovic et al.
audio and video to FlexRay [3] with deterministic TDMA. With the properties as described in detail in section 3, CAN (Controller Area Network) [4] offers a sustainable performance to fulfill common communication demands in automotive industry, mainly regarding available data rate and collision avoidance at media access. The majority of car manufacturers employ CAN based data buses that often coexist with other bus systems. Each of them can offer specific advantages in costs, data rate, media access and multiplexing schemes or other, mostly hardware related peculiarities. Analyzing data traffic is necessary in order to support design of a reliable and robust communication system. The objective of this paper is to show the applicability of timed automata and model checking techniques for a quantitative evaluation of automotive communication systems. In such systems mean values for performance measures, as obtained from stochastic modeling approaches like queuing theory, are of minor interest, as they are not sufficient to predict whether hard real-time deadlines are met in any case. Reliable operation and avoidance of system malfunctions with fatal effects generally depend on the worst case performance of the communication infrastructure. Using model checking for timed systems yields upper bound delays for data transmission which are inevitable to assess reliability of the system in all possible scenarios of operation, since timeliness itself is an essential aspect of hard real-time systems. This paper is organized as following. The section 2 presents previous approaches to analyze timing aspects of CAN communication; in the section 3 the main features of CAN are briefly presented. In the section 4 we introduce a CAN model based on timed automata which allows us automatic verification of hard timing bounds. In section 5, for studied systems, worst case delays are found, verified and analyzed. The section 6 summarizes the results and gives some final conclusions.
2
Related Work
Timing aspects in CAN have been an object of several studies concerning reliable data transfer in automotive applications. Response Time Analysis was used in [5] to estimate if deadlines of tasks are met by a given schedule. In [6] and [7], this technique was extended for an improved priority scheduling policy and automatic assignment of task and message periods. Paper [8] investigates CAN using a system representation as a composition of timed automata. In comparison to [8] our model is very compact and it uses fewer states since it is settled to a higher abstract level. It is also easier to understand and to analyze. In [9] a tool for exploration and optimization of CAN bus design is presented. For an optimized assignment of tasks as well message cycles [10] applies the Earliest Deadline First algorithm in conjunction with the CAN protocol. But the drawback of all these proposals is that a static a-priori schedule for all tasks and messages is inevitable to analyze real-time properties and to optimize traffic with respect to timing demands. In fact, due to the asynchronous wake-up of controllers along the CAN bus, such a global schedule can never be anticipated. To overcome this, the verification procedure that we propose requires
Verifying Worst Case Delays in Controller Area Network
93
only statically assigned CAN identifiers and known cycle times at which each message will be sent. We do not need to know any global bus-wide schedule of traffic in order to obtain upper bound delays for each priority class. Network calculus was used in [11] to obtain delay bounds in CAN. An advantage of our approach is that it follows CAN protocol more closely and it allows insight into the internal world of CAN in contrast to black box methods.
3
A Brief Description of CAN
Increasing number of interconnected electronic devices used in modern cars motivated Robert Bosch GmbH in 1980s to develop a protocol that suits specific needs of automotive applications. This resulted with CAN bus which later spread outside automobile industry because of its suitable characteristics for safety-critical and real-time systems. CAN bus is now standardized as ISO 11898 [4]. It uses differential serial line architecture with dominant and recessive bits. A CAN bus system consists from a number of nodes, also called ECUs. Every node has a CAN controller connected through a transceiver with the CAN bus as shown on Fig. 1. CAN controller is responsible for sending massages, generated in upper layers by a microcontroller, through the CAN bus to other CAN controllers and for delivering received messages to its own microcontroller.
Fig. 1. ECU nodes interconnected with the CAN bus
A recessive bit represents a logical “1” and a dominant bit a logical “0”. When the bus is idle it has the recessive level. If one controller on the bus sends a dominant bit while another controller sends a recessive bit, the resulting value on the bus will be dominant. This feature implies a bit-wise arbitration scheme for the media access, often called CSMA/BA (Carrier Sense Multiple Access with Collision Avoidance by Bitwise Arbitration). While sending data, each station listens to the bus, i.e. it measures its voltage level. If a collision occurs in the arbitration process where one station tries to send a recessive bit but receives a dominant bit, it will notice that another station is sending simultaneously and will stop its own transmission immediately. This makes the arbitration non-destructive, since the controller sending dominant bit will continue with sending without any negative effects on the bus while
94
N. Ivkovic et al.
the station sending recessive bit remains silent from the time on when the collision has occurred. A retransmission of the interrupted frame is triggered automatically once the bus is idle and thus recessive again. Before sending, each controller stays in the carrier sense phase and listens to the bus. A controller can start sending if the bus is free (recessive), but only after at least 3 bit times (bit time is the time needed to transmit 1 bit on the bus, also denoted as b.t.) of the interframe space passed from the last frame. This means that the pending controller that lost previous arbitration will start sending frame as soon as the interframe space expires. The frame starts with the start bit (SOF) that is always dominant and is used by all controllers for the hard synchronization. Immediately after, follows the message identifier from the most significant to the least significant bit. The structure of a standard CAN frame is shown in Fig. 2. CAN controllers do not have explicit address, instead unique message identifiers are used to describe the content of the respective CAN frame. The frames are broadcast on the bus and each station can decide if the message content is relevant by examining the message identifier of the received frame. These identifiers are allocated statically during the system design phase to avoid an ambiguity in the interpretation of the frame content. This requires that a system designer has a global view of the complete system during development. Two variants of CAN frames exist: standard frames with 11 bit message identifiers and extended frames with 29 bit identifiers. Both can coexist on the same bus. The payload can be of variable size from 0 up to 64 bits.
Fig. 2. Standard CAN data frame
The media access scheme described above creates an implicit hierarchy of priorities based on message identifiers value. If more senders start to send simultaneously, the sender transmitting the highest message identifier has to send a recessive bit 1 first while transmitting the message identifier due to its binary encoding. This bit is overwritten by a dominant 0 bit. The controller listens to the bus at the same time as it sends the bits and if a collision is detected (dominant bus while it is sending a recessive bit) the controller stops sending. This arbitration process will continue and the only one sender remains sending after arbitration fields are sent. This must always be the one with the lowest message identifier and thus with the highest priority [12]. A NRZ (Non-Return-to-Zero) encoding is used for the line encoding of the bits and the sender inserts a complementary stuff bit when no change in the logic level has occurred over five successive bits (including stuff bits). These stuff bits are removed automatically by the receiver of the message.
Verifying Worst Case Delays in Controller Area Network
4
95
Timed Automata and Verification
Timed automata, as introduced in [13], enable modeling real-time requirements of systems. In general, a timed automaton consists of a set of locations which denotes states of a system and a set of edges between locations which denote transitions. Locations as well as transitions may be tagged with (real-valued) constraints on clocks (such constraints associated with locations are called invariants – time can elapse in such location only as long as the given invariant of the location is valid). Guards are constraints associated with transitions. Guards are also called preconditions. Furthermore, transitions can also be labeled with postconditions, which are automatically executed after a transition is passed. A transition can be taken only if current values of clocks satisfy the clock constraint. A state of a timed automaton comprises a location and an assignment of values to clock variables. Timed automata can capture important aspects of control and real-time systems such as (real-time) liveness and fairness (both qualitative features) as well timing delays, bounded response etc. (examples of quantitative features). In order to perform a formal analysis of a timed automata based model automatic verification (also known as model checking) can be used. Basically, the verification problem for state-based, infinite systems can be described as the language inclusion problem L(A) ⊆ L(B), (1) where L(A) is (the language of) the specification and L(B) is (the language of) the given model property to be proven. This inclusion is often transformed to the socalled emptiness problem, i.e. L(A) ∩ L(B)C = ∅
(2)
C
where L(B) denotes the complement of the language L(B). This equation shows that there is no word which is accepted by A but not accepted by B (if the intersection is not empty, then every word from the intersection set is a counterexample). Constructing the intersection of the automata A and BC (where the last automaton often represents a logic formula specifying the given property) is generally denoted as model checking. 4.1
CAN Arbitration Model
In this section, we present a timed automata based model of the CAN protocol. This model was used to determine upper delay bounds for message delivery with model checking techniques. Our CAN model represents a system of nodes attached to transceivers which are interconnected with the CAN bus. This is modeled as an array of timed automata, i.e. with several instances of the automaton. Using one compact model for a node has an advantage over the composition of separate automata as it shows more clearly what is happening by performing state transitions. Moreover, a more compact automaton promises a more efficient verification since it has fewer states and variables. Our model - as shown on Fig. 3 - was implemented with
96
N. Ivkovic et al.
UPPAAL [14] which allows modeling, simulation and verification of real-time systems specified as (networks of) timed automata, possibly extended with data types (bounded integers, arrays, etc.). Different instances of this timed automaton (guards represented with green, variable resets with blue, synchronization labels with light blue and invariants with violet) can communicate with each other by the global variables R (representing the highest adapter priority) and bus (representing the CAN bus state – being idle or busy), and by exchanging the synchronization label free through a broadcast communication channel. An instance of automation represents one node and models cyclic message generation that is done by the application process, but most of the model behavior is concerned with CAN protocol and CAN controller.
Fig. 3. The Controller area network model implemented with UPPAAL
Every instance of the timed automaton has two local clocks. The clock g is used for determining time events when new message is generated and given to the controller for sending. Moreover, the clock g is used for measuring the time delay between a request to send the message and the moment when the message is received by other transceivers. The clock t is used for modeling the bus access control of the CAN controller. Our model assumes that a node is waiting some undetermined amount of time before it can start to operate normally, i.e. before it can generate and send messages in predefined cycles. This aspect is modeled with the off state from which the timed automaton is allowed to make a transition to the idle state without any time constraints. After executing this transition, in order to start the time measuring of its own cycle, the clock g is set to 0.
Verifying Worst Case Delays in Controller Area Network
97
The automaton is staying in the idle state waiting its cycle time to expire; then it can change to the wait or to the start_arbit state, depending on the state of the bus variable. The bus variable represents the CAN bus state which can be free for sending (encoded by “1”) or busy (encoded by “0”). The automaton will remain in the wait state until it receives the free symbol, sent by another instance, after it completes its frame transmission and after the minimum interframe space T_ifs elapses. Immediately as the automaton enters the start_arbit state it starts the arbitration process. If its priority number p is smaller than the number currently stored in the global variable R, the automaton rewrites its priority number in the R, sets the bus variable to 0 and goes to the finish_arbit state. Otherwise, the automaton goes to the wait state. Initially the variable R stores a number that is bigger than the one associated with the lowest priority in the system. On the transition from the start_arbit to the finish_arbit state, the global variable bus is set to 0. This is done to prevent the other instances, which might miss the start of the arbitration, to subsequently engage in the arbitration process. (This can also be done on the transition to the wait state, but it is redundant since at least one instance will do the same on transition to the finish_arbit state.) When the time T_arb, spent within arbitration expires, the arbitration is finished and the priority of the automaton with the lowest value is stored in the R variable. Automata whose priority is different from the value currently stored in the R variable will go back to the wait state. The automaton which won the arbitration will fulfill the condition that its priority was the highest priority, and is stored in the R variable, so it will make a transition to the sending state. By making this transition, the automaton resets the R variable to its initial value, i.e. lower than the lowest priority used in the system. Thus, the variable R is prepared for some future arbitration process. The automaton will remain in the sending state until the value of the clock t satisfies the guard T_min ≤ t ≤ T_max, i.e. when a complete frame is sent and the automaton goes to the finish state. On the transition to the finish state the clock t is reset to „0“ so that the interframe space time can be measured. When the interframe space time T_ifs expires then the automaton goes from the finish to the idle state. On the transition to the idle state the variable bus is set to 1, meaning that the bus is now free, and the label free is sent to all automata in the wait state which causes them to change to the start_arbit state. It is easy to notice that in the transition between sending and finish state, an automaton only waits for the clock t to become T_min ≤ t ≤ T_max, and afterwards from the finish to the idle state it waits additionally T_ifs. An equivalent automaton can be produced if the finish state is omitted and the transition to the idle state is executed when T_min+T_ifs ≤ t ≤ T_max+T_ifs. That is, instead of sending a data frame an automaton can send “augmented virtual frame” that is ordinary data frame augment with following minimal interframe space. The transition to the idle state sets the bus variable to 1 and the free label is broadcasted. For an equivalent automaton, with the omitted finish state, the verification procedure can be done more efficiently, but for clarity reasons the finish state is included in the model.
98
4.2
N. Ivkovic et al.
Qualitative Properties
Based on the described CAN model we identified several properties which have to be fulfilled. These properties are specified in the UPPAAL version of Computation Tree Logic (CTL). Apart from the usual deadlock-freeness proof (UPPAAL expression: A[] not deadlock) we automatically proved also the following properties (for more details look at [15]): 1) “At a certain time, only one node (after winning arbitration!) may send a frame”. This safety property can be specified in UPPAAL as follows: A[] forall (i:id_t) forall (j:id_t) P(i).sending && P(j).sending imply i == j
2) “When a node is sending a frame then the bus has to be busy (i.e. bus=0)”. This as another safety property which can expressed as: A[] forall (i:id_t) P(i).sending imply bus==0
3) “It is possible that several nodes compete for bus access right”. This liveness property is specified as follows: E[] forall (i:id_t) forall (j:id_t) forall (k:id_t) P(i).start_arbit && P(j).start_arbit && P(k).start_arbit imply (i != j and j != k and i != k)
(for clarity reasons we assumed here that three nodes are competing; but this CTL formula can be easily extended for any number of nodes!) 4) “The highest-priority node will always win the arbitration and start with sending”. This is also a liveness property which can be specified as follows P(0).finish_arbit --> P(0).sending
If we set “1” (or any other non-zero value) instead of “0” this property – as expected will not be always valid (because any other node can lose its arbitration process). 4.3
Time Constants
As appropriate time unit in our CAN model we use bit time since it does not depend on the transmission rate and it is easy to calculate times needed for frame transmissions. Every automaton i has its own cycle[i] time that is statically defined during the system development phase. The time needed to transmit the minimum sized frame is easy to determine as it occurs when no data is sent in the data filed and the bit pattern is one that doesn’t
Verifying Worst Case Delays in Controller Area Network
99
need the bit stuffing technique. Minimum sized frame for the standard 11 bit identifier has to be 47 bits, therefore the respective time is T_min = 47 b.t. A maximum sized frame is one that carries the maximum (8 byte) data field, and the bit pattern is such that causes adding a maximum number of stuffed bits. If five consecutive bits with the same value (including stuff bits) are encountered, then the opposite bit is inserted in the frame [4]. Bit stuffing is used only for the first part of the frame and is not used in the remaining part of the frame after the CRC field. Unstuffed frame with 11 bit identifier may have up to 98 bits in the stuffable part of the frame. An upper bound of the frame length is achieved if first five bits are equally valued and then are followed with alternation of four bits patterns of ones and zeroes as shown on Fig. 4. This way 1 + ⌊ (98 - 5) / 4⌋ = 24 additional stuff bits are inserted. With remaining bits, the upper bound for a CAN frame is 98+24+10=132 bits, therefore T_max = 132 b.t. The actual time needed to finish the arbitration process depends on the length of the arbitration field (i.e. message identifier) of all messages competing for the bus access. This time depends upon the fact that winning identification may have a bit pattern which is stuffed or not. Together with the SOF bit the arbitration field can have between 12 and 14 bits. In our practical implementation the T_arb is set to 14 b.t. Since the minimum and maximum arbitration field is included in the T_min and the T_max setting T_arb to 12, 13 or 14 bits does not affect the delay time of the automaton. Unstuffed frame (first 98 bits) 00000111100001111000011110000111100001111000011110000111100001 111000011110000111100001111000011111
Stuffed frame (first 98+24=122 bits) 00000i1111o0000i1111o0000i1111o0000i1111o0000i1111o0000i1111o 0000i1111o0000i1111o0000i1111o0000i1111o0000i1111o0000i1111o1
Legend: o – stuffed zero, i – stuffed one Fig. 4. Example of a bit pattern that gives an upper bound for frame size
5
Verification of Worst Case Delays
Based on our CAN model presented in the Section 3 we were able to find exact upper bounds for message delivery delays without regard to the concrete priority of the given message. For such purposes UPPAAL provides a scheme for computing upper time bounds: A[](b ==> z<=t).
Here is b a Boolean variable (initially set to false), z is a local clock and t is an upper time bound to be proved. When some state property should be fulfilled then b is set to
100
N. Ivkovic et al.
true and t is reset. This method requires one additional local clock per adapter that would be dedicated for this purpose. We used a bit different approach by exploiting a local clock that is used for message generation (the g clock in our model). By this, we reduced the number of required local clocks. The upper bound is then determined by verifying properties expressed with CTL formulas. By leaving the idle state, an automaton sets the clock g to „0“ and starts delay measurement which should stop before the beginning of the minimal interframe space in the finish state. This means, that in all states, except the off, idle and finish states, the clock g should be less than or equal to the worst case delay. For an example, A[] not (P(i).off || P(i).idle || P(i).finish) imply P(i).g<=Time_constant
verifies that an automaton with the priority i will always deliver its message within the Time_constant or a smaller delay. For an example verifying the properties: A[] not (P(0).off || P(0).idle || P(0).finish) imply P(0).g<=267 A[] not (P(1).off || P(1).idle || P(1).finish) imply P(1).g<=402 A[] not (P(2).off || P(2).idle || P(2).finish) imply P(2).g<=537 A[] not (P(3).off || P(3).idle || P(3).finish) imply P(3).g<=672 A[] not (P(4).off || P(4).idle || P(4).finish) imply P(4).g<=672
shows that they are satisfied, and verifying the properties: A[] not (P(0).off || P(0).idle || P(0).finish) imply P(0).g<=266 A[] not (P(1).off || P(1).idle || P(1).finish) imply P(1).g<=401 A[] not (P(2).off || P(2).idle || P(2).finish) imply P(2).g<=536 A[] not (P(3).off || P(3).idle || P(3).finish) imply P(3).g<=671 A[] not (P(4).off || P(4).idle || P(4).finish) imply P(4).g<=671
shows that they are not satisfied, so we determine tight bounds (found by bisection) for the worst case delays as 267, 402, 537, 672 and 672 for the priorities 0, 1, 2, 3, 4 respectively. 5.1
Results
We determined and verified worst case delays for studied CAN systems and analyzed the acquired results. In order to allow a comparison of the results, the cycle times were taken from [11] for the system with 5 nodes. With the same cycle times, determination of the upper bounds for delays was also conducted for systems with a smaller number of nodes. Important parameters of investigated systems and verified
Verifying Worst Case Delays in Controller Area Network
101
quantitative properties are summarized in the Table 1. All bounds given in the Table 1 are tight, meaning, for example, that for a system with 4 sending nodes and for priority 1 the verification results imply that a message is guaranteed to be sent within 402 b.t., but in the worst case it cannot be sent in less than 402 b.t. Besides in bit time, in the Table 1 time is also listed in milliseconds (given a bus with 500 kbps data rate) as was done in [11], to make the results easer to compare. In the case when only one node in the system sends messages, but many other nodes can be attached to the CAN bus, the maximum delay is 132 b.t. which is equal to T_max. This result is expected since this is the only node and it can send as soon as it is ready. For a system with two sending nodes, both nodes with priorities 0 and 1 have the same worst case delay which is equal to 2ÿ T_max + T_ifs. Although the higher priority will win all access contentions with the lower priority, the lower priority which is already sending cannot be preempted. For systems with 3, 4 or 5 sending nodes the maximum delay generally grows with the priority number, but the lowest priority always has the same maximum delay as the preceding one. This is of course true for the provided cycles. In systems with original cycles (as given in the third column in the Table 1) the network utilization is rather low. The maximal utilization (for maximal sized frames) is 0.5%, 3.2%, 3.4%, 4.8% and 5.6% for systems with 1, 2, 3, 4 and 5 sending nodes, respectively. This could mislead to the conclusion that only for very low network utilizations, worst case delays can be guaranteed. To investigate what is happening when the network utilization is higher, we have determined (by bisection) the minimum cycles for which the same worst case delays are guaranteed. Table 1. Verified quantitative CAN properties with cycles for systems with 1-5 nodes Maximum Maximum Min. Min. Number Priority Cycle [b.t.] Cycle [ms] delay[b.t.] delay[ms] Cycle[b.t.] Cycle [ms] of nodes 1
0 0 1 0 1 2
25000 25000 5000 25000 5000 50000
50 50 10 50 10 100
132 267 267 267 402 402
0.264 0.534 0.534 0.534 0.804 0.804
135 270 270 405 405 405
0.270 0.540 0.540 0.810 0.810 0.810
4
0 1 2 3
25000 5000 50000 10000
50 10 100 20
267 402 537 537
0.534 0.804 1.074 1.074
540 540 540 540
1.080 1.080 1.080 1.080
5
0 1 2 3 4
25000 5000 50000 10000 15000
50 10 100 20 30
267 402 537 672 672
0.534 0.804 1.074 1.344 1.344
675 675 675 675 675
1.350 1.350 1.350 1.350 1.350
2
3
102
N. Ivkovic et al.
These verified minimum cycles are listed in the two rightmost columns in the Table 1. If any node has a smaller cycle than the listed minimum cycle given in the rightmost columns, then the given maximum delays can no longer be guaranteed. In all cases, the minimum cycle, for which the verified worst case delays of original systems can still be guaranteed, is equal to the maximum delay in the system increased for exactly T_ifs. It is interesting to note that the maximal network utilization for all systems (1-5 nodes) that use the minimal cycles is 97.78%. This is also the theoretical maximal utilization of the network, since maximum sized frames are transmitted through the bus all the time, except for the minimal interframe space (132 b.t. / 135 b.t. = 0.9778) which is required by the CAN standard. When the minimum cycle times from the Table 1 are not respected, then the worst case delays can be prolonged and more complex transmission sequences can occur. One system with such cycle values was studied and the results are presented in the Table 2. The network utilization for this system is 95.0%. In this case, for two highest priority messages the worst case delays remain the same as with the 4 nodes system in the Table 1, but for the lower two priorities the worst case delays are larger. One example of transmissions sequence that causes the worst case delay for the node with priority 2 is following. (Although, this is a rather complex example, it is the shortest possible sequence of time events that causes the worst case delay, and it was generated by UPPAAL) The node 3 starts sending a frame and enters the arbitration phase alone. Therefore, it wins the arbitration and can continue to send the frame. The nodes 1 and 2 were ready to start sending just a moment after the node 3, and have to wait for the bus to become free again. While the node 3 is sending its frame, the node 0 is also ready to send and waits for the bus to become free. When the node 3 is done, and the bus is free, the nodes 0, 1 and 2 start a new arbitration. Of course, the node 0 wins and the nodes 1 and 2 wait for node 0 to finish transmitting to start the next arbitration. In the next arbitration, the nodes 1 and 2 compete, so the node 1 wins and the node 2 is the only node that is waiting for the next arbitration. But since the node 0 has very small cycle time, before the bus is free again, the node 0 is also waiting for the next arbitration. So, in the next arbitration the nodes 0 and 2 are competing. Consequently, the node 0 wins and the node 2 waits for the next arbitration. Before the bus is free again, the node 1 with a very small cycle time is waiting for the next arbitration. As soon as the node 0 finishes and the bus is free, the nodes 1 and 2 are competing for the bus again. The node 1 wins and the node 2 waits the next arbitration. In the next arbitration the node 0 is ready again, so it wins the arbitration over node 2. Finally, when the node 0 finishes, the bus is free and the node 2 is the only node waiting for the bus, so it starts sending a frame. To summarize, before the node 2 could finally send the frame, the following sequences of sending occurred: 3, 0, 1, 0, 1 and 0. This sequence of transmissions caused the 6*( T_max + T_ifs)+ T_max =942 b.t. delay.
Verifying Worst Case Delays in Controller Area Network
103
Table 2. A system with very low cycle times Priority 0 1 2 3
5.2
Cycle [b.t.] 300 500 5000 1000
Cycle [ms] 0.6 1 10 2
Maximum delay[b.t.] 267 402 942 942
Maximum delay[ms] 0.534 0.804 1.884 1.884
Discussion
Because of concurrency, CAN systems can be hard to analyze, so it is inevitable to use formal methods to minimize the possibility of mistakes and errors. This is especially important for mission-critical and safety-critical systems. The papers published by Tindell, Burns and Wellings [16, 17, 18] has shown that time analysis of CAN systems is hard, since these papers contain several flaws that were detected only years later [19] and subsequently corrected in [6]. The comparison of our results with results obtained by Krakora and Hanzalek [8] is not directly applicable since their model assumes specific, fixed lengths of frames which are smaller than the maximal sized frames. This is in contrast with our model which takes into account all possible sizes of data frames. It seems that their work did not properly address the issue of the minimal interframe space. Their paper does not give enough information for full reproducibility of the results. The minimal interframe space was not mentioned in their paper, so it is either discarded or implicitly included in the “augmented virtual frame”. If it is discarded, then their model should be updated, but if it is included in the “augmented virtual frame”, then they also forget to subtract it from the final worst case delays. Also, they use arbitrary given message lengths, without mentioning the influence of the bit stuffing rule, so it is not clear whether or not the existence of the stuffed bits is accounted in their analysis. The comparison of our results with the results acquired by Klehmet at al. with [11] confirms that their approach is correct, but has some technical errors that we detected. One problem is that the minimal interframe space followed by the frame is included in the calculated worst case delay. Since, the inclusion of the minimal interframe space in the “augmented virtual frame” makes the calculation easier, the correction of the method should be made. The calculated time should be decreased by exactly one minimal interframe space (one that elapses after a particular frame has already been delivered). Also, the maximum frame size used for the analysis suffers from the same error (regarding the bit stuffing) as the papers [16, 17, 18] published by Tindel and Burns. Furthermore, the waiting for the 6 bit time of silence in the carrier sense phase before the node can start sending frame is not referenced, and cannot be found in the CAN standard. The worst case delays of the two lowest priorities in the systems, acquired by this method, are different. It is not clear whether it is assumed that there are other nodes in the system for which the worst case delays are not calculated. If this is not the case, then the method should be reviewed to take a special care for the
104
N. Ivkovic et al.
lowest priority in the system. If the detected flaws are corrected, the network calculus stays a very appealing method for CAN, because it could be applied for systems with many more nodes than the timed automata model due to resources demanding verification for timed automata. Also, it would be interesting to see if a network calculus model could be applied in a case of a system when some nodes have very small cycle times, like the system in the Table 2.
6
Conclusion
Our paper shows that application of timed automata and model checking can be successfully applied for deterministic performance evaluation of the state-of-the-art automotive communication networks. Delay bounds we obtained support the decision whether, for CAN frames of arbitrary priority and size, hard real-time transmission properties can be fulfilled. For studied systems we found and verified worst case delays and minimal cyclic times for which given guaranties are still valid. Using the presented model we detected numerous flaws in previous attempts to determine worst case delays in CAN systems that should be corrected now. Important qualitative properties of CAN have been verified as well. Our approach is easily applicable to actual CAN traffic, providing a rigorous and sound method to prove delay bounds for any CAN priority classes. The advantage of our model is that it is not stochastic and it takes into account all possible scenarios to verify worst case delays. It is also compact and easy to understand and it provides a suitable tool for analyzing CAN internal behavior. The main drawback of our approach is that it requires substantial computing resources for systems with large number of nodes, but this could be mitigated by complementing model checking with methods like network calculus that can handle systems with arbitrarily many nodes. In future we want to extend our model in order to take into account possible error occurrence and its influence on the worst case delays.
References 1. von der Wense, H.-C.: LIN Specification Package. LIN Consortium (2003) 2. MOST Cooperation: MOST Media Oriented Systems Transport. Rev 2.4 (2005) 3. FlexRay Consortium: FlexRay Communications System Protocol Specification. Ver. 2.1 (2005) 4. ISO 11898-1:2003: Road vehicles – Controller area network (CAN) – Part 1: Data link layer and physical signalling. International Organization for Standardization (2003) 5. Tindell, K., Burns, A.: Guaranteed Message Latencies for Distributed Safety Critical Hard Real-Time Networks. Technical Report YCS 229, Dept. Computer Science, University of York (1994) 6. Davis, R.I., Burns, A., Bril, R.J., Lukkien, J.J.: Controller Area Network (CAN) schedulability analysis: Refuted, revisited and revised. Real-Time Systems 35, 239–272 (2007)
Verifying Worst Case Delays in Controller Area Network
105
7. Davare, A., DiNatale, M., Zhu, Q.: Period Optimization for Hard Real-time Distributed Automotive Systems. In: Proceedings of the 44th IEEE/ACM Design Automaton Conference (2007) 8. Krakora, J., Hanzalek, Z.: Verifying Real-Time Properties of CAN bus by Timed Automata. In: World Automotive Congress, FISITA 2004, Barcelona (2004) 9. Hamann, A., Racu, R., Ernst, R.: Formal Methods for Automotive Platform Analysis and Optimization. In: Proc. Future Trends in Automotive Electronics and Tool Integration Workshop (DATE Conference), Munich (2006) 10. Richardson, P., Sieh, L., Elkateeb, A., Haniak, P.: Real-time Controller Area Networks (CAN) managing transient surges. Integr. Comput.-Aided Eng. 9 (2002) 11. Klehmet, U., Herpel, T., Hielscher, K.-S.J., German, R.: Delay Bounds for CAN Communication in Automotive Applications. In: Bause, F., Buchholz, P. (eds.) Proceedings 14th GI/ITG Conference on Measurement, Modelling and Evaluation of Computer and Communication Systems (MMB 2008), Dortmund, Germany, March 31-April 2, pp. 157–172. VDE Verlag (2008) 12. Lawrenz, W.: CAN Controller Area Network, 4th edn. Hüthig Verlag (2000) 13. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Science 126, 183–235 (1994) 14. UPPAAL, http://www.uppaal.org/ 15. Kresic, D., Hielscher, K.-S., German, R.: Specification and Implementation of CAN Arbitration in UPPAAL. Technical Report ISSN 2191-5008, Technische Fakultät, University of Erlangen-Nuremberg (2010) 16. Tindell, K.W., Burns, A.: Guaranteeing message latencies on Controller Area Network (CAN). In: Proceedings of 1st International CAN Conference, pp. 1–11 (1994) 17. Tindell, K.W., Burns, A., Wellings, A.J.: Calculating Controller Area Network (CAN) message response times. Control Engineering Practice 3(8), 1163–1169 (1995) 18. Tindell, K.W., Hansson, H., Wellings, A.J.: Analysing real-time communications: Controller Area Network (CAN). In: Proceedings 15th Real-Time Systems Symposium (RTSS 1994), pp. 259–263. IEEE Computer Society Press (1994) 19. Bril, R.J., Lukkien, J.J., Davis, R.I., Burns, A.: Message response time analysis for ideal controller area network (CAN) refuted. CS-Report 06-19, Technische Universiteit Eindhoven (TU/e) The Netherlands (2006)
Lifetime Improvement by Battery Scheduling Marijn R. Jongerden1 and Boudewijn R. Haverkort1,2 1 University of Twente, Centre for Telematics and Information Technology, Design and Analysis of Communication Systems (DACS), Enschede, The Netherlands {jongerdenmr,brh}@ewi.utwente.nl 2 Embedded Systems Institute, Eindhoven, The Netherlands [email protected]
Abstract. The use of mobile devices is often limited by the lifetime of their batteries. For devices that have multiple batteries or that have the option to connect an extra battery, battery scheduling, thereby exploiting the recovery properties of the batteries, can help to extend the system lifetime. Due to the complexity, work on battery scheduling in literature is limited to either small batteries or to very simple loads. In this paper, we present an approach using the Kinetic Battery Model that combines real size batteries with realistic random loads. The results show that, indeed, battery scheduling results in lifetime improvements compared to the sequential useage of the batteries. The improvements mainly depend on the ratio between the average discharge current and the battery capacity. Our results show that for realistic loads one can achieve up to 20% improvements in system lifetime by applying battery scheduling.
1
Introduction
Many autonomous devices rely on batteries for their power supply. The capacity of the batteries is finite, and the duration with which one can use the device is limited by the battery lifetime. Lifetime, here, is the time of one discharge period of the battery, from full to empty. Although the battery lifetime depends mostly on its capacity and the level of the load applied to it, another important influence is how the battery is used, i.e., its usage pattern [3]. When a battery is continuously discharged, a high current will cause it to provide less energy until the end of its lifetime than a lower current. This effect is termed the rate-capacity effect. On the other hand, during periods of low or no discharge current, the battery can recover to a certain extend. This is termed the recovery effect. One approach to improve system lifetime is to connect one or more extra batteries. In this case, the batteries mostly are used in sequential order, the next one is used when the previous one has reached the end of its lifetime. Although this clearly prolongs the device lifetime, it is not the most efficient way. By using J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 106–120, 2012. c Springer-Verlag Berlin Heidelberg 2012
Lifetime Improvement by Battery Scheduling
107
the batteries one after each other one does not exploit the rate-capacity and recovery effect. Indeed, by switching regularly between the batteries one will give the batteries time to recover from the applied load. This will lead to longer system lifetimes, as we show in this paper. Some research has already been done on battery scheduling. However, the approaches that use realistic random loads are limited to very small batteries, cf. [2, 17], and the approaches that do have real size batteries, such as [1], use only a limited number of test loads which are mostly very regular. The former leads to an overestimation of the improvement obtained by battery scheduling. For the latter the question remains how battery scheduling will perform under realistic loads. In this paper, which has been presented earlier at the UK-PEW 2011 [7], we study the impact of battery scheduling when using real size batteries with a variety of realistic (random) loads. Various battery schedulers are modeled, and using the Kinetic Battery Model (KiBaM) [8–10] the overall system lifetime is computed for randomly generated loads. Our results show that for realistic loads one can achieve up to 20% improvements in system lifetime by applying battery scheduling. The rest of the paper is structured as follows. In Section 2 an overview is given of the related work. Section 3 describes the used Kinetic Battery Model, and gives an expression for the maximum possible lifetime gain according to this model. The results of the simulations are given in Section 4. Finally, we conclude in Section 5.
2
Related Work
The scheduling of batteries has attracted quite some attention in the literature. Over the years various kinds of battery models have been developed. An overview of the main modeling approaches is given in [6]. Part of these models have been used to study the problem of battery scheduling. We consider here the main approaches. The most important scheduling schemes that are studied are: – Sequential scheduling: a next battery is only picked when the current one is empty. – Round robin scheduling: at fixed moments in time another battery is used. The batteries are used in a fixed order. – Pick-best scheduling: at fixed moments in time the status of all batteries is checked and the best battery is used. What is the best battery can be determined in several ways, for example the battery with the highest voltage, or the battery that has been used for the shortest period of time. Benini et al. [1] use an electrical-circuit model to describe the batteries. They consider sequential scheduling, round robin scheduling and various types of pickbest scheduling, where either the output voltage or the time that a battery has not been used determines which battery is to be scheduled. The different scheduling schemes are applied to several battery configurations containing up to four
108
M.R. Jongerden and B.R. Haverkort
batteries. The loads that have been used are simple continuous and intermitted loads and two real-life example load profiles. Which scheduler performs best depends on the applied load. Chiasserini and Rao [2] use a discrete-time Markov battery model to compare three different scheduling schemes in a multiple battery system. In the model, the recovery of the battery is considered as a random process. Also, the workload is stochastic. Next to the commonly used round robin and pick-best scheduler, also a random scheduler is considered. The schedulers are compared for different job arrival rates. The results show that the pick-best scheduler outperforms the other two. However, the complexity of the used models limits the analysis to cases with only small batteries. In all this work the battery scheduling is limited to simple scheduling schemes. All show that battery scheduling gives longer system lifetime than when the batteries are used sequentially. However, they do not indicate whether longer lifetime could be possible by using even smarter scheduling. Sarkar and Adamou [17] propose an algorithm for computing an optimal scheduling scheme based on the stochastic battery model of Chiasserini and Rao. To do this, they translate the problem to a stochastic shortest path problem. The optimal solution can only be computed for very small batteries. However, they do show that pick-best scheduling performs close to optimal. Another optimization approach is taken in [4, 5], in which the batteries are modeled using priced-timed automata. With model checking techniques the schedule that gives the maximum lifetime is computed. The result is compared to the simple sequential, round robin and pick-best schedulers. Although the results show that the round robin and pick-best schedulers are sometimes far from optimal, these schedulers are much better than the sequential scheduler. The model actually shows that sequential scheduling results in the shortest lifetime possible. In [5] a first step towards random loads was taken. However, the priced-timed automata model was limited to very small battery capacities. In this paper we combine the random loads with realistic battery capacities, in order to obtain a better prediction of the potential gain of battery scheduling. All these studies show that by applying battery scheduling, the system lifetime will be extended. However, the improvement varies a lot between the different modeling approaches. Where Benini et al. [1] predict an average improvement of approximately 11% for a two battery system, Chiasserini and Rao [2] show improvements of more than 100%.
3 3.1
Kinetic Battery Model Introduction
The battery model we use is the Kinetic Battery Model (KiBaM) of Manwell and McGowan [8–10]. This model is very intuitive, and the simplest model that includes the two important non-linear battery properties, the rate-capacity effect and the recovery effect [6]. The rate-capacity effect is the effect that less charge can be drawn from the battery when the discharge current is increased. However,
Lifetime Improvement by Battery Scheduling
109
some of the charge left behind in the battery after a period with a high discharge current will be available for usage after a period with no or low current. This is the recovery effect. In the model the battery charge is distributed over two wells: the availablecharge well and the bound-charge well (cf. Figure 1). For the full battery, a 1-c
c
h2 h1 y1
y2
k bound charge
i(t) available charge
Fig. 1. Two-well-model of the Kinetic Battery Model
fraction c of the total capacity is put in the available charge well, and a fraction 1 − c in the bound charge well. The available charge well supplies electrons directly to the load (i (t)), whereas the bound-charge well supplies electrons only to the available-charge well. The charge flows from the bound charge well to the available charge well through a “valve” with fixed conductance, k. Next to this parameter, the rate at which charge flows between the wells depends on the height difference between the two wells. The heights of the two wells are given y2 . The change of the charge in both wells is given by by: h1 = yc1 and h2 = 1−c the following system of differential equations: ⎧ ⎨ dy1 = −i (t) + k(h2 − h1 ), dt (1) ⎩ dy2 = −k(h2 − h1 ), dt with initial conditions y1 (0) = c · C and y2 (0) = (1 − c) · C, where C is the total battery capacity. The battery is considered empty when there is no charge left in the available charge well, y1 = 0. One can solve the differential equations using Laplace transformations when the load is constant (i(t) = I). In this case the evolution of the charge in the two charge wells is given by: ⎧ −k t ⎨ y1 (t) = cC − Ict − I(1−c) 1 − e , k (2) −k t ⎩ y2 (t) = (1 − c)C − (1 − c)It + I(1−c) 1 − e , k where k is defined as k = k/ (c (1 − c)).
110
M.R. Jongerden and B.R. Haverkort I/C (h-1) 1
10
2
1.8
G2
1.6
1.4
1.2
1 0.1
1
10
I (A)
Fig. 2. The maximum lifetime gain for a system with two batteries as a function of the constant discharge current. Note that the current is plotted in a logarithmic scale. The top x-axis gives the current normalized to the capacity of one battery.
From the equation for the available charge one can obtain the battery lifetime (ts ) by setting y1 = 0: C 1 1−c 1 − c − Ck + 1−c I c , (3) ts = − −W e I k c c where W denotes the Lambert W function. The Lambert W function is the inverse function of f (W ) = W eW [14]. 3.2
Maximum Possible Lifetime Gain
In [5] it is shown that, according to the KiBaM model, in theory the best way to discharge the batteries in a multiple battery system is by using them in parallel. For a system with N identical batteries discharged with a continuous current I the lifetime is given by: tp,N =
1 NC − I k
1−c −W c
1 − c − N Ck + 1−c I c e c
.
(4)
This equation is similar as (3) with NI substituted for I. Using Equation (3) and (4) we can compute the maximum possible gain one can obtain by applying battery scheduling in the case of a constant discharge current. The system lifetime when using N batteries sequentially will be N ts , hence the maximum possible gain with N batteries GN is given: GN =
tp,N . N ts
(5)
In Figure, 2 the gain for a system with 2 batteries (G2 ) is given as a function of the discharge current. The batteries that have been used in this computation are
Lifetime Improvement by Battery Scheduling
111
similar to those used in [5], i.e., c = 0.166 and k = 2.815 · 10−4 s−1 . However, here the capacity is increased to a realistic value, C = 2400 As, instead of the much smaller capacity of 330 As used in [5]. This type of battery is used in the Itsy pocket computer, which was also simulated by Rakhmatov et al. in [15, 16]. The discharge current has been varied between 0.1 A and 10 A. For this system of batteries the highest gain is obtained at a discharge current of approximately 0.85 A, where the gain is more than 1.9. The peak can be explained as follows. When the discharge current gets too high, the available charge well will be depleted too fast and the slow recovery process will hardly increase the usable capacity, even when scheduling is applied. At low discharge currents the loss of capacity due to the rate capacity effect is low, i.e., the flow of charge from the bound to the available charge well can keep up with the demand, and little charge will be left behind in the bound charge well. Therefore, the gain of allowing batteries to recover by the scheduling is limited. However, at a discharge current of 0.1 A the gain still is approximately 1.05, and a 5% lifetime extension is still a considerable improvement. When we look at Equation 4 and Equation 3 we see that the discharge current I always appears in direct relation with the battery capacity C, in the form CI . This implies that when the battery capacity is halved, and the other battery parameters stay the same, the discharge current needs to be halved as well to obtain the same lifetime gain. Using the top x-axis Figure 2 shows how the maximum lifetime gain depends on the current normalized to the capacity of one battery (I/C). Of course, the shape of curve depicted in Figure 2 and the position of the maximum highly depend on the battery parameters c and k, as well as the number of batteries N . The plots in Figure 3 show how the curve changes when one of the parameters is varied. In all the three subfigures the curve of Figure 2 is given as reference, drawn with a solid line. In Figure 3(a) the number of batteries (N ) is varied. The increase of the number of batteries leads to an increase of the gain (GN ). This can be understood as follows. When more batteries are used, the discharge current per battery will drop. The flow of charge from the bound charge wellto the available charge wellnow can keep up better with discharge current, and more charge will be available for the load. Figure 3(b) shows that when the fraction of available charge (c)is increased the gain will be lower. As more charge is directly available the lifetime of sequential discharge will increase, and dividing the load will be less beneficial. In the extreme case that c = 1, the batteries will behave as ideal batteries, and all charge will always be available for the load. In this case, there will be no difference in lifetime between sequential and parallel discharge, and GN = 1 for all currents. Finally, Figure 3(c) shows that an increase of k leads to an increase of the current at which the gain is maximal. The increase of k causes the flow of the charge from the bound charge wellto the available charge wellto be faster.
112
M.R. Jongerden and B.R. Haverkort
In this way, the flow will be able to keep up with higher discharge currents, and the gain will be largest at a higher discharge current. The results shown above indicate that a system with two identical batteries used in parallel will behave as one with double the capacity. However, the possibility of connecting batteries directly in parallel is under debate. Where [12] claims lithium batteries are well suited to connect in parallel, [13] says one should not do this. One problem of connecting batteries in parallel is that even for two batteries of the same type a difference in potential can occur. When this happens a current will flow between the batteries, resulting in a loss of capacity and, even worse, possibly damage to the batteries. Using batteries in parallel requires extra electronic circuitry, which consumes some power and decreases efficiency. Also, in some situations, like the routing problem described in the Section 1, parallel usage is simply impossible. Using a simple scheduling scheme, like round robin scheduling, one can circumvent the problems of parallel usage, and still obtain an improvement in system lifetime.
4 4.1
Battery Scheduling Results Simulation Set-Up
Like in the previous section, the batteries we model are the lithium-ion batteries that are used in the Itsy pocket computer. The modeled battery has a capacity of 2400 As, with c = 0.166 and k = 2.815 · 10−4 s−1 . The load currents are within the range of the Itsy pocket computer, up to 600 mA [15]. In [6], we have shown that the KiBaM yields good results in lifetime computations for this type of battery and loads. We generate 10000 random load traces, which are subsequently input to the system lifetime computations using the KiBaM equations (1). We introduce randomness into the loads in three steps. First, we introduce random on-times, and keep the discharge current constant. Second, we introduce random discharge currents and keep the periods of discharge constant. Finally, we model full random loads based on a Markov model that mimics a typical usage pattern of a mobile device. In the analysis we use four basic scheduling schemes: – sequential : a next battery is chosen when the current one is empty. – load-round-robin: the batteries are chosen in a fixed order, a switch between batteries takes place at the moment the discharge current is changed to another positive current. – best-of-two: at the moment the load changes the battery with the most charge in the available-charge well is chosen. – time-round-robin: the batteries are chosen in a fixed order, a switch between batteries takes place after a fixed amount of time has passed. Three of these schedulers were also used in [5] in the setting of priced timed automata: sequential, load-round-robin and best-of-two. These schedulers are used here to see what the effect is of the bigger battery capacity on the lifetime gain.
Lifetime Improvement by Battery Scheduling -1
I/C (h ) 1
10
3 N=2 N=3 N=4
GN
2.5
2
1.5
1 0.1
1
10
I (A)
(a) Varying number of batteries (N ), c = 0.166, k = 2.815 · 10−4 s−1 and C = 2400 As. I/C (h-1) 1
10
2 c=0.166 c=0.2 c=0.3 1.8
G2
1.6
1.4
1.2
1 0.1
1
10
I (A)
(b) Varying fraction available charge (c), N = 2, k = 2.815 · 10−4 s−1 and C = 2400 As. I/C(h-1) 1 2
10
-4 -1
k=2.815 10 s-1 k=5 10-4 s k=1 10-3 s-1 1.8
G2
1.6
1.4
1.2
1 0.1
1
10
I (A)
(c) Varying conductance (k), N = 2, c = 0.166 and C = 2400 As. Fig. 3. Maximum possible gain (GN ) for various battery parameters
113
114
M.R. Jongerden and B.R. Haverkort
1.9
1.8
1.6
1.5
t
/t
round robin sequential
1.7
1.4
1.3
1.2
1.1
1 −3 10
−2
10
−1
10
0
10
1
10
2
10
switching frequency (Hz)
Fig. 4. Gain of using a round robin scheduler compared to sequential usage as a function of the switching frequency
The time-round-robin scheduler is used to approach the maximum lifetime. As discussed before, in Section 3.2, parallel discharge, which leads to the maximum lifetime, may not be practically possible. However, the lifetime of parallel discharge can be easily approached by using a fast switching round robin scheduler, as will be shown in the next section. Of course, the switching between the batteries in all the used schedulers will cost some extra energy, especially for the time-round-robin scheduler which will switch between the batteries the most often. However, the energy needed to switch between batteries will be negligible compared to the actual load. In [11], Matsuura presents a low-power pulse generator which operates with a discharge current of 0.15 µA at a voltage of 1.5 V. This current is at least a factor 1000 less than the discharge current the device operates at, which is in the order of mA. Therefore, the cost of switching using the time-round-robin scheduler can be neglected without introducing any significant error to the computed system lifetime. 4.2
Round Robin Frequency Dependence
In order to find what switching frequency is efficient to approach the maximum lifetime, we investigate how the gain in lifetime depends on the switching frequency in case of round robin scheduling. The system of two batteries is discharged with a constant current of 1 A. We compare the system lifetime obtained with the round robin scheduler with that of sequential battery usage. In Figure 4, we show the ratio of the system lifetime using round robin scheduling to the lifetime with sequential scheduling as a function of the round robin switching frequency. We see that the gain in lifetime of using the scheduler grows to a level of 1.89 when the switching frequency is increased. This level is the gain
Lifetime Improvement by Battery Scheduling
115
one would get with parallel discharge, which can be seen as switching with infinite frequency. The figure shows that already for a switching frequency of 1 Hz the gain is close to optimal, so switching at higher frequencies is not necessary. Therefore we use a switching frequency of 1 HZ for the time-round-robin scheduler in the simulations in the following sections. On the side of the low frequencies, smaller than 0.1 Hz, the graph fluctuates with clear downward tendency, that is, a small increase of the switching frequency may result in a considerable change in lifetime. This can be explained as follows. At very low frequency, lower than 0.008 Hz, the batteries are emptied in one period and the round robin scheduler results in sequential usage. When the frequency is increased the point will be reached where the first battery will not be emptied completely before the switch takes place. While the second battery is used the first can recover. Due to this recovery time the battery can be used longer, and the system lifetime is increased. This results in the first jump in the graph. Every time the batteries can recover for one period more a next jump in the graph occurs. The size of the jumps decreases as the frequency increases, since the extra recovery time will be shorter at higher frequencies. Between the jumps the system battery lifetime decreases, since the extra recovery time decreases as the switching frequency is increased. Thus, the ratio between on and off time will decrease until the next jump occurs. 4.3
Random Times
As first random load, we take an on-off load with 250 mA on-current. The off periods
1 minute, and the on periods are uniformly distributed over the last interval 12 , 32 minute. This load has also been used in [5], but there the modelled batteries had a capacity that was approximately 8 times smaller than the real capacity, which is used here. We compute the system lifetime for 10000 randomly generated loads using the four schedulers mentioned in Section 4.1. In Figure 5 the empirical lifetime distributions,expressed as the frequency count of the lifetimes, is provided for the different schedulers. The bin size for these distributions has been set to 1 minute. For clarity the histograms are plotted using lines. In Table 1 the mean and variance of the computed lifetimes are given for the different schedulers. As can be observed, clear system lifetime improvement is obtained when battery scheduling is applied. On average the load-round-robin and best-of-two scheduler outperform sequential usage by 6% and 6.6% respectively. Also, the two schedulers perform only slightly worse than the time-round-robin scheduler. When we compare these results with those in [5], in which a gain of 65% was observed, we see that the relative gain in lifetime obtained by battery scheduling is much less than for smaller batteries. This is related to the result in Section 3.2, where the maximum possible gain is given as a function of the discharge current, as follows. The mean of the discharge current of the used loads is 125 mA. This gives a ratio between the load and the battery capacity of 0.125 A/0.666 Ah = 0.1875 h−1 for the real size battery. For the smaller battery used in [5] the ratio is 0.125 A/0.0916 Ah = 1.36 h−1 . Using the top
116
M.R. Jongerden and B.R. Haverkort 800
700
sequential load round robin best-of-two time round robin
600
count
500
400
300
200
100
0 530
540
550
560
570
580
590
600
610
620
lifetime (min)
Fig. 5. Empirical lifetime distributions generated with 10000 on-off loads with random on-times Table 1. Mean and variance of the lifetimes obtained with the different schedulers for the loads with random on-times scheduler sequential load-round-robin best-of-two time-round-robin
mean lifetime (min) variance (min2 ) 552.87 585.90 589.33 596.01
39.36 50.39 37.44 33.38
x-axis in Figure 2 one sees that the ratio of 0.1875 h−1 allows for a gain of less than 10%, whereas the ratio of 1.36 h−1 is close to the peak value of a maximum possible gain of 90%. 4.4
Random Currents
The second set of random loads is also used in [5]. In this set of random loads every minute we uniformly choose the discharge current from the set {0, 100, 200, 300, 400, 500} mA. The current will stay constant for one minute until the next current is picked. We use the same schedulers as in the previous section. The load-round-robin and best-of-two scheduler now make a scheduling decision every minute, when the new current is picked. Again, 10000 loads were generated. The lifetime distributions for these loads are given in Figure 6, and the numbers for the mean and variance of the simulations are given in Table 2. The trend is similar to the previous random load. The best-of-two scheduler performs slightly better than the load-roundrobin scheduler, and both perform close to the time-round-robin scheduler. The
Lifetime Improvement by Battery Scheduling
117
350
300
sequential load round robin best-of-two time round robin
250
count
200
150
100
50
0 160
180
200
220
240
260
280
300
320
340
lifetime (min)
Fig. 6. Empirical lifetime distributions generated with 10000 loads with random discharge current Table 2. Mean and variance of the lifetimes obtained with the different schedulers for the loads with random discharge currents scheduler sequential load-round-robin best-of-two time-round-robin
mean lifetime (min) variance (min2 ) 229.55 266.12 270.10 274.84
237.98 206.73 195.44 197.20
average improvements relative to the sequential scheduler are 16% and 18% for the load-round-robin and best-of-two, respectively. This is much better than for the loads with random on times due to the higher average discharge current. For the loads with random currents the average discharge current is 250 mA. In Figure 2 we can see that for a discharge current of 250 mA the maximum lifetime gain is just under 20% when the system is discharged with a continuous current of 250 mA. On the other hand, the maximum gain for the random on-times, which have an average discharge current of 125 mA, is approximately 10%. Due to the higher variance in discharge current, the variance in lifetime is larger for this load, as visible through the “wider” graphs, and the numbers for the variance in Table 1 and 2. When we compare the results with those of [5], we see that the difference in lifetime gain is not as large as with the previous set of random loads. The timedautomata approach in [5] resulted in a system lifetime that was 26% longer than the sequential schedule. For the smaller batteries used in [5], the ratio between the discharge current and the battery capacity is 0.250 A/0.0916 Ah = 2.73 h−1 . This ratio leads, according to Figure 2, to a maximum possible gain of approximately 26%, which is the obtained gain.
118
M.R. Jongerden and B.R. Haverkort
l
s
start-up 300 mA
on-1 400 mA
m1
sleep 2 mA
l
t idle 20 mA
m2
n1
n2
on-2 600 mA
Fig. 7. State transition diagram of the workload model Table 3. Transition rates of the Markov model transition rate (min−1 )
4.5
λ σ μ1 ν1 μ2 ν2 τ 1 5
2
1 1 1 4 1 14 14 25 25 2
Full Random Load
The final step in introducing randomness into the loads is having both random discharge times and random currents. This is done by using a Markov model that represents a simple workload of a device. The state transition diagram of this Markov model is given in Figure 7. The device has 5 different states: sleep, start-up, on-1, on-2 and idle. In the sleep state the device draws a 2 mA current from the battery. From the sleep state the device first has to start-up before it can go to the on-1 state. The start-up takes 30 seconds on average, and during start-up the discharge rate is 300 mA. From the on-1 state a transition is made either to the idle state, or to the on-2 state, both with probability 12 . In the on-1 and on-2 state the discharge current is 400 and 600 mA, respectively. The average residence time in the on-1 and on-2 state is 7 and 6 minutes, respectively. From the on-2 with probability 45 it will go back to on-1, and with probability 15 go to idle. In the idle state the current is 20 mA, and the average time it takes to go back to sleep is 2 minutes. The used discharge currents are based on the average discharge currents for different modes of the Itsy pocket computer [16]. An overview of the transition rates is given in Table 3. Again, we use the sequential, load-round-robin, best-of-two scheduler and time-round-robin to compute the system lifetime. The Markov model is used to generate 10000 random loads. For the load-round-robin and best-of-two scheduler the scheduling choices are made at the state changes. In Figure 8 the empirical lifetime distributions for the Markov workload model are given. In Table 4 the mean and variance of the lifetimes are given. Again, we see the same order in performance of the four schedulers. However, the difference between the time-round-robin scheduler, and the best-of-two and loadround-robin scheduler is a lot larger. Even though the average improvement compared to sequential discharge of the load-round-robin and best-of-two scheduler is 10.3% and 12.6%, respectively, the time-round-robin scheduler leads to an even longer lifetime. The time-round-robin scheduler outperforms sequential
Lifetime Improvement by Battery Scheduling
119
180 sequential load round robin best-of-two time round robin
160 140
count
120 100 80 60 40 20 0 50
100
150
200
250
300
350
lifetime (min)
Fig. 8. Empirical lifetime distributions generated with 10000 Markov model loads Table 4. Mean and variance of the lifetimes obtained with the different schedulers for the Markov model loads scheduler sequential load-round-robin best-of-two time-round-robin
mean lifetime (min) variance (min2 ) 133.72 145.99 149.11 183.61
794.12 803.83 892.97 775.06
scheduling with 40.6% for this workload. The large difference between the timeround-robin scheduling and the round robin and best-of-two schedulers is caused by the longer average time between scheduling moments.
5
Conclusions
In this paper we have looked at the impact of battery scheduling for real size batteries using several random loads. Our results show that the use of battery scheduling can improve the system lifetime. The analysis of the kinetic battery model shows that parallel discharge leads to the largest lifetime improvement. The actual lifetime gain highly depends on the load-capacity ratio. Parallel discharge is not always possible, as it may damage the batteries. However, we show that using a high frequency round-robin scheduler one can approach the lifetime obtained by parallel discharge. The simulation results show that also for more complex random loads, battery scheduling helps to improve the system lifetime considerably. The gain in lifetime compared to sequential discharge of the batteries for the different schedulers varies with the type of
120
M.R. Jongerden and B.R. Haverkort
load. The average maximum lifetime gain can be well predicted by computing the maximum possible lifetime gain for a continuous discharge current using the average current of the random load.
References 1. Benini, L., Castelli, G., Macii, A., Macii, E., Poncino, M., Scarsi, R.: Extending lifetime of portable systems by battery scheduling. In: Design, Automation and Test in Europe (DATE), pp. 197–203. IEEE Computer Society, Los Alamitos (2001) 2. Chiasserini, C., Rao, R.: Energy efficient battery management. IEEE Journal on Selected Areas in Communications 19(7), 1235–1245 (2001) 3. Cloth, L., Haverkort, B.R., Jongerden, M.R.: Computing battery lifetime distributions. In: Proc. IEEE DSN 2007, pp. 780–789. IEEE Computer Society Press (2007) 4. Jongerden, M., Haverkort, B., Bohnenkamp, H., Katoen, J.P.: Maximizing system lifetime by battery scheduling. In: Proc. IEEE DSN 2009, pp. 63–72. IEEE Computer Society Press (2009) 5. Jongerden, M., Mereacre, A., Bohnenkamp, H., Haverkort, B., Katoen, J.P.: Computing optimal schedules for battery usage in embedded systems. IEEE Transactions on Industrial Informatics 6(3), 276–286 (2010) 6. Jongerden, M.R., Haverkort, B.R.: Which battery model to use? IET Software 3(6), 445–457 (2009) 7. Jongerden, M., Haverkort, B.: Lifetime improvement by battery scheduling. In: UK Performance Engineering Workshop, Bradford, UK, July 7-8 (2010) 8. Manwell, J., McGowan, J.: Lead acid battery storage model for hybrid energy systems. Solar Energy 50, 399–405 (1993) 9. Manwell, J., McGowan, J.: Extension of the kinetic battery model for wind/hybrid power systems. In: Proc. of the 5th European Wind Energy Association Conference (EWEC 1994), pp. 284–289 (1994) 10. Manwell, J., McGowan, J., Baring-Gould, E., Stein, W., Leotta, A.: Evaluation of battery models for wind/hybrid power system simulation. In: Proc. of the 5th European Wind Energy Association Conference (EWEC 1994), pp. 1182–1187 (1994) 11. Matsuura, Y.: Low-power consumption reference pulse generator. United States Patent 4,618,837 (1986) 12. Battery University.com (April 2010), http://www.batteryuniversity.com/partone-24.htm 13. Low Cost Batteries.com (April 2010), http://www.lowcostbatteries.com/articles.asp?id=107 14. Wolfram Mathworld Lambert-W Function (April 2010), http://mathworld.wolfram.com/lambertw-function.html 15. Rakhmatov, D., Vrudhula, S., Wallach, D.A.: Battery lifetime predictions for energy-aware computing. In: Proc. of the 2002 International Symposium on Low Power Electronics and Design (ISLPED 2002), pp. 154–159 (2002) 16. Rakhmatov, D., Vrudhula, S., Wallach, D.A.: A model for battery lifetime analysis for organizing applications on a pocket computer. IEEE Transactions on VLSI Systems 11(6), 1019–1030 (2003) 17. Sarkar, S., Adamou, M.: A framework for optimal battery management for wireless nodes. In: Proc. IEEE INFOCOM 2002, pp. 179–188 (2002)
Weighted Probabilistic Equivalence Preserves ω-Regular Properties Arpit Sharma RWTH Aachen University, Software Modeling and Verification Group, Germany [email protected]
Abstract. Equivalence relations can be used to reduce the state space of a system model, thereby permitting more efficient analysis. This paper extends the notion of weighted lumpability (WL) defined on continuous-time Markov chains (CTMCs) to the discrete-time setting, i.e., discrete-time Markov chains (DTMCs). We provide a structural definition of weighted probabilistic equivalence (WPE), define the quotient under WPE and prove some elementary properties. We show that ω-regular properties are preserved when reducing the state space of a DTMC using WPE. Finally, we show that WPE is compositional w.r.t. synchronous parallel composition. Keywords: discrete-time Markov chain, weighted probabilistic equivalence, bisimulation, ω-regular property, deterministic Rabin automata, synchronous parallel composition.
1 Introduction Discrete-time Markov chains (DTMCs) are widely used for the evaluation of performance and dependability of information processing systems. Equivalence relations are used to reduce the state space of DTMCs, by combining equivalent states into a single state. The reduced state space obtained under an equivalence relation called a quotient can then be used for analysis provided it preserves a rich class of properties of interest. Various branching-time relations on DTMCs have been defined such as weak and strong variants of bisimulation equivalence and simulation pre-orders [13,20,15,18,9,14,6]. Their compatibility to (fragments of) probabilistic variants of Computation Tree Logic (CTL) has been thoroughly investigated [4]. Probabilistic model checking tools such as Probabilistic Symbolic Model Checker (PRISM) [19] and Markov Reward Model Checker (MRMC) [16] have been used to model check Probabilistic Computation Tree Logic (PCTL) [11] properties on DTMCs. In the linear-time setting, probabilistic trace equivalences [12,22] have been defined for discrete-time or time-abstract probabilistic models. For the continuous case, Markovian testing equivalence has been proposed in [5]. In [27] the Markovian variants of several linear-time equivalences have been extensively investigated. In this paper our focus is on weighted probabilistic equivalence (WPE) that allows for a more aggressive state space aggregation than bisimulation. WPE is an extension of weighted lumpability [23] defined on CTMCs. Two states s, s are weighted lumpable if they are equally labeled, have identical exit rates and weighted rates. In [23] it has been J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 121–135, 2012. c Springer-Verlag Berlin Heidelberg 2012
122
A. Sharma
shown that WL preserves linear real-time objectives. Whereas bisimulation compares states on the basis of their direct successors —the cumulative probability to directly move to any equivalence class must be equal— WPE considers a two-step perspective. Two states s ans s are weighted probabilistic equivalent if for each pair of their direct predecessors the weighted probability to directly move to any equivalence class via the equivalence class [s] = [s ] coincides. The main principle is captured in Fig. 1 1 where pC1 = p1 ·p1,1 , pC2 = p1 ·p1,2 + p2 ·p2,1 , pC3 = p2 ·p2,2 with p1 = p1p+p and 2 p2 p2 = p1 +p2 . Here states s1 and s2 are weighted probabilistic equivalent, as for each pair of direct predecessors of the equivalence class [s1 ] = [s2 ], i.e., in this case only s0 , the weighted probability to move to all the states in the equivalence class Ci (for i=1, 2, 3) via all the states in [s1 ] is equal. This allows combining states s1 and s2 into state s1 , cf. the right DTMC in Fig. 1. We provide a structural definition of WPE on s0
s0
p1 + p2
p1 p2 s1 p1,1 C1
p1,2 C2
s2 p2,1 p2,2 C3
pC1 C1
s1 pC2 C2
pC3 C3
Fig. 1. State space reduction under weighted probabilistic equivalence
DTMCs. We define the quotient under WPE, show that any DTMC is equivalent to its quotient under WPE, and prove that WPE is (strictly) coarser than bisimulation. The main contributions of this paper are as follows: – We show that ω-regular properties are preserved under WPE quotienting. – Next, we show that WPE is compositional w.r.t. synchronous parallel composition. We first show that the probability of satisfying a deterministic Rabin automaton (DRA) [24, pp. 3-21], [3, pp. 801-805] specification for any DTMC coincides with the probability for its quotient. Since the class of languages accepted by DRAs agree with the class of ω-regular properties it implies that WPE preserves ω-regular properties. We note that this also implies the preservation of Linear Temporal Logic (LTL) [3, pp. 229-270] formulas and transient-state probabilities. It is important to note that there are certain interesting ω-regular properties, e.g., every even position should always be occupied by a, that cannot be expressed using PCTL, PCTL* (an extension of PCTL) or LTL. Model checking of DTMC against DRA specification can be done by solving a system of linear equations obtained on the product of the DTMC and the DRA [3]. Finally we show that WPE is compositional w.r.t. synchronous parallel composition [25,26]. This is helpful as instead of analyzing a large DTMC, which may be very costly, we can analyze the smaller, weighted probabilistic equivalent DTMC.
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
123
Organisation of the paper. Section 2 briefly recalls the basic concepts of DTMCs. Section 3 defines weighted probabilistic equivalence and treats some basic properties. Sections 4 discusses the preservation of ω-regular properties. Section 5 shows that WPE is compositional w.r.t. synchronous parallel composition. Finally, Section 6 concludes the paper. All the proofs are contained in the appendix.
2 Discrete-Time Markov Chains This section recalls the basic concepts of discrete-time Markov chains with finite state space. The presentation is focused on the concepts needed for the understanding of the rest of this paper. Definition 1 (DTMC). A (labeled) discrete-time Markov chain (DTMC) is a tuple D = (S, P, AP, L, s0 ) where: – S is a non-empty finite set of states, – P : S × S → [0, 1] is a probability matrix such that s ∈ S,
s ∈S
P (s, s ) = 1 for all
– AP is a finite set of atomic propositions, – L : S → 2AP is a labeling function, – s0 ∈ S is the initial state. Intuitively, P (s, s ) specifies the probability to move from state s to s in one step, i.e., by a single transition. State s of DTMC D is called absorbing if and only if P (s, s) = 1, and P (s, s ) = 0 for all s ∈ S s.t. s = s . Definition 2 (DTMC paths). Let D = (S, P, AP, L, s0 ) be a DTMC. An infinite path → s1 − → s2 . . . ∈ S ω s.t. π in D is an infinite state sequence si ∈ S, i.e., s0 − P (si , si+1 ) > 0, for all i ≥ 0. A finite path π is a finite state sequence si ∈ S, i.e., s0 − → s1 − → s2 . . . sn−1 − → sn s.t. P (si , si+1 ) > 0 for all i < n. For path π ∈ D, inf (π) denotes the set of states that are visited infinitely often in π. For finite DTMCs, inf (π) is nonempty for all infinite paths π. Let Paths D = Paths D f in ∪ D D denote the set of all paths in D, where Paths = Paths is the set of Paths D ω f in n n∈N D all finite paths in D and Paths ω is the set of all infinite paths in D. For infinite path π and any i ∈ N, let π[i] = si , the (i + 1)-st state of π. For finite path π, which is a finite prefix of length n of an infinite path, π[i] is only defined for i ≤ n and defined as in the case of infinite paths. Let Paths(s0 ) denote the set of all paths that start in s0 . Let π[i...] denote the suffix of path π starting in the (i + 1)-st state. Example 1. Consider the DTMC D in Fig. 2 (left), where S = {s0 , s1 , s2 , s3 , s4 , s5 , s6 , s7 }, AP = {a, b, c}, L(s0 ) = {c}, L(s7) = {}, L(s1 ) = L(s2 ) = L(s3 ) = {b}, L(s4 ) = L(s5 ) = L(s6 ) = {a} and s0 is the initial state. The transition probabilities are attached to the transitions. An example finite path π is s0 − → s1 − → s5 − → s7 . Here π[2] = s5 .
124
A. Sharma
Definition 3 (Cylinder set). Let s0 , . . . , sk ∈ S with P (si , si+1 ) > 0 for 0 ≤ i < k. Cyl(s0 , . . . , sk ) denotes the cylinder set consisting of all paths π ∈ Paths(s0 ) s.t. π[i] = si for i ≤ k. Intuitively the cylinder set spanned by the finite path π consists of all infinite paths that start with π. The definition of a Borel space on paths of a DTMC follows [8,1]. Let F (Paths(s0 )) be the smallest σ-algebra on Paths(s0 ) which contains all sets Cyl(s0 , . . . , sk ) s.t. s0 , . . . , sk is a state sequence with P (si , si+1 ) > 0, (0 ≤ i < k). Definition 4. The probability measure Prs0 on F (P ath(s0 )) is the unique measure defined by induction on k in the following way. Let Prs0 (Cyl(s0 )) = 1 and for k > 0: Pr(Cyl(s0 , . . . , sk , s )) = Pr(Cyl(s0 , . . . , sk )) · P (sk , s ) s0
s0
For T ⊆ S and s ∈ S, let P (s, T ) = s ∈T P (s, s ) be the cumulative probability to directly move from state s to some state in T ⊆ S. Definition 5 (SCC). A subset T of S is called strongly connected if for each pair (s, t) → s1 . . . sn such that si ∈ T for 0 ≤ i ≤ of states in T there exists a path fragment s0 − n, s0 = s and sn = t. A strongly connected component (SCC) of D denotes a strongly connected set of states such that no proper superset of T is strongly connected. Definition 6 (BSCC). A bottom strongly connected component (BSCC, for short) of D is an SCC T from which no state outside T is reachable, i.e., for each state t ∈ T it holds that P (t, T ) = 1. Let BSCC(D) denote the set of all BSCCs of D. Theorem 1. [3, pp. 775-776] For each state s of a finite DTMC D: P rs {π ∈ P aths(s)|inf (π) ∈ BSCC(D)} = 1. In simple words this theorem states that almost surely any finite discrete-time Markov chain eventually reaches a BSCC and visits all states of the BSCC infinitely often. Example 2. Consider the DTMC D in Fig. 2 (left), the only BSCC in D is {s7 }. According to the previous theorem, any infinite path will almost surely lead to this BSCC. Assumptions. Throughout this paper we assume that every state of DTMC D has at least one predecessor, i.e., pred(s) = {s ∈ S | P (s , s) > 0} = ∅ for any s ∈ S. This is not a restriction, as any DTMC (S, P, AP, L, s0 ) can be transformed into an equivalent DTMC (S , P , AP , L , s0 ) which fulfills this condition. This is done by adding a new state sˆ to S equipped with a self-loop and which has a transition to each state in S without predecessors. The transition probabilities for sˆ are set to some arbitrary value, s, s) = 12 where pred(s) = ∅ s.t. s∈S P (ˆ s, s) = 1. e.g., P (ˆ s, sˆ) = 12 and P (ˆ To distinguish this state from the others we set L (ˆ s) = ⊥ with ⊥ ∈ AP. (All other labels, states and transitions remain unaffected.) Let s0 = s0 . It follows that all states in S = S ∪ {ˆ s} have at least one predecessor. Moreover, the reachable state space of both DTMCs coincides. We also assume that the initial state s0 of a DTMC is distinguished
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
125
from all other states by a unique label, say $. This assumption implies that for any equivalence that groups equally labeled states, {s0 } constitutes a separate equivalence class. Both assumptions do not affect the basic properties of the DTMC such as transient or steady-state distributions. For convenience, we neither show the state sˆ nor the label $ in figures.
3 Weighted Probabilistic Equivalence This section presents the basic concepts related to weighted probability followed by the formal definition of weighted probabilistic equivalence. We also define the quotient DTMC under WPE and explore its relationship with bisimulation. Definition 7. For s, s ∈ S and C ⊆ S, the function P : S × S × 2S → R≥0 is defined by: P (s,s ) if s ∈ C and P (s, C) > 0 P (s, s , C) = P (s,C) 0 otherwise. Intuitively, P (s, s , C) is the probability to move from state s to s under the condition that s moves to some state in C. Definition 8 (Weighted probability). For s ∈ S, and C, D ⊆ S, the function wp : S × 2S × 2S → R≥0 is defined by: wp(s, C, D) = P (s, s , C) · P (s , D). s ∈C
Definition 9 (WPE). Equivalence R on S is a weighted probabilistic equivalence (WPE) if we have: 1. ∀(s1 , s2 ) ∈ R it holds: L(s1 ) = L(s2 ), and 2. ∀C, D ∈ S/R and ∀s , s ∈ pred(C) it holds: wp(s , C, D) = wp(s , C, D), where for C ⊆ S, pred(C) = s∈C pred(s). States s1 , s2 are weighted probabilistic equivalent, denoted by s1 s2 , if (s1 , s2 ) ∈ R for some WPE R. These conditions require that any two weighted probabilistic equivalent states are equally labeled and that for any two equivalence classes C, D ∈ S/R , where S/R denotes the set consisting of all R-equivalence classes, the weighted probability of going from any two predecessors of C to D via any state in C must be equal. Note that, by definition, any WPE is an equivalence relation. Example 3. Consider the DTMC D in Fig. 2 (left). Let C = {s4 , s5 , s6 } and D = {s7 }. Then wp(s1 , C, D) = P (s1 , s4 , C)·P (s4 , D)+P (s1 , s5 , C)·P (s5 , D) = 34 . Similarly, wp(s2 , C, D) = P (s2 , s4 , C) · P (s4 , D) + P (s2 , s5 , C) · P (s5 , D) + P (s2 , s6 , C) · P (s6 , D) = 34 . Definition 10 (Quotient DTMC). For WPE relation R on D, the quotient DTMC D/R is defined by D/R = (S/R , P , AP, L , s0 ) where:
126
A. Sharma
s0 1 s1
s4
1 4 1 8
1 8 3 4
s5
1 4
s2 6 8 1 8
s 0
5 8
s3 3 1 7 7
s6 1
1 4
1
1 s 1
3 7
41 56
s7
s 2
3 4
1
15 56
s 3 1
Fig. 2. DTMC D (left) and its quotient under a WPE D/R (right)
– – – –
S/R is the set of all equivalence classes under R, P (C, D) = wp(s , C, D) where C, D ∈ S/R and s ∈ pred(C), L (C) = L(s), where s ∈ C and s0 = C where s0 ∈ C.
Note that P (C, D) is well-defined as for any predecessors s , s of C it follows wp(s , C, D) = wp(s , C, D). Similarly, L is well-defined as states in any equivalence class C are equally labeled. Example 4. For the DTMC D in Fig. 2 (left), the quotient D/R under WPE R with partition {{s0 }, {s1 , s2 , s3 }, {s4 , s5 , s6 }, {s7 }} is shown in Fig. 2 (right). Definition 11. Any DTMC D and its quotient D/R under WPE relation R are equivalent denoted by DD/R if and only if there exists a WPE relation R defined on disjoint union of state space S ∪S/R such that ∀C ∈ S/R , s ∈ C it holds: (s, C) ∈ R . Remark 1. Note that the new probability matrix, say P defined on S ∪ S/R will only allow following transitions: P (s, s ) = P (s, s ) if s, s ∈ S and P (s, s ) = P (s, s ) if s, s ∈ S/R , and 0 otherwise. Next, we show that any DTMC D and its quotient under WPE relation are -equivalent. Theorem 2. Let D be a DTMC and R be a WPE on D. Then DD/R . Note that WPEs are not unique, i.e., there can be more than one equivalence relation that is a WPE for any given DTMC. Intuitively it means that original DTMC D can be minimized in different ways. Next, we investigate the relationship of WPE to bisimulation. Definition 12 (Bisimulation [4,20]). Equivalence R on S is a bisimulation on D if for any (s1 , s2 ) ∈ R we have: L(s1 ) = L(s2 ), and P (s1 , C) = P (s2 , C) for all C in S/R . s1 and s2 are bisimilar, denoted s1 ∼ s2 , if (s1 , s2 ) ∈ R for some bisimulation R. These conditions require that any two bisimilar states are equally labeled and have identical cumulative probabilities to move to any equivalence class C ∈ S/R .
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
127
Lemma 1. ∼ is strictly finer than . Remark 2. From Fig. 2 it can be observed that states s5 and s6 cannot be merged under ∼, as s6 can move to s0 but there is no direct successor s of s5 with s ∼ s0 (Note that L(s0 ) = L(s7 )). Similarly, s2 and s3 cannot be merged under ∼. This shows that ss s ∼ s .
4 Preservation of ω-Regular Properties In this section we investigate the linear-time properties for DTMCs that are preserved by WPE. We study a more general class of linear-time properties, i.e., ω-regular properties that are defined over infinite sequence of symbols, i.e., A0 A1 A2 ..., Ai ∈ (2AP )ω , where (2AP )ω denotes the set of all infinite words over 2AP . These include, e.g., properties of the form: every time the process tries to send a message, it eventually succeeds in sending it. Note that the preservation of ω-regular properties implies the preservation of LTL formulas and transient-state probabilities. These preservation results can be exploited for model checking by reducing DTMC models under consideration prior to carrying out the verification. This may speed up the verification as (mostly) a smaller model needs to be checked. A model checking algorithm that verifies a DTMC against an ω-regular specification already exists [3, pp. 803-805], where the probability of all the DTMC paths that are accepted by ω-regular specification is computed. This is done by solving a system of linear equations. In the context of branching-time equivalence relations, probabilistic bisimulation coincides with the logical equivalence of the branching-time logic’s PCTL and PCTL* [7,2,11]. PCTL* and ω-regular properties have incomparable expressiveness [3]. Definition 13 (ω-Regular Language). A language L ⊆ (2AP )ω is called ω-regular if L = Lω (G) for some ω-regular expression G over 2AP . For instance, the language consisting of all infinite words over {a, b} that contain only finitely many a’s is ω-regular since it is given by the ω-regular expression (a + b)∗ bω . ω-regular languages possess several closure properties: they are closed under union, intersection and complementation [24, pp. 61-77], [3, pp. 172-198]. Definition 14 (ω-Regular Property). Linear-time property P over AP is called ωregular if P is an ω-regular language over the alphabet 2AP . Theorem 3. [24, pp. 53-59] The class of languages accepted by DRAs agrees with the class of ω-regular languages. Intuitively, this theorem says that any property P that can be expressed using ω-regular language L is also expressible using some DRA A and vice versa. Next we show that the probability of satisfying a deterministic Rabin automata specification for any DTMC coincides with the probability for its quotient under WPE.
128
A. Sharma
4.1 Deterministic Rabin Automata Definition 15 (DRA). A deterministic Rabin automaton (DRA) is a tuple A = (Q, Σ, δ,q0 , Acc) where: – – – – –
Q is a nonempty finite set of locations, Σ is a finite alphabet, δ : Q × Σ → Q is the transition function, q0 is the initial location, Acc ⊆ 2Q × 2Q where: the run q0 q1 q2 . . . is accepting if there exists a pair (Li , Ki ) ∈ Acc such that (∃n ≥ 0.∀m ≥ n.qm ∈ / Li ) ∧ (∃∞ n ≥ 0.qn ∈ Ki ).
Intuitively a DRA is a finite-state automaton with the same components as non-deterministic B¨uchi automaton (NBA) [24, pp. 3-7], [3, pp. 173-178] except for the acceptance condition. The acceptance condition of a Rabin automata is given by a set of pairs of states: {(Li , Ki )|0 < i ≤ k} with Li , Ki ⊆ Q. A run of a Rabin automaton is accepting if for some pair (Li , Ki ) the states in Li are visited finitely often and the states in Ki infinitely often. A deterministic Rabin automata is deterministic since it has a single initial state and the successor location is uniquely determined. The edge, a q asserts that the DRA A moves from location q to q when the input symbol is q −→ a0 a1 a. An infinite path of DRA A has the form ρ = q0 −− → q1 −− → . . ..
¬a
a q0
a q1
¬a Fig. 3. Deterministic Rabin automaton
Example 5. Consider the DRA A in Fig. 3. Let AP = {a}, Σ = {{a}, ∅}, Q = {q0 , q1 }, Acc = {({q0 }, {q1 })}, and q0 is the initial state. The runs accepted by A are those which eventually stay forever in state q1 . The ω-regular property expressed by this DRA is given by: eventually forever a (32a). Before defining the probability of paths in DTMC D that are accepted by DRA A, we first define two auxiliary concepts namely product Markov chain (D ⊗A) and accepting BSCCs in D ⊗ A. Definition 16. Product Markov chain [3, pp. 802-803] Let D = (S, P, AP, L, s0 ) be a DTMC and A = (Q, 2AP , δ, q0 , Acc) be a DRA. The product D ⊗ A is the Markov chain: D ⊗ A = (S × Q, P , s0 , q, AP , L )
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
129
where Li , Ki serve as atomic propositions in D ⊗ A if the acceptance condition of A is Acc = {(L1 , K1 ), . . . , (Lk , Kk )}. The set of these atomic propositions is AP . The labeling function L in D⊗A is: if H ∈ {L1 , . . . , Lk , K1 . . . , Kk }, then H ∈ L (s, q) if and only if q ∈ H. The initial state of D ⊗ A, i.e., s0 , q is s.t. q = δ(q0 , L(s0 )). The transition probabilities in D ⊗ A are given by: P (s, s ) if q = δ(q, L(s )) P (s, q, s , q ) = 0 otherwise. The product Markov chain is intuitively the synchronous product of DTMC D and DRA L(s) A s.t. transition s − → s in D is matched with edge q −−−→ q . Definition 17 (Accepting BSCCs in D ⊗ A). A BSCC T in D ⊗ A is accepting if and only if there exists some index i ∈ {1, . . . , k} such that T ∩ (S × Li ) = ∅ and T ∩ (S × Ki ) = ∅. Let us formally define the paths of DTMC D that are accepted by DRA A. Definition 18 (DTMC paths accepted by a DRA). Let DTMC D = (S, P, AP, L, s0 ) and DRA A = (Q, 2AP , δ, q0 , Acc). The DTMC path π = s0 − → s1 − → s2 . . . is accepted by A if there exists a corresponding DRA path L(s )
L(s )
0 1 q0 −−−− → q1 −−−− → q2 . . .
such that for path s0 , q1 s1 , q2 . . . in D ⊗ A, si , qi+1 ∈ T for some i ≥ 0, where T is an accepting BSCC in D ⊗ A. Since the product Markov chain is also a DTMC it will eventually reach a BSCC and visits all its states infinitely often. Let P athsD (A) = {π ∈ P athsD |π is accepted by A}. Note that for any DTMC D and DRA A, the set P athsD (A) is measurable [3, pp. 804805]. Definition 19. For DTMC D and DRA A, let Pr(D |= A) = Pr Paths D (A) . Stated in words, Pr(D |= A) denotes the probability of all the paths in DTMC D that are accepted by DRA A. Theorem 4 (Preservation of DRA specifications). For any DTMC D, a WPE R on D and DRA A: Pr(D |= A) = Pr(D/R |= A). Intuitively this theorem says that the probability of all the paths in Markov chain D satisfying DRA A equals the probability of all the paths in D/R that satisfy A. Corollary 1. WPE preserves transient state probabilities.
130
A. Sharma
5 Synchronous Parallel Composition for DTMCs In this section we show that WPE is compositional w.r.t. synchronous parallel composition of DTMCs. This result is useful for analyzing synchronous distributed algorithms and synchronous hardware circuits where processes progress in a lock-step fashion. For example say we want to compose a large DTMC D1 with another DTMC D2 and these DTMCs have n and m states respectively. Then the resulting DTMC D1 ⊗ D2 will have m · n states so it worthwhile to compute this composition using a smaller DTMC D weighted probabilistic equivalent to D1 . An interesting and practically relevant case study of failure behavior of Negated AND (NAND) multiplexing has been investigated in [21]. Here modeling the system involves constructing a PRISM module for each of the N NAND gates in the stage, and then combining these modules through synchronous parallel composition. Synchronous parallel composition operator (⊗) for DTMCs is formally defined as: Definition 20 (Synchronous parallel composition [25,26]). Let D1 = (S1 , P1 , AP1 , pi L1 , s01 ) and D2 = (S2 , P2 , AP2 , L2 , s02 ) be two DTMCs. We say s −− → i s if pi = Pi (s, s ) > 0 for i = 1, 2. The synchronous parallel composition of two DTMCs is D1 ⊗ D2 = (S1 × S2 , P, AP1 ∪ AP2 , L, (s01 , s02 )), where (s01 , s02 ) is the initial state, L((s1 , s2 )) = L(s1 ) ∪ L(s2 ), and P is gives as follows: p1 p2 s1 −− → 1 s1 ∧ s2 −− → 2 s2 . p1 ·p2 (s1 , s2 ) −−−−→ (s1 , s2 )
Intuitively both the Markov chains proceed in a lock-step fashion, and thus resulting in the product of individual transition probabilities. Theorem 5. Let D be a DTMC and R be a WPE on D. Then for any DTMC D1 : (D ⊗ D1 )(D/R ⊗ D1 ).
6 Conclusions This paper defines an equivalence relation (what we refer to as) weighted probabilistic equivalence (WPE) on DTMCs. WPE is an extension of weighted lumpability (WL) [23] defined on CTMCs. The main contributions of this paper are as follows: – We show that ω-regular properties specified on DTMCs are preserved under WPE quotienting. – Next, we show that WPE is compositional w.r.t. synchronous parallel composition. Implementing an efficient quotienting algorithm for WPE, and investigating the weaker variant of WPE are left for future work. Acknowledgements. The author would like to thank Joost-Pieter Katoen for his valuable feedback and comments. This work was supported by the European Commission under the India4EU project.
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
131
References 1. Ash, R.B., Doleans-Dade, C.A.: Probability and Measure Theory. Academic Press (2000) 2. Aziz, A., Singhal, V., Balarin, F.: It Usually Works: The Temporal Logic of Stochastic Systems. In: Wolper, P. (ed.) CAV 1995. LNCS, vol. 939, pp. 155–165. Springer, Heidelberg (1995) 3. Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT Press (2008) 4. Baier, C., Katoen, J.-P., Hermanns, H., Wolf, V.: Comparative branching-time semantics for Markov chains. Inf. Comput. 200(2), 149–214 (2005) 5. Bernardo, M.: Non-bisimulation-based Markovian behavioral equivalences. J. Log. Algebr. Program. 72(1), 3–49 (2007) 6. Buchholz, P.: Exact and ordinary lumpability in finite Markov chains. J. of Appl. Prob., 59– 75 (1994) 7. Desharnais, J., Edalat, A., Panangaden, P.: A logical characterization of bisimulation for labeled Markov processes. In: LICS, pp. 478–487 (1998) 8. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 1. John Wiley and Sons (1968) 9. Glabbeek, R.J.V., Smolka, S.A., Steffen, B.: Reactive, generative and stratified models of probabilistic processes. Information and Computation 121, 130–141 (1990) 10. Han, T., Katoen, J.-P., Mereacre, A.: Compositional Modeling and Minimization of TimeInhomogeneous Markov Chains. In: Egerstedt, M., Mishra, B. (eds.) HSCC 2008. LNCS, vol. 4981, pp. 244–258. Springer, Heidelberg (2008) 11. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Asp. Comput. 66(5), 512–535 (1994) 12. Huynh, D.T., Tian, L.: On some equivalence relations for probabilistic processes. Fundam. Inform. 17(3), 211–234 (1992) 13. Jones, C., Plotkin, G.: A probabilistic powerdomain of evaluations. In: LICS, pp. 186–195. IEEE Press, Piscataway (1989) 14. Jonsson, B., Larsen, K.G.: Specification and refinement of probabilistic processes. In: LICS, pp. 266–277 (1991) 15. Jou, C.-C., Smolka, S.A.: Equivalences, Congruences, and Complete Axiomatizations for Probabilistic Processes. In: Baeten, J.C.M., Klop, J.W. (eds.) CONCUR 1990. LNCS, vol. 458, pp. 367–383. Springer, Heidelberg (1990) 16. Katoen, J.-P., Khattri, M., Zapreev, I.S.: A Markov reward model checker. In: QEST, pp. 243–244. IEEE CS Press (2005) 17. Katoen, J.-P., Mereacre, A.: Model Checking HML on Piecewise-Constant Inhomogeneous Markov Chains. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp. 203– 217. Springer, Heidelberg (2008) 18. Kemeny, J., Snell, J.: Finite Markov chains. VanNostrand (1969) 19. Kwiatkowska, M., Norman, G., Parker, D.: Prism 2.0: A tool for probabilistic model checking. In: QEST, pp. 322–323. IEEE Computer Society Press (2004) 20. Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. In: POPL, pp. 344–352 (1989) 21. Norman, G., Parker, D., Kwiatkowska, M., Shukla, S.: Evaluating the reliability of NAND multiplexing with PRISM. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24(10), 1629–1637 (2005) 22. Segala, R.: Modelling and Verification of Randomized Distributed Real Time Systems. PhD thesis, MIT (1995) 23. Sharma, A., Katoen, J.-P.: Weighted Lumpability on Markov Chains. In: Voronkov, A. (ed.) PSI 2011. LNCS, vol. 7162, pp. 322–339. Springer, Heidelberg (2012)
132
A. Sharma
24. Thomas, W., Wilke, T.: Automata Logics, and Infinite Games: A Guide to Current Research. Springer, Heidelberg (2002) 25. Tofts, C.M.N.: Processes with probablities, priority and time. Formal Asp. Comput. 6(5), 536–564 (1994) 26. Tofts, C.M.N.: Compositional Performance Analysis. In: Brinksma, E. (ed.) TACAS 1997. LNCS, vol. 1217, pp. 290–305. Springer, Heidelberg (1997) 27. Wolf, V., Baier, C., Majster-Cederbaum, M.E.: Trace machines for observing continuoustime Markov chains. ENTCS 153(2), 259–277 (2006)
Appendix Proof of Theorem 2 Proof. In order to prove this theorem we follow an approach similar to [10,17] where the original system and its quotient under bisimulation are shown to be equivalent. Let D = (S, P, AP, L, s0 ) be a DTMC and D/R = (S/R , P , AP, L , s0 ) be its quotient under WPE. Since we have defined the WPE relation on a single state space, to prove this theorem we take the disjoint union S ∪ S/R . Let us define a relation R ⊆ (S ∪ ∗ S/R ) × (S ∪ S/R ) such that R = {(s, C)|s ∈ C, C ∈ S/R }. Let R be the reflexive, symmetric and transitive of closure of R . ∗ Now we prove that R is a WPE relation. This is done by checking both conditions of Def. 9. Let (s, C) ∈ R∗ . The proofs for pairs (s, s ), (C, s), and (C, C) are similar and omitted. 1. L (C) = L(s) by definition of D/R . 2. Next we prove that ∀E, F ∈ (S ∪ S/R )/R∗ and ∀x0 , x0 ∈ pred(E) it holds wp(x0 , E, F ) = wp(x0 , E, F ). Let x0 , x0 ∈ pred(E). Consider the following three cases based on the successors of x0 , x0 such that these successors are in E. a) The successors of both x0 , x0 belong to S. Since we know that R is a WPE, it follows wp(x0 , E, F ) = wp(x0 , E, F ). b) The successors of both x0 , x0 belong to S/R . In this case, wp(x0 , E, F ) = wp(x0 , E1 , F1 ) where E1 ∈ E ∩ S/R and F1 ∈ F ∩ S/R , which equals P (x , x ) 0 · P (x , F1 ) = P (E1 , F1 ). P (x0 , E1 )
x ∈E1
Similarily wp(x0 , E, F ) = wp(x0 , E1 , F1 ) = P (E1 , F1 ). c) The successors of x0 , x0 belong to S and S/R respectively. In this case we get, wp(x0 , E, F ) = wp(x0 , E1 , F1 ) = P (E1 , F1 ). From Def. 10 we know that: P (E1 , F1 ) = wp(x0 , E1 , F1 ) = wp(x0 , E, F ). Since all the conditions of Def. 9 are satisfied by the relation R∗ , it is a WPE relation. Proof of Lemma 1 Proof. Let s1 ∼ s2 . We prove that both conditions for are satisfied.
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
133
– L(s1 ) = L(s2 ), follows directly from s1 ∼ s2 . – Let C, D ∈ S/∼ and s0 , s0 ∈ pred(C). Since P (s1 , D) = P (s2 , D) for all s1 , s2 ∈ C, then for all s∗ ∈ C: wp(s0 , C, D) =
P (s , s) 0 · P (s, D) P (s0 , C)
s∈C
= P (s∗ , D) · ∗
= P (s , D) = P (s∗ , D) ·
P (s , s) 0 P (s0 , C)
s∈C
P (s , s) 0 P (s0 , C)
s∈C
P (s , s) 0 · P (s, D) = P (s0 , C) s∈C
= wp(s0 , C, D). Thus s1 s2 . Consider the equivalence class C = {s4 , s5 , s6 } under WPE R in Fig. 2 (left). Here s4 ∼ s5 since s4 can reach a c-state while s5 cannot. Thus we can conclude that ∼ is strictly finer than . Proof of Theorem 4 Proof. In order to prove this theorem it is sufficient to show that for each accepting cylinder Cyl set in D/R , there is a corresponding set of cylinder sets in the DTMC D that are accepted by the DRA A and that jointly have the same probability as Cyl. Consider the set Π of cylinder sets in D, and D/R that are accepted by DRA A, s.t. ∀Cyl = (s0 , s1 , ..., sn ), and Cyl = (s0 , s1 , ..., sn ) with si si , ∀0 ≤ i ≤ n implies Cyl ∈ Π. Then we have to prove, P (s0 , s1 , D) · Pr(Πn ) = Pr(Πn ). (1) s1 ∈D
s1
D
We will prove Eq.(1) by induction over the length of the cylinder set Cyl. – Base Case: In this case, n = 0 and P (s0 , s1 , D) · Pr(Π0 ) = 1 = Pr(Π0 ), s1 ∈D
s1
D
if s0 ∈ D, Π0 , and 0, otherwise. – Induction Hypothesis: Assume that for cylinder sets of length n ∈ N, it holds: P (s0 , s1 , D) · Pr(Πn ) = Pr(Πn ). s1 ∈D
s1
D
134
A. Sharma
– Induction Step: Consider the case n + 1:
P (s0 , s1 , D) · Pr(Πn+1 ) s1
s1 ∈D
=
P (s0 , s1 , D) ·
s1 ∈D
=
P (s1 , s2 ) · Pr(Πn )
s2 ∈S
P (s0 , s1 , D) ·
s1 ∈D
=
P (s0 , s1 , D) ·
P (s0 , s1 , D) ·
C∈S/R s1 ∈D
P (s1 , s2 ) · Pr(Πn )
s2
P (s1 , s2 ) · Pr(Πn ). s2
s2 ∈C
Multiplying the above expression by
C∈S/R s2 ∈C
C∈S/R s1 ∈D
s2
P (s1 , C) we get: P (s1 , C)
P (s1 , C) · P (s1 , s2 ) · Pr(Πn ) s2 P (s1 , C)
s2 ∈D
P (s1 , s2 ) · Pr(Πn ) P (s1 , C) s2 s2 ∈C C∈S/R s1 ∈D P (s0 , s1 , D) · P (s1 , C) · P (s1 , s2 , C) · Pr(Πn ). =
=
P (s0 , s1 , D) · P (s1 , C) ·
C∈S/R s1 ∈D
s2
s2 ∈C
From the induction hypothesis we have: P (s1 , s2 , C) · Pr(Πn ) = Pr(Πn ). s2
s2 ∈C
C
Also from Def. 8 and Def. 10 we know that: P (s0 , s1 , D) · P (s1 , C) = P (D, C), C∈S/R s1 ∈D
since get:
s1 ∈D
C∈S/R
P (s0 , s1 , D) · P (s1 , C) = wp(s0 , D, C) = P (D, C). Therefore we C∈S/R
P (D, C) · Pr(Πn ) = Pr(Πn+1 ). C
D
Proof of Theorem 5 Proof. Let D = (S, P, AP, L, s0 ) be a DTMC and D/R = (S/R , P , AP, L , s0 ) be its quotient under WPE R. Let D1 = (S1 , P1 , AP1 , L1 , sp ) be a DTMC composed in parallel to D and D/R , i.e., D ⊗ D1 , D/R ⊗ D1 . Since we have defined
Weighted Probabilistic Equivalence Preserves ω-Regular Properties
135
WPE relation on a single state space, to prove this theorem we take the disjoint union (S × S1 ) ∪ (S/R × S1 ). Let us define a relation R ⊆ ((S × S1 ) ∪ (S/R × S1 )) × ((S × S1 ) ∪ (S/R × S1 )) such that R = {((s, t), (C, t))|s ∈ C, t ∈ S1 }, C ∈ S/R . Let R∗ be the the reflexive, symmetric and transitive closure of R . Now we prove that R∗ is a WPE relation. This is done by checking both conditions of Def. 9. Let ((s, t), (C, t)) ∈ R∗ . 1. From the definition of D/R we know that, L (C) = L(s). Nowe we have to show that: L(s, t) = L(C, t). From Def. 20 we know that L(s, t) = L(s) ∪ L(t) = L(C) ∪ L(t) = L(C, t). 2. Next we prove that ∀E, F ∈ ((S × S1 ) ∪ (S/R × S1 )) /R∗ and ∀x0 , x0 ∈ pred(E) it holds wp(x0 , E, F ) = wp(x0 , E, F ). Let x0 , x0 ∈ pred(E). Consider the following three cases based on the successors of x0 , x0 such that these successors are in E. a) The successors of both x0 , x0 belong to S × S1 . Since C ∈ S/R we know that wp(s0 , C, D) = wp(s0 , C, D), where s0 , s0 ∈ pred(C). Since parallel composition is the synchronous product, where two DTMCs move in a lock step p3 p1 p2 s1 −−→ s3 ∧s2 −−→ s4 ∧t −−→ t . fashion, i.e., ∀s1 , s2 ∈ C and t ∈ S1 , p1 ·p3 p2 ·p3 (s1 ,t) − − −−→ (s3 ,t )∧(s2 ,t) −− −−→ (s4 ,t ) That is all the transitions from an equivalence class, say from C ∈ S/R to D ∈ S/R get multiplied by the same factor in the product DTMC and therefore ∀(s1 , t), (s2 , t) ∈ S × S1 s.t. s1 , s2 ∈ C it holds: wp(x0 , E, F ) = wp(x0 , E, F ). b) The successors of both x0 , x0 belong to (S/R × S1 ). We have wp(x0 , E, F ) = wp(x0 , E1 , F1 ) where E1 ∈ E ∩ (S/R × S1 ) and F1 ∈ F ∩ (S/R × S1 ), which equals P (x , x ) s 0 · Ps (x , F1 ) = Ps (E1 , F1 ), (x , E ) P 1 s 0 x ∈E1
Ps
is the probability matrix for D/R ⊗ D1 defined on (S/R × S1 ). where Similarily wp(x0 , E, F ) = wp(x0 , E1 , F1 ) = Ps (E1 , F1 ). c) The successors of x0 , x0 belong to (S × S1 ) and (S/R × S1 ) respectively. We already know that: wp(x0 , E, F ) = wp(x0 , E1 , F1 ) = Ps (E1 , F1 ). From Def. 10 we know that wp(s0 , C, D) = P (C, D), where s0 ∈ pred(C), and C, D ∈ S/R , therefore while taking the parallel composition, corresponding transitions of both DTMCs D, D/R get multiplied by the same factor, i.e., p3 p1 p2 t −−→ t ∧s −−→ s ∧C −−→ D ∀s ∈ C and t ∈ S1 , , so as to obtain p1 ·p3 p2 ·p3 (s,t) − − −−→ (s ,t )∧(C,t) −− −−→ (D,t ) (D ⊗ D1 ), (D/R ⊗ D1 ). Thus Ps (E1 , F1 ) = wp(x0 , E1 , F1 ) = wp(x0 , E, F ).
Probabilistic CSP: Preserving the Laws via Restricted Schedulers Sonja Georgievska and Suzana Andova Department of Mathematics and Computer Science, Eindhoven University of Technology, The Netherlands {s.georgievska,s.andova}@tue.nl
Abstract. Extending Communicating Sequential Processes (CSP) by preserving the distributivity laws for internal choice, in the presence of probabilistic choice, has been an open problem so far. The problem stems from a well known disagreement between probabilistic choice and nondeterministic choice, that raises congruence issues for parallel composition. Recently, it has been argued that the congruence issue can be resolved only by restricting the power of the schedulers that resolve the nondeterminism. In our previous work, we have restricted the schedulers by suitably labeling the nondeterministic transitions. We have defined a ready-trace equivalence and a parallel composition with hiding for which the equivalence is a congruence. In this paper, we generalize our model and give a CSP-style axiomatic characterization of the ready-trace equivalence. From the axiomatization it follows that all distributivity axioms for internal choice from CSP are preserved, and no new axioms are added.
1
Introduction
Finding a satisfactory extension of the CSP process algebra [3, 18, 27] in the presence of internal probabilistic choice has been a well-studied open problem in the last two decades [8,17,19,23,25,29]. The main challenge has been to preserve the axioms (laws) from standard CSP, without adding additional axioms for the CSP operators [17, 23, 25, 29]. One of the hallmarks of the philosophy behind CSP is that the exact moment internal nondeterministic choice occurs is unobservable. This property is desirable for internal probabilistic choice, too [18]. In words of axiomatic representation, unobservability of the internal choice means allowing distributivity of the operators over internal (probabilistic or nondeterministic) choice. It turned out, however, that this distributivity in the probabilistic case comes at the expense of losing other valuable properties, such as: congruence for parallel composition [19, 23], idempotence for nondeterministic choice [25], or not being able to preserve the standard nondeterministic choice in the process language [17, 29]. Some approaches overcome these problems (e.g. [8]), by giving up on the unobservability of the internal choice and, thus, on any distributivity axioms. J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 136–150, 2012. c Springer-Verlag Berlin Heidelberg 2012
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
137
In this paper we show that CSP can be extended with probabilistic choice such that the original laws, viz. distributivity and idempotence for internal choice, are preserved, together with the congruence properties of the underlying semantics for parallel composition. To achieve that, we rely on a recent concept of restricting the power of the schedulers that resolve the nondeterminism in probabilistic concurrent processes [5–7, 12, 14]. These techniques aim at obtaining compositionality for linear-time semantical equivalences (viz., equivalences where internal choice is unobservable), but also at obtaining better estimates for the probabilistic behaviour of composed processes. The compositionality problem and restricted schedulers The problem with compositionality of trace-like, or linear-time, process equivalences in the presence of probabilistic choice can be explained with the following game example. Two players, x and y, are playing a game. Player x y x2 1 xy 1 x ¯ 1 1 tosses a fair coin, waits a CC2 2 222 2 { w w C { C! bit, then announces publicly }{{ 4 τ 4τ r 4 that he is going to reveal w w
τ τ
44 1 1 // // the result of tossing (head r
2 2 r τ τ τ τ r r
4 / / or tail), and then reveals it. t t t h τ Player y waits a bit, then h h τ τ τ makes a guess about the reω ω τ τ sult of coin-tossing by player ω ω x, then announces to reveal the result, and finally reveals it. The two players are Fig. 1. Synchronized processes modeled by the processes in Fig. 1. Obviously, the probability that y makes the correct guess is 12 . However, the graph of the synchronization xy of processes x and y in Fig. 1 does not suggest this probability. In order to obtain the probabilities with which action ω (correct guess ) is reported, usually, almighty schedulers are used to resolve the nondeterminism [8, 23, 28, 30]. There are four possible schedulers for the process xy, yielding the set {0, 12 , 1} of values of probabilities to observe action ω. On the other hand, when process x ¯ (Fig. 1) is synchronized with y, the set of probabilities to observe action ω, yielded by the schedulers of x ¯y, is { 21 }. Since the two sets of probabilities are different, processes xy and x ¯y cannot be equated – they exhibit different (probabilistic) behaviour. This implies that processes x and x ¯ are also not equivalent by [8, 28, 30], since equating them would break the compositionality for parallel composition. Consequently, usage of the above described schedulers rules out trace-based equivalence relations for probabilistic processes. In addition, unrealistic overestimation for the probabilistic behaviour of the system xy is derived.1 It has been noted that this problem occurs due to cloning of an internal choice of a component in the parallel composition [5–7,12]. Namely, the local choice that 1
This example was first pointed out in [23, 25, 28].
138
S. Georgievska and S. Andova
player y makes is copied twice in the parallel composition xy. In each future of xy after the probabilistic choice, the internal choice that y makes can be resolved in different ways. To overcome this artefact of the parallel composition, different solutions have been proposed (see e.g. [5–7, 12]) to restrict the power of the schedulers that resolve the nondeterminism. In [12] we propose a method to identify different appearances of the same internal nondeterministic choice and they are always resolved in the same manner. This is achieved by assigning labels to each transition of an internal nondeterministic choice, and the set of all transition labels of a given internal nondeterministic choice uniquely identifies this choice. In the example in Fig. 1, this means that the two transitions of the internal choice of y are labeled differently, e.g. by τ1 and τ2 , and this combination of τ1 and τ2 uniquely identifies this choice. The labels are propagated to the parallel composition xy, denoting that the two non-deterministic choices in xy are, in fact, two appearances of the same choice, and as such they have to be resolved in the same way. As a result, the probability with which action ω is reported in xy is always 12 , yielding now equivalence between xy and x ¯y, and allowing us to relate x and x ¯. In [12] we also define how to label the nondeterminism that arises when actions are hidden after synchronization of processes, and define a probabilistic ready-trace [2, 26] equivalence. In [11] we define a general parallel composition for which this equivalence is a congruence. By the parallel composition of [11], processes synchronize on a set of actions and interleave on the rest of the actions, as in CSP [27]; in addition, the synchronized actions are hidden afterwards, as in CCS [24]. The main idea behind labeling the nondeterministic transitions that arise from parallelism is to integrate the exact information based on which the nondeterministic choice is resolved into the labels. Thus, two parallel appearances of the same nondeterministic choice are always resolved in the same way. Contributions. In this paper we show that our approach leads to a probabilistic extension of CSP in which the original distributivity laws and the idempotent law for internal choice are preserved, together with the congruence properties of the underlying semantics for parallel composition. Technically, we first generalize the model introduced in [12], such that the nondeterministic transitions can be labeled with rational expressions over labels in the labelset, rather than only with labels from the labelset. This generalization is needed for derivation of normal forms of finite processes, which themselves are essential to prove completeness of the axiomatization; the general model is interesting in itself. Then, the complete axiomatization shows that all distributivity laws, as well as the idempotent law for nondeterministic choice are preserved from CSP, and no new laws for the choice operators of CSP are added. Thus, we obtain a probabilistic extension of CSP that includes nondeterministic choice, external choice, probabilistic choice, and parallel composition with hiding (the latter treated only in [13]) , that respects the original distributivity axioms of CSP. Structure of the paper. The rest of the paper is structured as follows. In Sec. 2 we recall the model of process graphs and the probabilistic ready-trace
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
139
equivalence from [12]. In Sec. 3 we recall the choice operators for our processes as defined in [11]. The contribution of the present paper is presented in Sec. 4 (the generalization of the model and normal forms derivation) and Sec. 5 (the complete axiomatization). Section 6 ends with a discussion on related work and with concluding remarks. The proofs of the results can be found in the report [13].
2
Model and Probabilistic Ready-Trace Preorder
In this section we recall our model of process graphs together with the ready trace equivalence, as defined in [12]. Three kinds of choices can be modeled by the process graphs: choice between several different actions – external choice, nondeterministic choice between several internal transitions – internal choice, and probabilistic choice. l → t we denote the existence of a transition Given a directed graph r, by s − (an edge) in r originating from state (node) s and ending in state t, labeled with l; we may omit s, t, or l from the notation to denote that they are arbitrary. li For a finite index set I, by [{s − → si }i∈I ] we denote that there exist transitions l
i → si }i∈I and s has no other outgoing transitions. We presuppose a finite {s − action set A and a countable set of internal labels L such that A ∩ L = ∅.
Definition 1 (Process graph). A process graph r, or simply process r, is a directed, finite-state and finite-transition graph with root r, such that all states are reachable from r and – there exist three types of states: action, nondeterministic, and probabilistic; from an action, resp. nondeterministic, probabilistic state there can originate only action (− →), resp. internal (), probabilistic () transitions; – the action transitions are labeled with actions from A, such that no two action transitions with the same state of origin are labeled the same; – the internal transitions are labeled with labels from L such that (i) no two internal transitions with the same state of origin are labeled the same, and τ1 τ1 τi τi (ii) if s and t for some τ1 ∈ L, then s iff t for all τi ∈ L, i.e. if two states share a label on their outgoing internal transitions, then they have the same sets of labels on all of their outgoing transitions; – the probabilistic transitions are labeled with scalars from (0, 1], such that (i) given two states, there is at most one probabilistic transition connecting them, π and (ii) for each probabilistic state s, if [{s i si }i∈I ] then i∈I πi = 1. The set of all process graphs is denoted by G. We say a process is deterministic if it is without internal transitions. A state without outgoing transitions, i.e. a deadlock state, is considered an action state. The deadlock process is denoted by the constant 0. Given an action state s, by sa we denote the state (if it exists) ai a for which s − → sa ; by I(s) we denote the set {ai | s −→}. I(s) is called the menu of s. It is the set of actions that process s can perform initially. Note
140
S. Georgievska and S. Andova
that, as we discussed in the introduction, the internal transitions in our model have been assigned labels, local to the process to which they belong. The restrictions imposed on the labels of the internal transitions do not cause loss of generality, because the idea behind labeling is to assign a separate set of labels to each instance of internal choice, for the purpose of identifying the choice. In general, the labels are meant to contain information for the schedulers that resolve the internal nondeterminism, and after an appropriate unfolding and relabeling function is applied, they represent unknown probability distributions. Our model can be also seen as an orthogonal combination of reactive probabilistic processes [21] and parametric discrete-time Markov chains [9], where the transition probabilities are parameters, without a time component. In [12] we defined how (possibly recursive) processes are unfolded, up to a certain length, to finite trees, by relabeling the internal choices. Intuitively, if in the original process graph one internal choice happens in the future of another, then the two choices are different, and even if they have the same sets of labels, they are relabeled with different sets of labels when unfolding. On the contrary, if choices with the same set of labels are placed “in parallel” in the original graph (e.g. the choices player y makes in the synchronization in Fig. 1), then they represent the same choice, and stay labeled with the same set of labels. In this paper, to avoid introducing too many preliminaries, we assume that the process graphs are already unfolded. We call them process trees. Thus, a process tree is a finite process graph (i.e. with only finite paths), given in the form of a tree2 . Considering only finite processes is indeed plausible, because by restricting the power of the schedulers in concurrent probabilistic processes, the problem of verifying infinite processes becomes undecidable [16]. Moreover, for many systems, e.g. for security protocols (see [5]), considering only finite processes suffices. Each process tree s defines a set of equations, called constraints for s. Similarly as in parametric Markov chains [9], or Constraint Markov Chains [4], they restrict the values of the labels assigned to an internal choice, such that they form a probability distribution. The constraints are defined next. We also define the deterministic process that results when the internal choices in a process tree are resolved according to the constraints. Definition 2 (Constraints). Let s be a process tree. The set of constraints for s, C(s), is a set of linear equations over labels in L, such that an equation τi is in C(s) if and only if it has the form i∈I τi = 1 with [{t ti }i∈I ] for some nondeterministic state t in s. A resolution of C(s) is a function assigning values from [0, 1] to the variables in C(s), respecting the constraints C(s). Given a resolution λ of C(s), a resolution of the process tree s is the deterministic l
process that is obtained when every transition t t belonging to the process tree λ(l)
s is replaced by t t if λ(l) = 0, or erased, otherwise. 2
The rest of the restrictions implied by the definition of unfolding [12] are irrelevant here and thus omitted.
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
141
Thus, we utilize “randomized” schedulers [7, 28], which have more power than the deterministic schedulers which assign trivial distributions to the transitions. We proceed by defining the probabilistic ready-trace preorder relation on process trees [11, 12]. The relation is based on the ability of a process to mimic the probabilistic behaviour of another process under an arbitrary resolution of the internal nondeterminism of the latter. By mimicking the probabilistic behaviour, we mean matching the probability of an arbitrary ready-trace [2, 26]. Definition 3 (Ready-trace). A ready-trace of length n is a sequence (M1 , a1 , M2 , a2 , . . . , Mn−1 , an−1 , Mn ), where Mi ⊆ A for all i ∈ {1, 2, . . . , n} and ai ∈ Mi for all i ∈ {1, 2, . . . , n − 1} . We assume that an observer is able to see the actions that the process performs, together with the menus out of which actions are chosen. A ready-trace (M1 , a1 , M2 , a2 , . . . , Mn−1 , an−1 , Mn ) can be observed if the initial menu is M1 , then action a1 ∈M1 is performed, then the next menu is M2 , then action a2 ∈M2 is performed and so on, until the observation ends when the current menu is Mn . Next, given a deterministic process tree s, we define process s(M,a) , which is the process that s becomes, by performing action a selected from menu M . For π a state s, we write shortly s sn with sn being an action state, rather than π1 πn π2 s s1 s2 . . . sn for π = π1 π2 · · · πn . Definition 4. Let s be a deterministic process tree. Let M ⊆ A, a ∈ M . The process graph s(M,a) is obtained from s in the following way: – if I(s) = M then s(M,a) ≡ sa ; πi – if {si }i∈I = ∅ are all process graphs si such that I(si ) = M and s si for i ∈ I, then s(M,a) ≡ ⊕i∈I ππi sia , for π = i∈I πi , where sia for each i ∈ I is a → sia . such that si − – in any other case, s(M,a) is undefined. For example, for process x in Fig. 1, x({w},w) is the process that performs with probability 1/2 sequence rh and with probability 1/2 sequence rt. As the probability to observe a ready-trace is conditioned upon the actions that are actually performed, we employ conditional probabilities on readytraces. For these reasons, we exploit the Bayesian definition of probability [22] (see also [13]), in which the probability is naturally conditioned, rather than the usual measure-theoretic definition. Next, for a finite deterministic process s and a ready-trace (M1 , a1 , . . . , Mn−1 , an−1 , Mn ), we define the conditional probability to observe menu Mn in s, given that previously the sequence M1 , a1 , . . . , Mn−1 , an−1 was observed. Definition 5. Let (M1 , a1 , . . . , Mn−1 , an−1 , Mn ) be a ready-trace of length n and s be a finite deterministic process. The partial functions Ps1 (M ) and Psn (Mn |M1 , a1 , . . . Mn−1 , an−1 ) (for n > 1) are defined in the following way: ⎧ πi 1 ⎪ ⎨ i∈I πi Psi (M ) if [{s si }i∈I ], Ps1 (M ) = 1 if I(s) = M, ⎪ ⎩ 0 otherwise.
142
S. Georgievska and S. Andova
Ps2 (M2 |M1 , a1 ) =
1 ,a1 )
Psn (Mn |M1 ,a1 ,...,Mn−1 ,an−1 )=
Ps1(M
(M2 )
undefined
if Ps1 (M1 ) > 0, otherwise.
1 Psn− (Mn |M2 ,a2 ,...,an−1 ) (M1 ,a1 )
if Ps1 (M1 )>0,
undefined, otherwise. Let the sample space consist of all the subsets of A and let s be a deterministic process tree. Function Ps1 (M ) can be interpreted as the probability that menu M is observed when process s starts executing. Let the sample space consist of all the ready-traces of length n. Function Psn (Mn |M1 , a1 , . . . Mn−1 , an−1 ) can be interpreted as the probability of the event {(M1 , a1 , . . . , Mn−1 , an−1 , Mn )}, given the event {(M1 , a1 , . . . Mn−1 , an−1 , X) | X ⊆ A}, when observing ready-traces of process s. It can be shown that these probabilities are well defined (see [22]). Example. For process x in Fig. 1 we obtain Px1 ({w})=1, Px2 ({r}|{w}, w)=1, Px3 ({h}|{w}, w, {r}, r)= 12 . Definition 6. Let s¯ and t¯ be two deterministic process trees of the same length m. s¯ and t¯ are ready-trace equivalent, denoted by s¯ ≈RT t¯, iff for all 1 < k ≤ m and for all ready-traces (M1 , a1 , . . . , Mk ), (i) Ps¯1 (M1 ) = Pt¯1 (M1 ) and (ii) Ps¯k (Mk |M1 , a1 , . . . Mk−1 , ak−1 ) is defined if and only if Pt¯k (Mk |M1 , a1 , . . . , Mk−1 , ak−1 ) is defined, and, in case they are both defined, they are equal. Definition 7 (Ready-trace preorder and equivalence). Let s and t be process trees. We say s implements t w.r.t. ready-traces (notation s RT t) if for every resolution s¯ of s, there exists a resolution t¯ of t such that s¯ ≈RT t¯. s and t are ready-trace-equivalent, denoted by s ≈RT t, iff s RT t and t RT s. Informally, a process s implements a process t if and only if for every resolution s¯ of the nondeterminism in s, there is a resolution t¯ of the nondeterminism in t, such that for every ready-trace (M1 , a1 , . . . , Mk ), the probability to observe Mk , given that previously the sequence M1 , a1 , . . . Mk−1 , ak−1 was observed, is defined at the same time for both s¯ and t¯, and, moreover, in case both probabilities are defined, they coincide. In general, process s implements process t iff s contains “less” internal nondeterminism than process t. Example. Processes x and x¯ in Fig. 1 are ready-trace equivalent.
3
The Language of Probabilistic CSP
In this section we present CSP-based probabilistic process language SPp , already defined in [11]. We recall that one of the objectives of this paper is to show preservation of the distributivity laws for choice operators from CSP to its probabilistic extension proposed here. For these purposes, the main attention in the SPp language is given to the choice operators. Please notice, however, that the parallel composition operator has been also defined in [11], and axiomatized in [13]. Sequential composition can be included straightforwardly following the
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
143
lines of [18,27]. Also, the priority operator [2], which is not part of the CSP syntax, but is characteristic for the ready-trace semantics, has been defined in [10]. In the rest of this paper, to ease the notation, we assume that every time an index set I is used, it is finite and non-empty, and we agree that A = {a1 , a2 , . . . an } every time an action set {ai }i∈I ⊆ A is used. The SPp process terms are generated by the following grammar: x ::= 0 | i∈I ai xi | x x | i∈I τi xi | ⊕i∈I πi xi
where 0 ∈ A is a new symbol, {ai }i∈I ⊆ A, {πi }i∈I ⊂ (0, 1] such that i∈I πi = 1, and {τi }i∈I ⊂ L such that the labels {τi }i∈I do not appear in the terms {xi }i∈I . We let p, q, ... range over SPp process terms. The constant 0 stands for the deadlock process. The external action choice i∈I ai pi stands for choice among the actions in {ai }i∈I ; it proceeds as process pj if action aj is chosen and executed. We write ap (prefix) rather than ap and simply a rather than a0. The internal choice i∈I τi pi stands for labeled internal choice between processes {pi }i∈I . The probabilistic choice ⊕i∈I πi pi is process pi with probability πi for i ∈ I. The operator p q stands for a general external choice between p and q. Table 1. Operational semantics for the choice operators (R1)
k∈I
(R2)
ak
i∈I ai pi −→ pk a
(R4)
k∈I
i∈I τi pi pk
a
p− → p , q − → q a
pq− → τ1new p τ2new q p p , q π
p q p q p p τ1
p q p q
π
k ⊕i∈I πi pi pk
a
p− → p , q , q , q − → a
a
pq− → p , q p − → p π
(R7)
p p , q , q
τ1
(R8)
k∈I
a
(R5)
π
(R6)
(R3)
τk
π
q p q p τ1
(R9)
p p , q τ1
q p q p
Table 1 represents the operational semantics of SPp process terms. Rules R1 and R3 are standard. Rule R2 states that, when several processes are composed via a labeled internal choice, labels are assigned to the new internal transitions accordingly. Rule R4, similarly as in CSP [27], states that if two processes can initially perform action a, then the external choice between them can also perform action a; however, the choice on whether the first or the second process is executed afterwards is nondeterministic, i.e. internal. Note that the transitions of the internal choice are suitably labeled with newly introduced labels. Rule R5 shows the priority of internal/probabilistic transitions over action transitions when bound in a general external choice. This is because the environment (the external choice) is unable to prevent the internal transitions from occurring [27].
144
S. Georgievska and S. Andova
Rules R6 – R9 demonstrate the priority of an internal transition over a probabilistic transition in a general external choice. This is an arbitrary technical solution, namely, we can also give priority to the probabilistic transitions. This freedom stems from the fact that the internal transitions are labeled with all the information they need in order to be resolved. In fact, since probabilistic choice is a special case of internal choice, we argue that any ordering should lead to the same axiom set (which will become evident later). Rules R8 and R9 (resp. R6 and R7) express that, if both processes can perform internal (resp. probabilistic) transitions initially, then, in a general external choice, the internal (resp. probabilistic) transitions of the left process happen first. This is also an arbitrary technical solution, and the freedom is justified by the fact that in [27] internal choices, one per a component bound in an external choice, are all combined into one internal choice – e.g. (a b) (c d) has the same graph representation as (a c) (a d) (b c) (b d).
4
Generalization of the Model and Normal Forms
In this section we generalize the model of process trees, such that the labels on the internal transitions can be rational expressions over the label variables in the set L. This generalization is needed in order to derive normal forms of the process trees, that shall play an essential role in the proof of the completeness of the axiomatization presented in the following section. The need to generalize the model comes from the observation that processes, e.g , τ1 a τ2 a τ3 b and τ4 a τ5 b are ready-trace equivalent and should have comparable normal forms. We obtain that the normal form of the former process is (τ1 +τ2 )a τ3 b, while the normal form of the latter one is τ4 a τ5 b. Then, we define two normal forms to be equivalent if they yield the same sets of deterministic processes when the internal nondeterminism is resolved. In order to generalize the process trees, we shall need a subset Q of the rational expressions over the elements of L, generated by the following grammar: ϕ1 , ϕ ::= α | l | ϕ1 + ϕ2 | ϕ1 · ϕ2 | ϕ2 where α ∈ (0, 1], l ∈ L, and “+”, “·”, and “ ·· ” are ordinary algebraic addition, multiplication and fraction, respectively. A general-process tree is defined by relaxing the conditions for a process tree: the internal transitions are labeled with expressions in Q, rather than only with labels in L, and there are no restrictions on the expressions labeling the internal transitions. Next, we define transformations on general-process trees that shall lead to ϕ normal forms. We call an internal transition s t trivial if s has no other outgoing transitions. Definition 8 (General-process tree transformations). Let p be a generalprocess tree. A transformation of p is called
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
145
π
(i) substitution if, for a state s in p, every transition s s is replaced by π s s ; ϕ (ii) erasing if, given a trivial transition s s , states s and s are identified; ϕ ϕi (iii) compressing if, for a state s such that s s and [{s si }i∈I ], the ϕϕi
ϕi
transitions {s si }i∈I are erased, and new transitions {s si }i∈I are created. ϕ η ai s1i }i∈I ], (iv) flipping if, given a state s such that s s1 , s s2 , [{s1 −→ ϕ
a
η
i and [{s2 −→ s2i }i∈I ], the transitions s s1 and s s2 are erased, and
ϕ+η
new states s and {si }i∈I are created, together with the transitions s ϕ ϕ+η
a
η ϕ+η
i si }i∈I , {si s1i }i∈I , and {si s2i }i∈I ; s , {s −→ ϕ η (v) deadlocks-joining if, given a state s such that s s1 , s s2 , and s1 , s2
ϕ
η
are both deadlock states, the transitions s s1 and s s2 are erased, and ϕ+η
a new transition s s1 is created. Next, we narrow down the set of general trees of interest to the set of those that are obtained from process trees by transformations. Definition 9 (CG process tree). A coherent general-process tree , CG process tree for short , is a general-process tree that can be obtained from a process tree, when transformed zero or more times by Def. 8. Note that the transformations from Def. 8 do not essentially increase the set of constraints for a general-process tree, i.e. no new labels are added by a transformation step and no new restrictions to the old labels are imposed. We formalize this statement in the following proposition. For our convenience, we assume that a rational expression ϕ/ϕ in Q that is used for labeling a transition in a CG process tree always has value 1, even when ϕ evaluates to 0. Below, we justify this assumption. Proposition 1. Let p be a CG process tree and p be a CG process tree obtained from p via a transformation step. Then, the systems of equations C(p) and C(p)∪ C(p ) are equivalent. Since a transformation does not increase the set of constraints defined by the original process tree, the CG process tree inherits the (linear) constraints from its original process tree. We introduce some notation to formalize this. Let p be a process tree and p be a CG process tree obtained from p via zero or more ˜ ) we denote C(p). Recall that a resolution of a set transformation steps. By C(p of constraints is a function that assigns values in [0, 1] to the variables in the constraints, respecting the constraints. A resolution of p is the process tree p¯ ϕ ˜ ), every transition t t in obtained when, for an arbitrary resolution λ of C(p λ(ϕ)
the tree p is replaced by t t , if λ(ϕ) = 0, or erased otherwise, where λ(ϕ) for ϕ ∈ Q \ L is defined in the usual way, i.e. λ(α) = α for α ∈ (0, 1], λ(ϕ1 +ϕ2 ) = λ(ϕ1 ) 1 λ(ϕ1 )+λ(ϕ2 ), λ(ϕ1 ·ϕ2 ) = λ(ϕ1 )·λ(ϕ2 ), λ( ϕ ϕ2 ) = λ(ϕ2 ) . From now on, unless
146
S. Georgievska and S. Andova
stated otherwise, we assume that the process tree from which a CG process tree originates is implicitly given and we shall omit it. Note that the assumption that ϕ/ϕ always evaluates to 1 can be now justified by the fact that a label ϕ/η on an internal transition in a CG process tree originates from “flipping”. Namely, in this case there is a transition labeled with η, preceding the transition labeled with ϕ/η (see Def. 8). When λ(η) = 0, in the resolution of the CG process tree the transition labeled with η does not appear, and thus the transition labeled with ϕ/η does not appear, too. Because of this, our assumption that ϕ/ϕ always evaluates to 1 in Prop. 1 has no influence on the resolution of a CG process tree p, which is the only context in which the constraints C(p) are used. Having defined resolutions of CG process trees, the definition of ready-trace equivalence for process trees extends easily to CG process trees. Definition 10 (Ready-trace equivalence). Let s and t be CG process trees. We say s implements t w.r.t. ready-traces, denoted by s RT t iff, for every resolution s¯ of s , there exists a resolution t¯ of t such that s¯ ≈RT t¯ . We say that the CG process trees s and t are ready-trace equivalent, denoted by s ≈RT t , iff s RT t and t RT s . Proposition 2 (Soundness of the transformations). The transformations in Definition 8 are sound w.r.t. ≈RT when applied to CG process trees, i.e. if p is obtained from p via a transformation, then p ≈RT p. Next, we define normal forms of CG process trees and show that ready-trace equivalent CG process trees reduce to equivalent normal forms, i.e. to normal forms that yield the same sets of deterministic processes when the internal nondeterminism is resolved. Definition 11 (Normal form). A CG process tree is in ready-trace normal form (RT-normal form) if none of the transformations of Def. 8 can be applied. Proposition 3. Every general-process tree transformation sequence eventually ends in RT-normal form. Definition 12. Two CG process trees p and q are almost-equal (p q) iff for every resolution p¯ of p there is a resolution q¯ of q isomorphic to p¯ and vice versa, for every resolution q¯ of q there is a resolution p¯ of p isomorphic to q¯. Proposition 4. Two CG process trees p and q reduce to almost-equal RTnormal forms iff p and q are ready-trace equivalent.
5
Axiomatic Characterization of ≈RT
In this section we give an axiomatic characterization of probabilistic ready-trace equivalence ≈RT defined on CG process trees for the operators in SPp . The congruence result for ready-trace equivalence for the choice operators and the parallel composition operator of [11] can be found in [11] and [13]. The axioms
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
147
(A1) x 0 = x (A2) 0 x = x (A3) (i∈I ai xi ) (j∈J aj yj ) =
(k∈I∩J ak (τk1 xk τk2 yk )) (i∈I\J ai xi ) (j∈J \I aj yj ) , if I∩J=∅
(A4) (i∈I ai xi ) (i∈J ai xi ) =i∈I∪J ai xi , if I ∩ J = ∅ (A5) ⊕i∈I πi xi = i∈I πi xi (A6) i∈I ϕi x = x (A7) i∈I ai j∈J ϕj xij = j∈J ϕj i∈I ai xij (A8) (i∈I ϕi xi ) y = i∈I ϕi (xi y) (A9) y (i∈I ϕi xi ) = i∈I ϕi (y xi ) (A10) i∈J ϕi (k∈K ρik xik ) = i∈J,k∈K (ϕi ρik )xik Fig. 2. Axioms for choice operators
for parallel composition and the appropriate elimination theorem can be found in the long version of the present paper [13]. The set of axioms of theory SPp is given in Fig. 2, where x, y, ... are arbitrary SPp terms, and we assume that A = {ai }i∈N for some index set N . For the purpose of axiomatization, we allow a general internal choice i∈I ϕi xi , ϕi ∈ Q in SPp . For axiom A3 it is assumed that the labels of the internal choices, τk1 and τk2 , for every k, are new with respect to the labels that appear in the rest of the terms in the axiom. Note how axiom A5 reflects the view that probabilistic choice is a special type of internal choice. Note also the existence of axiom A7 (distributivity of action choice over internal choice), which is typical for the non-probabilistic CSP. We refer to a SPp term as basic term if only external action choice, internal choice and probabilistic choice appear in the term (i.e. the general external choice is not present). The next proposition says that every term can be rewritten to a basic term by using the axioms of SPp . Proposition 5 (Elimination). For a SPp process term x there exists a basic term y, such that SPp x = y. Observe that each CG process tree can be mapped to a basic term; thus, we can use the name p for a process term that represents the CG process tree p. Proposition 6 (Soundness). Let p and q be CG process trees. If SPp p = q, then p ≈RT q. The following theorem states that two ready-trace equivalent CG process trees can be reduced to almost-equal process trees using the axioms of theory SPp . Theorem 1 (Completeness). Let p and q be CG process trees such that p≈RT q. There exist CG process trees p , q , such that SPp p = p , SPp q = q , and p q .
148
S. Georgievska and S. Andova
Proof. By Proposition 4 and Proposition 5, it is enough to show that each of the general-process tree transformation steps given in Definition 8 can be mimicked by the axioms A1–A10 (or, more concretely, axioms A5, A6, A7 and A10). Steps (i) can be mimicked by axiom A5, step (ii) by axiom A6, step (iii) by axioms A6 and A10, step (iv) by using first axiom A10 in a right-to-left direction and then A7, while step (v) can be performed by using first axiom A10 in a right-to-left direction and then A6. Note that an infinite set of axioms is required in order to reduce two ready-trace equivalent CG process trees to isomorphic CG process trees, which is why we restrict ourselves to deriving almost-equal process trees. However, the purpose of the present text was to obtain an algebra in which the CSP laws are preserved under presence of probabilistic choice, and in which no new laws regarding the interplay between the different choice operators are added. We leave the decidability problem for the future.
6
Related Work and Conclusion
Related work. As mentioned in Section 1, the problem of extending CSP with probabilistic choice has been addressed many times before [8, 17, 19, 23, 25, 29]. The main goal of these approaches, except for [8], is to allow distributivity of action prefix over internal probabilistic choice, in the spirit of the original CSP [3], in order to preserve the verification power of the CSP axioms. The report [23] defines trace-style semantical equivalences for probabilistic CSP processes, by resolving the internal nondeterminism with the almighty (randomized) schedulers discussed in the introduction. Not surprisingly, the equivalences are not congruences: they do not equate processes xy and x ¯y in Fig 1, although processes x and x ¯ are equated. Interestingly, reference [23] is the earliest one that notes the problem related to the processes in Fig. 1. Reference [29] considers processes without internal nondeterminism and equates two processes only if they cannot be distinguished by the environment, which is represented by a sequence of actions. As a result, although [29] allows distributivity of external action choice over probabilistic choice, it also makes undesirable identifications from the point of view of process theory, e.g. by equating processes 12 ca ⊕ 12 cb and 12 c(a b) ⊕ 12 c. In [25] a process is a probability distribution over standard CSP processes; two processes are equivalent if the probability distributions are the same. In other words, all the probabilistic choices are resolved before the execution of the process. By “lifting” the probabilistic choices to the root, the nondeterministic choices are ”pushed” downwards and replicated. As discussed in the introduction, this replication leads to loss of probability information. With this approach also the idempotence of the internal choice is lost [25], viz. the law x x = x does not hold, and action choice does not distribute over probabilistic choice.3 The loss of idempotence for internal choice was overcome 3
This distributivity is important for the congruence property of interleaving (see [10]).
Probabilistic CSP: Preserving the Laws via Restricted Schedulers
149
by the button-pushing testing equivalence defined in [19]. However, here still congruence for parallel composition could not be achieved [20]. Reference [17] defines a probabilistic CSP based on ready-trace equivalence for processes given in denotational semantics, but still nondeterministic choice could not be included. Finally, [8] defines a probabilistic version of CSP with a rich set of operators, including parallel composition. But, as the almighty schedulers discussed in the introduction are used, both the probability information and the distributivity axioms are sacrificed. Conclusion. We have extended CSP with probabilistic choice operator by preserving the distributivity laws from CSP and not adding new laws for the CSP operators. Defining such an extension was a long-standing open problem because allowing distributivity laws caused congruence issues for parallel composition. To achieve the congruence property , but also to obtain better probabilistic estimates in a parallel composition [10, 12] , our approach employs restricted schedulers for the nondeterminism in a parallel composition, contrary to the standardly used all-mighty schedulers [8, 28]. Restricted schedulers have been recently used for deriving compositionality for trace-style semantics for probabilistic systems [6, 7, 12], in the context of security protocols [1, 5], and for improved verification of probabilistic distributed systems [5, 12, 14, 15]. As a future work it is interesting to explore whether recursive processes can be treated, and to which extend, in the present setting. Considering recursive processes under restricted schedulers has been proven difficult [16]. Extending the probabilistic CSP language with a hiding operator is yet another challenge. In the non-probabilistic case the hiding operator is useful to abstract away unimportant information. However, by now it became clear that adding a separate hiding operator in the presence of probabilistic choice does not fit well with our goal of retaining the probabilities: it turns external into internal choice and, thus, gives even more (unrealistic) power to the schedulers. Hence, combining probabilistic choice and hiding operator properly is an open problem.
References 1. Andr´es, M.E., Palamidessi, C., van Rossum, P., Sokolova, A.: Information hiding in probabilistic concurrent systems. Theor. Comp. Sc. 412(28), 3072–3089 (2011) 2. Baeten, J.C.M., Bergstra, J.A., Klop, J.W.: Ready-trace semantics for concrete process algebra with the priority operator. The Comp. Journal 30(6), 498–506 (1987) 3. Brookes, S.D., Hoare, C.A.R., Roscoe, A.W.: A theory of communicating sequential processes. Journal of ACM 31(3), 560–599 (1984) 4. Caillaud, B., Delahaye, B., Larsen, K., Legay, A., Pedersen, M., Wasowski, A.: Constraint markov chains. Theor. Comp. Sc. 412(34), 4373–4404 (2011) 5. Chatzikokolakis, K., Palamidessi, C.: Making random choices invisible to the scheduler. Information and Computation 208(6), 694–715 (2010) 6. Cheung, L., Lynch, N., Segala, R., Vaandrager, F.: Switched PIOA: Parallel composition via distributed scheduling. Theor. Comp. Sc. 365(1-2), 83–108 (2006) 7. de Alfaro, L., Henzinger, T., Jhala, R.: Compositional Methods for Probabilistic Systems. In: Larsen, K.G., Nielsen, M. (eds.) CONCUR 2001. LNCS, vol. 2154, pp. 351–365. Springer, Heidelberg (2001)
150
S. Georgievska and S. Andova
8. Deng, Y., van Glabbeek, R.J., Hennessy, M., Morgan, C.: Characterising testing preorders for finite probabilistic processes. Logical Methods in Comp. Sc. 4(4:4), 1–33 (2008) 9. Doob, J.L.: Stochastic Processes. John Wiley and Sons, New York (1953) 10. Georgievska, S.: Probability and Hiding in Concurrent Processes. PhD thesis, Eindhoven University of Technology (2011) 11. Georgievska, S., Andova, S.: Composing Systems While Preserving Probabilities. In: Aldini, A., Bernardo, M., Bononi, L., Cortellessa, V. (eds.) EPEW 2010. LNCS, vol. 6342, pp. 268–283. Springer, Heidelberg (2010) 12. Georgievska, S., Andova, S.: Retaining the Probabilities in Probabilistic Testing Theory. In: Ong, L. (ed.) FOSSACS 2010. LNCS, vol. 6014, pp. 79–93. Springer, Heidelberg (2010) 13. Georgievska, S., Andova, S.: Probabilistic CSP: Preserving the laws via restricted schedulers. Technical Report (2011), http://www.win.tue.nl/~ sgeorgie/axioms2011_long.pdf 14. Giro, S., D’Argenio, P.: On the expressive power of schedulers in distributed probabilistic systems. In: QAPL 2009. ENTCS, vol. 253(3), pp. 45–71 (2009) 15. Giro, S., D’Argenio, P., Ferrer Fioriti, L.M.: Partial Order Reduction for Probabilistic Systems: A Revision for Distributed Schedulers. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS, vol. 5710, pp. 338–353. Springer, Heidelberg (2009) 16. Giro, S., D’Argenio, P.R.: Quantitative Model Checking Revisited: Neither Decidable Nor Approximable. In: Raskin, J.-F., Thiagarajan, P.S. (eds.) FORMATS 2007. LNCS, vol. 4763, pp. 179–194. Springer, Heidelberg (2007) 17. Gomez, F.C., De Frutos Escrig, D., Ruiz, V.V.: A Sound and Complete Proof System for Probabilistic Processes. In: Rus, T., Bertr´ an, M. (eds.) AMAST-ARTS 1997, ARTS 1997, and AMAST-WS 1997. LNCS, vol. 1231, pp. 340–352. Springer, Heidelberg (1997) 18. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall (1985) 19. Kwiatkowska, M., Norman, G.: A testing equivalence for reactive probabilistic processes. In: EXPRESS 1998. ENTCS, vol. 16(2), pp. 1–19 (1998) 20. Kwiatkowska, M.Z., Norman, G.J.: A fully abstract metric-space denotational semantics for reactive probabilistic processes. In: COMPROX 1998. ENTCS, vol. 13, pp. 1–33 (1998) 21. Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Information and Computation 94, 1–28 (1991) 22. Lindley, D.V.: Introduction to Probability and Statistics from a Bayesian Viewpoint. Cambridge University Press (1980) 23. Lowe, G.: Representing nondeterministic and probabilistic behaviour in reactive processes. Technical Report PRG-TR-11-93, Oxford University Computing Labs (1993) 24. Milner, R.: A Calculus of Communicating Systems. Springer, Heidelberg (1980) 25. Morgan, C., McIver, A., Seidel, K., Sanders, J.W.: Refinement-oriented probability for CSP. Formal Aspects of Computing 8(6), 617–647 (1996) 26. Pnueli, A.: Linear and Branching Structures in the Semantics and Logics of Reactive Systems. In: Brauer, W. (ed.) ICALP 1985. LNCS, vol. 194, pp. 15–32. Springer, Heidelberg (1985) 27. Roscoe, A.W.: The Theory and Practice of Concurrency. Prentice Hall (1998) 28. Segala, R.: Modeling and Verification of Randomized Distributed Real-time Systems. PhD thesis, MIT (1995) 29. Seidel, K.: Probabilistic communicating processes. Theor. Comp. Sc. 152, 219–249 (1995) 30. Wang, Y., Larsen, K.G.: Testing probabilistic and nondeterministic processes. In: Proceedings of the IFIP TC6/WG6.1 Twelth International Symposium on Protocol Specification, Testing and Verification XII, pp. 47–61 (1992)
Heuristics for Probabilistic Timed Automata with Abstraction Refinement Luis Mar´ıa Ferrer Fioriti and Holger Hermanns Saarland University – Computer Science, Saarbr¨ ucken, Germany
Abstract. Probabilistic Timed Automata provide a theory to model and verify real-time systems with non-deterministic and probabilistic behaviors. The main approach to model checking Probabilistic Timed Automata is based on encoding the time behavior either with abstractions based on a region graph or with digitalization of clocks. In this paper we present a sound method that combines digitalization to encode time behavior and predicate abstraction to reduce the state space, allowing the analysis of models with possibly infinite numbers of locations. Our method is compatible with abstraction refinement techniques previously used for Probabilistic Automata. Based on experimental results, we show that the underlying digital semantics of clocks is prone to produce an overhead in the abstraction process that can sometimes make the model checking infeasible. To cope with this problem we present some heuristics to handle clocks and show their impact on the verification. Keywords: probabilistic timed automata, model checking, abstraction refinement, probabilistic games.
1
Introduction
Probabilistic Timed Automata (PTA) [24] arise as an orthogonal combination of Timed Automata (TA) [1] and Probabilistic Automata (PA) [26]. PTA support non-determinism, probabilistic behavior and time. This enables the modelling of real time systems operating in an uncertain environment or using randomization as part of their functionality, e.g. to break symmetries or to securely encode messages. Non-determinism is especially useful to model systems that are composed of several components running concurrently. In addition, non-determinism can be used to abstract from implementation details. PTA can be enriched with variables and composed to form networks of variable-decorated PTA much like networks of variable-decorated TA, as they appear in real-time model checkers such as UPPAAL. Model checking algorithms for PTA can be classified into two broad categories: Symbolic techniques, based on either forwards [21,8] or backwards reachability [22], and the digital clocks approach [20]. In recent years, several model checkers have been developed that are able to analyse PTA, for example Modest mcpta [12] which encodes time using digital clocks, UPPAAL Pro [32] which uses timeabstract bisimulations, and PRISM which supports digital clocks and abstraction refinement based on zones [23]. J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 151–165, 2012. c Springer-Verlag Berlin Heidelberg 2012
152
L.M. Ferrer Fioriti and H. Hermanns
One of the main obstacles faced by model checkers (of any kind) is the state space explosion problem. This is worsened by the presence of clocks. Several techniques were developed to alleviate that problem in the PA, TA, and PTA setting. In the setting of PA, prominent techniques are based on partial order reduction [2], symmetry reduction [9], confluence reduction [28], as well as abstractionrefinement techniques based on predicate abstraction and CEGAR [11], gamebased abstraction [17] and menu-based abstraction [29]. Indeed, the game-based abstraction framework has recently been extended to PTA [18,19]. In [18], abstraction is applied only to the zone-based time part of the state representation, while variable valuations are expanded in full. In [19] variables valuations are abstracted using predicates and clocks are abstracted either using zones or predicates. However, no implementation or cases studies are given in [19] to show the impact of those approaches. In this paper, we revisit the abstraction-refinement framework as implemented in the PASS model checker [11]. This model checker uses predicate abstraction and refinement for variable-decorated networks of PA.
2
Preliminaries
We first introduce some preliminary definitions and notations for the technical discussion that follows. Probabilities. Given a set X a probability distribution is a function μ : X → [0, 1] such that x∈X = 1. The support of μ denoted by support(μ) is the set {x ∈ X | μ(x) > 0}. Distr(X) is the set of all probability distributions over X. Given an element x ∈ X, by Dx : X → [0, 1] we denote the Dirac distribution i.e. Dx (x) = 1. We find it convenient to work with distributions that are labelled by updates [29]. Given a finite alphabet U of updates and a set X, an update-labelled distribution μ is a distribution over U × X such that (u, x), (u, x ) ∈ support(μ) implies x = x . Expressions and evaluations. Given a set of variables V such that every variable x ∈ V has a specific domain Dx . In our framework we restrict the domain of the variables to booleans, integers or bounded integers. A valuation of V is a function v : V → x∈V Dx such that v(x) ∈ Dx . We denote by Val(V ) the set of valuations of V . To avoid a cumbersome description, we assume that we can construct expressions where the free variables are in V by using the usual arithmetic operations and relations. We denote by ExprV (BExprV ) the set of all (boolean) expressions of V . Given a expression e ∈ ExprV and an valuation of v of V , we denote by ev the evaluation of the expression were every occurrence of a variable x is replaced by the value v(x) and the term is evaluated subsequently. We denote by e[x1 → e1 , . . . , xn → en ] the expression that results from replacing simultaneously every occurrence of xi by ei . Given b ∈ BExprV and v ∈ Val(V ) we say that v satisfies b, denoted by v |= b, if bv = 1. The set of all valuations that satisfies a boolean expression is denoted
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
153
by b. An assignment is a function η : V → ExprV such that for every evaluation v and variable x ∈ V it is η(x)v ∈ Dx . The set of all assignments is denoted by Assgn(V ). We denote by e[η] the expression e[x1 → η(x1 ), . . . , xn → η(xn )] and v[η] is the valuation such that v[η](x) = η(x)v . Clock Constraints. In order to model real-time behavior, a special kind of variables is required, we call them clock variables. Given a finite set of clock variables X , a clock constraint ζ is an expression in BExprX that is generated by the following grammar rules: ζ ::= true | false | x ≤ c | x = c | x ≥ c | ζ ∧ ζ where x ∈ X and c ∈ N. We denote by CC(X ) the set of all clock constraints over X . Timed Guarded Commands. Given a finite set of updates U a set of actions Σ, a finite set of discrete variables V and a finite set of clock variables X , a guarded command is a tuple c = (a, g, gt , μ, X) where a ∈ Σ is the label, g ∈ BExprV is the guard, gt ∈ CC(X ) is the guard over clock variables, μ : U → Q × Assgn(V ) is the probabilistic transition relation such that: p = 1. u∈U μ(u)=(p,η)
and X ∈ X are the clocks that are reset by the action. Definition 1 (Variable-decorated Probabilistic Timed Automata). A Variable-decorated Probabilistic Timed Automaton or simply Probabilistic Timed Automaton (PTA) M is a tuple (V, X , I, Σ, U, Inv, T C) where: V is a finite set of data variables. X is a finite set of clock variables. I ∈ BExprV is the initial condition for the data variables. Σ is a finite set of actions labels. U is a finite update alphabet. Inv ⊆ BExprV ∪X is a finite set of invariants such that if inv ∈ Inv then inv is of the form p =⇒ q where p ∈ BExprV and q ∈ CC(X ). – T C is a finite set of timed commands.
– – – – – –
Intuitively, a PTA, just like a TA, is a system whose states are paired from data valuation and clock valuation, the latter satisfying the invariants. At each step the system can either choose a command enabled in the current state whose execution does not violate the invariant in any of the possible target states. In this case it performs the action and carries out a probabilistic experiment, which determines the successor location and clock resets. The system can also let the time pass in which case all of the intermediate states must not violate the invariant of the location. By the above definition of clock constraints, our PTA are
154
L.M. Ferrer Fioriti and H. Hermanns
closed and diagonal-free by construction. That means that strict comparisons are disallowed and clocks can only be compared against constants. We call strategy to the function that in each state resolves the non-determinism present in the model. As usual we restrict our analysis to non-zeno, timelock free models and consider only divergent strategies (i.e. infinite traces take infinite time). Given a divergent strategy and a reachability objective they define a unique probability measure over the states [24]. A Probabilistic Automaton (PA) can be thought of as a PTA without time behavior. Therefore, it can be defined along the lines of Definition 1 removing clock variables and the time action. A timed model can be translated into a discrete non-timed model by means of a transformation called digitalization [14,20]. This technique roughly consists of introducing a new integer variable for each clock present in the original model and a new action that increments synchronously by one all clocks. In the non-probabilistic setting Henzinger, Manna and Pnueli [14] and later for PTA Kwiatkowska et al. [20] showed that for closed and diagonal-free models digitalization is equivalent to the dense semantics in case of reachability properties.
3
Abstraction and Refinement
In this section we show how a probabilistic automaton can be abstracted using a stochastic game semantics and later we instantiate this idea to a particular case of abstraction which is based on predicates. 3.1
Abstraction Based on Stochastic Games
Abstraction based on state partitioning can be formed by over-approximating PA [7,31]. Such abstractions can only provide safe (usually upper) bounds, but we cannot be sure about the precision of the abstraction, i.e. minimal and maximal adversaries cannot be calculated. A better approach, pioneered by Kwiatkowska et al. [17] uses stochastic games [27,5] as abstractions and exploits a separation of non-determinism of the original model and non-determinism introduced by the abstraction. As a consequence, we can calculate upper and lower bounds of both minimal and maximal probabilities. Definition 2 (Stochastic Game [5]). A Stochastic Game is a tuple G = ((V, E), Vinit , V1 , V2 , Vp , U, δ) where: – – – – – – –
(V, E) is a finite directed graph with vertices V and edges E ⊆ (V × V ). Vinit ⊆ V the set of initial vertices. V1 ⊆ V are the player 1 vertices. V2 ⊆ V are the player 2 vertices. Vp ⊆ V are the probabilistic vertices. U is a finite alphabet. δ : Vp → Distr(U × V1 )
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
155
such that (V1 , V2 , Vp ) is a partition of V , E ⊆ V1 × V2 ∪ V2 × Vp ∪ Vp × V1 and δ(v)(u, v ) > 0 implies (v, v ) ∈ E. We denote by E(v) = {w | (v, w) ∈ E} the set of successors of v. As with PA and PTA, non-determinism is resolved by means of strategies. In case of Stochastic Games, strategies are formed by two independent strategies one for player 1 choices and another for player 2. In our framework we abstract PA into stochastic games [17,30]. Definition 3 (Menu-based 1 Abstraction [30]). Given a PA M = (V, I, Σ, U, C), a reachability objective F ∈ BExprV and Q, a partition of Val(V ) such that for all B ∈ Q if v, v ∈ B then for all c ∈ C it is v |= gc if and only if gc , and there exist blocks B1 , . . . , Bn , B1 , . . . , Bm such that I = i Bi and v |= F = i Bi , the Menu-based Abstraction (MBA) of M with respect to Q is the stochastic game GM,Q = ((V, E), Vinit , V1 , V2 , Vp , U, δ) where: – – – –
Vinit = I/Q V1 = Q. V2 = {(B, c) | B ∩ gc = ∅}. Vp = {μv | (a, g, μ) ∈ C ∧ v |= g}.
The distribution function δ is the identity and the edges in E are defined by: E ={(B, (B, c)) | B ∈ V1 , B ∩ gc = ∅} ∪{((B, c), μv ) | B ∈ V1 , v ∈ B, c = (a, g, μ) ∈ C, v |= g} ∪{(μv , B) | μ ∈ Vp , B ∈ support(μ)}. and μu ∈ Distr(U × Q) is defined by: p μu (u, B) = 0
if B = v[η]/Q, otherwise
where μ(u) = (p, η). The basic idea of the abstraction is that the first player resolves the nondeterminism present in the original PA while the second player resolves the non-determinism introduced by the abstraction. The abstraction used here is – for presentation reasons – slightly simplified from the one presented in [30] since we assume that in each abstract player 1 state the same commands are enabled. Theorem 1 (Soundness [30]). Given a PA M = (V, I, Σ, U, C), a partition Q of Val(V ) that satisfies the condition of Definition 3, and the menu-based abstraction GM,Q then: inf pσs 1 ,σ2 ≤ pmin ≤ inf sup pσs 1 ,σ2 s
σ1 ,σ2
sup inf σ1 1
σ2
σ1
pσs 1 ,σ2
≤
pmax s
σ2
≤ sup pσs 1 ,σ2 σ1 ,σ2
This abstraction is originally called Parallel Abstraction [30], but we use more recent terminology [29].
156
L.M. Ferrer Fioriti and H. Hermanns
Menu-based abstraction has some advantages over game-based abstraction [17] (GBA), a technique where the role of the players are basically reversed - with the first player resolving the non-determinism due to the abstraction, and the second player resolving the model inherent non-determinism: MBAs are generally more compact than GBAs, and they are easier to compute [29], since the need to analyse the effect of multiple actions at a time [15] can be avoided. This is contrasted by the foundational observation that GBA has the best transformer property in the sense of abstract interpretation [6], while MBA is actually suboptimal for non-cooperative strategies [29]. 3.2
Predicate Abstraction
In order to build an abstract model, one of the most widely used techniques is Predicate Abstraction, which was pioneered by Graf and Sa¨ıdi [10]. The partition of the state space using predicates can be efficiently constructed from a syntactic representation of the model using modern SMT solvers. This avoids the construction of the full semantics of the original model. In theory, the state space size generated by Predicate Abstraction may be exponential in the number of predicates used. However, in practice the size is considerably lower. First, not all the abstract states are reachable from the initial abstract configuration. Secondly, the predicates are not always disjoint, giving abstract states that are logically equivalent to false. In case of probabilistic models, predicate abstraction is more complex due to the fact the post relation of an action in the original model is not deterministic [31]. Requirements of Definition 3 can be fulfilled by adding all the predicates from guards in case of PA, and additionally all predicates from timed guards and invariants in case of PTA. 3.3
Backward Refinement
Given a PA or a digitalization of a PTA and a partition we are in the position to construct a stochastic game that is a sound abstraction of the original model. However, the built abstraction might be too coarse, therefore the upper and lower bounds for the reachability property we want to analyze may differ considerably. To solve this problem the abstraction has to be refined. In our framework this is equivalent to obtaining a new set of predicates. Backwards Refinement (BR) is a technique introduced by Kattenbelt et al. [16] to verify probabilistic software. It was later adapted to networks of PA using MBA in [30]. The algorithm is based on the notion of pivot blocks. A block is a pivot if there is a difference between the selections made by player 2 in the strategies obtaining the upper and lower bounds on the reachability property. Although two different strategies can yield the same probability to reach a set of target states, it can be assumed that the decisions of player 2 of the upper and lower strategies differ only if that induces a change in the probabilities [3]. To ensure this constraint, PASS employs a modification of the value iteration algorithm [29]. The new predicates are obtained by taking the weakest preconditions of the different choices made
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
157
Table 1. Experimental results of models hand-translated into digital clocks model
param 16 / 2 / 1
BRP
16 / 2 / 4
65 / 5 / 1
65 / 5 / 4
CSMA
1 4
Zeroconf
-
prop ref pred t pred time abst P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4
24 10 18 3
79 79 78 64
27 28 26 26
29 17 5
126 131 108
75 79 70
40
138
35
5
84
43
4
67
26
P1 P2 P3 P4 P* P1 P2 P3 P4
9
218
192
11 55 32 242 111 238 134 265
42 225 222 249
mc
76s 19s 48s 16s 10s 1s 48s 15s 24s 2s 2s 1s < timeout 184s 118s 15s 902s 65s 789s 6s 4s 1s < timeout 388s 125s 110s timeout 4s 3s 1s < timeout timeout timeout 3s 2s 1s <
cex
prob 1s 4.23 10−4 1s 2.64 10−5 1s 1.85 10−4 1s < 8 10−6 8s 2.64 10−5 4s 1.85 10−4 1s < 8 10−6 67s
7 10−10
1s <
8 10−6
1s <
8 10−6 7.81 10−3
139s 81s 8s 11s out of memory out of memory out of memory out of memory 6s 2s 1s < 4s 19s 11s 1s < 6s 118s 43s 46s 6s 235s 81s 198s 9s
1 6.51 10−4 1.07 10−3 1.22 10−3
by player 2 on the pivot block. It can be proved that these predicates produce a finer abstraction [30]. The BR approach is usually improved by checking the realizability of the path introduced by the strategies that reaches the pivot block [16,30]. In case of spurious paths, new predicates can be introduced by means of Craig Interpolants [25]. The abstraction-refinement loop can be infinite, BR is stopped once the relative difference between the upper and lower bound are below a predefined bound.
4
Case Studies and Heuristics
In this section, we experiment with a broad selection of PTA case studies and then discuss a variety of heuristics to speed up the model checking. Zeroconf is a protocol for dynamic configuration of IPv4 local-addresses [4]. It is used to automatically assign an IP address to a host in a local or ad hoc network without a DHCP server or a fixed configuration. The protocol works as follows: when a new host connects to the network it chooses an IP address at random, sends a packet announcing its new address and then waits for responses. If the address is used by another host in the network, this host sends a packet indicating the collision. Then, the host which is trying to obtain an IP starts the process again. If four packets are sent without a response, the host assumes that its address is free and therefore uses it. The model used in our experiments is based on the model presented in [20]. The properties that we analyse are the maximal probability that the protocol finishes (P1) and the maximal probability that before a certain amount of time a host assumes that an used IP address is a fresh one (P2, P3 and P4). The model comprises two clock variables of size 2 20 2
The size of a clock is the maximal constant to which it is compared.
158
L.M. Ferrer Fioriti and H. Hermanns
and 5. The time-bounds of properties are enforced by introducing an additional clock of size 100, 150 and 200. IEEE 802.3 CSMA/CD is a protocol used in Ethernet networks to avoid collisions when several hosts are sending packets concurrently over a shared wire. Each of the hosts is capable to detect when a collision has occurred. Whenever a host sends a packet and detects that its packet collides with another packet, it waits a random amount of time using a back-off mechanism, and tries to send its message again. If the number of failing attempts reaches a threshold the transmission is aborted. The model under analysis is based on the model presented in [22]. The only parameter that we consider is the size of the back-off which in our models is either 1 or 4. The analysed properties are the maximal probability to abort (P1), and the minimal probability that both stations transmit successfully their packets before a time limit (P2, P3 and P4). The model has three clocks, the first one has size 13, the other two clocks are symmetric and have a size of 404 or 416 depending on the size of the back-off. The clock used on the time bounded properties can take values up to 500, 1000 and 1500. Bounded Retransmission Protocol [13] is a protocol designed by Phillips for the transmission of files over unreliable channels. The protocol is based on the alternating bit protocol but with a bounded number of retransmissions per frame. The model used in our experiments is an adaptation of the flat model of [12]. The model used has three parameters: N is the number of chunks per file, MAX is the maximum number of retransmissions per frame and T D is the maximum transmission delay of the channel. The properties we analyse are the maximum probability the sender does not report a successful transmission (P1), the maximum probability that the sender reports an uncertainty on the success of the transmission (P2), the maximum probability that the sender reports an unsuccessful transmission after a fixed number of chunks have been successfully sent (P3) and the maximum probability that the receiver does not receive any chunk (P4). Our experiments were run on an AMD Athlon II X4 620 2.60GHz with 4GB RAM and Linux 2.6.32. Due to the large number of experiments, we automatically terminated experiments that exceeded a runtime of one hour. The first experiment we ran was checking the properties mentioned previously using PASS without any modification. The models were hand-constructed. Results from checking these properties on PASS can be found in Table 1. The column preds indicates the total amount of predicates generated during the abstraction process, t preds indicates the number of predicates that contain references to clock variables. the rest indicate the time used in each step of the running. Clearly, in almost every case most of the time was spent in the abstraction procedure. There are also some properties that are also model checking intensive, i.e. where a considerable portion of the time was spent in the value iteration algorithm, as in the case of P1 and P3 of BRP or P4 of Zeroconf. In case of CSMA, for the model with the largest clocks, only one property was checkable, for the remainder PASS aborted due to memory exhaustion. An important observation is that in general the majority of the predicates needed corresponded
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
159
Table 2. Experimental results of models hand-translated into digital clocks stopping MC when a pivot can be obtained model
param
prop P1 P2 P3 P4 P1 16 / 2 / 4 P2 P3 P4 P1 65 / 5 / 1 P2 P3 P4 P1 65 / 5 / 4 P2 P3 P4 16 / 2 / 1
BRP
1 CSMA
Zeroconf
4 -
P1 P2-4 P* P1 P2 P3 P4
ref pred t pred 11 78 26 11 77 25 17 78 27 3 54 16 24 119 68 20 115 63 19 118 68 4 95 57 51 141 37 40 149 35 72 141 37 5 66 25
3
110
69
10
240
213
11 55 109 239 111 280 134 342
42 222 264 325
time abst mc 18s 10s 1s 14s 9s 1s 22s 13s 1s 2s 2s 1s < 198s 106s 16s 136s 62s 21s 146s 57s 34s 4s 3s 1s < 369s 172s 27s 325s 129s 13s 424s 146s 175s 3s 2s 1s < timeout timeout timeout 6s 3s 1s
cex 1s 1s 2s 1s < 8s 11s 5s 1s < 93s 109s 5s 1s <
prob 4.23 10−4 2.64 10−5 1.85 10−4 8 10−6 4.23 10−4 2.64 10−5 1.85 10−4 8 10−6 4.55 10−8 7 10−10 1.85 10−4 8 10−6
1s
8 10−6 7.81 10−3
122s 77s 1s 23s out of memory out of memory 6s 2s 1s < 4s 89s 53s 6s 7s 76s 48s 2s 8s 138s 92s 3s 10s
1 6.51 10−4 1.07 10−3 1.22 10−3
to the predicates that contain clock variables. That is especially noticeable on CSMA and Zeroconf models. In the case of the large models of BRP, there are some properties that did not terminate within one hour. This is due to a huge amount of time spent in the value iteration algorithm. In the next section we introduce a heuristic which is able to cope with this problem. 4.1
Speeding Up Strategy Synthesis
As previously described, for some models and properties PASS spends a considerable time on the model checking stage. The value iteration algorithm computes a monotonically increasing sequence of lower bounds for the reachability property in order to compute the fixed-point of the value function [3]. One well-known problem of the value iteration is the non-convergence in finite time when cyclic dependencies are present in the states. To overcome this problem usually the iterations are stopped once the difference in all states is lower than a predefined small bound. To save computational effort and to be able to compute a pivot block in case of a too coarse abstraction, the lower bound of the reachability property is computed first. Then the upper bound is computed reusing the result from the lower bound [29]. In case of finding a pivot block the probabilities obtained from the values iteration are not used directly, only the difference between the bounds computed. As an improvement we propose stopping the value iteration algorithm immediately after no change of the strategies is present for more than n iterations and the bounds in the lower and upper bounds for the initial states exceeds the threshold used to detect pivot blocks. The generation of fresh predicates is ensured since the presence of more than one player 2 selections guarantees that the abstract state has more than one concrete state. A special case of our heuristic is when n is infinite. In such
160
L.M. Ferrer Fioriti and H. Hermanns
a case the algorithm behaves exactly as the original value iteration used in PASS. The main objective of this heuristic is finding the same pivot blocks but earlier. However, it could be the case that after more iterations the strategy is changed giving different pivot blocks and therefore different predicates. It is subject of future research to analyze if indeed the predicates synthesized using this heuristic are suboptimal and if so, to identify which value of n gives the best results. Table 2 shows our experimental results when the value iteration for the upper bound is stopped after the existence of a pivot block is ensured (i.e. when n is 0). We observe a clear time improvement in essentially all cases. For example in BRP some properties that without the heuristic were not able to finish after an hour can be checked in less than 10 minutes, which means an improvement of at least a factor of 6. In BRP we see that fewer refinement steps are needed to check some properties. For example in P1 in the first model we need less than half the number of refinement steps. In contrast, in case of Zeroconf and the second property our heuristic needs three times more refinement steps. It seems important to remark that this heuristic can also be applied to the un-timed setting. A more exhaustive analysis of its impact in that setting is left as future work. 4.2
Self-loops and Execution Invariants
When doing predicate abstraction there are some problems that can have a negative impact on the reduction. The first problem is the introduction of spurious self-loops on the abstraction. A self-loop in an abstraction is spurious if some of its paths are not realizable. These loops are usually generated when some concrete states – but not all – are reachable in one step by a command. If an action that constitutes a spurious self-loop is selected by player 1, then player 2 of the strategy that aims at the lower bound would select the action that constitutes the loop as well while player 2 of the strategy that aims at the upper bound would choose a transition that leaves the block 3 . Spurious self-loops in digitalized PTA are generated easily when the predicates that have clock variables are clock constraints. When a block includes two consecutive valuations, the deterministic time action is enabled and one of its player 2 actions goes to the same block. In case of minimal reachability, all the states that can reach at least one target block and are visited by the strategy cannot have two consecutive valuations. In case of maximal reachability the previous condition only has to hold on the blocks in which player 1 chooses the time action. Spurious self-loops can be avoided by simply expanding one of the clock variables. Another possible solution is to remove all the spurious self-loops, since the resulting automaton weakly simulates the original one. Notice that in case of implementing the latter realizability checks must be disabled. Another problem, that is present despite the absence of spurious self-loops, is the violation of execution invariants 4 . An execution invariant is a property that 3 4
Unless all outgoing transitions of the command lead to a state with probability 0. We use that terminology to differentiate it from the invariants of PTA.
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
161
module A l : {try, fail, succ} init try; x : clock; y : clock; x = 0 y ≤ 4
invariant l=try => x <= 2 & l=try => y <= 5 endinvariant; [a] l=try & x=2 -> 0.5: (l’=succ) + 0.5: (x’=0); [b] l=try & y=5 -> (l’=fail); endmodule
x= 1 y ≤ 4
x =2 y ≤ 4
succ x = 0 y = 5
x= 1 y = 5
x =2 y =5
fail
Fig. 1. Example of a model were its abstraction violates execution invariants
is satisfied in all intermediate states of a path during an execution of the concrete model. While clock variables are tightly related in the underlying PTA semantics, clock constraints are not capable to express most of those relationships. Consider the PTA of Fig. 1 which is described using PRISM syntax for PTA [23]. It is a simple model that consists of two clocks x and y. Every two time units the system performs a probabilistic experiment in order to reach a target location. In case of failure the system repeats the process described previously. After five units of times the system gives up. Clearly, an execution invariant is ∃k ∈ N. y = x + 2k. Consider the abstraction generated from the predicates {x = 0, x = 1, x = 2, x ≤ 4, x = 5} 5 . The graphical representation appears in Fig. 1. The big circles represent the player 1 vertices, the small boxes represent player 2 vertices and the small circles represent the probabilistic player vertices. Suppose we want to compute the maximal probability to reach the succ state. Player 2 of the upper strategy would choose the transition that goes from {x = 0, y ≤ 4} to {x = 1, y ≤ 4}, while the lower strategies would select the other one. This means that the lower strategy increments the speed of the clock y, and the upper strategy stops it. Those behaviors are unrealistic and thus not present in the original model. The predicate generated by taking the weakest preconditions is y = 4. Again the same problem occurs in the block {x = 1, y ≤ 3}. In case of using counterexample analysis in order to remove dead blocks, predicates like x = y or y = x+2 can be generated. However, from our experimental results this kind of predicates are hardly ever generated. Generally, the obtained interpolants are witnesses of violations of execution invariants in the abstraction. Our heuristic to avoid the violation of execution invariants and spurious selfloops lies in the unrolling of all clock valuations. That is, for each clock variable x ∈ X we generate the predicates x = 1, . . . , x = cmax . But introducing all 5
Notice that the initial state is not isolated, we did that in order to keep the example small.
162
L.M. Ferrer Fioriti and H. Hermanns
Table 3. Experimental results expanding all clocks without introducing predicates and stopping MC when a pivot block is obtained model
param 16 / 2 / 1
16 / 2 / 4 BRP
65 / 5 / 1
65 / 5 / 4
1 CSMA 4
Zeroconf
-
prop ref pred t pred time abst P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4
10 11 14 4 16 10 14 4 70 38 66 6 68 19 64 7 8 0 0 0 8 0 0
51 50 52 38 53 50 52 38 104 102 104 41 104 102 103 41 34 18 18 18 32 24 24
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 4 4 4
14 15 15 15
0 0 0 0
mc
18s 13s 1s < 14s 11s 1s < 14s 11s 1s < 2s 2s 1s < 28s 17s 1s 30s 19s 1s 29s 19s 1s 2s 2s 1s < 205s 145s 15s 233s 149s 2s 254s 183s 13s 2s 2s 1s < 540s 240s 114s 530s 327s 10s 497s 230s 97s 4s 3s 1s < 9s 7s 1s < 2s 2s 1s < 22s 4s 10s 256s 6s 250s 19s 15s 1s < 12s 11s 0s 192s 23s 117s out of memory 0.35s 0.26s 0.003s 0.77s 0.69s 0.004s 1.57s 1.1s 0.059s 2.18s 0.69s 0.004s
cex 1s 1s 0s 0s 1s < 1s 1s < 1s < 1s 48s 2s 1s < 1s 50s 2s 1s < 1s < 1s < 1s < 1s < 1s < 0s < 0s <
prob
4.23 10−4 2.64 10−5 1.85 10−4 8 10−6 4.23 10−4 2.64 10−5 1.85 10−4 8 10−6 4.55 10−8 7 10−10 3.92 10−8 6.4 10−11 4.55 10−8 7 10−10 3.92 10−8 6.4 10−11 7.81 10−3 0 8.69 10−1 9.99 10−1 2.38 10−7 0 4.05 10−1
0.01s 1 0.017s 6.51 10−4 0.032s 1.07 10−3 0.017s 1.22 10−3
possible values for clocks in the abstraction process could be quite inefficient and sometimes impossible, because the SMT solver would have to interpret and manage possibly hundreds of formulas. Additionally, the generation of the transitions related to time are relatively simple to generate without the assistance of a solver. Moreover, time behavior is orthogonal to the discrete behavior. Time and data only interact on the invariants. Therefore, we do not introduce those predicates explicitly, but implicitly. First, we construct the abstraction Md of the original model but remove all invariants and clock variables in the same way as done by Watcher[29]. Secondly, we build Mt , that is the original model without data variable references. Finally, we merge Md and Mt together and remove all states and transitions that violate an invariant. PASS implements the abstractions of the transition relation as MTBDD matrices. Therefore, merging and restriction the abstraction is implemented using basic operations of MTBDDs such as conjunctions and multiplications. We show experimental results in Table 3. It can be seen that except in one case every property is calculated on time and in memory. In comparison with the previous approach, less time is needed. That is a direct consequence of avoiding those refinement steps that introduce only time constraints. The expansion of clocks obviously introduces a larger number of blocks, consequently, the time spent on model checking is higher if we consider the proportion of time on each iteration. However, the total time is lower because of the reduction in the number of refinement steps.
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
163
Table 4. Comparison between our approach and two PTA engines of PRISM model
param prop
1 CSMA 4
Zeroconf
5
-
P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4
PRISM DC PRISM GBA PASS-PTA states time states time states time 34265 11s 817 1.7s 34538 9s 1381609 133s 15002 3.9s 495988 2s 4375025 734s 24565 19s 1987172 22s 7357729 1488s 76776 344s 4059229 256s 279650 146s 5547 2.8s 160215 19s 31825816 551s 95015 93s 11456550 12s out of memory 33213884 192s out of memory 519 1.71s 26 1.76s 519 0.35s 11172 1.64s 186 1.75s 394 0.77s 32574 2.3s 477 1.71s 26793 1.57s 58354 3.32s 811 2.11s 46946 2.18s
Comparison with Other Approaches
In this section we present a comparison of our approach with both PTA engines of PRISM [23]: digital clocks and games. The syntax used for PASS in PTA mode is similar to and inspired by the PRISM syntax for PTA. The PRISM language however does not allow the use of shared variables with PTA. Therefore we cannot compare the results obtained for the BRP case. In Table 4 we present some comparative results for the remaining cases. The numbers of states reported correspond to the number of reachable blocks in the final iterations. Our implementation of digital clocks outperforms the results of the digital engine of PRISM, especially in the CSMA model. Regarding state space generation, PASS-PTA, can generate models with half or a third the original state space in case of CSMA. An interesting case is P1, where PASS generated more abstract states than concrete state. This effect is caused by the inclusion of unreachable blocks. With respect to the game-based abstraction of PRISM, we observe that this technique clearly can obtain models with a smaller number of abstract states – by several orders of magnitude. However, the memory usage in PRISM is higher. This might be a consequence of the inherent (exponential) complexity of GBA transitions and the use of more complex data structures such as Difference Bound Matrices to represent zones.
6
Conclusion and Further Work
In this paper we empirically demonstrated that the underlying semantics of digital clocks can make the abstraction refinement process practically infeasible when using predicate abstraction. We proposed and implemented in PASS a technique that does not abstract time behavior. It exploits orthogonality of data and time in order to reduce the number of SMT solver calls. We showed that the proposed approach not only outperforms the na¨ıve implementation of BR and digitalization, but it is also comparable with another abstraction refinement based on zones which was proposed in [18]. A major drawback in the PTA setting induced by PRISM-like languages is that flow control variables are mixed with ordinary data variables. Therefore, it is harder to make static analysis techniques work, for example to identify locations were some clocks are unused, just as the Modest mcpta checker does [12]. Additionally,
164
L.M. Ferrer Fioriti and H. Hermanns
the values taken by control variables do not have a specific meaning. Changing this would make it possible – as for clock variables – to exclude them from the abstraction process to some extent. Additional effort will be needed to remove spurious paths and unreachable states. To do so, we would like to incorporate approaches which are currently in use in the control-flow analysis of non-probabilistic systems. Acknowledgements. The authors are grateful to Moritz Hahn (Saarland University) for discussions and careful proofreading of an earlier version of this paper. This work has been supported by the EU FP7 under grant number ICT-214755 (Quasimodo), by the German Research Council (DFG) as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS), by the DFG/NWO Bilateral Research Programme ROCKS and by the European Union Seventh Framework Programme under grant agreement number 295261 as part of the MEALS project.
References 1. Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2), 183–235 (1994) 2. Baier, C., D’Argenio, P.R., Gr¨ oßer, M.: Partial order reduction for probabilistic branching time. ENTCS 153(2), 97–116 (2006) 3. Chatterjee, K., de Alfaro, L., Henzinger, T.A.: Strategy improvement for concurrent reachability games. In: QEST, pp. 291–300. IEEE Computer Society (2006) 4. Cheshire, S., Aboba, B., Guttman, E.: RFC 3927: Dynamic configuration of IPv4 link-local addresses (May 2005), http://files.zeroconf.org/rfc3927.txt 5. Condon, A.: The complexity of stochastic games. Inf. Comput. 96(2), 203–224 (1992) 6. Cousot, P., Cousot, R.: Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: POPL, pp. 238–252 (1977) 7. D’Argenio, P.R., Jeannet, B., Jensen, H.E., Larsen, K.G.: Reachability Analysis of Probabilistic Systems by Successive Refinements. In: de Luca, L., Gilmore, S. (eds.) PAPM-PROBMIV 2001. LNCS, vol. 2165, pp. 39–56. Springer, Heidelberg (2001) 8. Daws, C., Kwiatkowska, M., Norman, G.: Automatic verification of the IEEE 1394 root contention protocol with KRONOS and PRISM. International Journal on Software Tools for Technology Transfer (STTT) 5(2-3), 221–236 (2004) 9. Donaldson, A.F., Miller, A.: Symmetry Reduction for Probabilistic Model Checking Using Generic Representatives. In: Graf, S., Zhang, W. (eds.) ATVA 2006. LNCS, vol. 4218, pp. 9–23. Springer, Heidelberg (2006) 10. Graf, S., Sa¨ıdi, H.: Construction of Abstract State Graphs with PVS. In: Grumberg, O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997) 11. Hahn, E.M., Hermanns, H., Wachter, B., Zhang, L.: PASS: Abstraction Refinement for Infinite Probabilistic Models. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 353–357. Springer, Heidelberg (2010) 12. Hartmanns, A., Hermanns, H.: A Modest approach to checking probabilistic timed automata. In: QEST. IEEE Computer Society (September 2009) 13. Helmink, L., Sellink, M.P.A., Vaandrager, F.W.: Proof-checking a Data Link Protocol. In: Barendregt, H., Nipkow, T. (eds.) TYPES 1993. LNCS, vol. 806, pp. 127–165. Springer, Heidelberg (1994)
Heuristics for Probabilistic Timed Automata with Abstraction Refinement
165
14. Henzinger, T.A., Manna, Z., Pnueli, A.: What Good are Digital Clocks? In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 545–558. Springer, Heidelberg (1992) 15. Kattenbelt, M., Kwiatkowska, M., Norman, G., Parker, D.: Game-based probabilistic predicate abstraction in PRISM. In: Proc. 6th Workshop on Quantitative Aspects of Programming Languages, QAPL 2008 (2008) 16. Kattenbelt, M., Kwiatkowska, M., Norman, G., Parker, D.: Abstraction Refinement for Probabilistic Software. In: Jones, N.D., M¨ uller-Olm, M. (eds.) VMCAI 2009. LNCS, vol. 5403, pp. 182–197. Springer, Heidelberg (2009) 17. Kwiatkowska, M., Norman, G., Parker, D.: Game-based abstraction for Markov decision processes. In: Proc. 3rd International Conference on Quantitative Evaluation of Systems (QEST 2006), pp. 157–166. IEEE CS Press (2006) 18. Kwiatkowska, M., Norman, G., Parker, D.: Stochastic Games for Verification of Probabilistic Timed Automata. In: Ouaknine, J., Vaandrager, F.W. (eds.) FORMATS 2009. LNCS, vol. 5813, pp. 212–227. Springer, Heidelberg (2009) 19. Kwiatkowska, M., Norman, G., Parker, D.: A Framework for Verification of Software with Time and Probabilities. In: Chatterjee, K., Henzinger, T.A. (eds.) FORMATS 2010. LNCS, vol. 6246, pp. 25–45. Springer, Heidelberg (2010) 20. Kwiatkowska, M., Norman, G., Parker, D., Sproston, J.: Performance analysis of probabilistic timed automata using digital clocks. Formal Methods in System Design 29, 33–78 (2006) 21. Kwiatkowska, M., Norman, G., Segala, R., Sproston, J.: Automatic verification of real-time systems with discrete probability distributions. Theoretical Computer Science 282, 101–150 (2002) 22. Kwiatkowska, M., Norman, G., Sproston, J., Wang, F.: Symbolic model checking for probabilistic timed automata. Information and Computation 205(7), 1027–1077 (2007) 23. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of Probabilistic Real-Time Systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011) 24. Kwiatkowska, M., Norman, G., Segala, R., Sproston, J.: Verifying Quantitative Properties of Continuous Probabilistic Timed Automata. In: Palamidessi, C. (ed.) CONCUR 2000. LNCS, vol. 1877, pp. 123–137. Springer, Heidelberg (2000) 25. McMillan, K.L.: Applications of Craig Interpolants in Model Checking. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 1–12. Springer, Heidelberg (2005) 26. Segala, R.: Modeling and verification of randomized distributed real-time systems. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA, USA (1995) 27. Shapley, L.S.: Stochastic games. Proceedings of the National Academy of Sciences of the United States of America 39, 1095–1100 (1953) 28. Timmer, M., Stoelinga, M., van de Pol, J.: Confluence Reduction for Probabilistic Systems. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 311–325. Springer, Heidelberg (2011) 29. Wachter, B.: Refined Probabilistic Abstraction. Ph.D. thesis, Universit¨ a des Saarlandes (2010) 30. Wachter, B., Zhang, L.: Best Probabilistic Transformers. In: Barthe, G., Hermenegildo, M. (eds.) VMCAI 2010. LNCS, vol. 5944, pp. 362–379. Springer, Heidelberg (2010) 31. Wachter, B., Zhang, L., Hermanns, H.: Probabilistic model checking modulo theories. In: Fourth International Conference on the Quantitative Evaluation of Systems (2007) 32. UPPAAL Pro., http://www.cs.aau.dk/~ arild/uppaal-probabilistic/
Simulative and Analytical Evaluation for ASD-Based Embedded Software Ramin Sadre1 , Anne Remke1 , Sjors Hettinga2 , and Boudewijn Haverkort1,3 1
Design and Analysis of Communication Systems, University of Twente, The Netherlands {r.sadre,a.k.i.remke,b.r.h.m.haverkort}@utwente.nl 2 Embedded Systems Institute, Eindhoven, The Netherlands [email protected]
Abstract. The Analytical Software Design (ASD) method of the company Verum has been designed to reduce the number of errors in embedded software. However, it does not take performance issues into account, which can also have a major impact on the duration of software development. This paper presents a discrete-event simulator for the performance evaluation of ASD-structured software as well as a compositional numerical analysis method using fixed-point iteration and phase-type distribution fitting. Whereas the numerical analysis is highly accurate for non-interfering tasks, its accuracy degrades when tasks run in opposite directions through interdependent software blocks and the utilization increases. A thorough validation identifies the underlying problems when analyzing the performance of embedded software.
1 Introduction Due to the increasing complexity of embedded software it becomes more difficult to find and fix all errors made in early development phases. The company Verum [19] developed a structured design method with built-in model-checking [2] that they claim to reduce the number of errors made by programmers. Several companies, amongst others Philips Healthcare (PHC), are investigating the use of Verum’s Analytical Software Design (ASD) method. However, the ASD method does not take performance into account, even though performance evaluation is necessary in an early stage of a project to predict, e.g., the expected response time of tasks in the system. The to some extent simplified ASD architecture, as discussed in this paper, organizes a software system into a tree of blocks, allowing for top-down synchronous calls and for asynchronous calls running in the opposite direction from block to block (details will follow). Asynchronous calls have non-preemptive priority over synchronous calls and synchronous calls that issue further requests cause blocking at the current block until these requests are returned to the issuing block. Related work, like Layered Queueing Networks [5] and the Methods of Layers [15], however, only covers systems where calls just run in one direction. Also Modular Performance Analysis [3,20] cannot deal with cyclic dependencies due to synchronous and asynchronous calls running in opposite directions. Note that, even though this work directly resulted from a cooperation with PHC and Verum, its applicability is not limited to ASD structures, since similar problems arise in all areas of software analysis where calls in opposite directions are allowed. J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 166–181, 2012. c Springer-Verlag Berlin Heidelberg 2012
Simulative and Analytical Evaluation for ASD-Based Embedded Software
167
This paper’s main contribution is a discrete-event steady-state simulator for the performance evaluation of arbitrary ASD structures and the discussion of the intrinsic difficulties of an analytical solution. The simulator computes several measures of interest, e.g., the mean response time of the system for each task, the utilization of the blocks, and the mean waiting times of the calls at each block. Due to the cyclic dependencies and the fact that an ASD block is a one-server priority system with an open and a closed queue, the ASD tree structure cannot be represented as a standard queueing system. This makes analytical performance evaluation a challenging task and also, to the best of our knowledge, there is no simulator available off the shelf for the targeted tree structure. Following [9], this paper presents a first step towards a compositional numerical analysis method for a restricted class of tasks. A single ASD block is analyzed by solving the underlying CTMC of a queuing station model. This approach can, however, not easily be extended to multiple blocks, since it results in a global (non-compositional) CTMC that is potentially infinite in multiple dimensions. Therefore, we propose a decomposition approach, based on the single block analysis, using fixed-point iteration. We provide a comparison of simulation and analysis results and discuss possible sources of inaccuracies. As can be expected, the simulator takes considerably longer to compute results than the very quick numerical analysis. The analysis is still highly accurate for non-interfering tasks that are spread over several blocks. However, its accuracy degrades when tasks interfere with each other, especially for higher utilizations. This paper helps to identify the intrinsic difficulties when analyzing the performance of embedded software, so that future work can directly address the identified weaknesses. The paper is further organized as follows: In Section 2 the simplified version of the ASD structure is explained. The discrete-event simulator which captures ASD structures with several blocks is presented in Section 3. Section 4 discusses the analysis of a single ASD block and presents a compositional algorithm for the analysis of ASD structures with multiple blocks. Section 5 discusses analysis and simulation results for several test cases. Finally, Section 6 presents the conclusion and pointers to future work.
2 The ASD Architecture The ASD suite can check for deadlocks and life-locks in the code. The developer splits the state diagram of the complete software system into smaller parts, each implemented as a single ASD block. Blocks have a clearly defined interface and other blocks only see the interface and consider the block itself as a black box. The work in this paper is based on a simplified version of the ASD state diagram which is necessary for formal verification by Verum’s ASD suite. The communicating ASD blocks are organized in a tree structure which determines that every master block (parent) can have multiple slave blocks (children), but every slave has only one master. Communication between blocks is done either via synchronous (S) or via asynchronous (AS) calls. The S-calls can only go top down in the structure, while the AS-calls only go bottom up. S-calls return to their caller when they have finished, AS-calls do not send returns. A call that is processed by an ASD block can issue new S-calls or AS-calls as part of the response. When an S-call is sent to a slave block, the caller remains blocked until it has received a return on the call, just like a function call in a program. This effect is
168
R. Sadre et al.
Fig. 1. ASD architecture
Fig. 2. Timeline of nested synchronous calls
illustrated in Figure 2. In this example a master receives an S-call, does some processing (P) and then issues an S-call to one of its slaves, which does the same. While a slave is processing, the master stays blocked (B). The blocking is removed by the synchronous return. Issuing an AS-call however, is non-blocking. The caller does not have to wait until the call has finished, but just continues with other work. Within an ASD block, only one call can be handled at a time and calls in progress are not preempted. Incoming AS-calls have priority over S-calls. They are queued upon arrival and served according to First In First Out (FIFO). The tree structure together with the blocking ensures that there can only be one S-call per block, because there is only one master node that can issue an S-call to its slave. The ASD tree structure forms a complete program that has to respond to one or multiple tasks. A task is a sequence of calls that flow through the tree (not necessarily through all levels) according to a predefined path. For example, a task could start with an S-call sent to the root block of the tree, followed by several S-calls to slave blocks, and finally closing with an AS-call. Such a task could be the reaction to a button pressed by the user. As seen in this example, tasks can mix S-calls and AS-calls. However, tasks starting with an S-call always enter the tree at its root, and tasks starting with an AS-call can only enter the tree at one of its leaves.
Simulative and Analytical Evaluation for ASD-Based Embedded Software
169
3 Simulation We have implemented a discrete-event steady-state simulator for the performance evaluation of the ASD architecture. The simulator has been written in Java and allows to study ASD trees of arbitrary size. In addition to the tree structure, the user has to define tasks and to provide performance-oriented parameters. The supported model class is detailed below, followed by a brief description of the measures computed by the simulator and its performance. A model definition for the simulator essentially consists of two parts. First, the user has to provide the ASD tree structure, i.e., a list of blocks and their master/slave relationships. The user then describes the tasks to execute. A task is defined as a sequence of calls that form a path through the tree. For each call in a task, the user has to specify (i) whether it is an S-call or an AS-call; (ii) to which block the call is sent. The simulator checks whether the constraints given by the tree structure are held. (iii) The mean service time of the call in the block. Service times are assumed to be negativeexponentially distributed. All tasks are repeatedly executed. For tasks starting with an AS-call, the user has to provide the mean time between the generation of two instances of the task. These interarrival times are negative-exponentially distributed. However, if a task begins with an S-call, the simulator has to wait until the S-call has returned before it can create a new instance of that task due to the blocking nature of the S-call. In that case, the user defines the mean think time, i.e., the time between the end of one instance of the task and the begin of its next instance. Again, think times are negative-exponentially distributed. Note that the requirement that the involved distributions are negative-exponential has merely been a design decision than caused by technical limitations. Since the simulator is discrete-event based it can be easily extended to other distributions. The behavior inside a block, i.e., how calls are processed, is simulated according to the description given in Section 2. If a call currently processed by a block X sends a new call to a master or slave block Y , the simulator assumes that the new call is sent to Y only after the local processing in X has finished, as illustrated in Figure 2. The real ASD-generated software will run on hardware with a limited number of processing units. In the simulator, the user can choose between (a) a fully parallelized model where each block has its own processing unit or (b) a model where all calls share one CPU. For the latter case, the simulator offers round-robin (RR) scheduling with adjustable time slice length and processor-sharing (PS) scheduling, which is equivalent to RR scheduling with infinitely small time slices. The simulator computes the means of measures, together with confidence intervals, by running independent replications. Measures of interest include the mean response time of the system for each task, the utilization of the blocks, and the mean waiting times of the calls at each block. The current implementation of the simulator focuses more on functionality than on efficiency. The length of the simulation of course depends on the model size and on the desired width of the confidence intervals. The results shown for the rather simple example in Section 5.2 have been obtained on low-end hardware (dual core notebook @ 2.0 Ghz) after 40 seconds. The simulator has generated around 0.46 · 106 tasks per second. A more efficient implementation initiated recently, also written in Java, simulates around 1.1 · 106 tasks per second on the same hardware.
170
R. Sadre et al.
Fig. 3. Queueing model of a single ASD block
4 Numerical Analysis In this section, we present our first steps toward an analysis method for ASD tree models. Our method is based on the decomposition of the tree on block level and targets, in its current stage, a restricted class of ASD models. Compared to the model class presented in Section 3, it requires that (i) each block has its own processing unit, (ii) a task either sends only S-calls or only AS-calls and (iii) each block only receives calls from at most two tasks: One task sends S-calls to the block, the other task sends AS-calls. Among other things, this implies that there can be only one task with S-calls. We first introduce a queueing station model for single ASD blocks in Section 4.1. The computation of the measures of interest for that model is presented in Section 4.2. Then, we explain how the single block analysis can be used to perform a decomposition-based analysis of ASD trees in Section 4.3. 4.1 Single ASD Block Analysis In the restricted model class introduced above, a single ASD block is completely described by the following four rates: the arrival rate λas and the service rate μas of the AS-calls and the think rate zs and the service rate μs of the S-calls. Please note that in the following we have left out the discussion of the trivial case where the block only receives calls from one task, i.e., either only AS-calls or only S-calls. We model the behavior of the block by a special queueing system with two queues and one service station, as shown in Figure 3. In this system, both types of calls, synchronous and asynchronous, have their own queue. AS-calls arrive at the top queue, that is modeled to be infinite, while S-calls arrive at the lower queue. As explained in Section 3, AS-calls have priority over S-calls and are served according to FIFO. Calls in progress are not preempted. The fact that only one S-call can be present in the block is modeled by a closed loop. An S-call is either (i) waiting for service in the corresponding queue, (ii) in service in the service station, or (iii) experiencing a think time outside the block. The think time represents the time between the moment an S-call returns from the block and the moment a new S-call enters the block. The presented model of a single block is a quasi-birth-death process (QBD). Figure 4 shows the resulting underlying CTMC. Apart from state 0, 0, E representing the empty system, the states are organized in two groups. The system is in state 0, a, S if an S-call is being served and a AS-calls are queued. Alternatively, the system can be in state s, a, AS, indicating that an AS-call is
Simulative and Analytical Evaluation for ASD-Based Embedded Software
171
Fig. 4. Underlying CTMC of the single-block QBD process
being served and that s synchronous and a asynchronous calls are queued. Remember that s can only be either 0 or 1. By solving the above CTMC, we can compute the desired measures of interest for a single block, such as the mean waiting time of the calls. However, we will see in Section 4.3 that the assumption of Poisson arrival, think and service processes is too restricting. Fortunately, our model can be easily generalized to phase-type (PH) distributions. We will only keep the requirement that the inter-arrival times of AScalls are negative-exponentially distributed. We follow the common notation for PHdistributions [13] and denote (a, T) for the PH-distribution with initial probabilities a, rate matrix T of the transient states, and rate vector T 0 to the absorbing state. Let (aμas , Tμas ) be the distribution of the service time Sas of AS-calls, (azs , Tzs ) be the distribution of the think time Zs of S-calls, and (aμs , Tμs ) be the distribution of the service time Ss of S-calls. For reasons of conformity, we also write (aλas , Tλas ) for the distribution of the inter-arrival time Aas of AS-calls, although it will always be a negative-exponential distribution. The block-banded generator matrix Q of the resulting CTMC follows directly from the above descriptions, however the derivation ∞ can be found in [9]. The steady-state state probability vector p with pQ = 0 and i=1 pi = 1 can be written as p = [ z0 z1 z2 . . . ] with the sub-vectors z0 and zi containing the state probabilities of the states at level 0, respectively level i ≥ 1 of the QBD. The subvectors can be computed by a matrix geometric method, for example by the LR method [12], which yields z0 and z1 as well as the matrix R with zi = z1 Ri−1 for i ≥ 1. 4.2 Measures of Interest for Single Blocks An important measure of the system performance is the waiting time of the calls. Typically, one is interested in the mean waiting time but we will see in Section 4.3 that it is useful to know higher moments of the S-call waiting time distribution, as well. We begin with the computation of the S-call waiting time. As in the previous section, we first explain our approach for the simple case where all involved processes are
172
R. Sadre et al.
Fig. 5. CTMC describing the waiting time of an S-call
Poisson. An incoming S-call finds the block either empty with probability e0 or it finds i + 1 AS-calls with probability ai , i ≥ 0. In the first case, the waiting time is zero; in the second case, the system first has to process the i + 1 AS-calls and any other incoming AS-call. Figure 5 shows the absorbing CTMC representing the situation of a non-zero waiting time. The waiting time is the time that elapses between entering the state 1, i, AS (with probability ai ) and reaching the absorbing state 0, 0, S. We introduce the matrix T, containing the transition rates for the transient states 1, i, AS: ⎡ ⎢ ⎢ T=⎢ ⎣
−λas − μas λas 0 −λas − μas λas μas −λas − μas 0 μas .. . 0
⎤ ⎥ 0 ⎥ . λas 0 ⎥ ⎦ .. .. . .
(1)
The kth moment of the waiting time distributions is given by E[Wsk ] = (−1)k k!aT−k 1 with a = [a0 a1 . . .]. In the more general case of PH-distributions, the entries in matrix T are replaced by block matrices based on the distributions (aλas , Tλas ) and (aμas , Tμas ). Analogously, the scalar probabilities ai are replaced by vectors of the 1 z1 Ri c, where pS0 is the probability to have no S-call in the block and c form ai = pS0 is a matrix. The definitions of pS0 and c can be derived from the CTMC, as explained in [9]. In order to obtain finite expressions, we approximate the waiting time moments by truncating the absorbing CTMC at level t. In most of our experiments, truncation levels t ≥ 50 have not provided any improvements of the results. The exact number depends on the system load and is iteratively determined in our implementation. The resulting expressions for the waiting time follow directly from the structure of the state space. Note that, in the special case of negative-exponentially distributed AScall inter-arrival times and service times, the times to absorption can be directly derived from the first passage time analysis of birth-death processes [10]. Knowing the waiting time, we can calculate the mean inter-arrival time E[As ] of S-calls. Since the interarrival time of S-calls is the sum of the think time, the waiting time and the service time, it holds: E[As ] = E[ZS ] + E[Ws ] + E[Ss ]. The restriction to Poisson distributed AS-call arrivals allows us to get a simple expression for their mean waiting time E[Was ]. Using the well-known results for non-preemptive priority scheduling with Poisson arrivals [4], we obtain: 2 E[Sas E[Ss2 ] ] 1 + E[Was ] = . (2) E[Sas ] E[Aas ] E[As ] 2 1 − E[A ] as
Simulative and Analytical Evaluation for ASD-Based Embedded Software
173
4.3 Analysis of Multiple Blocks In the previous section, we have analyzed a single ASD block by solving the underlying CTMC of a queueing station model. This approach cannot be directly extended to general tree structures consisting of multiple blocks: the result would be a CTMC that is potentially infinite in multiple dimensions because of the infinite AS queues in each block. In the following we propose a decomposition approach that is based on the single block analysis. The approach is based on the following two observations: (i) When an S-call processed by block X sends an S-call to block Y , X stays blocked while the second S-call is processed by block Y . As a result, the call at block X experiences an effective service time that is the sum of its specified service time at X, the S-call waiting time at Y and the effective S-call service time at Y . (ii) From the viewpoint of block Y , the S-calls sent by X arrive with a perceived think time that is the sum of the perceived think time at X, the S-call waiting time at X and the original S-call service time at X. Once we know the effective S-call service time distribution and the perceived think time distribution of a block, we can analyze it by the method presented in Section 4.1. However, the effective service time and perceived think time are recursively defined since blocks depend on each other. We propose the iterative algorithm show in Figure 6 to solve the dependencies. The algorithm first initializes the distributions of each block with the distributions provided by the task specifications. Then, it employs a fixed-point iteration where in each iteration it analyzes the single blocks and updates the estimations of their effective service time, perceived think time and waiting time until a desired accuracy is reached. We assume that AS-call arrivals always are Poisson distributed. An important operation in estimating the effective service time distributions and perceived think time distributions (step 1 and 2 of the iteration loop) is the sequential composition function (seqcomp). For two PH-distributions (a1 , T1 ) and (a2 , T2 ), the sequential composition (a, T) is given by T1 −T1 1 a2 a = [a1 (1 − a1 1)a2 ] , T = . (3) 0 T2 However, a finite PH representation of the S-call waiting time distribution is not directly available from the block analysis due to the underlying dependencies. In Section 4.2, we have only computed the moments E[Wsk ]. Hence, we have to fit a PH-distribution to those moments (step 4 of the iteration loop). Our experiments have shown that twomoments fitting, as employed in [7,17], and three-moments fitting from [14] yield equivalent results. Finally, it should be noted that we have not proved that the algorithm reaches the desired fixed point or, at least, terminates for all possible ASD models. However, in the experiments performed in Section 5, the algorithm has terminated after less than 20 iterations, even if a low relative threshold of 0.1% was chosen.
5 Validation In this section, we study the performance of the analysis and the simulation. We begin with a simple test case with two blocks and three tasks in Section 5.1. In the second test case in Section 5.2, we evaluate two blocks executing two interfering tasks. For those
174
R. Sadre et al.
Initialization For each block B executing a task with S-calls do: B B B 1. Effective S-call service time distribution (aB µs,ef f , Tµs,ef f ) := (aµs , Tµs ). B B 2. Perceived think time distribution (azs,pcv , Tzs,pcv ) := think time distribution of the task. B 3. S-call waiting time distribution (aB W s , TW s ) := distribution with mean 0. Repeat For each block B do: 1. If the block sends S-calls to block Y : B B B Y Y Y Y (aB µs,ef f , Tµs,ef f ) := seqcomp (aµs , Tµs ), (aW s , TW s ), (aµs,ef f , Tµs,ef f ) . 2. If the block receives S-calls from block X: B X X X X X X (aB zs,pcv , Tzs,pcv ) := seqcomp (azs,pcv , Tzs,pcv ), (aW s , TW s ), (aµs , Tµs ) . B B 3. Perform the single block analysis for this block with the specified λB as and (aµas , Tµas ) B B B B and the estimated (azs,pcv , Tzs,pcv ), and (aµs,ef f , Tµs,ef f ). Compute new estimates of EB [Was ] and of EB [Wsk ]. B B k 4. (aB W s , TW s ) := fit(E [Ws ]). 5. Calculate the changes in mean waiting times relative to the previous iteration. Until all changes in mean waiting times are smaller than a given threshold. Finally, compute the mean response time for each task by summing the expected waiting times and service times of the blocks visited by the task. Fig. 6. Iteration procedure for multi-block analysis
two test cases, we provide the results as obtained by the analysis and by the simulator. In the third test case (Section 5.3), we discuss an ASD structure that can be currently only evaluated by means of the simulator. 5.1 Test Case 1: Two Blocks In our first test case, we consider a system consisting of two ASD blocks A and B, as depicted in Figure 7a. The blocks execute three tasks. Task 1 sends S-calls to block A which in turn generate S-calls to block B. The mean think time of the task is 10. The mean service time is 3 at block A, respectively 2 at block B. Task 2 sends AS-calls to block B with arrival rate λas,2 and mean service time 2. Task 3 sends AS-calls to block A with arrival rate 0.3 and mean service time 1. For this example we have omitted the right slave of the root node, assuming an infinitely high service rate. This results in a degenerated tree, which does not affect the applicability of the method. In Figure 8, we show the results for the mean response times for various arrival rates λas,2 , as computed by the numerical analysis and the simulation using the fully parallelized model (together with the 95% confidence intervals). In addition, the figure also shows the analytically calculated utilization of block A for the different arrival rates. It is the effective utilization of the block, i.e., it also includes the blocking times when A is waiting for B. We observe that the analytical results are almost overlapping with those obtained by simulation. Only for task 1, a small difference is visible at λas,2 = 0.4, which corresponds to a relatively high utilization of 89%.
Simulative and Analytical Evaluation for ASD-Based Embedded Software
(a) Test case 1
(b) Test case 2
175
(c) Test case 3
Fig. 7. ASD structures of the test cases
55
1
50
0.9
45
0.8
40
0.6 30
Analysis task 1 Analysis task 2 Analysis task 3 Simulation task 1 Simulation task 2 Simulation task 3 Utilization block A
25 20 15
0.5 0.4 0.3 0.2
10
0.1
5 0 0.05
Utilization
Mean response time
0.7 35
0.1
0.15
0.2 0.25 Arrival rate of task 2
0.3
0.35
0.4
0
Fig. 8. Test case 1: Mean response times and utilization as function of λas,2
To better understand the source of the error, we give the mean waiting times of the two call types for λas,2 = 0.4 in Table 1. The respective relative errors (RE) between analysis and simulation are shown in the third row. The analysis obviously underestimates the waiting time of the S-calls at block A (first column). A reason for this can be found by inspecting the auto-correlation of the S-call waiting times at block B. Figure 9a gives the corresponding lag-k auto-correlation coefficients computed by the simulation for λas,2 = 0.4. Remarkably, the waiting times show a positive auto-correlation that does not fade at larger lags. It is known that such long-range dependencies can increase the response times of queueing stations [6]. Since our decomposition analysis fits a PH-distribution to the waiting time distribution of B, it cannot account for that effect
176
R. Sadre et al.
0.025
0.2
block A block B 0.19 0.02 0.18 0.015
AC
AC
0.17
0.16
0.01
0.15 0.005 0.14 0 0.13
-0.005
0.12 5
10
15
20
25 lag
30
35
40
45
50
5
10
15
20
25 lag
30
35
40
45
50
(b) Test case 2 (block A and B)
(a) Test case 1 (block B)
Fig. 9. Auto-correlation coefficients (AC) of the S-call waiting times for λas,2 = 0.4
60 55 50 45
Mean response time
40 35 30 25
Task 1 Task 2 Task 3
20 15 10 5 0
1
2
3
4
5 6 Number of iterations
7
8
9
10
Fig. 10. Test case 1: Mean response times as function of the number of iterations for λas,2 = 0.4 Table 1. Test case 1: Mean waiting times for λas,2 = 0.4 EA [Ws ] Simulation 7.28 Analysis 7.00 RE -3.8%
EB [Ws ] EA [Was ] EB [Was ] 20.17 50.71 8.49 19.86 50.26 8.49 -1.5% -0.9% 0.0%
when building the effective service time process of the S-calls at A. Finally, Figure 10, shows the estimated mean response times as a function of the number of iterations of the analysis algorithm. We observe that the estimated values quickly converge to the final results after a few iterations. In total, the analysis results are computed in less than five
Simulative and Analytical Evaluation for ASD-Based Embedded Software 75
177
1
70
Analysis task 1 Analysis task 2 Simulation task 1 Simulation task 2 Utilization block A
65 60 55
0.9 0.8 0.7
45
0.6
40 0.5 35 30
Utilization
Mean response time
50
0.4
25 0.3 20 15
0.2
10 0.1 5 0 0.05
0.1
0.15
0.2 0.25 Arrival rate of task 2
0.3
0.35
0.4
0
Fig. 11. Test case 2: Mean response times and utilization as function of λas,2
seconds on low-end hardware (dual core notebook @ 2.0 Ghz) with a relative threshold of less than 0.1% for the iteration loop. The computation of the simulation results takes substantially longer with 30 seconds to two minutes with our original implementation, depending on the desired width of the confidence intervals. 5.2 Test Case 2: Interfering Tasks For the second test case, we again consider two blocks A and B. In contrast to the previous test case, the blocks have to execute two interfering tasks, as shown in Figure 7b. Again, task 1 sends S-calls to block A which in turn generate S-calls to block B. The mean think time is 10. The mean service time is 3 at block A, respectively 2 at block B. The AS-calls generated by task 2 run into the opposite direction: every AS-call sent to B also generates an AS-call to A. The arrival rate of AS-calls at B is λas,2 . The mean service time at B is 2, respectively 1 at block A. Figure 11 shows the mean response times for various arrival rates λas,2 , as computed by the numerical analysis and the simulation using the fully parallelized model (together with the 95% confidence intervals), as well as the analytically calculated utilization of block A. This time we can see a much clearer difference between simulation and analysis for utilizations higher than 80%, i.e., for λas,2 ≥ 0.35. We give the mean waiting times at each block for λas,2 = 0.4 in Table 2, together with the relative errors (RE) between analysis and simulation (third row). We notice a significant error in the waiting time analysis at block A (column 1 and 3). Interestingly, the waiting times of the S-calls at block A and B do not exhibit any significant autocorrelation in this case. Figure 9b shows the corresponding autocorrelation coefficients for both blocks, as obtained by simulation for λas,2 = 0.4. The coefficients are always close to zero, even for a lag of 1. Of course, this does not exclude other kinds of dependencies (see below). We have also checked two other possible sources of
178
R. Sadre et al.
Table 2. Test case 2: Mean waiting times for λas,2 = 0.4 EA [Ws ] EB [Ws ] EA [Was ] EB [Was ] Simulation 2.61 5.78 11.78 3.42 Analysis 2.27 5.69 9.63 3.44 RE -13.0% -1.6% -18.3% 0.6%
errors. First, there is the assumption in our algorithm that AS-call arrivals are Poisson. We have verified the simulated arrival times of the AS-calls at block A, and, indeed, they are independent and nearly Poisson distributed. Second, we have verified whether the error could be caused by the three-moments fitting procedure itself. Therefor, we have also computed the fourth moment of the S-call waiting time distribution. The relative error of the fourth moment of the fitted distribution to the unfitted waiting time distribution (see Section 4.2) is -7.0% at block A and -10.8% at block B. Although these values look large, other effects have, from our experience, a much higher impact on the system performance than the fourth, and often even the third, moment [16]. We believe that the large analysis errors for high utilizations are caused by the fact that in this test case the two blocks are made dependent from each others in ”two directions”: by the S-calls from A to B and by the AS-calls from B to A. When an S-call is queued for service at block A, it implies that no S-call is served by block B. This means that B can serve AS-calls while the S-call is waiting. Since each AS-call served by B sends a new AS-call to A, the waiting S-call at A has to wait even longer (remember that AS-calls have higher priority). A similar effect increases the waiting times of AS-calls when A is processing an S-call. This behavior of inter-dependent queueing stations is not modeled by our decomposition algorithm. 5.3 Test Case 3: Complex Structure For our last experiment, we consider an ASD tree structure consisting of 5 blocks as shown in Figure 7c. We define three tasks. Task 1 sends S-calls to A with a mean think time of 5, followed by S-calls to B. Task 2 sends AS-calls to D with an arrival rate of 0.2, followed by AS-calls to B. Task 3 sends calls in the following order: AS-calls to E with arrival rate λas,3 , AS-calls to B, AS-calls to A, and S-calls to C. All mean service times are 1.0. Figure 12 shows the resulting mean response times for the three tasks for different λas,3 , as obtained by simulation using the fully parallelized model. Relative confidence intervals are smaller than 2% for all results (not shown). We observe that the response times of task 1 and task 3 quickly increase with increasing arrival rate since both tasks compete for the resources of block A. Note that A is often blocked because of S-calls to B and C. The response time of task 1 increases even faster than the response time of task 3 because the AS-calls of the latter have priority over task 1’s S-calls. We also show in Figure 12 the mean response times obtained by simulation when using the processor sharing (PS) model. As expected, the response times are much higher than in the parallelized model and the system reaches its maximum utilization already when λas,3 approaches 0.15. In addition, the increasing arrival rate affects all tasks in the PS model, while task 2 is rather independent from task 1 and 3 in the parallelized model.
Simulative and Analytical Evaluation for ASD-Based Embedded Software
179
30 Task 1 (parallelized) Task 2 (parallelized) Task 3 (parallelized) Task 1 with PS Task 2 with PS Task 3 with PS
25
Mean response time
20
15
10
5
0 0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Arrival rate of task 3
Fig. 12. Test case 3: Mean response times as function of λas,3
6 Conclusions This paper presented a performance evaluation approach for embedded software, along the lines of the the ASD suite of Verum. The simulator, as presented in this paper, derives accurate results for general models, however, is time-consuming. The numerical analysis is very accurate for non-interfering tasks, but suffers from correlation effects when tasks interfere and the utilization increases. Several assumptions have been made about the ASD-generated software to make modeling and analysis possible. The structure of the ASD-generated software as described in this paper does not include all constructions allowed by the ASD suite. Even though deviations from the presented structure are possible from a modeling point of view in ASD, it is often not possible to formally verify them with ASD. Furthermore, service time, inter-arrival time and think time distributions were assumed to be negative exponential. To analyze real systems with different distributions, these need to be approximated by phase type distributions. This results in a larger state-space, however it does not pose any principal problem for the presented numerical analysis. Clearly, the simulator can be easily adapted to deal with different distributions. Due to the increasing relative error for interfering tasks, the analysis as presented in this paper cannot directly be used for larger embedded software designs. However, we consider it a first important step towards more precise methods. Research will be continued in the recently started COMMIT project Allegio [1] in collaboration with Philips Healthcare. Future work will include a comparison with empirical data and simulations with two and more processing units. Furthermore, we will take the correlation of the waiting time processes at higher lags into account and better explore the iteration behavior of the algorithm, as done in [18]. However, note that not taking into account certain dependencies between blocks is a general weakness of compositional algorithms
180
R. Sadre et al.
[21,17,8]. Hence, we also plan to use abstraction techniques on the complete underlying state space, which is infinite in as many dimensions as ASD blocks are present, like proposed in [11]. Acknowledgements. Anne Remke is funded through 3TU.CeDiCT and a NWO Veni grant.
References 1. Allegio (2011), http://www.esi.nl/research/applied-research/ current-projects/allegio/ 2. Broadfoot, G.H., Broadfoot, P.J.: Academia and industry meet: Some experiences of formal methods in practice. In: 10th Asia-Pacific Software Engineering Conference (APSEC 2003), pp. 49–58 (2003) 3. Chakraborty, S., K¨unzli, S., Thiele, L.: A general framework for analysing system properties in platform-based embedded system designs. In: DATE (2003) 4. Cobham, A.: Priority assignment in waiting line problems. Operations Research 2(1), 70–76 (1954) 5. Franks, G., Al-Omari, T., Woodside, M., Das, O., Derisavi, S.: Enhanced Modeling and Solution of Layered Queueing Networks. Transactions on Software Engineering 35(2), 148–161 (2009) 6. Grossglauser, M., Bolot, J.C.: On the relevance of long-range dependence in network traffic. IEEE/ACM Transactions on Networking 7(5), 629–640 (1999) 7. Haverkort, B.: Approximate Analysis of Networks of PH|PH|1|K Queues: Theory & Tool Support. In: Beilner, H., Bause, F. (eds.) MMB 1995 and TOOLS 1995. LNCS, vol. 977, pp. 239–253. Springer, Heidelberg (1995) 8. Heindl, A.: Decomposition of general queueing networks with MMPP inputs and customer losses. Performance Evaluation 51(2-4), 117–136 (2003) 9. Hettinga, S.: Performance Analysis for Embedded Software Design. Master’s thesis, University of Twente (2010) 10. Jouini, O., Dallery, Y.: Moments of first passage times in general birth-death processes. Mathematical Methods of Operations Research 68, 49–76 (2008) 11. Klink, D., Remke, A., Haverkort, B., Katoen, J.P.: Time-bounded reachability in treestructured QBDs by abstraction. Performance Evaluation 68, 105–125 (2011) 12. Latouche, G., Ramaswami, V.: A logarithmic reduction algorithm for quasi birth and death processes. Journal of Applied Probability 30, 650–674 (1993) 13. Neuts, M.: Matrix-Geometric Solutions in Stochastic Models — An Algorithmic Approach. Dover Publications, Inc. (1981) 14. Osogami, T., Harchol-Balter, M.: A Closed-Form Solution for Mapping General Distributions to Minimal PH Distributions. In: Kemper, P., Sanders, W.H. (eds.) TOOLS 2003. LNCS, vol. 2794, pp. 200–217. Springer, Heidelberg (2003) 15. Rolia, J., Sevcik, K.: The Method of Layers. Transactions on Software Engineering 21(8), 689–700 (1995) 16. Sadre, R.: Decomposition-Based Analysis of Queueing Networks. Ph.D. thesis, University of Twente (2006)
Simulative and Analytical Evaluation for ASD-Based Embedded Software
181
17. Sadre, R., Haverkort, B.: FiFiQueues: Fixed-Point Analysis of Queueing Networks with Finite-Buffer Stations. In: Haverkort, B.R., Bohnenkamp, H.C., Smith, C.U. (eds.) TOOLS 2000. LNCS, vol. 1786, pp. 324–327. Springer, Heidelberg (2000) 18. Sadre, R., Haverkort, B.: Decomposition-Based Queueing Network Analysis with FiFiQueues. In: Queueing Networks: A Fundamental Approach. International Series in Operations Research & Management Science, vol. 154, pp. 643–699. Springer, Heidelberg (2011) 19. Verum (2010), http://www.verum.com 20. Wandeler, E., Thiele, L., Verhoef, M., Lieverse, P.: System architecture evaluation using modular performance analysis: a case study. International Journal on Software Tools for Technology Transfer 8(6), 649–667 (2006) 21. Whitt, W.: The Queueing Network Analyzer. The Bell System Technical Journal 62(9), 2779–2815 (1983)
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems Alireza Abdollahpouri1,2 and Bernd E. Wolfinger1 1
Department of Computer Science - TKRN University of Hamburg, Germany 2 University of Kurdistan, Sananda, Iran [email protected]
Abstract. Due to the enormous improvement of networking technologies and the advances in media encoding and compression techniques, IPTV becomes one of the fastest growing services in the Internet. When offered via wireless technologies (e.g., WiMAX), IPTV can pave the way for quad-play in next generation networks and ubiquitous delivery. IPTV subscribers expect the same or even better Quality of Experience (QoE) compared with the services offered by traditional operators (e.g., cable or satellite). An important QoE element is the channel switching delay also known as zapping delay. In this paper we propose a prediction-based prejoin mechanism to join one or two TV channels (which are likely to be selected next) in advance in order to shorten the channel switching time in WiMAX-based IPTV systems. A trace-driven simulation is conducted to evaluate the proposed method. The simulation results confirm the reduction of about 30% in average zapping delay. Keywords: IPTV, Zapping delay, WiMAX, Performance evaluation.
1
Introduction
Traditional one-way broadcasting of TV programs no longer satisfies the new generation of TV users which have grown up with Internet and interactive gaming. Internet Protocol TV (IPTV) describes a mechanism for transporting TV streams encapsulated in IP packets using networking protocols and tries to offer more interactivity and more control over the content. To provide ubiquitous delivery, IPTV service providers have to pay special attention to wireless broadband technologies as their access networks. Worldwide Interoperability for Microwave Access (WiMAX) technology which is based on IEEE 802.16 air-interface standard has salient features like high data rate, multicast support, guaranteed quality of service and scalability. Therefore, it can be a good candidate to deliver IPTV services to fixed and mobile subscribers. The time between pushing the channel change button and the first video frame being displayed on the TV, is called zapping delay. Besides the acceptable audiovisual quality, channel zapping delay is a fundamental requirement for quality of user’s experience (QoE). Although it seems to be a natural requirement from a subscriber's perspective, providing this functionality can be problematic for network operators. J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 182–196, 2012. © Springer-Verlag Berlin Heidelberg 2012
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
183
Recently many research efforts have been devoted to reduce channel zapping delay [1]-[18]. Cisco proposed VQE (Visual Quality of Experience) appliance for xDSLbased IPTV systems, which uses I-frame caching and unicast bursts to accelerate channel switching time [1]. J. Lin et al. in [2] proposed another unicast-based method which distributes an additional media stream starting with a key frame (to reduce I-frame acquisition delay). However, because of the correlation in channel switching events (e.g., during commercial advertisements), unicast-based schemes lead to spikes in the network load. In order to prevent such an impulsive load, [3] and [4] used multicast-based approaches (using an additional multicast stream instead of using an additional unicast burst). This additional stream can be a time-shifted copy of the original stream (in [4]) or a secondary lower quality (only I-frame) stream [3]. In [5] H. Joo et al. proposed a method to insert extra I-frames into the channels based on the user’s channel preference information. However, this method reduces the compression efficiency. H. Uzunalioglu in [6] tried to adjust the GOP (Group of Pictures) duration to decrease channel change delay. All of the abovementioned techniques try to reduce I-frame acquisition delay. Prejoining or predictive tuning is another technique in which one or more channels (those which are likely to be selected next) are prejoined and pre-buffered in addition to the currently watched channel in order to reduce channel zapping time [7]-[11]. In [7], U. Oh et al. presented various hybrid channel prefetching and reordering schemes and showed that the adjacent channel prefetching scheme has better performance than the popular channel prefetching scheme no matter what reordering scheme is used. A rating server is proposed in [8], which gathers information about channel change events from Set-Top Boxes (STBs) and manages statistics for each STB (which, of course, could lead to privacy problems). Based on those statistics a list of channels is predicted and therefore, the user experiences low zapping delay when selecting those channels. A survey of prediction-based methods to reduce channel zapping is presented in [9]. In [10] and [11], the authors try to predict the channels based on the surfing behavior of IPTV subscribers. Scalable Video Coding (SVC) schemes can also be applied for rapid channel switching. In [12] Y. Lee et al. allocated the base layer and enhancement layer of each channel to two separate multicast groups. The base layers of a collection of candidate channels are stored in the buffer. Channel switching in preview mode occurs immediately since the users access the base layers without delay. In watching mode, both the base and enhancement layers of the selected channel are used to achieve full quality. A combination of prejoining and SVC to reduce zapping delay is used in [10]. Some of the researchers try to influence network factors such as latency in the access network by enhancing IGMP features (e.g., reducing number of IGMP membership queries, join before leave and snooping). In [13] authors proposed sending an IGMP-Join message for the requested channel before leaving the currently watched channel by sending an IGMP-Leave message. E. Lee et al. in [14] proposed a new extended IGMP for mobile WiMAX access networks. Fig. 1 summarizes some of the most important methods used for reducing channel switching delay.
184
A. Abdollahpouri and B.E. Wolfinger
In this article, we try to reduce the average switching (zapping) delay by mixing two mechanisms: a combined multicast/unicast transmission of TV channels and prediction-based prejoining. Taking advantage of multicast support of WiMAX and considering the minimum slot requirement, one or two channels that are likely to be selected next, will be prejoined. Prejoining is only applied for unicast channels and in surfing period. The main differences between this work and other prediction-based methods are twofold: we focus on WiMAX networks while other works are for DSLbased access networks and we combine prediction-based prejoining with multicasting most popular channels to reduce zapping delay. Note that, in our paper, prejoining is not in the sense of multicast join but in the sense of pre-requesting. The rest of this paper is organized as follows. In Section 2 some background information about Multicast Broadcast Service (MBS) in WiMAX and channel switching delay is given. Section 3 presents our prediction-based prejoin proposal. We then present our simulation-based performance evaluation of the proposed method in Section 4. Finally, we conclude the paper in Section 5.
Fig. 1. Categorization of zapping delay reduction schemes
2
Background
2.1
WiMAX Multicast Broadcast Service (MBS)
Unicast transmission may not be an efficient approach in terms of bandwidth requirement, because the resource requirement increases proportionally with the number of users. Based on the useful features of similar technologies (namely MBMS, DVB-H and MediaFLO), the IEEE 802.16e standard proposed Multicast Broadcast Service (MBS). MBS provides an efficient method for concurrent transmission of commonly demanded data (e.g., a TV channel) to a group of users, using a common multicast connection identifier (MCID).
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
185
Using MBS, bandwidth requirement reduces from one burst per viewer to one burst per TV channel. MBS is offered in the downlink only. To manage overall operations of MBS, an MBS controller (server), is needed in the system as shown in Fig. 2. It is worth noting that MBS can not take advantage of Adaptive Modulation and Coding (AMC) because it must fulfill the requirements of all the clients in the multicast group (even those in the border of the cell with low SNR). Meanwhile, ARQ can not be used in multicast sessions.
Fig. 2. WiMAX access network with MBS server (ASN-GW: Access Services Network Gateway; BS: Base Station; MS: Mobile Station; SS: Subscriber Station; STB: Set-Top Box)
The BS sends an MBS MAP (media access protocol) message to specify the location and size of MBS data bursts in the MBS region of the downlink subframe. MBS MAP is located at the first subchannel and the first OFDMA symbol of the MBS region as illustrated in Fig. 3. Similar to unicast services in IEEE 802.16, the MBS service flows are managed through a DSx (Dynamic Service Addition/Deletion/Change) messaging procedure used to create, change, and delete a service flow for each MS.
Fig. 3. Multicast and unicast in a WiMAX cell
186
A. Abdollahpouri and B.E. Wolfinger
A WiMAX cell with seven IPTV users is depicted in the Fig 3. The numbers beside the stations indicate the TV channel each user is already watching (at an instant of time). Three users are watching channel 1 and two others are watching channel 2. TV channels 3 and 4 are watched by only one user. Channels 1 and 2 are transmitted by means of multicast using multicast bursts. Unicast bursts are used for TV channels 3 and 4. The number of data bursts (unicast and multicast) is equal to four. Note that because channel 3 is watched by a user which is close to the BS (good signal condition), the resource requirement (number of OFDMA slots) is lower. 2.2
Channel Switching Delay in IPTV Systems
In the analog cable TV network, all the channels are available simultaneously on the link with different frequencies. Therefore, channel change is almost instantaneous, since it only involves the TV receiver tuning to a new carrier frequency and then demodulating the analog signal and displaying the video on the screen. With the introduction of digital transmission technologies and video compression techniques such as MPEG, channel change is no longer immediate, because of some factors like I-frame acquisition delay, de-jitter buffering delay and MPEG buffering delay. In other words, the dependency on previous frames in compressed video streams prevents the ability of random access and prolongs the switching delay. Changing a channel in an IPTV system is even more complicated due to addition of further delay components such as multicasting delay (IGMP join and leave and route establishment) and buffering delays in the intermediate nodes of the network. The steps involved with channel switching depend on the networking infrastructure and the location of the requested channel. For instance, if the channel is available at the BS, then the delay is shorter than for the case when the channel must be requested from the MBS server. In an IPTV system, due to the limited capacity of the last-mile, a rather limited number of TV channels can be transmitted. Therefore, one could experience a switching delay of about a few seconds. Switching to an MBS channel, involves a shorter delay in comparison to switching to a unicast channel. Because in the former case, channel switching time only consists of the delay to wait for the next burst on the target MBS stream and buffering it in the MS to avoid underflow and remove the jitter. In the latter case however, the MS should at first obtain bandwidth to transmit the channel change request (in MAC and IP level). This is a random backoff-based contention mechanism and happens during the contention period in the uplink subframe. If the channel change request is accepted by the BS and after scheduling the requested channel, BS advertises the position of the unicast bursts via DL-MAP. Thereafter, the MS should decode the MAP and find the location of the desired burst. Finally, after a buffering delay, the first picture of the new channel is displayed. Note that the optimization techniques that have been explored in the wired IPTV domain (e.g., a separate tune-in stream or prediction prejoining) can also be adapted for WiMAX MBS. To summarize, zapping time can be influenced by many factors such as: • Multicast latency for “leaving” the old channel and “joining” the new channel; • Program Clock Reference (PCR) and sequence header information;
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
187
• Random access point (such as I-frame) acquisition delay (example: If GOP=15 and fps=30 then an I-frame is produced every 500 ms); • Network buffer delays, including delays caused by error-mitigation techniques; • MPEG decoder buffer delay. Note that the ITU-T FG IPTV group is recommending that the time taken by the channel switching process should not exceed 2.5 s.
3
Workload Generation and Prediction-Based Prejoin Method
Before turning to our proposed method, we need some knowledge about the workload and therefore we model the switching behavior of a typical IPTV user. 3.1
Modeling the Behavior of a Single IPTV User, during an ON Session
The behavior of an IPTV user is different from the users of other IP-based applications. Fig. 4 illustrates the typical behavior of an IPTV user during an ON period (active session). In this figure, switching events performed by the user during the peak hour (here, 9PM to 10PM) are depicted. A switching event occurs when a user selects a new TV channel. Several switching events with a certain inter-arrival time (e.g., less than 10 sec) show that the user is zapping TV channels to find something of interest. The number of channel switching prior to viewing is called a zapping block. For example, three zapping blocks with the size of 4, 2 and 6 can be seen in the figure during the one-hour time interval. Note that, the channels being actually watched during a long time period are not included in the zapping block. The user experiences a sequence of zapping periods followed by viewing periods. The user is in watching state whenever there is no switching event during a relatively large amount of time. Modeling and analyzing this different type of workload can help the IPTV service providers in the design process or after the implementation process to evaluate several “what-if” scenarios.
Fig. 4. Channel zapping (z) and viewing (v)
188
A. Abdollahpouri and B.E. Wolfinger
Depending on the next channel chosen, channel switching is considered to be sequential or targeted. In the sequential channel switching, the user chooses the next channel using UP and DOWN buttons on the remote controller. Targeted switching represents the cases in which the user chooses the desired channel directly by pressing the channel number or using Electronic Program Guide (EPG). During a commercial advertisement or, e.g., between half-times of a football match, most users change the channel to find a more favorite program. In other words, channel switching behavior of the viewers may be correlated. There exist some problems to be considered in the modeling of the user behavior. In general, the following three main questions should be answered. Q1. How many channels a user surfs in average before viewing? (Size of zapping block) Q2. Which channels is a user watching or surfing? (Access pattern - Channel number) Q3. When do channel change events happen? (Channel dwell time in viewing or zapping modes) In [19] we introduced our model to cover both, channel popularity and user activity, in an IPTV system for a single typical IPTV user. The proposed user behavior automaton (TV-UBA) to model zapping and viewing periods of Fig. 4 is depicted in Fig. 5. According to our model, after turning the STB on (State Si), the subscriber starts zapping channels with the probability of pz or watches a channel with the probability of 1-pz. In zapping mode, the user may surf one or more channels before viewing (zapping block). Each state indicates the number of successive channel switching events. For example, state Z3 means surfing three channels before watching. The user returns back to viewing mode after each zapping block. In viewing mode, after watching a specific channel, the viewer may terminate watching with probability pt, view another channel or start surfing another set of channels. Interested readers can refer to [19] for more detailed information about view and zapping states.
Fig. 5. User behavior automaton for IPTV user (TV-UBA), cf. [19]
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
189
We used the LoadSpec tool [20][21] to formally describe and thereafter simulate our model. A sample output of our model is shown in Fig. 6. Zapping blocks and viewing time are clearly distinguishable. In this figure, the user first surfs three channels (zap block with size three) and then starts watching the fourth channel. After termination of the corresponding viewing period, the user browses five channels and finds the sixth channel of interest.
Fig. 6. Output of the TV-UBA model in LoadSpec tool [19]
3.2
Our Proposed Prediction-Based Prejoin Method
M. Cha et al. in [22] reported that about 60% of channel changes are sequential. In other words, more users prefer to switch channels using UP and DOWN buttons on the remote control. Therefore, in sequential switching, the next requested channel would be an adjacent channel. The rest of the switching events are called targeted switching, in which choosing the next channel only depends on the watching probability of the destination channel. Based on the above information, instead of just prejoining neighboring channels we propose a more intelligent prejoin method (to prejoin one or two channels) which considers the channel switching behavior of a typical IPTV user, as follows (the currently requested channel is Ci):
190
A. Abdollahpouri and B.E. Wolfinger
where: p(Ci): watching probability of TV channel Ci. pta: Probability of targeted switching pseq: Probability of sequential switching (about 0.6 [22]), where pseq=1- pta pu: Probability of sequential UP switching (about 70% of sequential switching events). Here, Prejoin1 is a method which predicts the next channel based on statistics obtained from switching behavior of users in a real IPTV system. The probability state diagram for Prejoin1 is given in Fig. 7. In the figure, Cmax defines the maximum number of offered TV channels. Whenever Ci+1 is more than Cmax, evidently Ci+1 has to be replaced by C1 (we omit this in the text in favor of better readability). Prejoin2 always predicts the next upper channel for prejoining. Therefore, one or two channels will be prejoined depending on whether the prejoined channels are the same or not. Also, with probability pta⋅p(Ci) the method Prejoin1 will not recommend any channel to prejoin, in which case, too, only channel Ci+1 will be prejoined (according to method Prejoin2). Note that, prejoining is performed only for unicast channels.
Fig. 7. Prejoin1 (Ci) probability state diagram; Cmax=50
Let’s take a simple example to explain the prediction-based prejoin method. In Fig. 8, a part of the trace for an IPTV user is shown which consists of two zapping blocks and one viewing period. The numbers above the arrows indicate requested channels. For the sake of simplicity, assume channels are sorted by descending order of popularity (channel one is the most popular channel). If, for example, the five most popular channels are transmitted by means of multicast, the user experiences a shorter switching delay for these channels. Taking this example and assuming the scenario illustrated by Fig.8, Table 1 indicates the prejoined channels as well as channel change delay in each switching event (“T” represents long delay and “t” is used to indicate short delay). If the requested channel is correctly predicted and prejoined, the switching delay is virtually zero (at the time of switching the new channel is already buffered and ready to decode). Note that the first channel switching (if unicast) suffers a long delay since there is no prediction mechanism yet. For the third channel switching (channel 22), switching delay is zero because this channel is predicted and prejoined previously (by Prejoin1). This is also happening in the 8th switching event when both prejoined formulas predict the next channel correctly (in this case only one channel is prejoined which is channel 11).
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
191
To save valuable bandwidth, the prejoined channels will be released in viewing mode (when channel dwell time is longer than 30 seconds). Therefore, at the 7th event, although the channel 10 is predicted and prejoined correctly, because of staying in viewing mode in the previous channel, the switching delay is long.
Fig. 8. Channel switching events in a period of time
For this example the average zapping delay is equal to:
Average_Zap_delay = (6T+2t)/10 Table 1. An example of zapping delay reduction with prejoining
4
Performance Evaluation
A trace-driven simulation is conducted to evaluate the performance of the proposed prejoin method. For this purpose, a dedicated simulation program was implemented by us using C++ programming language. Watching probability of TV channels can be modeled quite realistically using a Zipf-like distribution with the following formula: p(Ci ) =
Ω, iα
where
⎛ C 1 ⎞ Ω = ⎜⎜ ∑ α ⎟⎟ ⎝ j =1 j ⎠
−1
;
0 < α ≤ 1;
Here, α is the shaping parameter, Ω is the scaling parameter, which can be decided by α, and C is the number of TV channels. We assume 50 TV channels and α =1 [5], then: Ω = 0.2222622. The simulation scenario is similar to Fig. 2 which is composed of a WiMAX cell and 30 (and 60) IPTV users. The MBS server has access to all channels and handles the TV requests of IPTV users on behalf of the IPTV head-end. We elaborated the effects of overhead slots in WiMAX-based IPTV systems in [23] (using both analytical and simulation methods) and showed that with a correct
192
A. Abdollahpouri and B.E. Wolfinger
combination of multicast and unicast TV channels and a proper scheduling policy, the overhead can be significantly reduced. The total number of required slots (data bursts plus overhead slots) for the cases of 30 and 60 users is shown in Fig. 9. For the case in which 30 IPTV users exist in the cell, multicasting the 4 most popular channels can lead to minimum slot requirement. When 60 users are served, the 8 most popular channels should be transmitted by means of multicast to expense the minimum number of slots when providing the IPTV service. Usually one MBS burst contains several GOPs and therefore, there will be more than one I-frame in the MBS burst. Hence, for WiMAX MBS we don’t need to account for an I-frame acquisition delay. We perform the simulation for the cases where the first N channels are transmitted by means of MBS (varying N=1, …, 10). We also assume target switching delay of 2.5 sec for unicast (T=2.5) and 0.5 sec for multicast (t=0.5). For each user, a 10-hour trace is obtained from an existing TV-UBA model. Note that, we use the term “trace” in its general sense, i.e. not only referring to measured data but also to data obtained from our TV-UBA model. Each entry of the trace includes the following information: ⋅ ⋅ ⋅
Timestamp Type of event (i.e., Request for channel- Start watching - Terminate watching) Number of requested channel (when event is Request for channel)
Fig. 9. Required slots when multicasting N most popular channels and unicast the rest of the requested channels (N= 1, …, 50) [23]
The flowchart of the simulation is depicted in Fig. 10. The average zapping delay with and without prediction is calculated for each user. The results are shown in Fig 11. The improvement in average zapping delay is between 33% and 28% for multicasting one to ten most popular channels as shown in Fig. 11. Considering the minimum overhead and slot requirement (which means that, here, multicasting the
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
193
top 4 channels is the optimum decision), an improvement of about 30% in average channel zapping delay is obtained. We repeated the simulation for different user behaviors (different traces) and obtained similar results. Total delay reduction is a combination of the following two factors (Fig. 11): (a) Reduction obtained from multicasting N most popular channels (efficiency of multicasting) (b) Reduction obtained from prejoining mechanism (efficiency of prejoining) As can be seen, the efficiency of multicasting increases with the increment of the number of multicast channels and this clearly makes sense. But for the efficiency of prejoining, a gradual decrement can be seen. To calculate the percentage of successful prejoins, we analyzed the switching behavior of a typical user and the predicted channels obtained by Prejoin1 and Prejoin2. The results are as follows: ⋅ Total number of switching events (in a 10-hour trace): 341 ⋅ Number of prejoining channels (both Prejoin1 and Prejoin2) = 126 ⋅ Number of successful prejoins: 58 Therefore, the percentage of successful predictions is about 46%. We repeated the analysis for some other user behavior patterns and this value is almost the same.
Fig. 10. Simulation flowchart
194
A. Abdollahpouri and B.E. Wolfinger
Fig. 11. Average zapping delay with and without prejoining (95% confidence intervals)
5
Conclusion
One of the most important challenges in IPTV systems is reducing channel change time or zapping delay. In this paper, based on our previous works on modeling switching behavior of IPTV users and slot requirement for different multicast/unicast combinations, we proposed a prediction-based prejoin mechanism to shorten channel zapping delay during surfing periods. Taking advantage of Zipf-like distribution of the watching probability of TV channels and multicast support in WiMAX networks, two mechanisms were successfully used to reduce channel zapping delay: combined multicast/unicast transmission of TV channels and prediction-based prejoining. Simulation results confirm that with the consideration of minimum slot requirement, quite significant improvements of about 30% in zapping time can be obtained. A combination of the other methods described in Section 1 with the proposed method in this paper can be used to shorten the zapping delay even more. For example, WiMAX can take advantage of scalable video coding in which the base layer can be transmitted with the most robust MCS and enhancement layers are transmitted with the less robust (but with higher data rates) MCS. In surfing mode, only the base layer can be prejoined and after dwelling in a specific channel for e.g., 30 seconds, a full quality can be achieved. A dynamic scenario can be investigated as a future work in which the arrival and departure rate of the subscribers are taken into account. Furthermore, the measurements obtained from a real IPTV system can also be used instead of the output of our TV-UBA model. Also, it would be of interest to implement our prejoin method in an existing IPTV system in order to quantify the resources which have to be spent (e.g., network bandwidth and STB CPU capacity), to achieve the performance improvements. Note that, although we have investigated WiMAX-based
Reducing Channel Zapping Delay in WiMAX-Based IPTV Systems
195
access networks in this article, the method can be applied in a straight-forward manner to other OFDMA-based wireless environments which support multicasting (e.g., 3GPP LTE).
References 1. Cisco White Paper: Delivering Video Quality in Your IPTV Deployment 2. Lin, J., Lei, W., Bai, S., Li, L.: The Implementation of Fast Channel Switching in IPTV. In: Second International Conference on Intelligent Computation Technology and Automation, ICICTA, Hunan (2009) 3. Banodkar, D., Ramakrishnan, K., Kalyanaraman, S., Gerber, A., Spatscheck, O.: Multicast Instant Channel Change in IPTV Systems. In: Int. Conf. Communication Systems Software and Middleware (COMSWARE), Bangalore (2008) 4. Sasaki, C., Tagami, A., Hasegawa, T., Ano, S.: Rapid Channel Zapping for IPTV Broadcasting with Additional Multicast Stream. In: Proc. IEEE International Conference on Communications (ICC 2008), Beijing (2008) 5. Joo, H., Song, H., Lee, D.B., Lee, I.: An effective IPTV Channel Control Algorithm Considering Channel Zapping Time and Network Utilization. IEEE Trans. Broadcast. 54(2) (2008) 6. Uzunalioglu, H.: Channel Change Delay in IPTV Systems. In: IEEE Consum. Comm. Net. Conf., Las Vegas (2009) 7. Oh, U., Lim, S., Bahn, H.: Channel Reordering and Prefetching Schemes for Efficient IPTV Channel Navigation. IEEE Trans. Consumer Electronics 56(2) (2010) 8. Lee, J., Lee, G., Seok, S., Chung, B.: Advanced Scheme to Reduce IPTV Channel Zapping Time. In: Ata, S., Hong, C.S. (eds.) APNOMS 2007. LNCS, vol. 4773, pp. 235–243. Springer, Heidelberg (2007) 9. Ahmad, M.Z., Qadir, J., Rehman, N.U., Baig, A., Majeed, H.: Prediction-based Channel Zapping Latency Reduction Techniques for IPTV Systems - a Survey. In: Proc. IEEE International Conference on Emerging Technologies (ICET 2009), Islamabad (2009) 10. Lee, C.Y., Hong, C.K., Lee, K.Y.: Reducing Channel Zapping Time in IPTV Based on User’s Channel Selection Behaviors. IEEE Transactions on Broadcasting 56(3) (2010) 11. Kim, Y., Park, J.K., Choi, H.J., Lee, S., Park, H., Kim, J., Lee, Z., Ko, K.: Reducing IPTV Channel Zapping Time Based on Viewer’s Surfing Behavior and Preference. In: IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, Las Vegas (2008) 12. Lee, Y., Lee, J., Kim, I., Shin, H.: Reducing IPTV Channel Switching Time Using H.264 Scalable Video Coding. IEEE Transaction on Consumer Electronics 54 (2008) 13. Sarni, M., Hilt, B., Lorenz, P.: A Novel Channel Switching Scenario in Multicast IPTV Networks. In: The Fifth International Conference on Networking and Services (ICNS 2009), Valencia (2009) 14. Lee, E., Park, S., Lee, J., Lau, P.Y.: An Extended IGMP Protocol for Mobile IPTV Services in Mobile WiMAX. In: Cai, Y., Magedanz, T., Li, M., Xia, J., Giannelli, C. (eds.) Mobilware 2010. LNICST, vol. 48, pp. 413–426. Springer, Heidelberg (2010) 15. Siebert, P., Van Caenegem, T.N.M., Wagner, M.: Analysis and Improvements of Zapping Times in IPTV Systems. IEEE Transactions on Broadcasting 55(2) (2009) 16. Lee, D.-B., Kim, W., Song, H.: An Effective Mobile IPTV Channel Control Algorithm over WiMAX Network. In: IEEE CCNC, Las Vegas (2010)
196
A. Abdollahpouri and B.E. Wolfinger
17. Lin, K., Sun, W.: Switch Delay Analysis of a Multi-channel Delivery Method for IPTV. In: 4th IEEE International Conference on Circuits and Systems for Communications, Shanghai (2008) 18. Gopal, T.: WiMAX MBS Power Management, Channel Receiving and Switching Delay Analysis. In: Vehicular Technology Conference, Barcelona (2009) 19. Abdollahpouri, A., Wolfinger, B.E., Lai, J., Vinti, C.: Elaboration and Formal Description of IPTV User Models and Their Application to IPTV System Analysis. In: MMBnet 2011, Hamburg (2011) 20. Cong, J., Wolfinger, B.E.: A Unified Load Generator Based on Formal Load Specification and Load Transformation. In: First Intern. IEEE Conf. on Performance Evaluation Methodologies and Tools, ValueTools 2006, Pisa (2006) 21. Kolesnikov, A., Heckmüller, S.: LoadSpec - ein E-Learning Werkzeug zur Lastspezifikation im Bereich der Telematik. E-Learning Baltics, eLBa 2008, Rostock (2008) 22. Cha, M., Rodriguez, P., Crowcroft, J., Moon, S., Amatrianin, X.: Watching Television over an IP Network. In: Proc. ACM IMC, Vouliagmeni (2008) 23. Abdollahpouri, A., Wolfinger, B.E.: Overhead Analysis in WiMAX-based IPTV Systems. In: International Congress on Ultra Modern Telecommunications and Control Systems, ICUMT 2011, Budapest (2011)
Performance Evaluation of 10GE NICs with SR-IOV Support: I/O Virtualization and Network Stack Optimizations Shu Huang and Ilia Baldine Renaissance Computing Institute, Chapel Hill, NC, USA {shuang,ibaldin}@renci.org
Abstract. SR-IOV has been proposed to improve the performance and scalability of I/O in Virtual machines and some 10GE NICs supporting this functionality have already appeared on the market. In addition to the SR-IOV support, these NICs all provide optimizations for various network layers within the OS kernel. In this paper we try to present a comprehensive view of the performance gain by SR-IOV. This study is conducted by evaluating the performance of 10GE NICs with SR-IOV support at different layers in various virtualized environments. Keywords: SR-IOV, performance, NIC.
1
Introduction
Virtualization is an enabling technology for cloud computing. A virtualized compute system allocates the physical resources such as processor cores, memory, storage and I/O capacity, such that each set of resources can operate independently, using separate system images. Among all the physical resources, I/O units are critical components and increased I/O utilization by virtual machines places significant strain on the physical I/O capacity. Moreover, as a major obstacle to the wider deployment of virtual systems, the performance overhead associated with virtualization has gained a lot of interest in both research and practice. In this study we try to evaluate the performance of some existing Network Interface Cards (NIC) that support Single Root I/O Virtualization (SR-IOV), a hardware I/O virtualization approach proposed by PCI-SIG[2].
2 2.1
Background I/O Virtualization
There are two types of virtualization: full virtualization and para-virtualization. In a fully virtualized system, the guest OS is not aware that it is being virtualized and it does not require any modifications. In contrast, para-virtualization requires the modifications in the guest OS. For example, for I/O drivers, the guest OS requires a back-end driver that works with the frond-end driver in the J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 197–205, 2012. c Springer-Verlag Berlin Heidelberg 2012
198
S. Huang and I. Baldine
host to provide I/O service to applications running in the guest OS. With the introduction of hardware virtualization technologies, the line between full and para virtualization has become somewhat blurred. A notable improvement is introduced by I/O MMU virtualization that allows PCI devices be passed through to the guest OS. SR-IOV presents a step further than PCI ‘passthrough’ by defining extensions to the PCI Express (PCIe) to enable multiple virtual machines to share PCI hardware resources. Specifically, a fully functional PCI Physical Function (PF) can support multiple light-weight Virtual Functions (VFs). A VF is associated with a PF and shares physical resources with other VFs that are associated with the PF. Based on the IOMMU technology, VFs can be assigned to Virtual Machines (VMs) directly. However, although OS modifications are not required, the guest OS does require specific drivers for VFs and some products (such as the Intel X520-SR2) require the PF within the host be turned on all the time. Another type of virtualization is referred to as container-based. Examples are Linux VServer[3], OpenVZ[6] and Network namespace[1]. Container-based virtualization provides sharing of physical resources through name-space isolation for security purposes and limited performance isolation between instances. Container based virtualization has been widely deployed in PlanetLab and other testbeds. 2.2
Optimizations
Diagnosing the network throughput observed by an application compared to the stated bandwidth of the NIC and intermediate links remains a challenge. One reason is that the bandwidth may be limited by end-host resources other rather than the NIC itself, like memory bandwidth and CPU speed. Various optimizations that aim at improving the processing speed to keep up with NIC’s capacity have been proposed and implemented to improve the network performance and reduce the CPU usage. For example, TCP-Segmentation-Offload (TSO) offloads the task of segmenting large TCP packet (to meet the MTU requirement of the media) to the NIC. Generic-Segmentation-Offload (GSO) [9], a software version of TSO generalizes it to support other transport protocols. Large-Receive-Offload (LRO) reduces the per-packet processing overhead by aggregating smaller packets into larger ones and passing them up to the network stack. Generic-Receive-Offload (GRO) provides a generalized software version of LRO, just like what GSO does to TSO. These optimizations have been extensively evaluated in physical machines. In this study we test these optimizations with virtualization technologies supporting SR-IOV. 2.3
Prior Work
In [8], the authors propose a container-based virtualization solution called Trellis and evaluate its forwarding performance. Since the introduction of SR-IOV, some work on evaluating its performance has appeared in the literature. In [7],
Performance Evaluation of 10GE NICs with SR-IOV Support
199
the author provides a detailed analysis on the performance of SR-IOV including not only network performance such as delay and bandwidth but system performance such as CPU usage, memory access, VM exits and interrupts etc. However, as we conducted similar experiments, we found that the results obtained from TCP/UDP performance tools such as Iperf inconsistent and various optimizations discussed in section 2.2 enabled in Linux by default, complicated the results even further. Therefore, in our study, we chose to use Pktgen (detailed in section 3.1) to avoid the interference of upper layer protocol processing and when the TCP performance is studied, we focused on how various optimizations affect the SR-IOV performance in terms of the bandwidth efficiency.
3
Experimental Setup
The testbed we use to evaluate the performance consists of two DELL R710 servers with 12 2.66GHz Xeon physical processors (6 cores/socket × 2 sockets), 24 GB DDR3 memory. Each server has three NICs, one is connected to the control network, the other two, a 10G Chelsio T4-CR and a 10G Intel X520-SR2 are connected back-to-back for testing. The Intel NIC is used for comparison and verification purpose and unless explicitly pointed out, the Chelsio card is used to conduct the test. For Chelsio T4, we installed the latest driver available (1.0.2.26). For the Intel NIC, we used the default provided by CentOS 6 (ixgbe and ixgbvf). Unless explicitly specified, the default driver parameters are used. For the Chelsio NIC, this means each physical function and virtual function is equipped with 16 and 2 queue sets respectively. Within each queue set, NAPI and LRO are enabled, TX and RX ring sizes are set to 1024 and 64 respectively. Note that the kernel parameter ‘intel iommu=on’ is passed in only when virtualization is needed. Our experiments show IOMMU while improving the security, does decrease the network throughput. Virtual machines under test have 4 virtual CPUs, 5GB memory and 10GB disk space and run CentOS 6.0. 3.1
Tools
In this study we evaluate the performance in terms of the bandwidth and CPU usage. For bandwidth measurements, we choose Pktgen and Iperf. Pktgen is a kernel module that measures the kernel packet forwarding speed. Pktgen avoids the overhead caused by upper layer protocol processing in the network stack by invoking the driver’s transmit function directly. Creating a packet is resourceconsuming because it may involve freeing and creating new data structures (i.e., sk buff) in kernel space. To avoid this overhead, Pktgen provides a threshold parameter (clone skb count) so that a packet is cloned and sent repeatedly – a new packet is created only when the number of packets sent accumulated is larger than the threshold. Interrupt affinity also plays an important role in the performance and has a significant impact on the measurement repeatability. Pktgen creates one thread for each processor, i.e. thread kpktgend x for processor x. If this thread is used to conduct the test, to improve the outbound throughput, it is
200
S. Huang and I. Baldine
important that the processor is not interrupted by other events. Several measures are taken: 1) Do not set ‘dst’ and ‘dst mac’ parameters in Pktgen script. Pktgen sends UDP packets to port 9, which are expected to be discarded silently. However, machines that do not support Discard service may return ‘port unreachable’ ICMP packets to the source. Our experiments show that if the processor needs to handle these packets, this slows down the sender speed significantly. We also bind the interrupt associated with the NIC’s receiving queue to a different processor. 2) Increase the packet count. Pktgen sends the number of packets specified by count then becomes idle until skb is freed (skb→users hits 0). The idle time is not a function of packet count, so the larger the packet count is, the smaller the impact of the idle time. 3) Use the default ring size (1024). According to our experiments, a smaller ring size results in a slower forwarding speed. Specifically, when two SR-IOV VFs are transmitting simultaneously, the one with a larger ring size gets more opportunities to transmit. 4) Disable unnecessary services such as irqbalance (to avoid automatic interrupt distribution) and iptables (to avoid Netfilter examinations). To evaluate the performance of the entire network stack, we use Iperf. By default, Iperf client creates one thread for reporting and one thread for sending packets. Running Iperf client on a computer with more than one processor can significantly improve the forwarding speed. Iperf also supports multiple client threads to run in parallel. Conducted experiments on a multi-core server show that the throughput increases as the number of threads increases, until the NIC’s hardware transmit limit is reached. In this study, since the bandwidth limit is explored by Pktgen (Iperf is primarily used to study the CPU efficiency), unless explicitly point out, we simply use a single client thread. We also use Perf, a performance analysis tool included with the new Linux kernel, to analyze the software performance. 3.2
Data Analysis
For testing delay, the measurements are usually consistent (with very small confidence intervals). However, for testing throughput and CPU usage, the measurements may vary significantly from run to run, especially after rebooting the machine. This is because the measurements are affected by other factors such as background processes that can not be easily factored out. However, for these measurements, we focus on the comparison of the test results. We try our best to keep the environment the same for the test runs that are to be compared. For example, although the throughput measurements obtained may have a large confidence interval because they may change significantly after the the host machine is rebooted, we can still safely conclude from our measurements that the SRIOV enabled Ethernet interface has a larger throughput than a software simulated one.
Performance Evaluation of 10GE NICs with SR-IOV Support
4 4.1
201
Performance Evaluation End-to-End Traffic Delay
Fig. 1 shows the average Round Trip Time (RTT) between the machines. One end is always a Physical Function; on the other end, we employ VMs using different I/O virtualization techniques. The result shows clearly that the delay between two PFs is the smallest among all. OpenVZ employs a virtual ethernet device (veth) for layer 2 virtual networking within a container. The OpenVZ veth within Fig. 1. Round trip time comparison guest namespace is bridged to the physical NIC in host (the resource isolation is done via separate network stacks) and it provides better delay performance than other more heavy-weighted virtual machines. KVM SR-IOV enabled VM takes a longer time to receive the Ping ICMP packets due to the cost of IRQ virtualization. The KVM VM using bridged networking produces the longest delay because software emulation and network bridging are both expensive. The numbers plotted are an average over 50 runs. We omit the confidence intervals because they are all very small (< 0.01ms) for 95% confidence. 4.2
Forwarding Throughput
Fig. 2 shows a comparison of forwarding and receiving performance using Iperf. The results show that VMs with SR-IOV support has a much better forwarding performance than the KVM VM with bridged network interface. OpenVZ uses network namespace to provide an isolation of the network stack for users (we were unable to use Fig. 2. Throughput comparison pktgen to test the forwarding performance because container users have no access to kernel modules). It turns out SR-IOV outperforms OpenVZ. Although we observed expensive VM EXIT events (primarily for IRQ processing) in SR-IOV tests, OpenVZ experiences a bigger per-packet cost – we verified this explanation by sending a big file using different MTUs and found that OpenVZ’s performance deteriorates much quicker when MTU decreases. Lastly, the physical NIC without virtualization has the best performance.
202
4.3
S. Huang and I. Baldine
Throughput vs. Packet Size
Throughput is usually a function of the packet size. Fig. 3 shows that as the packet size increases, the forwarding throughput measured by pktgen in bits per second also increases. However the number of packets per second (pps) decreases. This is expected because pps emphasizes the per packet processing capability, which is constrained by the CPU processing capabilFig. 3. Forwarding throughput vs. packet size ity when packet size is small. When the packet size gets larger, the forwarding performance reaches NIC’s capacity. It’s worth noting that when packet size increases from 100 to 128, the throughput in pps slightly increases as well. The reason is due to an internal optimization of the driver – when a packet size is less than 128, the packet is directly inlined (copied) into the tx descriptor instead of referencing it in kernel buffers. 4.4
VM Performance Isolation
Fig. 4 shows that the output throughput increases as the number of VM increases. In previous tests, we use the default ring parameter (i.e., tx ring size is 1024), in Fig. 5 we set the ring size for VM1, VM2 and VM3 to 64, 512 and 4096 respectively. The result shows that the throughput is an increasing function of the ring size (although not linear). This suggests that the VM users can adjust their driver parameters (if they are given root privilege) to tune their network performance, and, accordingly, affect other VM users network performance. Thus SR-IOV may not provide sufficient performance isolation between VMs. We performed additional tests by changing the MTU for 3 VMs to 150, 900 and 4800 respectively and observed that each VM still receives same share of the entire 10G bandwidth (shown in the last column of Fig. 5). This suggests that although the user of a VM can tune the MTU to suit their purpose (e.g., to lower the delay for VOIP), it seems to pose no harm to other VMs. 4.5
TCP-Segmentation-Offload and Generic-Segmentation-Offload
To understand how network stack optimizations affect VMs’ network performance, we first evaluated host’s TCP performance and CPU usage, using Iperf and Perf respectively. We also calculate the efficiency (bandwidth/cpu usage), which describes how many bits can be sent by one CPU. TSO and GSO offload the TCP segmentation function to hardware (for GSO, as deeply into the kernel
Performance Evaluation of 10GE NICs with SR-IOV Support
Fig. 4. Throughput vs. number of VMs
203
Fig. 5. Throughput vs. Packet size
as possible) so that the per-packet processing overhead can be reduced. Since per-packet processing is expensive, we expect to see lower CPU usage with TSO and GSO enabled. This is verified by Fig.6. However, it is worth pointing out
Fig. 6. Network forwarding performance optimizations for both PM (Physical machine) and VM
that TSO and GSO do not necessarily increase the throughput, as shown in the figure. Instead, we observe that TSO and GSO may result in a lower throughput (with even lower CPU usage and better efficiency though). In a physical machine, our result shows clearly that TSO can significantly reduce the CPU usage. Once TSO is disabled, the network performance decreases and CPU usage increases dramatically. Although GSO can also improve the network performance and reduce the CPU load, our experiments show clearly that TSO plays a more important role than GSO does in improving the bandwidth efficiency. However, this fact may not be true in virtual machines. The network performance in virtual machines is harder to measure than in physical machines because of various issues such as timekeeping[4] and lacking proper profiling tools[5]. As a result,
204
S. Huang and I. Baldine
our experiment results using Iperf are less repeatable. Nevertheless, the results indicate GSO becomes more effective in terms of improving the transmission efficiency. The exact reasons are to be further investigated. 4.6
Large-Receive-Offload and Generalized-Receive-Offload
For network receive performance, we evaluated how LRO and GRO affect the throughput and CPU usage. Note that the TCP throughput is usually limited by the sender and receiver’s buffer size instead of reflecting the real physical bandwidth. However in this test we focus on evaluating the impact of LRO and GRO. Fig. 7. Network receiving performance optimizaTherefore, instead of optimiztions in physical machine ing TCP in Linux to reach the maximum throughput, we just make sure all the settings other than the ones under study (LRO and GRO) remain unchanged between tests. We also found that the Chelsio NIC’s LRO option is controlled via the ethtool gro option – LRO is on if and only if GRO is on. Because LRO and GRO can not be evaluated separately using the Chelsio NIC, we chose the Intel NIC, which provides ethtool interface to turn on LRO and GRO separately. In Fig. 7, our experiment results should present no surprise – both GRO and LRO can effectively improve the bandwidth efficiency and the hardware solution outperforms the more generalized version implemented in software. However, the result also shows GRO and LRO can reduce the throughput because of the delay caused by buffering small packets. We repeat the same experiment for SR-IOV enabled virtual machines. Again, we do not aim at comparing the results with those achieved in previous experiment because the TCP settings are different in physical machines from those in virtual machines. It turns out LRO is not supported by the virtual function driver (ixgbevf), we can only evaluate the impact of GRO. In Fig. 8, the results show Fig. 8. Network receiving performance optimization in virtual machines similar trends as in Fig. 7.
Performance Evaluation of 10GE NICs with SR-IOV Support
5
205
Conclusion
In this study we evaluate SR-IOV at different layers. First, the network performance in terms of delay and throughput is tested and compared with other virtualization technologies. Interestingly we observed a longer round trip delay in SR-IOV VMs than in light-weight VMs (containers). Using the Perf tool, we found this is primarily due the cost of KVM vm entry and vm exit events. Then we study the impact of various optimization techniques such as TSO/GSO and LRO/GRO on SR-IOV. Our results generally show that these optimizations can significantly reduce the CPU usage. However they may also reduce the throughput because of the buffering delay.
References 1. Network namespaces, http://lwn.net/Articles/219794/ 2. Single Root I/O Virtualization and Sharing 1.1 specification, http://www.pcisig.com/specifications/iov/ 3. Soltesz, S., Potzl, H., Fiuczynski, M., Bavier, A., Peterson, L.: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, EuroSys (2007) 4. Timekeeping in VMware Virtual Machines, http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf 5. Du, J., Sehrawat, N., Zwaenepoel, W.: Performance profilling of virtual machines. In: Proceedings of the 7th ACM SIGPLAN/SIGOPS, VEE (2011) 6. Kolyshkin, K.: Virtualization in Linux (2006), http://download.openvz.org/doc/openvz-intro.pdf 7. Liu, J.: Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support. In: IEEE International Symposium on Parallel & Distributed Processing, IPDPS (2010) 8. Bhatia, S., Motiwala, M., Muhlbauer, W., Mundada, Y., Valancius, V., Bavier, A., Feamster, N., Peterson, L., Rexford, J.: Trellis: a platform for building flexible, fast virtual networks on commodity hardware. In: Proceedings of the ACM CoNEXT Conference (2008) 9. Xu, H.: GSO: Generic Segmentation Offload (2006), http://lwn.net/Articles/188489/
Business Driven BCM SLA Translation for Service Oriented Systems Ulrich Winkler1 , Wasif Gilani1 , and Alan Marshall2 1
SAP Research, SAP AG, The Concourse, Belfast, United Kingdom {ulrich.winkler,wasif.gilani}@sap.com 2 Queen’s Universiy Belfast Belfast, United Kingdom [email protected]
Abstract. A Business Continuity Management (BCM) Impact Analysis derives business-level BCM SLAs which need to be translated at ITlevel, infrastructure-level and facility-level services. However, translation of SLAs across a service oriented system is not an easy task. In this patent we present a new Petri-Net based procedure to define and to translate BCM SLAs for service oriented systems. As a result of our approach we are able to introduce a BCM SLA classification schema. We will describe our approach in the context of a use-case.
1
Introduction
Business process disruption can lead to financial and legal losses as well as damage to reputation [12]. Business Continuity Management (BCM) aims (1) to identify critical business processes, systems and services (2) to identify potential threats to services, systems and critical business processes and (3) to assess and evaluate potential damages or losses that may be caused by a threat to critical business processes. Business Continuity Experts refer to these three activities as Business Impact Analysis (BIA). One outcome of a BIA are specific time-frames, such as the Maximum Tolerable Outage Time (MTO), in which a normal level of services and operations has to be restored, such that the organisation can continue to deliver products and services. In a Service Oriented Architecture services often consume other services and therefore depend on each other. Non-functional properties of services, such as MTO, are encoded and negotiated between service provider and service customer via SLAs. These SLAs need to be translated to depending services as well. The relationship among services can be described using a service dependency graph [7,10]. Starting from the top-level services, which are directly consumed by business functions, the Business Continuity Expert needs to translate
The research leading to these results is partially supported by the European Community’s Seventh Framework Programme (FP7/2001-2013) under grant agreement no.216556.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 206–220, 2012. c Springer-Verlag Berlin Heidelberg 2012
Business Driven BCM SLA Translation for Service Oriented Systems
207
the MTO objective down to lower-level services in the dependency graph. To translate SLAs along a service dependency graph is not an easy task [8]. In this paper we extend our previous work [15,13,14] and present our systematic and Petri-Net based approach to derive BCM metrics from a Business Impact Analysis and translate these BCM metrics along a service dependency graph to IT and facility level SLAs. We describe our approach in the context of a use-case which is introduced in Section 2. A Business Impact Analysis is carried out in three phases and our approach supports the Business Continuity Expert in all these phases: – First, the business continuity expert has to understand the business, business processes and the impact of business disruptions. He has to take into account financial indicators and non-financial indicators, such as legal requirements or external reputation. Moreover, large and global organisations deploy hundreds of cross-functional business processes in a number of regional variants. We explain in Section 3 how our approach helps the business continuity expert to conduct a sound business level impact analysis. – Second, the expert has to determine various Business Continuity Metrics for every business process and business function. In Section 4 we elaborate how the business continuity expert uses our methodology to determine the MTO of business functions and first level services. A first level service is a service which is directly consumed by a business function. – Third, the service dependency graph is used to determine and translate service level requirements as explained in Section 5. We introduce a novel classification schema of BCM SLAs which helps the expert to classify various SLA offers in Section 5.3. We discuss how our methodology supports the expert to select satisfactory SLA offers. We discuss related work in Section 6 and conclude this paper with an outlook on future work in Section 7.
2
Use Case
Figure 1 depicts three business processes and a service landscape we will use throughout this paper as a accompanying example. Circles denote processes and rectangles represent services. The processes are: Accounts Collectable A, Treasury B and Procurement C. The outcome of process Accounts Collectable A is input to the Treasury Process B whereas process Procurement C has no relationship with B or C. The service landscape in our example comprises five services: Enterprise Resource Planning (ERP) S1 , a primary database service S3 , a hot-standby secondary database service S2 , an Uninterruptible Power Supply S4 and electricity S5 . S1 is a top-level service. A top-level service is a service which is directly consumed by business process activities. In our example processes A and C directly consume the ERP service S1 .
208
U. Winkler, W. Gilani, and A. Marshall
The ERP service S1 depends on a high-available database service. The primary database service S3 is backed-up by a secondary, externally hosted database service S2 . The secondary database service would be used if the primary database service is not available. S1 is considered as unavailable if both database services, S2 and S3 are not available. S1 is available if at least one of the database services S2 or S3 is available. The primary database server itself depends on electricity, which is provided by an Uninterruptible Power Supply (UPS) S4 . The UPS acts as a buffer between the actual electricity provider and the database server and becomes unavailable if electricity S5 is disrupted for a certain amount of time.
""( "
'(
"(
#"$%&
!
""
Fig. 1. Example business processes and service dependency graph
3
Business Impact
The business continuity expert (expert for short) has to quantify the business impact if a business process is disrupted. The expert has to consider various dimensions, for example legal consequences, financial impact or external damages. Although some of these values are easily quantifiable, some of them can only be expressed in a qualitative form. Financial consequences are, for example, an easily quantifiable dimension. Legal consequences or the damage to external reputation are not so easy to express as numeric values and depend on human judgement. To quantify these values we provide the expert a business impact table which is similar to risk matrices. Such an impact table is depicted in Figure 2. The impact table comprises a set of quantitative or qualitative dimensions and
Business Driven BCM SLA Translation for Service Oriented Systems
209
a time dimension. In our example we use three dimensions: the financial impact (quantitative), legal (qualitative) and external (qualitative) consequences. The business impact table is configurable: the expert can add or remove quantitative/qualitative dimensions according to business needs. Down Time (h) 1 2 3
0-4 4-8 8 - inf
Financial 0.00 1000.00 1000.00
Legal
External
Severity
LOW LOW MEDIUM
LOW LOW HIGH
2 6 21
Fig. 2. Business Impact Table for process A
A column in a business impact table is called a dimension. A row is called time-frame. In the example the first dimension is the time dimension. The second dimension, the financial dimension, is a quantitative dimension. The third and forth dimensions are both qualitative dimensions. An entry in a qualitative dimension can have five impact values: NONE, LOW, MEDIUM, HIGH or MEGA. A row in such a table describes the business impact over a given time frame. For example, the second row in Figure 2 describes, if the business function is offline from 4 to 8 hours the business stands to lose 1’000 USD per hour and we can expect MEDIUM legal consequences and a HIGH impact on external reputation. For each row in the business impact table we compute the row severity value. The row severity value provides a numeric representation of the expected business impact for a certain time frame. The row severity value is the sum of all the severity values in a row, excluding the time-frame dimension. To compute the severity value of a quantitative dimension the expert has to map value ranges of that dimension to impact values. Depending on the business value of an activity the expert maps the financial impact of $0-$99.99 to NONE, $100-$999.99 to LOW and so on. The severity value of a qualitative dimension entry is computed using the following rules: If the assumption is made that there are x quantitative and qualitative dimensions in our impact table, then – – – – –
NONE = 0 , LOW = 1, MEDIUM = x∗ LOW + 1, HIGH = x∗ MEDIUM +1 and MEGA = x∗ HIGH +1.
This ensures that, for example, a row with a single HIGH entry always has a higher severity value than a row where all entries are MEDIUM. This is an important property of risk matrices as discussed in [4]. As our impact table is configureable and can hold a variable number of quantitative and qualitative dimensions it is not acceptable to simple assign constant values to all five impact values.
210
U. Winkler, W. Gilani, and A. Marshall
In our example x = 3 as we have three dimensions (finance, legal, external). Therefore NONE = 0, LOW = 1, MEDIUM = 4, HIGH = 13, and MEGA = 40. The financial impact of $1’000 is mapped by the expert to MEDIUM. Therefore the severity value of row two in Table 2 evaluates to MEDIUM + LOW + LOW = 6. 3.1
Business Level to Business Level Translation
Business processes may depend on each other; the output of a process may be the input of another process. Once the expert has developed the impact tables he must identify dependencies between business processes.
Fig. 3. Process to process dependencies
Let us assume that the output of business process P1 is the input for process P2 , as depicted in Figure 3 (b). If P1 is disrupted P2 may be able to proceed for a certain time, e.g. one hour. We call this time interval the offset o. An offset is never negative. The impact table for business process P1 is T1 and the impact table of P2 is T2 . We assign T1 and T2 to P1 and P2 with an offset of 0, denoted as (T1 , 0) and (T2 , 0). In the second step we mark the input arc between P1 and P2 with an offset of 1. Now, since P2 depends on P1 , we assign the impact table of P2 to P1 with an offset of 1, denoted as the tuple (T2 , 1). P1 has now two impact tables in its table set {(T1 , 0), (T2 , 1)}. Transitive input relationships are handled accordingly. We assign all impact tables of an input arc to the business process and add the additional offset to the existing offset. In Figure 3(c) we assign the P3 impact table to P2 ’s table set with an offset of 3 and to P1 impact table set with an offset of 4. If we have to assign an impact table with an offset o1 that has been already assigned to the business process’ table set with an offset o2 we choose the minimum offset min(o1 , o2 ). This solves loops and conflict situations as depicted in
Business Driven BCM SLA Translation for Service Oriented Systems
211
Figure 3(c) and (d). As you can see in Figure 3(e) we add either (T3 , 7) or (T3 , 2) to the impact table set of P1 . We choose (T3 , 2). In a loop situation we would assign the impact table set {(T1 , 1), (T2 , 0)} of P2 to P1 with an offset of 7. But since P1 already has the tuple (T1 , 0) in its table set, the minimum offset 0 is chosen. Continuing our example from Section 2 business process A is the input to process B, as shown in Figure 4. The impact table of A is depicted in Figure 2 and the impact table of B is shown in Figure 5. The offset between A and B is 72 hours (or 3 days). The impact table set of A now contains two impact tables, the impact table of A and the impact table of B with an offset of 72 hours.
Fig. 4. Top level service S1
Down Time (h) 24h - inf
Financial 0.00
Legal
External
Severity
MEGA
MEGA
80
Fig. 5. Business impact table for process B
3.2
Merging Impact Tables
Once a complete set of impact tables for each business process is available we merge all the tables into a single impact table, the merged impact table T . The merged impact table contains the rows from all impact tables with the offset added to the time-span. If tables contain rows with identical time-frames, the quantitative entries (e.g. financial entries) are added together and qualitative entries (e.g. legal entries) are replaced with the higher value. If the time-frames of tables rows overlap, the rows are split into rows with new time-frames such that some rows do not overlap at all and some rows become rows with identical time-frames. We then apply then the rules to merge rows with identical time-frames. In our example we add the single 24 hour row from the impact table of B (Figure 5) with an offset of 72 hours. In a first merging step we add the single row of B with and offset of 96 hours to TA as shown in Figure 6.
212
U. Winkler, W. Gilani, and A. Marshall
Down Time (h) ... 3 4
... 8 - inf 96 - inf
Financial ... 1000.00 0.00
Legal
External
Severity
... MEDIUM MEGA
... HIGH MEGA
... 21 80
Fig. 6. Step 1: Adding the single row from TB to TA with an offset of 72 hours (row 4)
Row 3 and 4 are overlapping rows. In step two we split row number 3 into 3 and 3 such that 3 and 4 have the same time-frame as shwon in Figure 7.
Down Time (h) ... 3 3 4
... 8 - 96 96 - inf 96 - inf
Financial ... 1000.00 1000.00 0.00
Legal
External
Severity
... MEDIUM MEDIUM MEGA
... HIGH HIGH MEGA
... 21 21 80
Fig. 7. Step 2: Splitting row 3 into 3 and 3 such that row 3 and 4 have the same time-frame and row 3 does not overlap with row 3 or 4
In a last step we merge row 3 and row 4 into a new row 4 . Following the rules of merging rows outlined above we add the financial values of 3 and 4 and the new financial value of 4 becomes $ 1000’00. As row 4 has higher legal and external values as row 3 we assign the row 4 values for legal and external to 4 . We remove row 3 and 4 from TA . The final and complete merged impact table for A is shown in Figure 8.
Down Time (h) 1 2 3’ 4’
0-4 4-8 8 - 96 96 - inf
Financial 0.00 1000.00 1000.00 1000.00
Legal
External
Severity
LOW LOW MEDIUM MEGA
LOW LOW HIGH MEGA
2 6 21 82
Fig. 8. Merged business impact table TA for process A
4
Defining the Maximum Tolerable Outage Times
The Maximum Tolerable Outage time (MTO) for a business process represents the maximum amount of time an organisation can afford that a business process is not executed. Defining the MTO for a process provides the deadline when this process must be executable again after a disruption. Using impact tables the continuity expert is able to define MTOs for processes and top-level services.
Business Driven BCM SLA Translation for Service Oriented Systems
213
Together with the business process owner, the business continuity expert defines the risk appetite. The risk appetite is a numeric value and expresses the risk level an organisation or business is willing to accept. In our example the risk appetite is set to HIGH. In our example table (see Figure 8) we have three dimensions (finance, legal, external), and therefore HIGH is equal to 13 as explained in Section 3. The expert selects all rows in a merged impact table that have a severity value equal to or greater than the risk appetite of 13. Within this set the expert determines the minimum time-frame. This time-frame becomes the MTO. In our example the expert would select the third and the forth row of the merged impact table (Figure 8) as both entries have a severity value of more than 13. From those two entries the expert selects the first row as this row has the lowest time-frame value (the lowest downtime), which is 8 hours. Therefore the M T OA for business process A would be 8 hours. In the next step we assign impact tables to first level services. Consider the situation depicted in Figure 4. Business processes A and C directly depend on the service S1 . Business processes A and C have a set of business impact tables TA and TC assigned to them. Process A also has TB in its table set. In a first translation step we simply assign all unmerged impact tables to the first level service S1 in a similar way as for merged impact tables, but without applying any offsets. The table set of S1 is {(TA , 0), (TB , 72), (TC , 0)}. The impact table TC is shown in Figure 9.
Down Time (h)
Financial
6h - inf
0.00
Legal
External
Severity
MEDIUM
HIGH
17
Fig. 9. Business impact table for process C
In the second step we merge the impact table set for service S1 . The same merging rules apply. We call the merged impact table TS 1 . Table TS 1 is shown in Figure 10.
Down Time (h) 1 2 4 5
0-4 4-6 6 - 96 96 - inf
Financial 0.00 1000.00 1000.00 1000.00
Legal
External
Severity
LOW LOW MEDIUM MEGA
LOW LOW HIGH MEGA
2 6 21 84
Fig. 10. Merged business impact table TS 1 for top level service S1
Similar to business processes the expert is now able to compute the Maximum Tolerable Outage Time M T OS1 for S1 . In our example M T OS1 is 6 hours.
214
5
U. Winkler, W. Gilani, and A. Marshall
SLA Translation
A BCM dependency graph is a service dependency graph with additional features, such as dependencies with recovery times. In our use-case, the ERP service is using more than a single database service. If one database server fails, the administrator not only has to replay the backup, he also needs to make sure that various databases are synchronised. This is called a logical error in SAP terms. Even if the database server is up and running again, the service may still be unavailable until the administrator resolves the logical errors.
Fig. 11. Simplified BCM dependency graph
Likewise, resources in such a BCM dependency graph may act as buffers. A good example is the Uninterrupted Power Supply (UPS) in our use-case. A UPS acts as a buffer between servers and the electricity provider and is capable of providing power for a certain amount of time when the electricity is disrupted. Figure 11 shows a simplified version of a BCM dependency graph. S1 is the top-level service which we use in our example. S1 depends on either S2 or S3 . The dependency between S3 and S4 is a dependency with recovery time of Δr = e(5, 7, 10). e(a, m, b) denotes a three point estimate. Three point estimation is a commonly used technique in management to fit a beta-distribution or triangle distribution if little information is available or the estimate is based on human judgement. The dependency link between S4 and S5 is a dependency with a buffer time Δb = e(10, 15, 20). Besides recovery times the expert also annotates a list of risks R to services and the potential failure rate λ each risk imposes on a service. For simplicity all lambda values are expressed as failures per year. The expert also estimates the probability a service provider is able to recover the service in time. Take electricity for example. Electricity may fail if a fuse blows, which is not a rare
Business Driven BCM SLA Translation for Service Oriented Systems
215
event. It is very likely that the service provider is able to replace the broken fuse within the time constraints. However, if electricity fails due to a heavy flood or an earthquake - which is a rare event - the service provider may not be able to restore electricity within the time constraints. The failure list R5 = {(λ1 = (0.5, 1, 2), p1 = 0.9)}, for S5 denotes that for S5 the probability that it will fail rises from very low within half a year through high probability within one year to almost certainty within two years. If that happens the service provider is able to recover the service in 90% of all cases within the time constraint. 5.1
SLA Offers
Let us assume that we have to select a SLA for service S5 . An example set of potential SLAs for service S5 is shown in Figure 12. Provider Time (min) Time (max) Price/a Price incentive Price per incident 1 2 3 4
P1 P1 P2 P3
0.0 4.0 0.0 0.0
4.0 6.0 12.0 4.0
2000.00 1000.00 500.00 100.00
0.00 0.00 250.00 0.00
0.00 0.00 0.00 500.00
Fig. 12. SLAs for Service S5
The recovery time constraints are provided by the values Time (min) and Time (max). Price per year denotes the cost to subscribe to that SLA per year regardless if the service fails or not. Price incentive is an additional bonus which is paid to the provider if the service does not fail once within that year. Price per incident is the cost the service user has to pay to the provider if a service needs to be recovered. Penalty costs are the cost the service provider pays the service customer every time a service fails. In our analysis model we do not explicitly distinguish between external and internal services as the continuity expert has to develop Service Level Agreements for both kind of services, external and internal alike. The major difference between external and internal services is the price tag assigned to SLAs. External services and service recovery operations are likely to be charged, while internal services are internally charged. Therefore it is difficult to quantify the additional costs. Without that price tag any selection algorithm would prefer services which are “free of charge” if costs are to be considered. To compensate, the expert is advised to add “hidden costs”, for example an administrator as a service provider for internal services. 5.2
SLA Validation
The main objective of the SLA validation analysis is to ensure that the selected SLA combination is capable of providing sufficient recovery support and that no MTO objective of top-level services or business processes is violated.
216
U. Winkler, W. Gilani, and A. Marshall
Let us assume that the expert selects the SLA that promises a recovery within 4 to 6 hours if service S5 becomes unavailable. In a first step, our SLA validation tool transforms a BCM dependency graph, risks, impact tables and the selected SLA into a Behaviour Analysis Model (BEAM). A BEAM model is a stochastic Petri-Net with time and supports read, inhibitor and reset arcs. BEAM also supports grouping of transitions and places and we call such a group a pattern. Patterns are used to define a model-toBEAM transformation. Our transformation tool transforms each source model object of the BCM dependency graph (e.g. service, a dependency between services, a risk or impact table) into a BEAM pattern according to the pre-defined transformation patterns. In a second step the SLA validation tool executes a series of Monte-Carlo simulations and in order to find any SLA violations. In a last step the SLA validation tools classifies the selected SLA according to a schema we introduce in Section 5.3.
) )" "
( !"
+
& ' #$ -
, *
.
$ !" #
%$
Fig. 13. Petri-Net
&"
Business Driven BCM SLA Translation for Service Oriented Systems
217
Figure 13 depicts the transformed BEAM net for S1 , S4 , S5 and the dependency between S4 and S5 . Please note that we omitted unnecessary details and we do not depict the whole Petri-Net for our example. Figure 13 also depicts the pattern for the impact table of service S1 . The impact table pattern monitors the S1 pattern. For each entry in the business impact table we have a dedicated place that counts how many times the S1 services became unavailable. For example, the place B represents the second row in the impact table for S1 and counts how many times the time constraints were violated for that particular row. Place A counts how many violation occurred for the fourth row in the merged S1 impact table. This is the row with a severity value greater than the risk appetite and therefore place A should never contain any tokens. We call such places SLA violation places and if there is a marking with a token in a violation place we call that marking a SLA violation marking. The Pattern S4 and S5 model the behaviour of internal service state. Essentially a Service can be either available or unavailable. The Pattern R5 models a particular risk of service S5 failing. Let us do a simulation: the timed-transition in R5 fires during a given time interval and a token is created in place D. This triggers the S5 unavailable transition and a token is put into the S5 unavailable place E. S5 becomes unavailable. A token in the S5 unavailable place enables two transitions: the delay transition of the S4 -S5 dependency pattern and the SLA recovery trigger transition. The trigger recovery transition places a token in place F. The two transitions X and Y compete for that token. Let us assume that the transition X with p = 0.9 wins and enables the timed transition that fires during the time interval [4 : 6] and recovers the service S5 . The timed transition that would have made the dependency pattern unavailable is now disabled. The recovery was within the time constraints so the failure was not propagated to the top-level S1 service (no violation marking). We call the selected SLA capable. 5.3
SLA Classification
We distinguish five classes of SLAs: – – – –
incapable SLAs, capable SLAs, satisfactory SLAs, and optimal SLAs
An incapable SLA is not able to recover a service within a given time constraint, that is there always exist at least one top-level service for which the validation tool finds a violation marking. A capable SLA is able to achieve a recovery within the given time constraint and prevents a failure propagation to a top-level service. In some simulation runs the validation tool detects no validation marking. However, a capable SLA does not necessarily guarantee recovery or failure prevention in all cases, that is in some simulation runs the validation tool might find a top-level service violation marking. In our use-case, there exists a 10% chance that the recovery takes more
218
U. Winkler, W. Gilani, and A. Marshall
than 6 hours; which is if the transition Y wins the competition with transition X. In that case the recovery may take longer than the UPS is able to provide electricity power and we may violate the M T OS1 objective. We call that risk that the recovery may fail the residual risk of SLA X. A satisfactory SLA is a capable SLA with a residual risk equal to or lower than a predefined residual risk threshold. For example the expert may state that a recovery must be successful in 90.0% of all cases, that is the residual risk threshold is set to 10.0%. The validation tool classifies a SLA as satisfactory if it finds validation markings in less than 10% of all simulation runs. The selected SLA of our example is capable and satisfactory. If there exist more than one satisfactory SLAs for a service S, we can determine one ore more optimal SLAs. An optimal SLA is a satisfactory SLA with minimum costs.
6
Related Work
A business impact analysis is related to reliability engineering. Fault-Tree Analysis (FTA) or Failure Modes, Effects, and Critical Analysis (FMECA) are two methodologies deployed to carry out an impact analysis [2]. Failure Modes, Effects, and Critical Analysis (FMECA) [1] is a standard to determine the effect on function, mission success, personal safety and performance that may could arise from failures in systems. However, FMECA provides no means to analysis the timely failure behaviour of resources, which is required in a BCM analysis [13]. Fault Tree Analysis (FTA) [5] is used in reliability engineering to determine combinations of failures in a system that could lead to undesired events at top level. The logical relationship between faults is modeled by means of logical gates, such as AND, OR, XOR and NOT. However, it is not possible to model the dynamic behaviour of systems in a FTA since Boolean gates are not able to capture the order in which events occur, nor is it possible to model time constraints, such as dead-lines. This limits the application of FTA in BCM to very simple analyses. SLA Translation is a well researched area with various aspects. A profound report of the status and challenges of SLA translation and SLA optimisation is given in [8]. However, most of the approaches examined in [9] focus on lower level IT metrics, such as response time and throughput. Our approach focuses on business level metrics, such as MTO. A very good overview of related work in regards to risk modelling and business impact analysis is provided by [6] and [11]. The authors of [11] discuss a formal approach enabling risk aware business process modelling and simulation. Their model is based on a profound mathematical definition of business activities, resources and threats. However, the model does not provide means to model arbitrary business values, such as legal aspects or the impact to external reputation. As they pointed out in [11] their approach currently “does not consider service level management aspects (e.g., planning service level agreements from a requesters view or analysing impacts of resource disruption regarding agreement breaches from a providers view)” and is considered as future work.
Business Driven BCM SLA Translation for Service Oriented Systems
219
The authors of [3] discuss survivability properties of communication systems and propose a model checking approach on stochastic Petri-Nets to decide whether the system under discussion is survivable or not. However they do not offer an automated SLA translation and optimisation mechanisms.
7
Conclusion and Future Work
In this paper we presented a approach to translate business level BCM requirements into SLAs. We introduced a novel methodology to develop business impact tables and to translate these impact tables to top-level services. Based on these tables the continuity expert is able to select a set of SLAs for a service dependency graph. We transform the dependency graphs into BEAM nets and simulate the behaviour of service systems. As future work we plan to investigate how we can extend our approach to provide support if the expert has to select SLAs for multiple services. In this case the selection problem becomes an optimisation problem. We aim to automate the selection process by investigating in an optimisation approach which determines a set of optimal solutions.
References 1. MIL-STD-1629A: Procedures for performing a failure mode, effect and criticality analysis (1980) 2. Brall, A.: Reliability analysis - a tool set for improving business processes. In: Proceedings - Annual Reliability and Maintainability Symposium (RAMS), pp. 1–5 (2010) 3. Cloth, L., Haverkort, B.: Model checking for survivability? In: Second International Conference on the Quantitative Evaluation of Systems (QEST 2005), pp. 145–154 (2005) 4. Cox, L.A.: What’s wrong with risk matrices? Risk Analysis: an Official Publication of the Society for Risk Analysis (2), 497–512 (April 2008) 5. FTA: IEC 61025 ed2.0 - Fault tree analysis (FTA) (2006) 6. Jakoubi, S., Tjoa, S., Goluch, G., Quirchmayr, G.: A Survey of Scientific Approaches Considering the Integration of Security and Risk Aspects into Business Process Management. In: 20th International Workshop on Database and Expert Systems Application, pp. 127–132 (2009) 7. Kotsokalis, C., Winkler, U.: Translation of Service Level Agreements: A Generic Problem Definition. In: The 3rd Workshop on Non-Functional Properties and SLA Management in Service-Oriented Computing, NFPSLAM-ICSOC 2009, Stockholm, Sweden (2009) 8. Li, H., Theilmann, W., Happe, J.: SLA Translation in Multi-Layered Service Oriented Architectures: Status and Challenges. Technical Report, Karlsruhe Institute of Technology (2009) 9. Li, J., Chinneck, J., Woodside, M., Litoiu, M., Iszlai, G.: Performance model driven QoS guarantees and optimization in clouds. In: Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing, pp. 15–22 (2009)
220
U. Winkler, W. Gilani, and A. Marshall
10. Theilmann, W., Yahyapour, R., Butler, J.: Multi-level SLA Management for Service-Oriented Infrastructures. Springer, Heidelberg (2008) 11. Tjoa, S., Jakoubi, S., Goluch, G., Kitzler, G., Goluch, S., Quirchmayr, G.: A Formal Approach Enabling Risk-aware Business Process Modeling and Simulation. IEEE Transactions on Services Computing, 1–14 (2010) 12. Wijnia, Y., Nikolic, I.: Assessing business continuity risks in IT. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 3547–3553 (2007) 13. Winkler, U., Gilani, W.: Business Continuity Management of Business Driven IT Landscapes. In: Reiff-Marganiec, S., Tilly, M. (eds.) Handbook of Research on Non-Functional Properties for Service-Oriented Systems: Future Directions, chap. Business C. IGI Global (2011) 14. Winkler, U., Gilani, W.: Model-Driven Framework for Business Continuity Management. In: Wieder, P., Butler, J., Theilmann, W., Yahyapour, R. (eds.) Service Level Agreements for Cloud Computing, chap. Model-Driv. Springer, Heidelberg (2011) 15. Winkler, U., Gilani, W., Fritzsche, M., Marshall, A.: A Model-Driven Framework for Process-centric Business Continuity Management. In: Seventh International Conference on the Quality of Information and Communications Technology, Porto, pp. 248–252 (2010)
Boosting Design Space Explorations with Existing or Automatically Learned Knowledge Ralf Jahr1 , Horia Calborean2, , Lucian Vintan2 , and Theo Ungerer1 1 2
Institute of Computer Science, University of Augsburg, 86135 Augsburg, Germany “Lucian Blaga” University of Sibiu, Computer Science & Engineering Department, E. Cioran Str., No. 4, Sibiu - 550025, Romania
Abstract. During development, processor architectures can be tuned and configured by many different parameters. For benchmarking, automatic design space explorations (DSEs) with heuristic algorithms are a helpful approach to find the best settings for these parameters according to multiple objectives, e.g. performance, energy consumption, or real-time constraints. But if the setup is slightly changed and a new DSE has to be performed, it will start from scratch, resulting in very long evaluation times. To reduce the evaluation times we extend the NSGA-II algorithm in this article, such that automatic DSEs can be supported with a set of transformation rules defined in a highly readable format, the fuzzy control language (FCL). Rules can be specified by an engineer, thereby representing existing knowledge. Beyond this, a decision tree classifying high-quality configurations can be constructed automatically and translated into transformation rules. These can also be seen as very valuable result of a DSE because they allow drawing conclusions on the influence of parameters and describe regions of the design space with high density of good configurations. Our evaluations show that automatically generated decision trees can classify near optimal configurations for the hardware parameters of the Grid ALU Processor (GAP) and M-Sim 2. Further evaluations show that automatically constructed transformation rules can reduce the number of evaluations required to reach the same quality of results as without rules by 43%, leading to a significant saving of time of about 25%. In the demonstrated example using rules also leads to better results.
1 Introduction Processor architectures can be influenced by many parameters as long as tape-out has not been completed (i.e. architectural design is not final). For benchmarking and during development of a processor it is important to find the best settings for these parameters in a given situation or environment. Due to the size of the design space and the long time necessary for a single simulation, an exhaustive search in the design space is impossible. Instead, heuristic algorithms like NSGA-II [8], SPEA2 [30] or SMPSO [19] can be applied to explore the design space carefully and lead to near-optimal configurations in reasonable time. The genetic algorithm NSGA-II runs with populations of
Horia Calborean was supported by POSDRU financing contract POSDRU 7706.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 221–235, 2012. c Springer-Verlag Berlin Heidelberg 2012
222
R. Jahr et al.
many individuals (e.g. 50 or 100) and tries to improve their quality over time. Each individual represents a configuration. Usually random individuals are selected for the first generation. The main result of a DSE is basically a set with the best individuals found during the exploration. They approximate the best possible configurations and hence show values for the objective functions very close to the optimum. Nevertheless, they do not allow conclusions to be drawn concerning the values of parameters probably leading to high-quality configurations. But this information would be very helpful in order to understand the influence of parameters and their values. To tackle this, an approach is presented to automatically calculate a decision tree with machine learning techniques and translate the tree into understandable rules describing subsets of the total design space with a high density for configurations of high quality. Multiple design space explorations (DSEs) are often necessary to find the best configurations, e.g. for specific application domains or in setups with small changes. Common algorithms always start from ground up, so they cannot gain any profit from the intuition of an engineer or conclusions acquired from previous DSEs. To overcome this, we present an extended version of the Framework for Automatic Design Space Explorations (FADSE [5, 6, 13]), which can incorporate domain knowledge represented by transformation rules expressed as fuzzy rules in the DSE process [3]. These rules specify how to transform individuals to move them probably closer to the Pareto front consisting of optimal individuals. They can be specified by the engineer performing the exploration to describe his knowledge or intuition about the explored system, hence hopefully accelerating the process of the DSE by avoiding the evaluation of configurations of obviously low quality. As alternative, we show how the rules derived from a decision tree calculated automatically based on the results of a prior DSE can be translated into transformation rules, hence creating a feedback-loop to gain profit from a prior DSE. The main advantages of the presented approach are that (a) the rules calculated from the results of a DSE allow to draw conclusions about values for parameters with high probability to result in high-quality configurations, (b) DSEs can be accelerated with rules calculated from results of a prior similar DSs, (c) DSEs can be sped up with rules representing knowledge of the person running the DSE, and (d) all types of rules either calculated automatically or to be specified by an engineer are highly readable and can thus be understood and specified easily. Section 2 introduces related approaches using machine learning techniques and knowledge in DSEs. Modeling, acquiring and using of knowledge in FADSE is described in Section 3. The potential of the approach is evaluated in Section 4. The paper is concluded in Section 5.
2 Related Work In difference to many publications on design space exploration and its acceleration, as introduced in this section, our approach makes use of (a) existing knowledge in a (b) readable notation and (c) NSGA-II as algorithm, which is known to work very well in most situations because of a good trade-off between simulation time and quality
Boosting Design Space Explorations
223
of the results [4]. The aim is to choose individuals of the design space in a way that evaluating them leads, with a higher probability, to an improvement of the approximated Pareto front. In comparison to not using existing knowledge, fewer individuals have to be evaluated to get an approximation of the Pareto front of the same quality. Related to the work presented in this article are all approaches using knowledge, often represented as meta-model, to speed up DSEs. To achieve this, machine learning techniques can be used to estimate the location of individuals (in the design space) that will be close to the Pareto front (objective space). Based on NSGA-II, Mariani et al. [16, 18] use an Artificial Neural Network (ANN) to predict the quality of individuals, which is then used to decide if they should be simulated or not. ANNs are also integrated into a multi-level model [17]. The link between the parameters of ANNs and understandable facts is typically hard to establish, hence they are not useful as interface for engineers to specify facts. Predictive models in a more general matter are also used by Ozisikyilmaz et al. [20] to accelerate DSEs. Mariani et al. [15] apply statistical methods to select promising candidates for next evaluations on the fly by predicting “the expected improvement of unknown configurations”, too. Although not naming it, Ozisikyilmaz and Mariani try to approximate a model for the response surface, which is also performed by Cook and Skadron [7]. Other work introducing alternative algorithms for design space explorations has been presented by e.g. Sengupta et al. [24, 29], Beltrame et al. [2], Ascia et al. [1], and Palermo et al. [21]. These algorithms do no incorporate existing knowledge, too. It could rather be possible to extend them with the ideas of this article. To our knowledge, there is no approach offering a possibility to specify existing knowledge or even a possibility to describe the set of Pareto optimal individuals in a readable form.
3 Description and Integration of Knowledge in a Design Space Exploration After describing the basic concepts of design space explorations (Section 3.1) it is explained how to model (Section 3.2) and integrate knowledge in the DSE process with the algorithm NSGA-II (Section 3.3). Finally a way to automatically acquire knowledge is introduced (Section 3.4). 3.1 Basic Concepts The aim of a DSE is to find the best points (equivalent to configurations) in the so-called parameter space P ⊂ Zn+1 , i.e. configurations consisting of values for all parameters p = (p0 , ..., pn ). They are evaluated with one or more objective functions, resulting in a point o = (o0 , ..., om ) in the objective space O ⊂ Rm+1 . Evaluating a point in the parameter p space can be described as projection f (p) := P → O into the objective space: (p0 , ..., pn ) → (o0 , ..., om )
224
R. Jahr et al.
The dominance relationship defines an order for configurations; A configuration i ∈ P dominates another configuration j ∈ P if all values of the objectives for i are the same or better and for at least one definitely better than those of j. The true Pareto front is defined as the optimal set consisting of all non-dominated individuals. It is approximated during the DSE by the set of known non-dominated individuals, which is called approximated Pareto front. 3.2 Modeling Knowledge Engineers running a design space exploration (DSE) typically have a rough idea of the area in the parameter space where they expect to find high-quality configurations, for example: “If the width of the processor front-end, the number of execution units and the number of write back units is medium then the processor should have a medium number of commit units.” In more general words, a single parameter shall be set to a value from a defined subset of its domain if other parameters have values from specific regions of their domains. Such conditional statements are typically described as fuzzy rules, which make use of so-called linguistic expressions like small, medium, etc. Fuzzy rules can be written in Fuzzy Control Language (FCL) [12], a domain-specific language. Due to the clear structure and high readability of this language, as well as the major impact on reducing execution time, it can, from our point of view, be expected from an engineer to specify his knowledge in this form. A FCL file specifies inputs and outputs, which are the parameters used in the design space exploration. For them, all used intervals, e.g. low, medium and high are described. They can then be used to formulate rules like the following: IF frontend_width IS medium AND exec_units IS medium AND wbk_units IS medium THEN commit_units IS medium
In the DSE process the rule is evaluated by an engine for fuzzy rules, e.g. the library jFuzzyLogic.As first step, the actual values for the three parameters in the above example are set as inputs. Next they are fuzzified; this means that their class is calculated. For this task a membership function is used, because – as we are in fuzzy logic – it can happen that a value is not fully attributed to a class but e.g. 40% to one class and 60% to another class when being on the triangular areas of the trapezoid describing the classes (example in Fig. 1). After evaluating all rules the values of the membership functions for the output values are calculated and with a defuzzification method distinct values are determined.
!
Boosting Design Space Explorations
225
Fig. 1. Trapezoid areas describing exemplary fuzzy classes
Mutation
Initial Population
Crossover
Current Population
Mutation
Offspring Combination & Selection for next generation
Fig. 2. Sketch of an extended version of NSGA-II with additional mutation operator
3.3 Using Knowledge during the DSE Process We are running FADSE with NSGA-II as DSE algorithm. Its coarse structure with an additional mutation step for the initial population, which will be explained later1 , is displayed in Figure 2. To avoid generating individuals that we know from our experience are likely to have low quality, knowledge can be helpful in two situations: (a) to generate the initial population and (b) when mutating individuals. If the initial population already contains individuals that are supposed to be closer to the Pareto front, then the algorithm converges more quickly. Whenever a parameter of an individual is mutated, which happens with a defined probability β, then a new value is chosen from anywhere in the parameter’s domain. Here knowledge can help to select a probably better value. Knowledge can be expressed in FADSE as transformation rules modeled as fuzzy rules. They are integrated into the mutation operator. Instead of using an initial population generated by random as starting point for the DSE, a random generation is first processed by the mutation operator (see Fig. 2). This increases the number of individuals supposed to be close to the Pareto front. When integrating knowledge in the mutation process, one has to keep in mind that the algorithm should not be restricted but supported by the existing knowledge. Hence the influence of the knowledge should be higher in the first generations than at the end of the design space exploration. Therefore in Listing 1.1, which describes a possible knowledge-supported mutation operator for NSGA-II, the knowledge-supported new value for a parameter is used only with a probability αgeneration dependent of the generation count. The mutation operator is also influenced by the standard mutation probability β, typically set to 1/#parameters, e.g. 16 ≈ 0.17 for six parameters. To calculate 1
Nevertheless, randomly changing random values does not change anything; the initial population stays random.
R. Jahr et al.
226
Fig. 3. Example for Gaussian distribution modeling the influence of the rules in the DSE process foreach ( Parameter p) if ( rand (0.0 ,1.0) < αgeneration ) # Knowledge ? // If a value is available if ( knowledge_value (p )) para . value = knowledge_value (p ); else if ( rand (0.0 , 1.0) < β) # Random ? para . value = random_value (p ); Listing 1.1. Mutation operator for NSGA-II able to handle existing knowledge
αgeneration , a Gaussian distribution starting at 0.8 for the first individual and converging towards β is used. The value β is reached after about 500 individuals or, with 50 individuals per generation, after 10 generations (see Figure 3 for an example). When fuzzy rules are matched, the results are typically intervals – to be more precise: a trapezoid with intervals as width and membership function as height (Figure 1) – and with a specific method a single value is calculated for further use. This process is called defuzzification. One of the more popular defuzzifiers is Center of Gravity (COG) but many more methods exist [23]. Beyond this, there is a difference in usage between the way we use fuzzy rules and the way they are typically used. Normally all input values are set, the rules are triggered and all output parameters are defuzzified and written back. In our mutation operator we work iteratively (respecting the bit-flip mutation operator implementation). For each parameter we set all yet available input values, trigger the rules, but write back only a single parameter, then move to the next one. It is vital to sustain diversity of the individuals, the algorithms shall not be limited by generating very similar individuals on and on. An increased risk of getting stuck in a local minimum would be the consequence. This issue was taken into account by (a) using a Gaussian distribution for the probability of applying the transformation rules, hence reducing the influence of knowledge over time, and (b) because only a fraction of all possible individuals can be matched and thus transformed by fuzzy rules. This also has to be kept in mind when designing them. 3.4 Acquiring Knowledge through Data Mining Techniques In this section, it is assumed that knowledge is not available explicitly but through a previous similar DSE. Actually, these results contain information about where to find high-quality individuals in the design space, but this knowledge is only represented
Boosting Design Space Explorations
227
by data points and not as facts or rules, so it is only implicitly available. Data mining techniques can be applied for this transformation. The goal is to find rules describing how to change individuals to improve their quality. It is an evident approach to measure the quality of an individual by the distance to the true Pareto front. Because the true Pareto front is unknown, instead its approximation calculated from the available results is used. In areas with very little or very high slope, the minimum distance between two of these points can be relatively high. Hence additional imaginary points are calculated with e.g. linear interpolation and added the approximation of the Pareto curve to close gaps. The quality of an individual i is then defined by calculating the minimum weighted distance in the objective space to an individual j on the approximated Pareto curve; the weighted distance d(i, j) is defined for i and j with objective values f (i) = a, f ( j) = b and s ∈ {0, ..., m} as follows: s as − bs 2 d(i, j) = where i, j ∈ P; a, b ∈ O; f (i) = a; f ( j) = b ∑ Δs {0,...,m} Δs is the difference of the maximal and minimal value that points of the approximated Pareto front show for objective s; dividing by Δs normalizes the objectives. Individuals within a maximal distance ε to the approximated Pareto front are called perfect; they are the candidates to be described by rules. This created two classes of individuals, perfect ones and all the others, named good. In our evaluations we selected ε so that about one third of the total number of individuals is rated as perfect. A model is needed to decide a priori, so before calculating the real values of its objectives, if it will be a perfect or a good individual. This is a classification task, a quite common situation in which machine learning techniques can be applied. The following paragraphs describe how rules are generated from an automatically constructed decision tree. Other approaches replacing decision trees for classification should be possible, too. For our evaluations we use the data mining tool WEKA [11]. First a reduction of the complexity of the parameter space is done by selecting only the most influential parameters with the CFS attribute subset evaluator [10]. Second, a decision tree is built with the algorithm C4.5 by Quinlan [22]. A decision tree is a binary tree with nodes end edges. Nodes without subsequent edges are called leafs and labeled with either perfect or good. A hierarchical set of decisions is mapped to all other nodes having two outgoing edges. A decision D j ∈ D is of the form pi ≤ a or pi > b where pi is a parameter pi ∈ {p0 , ..., pn } of an individual and a and b are numeric values. The path Wk from the root of the tree to a leaf k, which is labeled perfect or good, can be described as Wk = Da , Db , Dc , ..., Dn with {Da , Db , Dc , ..., Dn } ⊂ D . Decision trees are typically complete and conflict-free, so every individual can be classified and the sets of individuals classified as perfect and good are disjoint. To calculate the class for an individual the tree is traversed from top to bottom evaluating all the decisions on the nodes and following the edges corresponding to the result of the decision. Finally a leaf is reached and the individual is classified as element of the corresponding class. If an individual can be evaluated with the leaf k, then all decisions
228
R. Jahr et al.
on the path Wk can be interpreted as conditions restricting the parameter space. So the conjunction of all conditions on Wk , i.e. Da ∧ Db ∧ Dc ∧ ... ∧ Dn , must be true. Conditions in transformation rules can have a lower and an upper bound, so conditions on a path in the decision tree can be compacted often (e.g. pi ≤ 9 ∧ pi ≤ 12 ∧ pi > 4 ⇒ 4 < pi ≤ 9). The resulting decision tree has height smaller or equal to the number of parameters. The basic idea to gain a transformation rule R from conditions is to pick one of the conditions Ci from the path W and use it as consequence of the transformation rule R: R := Ca ∧Cb ∧Cc ∧ ... ∧Cm → Ci where {Ca ,Cb ,Cc , ...,Cm } = W \ Ci This means that if all conditions of the path W from the root of the decision tree to the leaf except Ci are true, then the parameters of the individual are set in a way that Ci is true, too. Such a rule can be generated for all Ci ∈ W . As on each path there are at most as many decisions as parameters (i.e. n) in the individual, at most n transformation rules are created for each perfect leaf of the decision tree. Although these rules have sharp conditions, which is because the common relations ≤ and < are used, they can easily be written as fuzzy rules. They however do not use fuzzy interval borders, which is in contrast to the rules specified manually by an engineer, so the membership function for their conditions is always 0 or 1. If the same fuzzy rule matches for several individuals, all of them will – if COG or a similar method is used – in the end have the same value set for one or more parameters because the membership function has only very high (close to 1) or very low values (close to 0). This is in contradiction to the purpose of mutation. Hence we developed a new random defuzzifier, which picks a random value from regions with high membership values in order to sustain diversity. When evaluating an individual, it will match one of the transformation rules only if at most one of the individual’s parameters is not in the interval described by the rule. To increase the probability of matching a transformation rule, additional rules can be introduced to change a parameter of an individual which is in an interval that is not covered by any rule to a value from the complementary interval. As example, imagine the domain of a parameter ]0; 8] being covered by rules in ]0; 4]. To deal with individuals with pa ∈]4, 8], a rule is introduced to move the value of pa of an individual to the covered region ]0; 4]. This increases, as we were able to show in some experiments, the probability of matching a rule and leads to a higher number of perfect individuals generated with the rule set (in our case about 10% more perfect unique individuals, see Section 4.1). To summarize, we use an automatically constructed decision tree based on the data of a previous DSE to decide for an individual without evaluating it, if it will be close to the approximated Pareto front or not. Based on these decision rules we construct a set of transformation rules described by fuzzy rules for further use in the DSE process.
4 Evaluation To show the effectiveness of adopting existing knowledge during a DSE process we focus on the Grid ALU Processor (GAP), a novel reconfigurable processor
Boosting Design Space Explorations
229
Table 1. Parameter space for GAP’s array (left) and instruction cache (right) Description p0 Rows p1 Columns p2 Layers
Domain {4, 5, 6, ..., 32} {4, 5, 6, ..., 31} {1, 2, 4, ..., 64}
Description p3 Line size p4 Sets p5 Lines per set
Domain {4, 8, 16} {32, 64, 128, ..., 8192} {1, 2, 4, ..., 128}
architecture for the execution of sequential instruction streams. A superscalar in-order frontend loads instructions. The GAP-specific configuration unit maps them dynamically onto a three-dimensional array of functional units (FUs). The array is organized in columns, rows, and configuration layers. Detailed information about the processor architecture is given by Uhrig et al. [27] and Shehan et al. [25]. The following three subsections cover different independent scenarios and evaluate the benefit of automatically generated rules. 4.1 Automatically Developing Rules The goal of this subsection is to show the effectiveness of automatically generated rules in describing the set of individuals close to the Pareto front (see Section 3.4). Basic data is the data gathered from the DSE of the hardware parameters of the Grid ALU Processor (GAP) described in [13]. It comprises 2833 individuals (out of which 2385 are unique) with six parameters (see Table 1) and values for two objectives, i.e. hardware complexity [13] and performance measured as clocks per instruction (CPI). Individuals with a weighted distance smaller than ε = 0.018 to their closest individual of the approximated Pareto front (or an imaginary individual calculated with linear interpolation) are classified as perfect; this is 994 or about 35% of all evaluated individuals. As described earlier (Section 3.4) we use the well-known data mining tool WEKA [11] for our studies. The parameters p0 and p3 are eliminated by the complexity reduction with the CFS attribute subset evaluator [10]. The decision tree constructed with C4.5 [22] is a tree with 28 leaves, out of which 10 classify perfect individuals. The algorithms for complexity reduction and constructing the decision tree are state of the art and make high use of entropy. Applying 10-fold cross-validation, the tree is able to classify 81.4% of all individuals correctly (average weighted F-Measure 0.816, for perfect 0.742 and for good 0.855). This result is good enough for our intentions. One should keep in mind that the rules should only point at the regions of the parameter space where perfect individuals can be found and not describe them sharply. The following table shows the confusion matrix; it can be concluded that the classification of individuals works well. Perfect Good ⇐ Classified as... Perfect 754 240 Good 285 1554 The complexity of the tree can be reduced further by sacrificing a bit of precision, i.e. by cutting away leafs which classify only a very small number of individuals. In our
230
R. Jahr et al. (p1 ≤ 20) ∧ (3 < p2 ) ∧ (28 < p4 ) ∧ (p5 ≤ 24 ) ⇒ (652/150) (p1 ≤ 20) ∧ (5 < p2 ) ∧ (211 < p4 ) ∧ (p5 ≤ 21 ) ⇒ (153/41) (p1 ≤ 20) ∧ (3 < p2 ) ∧ (27 < p4 ≤ 28 ) ∧ (21 < p5 ≤ 24 ) ⇒ (81/15) (p1 ≤ 8) ∧ (p2 ≤ 3) ∧ (27 < p4 ≤ 210 ) ∧ (20 < p5 ≤ 21 ) ⇒ (70/30) (11 < p1 ≤ 18) ∧ (3 < p2 ≤ 5) ∧ (211 < p4 ) ∧ (p5 ≤ 22 ) ⇒ (34/9)
Fig. 4. Paths from the root of the decision tree to the five most important leafs describing perfect individuals. The numbers in brackets are the number of classified and the number of incorrectly classified individuals.
! "#! $#% "&
Fig. 5. Classification of the configurations evaluated in the DSE with five rules, only a part of the objective space is displayed
case 64 classifications are lost by selecting only the five most important leafs. The next step is to compact and sort the conditions on the paths from the root to the selected leafs. The paths to the five most important perfect leafs look as shown in Figure 4. For a graphical demonstration all configurations evaluated during the DSE of GAP’s hardware parameters have been classified with the conditions described by the five most important paths. Figure 5 shows the results, where e.g. Perfect ⇒ Good describes perfect configurations classified as good, so they are classified incorrectly. In the most important regions, i.e. very close to the Pareto front and with complexity below 1200, there are nearly no incorrectly classified individuals. Far away from the Pareto front the density of correctly classified individuals is high, too. Although there is a region close to the Pareto front with configurations incorrectly classified as perfect this should not cause problems in the DSE because they help sustaining diversity. Also their number is comparably small as described by the confusion matrix. The two clusters with perfect configurations classified as good could be caused by the picking only five rules to classify perfect individuals. Based on Section 3.4, it is easy to translate these rules into transformation rules and to represent them as fuzzy rules; 20 of them are generated. Three additional rules are
Boosting Design Space Explorations
231
introduced to provide full coverage of all values of the domains of parameters p1 , p4 , and p5 . To evaluate the rule set random individuals are generated and evaluated with different probabilities to apply the fuzzy rules α. 50 iterations with 1000 unique individuals were performed, then the average values for the ratio of unique perfect generated individuals are calculated. This measure is used because it is (a) not helpful in the DSE process if the rules generate the same individuals again and again (little diversity) and (b) still a high ratio of generated perfect individuals is desired. If knowledge is totally ignored (α = 0), the ratio is about 11%. This value increases linearly to 51% if the parameter values calculated with knowledge are used always (α = 1). The influence of the rules is reduced as expected with reducing the probability to use values generated by them. The ratio of unique and perfect individuals is high enough to have a significant impact on the DSE process and low enough to sustain diversity. Apart from the GAP we also tried to find transformation rules for M-Sim 2 [14]; clocks per instruction (CPI) and energy consumption are used as objectives. One third of all individuals is defined as perfect. The number of parameters was reduced from 30 to 9. The decision tree has 72 leafs, 32 are labeled perfect. It classifies 74% of all individuals correctly and 10 rules are enough to describe more than 85% of the individuals classified as perfect by the decision tree (F-measure for perfect is 0.541, for good 0.814). So as conclusion, the proposed approach works well for M-Sim 2, too. 4.2 Accelerated General DSE of GAP’s Hardware Parameters Because a DSE with a representative number of benchmarks needs much time we propose to run a special case, i.e. a DSE with a single benchmark, gain rules representing the outcome of this DSE automatically, and then run the general DSE supported by these rules. We picked stringsearch, one of the smallest from the MiBench Benchmark Suite [9] in terms of execution time. A DSE was performed with it and from the 1100 unique results a set of four rules is automatically gained as described in Section 3.4. Supported by these rules, the DSE process is then run with 10 benchmarks for five times; as reference we also run DSEs without support by rules. Figure 6 shows the average hypervolumes for five runs with and without rules in relation to the number of evaluated unique individuals. The hypervolume, also called hyperarea, is the region between the so-called hypervolume reference point and the approximation of the Pareto front. The higher the hypervolume is, the better is the found approximation of the Pareto front. But neither its numerical value can be interpreted nor its optimum can be defined. For both tasks (a) the true Pareto front, which is per-se unknown, and (b) a very deep understanding of the evaluated problem, which would even supersede a DSE, would be requirements. After two generations, the runs supported with transformation rules show better results until the exploration is aborted after 40 generations. The quality of the results gained with rule-support can never be reached without rules. Typically a DSE can be aborted if the hypervolume stops showing progress for some generations. For the run without rules this is the case after 871 configurations with a 5 generation threshold; the hypervolume is nearly the same as after 780 configurations or 24 generations. If the automatically calculated rules are used the same value for
232
R. Jahr et al.
the hypervolume is reached after 423 evaluated configurations or 11 generations. As conclusion, in our setup approx. 46% fewer evaluations are necessary to reach the same value for the hypervolume with rule-support. Concerning the time necessary to (a) run the simple DSE, calculate rules, and perform the complex DSE in comparison to (b) just running the complex DSE, one always has to keep in mind that with rules results of a superior quality can be reached. So, even running the DSE without rules twice as long will not lead to comparable results. The simple DSE finished in about 2 hours, calculating the rules as this is a semi-automated task took only about 15 minutes. Without rules, an evaluation took on average 18 hours, with rules it was 11 hours to gain the same hypervolume (durations are given as an example to understand the difference in time between tasks). So the reduction is approx. 4:45 hours, which is about a quarter.2 Nevertheless, as the rule-supported exploration still shows progress and reaches a higher level of the hypervolume, one should let it run longer and gain profit from the improved results.
!" #$% !&'
Fig. 6. Average hypervolume for five DSEs of the hardware parameters of GAP in relation to the number of unique individuals which have to be evaluated. The data points represent the hypervolume gained after completing a generation starting with the first generation on the very left.
Our results show a speedup of progress of the DSE of ca. 50% when using the results of this DSE. Because running a DSE with a single benchmark does not consume much time, the break-even point in terms of duration and result quality can be reached easily. Moreover, using the rules from the previous run leads to even better results. 4.3 Specializing the GAP for Application Domains In recent years IP cores got more and more important. They can be configured with optimal parameters for the aimed application domain. A DSE can be used to find 2
As the explorations with and without rules were run in parallel and with the same number of cores the interaction of the runs also caused by results stored in the database (see [13]) is assumed to be little.
Boosting Design Space Explorations
233
optimal parameters and the vendor could support this by providing a set of rules describing optimal configurations in typical environments. In a case study the GAP was configured for (a) encoding/decoding images with JPEG and (b) encrypting/decrypting data with the Rijndael algorithm (AES). A set of five rules as shown earlier in Figure 4 was used to support the explorations with FADSE; the rules were calculated automatically based on a DSE with multiple benchmarks. With support from the rule set, superior results were found in shorter time and with less simulations then in comparative runs without rules.
5 Summary, Conclusion and Outlook We have presented an approach to accelerate the progress of the algorithm NSGA-II for automatic design space explorations (DSEs). This works by directing the mutation operator, i.e. the component in the algorithm which is used to sustain diversity, in the first generations of the exploration to regions of the design space, where good individuals are supposed to be located. The modified mutation operator presented for NSGA-II could also be integrated in algorithms like OMOPSO [26] and SMPSO [19], because they also incorporate a mutation operator. Technically speaking, fuzzy rules are used to describe the necessary transformations of individuals to improve their quality. These rules, also called transformation rules, are specified in the fuzzy control language (FCL), a domain specific language with high readability. Hence they can be specified easily by an engineer according to his knowledge or intuition. Also it is possible to use data mining methods to calculate transformation rules. We showed how a decision tree can be constructed automatically from a prior similar DSE and converted into transformation rules. These rules, describing areas of the parameter space with many high-quality configurations, are a very profitable result of the analyzed DSE because they allow conclusions about good values for the parameters. In the evaluation of the approach the Grid ALU Processor (GAP) is used as example. When performing DSEs for processor architectures, typically extensive simulations have to be run. We were able to show that a decision tree can be constructed automatically to describe regions of the design space with high density of high-quality configurations for GAP as well as M-Sim 2. With the transformation rules derived from this tree a respectable speed-up of about 25% for similar DSEs with slightly changed setups was gained for the GAP. As future work we plan to further extend FADSE making it able to cope with much larger design spaces with the goal of using it for adaptive code optimizations (see e.g. [28]), where parameters for multiple code optimizations and accordingly chosen hardware parameters have to be found. We already tested the proposed technique (generating the transformation rules) on the parameters of the M-SIM 2 simulator with very good results. The algorithm used to construct the classification trees is well known and should scale to larger design spaces easily. When dealing with many parameters not all of them might influence the output very much so some of them can be removed from the decision tree because of their low entropy.
234
R. Jahr et al.
References 1. Ascia, G., Catania, V., Nuovo, A.G.D., Palesi, M., Patti, D.: Efficient design space exploration for application specific systems-on-a-chip. Journal of Systems Architecture 53(10), 733–750 (2007); Embedded Computer Systems: Architectures, Modeling, and Simulation 2. Beltrame, G., Bruschi, D., Sciuto, D., Silvano, C.: Decision-theoretic exploration of multiprocessor platforms. In: Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2006, pp. 205–210 (October 2006) 3. Calborean, H.: Multi-Objective Optimization of Advanced Computer Architectures using Domain-Knowledge. PhD thesis, Lucian Blaga University of Sibiu, Romania (PhD Supervisor: Prof. Lucian Vintan, PhD) (2011) 4. Calborean, H., Jahr, R., Ungerer, T., Vintan, L.: Optimizing a superscalar system using multiobjective design space exploration. In: Proceedings of the 18th International Conference on Control Systems and Computer Science (CSCS), Bucharest, Romania, Calea Grivitei, nr. 132, 78122, Sector 1, Bucuresti, May 24-27, vol. 1, pp. 339–346. Editura Politehnica Press (2011) ISSN 2066-4451 5. Calborean, H., Vintan, L.: An automatic design space exploration framework for multicore architecture optimizations. In: 9th Roedunet International Conference (RoEduNet), Sibiu, Romania, pp. 202–207 (June 2010) 6. Calborean, H., Vintan, L.: Toward an efficient automatic design space exploration frame for multicore optimization. In: ACACES 2010 Poster Abstracts, Terassa, Spain, pp. 135–138 (July 2010) 7. Cook, H., Skadron, K.: Predictive design space exploration using genetically programmed response surfaces. In: Proceedings of the 45th annual Design Automation Conference, DAC 2008, pp. 960–965. ACM, New York (2008) 8. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2), 182–197 (2002) 9. Guthaus, M., Ringenberg, J., Ernst, D., Austin, T., Mudge, T., Brown, T.: Mibench: A free, commercially representative embedded benchmark suite. In: 4th IEEE International Workshop on Workload Characteristics, pp. 3–14 (December 2001) 10. Hall, M.: Correlation-based Feature Selection for Machine Learning. PhD thesis, University of Waikato (1999) 11. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009) 12. IEC 1131 - programmable controllers, part 7 - fuzzy control programming (January 1997) 13. Jahr, R., Ungerer, T., Calborean, H., Vintan, L.: Automatic multi-objective optimization of parameters for hardware and code optimizations. In: Smari, W.W., McIntire, J.P. (eds.) Proceedings of the 2011 International Conference on High Performance Computing & Simulation (HPCS 2011), pp. 308–316. IEEE (July 2011) ISBN 978-1-61284-381-0 14. Joseph, K.G., Sharkey, J., Ponomarev, D.: M-sim: A flexible, multithreaded architectural simulation environment. Technical Report CS-TR-05-DP01, State University of New York at Binghamton (October 2005) 15. Mariani, G., Brankovic, A., Palermo, G., Jovic, J., Zaccaria, V., Silvano, C.: A correlationbased design space exploration methodology for multi-processor systems-on-chip. In: Proceedings of the 47th Design Automation Conference, DAC 2010, pp. 120–125. ACM, New York (2010) 16. Mariani, G., Palermo, G., Silvano, C., Zaccaria, V.: Meta-model assisted optimization for design space exploration of multi-processor systems-on-chip. In: Proceedings of the 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools, DSD 2009, pp. 383–389. IEEE Computer Society, Washington, DC (2009)
Boosting Design Space Explorations
235
17. Mariani, G., Palermo, G., Silvano, C., Zaccaria, V.: Multi-processor system-on-chip design space exploration based on multi-level modeling techniques. In: International Symposium on Systems, Architectures, Modeling, and Simulation, SAMOS 2009, pp. 118 –124 (July 2009) 18. Mariani, G., Palermo, G., Zaccaria, V., Silvano, C.: An efficient design space exploration methodology for multi-cluster vliw architectures based on artificial neural networks. In: Proc. IFIP International Conference on Very Large Scale Integration VLSI - SoC 2008, Rhodes Island, Greece, October 13-15 (2008) 19. Nebro, A., Durillo, J., Garcıa-Nieto, J., Coello, C.A., Luna, F., Alba, E.: SMPSO: A new pso-based metaheuristic for multi-objective optimization. In: Proceedings of the IEEE Symposium Series on Computational Intelligence, pp. 66–73 (2009) 20. Ozisikyilmaz, B., Memik, G., Choudhary, A.: Efficient system design space exploration using machine learning techniques. In: Proceedings of the 45th Annual Design Automation Conference, DAC 2008, pp. 966–969. ACM, New York (2008) 21. Palermo, G., Silvano, C., Zaccaria, V.: Discrete particle swarm optimization for multiobjective design space exploration. In: Proceedings of the 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools, pp. 641–644. IEEE Computer Society, Washington, DC (2008) 22. Quinlan, J.R.: C4.5: Programs for Machine Learning, 1st edn. Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann (January 1993) 23. Roychowdhury, S., Pedrycz, W.: A survey of defuzzification strategies. International Journal of Intelligent Systems 16(6), 679–695 (2001) 24. Sengupta, A., Sedaghat, R., Zeng, Z.: Rapid design space exploration by hybrid fuzzy search approach for optimal architecture determination of multi objective computing systems. Microelectronics Reliability 51, 502–512 (2010); 2010 Reliability of Compound Semiconductors (ROCS) Workshop; Prognostics and Health Management 25. Shehan, B., Jahr, R., Uhrig, S., Ungerer, T.: Reconfigurable grid ALU processor: Optimization and design space exploration. In: Proceedings of the 13th Euromicro Conference on Digital System Design (DSD), Lille, France (2010) 26. Sierra, M.R., Coello Coello, C.A.: Improving PSO-Based Multi-objective Optimization Using Crowding, Mutation and ε-Dominance. In: Coello Coello, C.A., Hern´andez Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 505–519. Springer, Heidelberg (2005) 27. Uhrig, S., Shehan, B., Jahr, R., Ungerer, T.: The two-dimensional superscalar gap processor architecture. International Journal on Advances in Systems and Measurements 3(1 and 2), 71–81 (2010) 28. Waterman, T.: Adaptive compilation and inlining. PhD thesis. Houston, TX, USA, AdviserCooper Keith D. (2006) 29. Zeng, Z., Sedaghat, R., Sengupta, A.: A framework for fast design space exploration using fuzzy search for vlsi computing architectures. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), May 30-June 2, pp. 3176–3179 (2010) 30. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength pareto evolutionary algorithm. Technical Report 103, Computer Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH), Zurich, Switzerland (2001)
IBPM: An Open-Source-Based Framework for InifiniBand Performance Monitoring Michael Hoefling1 , Michael Menth1 , Christian Kniep2 , and Marcus Camen2 1
2
University of Tuebingen, Chair of Communication Networks, Sand 13, 72076 Tuebingen, Germany {hoefling,menth}@informatik.uni-tuebingen.de science + computing ag, Hagellocher Weg 73, 72070 Tuebingen, Germany {c.kniep,m.camen}@science-computing.de
Abstract. In this paper, we present a tool for performance measurement of InfiniBand networks. Our tool analyzes the network and presents a comprehensible visualization of the performance and health of the network. InfiniBand network operators can use the tool to detect potential bottlenecks and optimize the overall performance of their network.
1
Introduction
InfiniBand (IB) [1] is a communication technology used for interconnection in high-performance computing (HPC) data centers. With increasing computational needs, IB networks become more complex. Finding hot spots in the network, detecting congestion, and providing an overall health map of the network are features strongly needed to operate the network as efficiently as possible. Vendors of IB hardware offer proprietary software to manage IB networks. Mellanox’ Unified Fabric Manager (UFM) [2] and QLogic’s InfiniBand Fabric Suite (IFS) [3] allow network administrators and operators to monitor the current state of the network and optimize the routing. In this work, we present the IB performance monitoring tool (IBPM). We implemented IBPM as an extensible framework based on open-source components. Our goal was to give network administrators a tool to monitor and analyze their IB networks. The paper is organized as follows. In the next section, we briefly present the core idea of rate measurement in IB networks. The features of IBPM are presented in Section 3. Section 4 describes our architecture and implementation and Section 5 concludes this work.
2
Rate Measurement in IB Networks
The IB network utility software package [4] provides tools to extract raw network information from the network. The tools are outlined in the following. We use these tools, analyze their output, and derive statistics about the performance of the network. J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 236–239, 2012. c Springer-Verlag Berlin Heidelberg 2012
IBPM: An Open-Source-Based Framework for IBPM
237
Topology Extraction. The tool ibnetdiscover performs subnet discovery in the IB network. It produces a human readable file displaying all nodes and links of the topology. We further process the output to produce a graphical representation of the network which is easier to comprehend than the textual representation, especially if the network is large. Remote Counter Readout. The IB standard [1] defines performance counters for each port of an IB device. Tools such as perfquery can be used to remotely read out IB counters. These counters measure, e.g., the amount of transfered data in the absence of congestion, transfered data in the presence of congestion, errors, or changes of the physical state of the link. We use the term performance counter for the rest of the paper. Counter Limitations. Performance counters are unsigned 32 bit wide saturating counters. Each counter step represents a doubleword of transfered data. Thus, the maximum amount of transfered data which can be measured by an IB performance counter is (232 − 1) · 32 bit = 16 Gbyte. After counter saturation, no further traffic measurements can be conducted unless the counter is reset. IB uses 8b/10b line coding, i.e., the effective data rate is 20% lower than the line rate. Given a 40 Gbit/s QDR-link, the saturation time of the corresponding performance counter is calculated as follows. (232 − 1) · 32 bit Max. counter value = 4 = 3.¯9 s ≈ 4.0 s 3 Max. trans. speed 5 · 40 · 1024 bit/s Hence, we may underestimate the counter value if the readout interval is greater than 4 seconds.
3
Features
IBPM offers innovative measurement and analysis methods for IB networks. As a basic feature, our tool provides automatic topology extraction and visualization. Besides the technical view on the network, statistics are available as well. To add value to the plain topology view, additional performance information can be included as an overlay. IBPM features include traffic locality visualization, measurement of congestion, and measurement of link utilization. In the following, we present the features traffic locality and link utilization. Traffic Locality. We define the traffic locality as follows. Let X be a set of connected nodes. We consider a node x ∈ X . The functions out(x) and in(x) denote traffic rates leaving or entering the set X in x, and the functions gen(x) and con(x) denote the traffic rates generated or consumed in x. The locality of generated traffic is defined by out(x) x∈X (gen(x) − out(x)) = 1 − x∈X . lgen (X ) = gen(x) gen(x) x∈X x∈X
238
M. Hoefling et al.
and the locality of consumed traffic is in(x) x∈X (con(x) − in(x)) lcon (X ) = = 1 − x∈X . con(x) con(x) x∈X x∈X The locality of all traffic with respect to node set X is x∈X out(x) + x∈X in(x) l(X ) = 1 − . gen(x) + x∈X x∈X con(x) Thus, the locality nicely shows the percentage of local traffic with respect to the overall traffic. The concept of traffic locality is useful to assess whether nodes within a subtree of an IB network mainly communicate with each other or whether they heavily depend on communication with nodes outside their subtree. For a network with a tree structure, the visualization of the traffic locality is simple. Each subtree defines a set of nodes X for which the traffic locality can be computed and the obtained value can be associated with the root of the subtree which serves to color a node map. For general network topologies, the node sets X for the computation of the traffic locality need to be explicitly defined. Link Utilization. Link utilization profiles show the time-dependent utilization of a link. These profiles are a powerful tool to enable network administrators to identify bottleneck links. In case no bottleneck link exists in the network, the profiles can be used to carefully downsize the backbone network while still providing the desired quality of service (QoS) to the customers. In addition, time-averaged utilization values are used to color a network map.
4
Architecture and Implementation
IBPM is implemented in a modular fashion which makes it easy to extend. Figure 1 provides an overview of the program structure. Each module is displayed together with its main features that are currently implemented. The application core is formed by Nagios, a computer and network monitoring software as controller, and Foswiki as corresponding graphical user interface. Open-source packages which are used in the modules include Gnuplot, RRDtool, Graphviz, and many more. A monitoring run with IBPM normally consists of several steps. First, the topology of the network is automatically extracted, normalized, and stored for further processing. In regular intervals, performance data is collected from all nodes and switches in the network. In the configuration module, the user selects nodes and switches, and specifies measurement periods for statistical analysis. After enough data is collected, the general analysis of both topology and performance data is invoked and the analytical results are computed. Eventually the analytical results are interpreted by choosing one of the proposed comprehensible views, i.e., as network graph overlay or statistic.
IBPM: An Open-Source-Based Framework for IBPM Performance Collector • Collection of PerfMgr data • Reset IB counters
Configuration Module • Node and switch selection • Measurement period selection
3 CALC
Calculation Module • Calculation of port utilization • Calculation of traffic share • Calculation of traffic locality
4 OUTPUT
Topology Visualizer • Simple fabric view • Augmented fabric view • Port utilization • Traffic locality • Congestion • Job locality
Statistics Visualizer • Traffic share CDFs • Congestion CDFs • Histograms
APPLICATION CORE
2 CONFIG
Nagios and Foswiki
1 INPUT
Topology Discoverer • Node and switch detection • Link status detection
239
Fig. 1. Program structure of IBPM
5
Conclusion
In this paper, we presented IBPM, an extensible framework for IB performance monitoring. IBPM offers a clear interface to network administrators to view a health map of the network and perform measurements. In addition, it provides many options for the visualization of the measured performance data. Our approach defines a set of visualization scenarios that are valuable for network administrators and operators of HPC data centers. Acknowledgments. The authors would like to thank science + computing ag for providing them with access to IB test and production networks to test and evaluate IBPM.
References 1. InfiniBand Trade Association: The InifiniBand Architecture Specification (2011), http://www.infinibandta.org/ (last visited November 2011) 2. Mellanox Technologies: Unified Fabric ManagerTM Software for Data Center Management (2011), http://www.mellanox.com/ (last visited November 2011) 3. Qlogic Corporation: InfiniBand Fabric Suite (2011), http://www.qlogic.com/ (last visited November 2011) 4. OpenFabrics Alliance: OpenSM and InfiniBand Diagnostic Utilities (2011), https://www.openfabrics.org/ (last visited November 2011)
A Workbench for Internet Traffic Analysis Philipp M. Eittenberger and Udo R. Krieger Faculty of Information Systems and Applied Computer Science Otto-Friedrich University Bamberg, Germany [email protected]
Abstract. The specification and development of models for Internet traffic is mainly conducted by measurement-driven performance modeling based upon statistical analysis. Yet, this type of analysis can be a challenging task, due to the complexity and especially, with large sample sizes, the sheer quantity of the data. For this reason, preliminary data examination by graphical means and appropriate visualization techniques can be of great value. In this paper, we present the recent developments of the network traffic analyzer Atheris. Atheris has been designed specifically for measuring network traffic and for the visualization of its inherent properties. As a part of Atheris’ functionality, it performs traffic measurements at the packet layer, data extraction, flow analysis and enables the visual inspection of statistical characteristics.
1
Introduction
Measurement-driven traffic modeling typically involves trace-based analysis of captured packet streams, to detect, identify and quantify the intrinsic characteristics. This quest of a statistical analysis is a multi-layered endeavor. For the purpose of performance modeling, graphical methods are intuitive and appealing ways for a preliminary examination of the data, especially when the sample size is massive. As an example, the initial process of distribution selection is usually a combination of using visual inspection and summary statistics. Thereby, after the packet capture has been finished, the usual procedure for performance modeling is the following: At first, the metrics of interest, (such as flow or session features, the packet size distribution etc.) are exported to a text file. Subsequently, the data file will be imported for further statistical analysis to the standard software of choice, e.g. R or Octave. Of course, it would be beneficial to get a first glance upon common criteria without this media discontinuity. Since such a statistical analysis is an explorative process that is normally conducted iteratively, the repetition of exporting and re-importing data sets can be an annoying and timeconsuming task. Despite numerous measurement studies (e.g. [9] or [6] among many others), publicly available measurement tools, which are tailored to analyze the traffic features, are rarely provided yet. To summarize, for the purpose of performance modeling, visual inspection can be an integral complement to summary statistics and there is no solution available that provides powerful visualization techniques. In order to overcome this deficiency, we have developed J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 240–243, 2012. c Springer-Verlag Berlin Heidelberg 2012
Visualizing Internet Traffic with Atheris
241
Fig. 1. Visualization Window of Atheris
the traffic analyzer Atheris based on our teletraffic analysis concept [10]. The target of this development was not only to enable straight through processing, but also to lay the foundations for a unified measurement workbench. We presented a demonstration of the first version of Atheris at the poster session at the P2P’2010 conference [8] and a more detailed description of the first version was presented at the PDP’2011 conference [7]. In this paper we illustrate the recent developments, which extend and enrich the functionality of Atheris.
2
Atheris
Atheris is developed in Java 1.6 due to Java’s platform independence and the availability of suitable libraries. The first version of Atheris used Jpcap [3] as a Java wrapper for the libpcap/WinPcap library. Libpcap [4] and its Windows counterpart WinPcap [5] intercept the packets from the kernel space and copy them to the user space, so that traffic analyzers can dissect and store the packets. Libpcap respectively WinPcap are, of course, written in C, therefore, one needs a wrapper to get the packets into Java. Since the development of Jpcap has been discontinued and we encountered serious limitations and bugs, we migrated Atheris to jNetPcap [2], which serves now as the wrapper for the libpcap library. After the completion of the migration to the jNetPcap wrapper, the open source version of Atheris has been made publicly available [1]. The new version of Atheris includes also the following new functionality: We redesigned its GUI from scratch (see Figure 1 for the new visualization window) and included drag and drop functionality to ease its usability. Its workbench functionality has been
242
P.M. Eittenberger and U.R. Krieger
Fig. 2. Multithreaded Capture Engine of Atheris
extended, i.e. it is now possible to export the packets of selected flows or conversations to Wireshark or to CSV files. As a part of the ongoing work this export functionality will be extended to incorporate the export to SQL databases and the statistical software R. Finally, the addition of new plots to Atheris has been simplified. Due to its modular architecture, existing plots can be easily extended and to add a new plot, only one interface needs to be implemented.
3
Multi-threaded Packet Analysis
We have now implemented a fully multi-threaded approach to exploit the power of the modern generation of CPUs. Special attention had to be paid to the synchronization of the method calls to the underlying libpcap library. Since libpcap is single-threaded and thus, not reentrant, every call to the libpcap library must be properly synchronized. Without the synchronization of the method invocations, each parallel function call to libpcap can lead to a crash of libpcap and due to that, a crash of the Java virtual machine. Figure 2 illustrates the core architecture of the capture engine. Upon start of a capture session, one thread per interface is used to capture the packets and dump them to the storage device. Of course, as Atheris relies on the libpcap/WinPcap APIs, the usual filter expressions can be applied too, e.g. to capture only the traffic of one IP address. The process of dumping the packets on the storage device is due to avoid
Visualizing Internet Traffic with Atheris
243
excessive memory consumption and to have the possibility to split large traces (e.g. over 4GB) into several smaller files. After dumping a packet, another dedicated thread is used to read the packets from the packet dump and to put them into two queues. One queue is used to display the packets in the GUI. The other queue is part of a producer-consumer pattern: The analysis thread “consumes” the packets and “feeds” them to the various analyzers. These analyzers can be separated into two distinct groups: The layer analyzers collect layer specific information, like the total number of packets and bytes seen up to a given time, time stamps, port numbers etc., in regard to the protocol of the particular layer (e.g. IP or TCP/UDP). The connection analyzers collect the same information, but for a distinct connection, which could be a flow or a conversation (bi-flow). In addition, the evaluation of statistical session information is carried out by this thread too. For links with high speed, e.g. GBit/s links, there exists also the possibility, just to analyze the packets without storing them. This is beneficial to avoid storage exhaustion and to circumvent that the capture throughput is dependent on the read-write speed of the storage device.
4
Conclusion
In this paper, we presented the recent development of the open source traffic analyzer Atheris. We reported about its new functionality and its multi-threaded packet processing design. To the best of our knowledge Atheris is the first publicly available multi-threaded implementation of a graphical traffic analyzer. In future releases we plan to incorporate P2P functionality to enable the monitoring of applications in a fully distributed manner.
References 1. 2. 3. 4. 5. 6. 7.
8.
9.
10.
Atheris, https://sourceforge.net/projects/atheris/ jnetpcap, http://jnetpcap.com/ Jpcap, http://netresearch.ics.uci.edu/kfujii/jpcap/doc Libpcap, http://www.tcpdump.org Winpcap, http://www.winpcap.org Baset, S., Schulzrinne, H.: An analysis of the Skype peer-to-peer internet telephony protocol. In: INFOCOM 2006, pp. 1–11 (2006) Eittenberger, P.M., Krieger, U.: Atheris: A first step towards a unified peer-to-peer traffic measurement framework. In: 19th Euromicro International Conference on Parallel, Distributed and Network-Based Computing (PDP 2011), pp. 574–581 (2011) Eittenberger, P.M., Krieger, U., Biernacki, A., Markovich, N.: Integrated measurement and analysis of peer-to-peer streaming traffic by the java tool Atheris. In: P2P 2010, pp. 155–157 (2010) Hei, X., Liang, C., Liang, J., Liu, Y., Ross, K.W.: Insights into PPLive: A measurement study of a large-scale P2P IPTV system. In: Proc. of IPTV Workshop, International World Wide Web Conference (2006) Markovich, N.M., Biernacki, A., Eittenberger, P., Krieger, U.R.: Integrated Measurement and Analysis of Peer-to-Peer Traffic. In: Osipov, E., Kassler, A., Bohnert, T.M., Masip-Bruin, X. (eds.) WWIC 2010. LNCS, vol. 6074, pp. 302–314. Springer, Heidelberg (2010)
A Modelling and Analysis Environment for LARES Alexander Gouberman, Martin Riedl, Johann Schuster, and Markus Siegle Institut f¨ ur Technische Informatik, Universit¨ at der Bundeswehr M¨ unchen {firstname.lastname}@unibw.de Abstract. This paper presents a toolset for modelling and analysis of fault tolerant systems based on LARES (LAnguage for REconfigurable Systems), an easy-to-learn formalism that allows its users to describe system structure, dynamic failure and repair behaviour.
1
Introduction
Modern societies rely on the correct and timely functioning of complex systems, e.g. in the communication, transportation or energy sectors, which makes it extremely important that we understand and are able to improve the dependability of these systems. In this short paper, we present the LARES toolset, a new software library which supports dependability modelling and analysis. It is designed to be integrated into the development cycle of modern IT-based systems, since it offers transformations from / to various other modelling languages. At the core of the toolset is the domain specific LARES language, first presented in [3] and since then employed in several modelling case studies, e.g. [4]. This language allows the user to specify in a clear and concise way all aspects of a system which are relevant for its dependability. Apart from basic combinatorial failure conditions, complex error situations such as error propagation or common cause failures, arbitrary redundancy structures and different repair strategies can be modelled in a strictly modular and hierarchical fashion. The LARES toolset, presented here for the first time, does not contain its own analysis engine, but relies on external tools for qualitative and quantitative analysis. For this purpose, transformations to various target formalisms such as stochastic process algebra (SPA), stochastic Petri nets (SPN) or some simulation language need to be provided. Some of these have been implemented already, in particular the present paper focuses on the transformation to the SPA tool CASP A [1] which consists of several steps, as discussed below.
2
The LARES Modelling Language
For explaining how modularity and hierarchy can be modelled in LARES, a small example of a fault tolerant system is given in Fig. 1. The system RedComp consists of two components iC and iSer whose descriptions are captured inside the J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 244–248, 2012. c Springer-Verlag Berlin Heidelberg 2012
Modelling and Analysis Environment for LARES
245
(abstract) module definitions mComp and mSerialC. The module mComp includes a behaviour bComp which defines the state space for each instance of this module. Concretely, the lifetime of the component iC is phase-type distributed with an initial state Good, an erroneous state Error and an absorbing state Failed indicating a failure of the component. From an erroneous state the component can recover to the state Good with rate 0.2 or finally fail with rate 0.3. The component iSer consists of two subcomponents iComp[1] and iComp[2] which have the same lifetime distribution as iC since they are instantiated from the same abstract module mComp. Inside a module, conditions can be defined which capture information about the states of the subcomponents, e.g. the component iSer fails (defined in the condition failed) if at least one of its subcomponents fails. This condition is used in order to lift state information towards the system level in the structural model hierarchy. On the level of the system, if both components iC and iSer fail, the event intEvent is generated which immediately forces the system to fail. Beside its two components, the system has an additional error behaviour bErr: it can fail either if the guard is triggered (if both components fail) or due to an external event after an exponentially distributed lifetime with rate 0.082. Since the system does not provide an explicit start state for bErr, the first occurring state is used. Behavior bErr ( e x t e r n a l E v e n t R a t e ) { Transitions from Good i f → F a i l e d , d e la y e x p o n e n t i a l e x t e r n a l E v e n t R a t e i f → F a i l e d } System RedComp : bErr ( e x t e r n a l E v e n t R a t e = 0 . 0 8 2 ) { Behavior bComp { Transitions from Good i f → E r r o r , d e la y exponential 0 . 1 Transitions from E r r o r i f → Good , d e la y e x p o n e n t i a l 0 . 2 i f → F a i l e d , d e la y e x p o n e n t i a l 0 . 3 } Module mComp : bComp { Condition f a i l e d = bComp . F a i l e d Initial i n i t = bComp . Good } Module m S e r i a l C { expand ( i i n {1 . . 2 } ) { Instance iComp [ i ] o f mComp } Condition f a i l e d = OR[ i i n {1 . . 2 } ] iComp [ i ] . f a i l e d } Instance i S e r o f m S e r i a l C Instance i C o f mComp i S e r . f a i l e d & i C . f a i l e d guards bErr .< i n t E v e n t > }
Fig. 1. Fault Tolerant Example Model
We restrict in this model to exponentially distributed time delays (by using phase-type distributions) in order to be able to analyze it exactly with Markov chain methods. The LARES language also allows for general distributions, in which case the model needs to be analyzed with simulative methods.
3
Tooling and Transformations
Fig. 2 shows the tools used (dark grey filled rectangles) and the different transformation steps into specific formalisms such as SPN or SPA. A LARES model
246
A. Gouberman et al.
#"
"$ % &
!""" "$ % &
Fig. 2. Tools and Transformation Workflows
Prob[Failed]
can be specified using an editor component that supports the user with syntax highlighting, auto-completion and of course syntactical and partial semantical evaluation of correctness of the model. There is a multistage transformation that first evaluates all expressions, thereby expanding all parameter dependent statements, then resolving the hierarchical indirections over Condition statements leading to resolved logical propositions on states used inside the guards statements, and finally resolving the indirections of the guard label references over the hierarchy. These steps yield the LARESBASE subset from which two different types of transformation are supported: Firstly, the transformation into a stochastic process algebra, where synchronisation is used to handle the different cases, i.e.: Does a certain state of a component contribute to a valid generative composed state? Which reaction, based on the transitions referred to by a guard label reference, could follow? From this, a sequential stochastic process is generated and integrated by synchronisation information, to specify which transitions have to interact with their environment. While decomposing the LARES hierarchy, the SPA composition tree P ACT is built, consisting of the individual processes at its leaves and their subsequently generated intermediate composition nodes and their synchronisation sets. The generated composition structure is then decomposed, resulting in an SPA specification which can be analysed using the CASPA solver. Secondly, the hierarchy can be resolved leading to a stochastic Petri net (i.e. f latSP N ) with enabling conditions for transitions. Based on this model, either a reachability analysis can be performed, resulting in a transition system (T RA file), or alternatively, it can be translated to an eDSP N model to apply TimeNet [5] or to an SP N P model [2]. Applying the transformation workflow into 1 CASPA SPA, we can automatically derive cer0.75 tain measures. Fig. 3 shows a curve obtained 0.5 by transient analysis of our example for a num0.25 ber of timepoints for a given state measure defined on the Failed state of the system error 0 0 5 10 15 20 model. Since the model has an absorbing state Time (i.e. components cannot recover), the curve Fig. 3. Analysis Example converges to one.
Modelling and Analysis Environment for LARES
4
247
Implementation Details
A textual editor for LARES has been implemented using Eclipse TMF xText, in order to provide a suitable modelling environment to the user (a graphical editor is under development). xText provides an easy way to implement domain specific languages such as LARES. Moreover, we chose the Scala language to implement the transformations into the solver formalisms. It helped a lot to apply its functional and OO concepts and therefore obtain a code which is close to the mathematical formalization. The abstract syntax tree definition has been built using algebraic data types (i.e. using Scala case classes). Furthermore, the concrete syntax has been implemented using Scala parser combinators. By applying the root parser to a LARES specification, the abstract syntax tree is created, i.e. a model is loaded. Next, the transformation has to be done. Classically, the visitor pattern is applied to assure a separation between the abstract syntax tree implementation and the transformation code. As the tree consists of algebraic data types, one can use Scala pattern matching for decomposition instead of traversing the tree with the visitor concept, retaining the separation of the abstract syntax tree implementation and transformation code. The LARES instance tree is traversed multiple times to apply all transformation steps described above. After thus obtaining the intermediate P ACT representation, a composition routine is executed, resulting in the SPA model. Lastly, a simple model-to-text transformation is done on the SPA model whose result is subsequently handed over to the CASP A solver. Transformation validation can be performed, which helps to assure the correctness of the transformation workflows even when applying changes to the code. This is done using ScalaCheck, a unit testing library that allows to define a number of testcases. When performing a test, two different workflows are executed resulting in two transition systems on which a comparison (i.e. bisimulation equivalence) is performed. ScalaCheck finally states whether the two transition systems are bisimilar or not, or if an error occurred during the transformation. Whenever a code change is done, all testcases are evaluated again.
5
Conclusion
We have presented an environment in which LARES dependability models can be specified and analysed. We introduced the LARES formalism by example and gave insight into the different transformation steps which lead to an analysable model. Moreover, we also briefly touched transformation validation. Readers interested in the toolset are asked to contact the authors. Acknowledgments. We would like to thank Deutsche Forschungsgemeinschaft (DFG) who supported this work under grants SI 710/7-1 and for partial support by DFG/NWO Bilateral Research Programme ROCKS.
248
A. Gouberman et al.
References 1. Bachmann, J., Riedl, M., Schuster, J., Siegle, M.: An Efficient Symbolic Elimination Algorithm for the Stochastic Process Algebra Tool CASPA. In: Nielsen, M., Kuˇcera, A., Miltersen, P.B., Palamidessi, C., T˚ uma, P., Valencia, F. (eds.) SOFSEM 2009. LNCS, vol. 5404, pp. 485–496. Springer, Heidelberg (2009) 2. Ciardo, G., Muppala, J., Trivedi, K.: SPNP: Stochastic Petri Net Package. In: Proc. of the Third Int. Workshop on Petri Nets and Performance Models, pp. 142–151 (December 1989) 3. Gouberman, A., Riedl, M., Schuster, J., Siegle, M., Walter, M.: LARES - A Novel Approach for Describing System Reconfigurability in Dependability Models of FaultTolerant Systems. In: ESREL 2009: Proc. of the European Safety and Reliability Conf., pp. 153–160. Taylor & Francis Ltd. (2009) 4. Walter, M., Lˆe, M.: Clear and Concise Models for Fault-Tolerant Systems with Limited Repair using the Modeling Paradigm LARES+. In: 19th AR2TS Advances in Risk and Reliability Technology Symposium, pp. 310–321 (2011) 5. Zimmermann, A., Knoke, M., Huck, A., Hommel, G.: Towards version 4.0 of TimeNET. In: German, R., Heindl, A. (eds.) MMB, pp. 473–476. VDE Verlag (2006)
Simulation and Statistical Model Checking for Modestly Nondeterministic Models Jonathan Bogdoll, Arnd Hartmanns, and Holger Hermanns Saarland University, Computer Science, Saarbr¨ ucken, Germany Abstract. Modest is a high-level compositional modelling language for stochastic timed systems with a formal semantics in terms of stochastic timed automata. The analysis of Modest models is supported by the Modest Toolset, which includes the discrete-event simulator modes. modes handles arbitrary deterministic models as well as models that include nondeterminism due to concurrency through the use of methods inspired by partial order reduction. In this paper, we present version 1.4 of modes, which includes several enhancements compared to previous prototypical versions, such as support for recursive data structures, interactive simulation and statistical model checking.
1
Introduction
Modest [6] is a high-level compositional modelling language based on a formal semantics in terms of stochastic timed automata (STA) that provides an expressive syntax with features such as recursive processes, user-defined functions and exception handling. STA are a rich semantic model that includes nondeterministic and discrete probabilistic choices as in probabilistic automata (PA, [9]), hard real-time behaviour as in timed automata (TA, [1]) as well as stochastic sampling and delays according to arbitrary probability distributions. In fact, many well-known models, such as Markov chains, PA or TA are special cases of STA, and most are easy to identify on the syntactic level in Modest. The analysis of models specified in Modest is supported by the Modest Toolset, available at www.modestchecker.net, which provides several tools for model-checking different subsets of Modest/STA as well as a discrete-event simulator, modes, that supports almost the entire Modest language. The tools are integrated into a graphical modelling and analysis environment. This paper focuses on modes, which was released in a prototypical version in the first half of 2011 [5], and has been significantly extended since. The focus of modes is to allow simulation of nondeterministic models in a sound way. While most discrete-event simulators rely on hidden schedulers to resolve nondeterministic choices, which may influence the results in unexpected ways [8], modes uses methods inspired by partial order reduction [2] to decide, on-the-fly, whether any nondeterminism it encounters can be safely resolved in an arbitrary way, or whether doing so could skew the simulation results.
This work has been supported by the DFG as part of SFB/TR 14 AVACS and by the DFG/NWO bilateral research project ROCKS. It has received funding from the EU FP7 programme as part of the MEALS project, grant agreement no 295261.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 249–252, 2012. c Springer-Verlag Berlin Heidelberg 2012
250
J. Bogdoll, A. Hartmanns, and H. Hermanns
Since its original presentation, modes has continually been improved and extended in order to make it more robust, applicable to more case studies, and more user-friendly. To aid with model debugging, a new interactive simulation mode has been added. Aside from the original case studies presented in [5]— Arcade [7] dependability evaluation models and the IEEE 802.3 binary exponential backoff protocol—modes has since been applied to the analysis of wireless sensor networks [3] and to network delay and queueing models as part of the Data Networks course taught at Saarland University in summer 2011. The unprejudiced use by ∼ 100 students has greatly improved the tool’s robustness.
2
Language Enhancements
Modest is a modelling language combining features from process algebra with convenient constructs from programming languages, with a focus on succinctness and expressivity. The interested reader is invited to refer to [6] for details concerning the language design and setup. To improve the usability and applicability of modes, we added four extensions that are fully supported by modes: Recursive data structures, user-defined functions, binary and broadcast synchronisation of actions and a new delay keyword to succinctly specify stochastic delays. Data Structures and Functions. The original Modest language definition was abstract w.r.t. the handling of data. Consistent and expressive means for data manipulation, however, are crucial to building complex, but readable models. Prior to the version presented in this paper, the only data types supported by modes were atomic types (like bool or real), fixed-size arrays thereof and C-like structs whose members had to be atomic types. We have replaced the latter with ML-like data types. For example, datatype list = { real hd, list option tl }; declares a linked list of real numbers. For an instance l of that type, l.hd accesses its head and l.tl accesses its tail, which is a list option, i.e. it can be none or a list. To define operations on these types, but also to perform more complex computations, modes supports user-defined functions; as an example, to compute the length of a list, one could use the following function len: function len(list option l) = if l == none then 0 else 1 + len(l!.tl); These functions can be (mutually) recursive; however, if a function call on a simulation run does not terminate, neither will the run itself. User-defined data types and functions have been used, for example, to model and simulate a network with queues that prioritise packets according to length, using a sorted list. New Synchronisation Modes. Modest models are usually specified as the parallel composition of a set of concurrent processes, which then run asynchronously with the possibility of synchronising on certain action labels. As part of an effort to connect Modest to Uppaal [4], we have extended Modest to support CCS-style binary and Uppaal-style broadcast communication in addition to the previously available CSP-style multi-way synchronisation. These additions,
Simulation and Statistical Model Checking
251
which greatly simplify certain modelling scenarios and allow more concise value passing via global variables (because the assignments of a sender (a! {x := 7}) are performed before the assignments of the receivers (a? {y := x})), are fully supported by modes. A Shorthand for Stochastic Delays. Delays in a timed model can be specified using guards and deadlines; for example, if c is a clock, the Modest behaviour when(c ≥ 2) urgent(c ≥ 4) tau will result in an edge labelled tau being available for two time units starting when c ≥ 2; as soon as c reaches 4, this edge has to be taken. To specify stochastic delays, a value is first sampled from some probability distribution and then used as above: {= c = 0, x = Exp(λ) =}; when(c ≥ x) urgent(c ≥ x) tau causes tau to be executed precisely after an amount of time that is exponentially distributed with rate λ. To improve readability and lower the initial learning curve for models using such stochastic or deterministic delays, we have added a delay shorthand, so one can simply write delay(Exp(λ)) tau instead.
3
Towards Statistical Model Checking
When preparing the simulation of a model, two important decisions that have to be made are when to terminate a single simulation run and how many simulation runs to perform in order to obtain sufficient confidence in the results computed. Termination Criteria for Runs. In previous versions, modes could be instructed to terminate a simulation run after a fixed number of steps, after a certain amount of model time, or when a given predicate evaluates to true for the first time. When simulating to compute the value of some property, e.g. the expected time until a packet is transmitted, these criteria were problematic: A run would either be aborted prematurely, i.e. when the transmission was not complete and the time was not yet known, or it would continue long after the packet had arrived, wasting simulation time. To avoid these situations, modes now supports terminating simulation runs precisely at the moment when all properties specified in the model can be decided. Beyond Fixed Batch Sizes. In order to perform a statistical evaluation for a model property, the results of a number of simulation runs have to be collected. The accuracy of the result reported, e.g. specified by a standard deviation and a confidence interval, depends on how rare the event is and how many simulation runs were performed. Like many other simulators, previous versions of modes required the number of runs to be specified a priori. This had the unfortunate consequence that either too many runs were performed for a frequent event, or too few for a rare event—in which case another, larger batch of runs had to be performed, wasting the previous runs. modes version 1.4 offers two approaches to handle this problem: Keeping the current statistical evaluation based on confidence intervals with a user-selected confidence level (default 95 %), the simulator can now be instructed
252
J. Bogdoll, A. Hartmanns, and H. Hermanns
to keep generating new runs only until the confidence interval becomes smaller than a specified width (an absolute value or relative to the standard deviation) for properties computing a value (e.g. an expected value or a probability), or until the one-sided confidence limit confirms or contradicts the bound in a Boolean property (e.g. “is the expected time to reach a safe state below x?”). For Boolean properties comparing a probability to some bound, modes 1.4 can also use sequential probability ratio testing [10], using an indifference region and risk levels specified by the user. Usage of this test is typically called statistical model checking, which has the advantage of requiring only a very low number of runs if the bound is far from the actual probability. For better usability, the risk levels are computed from the global confidence level in a symmetric fashion by default, but may be adjusted if desired.
4
Conclusion
In this paper, we have presented modes 1.4, a discrete-event simulator for the Modest language that still is, to our knowledge, the only simulation tool that can deal with (some class of) nondeterministic models in a sound way. Compared to previous releases, which we rather consider research prototypes, version 1.4 has seen significant improvements in terms of ease of use, in particular ease of modelling and support for statistical model checking, as well as in robustness.
References 1. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Science 126(2), 183–235 (1994) 2. Baier, C., D’Argenio, P.R., Gr¨ oßer, M.: Partial order reduction for probabilistic branching time. Electr. Notes in Theor. Comput. Sci. 153(2), 97–116 (2006) 3. Bar´ o Graf, H., Hermanns, H., Kulshrestha, J., Peter, J., Vahldiek, A., Vasudevan, A.: A verified wireless safety critical hard real-time design. In: IEEE WoWMoM 2011. IEEE (2011) 4. Behrmann, G., David, A., Larsen, K.G.: A Tutorial on Uppaal. In: Bernardo, M., Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Heidelberg (2004) 5. Bogdoll, J., Ferrer Fioriti, L.M., Hartmanns, A., Hermanns, H.: Partial Order Methods for Statistical Model Checking and Simulation. In: Bruni, R., Dingel, J. (eds.) FORTE 2011 and FMOODS 2011. LNCS, vol. 6722, pp. 59–74. Springer, Heidelberg (2011) 6. Bohnenkamp, H.C., D’Argenio, P.R., Hermanns, H., Katoen, J.P.: MoDeST: A compositional modeling formalism for hard and softly timed systems. IEEE Transactions on Software Engineering 32(10), 812–830 (2006) 7. Boudali, H., Crouzen, P., Haverkort, B.R., Kuntz, M., Stoelinga, M.: Architectural dependability evaluation with Arcade. In: DSN 2008, pp. 512–521. IEEE CS Press (2008) 8. Hartmanns, A.: Model-Checking and Simulation for Stochastic Timed Systems. In: Aichernig, B.K., de Boer, F.S., Bonsangue, M.M. (eds.) FMCO 2010. LNCS, vol. 6957, pp. 372–391. Springer, Heidelberg (2011) 9. Segala, R.: Modeling and Verification of Randomized Distributed Real-Time Systems. Ph.D. thesis. MIT, Cambridge (1995) 10. Wald, A.: Sequential analysis. Wiley, New York (1959)
UniLoG: A Unified Load Generation Tool Andrey Kolesnikov Department of Computer Science, University of Hamburg, Vogt-K¨ olln-Str. 30, 22527 Hamburg [email protected]
1
Motivation
Utilities for generating artificial (synthetic) loads are very important for analyses of performance and behavior of networks and their offered services. Load generators implemented by the industry1 are mainly dedicated hardware components with very high performance and stringent precision requirements. In research and academia, mainly software based load generators are commonly used because of the expected higher flexibility in operation and maintenance (e.g. due to easy deployment of constituent load generating modules in the network, code customizations for a specific research purpose, etc.) while components of real operating systems and protocol stacks can be used to guarantee realistic load generation at lower costs. However, many existing tools are dedicated to a specific modeling study (e.g., Guernica [1] along with its specific Dweb model for Web traffic, or Harpoon [2] modeling IP traffic flows) or are focusing on generating traffic at some specific interface in a network (e.g., ITG [3] or Brute [4] were designed to generate traffic on UDP and TCP service interfaces). The proposed solutions quite often do not provide an adequate flexibility, e.g. in case the underlying model is to be modified or a completely new model is to be used. Therefore, the unified load generator UniLoG is presented in this paper, which combines the specification and generation of network loads in one single coherent approach. The basic principle underlying the design and elaboration of UniLoG is to start with a formal description of an abstract load model by means of a finite user behavior automaton (UBA, introduced in Sec. 2) and thereafter to use interface-specific adapters to map the abstract requests to the concrete requests as they are “understood” by the service providing component at the real interface in question. An overview of the distributed UniLoG architecture is given in Sec. 3 and a concrete example of its practical use in QoS studies for video streaming is demonstrated in Sec. 4.
2
Load Specification Method
In our terms, load L = L(E, S, IF, T ) is defined as a sequence of requests offered by an environment E (which consists of virtual, i.e. modeled, service users) to a service system S (e.g., TCP service) at a well-defined interface IF 1
e.g., Agilent N2X, IXIA Test Applications, LANforge FIRE.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 253–257, 2012. c Springer-Verlag Berlin Heidelberg 2012
254
A. Kolesnikov
(e.g., TCP service interface) during the time interval T . Each abstract request in this sequence is a tuple (ti , ri ), whereby ti ∈ T denotes the arrival time of the abstract request ri at IF and, ti ∈ R, i = 1, 2, . . . , n, n ∈ N, ti ≤ tj for i < j. The user of UniLoG (which may be, e.g., a single researcher, experimenter, test engineer or a whole quality assurance team) can choose the interface IF for load modeling dependent on the objectives of the specific study to be carried out. For example, the TCP service interface is chosen in the use case presented shortly in Sec. 4 in order to analyze a set of QoS metrics for RTSP video streaming over TCP under different TCP background loads. Further, the behaviour of one or many service users at IF is described by means of the corresponding UBA using the following three kinds of different user states to specify the possible sequences of abstract requests as induced by virtual service users: 1) Request- or R-states model the generation of requests of exactly one abstract request type (e.g., TCP connect or TCP send), 2) System- or S-states model the waiting for certain kinds of system events (e.g., receipt of a TCP socket error or blocking status message), and 3) Delay- or D-states are used to model the delays between subsequent requests or system events (e.g., interarrival times between TCP send requests). The values of request attributes (e.g., dest addr for modeling the IP address(es) of TCP receiver(s) or data len for the payload length of TCP requests) in R-states and delays in D-states can be specified by means of different statistical distributions, traces (e.g., PCAP files preprocessed by tshark using its different dissectors and filters to extract the request attributes needed) or as a constant in very simple cases. The transitions between user states can be specified by means of transition probabilities (yet assuming a uniform distribution) or by means of “guards” implemented recently (i.e. conditions on variables from the global memory of the UBA which can evaluate either to true or to false). Finally, the execution of the parameterized UBA (PUBA) model results in a specific trajectory of the underlying stochastic process and a concrete sequence of abstract requests is generated which is conform to the specified load L.
3
Overview of the UniLoG Architecture
To provide a high degree of flexibility in generating traffic mixes of different structure and intensity for various scenarios, a distributed architecture has been designed for UniLoG (cf. Fig. 1). To create a load required to measure, e.g. the throughput of a link or to test a web application running on a powerful server the resources of a single workstation can be insufficient. Likewise, in order to analyse the performance of different routing mechanisms or transport protocols, a kind of geographical distribution of traffic sources (and sinks) may be required. UniLoG resolves this problem by using several load generating nodes in each of which a special service called UniLoG load generation agent (or simple load agent) is installed. Load agents are responsible for the generation of traffic loads from virtual users and can be controlled by the experimenter from one central point in the network (management station, cf. Fig. 1) by means of different commands (e.g. loadPUBAModel(), start(), startAt(), stop(), etc.) encapsulated into HTTP messages for transport (a simplistic HTTP server with SSL/TLS support is included in each load
UniLoG: A Unified Load Generation Tool
255
agent for this reason). Further, each load agent contains a generator component (generator of abstract requests, GAR), which is responsible for the execution of the PUBA model and generation of initially abstract load requests according to the specifications therein. The actual generation and injection of real requests at the chosen service interface (e.g. TCP service interface) is accomplished by means of the correspondent specific adapter (e.g. UniLoG.TCP adapter). Among others, values of parameters needed to generate real TCP requests (e.g., source and destination port number(s)) can be supplied by the adapter in case they were omitted by the experimenter in the PUBA. Adapters for UDP, TCP and IP service interfaces are already available and a new adapter for generating realistic Web traffic loads at HTTP service interface (using a real, remotely controlled Web browser) has been developed recently.
Ii
Generator (GAR)
Si
It
St
Ib
SBlocked
0.99998
DOFF
St
1.0
0.00002
DOFF
Ib
SBlocked
0.99998
0.00002
DOFF
Si
RGet
It
SBlocked
Ib Ii 1.0
RGet
1.0
0.99998
Si 0.00002
Ia
DOFF
Si
1.0
Ia
1.0
1.0
RGet
SBlocked
0.99998
Ia
Ii
1.0
Ib
1.0
RGet
1.0
Ii 1.0
Ia
Management Station
PUBA-Model
Load Generator UniLoG
Experimenter
1.0
St
It
1.0
St
0.00002
It
IEEE 802.11g 54 Mbit/s AP
Load sink
100 Mbit/s RQIP
EQIP
RQTCP
RQHTTP
EQTCP
EQHTTP
1 Gbit/s
UniLoG.HTTP UniLoG.TCP UniLoG.IP
Request / Event mapper
Request real / Event requestmapper Request injector Request injector
Event manager
IP frames
ICMP status messages
Event manager TCP socket codes
TCP segments
TCP / UDP
IP …
Request / Event mapper
Request injector HTTP requests
Event manager HTTP responses
1 Gbit/s
Load Agent 1
RTP/TCP video stream
1 Gbit/s
Load Agent 2
HTTP
VoD client ( VLC )
VoD server ( LIVE555 )
Control channel (TLS / OpenSSL) abstract requests real requests
real system reactions abstract system messages
Fig. 1. Distributed architecture of the UniLoG tool
The performance of the adapters is characterized mainly by the maximum packet and data rate of the generated streams (i.e. UDP, TCP or IP streams) achievable by the correspondent adapter and the (mean) difference between the specified and actual packet injection times. For example in a 1 Gbit/s Ethernet LAN, UniLoG.IP adapter can generate IP streams with a data rate up to 725 Mbit/s while spending ca. 18 µs on each IP request for UBA execution and request preparation overhead. Compared to earlier work [5], significant progress is achieved in the functionality of UniLoG by integration of passive measurement tools like tshark in order to obtain required parameters for the UBA models or to estimate characteristics of traffic loads observed and/or generated during the experiments. Moreover, the performance of each single load agent is increased using prototypical implementation of TCP and UDP adapters on the basis of a real-time operating system (RTOS-32).
256
4
A. Kolesnikov
QoS Measurements for RTSP Streaming under Load
Along with many other scenarios, e.g. those mentioned in Sec. 3, UniLoG can be effectively used in QoS studies for video streaming applications as shown in Fig. 1. Here, an H.264 coded HD video of ca. 10 min length is transmitted (with mean required IP throughput of 5.7 Mbit/s and peaks up to 18.7 Mbit/s) from the VoD server located in the LAN segment to the VoD client in WLAN using real-time streaming protocol (RTSP) for the streaming session and RTP over TCP for the real-time transport of video frames. In each streaming experiment, background load in WLAN is generated by UniLoG which consists of a fixed number of constituent TCP flows (from load agents to load sink) with the same required throughput2 which is not increasing or decreasing during the experiment. In a series of experiments, the number of TCP flows is kept constant but their required throughput is increased from 0.5 to 16 Mbit/s. 5
8 1xTCP 2xTCP 3xTCP 4xTCP 5xTCP 6xTCP
1xTCP-Last 2xTCP-Last 3xTCP-Last 4xTCP-Last 5xTCP-Last 6xTCP-Last
4
6 Jitter in the video stream [ms]
IP throughput of the video [Mbit/s]
7
5
4
3
3
2
2 1 1
0
0 0 0.5 1
2
4 8 10 12 required throughput of one TCP flow in the background load [Mbit/s]
16
0 0.5 1
2
4 8 10 12 required throughput of one TCP flow in the background load [Mbit/s]
16
Fig. 2. IP throughput (left) and Jitter (right) of the RTSP/RTP video stream using TCP under different TCP background loads
Values of important quantitative QoS parameters like IP throughput, RTP jitter, packet loss, sequence errors and round-trip-time are obtained from the PCAP traces captured at the VoD client and server using the tshark tool combined with own Python scripts (cf. Fig. 2). The comparison of streaming characteristics obtained for the case of the stream delivery over the reliable TCP transport service with the results presented in [5] for the unreliable RTP/UDP transport may provide a very suitable guidance (e.g., for network operators) to configure reliable infrastructures for high-quality video streaming services.
References 1. Pena-Ortiz, R., et al.: Dweb model: Representing Web 2.0 dynamism. Computer Communications 32(6), 1118–1128 (2009) 2
Using TCP segments of constant length (1460 Byte), and constant interarrival times chosen to achieve required throughput of 0.5, 1, 2, . . . , 16 Mbit/s for one TCP flow.
UniLoG: A Unified Load Generation Tool
257
2. Sommers, J., Barford, P.: Self-Configuring Network Traffic Generation. In: Proc. of IMC 2004, Taormina, Sicily, pp. 68–80 (2004) 3. Avallone, S., Pescape, A., Ventre, G.: Analysis and experimentation of Internet Traffic Generator. In: Proc. of New2an 2004, St. Petersburg, Russia, pp. 70–75 (2004) 4. Bonelli, N., et al.: BRUTE: A High Performance and Extensible Traffic Generator. In: Proc. of SPECTS 2005, Philadelphia, pp. 839–845 (2005) 5. Kolesnikov, A., Kulas, M.: Load Modeling and Generation for IP-Based Networks: A Unified Approach and Tool Support. In: M¨ uller-Clostermann, B., Echtle, K., Rathgeb, E.P. (eds.) MMB&DFT 2010. LNCS, vol. 5987, pp. 91–106. Springer, Heidelberg (2010)
Non Preemptive Static Priority with Network Calculus: Enhancement William Mangoua Sofack and Marc Boyer ONERA – The French Aerospace Lab – F31055 Toulouse, France {William.Mangoua_Sofack,Marc.Boyer}@onera.fr
Abstract. The paper addresses worst case performance analysis of non preemptive static scheduling priority scheduling within the network calculus theory. Previous studies have been done, each one generalizing some other [8,1,7,3,10], needing weaker hypotheses or improving accuracy of results. This paper presents a very general results, with an accuracy that appear, on preliminary examples, as good as all other one1 . Keywords: Network Calculus, Service curve, QoS, Worst case Performance, Real-Time Calculus.
1
Introduction
Embedded systems are increasingly communicating systems. For example, in a modern aircraft, there are more than hundred computers connected to the avionic backbone with several applications that exchange informations in real time. Obviously, a correct behaviour of these applications must be guaranteed. From the point of view of the network, this means that any message transmitted over the network must arrive at its destination on time. This implies then to have a good bound of the end-to-end delay (known as Worst Case Traversal Time, WCTT). The network calculus theory [4,8,2] provides a formal framework for modelling of communication networks and computing such WCTT. The aim of this work is to reduce the pessimism of the method in a specific case. We consider a node shared by several flows, with a non-preemptive static priority policy. We are interested in the residual service left by the higher priority ones to lower priority ones. This policy has been already studied in [8,1,7], but these results appear not to carefully consider the hypothesis of non-preemption. In particular, when a non-preemptive flow is served, it benefits from the full speed from the server, even if, from long term point of view, it gets only a fractional part. This assessment was also the starting point of [3], but it leaves two open questions: is there any analytical formulation of their result? And is the result tight? In this paper, an analytical expression (with the proof) is given, in a more general context. 1
This works has been partially funded by French ANR agency under project id ANR09-SEGI-009.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 258–272, 2012. c Springer-Verlag Berlin Heidelberg 2012
Non Preemptive Static Priority with Network Calculus: Enhancement
259
Initially, network calculus is presented in Section 2.1, and the previous works [8,1,7,3] are detailed in Section 2.2. This presentation is necessary to understand our contribution, summarized in Section 2.3. Our modelling of non-preemptive static priority policy in network is recalled in Section 3. The technical presentation of our contribution can then be done (Section 4), with the Theorem 41. Our approach is illustrated by examples and results are compared to existing ones (Section 5). Section 6 concludes.
2
State of the Art
Analysis of the time-related performance during design of embedded systems has been approached from various angles, such as scheduling analysis, completion time analysis and model checking in timed automata with limitations inherent in their fields [11]. Network Calculus is a theory of deterministic queuing systems, developed to compute worst end-to-end communication bounds in networks [4,5,8]. 2.1
Network Calculus
The network calculus analysis focuses on worst case performances. The information about the system features are stored in functions, such as arrival curves characterising the traffic or service curves quantifying the service guaranteed at the network nodes. These functions can be combined together thanks to special network calculus operations, in order compute bounds on buffers size or delays. Mathematical background: (min, +) dioid Here are presented some operators of the (min, +) dioid used by network calculus. Beyond usual operations like the minimum or the addition of functions, network calculus makes use of several classical operations which are the translations of (+, ×) filtering operations into the (min, +) setting, as well as a few other transformations. Network calculus mainly uses non-decreasing functions, and related operators. Here are those used in this article. Set F F denote the set of wide-sense increasing functions f : R → R ∪ {+∞} such that f (t) = 0 for t < 0. Function [ ]+ : x → max(x, 0). The Vertical deviation. It is defined for two functions f and g by v(f, g) = supt≥0 {f (t) − g(t)} The Horizontal deviation. It is defined for two functions f and g by h(f, g) = supt≥0 {inf {d ≥ 0 | f (t) ≤ g(t + d)}} The Min-plus convolution. It is defined for two functions f and g by (f ∗ g)(t) = inf 0≤s≤t {f (t − s) + g(s)} The Positive and non-decreasing upper closure. It is defined for a functions f by f ↑ (t) = [sup0≤s≤t f (s)]+ The pseudo inverse. The inverse, f −1 , of a function f ∈ F cannot always be assumed to exist, however, two pseudo-inverses can be defined [12]:
260
W. Mangoua Sofack and M. Boyer ppty
−1 finf (u) = inf{t|f (t) ≥ u} = sup {t f (t) < u} def
ppty
−1 fsup (u) = sup{t|f (t) ≤ u} = inf {t f (t) > u} def
(1) (2)
−1 (u). If f ∈ F , u ≥ f (t), then With these definitions, if u ≤ f (t), then t ≥ finf −1 −1 −1 (u). The t ≤ finf (u). ∀t < fsup (u), f (t) ≤ u. Plus, if f (t) > u then, t ≥ fsup equivalence of definition with strict inequality is given in [8, Th 3.1.2].
Network calculus: reality modelling. A network calculus model for a communication network consists in the three following components: A partition of the network into subsystems (often called nodes) which may have different scales (from elementary hardware like a processor to large subnetworks). A description of data flows, where each flow follows a path through a specified sequence of subsystems and where each flow is shaped by some arrival curve just before entering the network. A description of the behaviour of each subsystem, that is service curves bounding the performances of each subsystem, as well as service policies in case of multiplexing (several flows entering the same subsystem and thus sharing its service). In network calculus, the real flows are modelled by cumulative functions R ∈ F : R(t) counts the total amount of data produced by the flow up to time t. Consider a system S, which we view as a black box; S receives an input flow, R(t), and delivers the data after a variable delay. Call R (t) the output flow: S → R . We have relation R ≤ R, meaning that data goes out after being R − entered. System S might be, for example, a single buffer served at a constant rate, a complex communication node, or even a complete network. Figure 1 shows input and output functions for a single server queue. The backlog is the amount of bits that are held inside the system; if the system is a single buffer, it is the queue length. In contrast, if the system is more complex, then the backlog is the number of bits “in transit“, assuming that we can observe input and output simultaneously [8]. For a system where R is the input and R the output, the backlog at time t is b(t) = R(t) − R (t). Obviously, b(t) ≤ v(R, R ). A backlogged period is a period during which the backlog is not zero. Let t, a moment in a backlogged period, this backlogged period has started at StBl(t) = sup {u ≤ t | R (u) = R(u)}. We limit the following to left-continuous functions to ensure: R (StBl(t)) = R(StBl(t))
(3)
The virtual delay at a time t is the delay that a bit entered at time t will wait until going out, defined by d(t) = inf {τ ≥ 0 | R(t) ≤ R (t + τ )}. Obviously d(t) ≤ h(R, R ).
Non Preemptive Static Priority with Network Calculus: Enhancement
261
Network calculus: contract modelling. To provide guarantees to data flows, one need to know some traffic contract on the traffics and the services in the network. For this purpose, network calculus provides the concepts of arrival curve and service curve. Arrival curve. A flow R ∈ F is constrained by α ∈ F if and only if for all s ≤ t: R(t) − R(s) ≤ α(t − s). We say also that R has α as an arrival curve, or also that R is α-smooth. This condition is equivalent to R ≤ R ∗ α. Service curve. The behaviour of a server is modelled by the concept of service curve, modelling some guarantees on the service provided to flows. The literature offers several definitions for different flavours of service. [1] S → R , i.e. a server S with proposes a comparative study. Consider a system R − input R and output R (Figure 1). We say that S offers to the flow a simple service of curve β if and only if R ≥ R ∗ β. We say that a system S offers a strict service of curve β if, during any backlogged period [t, s[, we have R (s) − R (t) ≥ β(s − t). Note that if t is not a moment of a backlogged period, then StBl(t) = t. There is a hierarchy between these service notions. A strict service is also a weak service. As discussed in Section 2.2, the need to have these different definitions is the decomposition of the residual service. Let us now present the main network calculus results: Theorem 21 (Backlog and delay bound). Assume a flow, constrained by an arrival curve α, traverses a system that offers a service curve β, the backlog b(t) for all t satisfies: b(t) ≤ v(α, β). The virtual delay d(t) for all t satisfies: d(t) ≤ h(α, β).
R
S
R’
R1 R2
S
R’1 R’2
Fig. 1. Servers
2.2
Related Works
Modelling aggregation is an important issue in network calculus. Aggregation means that the service is shared by different flows: for example, if a server S offers S → an aggregated simple service of curve β to two flows R1 and R2 ( (R1 , R2 ) − (R1 , R2 ), Figure 1), it means that it offers this service to the flow R = R1 + R2 (i.e. R1 + R2 ≥ (R1 + R2 ) ∗ β), but the repartition of the service between the flows depends on priority flows and server policy (common policies are FIFO, strict priorities...). Several results exist to derive the residual service Si offered S
i Ri – [8, § 6.2][8, § 2.1][9][3]). But the flavour of service is to each flow (Ri −→
262
W. Mangoua Sofack and M. Boyer
crucial: some results make assumption of strict service, others of simple. And the residual service can also be simple or strict. And this is of importance in the case of more than two flows, since a residual service can be shared by aggregated flows. Table 1. Example 1
Flow i Period Frame size (li ) Arrival curve αi delay [1,7] delay [10] Our delay t R1 3 1 4 4 4 3 R2 9 3 3 9t 6 5 5 t R3 4 1 6 6 6 4 This paper focuses on non-preemptive static priorities, and the residual service of the low priority flows. In the following, in all presented results, i < j implies that Ri has higher priority than Rj . Several results have been published on this subject, each one refining the previous results. They are presented in the sequel. Simple residual service. The initial results of [8, Cor. 6.2.1] claims that, if a server with strict service β is shared by two flows R1 and R2 , R1 having arrival curve α1 , then the low priority flow R2 receives a minimal residual service of + curve [β − α1 ] . It has two limitations: the residual service is limited to non-decreasing results, and is not strict. The limitation to non non-decreasing results leads to the exclusion of some real cases 2 . The ”non strict” aspect prevents from decomposing the residual service: with a server shared by three flows, with three priority levels, the residual service for the higher and lower flow can be computed, but not for the intermediate one, since the residual service left by the higher one to the two others is not strict. Strict residual service. The two problems described above have been independently studied by [1,7] with exactly the same result3 . Let then consider n flows R1 , . . . , Rn , each Ri having an arrival curve αi and a maximum packet size limax , and still a strict service of curve β. Then, each flow Ri receives the simple service of curve βi and the strict service of curve βi , defined by: max )↑ (4) βi = (β − i−1 j=1 αj − maxi<j≤n lj i−1 βi = (β − j=1 αj − maxi≤j≤n ljmax ) ↑ (5) 2
3
For example, if β is an affine function, modelling a constant rate service, and α1 is a stair-case function, modelling a per-packet arrival, β − α1 is locally decreasing at each packet arrival. At first glance, one should consider both results different, since one apply for network calculus and the other for real-time calculus. But as shown in [1], the strict minimal service of network calculus is equivalent to the service curve of RTC.
Non Preemptive Static Priority with Network Calculus: Enhancement
263
Note that the only difference between both is that, in the strict service case, the flow Ri competes versus its own maximal packet size (the term −limax ), and it does not in the simple service one. In [1] it is shown on an example that this “self competitive term” can not be ignored in the strict case4 . As an example, consider the data in Table 1. Let’s look at the delay of the third flow. According to equation 5, we can write: β3 = (β − (α1 + α2 ) − l3 ) ↑. Figure 2 shows the curves of β3 and α3 . In this figure, one can read the delay (6 ms) corresponding to the horizontal deviation between α3 and β3 . From this relation, the service offered to the flow R3 decreases if it has large size of packets. This is due to the self competitive term.
bits
bits
α2
3
3
2
2
[2,8] β ’2
1
0
1
2
3
4
5
6
7
8
9
time
0
[2,8] β ’3
α3
1
1
2
3
4
5
6
7
8
9
10
11
12
time
Fig. 2. Residual service curves: applying [1,7] on example of Table 1
Figure 2 shows the residual service offered to flow R2 using the results of [1,7] [1,7] (β2
on the figure). With these results, the delay on the flow R2 is 6 ms, as [1,7] shows that the remaining service increases shown in Table 1. The curve β2 from one unit, then two units subsequently. However, the flow of R2 is expected to benefit from full server speed when it is served. In addition, R2 has packets of size 3. As the priority is non-preemptive, R2 must send, without interruption, whole packet of size 3. Briefly, these approaches only model the negative impact of non-preemption (the add of a negative term −l2max) in the residual service, but not a positive one: when a non-preemptive flow is served it benefits of the full speed from the server. Strict residual service and non-preemption. The impact of non-preemption described above has been studied in [3], which gives an algorithm to compute a residual service with better delay bound than the one of [1,7], assuming fixed 4
This notion of “competition versus itself” is a bit counter-intuitive, and deserves an informal explanation. For example, with two priority levels, β2 and β2 are given by β2 = (β − α1 ) ↑ and β2 = (β − α1 − l2max ) ↑. Consider a backlogged period of the low priority flow R2 . Before having the server, the flow R2 must wait the end of service of the high-priority flow R1 (the term α1 ). But due to the non-preemption, the high priority flow can have been blocked by one packet of the low priority flow. And this could be a previous backlogged period of the flow R2 , not the considered one, as in Figure 4 . This makes the difference between strict service (considering any backlogged period) and the simple service one, based on convolution, that is to say, some period.
264
W. Mangoua Sofack and M. Boyer
size packets. The main idea of [3] is to consider non-preemption taking into account the effective size of the packets of lower priority. The curve in [1,7] is then modified by an algorithm that increases the residual service curve each time a non-conforming growth is found. A non-conforming growth is detected whenever it is not a multiple of the packet size. Although [3] improves the delay bound, the remaining service offered is a simple service, not a strict service. Thus, as we mentioned above, this remaining service is not suitable for decomposition. In addition, this result is limited to the server offering a service with a linear service curve. The impact of non-preemption, described above, has been studied in [10], with the proposal of a theorem to evaluate the residual service curve of the lower priority flow. In the same article, a comparison is made with the results of [6]. This comparison can show that the self competitive term is not properly taken into account in some cases. Table 2. Example 2: CAN
Flow i Period Size (li ) αi exact worst delay Delay, [10] Our delay R1 2.5 2.5 2.5 2t 2 2 2 5 2t R2 3.5 2.5 2.5 7 3 3 3 R3 3.5 2.5 2.5 2t 3.5 6 3.5 7
Table 2 contains the sample of the comparison. The pessimism found in [10] is due to the following reason. The method does not take into account the arrival curve of the lower priority flow. The evaluation of the service curve, and thus the delay is based on the maximum period of time during which the priority flow can occupy the server. However, it is possible (see figure 3) that during this maximum period of time, the lower priority flow is not in his backlogged period. R3 R1 , R2
R1 R2 , R3 R1
R3 R1
R2 R1
R2 R1
R3
Maximum service Fig. 3. Maximum period of service for R1 + R2
In this figure, the maximum period of service for R1 + R2 is 5 ms. Knowing the arrival curve of R3 , it is easy to verify that it is not possible for this flow of being in a backlogged period during 5 ms. The backlogged R3 starts 3.5 ms after the start of service of R1 + R2 . This shift of 3.5 ms should be considered in determining the service curve of R3 .
Non Preemptive Static Priority with Network Calculus: Enhancement
2.3
265
Our Contribution
In this paper, we address the same problem as [10]. We reduce the pessimism left in [10]. We use the same approach. The significant difference is the inclusion of the service curve of the lower priority flow in assessing its residual service. Here is a theorem and its proof, which corrects the pessimism found in [10].
3
Scheduling Modelling
This section presents how the scheduling and specially non-preemptive priority scheduling is modelled in our framework, and the basic properties of this modelling. In papers, sometimes, this modelling is implicit, and the properties are given as details of the proof. We chose to make it all explicit. us Ri the We consider a server shared by a set of flow R 1 , . . . , Rn . Let denote output related to each input flow Ri , and R = i Ri and R = i Ri . Here is are presented how the scheduling and specially non-preemptive priority scheduling is modelled in our framework. Mutual exclusion. When there is no backlog (i.e. R (t) = R(t)), the server is efficient enough to serve all flows simultaneously, and there is no influence of the scheduling policy. On backlogged periods, we will consider that flows share the services of the server in mutual exclusion. Everything happens as if there was a scheduler responsible for allocating the server to the streams. We assume that it exist, in each backlogged period, [StBl(t), s[ an increasing sequence of scheduling points, sci such that, at each of such instant, the server is allocated to a server5. Moreover, it exists a function Sched : sci → [1, n], that, at each scheduling instant, gives the number of the flow served by the server. Then, Sched (xi ) = j means that the server is allocated to flow j on interval [xi , xi+1 [. By extension on intervals, for a backlogged period [t, s[, Sched ([t, s[) = j means that the server is allocated to flow i on [t, s[. Note that there is no condition saying that Sched (sci ) = Sched (sci+1 ). We call period of service P for a flow R a period of time during which the stream is served by the server i.e. Sched (P ) = i. A period of service begins at a precise moment called start date of service. We define StSc the start date of service for a moment t. If t is in a period of service i, i.e. Sched (t) = i then StSc(t) denotes the beginning of this period of service, otherwise, it is not defined. StSc(t) ≤ t (6) Note that being in period of service does not mean that there is some output. Consider two flows RH , RL , sharing a server with service of curve βR,T , such that RH having the high priority, and an instant t such that each flow send a packet. 5
To avoid Zeno behaviour, on any finite interval, there is only a finite number of points.
266
W. Mangoua Sofack and M. Boyer
Then, from t and up to t + T , the output can be null (RH (t + D) = RH (t)) but [t, t + T ] is a period of service of RH . We will consider a period of service as an open interval on the right and leftclosed [t, s[. During the period of service of one flow, there is no output of the other flow.
P1B P1A
P2B P2A β
bits
P1A
P1B
Service
P2A
P2B R1
A−backlog B−backlog 0
1
2
3
4
5
Fig. 4. Scheduling example
6
7
u Fig. 5. Static modelling
v
t
priority
These notions can be illustrated with two flows, A and B, A having higher priority than B, A sending packets P1A , P2A , and B sending packets P1B , P2B (cf. Figure 4). The service times are represented in the Figure. Then [1, 5[ is a backlog period of A, but any sub-interval also is ([2, 4], ]2, 5[, etc). The same, [0, 2[ and [2, 7[ are backlog periods of B, and [0, 7[ is a backlog period of A + B. The start of service can be also illustrated: StScA (1) = StScA (3) = StScA (3.2) = 1, StScB (1) = 0, StScB (3) = StScB (6) = 5. This also illustrates the notion of “self competing term” presented in eq. 5: the B backlog period [3, 7[ has to wait the service because the flow A has been delayed by the packet P1B . Non-preemptive strict priority. The behaviour of non-preemptive strict priority is not so easy to model: consider Figure 5. The high priority flow R1 has backlog from 0 to u, and emit data at the server speed up to v, and then stops. What is the decision of the scheduler at time u? There is no more R1 backlog, so, it could switch to a low priority flow. Nevertheless, R1 is still sending data. Since we are studying the low priority flow, we are making the pessimistic assumption that, at time u, the server is able to see that R1 is still sending data, and still serves R1 up to v. Then, the strict priority is modelled as follows: let t be the start of a backlogged period. If there is input of the high priority flow, the server is allocated to this flow, until its stop to emit. Otherwise, the server is allocated to the low priority one, until one packet has been transmitted (i.e. until s such that R2 (s) − R2 (t) = l2 ). At the end of one period of service, the scheduling policy allocates the server as if it was the start of a backlogged period.
Non Preemptive Static Priority with Network Calculus: Enhancement
267
Formal properties related to scheduling. Our modelling of fixed priority scheduling implies some properties, that will be used in the proof, especially for server with strict service6 . Consider a server S with strict service of curve β. By definition, for any backlogged period, [t, s[ of the aggregate flow R = i Ri . R (s) − R (t) ≥ β(s − t)
(7)
Let [t, s[ be a backlogged period, for a flow i then, the services starts always after the start of backlog (eq. 8). The scheduling of modelling as a mutual exclusion point of view implies that, when a flow is not served, it has no output (eq. 9). Moreover, is S offers a strict service of curve b, the served flow output is at least β(s − t) (eq. 10) ∀u ∈ [t, s[: StSci (u) ≥ StBli (u) ∈ [t, s[: Rj (u) = Rj (v) Ri (s) − Ri (t) ≥ β(s −
Sched ([t, s[) = i =⇒ ∀j = i, ∀u, v Sched ([t, s[) = i, strict service β =⇒
(8) (9) t)
(10)
By definition of start of service and backlog period: R (StSc(t)) = R (StBl(t)) and then,
4
Ri (StSci (t))
=
Ri (StBli (t))
(11) (12)
Contributions
Theorem 41. Consider a server that offers to three flows, R1 , R2 and R3 , a strict service curve represented by a non-decreasing function β. Suppose that the flow R2 (resp. R3 ) emits its data into packets of fixed size l2 (resp. of maximal size l3 ). If the flow R1 (resp. R2 ) is α1 (resp. α2 ) upper-constrained and R1 has a non-preemptive priority over the flow R2 and R2 has a non-preemptive priority np (t) over the flow R3 , then the server guarantees to R2 a strict service curve β2S defined in eq (13).
np (t) β2S
=
⎧ ⎪ ⎨i × l2 min β(t) − β(χi ) + (i − 1) × l2 ⎪ ⎩ β(t) + β(Δ + ψ2 ) − β(χ”i + ψ2 ) + (i − 1) × l2 }
(13)
with i = max{j : χj ≤ t} and the definitions: ψ1 = (β − α) ↑−1 sup (0) χi = ((β − α1 ) ↑)−1 inf (l3 + (i − 1) × l2 ) χi = max {χi , χ”i } 6
−1 ψi = βsup (li ) for i ∈ {2, 3}
χ”i = ((β − α1 ) δψ2 )−1 inf (i × l2 ) Δ = (α2 )−1 inf (2 × l2 ) − ψ2
Our modelling does not requires the server to be strict, but, in the proof, we need the strict property.
268
W. Mangoua Sofack and M. Boyer
Some informal semantics of these terms can be give: ψi , is an upper bound of time required to serve a packet of length li , and χi is an upper bound on the waiting time of the ith packet to be served. The explicit definitions of these terms are: ψ1 = inf{t|β(t) − α(t) > 0}, ψi = inf {t β(t) ≥ li } (i > 1), χi = inf {t β(t) − α1 (t) > l3 + (i − 1)l2 }, χ”i = inf{t|β(t + ψ2 ) − α1 (t + ψ2 ) > i × l2 }. For some technical reason, not presented here, the non-decreasing upper closure is useless when using the inf based definition. Table 3. Notations related to the NC modelling
Notation related to real behaviour Ri cumulative function of the input flow i Ri cumulative function of the output flow i xi ith start of service of a packet of Ri in [t, s[ wi duration of ith period of service in [t, s[ StBli start date of backlog of Ri StSci start date of service of Ri tci StSci (StBl1 (t)) Notation related to contract β service provided by the server αi constraint respected on the flow Ri χi upper bound of the period before the start date of the ith period of service of packet ψi upper bound of time required to serve a packet of length li Δ upper bound of period during which there is no backlog between the emission of two packets
4.1
Proof
Let consider [t, s[ , a backlogged period of R2 . It will be the interval considered in all the proof. Some more definitions. Let xi be the ith start of service of a packet of R2 after t, i.e. x1 = min {sci sci ≥ t, Sched (sci ) = 2}. That ith period of service of packet of R2 during a backlogged period [t, s[ is an interval [xi , xi + wi [, with wi = inf {t | R2 (xi + t) − R2 (xi ) ≥ l2 }. It means that R2 (xi ) − R2 (x1 ) = (i − 1) × l2 R2 (xi
+ wi ) −
R2 (x1 )
= i × l2
(14) (15)
Between t and x1 , R2 may have received some service. However, since t is arbitrary, t is not necessarily the beginning of a service. Then, 0 ≤ R2 (x1 ) − R2 (t) < l2
(16)
Non Preemptive Static Priority with Network Calculus: Enhancement
269
The following is deduced from (14), (15) and (16). R2 (xi ) − R2 (t) ≥ (i − 1) × l2 R2 (xi
StBl2 (t) a)
+ wi ) −
t StSc2 (t)
R3
R2
R2 (t)
(17)
≥ i × l2
StBl2 (t) d)
(18)
StBl1 (t) t
StSc1 (t) StSc2 (t)
R3
R1
R2
≤ ψ3
≤ ψ3 StBl1 (t)
StBl2 (t) t
StSc2 (t)
c)
R1
R2
StBl1 (t) e)
StSc1 (t) StBl2 (t) StSc2 (t) R2 ≤ ψ2
≤ ψ1
R1
R2 t
Fig. 6. Different cases for StSc2 (t)
The proof itself. Let us recall the two interesting cases for the worst case. As shown in [10], these two cases generalize all possible cases. a) case 3-1-2: R3 is served. In this case (Figure 6-d), R2 wait for the transmission of one packet of R3 and that R1 empty its queue. Notice that, during the service time of R3 , R1 can accumulate some backlog. b) case 2-1-2: Another backlog period of R2 is served. This is the equivalent notion of the “self competitive“ term presented in the eq.5 and example of Figure 4 . Figure 6-e shows this case. In the cases (d-e), R1 begins to accumulate backlog at StBl1 (t) when the flow Ri (i ∈ {2, 3}) is being served. StBl1 (t) belongs to a period of service of Ri . This period of service began at the moment StSci (StBl1 (t)). For the rest, let us denote tci = StSci (StBl1 (t)), and note that tci ≤ StBl1 (t)
(19)
Remember an important lemma (from [10]) before the proof. Lemma 42 case 2-1-2: xi − tc2 ≤ χ”i + ψ2 . case 3-1-2: xi − tc3 ≤ χi Proof. From [10, Lemma 4.2], StSc1 (t) − tc2 ≤ ψ2 and xi − StSc1 (t) ≤ χ”i lead to xi − tc2 ≤ χ”i + ψ2 . The same for case 3-1-2. Consider [t, s[, a backlogged period of R2 . Let i = max{j ∈ N|xj ≤ s}.
270
W. Mangoua Sofack and M. Boyer
If s ≥ xi + wi : s is not in a period of service of R2 R2 (s) − R2 (t) ≥ i × l2 (by definition of xi and wi , & eq. 18) If xi ≤ s < xi + wi : s is in a period of service of R2 i.e. (R1 (s) = R1 (xi ) and R3 (s) = R3 (xi ): eq. 9). Then R1 (s) − R1 (t) = R1 (xi ) − R1 (t) R3 (s) − R3 (t) = R3 (xi ) − R3 (t) then, R2 (s) − R2 (t) ≥ β(s − t) − (R1 (xi ) − R1 (t) + R3 (xi ) − R3 (t)) In the following, one need some (good) bounds on R1 (xi ) − R1 (t) and R3 (xi ) − R3 (t). – case 2-1-2 (cf. e in Figure 6) In this case, it is clear that R3 (xi ) − R3 (t) = 0, which gives R2 (s) − R2 (t) ≥ β(s − t) − (R1 (xi ) − R1 (t). Moreover, since R2 is served between tc2 and StSc1 (t), then R2 (StSc1 (t)) − c R2 (t2 ) = l2 (fixed packet size) and R1 (StSc1 (t)) = R1 (tc2 ) (no R1 output during service of R2 ) Consider R1 (xi ) − R1 (t): R1 (xi ) − R1 (t) ≤ R1 (xi ) − R1 (StBl2 (t)) (since always StBl2 (t) ≤ t)
= (R1 (xi ) − R1 (tc2 )) − (R1 (StBl2 (t)) − R1 (StSc1 (t))) (since R1 (tc2 ) = R1 (StSc1 (t)) = (R1 (xi )−R1 (tc2 ))−((R (StBl2 (t))−R (tc2 )) − l2 ) (since R2 (StSc1 (t))−R2 (tc2 ) = l2 ) = (R1 (xi ) − R1 (tc2 )) − ((R (StBl2 (t)) − R (tc2 )) − l2 ) (since R1 (tc2 ) = R1 (tc2 )) ≤ (R1 (xi ) − R1 (tc2 )) − ((R (StBl2 (t)) − R (tc2 )) − l2 ) (since R1 (xi ) ≤ R1 (xi )) ≤ α1 (xi − tc2 ) − ((R (StBl2 (t)) − R (tc2 )) − l2 ) ≤ α1 (χ”i + ψ2 ) − β(StBl2 (t) − tc2 ) + l2 (from Lemma 42)
Then, R2 (s) − R2 (t) ≥ β(s − t) − β(χ”i + ψ2 ) + β(StBl2 (t) − tc2 ) + (i − 1) × l2 . Last step: getting a bound on StBl2 (t) − tc2 . At tc2 , there is an incoming packet of flow R2 . The arrival of the next packet occurs at StBl2 (t). This means that R2 (StBl2 (t)) − R2 (tc2 ) = 2 × l2 . Then, 2 × l2 = R2 (StBl2 (t)) − R2 (tc2 ) ≤ α2 (StBl2 (t) − tc2 ) (α2 is arrival curve of R2 ). StBl2 (t) − tc2 ≥ (α2 )−1 inf (2 × l2 ) and by definition of Δ: StBl2 (t) − tc2 ≤ Δ + ψ2 . We have, R2 (s) − R2 (t) ≥ β(s − t) − β(χ”i + ψ2 ) + β(Δ + ψ2 ) + (i − 1) × l2 – case 3-1-2 R1 (xi ) − R1 (t) ≤ R1 (xi ) − R1 (StSc1 (t)) (since StSc1 (t) ≤ t, eq 6)
≤ R1 (xi ) − R1 (StBl1 (t)) (since StBl1 (t) ≤ StSc1 (t), eq. 8) ≤ R1 (xi ) − R1 (StBl1 (t)) (since R1 (StBl1 (t)) = R1 (StBl1 (t)), eq. 3) ≤ R1 (xi ) − R1 (StBl1 (t)) (since R1 (xi ) ≤ R1 (xi )) ≤ R1 (xi ) − R1 (tc3 ) (from equation 19) ≤ α1 (xi − tc3 ) ≤ α1 (χi ) (from Lemma 42). Then, we have:
With, R3 (xi ) − R3 (t) ≤ l3 , it gives R2 (s) − R2 (t) ≥ β(s − t) − α1 (χi ) − l3 . By definition of χi , β(χi ) − α1 (χi ) > l3 + (i − 1)l2 . Then, R2 (s) − R2 (t) ≥ β(s − t) − β(χi ) + (i − 1)l2
Non Preemptive Static Priority with Network Calculus: Enhancement
271
Finally, R2 (s) − R2 (t) ≥ min{i × l2 , β(s − t) − β(χi ) + (i − 1) × l2 , β(s − t) − β(χ”i + ψ2 ) + β(Δ + ψ2 ) + (i − 1) × l2 } 4.2
Generalisation to Any Number of Flows
Corollary 43 (Generalisation to any number of flows). The result with 3 flows can be generalised to any number of flow A1 , . . . , An . Let us consider the i−1 n flow Ai , and make the renaming R2 = Ai , R1 = j=1 Aj , R3 = j=i+1 Aj . The sum of higher priority flow can be seen as a unique flow, whose arrival curve is the sum of the individual arrival curves, and the same for the low priority ones 7 .
5
Application
Finally, we illustrate our method with the two examples described in Table 1 and Table 2. In the first example (Table 1), we have the same results as in [10]. This is explained by the fact that, the maximum period of service of the priority flow actually corresponds to an expectation of service from the lower priority flow. In the second example (Table 2), since we consider the arrival curve of the low priority flow, we obtain a better delay than in [10]. This delay is the same as the real delay of this case presented in [6].
6
Conclusions
Static priority is a simple and widely used scheduling policy. It has been studied in network calculus, and the first results, [8], had some technical limitations. Further works have passed this limitations [1,7], but they only model the negative impact of non-preemption (the high priority flow can be interrupted by one low priority packet), not its benefits (when the low priority flow is served, it has the full server speed, even if from the long term point of view, it only has a residual part). This aspect is taken into account by [3], that gives an algorithm to improve the residual curve of [7], but the residual service is simple. [10] proposes a residual service, in a more general context than [3], but also that is deficient in some cases. In this paper, we correct this pessimism under the same assumptions. Compared to other theories, our approach is more general. It is more general than [6] (assumption of periodic flow, neither a constant service), more general than [3] (only the considered flow must have fixed packet size, and the residual service is strict) and gives exacts results on common cases studies. These are very interesting generalisations: first, it would be interesting to generalise the approach to the case of variable-size packets in the low priority flows; second, we conjecture that the result also holds for weakly strict service [1]; finally, 7
In case of Ai = An , just consider R3 = α3 = l3 = 0.
272
W. Mangoua Sofack and M. Boyer
it would also be interesting to see if this approach computes the exact worst case (up to now, no counter example have been found). It will also be interesting to see how this general result can be applied, for example, when combining P-GPS scheduling and fixed priority, or a variable speed service (in case of voltage scaling, etc.).
References 1. Bouillard, A., Jouhet, L., Thierry, E.: Service curves in network calculus: dos and don’ts. Rapport de recherche INRIA 7094, INRIA (Novembre 2009) 2. Cheng-Shang, C.: Performance Guarantees in Communication Networks. Springer, Heidelberg (2000) ISBN: 1-85233-226-3 3. Chokshi, D.B., Bhaduri, P.: Modeling fixed priority non-preemptive scheduling with real-time calculus. In: RTCSA 2008: Proc. of the 2008 14th IEEE Int. Conf. on Embedded and Real-Time Computing Systems and Applications. IEEE Computer Society, Washington, DC (2008) 4. Cruz, R.L.: A calculus for network delay, part I: Network elements in isolation. IEEE Transaction on Information Theory 37(1), 114–131 (1991) 5. Cruz, R.L.: A calculus for network delay, part II: Network analysis. IEEE Transaction on Information Theory 37(1), 132–141 (1991) 6. Davis, R.I., Burns, A., Bril, R.J., Lukkien, J.J.: Controller area network (CAN) schedulability analysis: Refuted (2007) (revisited and revised) 7. Haid, W., Thiele, L.: Complex task activation schemes in system level performance analysis. In: ESWeek 2007: Proc. of the 5th IEEE/ACM Int. Conf. on Hardware/Software Codesign and System Synthesis, Salzburg, Austria, September 30-October 03, pp. 173–178. ACM, New York (2007) 8. Le Boudec, J.Y., Thiran, P.: Network Calculus. LNCS, vol. 2050, pp. 3–81. Springer, Heidelberg (2001) 9. Lenzini, L., Mingozzi, E., Stea, G.: Delay bounds for FIFO aggegates: a case study. Computer Communications 28, 287–299 (2004) 10. Mangoua Sofack, W., Boyer, M.: Non preemptive static priority with network calculus. In: Proc. of the 16th IEEE Int. Conf. on Emerging Technologies and Factory Automation (ETFA 2011) (September 2011) 11. Perathoner, S., Wandeler, E., Thiele, L., Hamann, A., Schliecker, S., Henia, R., Racu, R., Ernst, R., Gonzlez Harbour, M.: Influence of different abstractions on the performance analysis of distributed hard real-time systems. Design Automation for Embedded Systems 13, 27–49 (2009) 12. Pollex, V., Lipskoch, H., Slomka, F., Kollmann, S.: Runtime improved computation of path latencies with the real-time calculus. In: Proc. of the 1st International Workshop on Worst-Case Traversal Time, WCTT 2011, pp. 58–65. ACM (2011)
A Demand-Response Calculus with Perfect Batteries Jean-Yves Le Boudec and Dan-Cristian Tomozei EPFL - LCA2 Bˆatiment BC Station 14 CH-1015 Lausanne {jean-yves.leboudec,dan-cristian.tomozei}@epfl.ch
Abstract. We consider an electricity consumer equipped with a perfect battery, who needs to satisfy a non-elastic load, subject to external control signals. The control imposes a time-varying upper-bound on the instantaneous energy consumption (this is called “Demand-Response via quantity”). The consumer defines a charging schedule for the battery. We say that a schedule is feasible if it successfully absorbs the effects of service reduction and achieves the satisfiability of the load (making use of the battery). Our contribution is twofold. (1) We provide explicit necessary and sufficient conditions for the load, the control, and the battery, which ensure the existence of a feasible battery charging schedule. Furthermore, we show that whenever a feasible schedule exists, we can explicitly define an online (causal) feasible schedule. (2) For a given arrival curve characterizing the load and a given service curve characterizing the control, we compute a sufficient battery size that ensures existence of an online feasible schedule. For an arrival curve determined from a real measured trace, we numerically characterize the sufficient battery size for various types of service curves.
1 Introduction Growing requirements for integration of renewable energy in the power grid pose a difficult challenge to Distribution Systems Operators (DSO). The increase in the penetration level of renewables, coupled with the volatility and lack of predictability of such energy sources (e.g., wind turbines, photovoltaic cells) may produce instabilities in the transport and distribution grid, which in turn may result in blackouts. It is possible that such sources produce too little energy when demand is high, or too much energy when demand is low. Studies have shown [1] that in order to guarantee a smooth operation of the grid in its current state, operators need to react quickly to changes in renewable output via use of ancillary services (e.g., gas turbines). Such services are able to provide large quantities of energy on short notice; however they constitute an investment that actors in the energy market are reluctant to perform, since it indirectly increases the cost of “green” energy.
The power measurements in Section 5 were performed by Peng (Peter) Gao during his Master studies at EPFL. The authors would like to thank Mario Paolone for meaningful discussions about battery characteristics and models, as well as the three anonymous reviewers for their useful feedback.
J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 273–287, 2012. c Springer-Verlag Berlin Heidelberg 2012
274
J.-Y. Le Boudec and D.-C. Tomozei
A method proposed in the next generation grid for dealing with variability of production is Active Demand Management, or Demand-Response. Specifically, at critical peak periods, Demand-Response mechanisms throttle flexible energy demands (called “loads”), such as Plug-in Electric Vehicles, heating, or air conditioning, thus adjusting demand to the available production. The control of electricity consumption is done either via price, or via quantity. Demand-Response via price consists simply of informing consumers of an increase in electricity price, which provides an incentive for consumers to reduce their load. This method has the advantage of being decentralized and is widely embraced in the literature [6]. Its main drawback is the fact that it exposes consumers to the variability of electricity prices. It has been shown [2] that real-time wholesale prices can fluctuate wildly in a fair deregulated electricity market. The second option, Demand-Response via quantity, has been implemented by companies such as Voltalis or Peaksaver [11, 8]. These companies have control over a large number of flexible loads (typically heating appliances) and act as virtual energy providers during peak consumption hours by throttling the loads of their customers, while guaranteeing hard limits on the service reductions. Such an approach can be modeled via service curve-based agreements between suppliers and consumers [5]. In this paper, we focus on Demand-Response via quantity. In both cases, the effect of Demand-Response can be interpreted as a time-varying imposed upper-bound on electricity consumption from the grid. Indeed, when control is done via pricing, such an upper-bound can be computed locally by a Smart Home Controller, which aims to minimize electricity cost. When control is done via quantity, the upper-bound is explicit. We refer to this imposed upper-bound as the control signal received from the grid. Demand-Response requires that loads be flexible, which is naturally the case for some appliances, as mentioned earlier, but not for most. One way to make a load flexible is to insert some form of energy storage (we call it a “battery”) between the load and the electrical grid. The problem we study in this paper is how to size the battery, and how to schedule its charge, in order to make sure that the inflexible load can be served transparently, in presence of Demand-Response via quantity. Contributions. We consider a non-elastic load equipped with a battery (interposed between the load and the grid), subject to time-varying control signals (e.g., via throttling from Demand-Response mechanisms). Despite imposed reductions in energy consumption from the grid, the load should be satisfied (from the grid and the battery). In order to ensure this, the battery needs to act like a buffer and provide energy to the load when the grid is unable to do so. We make the hypothesis that the battery is perfect (i.e., there are no thermal losses, and its efficiency is 1). In this paper we study battery charging schedules that ensure satisfiability of the load. We call such schedules feasible schedules. The first main contribution consists in providing explicit necessary and sufficient conditions on the load, the control, the battery capacity, and the initial charge level of the battery, that guarantee existence of such feasible charging schedules. Moreover, we show that there exist feasible schedules, if and only if there exist causal feasible schedules (i.e., schedules that can be computed online, without knowledge of the future).
A Demand-Response Calculus with Perfect Batteries
275
In addition, we examine the problem of sizing the battery. We consider loads that satisfy an arrival curve and control signals that are constrained by a service curve. Such control signals have been considered for performing Demand-Response [5]. The second main contribution of this paper consists in determining a sufficient battery capacity that ensures existence of feasible online charging schedules for all loads (and control signals) characterized by a given arrival curve (and a given service curve, respectively). Using an arrival curve computed based on real measurement traces, we characterize numerically the dependence between the battery size and various types of service curves that can be supported. Related Work. The problem of determining feasible battery charging schedules is somewhat related to the problem of optimal multimedia smoothing [4, Chapter 5]. The analysis of the latter employs techniques from min-plus system theory. We provide a novel formulation of the former and analyze it using similar min-plus system techniques: we determine the set of all feasible charging schedules (Theorem 1) and compute sufficient battery capacity (Theorem 4), assuming a perfect battery. Outline. In Section 2 we formally define the problem. In Section 3 we characterize the feasible battery charging schedules. In Section 4 we study bounds on the required battery size which ensure existence of feasible online schedules. In Section 5 we present numerical evaluations on real traces. We conclude in Section 6. Technical lemmas are given in Appendix A.
2 Problem Definition t We consider a load L(t) = 0 (s)ds, i.e., energy required over the interval [0, T ] by a t consumer who is subject to control signals G(t) = 0 g(s)ds (i.e., maximum available energy over the interval [0, T ]). The consumer owns a battery of capacity B. We assume a perfect battery, with no thermal loss and perfect efficiency (i.e., all the energy stored in the battery can be retrieved tat a later time). The actual load U (t) = 0 u(s)ds is the load drawn from the grid. It is computed by the consumer, and may differ from L(t), since the consumer may use the battery instead of the main supply; we say that U (t) is a “schedule”. Let B0 be the battery level at time t = 0. An actual load U (t), defined for t ≥ 0, is feasible if and only if for all times s, t such that 0 ≤ s ≤ t: L(t) ≤ B0 + U (t), U (t) − L(t) + B0 ≤ B, 0 ≤ U (t) − U (s) ≤ G(t) − G(s).
(1) (2) (3)
Eq.(1) expresses the no-underflow condition for the battery, i.e., all load comes either from the main supply U (t) or from the initial buffer B0 . Eq.(2) expresses the nooverflow condition. Eq.(3) expresses that the supply is positive, i.e., there is no local production, and the supply is constrained by the control signals G(t). Note that if U is feasible, the battery level at time t is B(t) = U (t) − L(t) + B0 .
(4)
276
J.-Y. Le Boudec and D.-C. Tomozei
The consumer setup is represented in Figure 1. u(t) − (u(t) − (t))+ Load L(t)
(t)
u(t) ≤ g(t)
((t) − u(t))+
Battery B(t)
Grid
(u(t) − (t))+
Fig. 1. The consumer setup. The load L(t) can be satisfied from the battery B(t) and the grid U (t) under DR constraints given by control G(t). We have denoted (a)+ = max(a, 0).
The control signals G(t) and the load L(t) are given. The consumer-side problem is to compute an online schedule U (t), i.e., a function U (t) whose definition depends ˜ only on the past and present values of L, G and U . A typical example is the policy U that greedily maximizes battery level subject to constraints given by Eq.(2) and Eq.(3). Before turning to the online problem, consider an oracle that knows the past, present and future of the load L(t) and the control signals G(t) and asks the question whether there exists an actual load U (t) that is feasible. We assume that U (t) = 0 for all t ≤ 0, i.e., there was no load before time 0. We find necessary and sufficient conditions for the existence of such a schedule. We also show that any feasible schedule is bounded by a ∗ ∗ ∗ and a maximum Umax . We show that the schedule Umax is computable minimum Umin ˜ online and corresponds to the policy U that greedily maximizes the battery level. Returning to the online problem, our result implies that if there exists a feasible schedule, then there is an online one. This comes as a surprise, since the set of online schedules is much smaller than the set of all possible schedules available to the oracle. In particular, our result implies that computing a feasible schedule is either impossible, or can be done online. In Section 4 we consider the case where the control signals G(t) are constrained by the service curve β. We derive from the previous results the values of minimum battery capacities that are required to guarantee existence of a schedule. Our results use min-plus system theory [4, Section 4.1 and Theorem 4.3.1].
3 Feasible Schedules Theorem 1. Assume L(t) and G(t) are known and 0 ≤ B0 ≤ B < ∞. There exists a feasible schedule U (t) if and only if both conditions hold: B0 ≥ sup (L(t) − G(t)) ,
(5)
t≥0
B ≥ sup (L(t) − L(s) − G(t) + G(s)) . 0≤s≤t
∗ ∗ If these conditions hold, there exist two feasible schedules Umin and Umax : ∗ (t) = max 0, sup (L(τ ) − G(τ )) + G(t) − B0 , Umin τ ≥t
(6)
(7)
A Demand-Response Calculus with Perfect Batteries ∗ Umax (t)
277
= min G(t),
inf
s:0≤s≤t
(L(s) + G(t) − G(s)) + B − B0 ,
(8)
such that any feasible schedule U satisfies ∗ ∗ Umin (t) ≤ U (t) ≤ Umax (t) for all t ≥ 0.
(9)
Proof. The idea of the proof is first to relax constraint (1); the relaxed problem has ∗ ∗ a maximal solution, Umax , and the original problem is feasible if and only if Umax satisfies the constraint (1). Similarly, by relaxing constraint (2), we obtain a problem ∗ with a minimum solution Umin . More precisely, consider the problem ⎧ ⎨ U is non decreasing and U (0) = 0 (P 1) U (t) ≤ B − B0 + L(t) for all t ≥ 0 ⎩ U (t) − U (s) ≤ G(t) − G(s) for all t ≥ 0 and s ≥ 0 with s ≤ t, where the unknown is the function U : R+ → R+ . This problem is equivalent to Problem (P 10) of Lemma 1 with f (t) = B − B0 + L(t) for all t > 0, f (0) = 0, g(t) = G(t) for all t ≥ 0. ∗ Thus (P 1) has a maximal solution, let us call it Umax , given by ∗ Umax (t) = inf (g(t) − g(s) + f (s)) 0≤s≤t
which, after some re-arrangements, gives Eq.(8). ∗ Now we show that there exists a feasible schedule if and only if Umax satisfies the ∗ relaxed condition Eq.(1). One implication is obvious: if Umax satisfies Eq.(1), then it is a feasible schedule. Conversely, assume that there exists a feasible schedule, say U . ∗ Then U is a solution of Problem (P 1) therefore U (t) ≤ Umax (t) for all t. Since U ∗ satisfies Eq.(1), so does Umax . ∗ Last, saying that Umax satisfies Eq.(1) is equivalent to saying that both U (t) = G(t) and U (t) =
inf
s such that 0≤s≤t
(L(s) + G(t) − G(s)) + B − B0
satisfy Eq.(1), which gives Eq.(5) and Eq.(6). It remains to show that there is a minimum solution. To this end, we consider the problem (P 2), obtained by relaxing Eq.(2): ⎧ ⎨ U is non decreasing and U (0) = 0 (P 2) U (t) ≥ −B0 + L(t) for all t ≥ 0 ⎩ U (t) − U (s) ≤ G(t) − G(s) for all t ≥ 0,
278
J.-Y. Le Boudec and D.-C. Tomozei
where the unknown is the function U : R+ → R+ . (P 2) is equivalent to Problem (P 20) of Lemma 2 with f (t) = max (0, L(t) − B0 ) for all t ≥ 0, g(t) = G(t) for all t ≥ 0. ∗ Thus (P 2) has a minimal solution, let us call it Umin , given by ∗ Umin (t) = sup (g(t) − g(s) + f (s)), s:s≥t
which, after some re-arrangements, gives Eq.(7). Note that there is a feasible solution ∗ to the original problem if and only if Umin satisfies the relaxed condition Eq.(2). After some algebra, this gives Eq.(5) and Eq.(6) which were already obtained. It follows that, if these conditions hold, any feasible schedule U is a solution of Problem (P 2) and ∗ (t) for all t ≥ 0. therefore U (t) ≥ Umin ∗ ∗ Note that Umax is causal, i.e., can be computed online. In contrast, Umin is not causal; it is in fact anti-causal, i.e., it depends only on the present and the future of L and G. Also note that different schedules result in different energy flows, however, by Eq.(4), U (t) − B(t), i.e., the bought energy minus the stored one, is the same for all feasible schedules, since we assume that there is no loss. The maximum schedule introduced in Theorem 1 is causal but is defined in abstract terms; this was required to prove the theorem. We show next that it corresponds to a simple, implementable online policy. To see this, consider discrete time and define the ˜ (t) := t u policy U s=0 ˜(s), where
˜ + (t + 1) , u˜(t + 1) = min g(t + 1), B − B(t) (10)
˜ ˜ is used. This policy with B(t) denoting the battery level at time t when the policy U greedily maximizes battery level subject to constraints given by Eq.(2) and Eq.(3), i.e., it consists in storing into the battery the maximum possible amount of energy, considering the constraints that (1) the load needs to be served, (2) the battery has a maximum capacity B and (3) the energy drawn from the grid cannot exceed the control signals. ∗ ˜ are equal. and U Theorem 2. The schedules Umax
Proof. Let us show by induction on t that ∗ ∗ Umax (t) − Umax (t − 1) = u ˜(t).
(11)
∗ (t) − L(t) + B0 the battery level at time t when the maximal Denote by B ∗ (t) = Umax ∗ schedule Umax is used. At time t = 1, since B − B0 > 0, ∗ ∗ (1) − Umax (0) = min [G(1), L(1) + B − B0 ] − 0 = Umax
min [g(1), B − B(0) + (1)] = u ˜(1).
A Demand-Response Calculus with Perfect Batteries
279
Suppose that Eq.(11) holds at times 1, . . . , t. We show that it also holds at time t + 1: ∗ Umax (t + 1) = min [G(t) + g(t + 1), inf (L(s) + G(t) − G(s)) + B − B0 + g(t + 1), s:0≤s≤t
L(t) + (t + 1) + B − B0 ] ∗ = Umax (t) + min [g(t + 1), (t + 1) + B − B ∗ (t)] .
˜ by the induction hypothesis, we can conclude. Since B ∗ (t) = B(t)
4 Battery Size with Service Curve Constraints Assume now that the control signals G(t) are constrained by the service curve β, i.e., G(t) − G(s) ≥ β(t − s) for all 0 ≤ s ≤ t, where β is a positive super-additive function, known in advance. The super-additivity of β is implicit, since it defines a minimum guaranteed value over any sliding window, and thus β(t + t ) ≥ β(t) + β(t ), for all non-negative t, t [5, Section IV]. A direct consequence of Theorem 1 is: Theorem 3. Assume L(t) and the service curve β are known and 0 ≤ B0 ≤ B < ∞, with the service curve satisfying the condition ¯ β(v) := sup[β(u + v) − β(u)] < ∞.
(12)
u≥0
Then there exists a feasible actual load U (t) for any realization of the control signal G(t) compatible with the service curve β if and only if both conditions hold: B0 ≥ sup (L(t) − β(t)) ,
(13)
t≥0
B ≥ sup (L(t) − L(s) − β(t − s)) . 0≤s≤t
(14)
Proof. Consider a control signal G constrained by β. It satisfies G(t)−G(s) ≥ β(t−s) for all 0 ≤ s ≤ t. If Eq.(13) and Eq.(14) are satisfied, then it follows that Eq.(5) and Eq.(6) are verified. Thus, by Theorem 1, there exists a feasible actual load U (t). Conversely, we need to show that if for all controls G constrained by β there exists a feasible actual load, then Eq.(13) and Eq.(14) hold. By contraposition, it suffices to ¯ for show that if either Eq.(13) or Eq.(14) do not hold, then there exists a schedule G which there is no feasible actual load. If Eq.(13) does not hold, then there exists some t0 ≥ 0 such that B0 < L(t0 )−β(t0 ). ¯ We define G(t) := β(t). This control is compatible with β via the super-additivity ¯ 0 ), and thus Eq.(5) does not property of the service curve. Hence, B0 < L(t0 ) − G(t ¯ hold for G. By Theorem 1 it follows that there exists no feasible actual load. If Eq.(14) does not hold, then there exist some t0 ≥ s0 ≥ 0 such that B < L(t0 ) − L(s0 ) − β(t0 − s0 ). Consider some constant A > 0 and define the following control: A + β(u), if u ≥ 0 ¯ G(s0 + u) = ¯ A − β(|u|), if u < 0.
280
J.-Y. Le Boudec and D.-C. Tomozei
¯ satisfies the service curve β. Let us check that G ¯ 0 + u) − G(s ¯ 0 + v) = β(u) − β(v) ≥ β(u − v), First, consider 0 ≤ v ≤ u. Then G(s by super-additivity of β. ¯ ¯ ¯ 0 − u) = β(u) ¯ 0 − v) − G(s − β(v) ≥ β(u − v). Indeed, Moreover, G(s ¯ ¯ β(v)+β(u−v) = sup [β(w +v)+β(u−v)−β(w)] ≤ sup [β(w +u)−β(w)] = β(u), w≥0
w≥0
again by super-additivity of β. ¯ 0 + u) − G(s ¯ 0 − v) ≥ β(u + v). Finally, consider u, v ≥ 0 and let us check that G(s ¯ We can rewrite the above inequality as β(u) + β(v) ≥ β(u + v). But this holds by ¯ definition of β(v) := supu≥0 [β(u + v) − β(u)]. ¯ it suffices to take a large enough constant, for example To ensure the positivity of G, ¯ ¯ ¯ 0 ) − G(s ¯ 0) = A = β(s0 ). Thus, G is a valid control satisfying β. By definition, G(t ¯ ¯ β(t0 − s0 ). Since B < L(t0 ) − L(s0 ) − G(t0 ) + G(s0 ), we have that Eq.(6) does not ¯ Again, by Theorem 1, there exists no feasible actual load for G. ¯ hold for G. Note that the constraint on β expressed in Eq.(12) simply states that over any time window of finite length, the service curve β guarantees only a finite amount of energy, which depends on the length of the window. It is not a strong assumption in this context. Let α be a sub-additive function of t ≥ 0, nondecreasing and such that α(0) = 0. We say that the load L(t) is α-smooth iff L(t) − L(s) ≤ α(t − s) for all 0 ≤ s ≤ t.
(15)
The smallest such possible α for which a given load L(t) is α-smooth is obtained via min-plus deconvolution: def
α∗ (t) = (L L)(t) = sup[L(t + s) − L(s)]. s≥0
It follows that we can compute the minimum required battery capacity: Theorem 4. Assume an α-smooth load and let control signals be constrained by β. Let B ∗ := sup (α(s) − β(s)) .
(16)
s≥0
If B0 ≥ B ∗ there exists a feasible load schedule, for any realization of the control signal G(t) compatible with the service curve β. One such schedule is computable online, using only past and present observations. Note that the condition on the initial battery level B0 ≥ B ∗ implies the condition on battery capacity B ≥ B ∗ . Proof. Since α(t) ≥ L(t) − L(0) and L(0) = 0, it follows that B ∗ ≥ sup(L(t) − β(t)). t≥0
Additionally, for any 0 ≤ s ≤ t, we have that α(t − s) ≥ L(t) − L(s), and hence, B ∗ ≥ sup(L(t) − L(s) − β(t − s)). t≥0
∗
Thus, B ≥ B0 ≥ B implies Eq.(13) and Eq.(14). By Theorem 3 we conclude.
A Demand-Response Calculus with Perfect Batteries
281
5 Numerical Evaluations Consider now a data center that subscribes to Demand-Response via quantity to reduce its electricity costs. This allows an external controller to reduce power consumption of the data center via load control signals at peak hours. The DR contract guarantees an upper bound on the amount of service interruption per day. We consider that the load of the data center is not controllable, and we wish to render the DR mechanism transparent. To this end, we need to install batteries in the data center. We make use of Theorem 4 to determine the required capacity of the batteries. The service curve β is agreed upon in the DR contract. To determine a reasonable arrival curve α we use measured loads. We again assume that the batteries are perfect. 5.1 Voltalis-Like Service Curve We consider the specific example of Voltalis’s BluePod [11], which disconnects appliances from the grid for at most 30 minutes per day. In addition, we impose a maximum allowed power consumption zmax which can never be overcome. Previously [5] we have proposed the following service curve: for some 0 ≤ t0 < t1 and some zmax > 0, define β1 (t) := zmax (t1 − t0 )t/t1 + zmax (t − t/t1 t1 − t0 )+ ,
(17)
where (a)+ := max(a, 0). The Voltalis on/off control satisfies the service curve β1 with t0 = 30 mn and t1 = 24 hours (see Figure 2). However, the service curve β1 allows more complex controls, such as limiting the power consumption to zmax/2 over a time interval of at most 2t0 per day. o(~)(Wh) ~ (hours) (h ) 0 t0=0.5
t1=24
48
48.5
Fig. 2. The service curve β1 , shown here, allows the distributor to switch off the load for at most 30 mn every day, or (for example) to reduce the load to zmax/2 for 60 mn every day
5.2 Akamai Arrival Curve We measured the power consumption of a desktop PC under different loads. The PC is equipped with a dual core Intel Pentium processor (2.8GHz) and is running Fedora Linux. We used the SPECpower ssj2008 benchmark software developed by SPEC [10] and a power analyzer. The benchmark models a server application with a large number of users. It targets 11 possible request rates, from the maximum supported rate (100% load) to 0-rate (idle, 0% load). Requests arrive at random intervals with lengths following an exponential distribution. Bursty arrivals result in requests queuing up.
282
J.-Y. Le Boudec and D.-C. Tomozei
The process starts with a calibration period, during which the maximum supported request rate is determined, i.e., the maximum number of requests per second that can be treated. Subsequently, for each load, the consumption is measured as follows: a ramp-up period during which the specific load is targeted (as a fraction of the measured maximum load) is followed by a measurement period, and then by a ramp-down period. During the measurement period, power consumption values are stored, and average power consumption is computed. The resulting values are shown in Figure 3(a). 160
160
140
150 dαm/dt [W]
Power [W]
120 100 80 60
140 130 120
40
110
20 0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Load
1
(a) Measured power for load varying from 0 to 100%. The error bars represent empirical standard deviation.
100
0
4
8
12 Time [h]
16
20
24
(b) Plot of dαdtm determined via min-plus deconvolution for the Akamai data set.
Fig. 3. Measured arrival curve
We used the real traffic data from the Akamai data set presented in the work of Qureshi et al. [9]. There, the authors plotted incoming traffic at roughly half the servers of Akamai (in millions of hits per second) measured over a period of 24 days. We regard this data as a representative real-world workload. We assume that all the servers in a cluster are given roughly the same workload. This assumption allows us to map the Akamai traffic data to the load (from 0 to 1) of a single server. Several models for estimating dynamic power usage [9, 3] assume that the power consumed by a cluster is roughly proportional to its utilization. We use a similar model: we map server load to power consumption using the roughly-linear power measurements in Figure 3(a) obtained as described above. We thus obtain a realistic estimation of the power consumption of a single server over 24 days (Figure 5). We use this estimated power consumption to derive a minimum measured arrival curve αm (·) by applying min-plus deconvolution. Since the length of the trace is of Tmax = 24 days, we determine αm (t) := sup0≤s≤Tmax −t [L(t + s) − L(s)] for t less than some value τ ≤ Tmax (i.e., we determine the maximum energy consumption over time windows of size less than τ ). For visualization ease, we plot the derivative of the measured arrival curve (rather than the curve itself) in Figure 3(b) for τ = 24 hours. 5.3 Battery Sizing Let us now determine feasible parameters of the service curve β1 (with t1 := 24h). Namely, we seek pairs (zmax , t0 ) of maximum allowed power and interruption time per day which allow satisfiability of αm -smooth loads using a battery of finite capacity.
A Demand-Response Calculus with Perfect Batteries
283
Theorem 4 states that taking the supremum over the difference between the arrival curve α and the service curve β provides a sufficient battery size. Theorem 5 below gives a sufficient condition for the finiteness of this supremum. Namely, the guaranteed delivered energy needs to be greater or equal than the maximum required energy on time windows of some length T . Moreover, the same theorem allows us to compute B ∗ by taking the supremum over the interval [0, T ], instead of the semi-axis [0, ∞). Theorem 5. Consider α(·) and β(·), arrival and service curves respectively. If there exists some T > 0 such that α(T ) ≤ β(T ), (18) then sup [α(s) − β(s)] = sup [α(s) − β(s)].
0≤s≤T
(19)
s≥0
Proof. Take s > T . We can write s = kT + s , where 0 ≤ s < T and k ∈ N. Then, by subadditivity of α and superadditivity of β, we have that α(s) − β(s) ≤ k[α(T ) − β(T )] + α(s ) − β(s ) ≤ α(s ) − β(s ). After determining the empirical arrival curve αm (t) for t ≤ t1 in Section 5.2, we wish to determine service curves β1 with parameters (zmax , t0 ), such that Eq.(18) is satisfied for some T ≤ t1 (i.e., we wish to guarantee a fully charged battery at some time during each day). We obtain a feasible region given by αm (T ) D := (zmax , t0 ) : 0 ≤ t0 ≤ t∗0 (zmax ) := sup T − . zmax 0≤T ≤t1 We plot the resulting feasible region in the first plot of Figure 4(a). We compute B ∗ for the points on the boundary of the feasible region using Eq.(16) and Theorem 5, and we plot it in the second plot of Figure 4(a) (we assume a 12V battery). The latter plot tells us that in the case in which there are no service interruptions (i.e., t0 = 0), a maximum power of 134W and a 12V battery of 13.96Ah suffice to satisfy any αm -smooth load. Let us now determine for a fixed “reasonable” maximum power zmax what are the interruption times and required battery sizes which ensure satisfiability of the load. In Figure 4(b) we pick 4 values for zmax , we vary the interruption time t0 , and we plot the required corresponding battery charge B ∗ for a 12V battery. For example, pick t0 = 1h. Then with zmax = 155W, a 12V battery of roughly 12.55Ah suffices. However, since a high Depth of Discharge (DoD) shortens the lifetime of a battery [7], we choose a higher charge, say 40Ah. This corresponds to a standard car battery, which (for αm -smooth loads) is guaranteed to have a charge level of roughly above two thirds of its maximum charge at all times. In Figure 5 we show the evolution of the battery charge over time, when the greedy ˜ is used. The control G is binary (on/off) and is compatible with the service policy U curve β1 (zmax = 155W and t0 = 1h). Time is discrete and we plot only the time steps during which an interruption is triggered. For verification, we plot the battery level (40Ah − B ∗ (155W, 1h)). As expected, the battery charge never drops below this level. Additionally, an oversized battery provides protection against unexpected blackouts.
J.-Y. Le Boudec and D.-C. Tomozei
24 20 Non−feasible 16 12 Feasible 8 4 0 2 3 10 10 z [W]
50 z Non−feasible
40 4
10
max
B* (zmax, t0*) [Ah]
=
max
300 200 100 (134.00W, 13.96Ah) 0 2 3 10 10 z [W]
4
10
max
(a) Feasible maximum power zmax and interruption time t0 . Battery charge B ∗ for points on the feasibility region boundary.
B* [Ah]
0
t * (z
max
) [h]
284
30
155W
zmax= zmax= 145W
Feasible 20 z =140W max 136W (1h,12.55Ah) 10 0
0
1
2 t0 [h]
3
4
(b) For several values of zmax , we plot the required battery size B ∗ for a desired interruption time t0 .
200 100 0
Battery charge [Ah]
Power interruption
Load [W]
Fig. 4. Finding battery charge B ∗ for measured arrival curve αm , for various parameters of β1
0
48
96
144 192 240 288 336 384 432 480 528 576
0
48
96
144 192 240 288 336 384 432 480 528 576
0
48
96
144 192 240 288 336 384 432 480 528 576 Time (h)
1 0
50 40 30 20 10 0
˜ , subject to control Fig. 5. Evolution of a 12V - 40Ah battery charge, using greedy schedule U satisfying β1 with zmax = 155W and t0 = 1h. The charge level remains above roughly 2/3.
Note that the deployment of such a system depends only on monetary incentives. Namely, the savings generated by using a Demand-Response service compared to the normal cost of electricity need to justify equipping each server with a battery. In this section we have provided the order of magnitude of the required battery size.
A Demand-Response Calculus with Perfect Batteries
285
6 Conclusion We have studied the dimensioning of a perfect battery in a scenario where an inflexible load is subject to Demand-Response via quantity. We have shown that the techniques used in Network Calculus can easily be applied and provide interesting results. Namely, we have shown that when the load is characterized using a known arrival curve, and when Demand-Response is constrained by a known service curve, a large enough battery charge can be determined such that there exists an online battery charging schedule. The perfect battery assumption allows the derivation of closed form expressions for the minimal and maximal charging schedules. In upcoming work we will focus on models for non-perfect energy storage, which do not seem to posess the same property.
References [1] CAISO and GE Consulting. Integration of renewable resources. Operational requirements and generation fleet capability at 20% RPS. Technical report, CAISO (August 2010) [2] Cho, I., Meyn, S.: Efficiency and marginal cost pricing in dynamic competitive markets with friction. Theoretical Economics 5, 215–239 (2010) [3] Fan, X., Weber, W.-D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. In: Proceedings of ISCA (2007) [4] Thiran, P., Le Boudec, J.-Y.: Network Calculus. LNCS, vol. 2050, pp. 3–81. Springer, Heidelberg (2001), http://infoscience.epfl.ch/record/282 [5] Le Boudec, J.-Y., Tomozei, D.-C.: Demand response using service curves. In: Proceedings of PES ISGT - Europe. IEEE (2011) (to appear) [6] Li, N., Chen, L., Low, S.H.: Optimal demand response based on utility maximization in power networks. In: 2011 IEEE Power and Energy Society General Meeting, pp. 1–8 (July 2011) [7] Linden, D., Reddy, T.: Handbook of Batteries, 3rd edn. McGraw-Hill (2002) [8] Peaksaver. Peaksaver, http://www.peaksaver.com/ [9] Qureshi, A., Weber, R., Balakrishnan, H., Guttag, J., Maggs, B.: Cutting the electric bill for internet-scale systems. In: Proceedings of the ACM SIGCOMM 2009, pp. 123–134. ACM, New York (2009) [10] SPEC. Design document SSJ workload (2011), http://www.spec.org/power/ docs/SPECpower ssj2008-Design ssj.pdf [11] Voltalis. Bluepod, http://www.voltalis.com/
A Technical Lemmas The proof of Theorem 1 is based on min-plus and max-plus system theoretic lemmas [4, Chapter 4], which are given in this section. We denote with G the set of functions f : [0, ∞) → R that are wide-sense increasing, and with F the subset of all f ∈ G such that f (0) = 0. Note that the functions L, G and U are all in F .
286
J.-Y. Le Boudec and D.-C. Tomozei
A.1 Maximal Solution Lemma 1. Let f ∈ F and g ∈ F . Consider the problem ⎧ ⎨U ∈ F (P 10) U (t) ≤ f (t) for all t ≥ 0 ⎩ U (t) − U (s) ≤ g(t) − g(s) for all t ≥ 0 and s ≥ 0 with s ≤ t, ¯ , defined where the unknown is the function U . This problem has a maximal solution U by ¯ (t) = inf (g(t) − g(s) + f (s)) , U (20) s: 0≤s≤t
¯ is a solution and any other solution U satisfies U (t) ≤ U ¯ (t) for all t ≥ 0. i.e., U Proof. Note that the problem always has one trivial solution, U (t) = 0 for all t ≥ 0. Let Π be the operator G → G defined by ΠU (t) = inf s: 0≤s≤t (g(t) − g(s) + U (s)). The fact that ΠU ∈ G whenever U ∈ G follows from [4, Chapter 3], where this operator is called Π = hg . Also note that (ΠU )(0) = U (0) thus Π maps F to F . Consider the problem (P 10 ) obtained by relaxing the condition U (0) = 0: ⎧ ⎨U ∈ G (P 10 ) U (t) ≤ f (t) for all t ≥ 0 ⎩ U (t) − U (s) ≤ g(t) − g(s) for all t ≥ 0 and s ≥ 0 with s ≤ t. The problem (P 10 ) is equivalent to U (t) ≤ min(f (t), (ΠU )(t)), for all t ≥ 0. The operator Π is min-plus linear [4, Chapter 4], therefore it is isotone and upper-semi continuous. Thus, by [4, Theorem 4.3.1], the problem (P 10 ) has a maximum solution ¯ = Πf ¯ , where Π ¯ is the min-plus closure of Π. The min-plus closure of Π is the U infimum of the identity and the iterates of Π. Now note that Π is idempotent, i.e., for all U ∈ F , Π(ΠU ) = ΠU ; thus the iterates of Π are equal to Π. Further, (ΠU )(t) ≤ U (t). Therefore the min-plus closure of Π is ¯ = Π. Thus the maximal solution of (P 10) is U ¯ = Πf , which gives Eq.(20). Π ¯ ¯ Last, note that U ∈ F because f ∈ F . Thus U is a solution to (P 10); further, any ¯. solution U of (P 10) is solution of (P 10 ), thus is upperbounded by U
A.2 Minimal Solution Lemma 2. Let f, g ∈ F such that f (t) ≤ g(t) for all t ≥ 0. Consider the problem ⎧ ⎨U ∈ F (P 20) U (t) ≥ f (t) for all t ≥ 0 ⎩ U (t) − U (s) ≤ g(t) − g(s) for all t ≥ 0 and s ≥ 0 with s ≤ t,
A Demand-Response Calculus with Perfect Batteries
287
where the unknown is the function U . This problem has a minimal solution U , defined by U (t) = sup (g(t) − g(s) + f (s)) , (21) s: s≥t
i.e., U is a solution and any other solution U satisfies U (t) ≥ U (t) for all t ≥ 0. Proof. Note that the problem always has at least one solution, U (t) = g(t). Let Q be the operator G → G defined for U ∈ G by (QU )(t) = sup (g(s) − g(t) + U (t)) . s≥t
Q is the max-plus equivalent of Π. More precisely, define H(t, s) for s ≥ 0 and t ≥ 0 by g(t) − g(s) if t ≤ s H(t, s) = (22) 0 if t ≥ s, so that (QU )(t) = sup (H(t, s) + U (s)) , s≥0
and further, t → H(t, s) is nondecreasing and s → H(t, s) is nonincreasing therefore Q is a max-plus linear operator from G to G ([4, Theorem 4.1.3]). Now consider the problem (P 20 ) obtained by relaxing the condition U (0) = 0: ⎧ ⎨U ∈ G (P 20 ) U (t) ≥ f (t) for all t ≥ 0 ⎩ U (t) − U (s) ≤ g(t) − g(s) for all t ≥ 0 and s ≥ 0 with s ≤ t. The problem (P 20 ) is equivalent to U (t) ≥ max(f (t), (QU )(t)), for all t ≥ 0. We have shown that the operator Q is max-plus linear, therefore it is isotone and lowersemi continuous. Thus, by [4, Theorem 4.3.2], the problem (P 20 ) has a minimal solution U = Qf , where Q is the max-plus closure of Q. The max-plus closure of Q is the supremum of the identity and the iterates of Q. Next, it is straightforward to check that Q is idempotent, i.e., Q ◦ Q = Q and that (QU )(t) ≥ U (t) for all t ≥ 0 and U ∈ G. It follows that Q = Q. Therefore, Problem (P 20 ) has a maximal solution, equal to U = Qf , which is the same as Eq.(21). It remains to check that U ∈ F , i.e., that U (0) = 0. Note that Q does not map F to F . However, we have assumed that f (0) = g(0) = 0 and f (s) ≥ g(s) for all s ≥ 0; therefore def U (0) = sup (−g(s) + f (s)) = 0. s≥0
Thus U is a solution of (P 20). As in the end of the proof of Lemma 1, this shows that U is the minimal solution of (P 20).
A Formal Definition and a New Security Mechanism of Physical Unclonable Functions Rainer Plaga and Frank Koob Federal Office for Information Security (BSI), D-53175 Bonn, Germany {rainer.plaga,frank.koob}@bsi.bund.de
Abstract. The characteristic novelty of what is generally meant by a “physical unclonable function” (PUF) is precisely defined, in order to supply a firm basis for security evaluations and the proposal of new security mechanisms. A PUF is defined as a hardware device which implements a physical function with an output value that changes with its argument. A PUF can be clonable, but a secure PUF must be unclonable. This proposed meaning of a PUF is cleanly delineated from the closely related concepts of “conventional unclonable function”, “physically obfuscated key”,“random-number generator”, “controlled PUF” and “strong PUF”. The structure of a systematic security evaluation of a PUF enabled by the proposed formal definition is outlined. Practically all current and novel physical (but not conventional) unclonable physical functions are PUFs by our definition. Thereby the proposed definition captures the existing intuition about what is a PUF and remains flexible enough to encompass further research. In a second part we quantitatively characterize two classes of PUF security mechanisms, the standard one, based on a minimum secret readout time, and a novel one, based on challenge-dependent erasure of stored information. The new mechanism is shown to allow in principle the construction of a “quantum-PUF”, that is absolutely secure while not requiring the storage of an exponentially large secret. The construction of a PUF that is mathematically and physically unclonable in principle does not contradict the laws of physics.
1 1.1
Introduction Aims and Outline of This Work
“Physical unclonable functions” (PUFs) are electronic hardware devices that are hard to reproduce and can be uniquely identified [14,8]. They promise to enable qualitatively novel security mechanisms (see e.g. [2,9,10]) and have consequently become a “hot topic” in hardware security[5]. The present work asks the question “What characteristics exactly define the qualitative novelty of the PUF concept?”. We hope that a precise answer will aid the security evaluation of existing PUFs and help to develop new ideas for PUF security mechanisms. We searched for J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 288–301, 2012. c Springer-Verlag Berlin Heidelberg 2012
A Formal Definition and a New Security Mechanism of PUFs
289
1. a formal definition of the properties that are required from a hardware device to be called “PUF”, and a 2. a formal definition of the criteria that have to be fulfilled to consider a PUF “unclonable”. The formal PUF definition should not suffer from weaknesses of previous definitions (see section 1.2), encompass at least the large majority of the existing PUF constructions, and be as flexible as possible, i.e. does not restrict further progress in PUF development (e.g. by demanding constructional details, like the amount of stored information). This aim is achieved in section 2.1. After formulating a simple definition of PUF-security (based on Armknecht et al.[1]) in section 2.2 we delineate PUFs from some closely related security concepts (section 3) and outline the elements of a PUF-security evaluation (section 4). In a second part of the paper we systematically analyse and classify PUF security mechanisms and calculate their quantitative security levels against attacks that attempt mathematical cloning (section 5). The aims of this section are to give a quantitative answer to Maes & Verbauwhede’s[13] question whether mathematically-unclonable PUFs are possible in principle, and to apply and thereby illustrate the PUF-definitions of the first part of the paper. In section 6 we characterise the qualitative novelty of PUFs as a new primitive of physical cryptography and discuss the future use and development of PUFs. 1.2
Previous Work on the Definition of a PUF
There have already been several proposal for the first definition of required PUF properties. Gassend et al.[8] who invented the term “PUF” (earlier work by Pappu was on the slightly different concept of a physical one-way function[14]) demand that the function must be “easy to evaluate”, i.e. it must efficiently yield a response value “R” for a challenge argument “C”. and “hard to predict (characterize)”. The latter property means that an attacker who has obtained a polynomial number of C – R pairs (CRPs) but has no longer physical access to the PUF can only extract a negligible amount of information about the R for a random C. R¨ uhrmair et al.[15] criticised this definition because the information content of finite physical objects is always polynomially bound, and therefore no PUF fulfilling this definition can exist. They propose an alternative formal definition in which the PUF must only be hard to predict for an attacker “who may execute any physical operation allowed by the current stage of technology”. Maes & Verbauwhede[13] chose to exclude unpredictability from their “least common property subset” of PUFs, because they put into question whether it is possible in principle to construct a mathematically unclonable PUF. They demand that a PUF is “easy to evaluate” (property “evaluatable”) and that it is “reproducible”, meaning that a C always leads to the same R within a small error. Moreover they demand “physical unclonability” i.e. that it must be “hard” for an attacker to construct a device that reproduces the behaviour of the PUF. However, PUFs that are mathematically clonable are also physically clonable because the mathematical algorithm for PF can then be implemented
290
R. Plaga and F. Koob
on a device that is then a functional physical clone of the PUF. Summarizing, a first generation of definitions roughly defined PUFs to be devices that are efficiently evaluatable and are mathematically and physically unclonable. They remain unsatisfactory for two reasons: 1. Most of the devices currently called PUFs do not fulfill these definitions (according to R¨ uhrmair et al.[15] there are only some “candidates”), i.e. the definition does evidently not really capture the PUF concept. 2. They combine the definition of a PUF with the definition of its security, i.e. points 1. and 2. above. A PUF is defined by its unclonability i.e. its security against attacks. This is problematic because an open-ended security analysis of a PUF clearly must have an “insecure PUF” as one a priori possible outcome. Based on the above definitions an “insecure PUF” is a paradox, PUFs would be secure by definition. These two problems were elegantly solved in a seminal paper by Armknecht et al.[1] who propose to formalize a PUF as “physical function (“PF”) - which is a physical device that maps bit-string-challenges “C” to bit-string-responses “R”. The unclonability is recognized by Armknecht et al. as only one crucial security property, that they further formally define in great detail. We will supply a simplified version of their general security definition in section 2.2 below. Following Armknecht et al., the PUF definition 1. consists in an answer to the question: What are the required characteristics of PF() in order to be a PUF? Armknecht et al. do not demand any specific mathematical properties but only that a PF is a “probabilistic procedure” that maps a set of challenges to a set of responses and that internally PF is a combination of a physical component and an evaluation procedure that creates a response. Armknecht et al. explain that the responses rely heavily on the properties of the physical component but also on uncontrollable random noise (hence “probabilistic”). This definition of PF() still faces the following problem: – Consider a standard authentication chip with a stored secret in a physically protected memory that calculates a response from the challenge and the secret. Such a chip must contain a “physical component” (the memory) and an evaluation procedure (its read-out) that fulfills Armknecht et al.’s definition because some (very small) uncontrollable random noise is unavoidable even in standard computer memories. There is also no reason why a well designed standard authentication chip cannot posess Armknecht et al.’s security properties. Therefore, even though Armknecht et al.’s definitions constitute great progress of lasting value, they still do not capture the distinctive properties of the PUF concept. In practice Armknecht et al. define all devices that run any challengeresponse protocol as PUFs.
A Formal Definition and a New Security Mechanism of PUFs
2
291
A Model of the PUF Concept
2.1
Formal Definition of “PUF”
In the following we assume Armknecht et al.’s model of a PUF as physical function PF() (see section 1.2). We break up the physical function PF() into three steps (see fig.(1)). C,Sr ,S and R are bit strings. 1. In the first “physical read-out” step PF1 = Sr , internal information Sr (the “raw secret”) is physically read-out from the PUF in response to a challenge C foreseen by the system architecture. 2. In an optional second step PF2 (Sr ) = S error correction and/or privacy amplification are performed, such that errors in the read-out are corrected and parts of Sr which may be known by the attacker (e.g. by guessing parts of the challenge) are removed by privacy amplification algorithms. 3. In an optional third step PF3 (S) = R, some additional algorithm is performed with S as input to calculate the final response R. Typically PF3 is some cryptographic protocol that proves the possession of S without disclosing it. In many existing PUF architectures the challenge C is an address of information inside the PUF which is output as the response R. E.g. in arbiter PUFs[11] C defines the choice of a set of delay switches whose cumulative delay path defines S (and from this R). Our idea is that the possibility for this mode of addressing, rather than its “unclonability”, defines a PUF. The challenge C can then be understood as a key required for physical access to the response R. R remains secret without access to C. A security architecture based on this idea requires certain properties of PF() which define the PUF concept: – Formal Definition 1 of a PUF A hardware device is called “PUF” if: a. a physical function PF2 (PF1 ()) which is deterministic for a specific set of challenges M, can be evaluated with each challenge at least once and b. the value S = PF2 (PF1 (C)) changes with its argument, for all outside challenges C ∈ M, i.e. PF2 (PF1 (C)) = S is not a constant function. One difference to some previous PUF definitions is that PF() is not required to be easily evaluatable. An efficient evaluation of S is certainly a desirable design goal, but there is no reason why a device with inefficient read out cannot be a PUF by definition. Another difference to most previous definitions is that it allows a PUF to be clonable. As an example consider the following physical function that fulfills the above definition 1: – PF1 (any C with more 1s than 0s) = 1001101101 – PF1 (any C with more 0s than 1s or equal number of 1s and 0s) = 0001101000 Clearly a PUF with this PF1 can be reproduced by a trivial algorithm, i.e. it is trivially mathematically clonable. This is a desirable property because “clonable
292
R. Plaga and F. Koob
C = 01011 PUF Challenge
C
PF_1
Sr
PF_2
S
Read out
Error correction
PF_3 Cryptoalgorithm
Sr
raw Secret
S
Secret
R
Response
Fig. 1. Symbolic model of a PUF The box delineates the PUF that receives a challenge C (shown with an example bit string) and sends a response R that is determined in three distinct steps. The first step is the physical readout, the second the correction of errors that can occur in the first step and the third step includes all operations of mathematical cryptography.
PUFs” do exist in the real world and should not present a PUF definition with a paradox. “Unclonability” is then a property that is aimed for, rather than achieved by definition. Analogously “cryptography” aims for secrecy (crypto) rather than achieving it by definition. Even though it is a child’s game to break it, the Cesar cipher is a valid cryptographical algorithm according to this definition. Consequently cryptographic algorithms are commonly defined to be “keydependent injective” (rather than “unbreakable”) mappings”[20]. Where does this leave PUF security? It is not possible in principle to extract the secret S from a PUF without knowing of the challenge. This is true even for the above insecure PUF. However in the example above it is easy to reproduce PF1 , and therefore, as soon as the challenge becomes known, S becomes known. Therefore the crucial necessary objective for the security of a PUF is the unclonability of PF2 (PF1 ()). In the next section we make this insight more precise. A complete and quantitative set of security requirements (i.e. with requirements on their length , the number of independent challenges N etc.) can only be made in the context of a concrete PUF architecture. One example is discussed further in section 5.1. 2.2
Security of a PUF
Requirements for prediction. Even though the response of a PUF can in principle be used for various purposes, we will conclude in section 6 that one central PUF capability is the distribution of remote authentication secrets. If S
A Formal Definition and a New Security Mechanism of PUFs
293
is used for authentication purposes, an attacker must be able to fully predict it, i.e. a partial prediction of S=PF2 (PF1 (C)) for a given argument C will not be considered a successful attack in the following. Therefore, the natural “basic objective” of PUF security is that the attacker cannot predict a complete, correct bit string S for a given bit string C. Attack models. Security can only be defined relative to an attack model, that lays down the assumptions about the security environment. We assume in the following two models from the literature that seem realistic in practice. The first one models an attempt to break Armknecht et al.’s[1] selective unclonability1 . It does not put any restriction on the attack strategy, therefore adaptive choices of challenges are possible2 . The second one is an attempt to do the same with a certain reasonable amount of insider knowledge. Both models assume that the attacker has only access to one single PUF, i.e. attacks exploiting correlations between different PFs are excluded by assumption (see Armknecht et al.[1] for the general case). Attack model 1: “Outsider attack”: The attacker has physical access only to the attacked PUF only for a finite amount of time Δta . After this access period, she tries to predict a secret S from the PUF to a challenge C, randomly chosen from the set of all challenges. She has no knowledge of the set of challenges and secrets that will be used during the active lifetime of the PUF or any further previous knowledge of the PUF. Attack model 2: “Insider attack”: The attacker has physical access only to the attacked PUF only for a finite amount of time Δta . After this access period she tries to predict a secret S from the PUF to a challenge C, randomly chosen from the set of all challenges. She has no knowledge of the set of challenges and secrets that will be used during the active lifetime of the PUF but she has all other information that the manufacturer of the PUF has about the attacked individual PUF. The attack models assume that the attacker tries to predict S rather than R, because PF3 might be protected with non-PUF security mechanisms, e.g. with a 1
2
R¨ uhrmair et al.’s[15] PUF definition demanded that the original manufacturer of the PUF cannot produce two PUFs which are clones of each other (Armknecht et al.[1] demand this “existential unclonabiliy” only optionally.). “Selectively unclonability”[1] means that given physical access to the device an attacker cannot produce a clone. In practice existential unclonability would hardly enhance the security against a malicious manufacturer, for the following reason. She could produce “quasiexistential-PUF” devices that do not meet the PUF definition 1, but algorithmically simulate - e.g. with an keyed hash function - an output that cannot be discriminated from the one of an existential PUF. These quasi-existential-PUFs could be easily cloned by the malicious manufacturer, and could serve exactly the same purpose as clonable PUFs. As an alternative to existential unclonability we will propose a weaker “resistanceagainst-insider-attacks” security level in this section 2.2. Therefore strong unpredictability in the sense of Armknecht et al.[1] will be necessary to protect the PUF.
294
R. Plaga and F. Koob
secure tamper-resistance scheme in combination with a secure crypto algorithm. Such a security mechanism shall remain out of our consideration because we aim to define the security of the PUF itself. Security against a model-2 attacker corresponds to unclonability against an attacker who has most of the inside knowledge about the PUF production, but who cannot directly manipulate the production process. This unclonability is weaker than “existential unclonability” (see footnote 1) but perhaps more relevant in practice. Definition of a secure PUF. The PUF-security definition now follows from the requirement that the attack shall be unsuccessful: – Formal definition 2 of the PUF-security objective A PUF is secure against an attack of a model-1 (“selectively unclonable”) attacker if a model-1 attacker can compute or physically copy the function PF2 (PF1 (C)) = S for not more than a negligible fraction L of challenges from the set of all possible challenges. Here “compute” means via a computation independent of the PUF and corresponds to “mathematical cloning”. “Physically copy” means to create a device that functionally reproduces PF2 (PF1 (C)) and corresponds to “physical cloning”. Replacing the model-1 by a model-2 attacker defines a PUF that is “insider selectively unclonable”. L is the security level of a secure PUF, i.e. the probability for an attacker to successfully predict the secret S for a challenge C without being in posession of the PUF after the access period. A precise quantification of “negligible”, i.e. the decision which upper limit of L is required, cannot be made on the level of this general definition because it depends on the detailed security environment. L is analogous to the required probability p of a successful brute force attack in classical cryptography that depends on the key length. We propose as a reasonable upper limit on L that it is “negligible on a terrestrial scale” which has been estimated by Emile Borel as < 10−15 [4].
3
Relation of PUFs to Closely Related Concepts
In this section we delineate the concept of a PUF as defined in section 2.1 and 2.2 from five closely related concepts. 3.1
PUFs and Conventional Unclonable Functions (“CUFs”) Are Qualitatively Different
Let us first differentiate between a PUF and a conventional physical function that serves the same function as a PUF (called “conventional unclonable function” CUF in the following). A CUF contains secret information that is protected by tamper resistance, by anti side-channel- and fault-induction-attack measures
A Formal Definition and a New Security Mechanism of PUFs
295
and by a cryptographic algorithm that protects the secret from disclosure via the response. A CUF does not fulfill the PUF definition 1., because the secret does not depend on the challenge. In other words: The first physical secret readout step PF1 (C) is a constant function in a CUF. PUF and CUF differ qualitatively in the way they protect the secret. In a PUF the lack of knowledge of the challenges protects the secret S in a similar sense that the lack of knowledge of a cryptographical key protects the clear text in a cipher text. There is no analogous “key” in a CUF. Its security mechanisms merely rely on physical barriers and arrangements that prevent access to secret information. 3.2
PUFs and Physically Obfuscated Keys Are Independent Concepts
Devices that extract physical information with “non-standard” methods are currently called PUF even if there is no (or effectively a single fixed internal) challenge (e.g. in SRAM PUFs[10]). In this case PF1 () is formally constant, so that such devices are no PUFs in the sense of our definition 1. We endorse R¨ uhrmair et al.’s suggestion[15] to call information extracted in this way in general “physically obfuscated keys” (POKs). This limit of N=1 is the only one where devices that are currently called PUFs, would no longer be classified as PUF under our proposed definition. We find this appropriate because while POKs can enable valuable tamper-resistance mechanisms (see below), they are not the qualitatively novel primitive of physical cryptography that PUFs promise to be (see section 6 for further discussion of the nature of this primitive). The protection by obfuscation is valuable: it consists in the extra-time an attacker needs to learn the non-standard readout mechanism or position in a standard memory where an obfuscated key has to be stored at least temporarily. POKs are delineated from CUF only by the “non-standard” qualifier because stored information is always physical[12]. The secrets of PUFs will usually be stored in a non-standard way, i.e. they will also be POKs. But this is no necessary requirement for a PUF. There is no fundamental reason why PUFs cannot have “standard” computer memories (see e.g. SHICs[17], a PUF using a standard crossbar memory). Physically obfuscated functions (POFs) may also appear in PUF architectures. They are defined as computation with non-standard physical processes, e.g. via scattering of light or folding of proteins. 3.3
Random Number Generators
In both deterministic and physical random number generators the initial readout step PF1 (the read out of the seed) does not depend on a challenge C. In secure deterministic RNGs PF1 (C) must be a constant function. In physical RNGs PF1 is not constant but intrinsically random, i.e. not deterministic. Therefore, RNGs do not meet the PUF definition 1.
296
3.4
R. Plaga and F. Koob
Controlled PUFs: A PUF with Additional Tamper Resistance
In controlled PUFs[7,9] tamper-resistance measures prevent the attacker from obtaining C – Sr pairs from the PUF. Only the C – R pairs - from which Sr cannot be derived if PF3 is a suitable, secure cryptographical algorithm - can be accessed by an attacker. It seems likely that PUFs e.g. used in smart cards will eventually all be controlled PUFs, because such an additional well understood security layer stands to reason. However the security of PUFs themselves should be analysed under the assumption of no such a control because if one trusts the control mechanism, mathematically clonable PUFs suffice anyway. 3.5
“Strong PUFs”: Not the Only Path to Strength
R¨ uhrmair et al.[15] defined a PUF to be “strong” if it “has so many C – R pairs ... that an attack ... based on exhaustively measuring the C – R pairs has a negligible probability of success”. In our nomenclature a strong PUF is roughly a MRT-PUF that fulfills our second security requirement (see section 5.1 below, for further explanation of MRT). It is thus appropriate to call them ”strong“, but there can be secure PUFs which are not “strong” in R¨ uhrmair et al.’s sense, e.g. EUR-PUFs(see section 5.2 below for further explanation of EUR).
4
Security Evaluation of PUFs
A main purpose of the present proposed formal PUF definitions 1. and 2. of the concept “secure PUF” is to establish a consistent basis for security evaluations and certifications of PUFs. What is the structure of an evaluation on this basis? If the proposed PUF fulfills definition 1, the basic informal questions of a security evaluation based on definition 2 are: Which form has PF1 (C) and by which physical mechanism is Sr extracted? What is the form of PF2 (Sr )=S and how is the function evaluated physically? What is the total information content in the set of all secrets S? For what fraction L of the allowed challenges can PF2 (PF1 (C)) be either mathematically computed or physically copied? 5. Which comprehensible physical security mechanisms prevent an attacker to compute or copy PF2 (PF1 (C)) for more than a fraction L of all challenges?
1. 2. 3. 4.
Answers to questions 1. - 4. allow to evaluate quantitative and comprehensible security levels against “mathematical-cloning brute force attacks” (see section 5). Question 5 will have a more qualitative answer, similar to answers to the question whether a mathematical cryptographic algorithm is secure against nonbrute force attacks.
5
Analysis of PUF Security Mechanisms
The holy grail of PUF construction is to construct PUFs that are unclonable i.e. fulfill the security definition 2 (section 2.2). If an attacker succeeds to access the
A Formal Definition and a New Security Mechanism of PUFs
297
PUF’s internal secrets, she will usually be able to compute PF2 . Because physical reproduction of a PUF without knowledge of its internal secrets will probably be hard in practice3 , PUF security mechanisms must above all prevent the attacker from computing PF2 . In other words: mathematical unclonability is the hardest nut. Therefore we will classify the known PUF security mechanism and calculate their security level against brute-force mathematical cloning attacks. Up to now all proposed and constructed PUFs4 are based on a mechanism that we propose to call “minimum readout time”(MRT) and that is further discussed in subsection 5.1. All these existing PUFs turn out to fulfill our PUF-definition 1, i.e. they “remain” PUFs, even in case they have turned out to be clonable (see below). Because currently the MRT mechanism dominates the field, one might be tempted to equate the very concept of PUFs with it. However, the flexibility of our definition allows a completely different PUF security mechanism that we call “erasure upon read-out”(EUR) (see section 5.2) for devices. One concrete EUR PUF, the quantum PUF will be introduced below. These examples show that our proposed definitions have achieved their aims: nearly all existing (MRT) PUFs can be included in its scope, but its flexibility allows to include completely novel PUF constructions (the EUR PUFs). 5.1
“Minimum Readout Time” PUFs
This well known PUF security mechanism is to store a large enough number N of C – S pairs on the PUF so that the total time Δtt = Δtr × N
(1)
to read them all out is much longer than the time Δta during which an attacker possesses the PUF. Δtr is the read-out time for one C – S pair. The maximal fraction of pairs the attacker can read-out is then Δta /Δtt = Lbf . Lbf is the security level against mathematical-cloning brute force attacks. Pappu’s optical PUF[14], the arbiter PUF[8] and nearly all other current PUFs are MRT-PUFs5 . These constructions are valid PUFs according to our definition because their values of PF2 changes with the challenge. However, many of the existing PUFs are insecure according to our definition because R¨ uhrmair et al.[16] succeeded to employ machine-learning methods that allow to infer PF2 (PF1 ()) from a small fraction of all C – R for which only short Δta is necessary[16]. Because all C – S pairs can be thus predicted, the security level against machine-learning attacks Lml = 1 which is “not negligible” in general, i.e. the PUF must be considered mathematically clonable according to PUF-security definition 2. 3 4 5
But not necessarily impossible. She could e.g. succeed to reproduce to clone a PUF exactly copying its production process. In the sense of this paper, i.e. excluding POKs. The only exception are “PUFs” with only one challenge which we propose to call only “POKs” in the future, see section 3.2.
298
R. Plaga and F. Koob
The exact form of PF() depends on the detailed architecture of the MRT PUF. In general MRT PUFs can be hardened against mathematical cloning if their PF2 (PF1 ) fulfills the following demands: Security requirements for the MRT-PUF – N must satisfy: N ≥ L−1 (Δta /Δtr ) – Suppose PF2 (PF1 (Cn )) = Sn with n = 1...N where both Cn and Sn contain bits. Then the combined information content (entropy) I of all Cn and Sn must satisfy: I ≥ 2 N – The set of challenges to be used in operation must not be contained in any form in the PUF. – The lengths of the challenge and response S must both fulfill: ,S ≥ log2 (N). The first condition expresses that to prevent brute force mathematical-cloning attacks the number of stored C – R pairs N must be extremely large if L = 10−15 , (see section 2.2 on the choice of L). With representative values of Δta = 1 day and Δtr = 1 second the required N would be on the order of 1020 which is exponentially larger than e.g. storable in common data storage devices of much larger size than a typical PUF. This is the sense in which a secure MRT-PUF requires the storage of an “exponentially large” secret. The second condition expresses that in order to reliably ward successful machine-learning attacks PF2 must be just an ordered list of C – S pairs with random values that cannot be represented in any more compact form. The third requirement prevents an attack in which only the set of challenges to be used in the field operation of a PUF (which is much smaller than M in secure MRT PUFs) are extracted in an attack. The fourth constraint is necessary to avoid a decrease in the the effective L. 5.2
“Erasure Upon Readout” PUFs – Quantum PUFs
Consider a PUF with only a single C – S pair foreseen by the system architecture. Because there is at least one other non-foreseen C, there are then at least two possible C. A novel PUF security mechanism requires the following: Security requirements for “Erasure Upon Readout” (EUR) PUF – The correct S is returned if the challenge C is correct (i.e. the one foreseen by the PUF’s architecture) and S is erased and returns a random value if it is not. – The length of the challenge and response S must both fulfill ,S ≥ log2 (1/L). – The set of challenges to be used in operation must not be contained in any form in the PUF. EUR PUFs can fulfill the PUF-definition 1 if they are non-constant PFs that are deterministic for the foreseen set of challenges. For EUR PUFs - completely opposite to the MRT case (see section 5.1) - the total number of challenges “N” can remain as small as 2 but still be secure because by way of the second and
A Formal Definition and a New Security Mechanism of PUFs
299
third security requirement the probability to guess the correct challenge is only L and challenging with the wrong challenge will erase S by the first requirement. N can be chosen to as many different challenges as are actually needed during the practical deployment of the PUFs. The only concrete “Erasure Upon Readout” architecture proposed up to now, is Wiesner’s “quantum money” and “quantum unforgeable subway token”[21,3] that can be described as an electronic hardware device running a challenge response protocol (such a kind of “money” or “token” has to be) and that fulfill our definition 1 of a PUF. In such a “quantum-PUF” the secret information consists of quantum-mechanical two-state systems (“qubits”) that are prepared either in one of the two quantum mechanical so called “Fock” states |0 or |1 (base # 0) or in either one of the states √12 (|0 + |1) or √12 (|0 - |1) (base # 1). |0 and √12 (|0 + |1) encode a “0” secret bit and |1 and √12 (|0 - |1) encode a “1” secret bit. The challenge bits indicate the correct chosen measurement bases. The raw secret Sr is encoded with the choice of the state within a chosen basis according to the rule stated above. In order to decode or copy Sr , it is necessary to know in which of the two bases # 0 or # 1 the qubits for one challenge were prepared. If a qubit is read out in a wrong base, the laws of quantum mechanics determine that the read-out result is a perfect random number and additional read out attempts will again yield this random number, rather than the original, correct number. The physical function PF of the quantum-PUF is thus given as: Quantum-EUR PF1 (): – First read-out: PF1 (correct base bit) = correct bit of Sr PF1 (incorrect base bit) = random bit. – Any further read-out in the same base: PF1 () = same bit as in first read-out Evidently in the first read-out PF1 is not constant and deterministic for the foreseen C i.e. a quantum-PUF fulfills definition 1. Reading out a C – S pair more than once is possible, but after the first read-out, the information is no longer secure because the qubits are no longer in a quantum-mechanical superposition of states. In the most simple case without any read-out errors or inefficiencies (so that no further processing is done PF2 (Sr )= Sr ) and implementation mistakes (an assumption that will be difficult to fulfill [18]) the only potentially successful attack is to guess the challenge. On average, for half of the bits the guess will be correct and the correct corresponding bits of Sr will be output. For the other half the probability to get the correct output bit is 1/2. The total probability to get a correct output bit of Sr is therefore 0.75 and La = ( 34 )− , which is the absolute (i.e. not only mathematical-cloning brute force) security level of a quantum-PUF against this attack. E.g. with a secret Sr consisting of 128 qubits, L < 10−15 thus fulfilling the criterion for a secure PUF with Borel’s estimate for an upper bound on L (see section 2.2). Wiesner’s quantum money, interpreted
300
R. Plaga and F. Koob
as a “quantum PUF”, thus proves that an absolutely unclonable PUF is not in contradiction to the laws of physics. The use of quantum-PUFs for authentication is beyond the reach of current technology because qubits are unavoidably read out on very short timescales (presently qubits cannot be isolated for longer than milliseconds[6]) by interactions with their environment. As explained above, quantum-PUFs are no longer secure after read-out. Quantum cryptography[18] can be described as sending a quantum-PUF in the form of a chain of photons in order to distribute its secret S for use as cryptographic key. In the laboratory such a “light-field” PUF remains in the initially prepared coherent state for no longer than about a millisecond.
6
Discussion
The protection of secrets in hardware devices that need to access these secrets in their normal operation - a necessary condition for any authentication procedure - cannot be implemented with methods of mathematical cryptography alone. Some physical protection mechanism is needed. The conventional tamper resistance mechanisms (employed in CUFs see section 3.1) rely on protecting the memory with physical barriers. CUFs withstand known, vigorous direct attacks typically for not longer than a few months[19]. We showed that PUFs are a qualitatively novel alternative. The secret is protected by the absence of information from the device of where of where the challenge is stored. In CUFs and POKs this information must exist on the device because otherwise the response cannot be evaluated, even if it is protected by direct, physical barriers. Thereby PUFs protect the secret by a novel genuine primitive of physical cryptography. The possibility of realizing PUFs based on the principles of quantum mechanics demonstrates that in principle the laws of physics allow to construct absolutely secure PUFs. This situation motivates more security-related physics research on unclonable quantum-PUF and MRT-PUF, to invent entirely new PUF construction principles. The real PUF promise are PUFs that withstand any known, practical attack, period, i.e. provide a level of authenticity protection similar to the one provided by mathematical cryptography for confidentiality. In the future PUFs will probably authenticate hardware devices. If Alice knows the C – Sr pairs of a PUF she gave to Bob (e.g. from the designer of the PUF) she can publicly broadcast a challenge and be sure that the correct response S can only be created on Bob’s original PUF. Therefore effectively PUFs allow the remote distribution of authenticated secret entropy (the S for Bob) via sending the challenges (the C chosen and sent by Alice) over standard channels. These entropy could “update” the secrets in conventional unclonable functions. In this way existing architectures based on CUFs could be augmented by PUFs without the need for a completely new PUF security architecture. Acknowledgements. We thank R. Breithaupt, U. Gebhardt, M. Ullmann, C. Wieschebrink and anonymous referees at the TrustED 2011 and PILATES 2012 workshops for helpful discussion and criticism on earlier versions of this manuscript.
A Formal Definition and a New Security Mechanism of PUFs
301
References 1. Armknecht, F., et al.: A Formal Foundation for the Security Features of Physical Functions. In: IEEE Symposium on Security and Privacy (SSP), pp. 397–412. IEEE Computer Society (May 2011) 2. Armknecht, F., Maes, R., Sadeghi, A.-R., Sunar, B., Tuyls, P.: Memory LeakageResilient Encryption Based on Physically Unclonable Functions. In: Matsui, M. (ed.) ASIACRYPT 2009. LNCS, vol. 5912, pp. 685–702. Springer, Heidelberg (2009) 3. Bennett, C.H., Brassard, G., Breidbart, S., Wiesner, S.: Quantum Cryptography, or Unforgeable Subway Tokens. In: Advances in Cryptography: Proceedings of CRYPTO 1982, pp. 267–275. Plenum Press (1983) 4. Borel, E.: Probabilities and life. Dover (1962) 5. Busch, H., Sot´ akov´ a, M., Katzenbeisser, S., Sion, R.: The PUF Promise. In: Acquisti, A., Smith, S.W., Sadeghi, A.-R. (eds.) TRUST 2010. LNCS, vol. 6101, pp. 290–297. Springer, Heidelberg (2010) 6. Fischer, J., Loss, D.: Dealing with decoherence. Science 324, 1277 (2009) 7. Gassend, B., Clarke, D., van Dijk, M., Devadas, S.: Controlled physical random functions. In: Proceedings of the 18th Annual Computer Security Applications Conference, ACSAC 2002 (2002) 8. Gassend, B., Clarke, D., van Dijk, M., Devadas, S.: Delay-Based Circuit Authentication and Applications. In: Proc. of the 18th Annual ACM Symposium on Applied Computing (March 2003) 9. Gassend, B., van Dijk, M., Clarke, D.E., Torlak, E., Tuyls, P.: Controlled physical random functions and applications. ACM Trans. Inf. Syst. Secur. 10(4), article 15 (2008) 10. Guajardo, J., Kumar, S.S., Schrijen, G.-J., Tuyls, P.: FPGA Intrinsic PUFs and Their Use for IP Protection. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 63–80. Springer, Heidelberg (2007) 11. Lim, D., et al.: Extracting Secret Keys From Integrated Circuits. IEEE Trans. on Very Large Scale Integration (VLSI) Systems 13(10), 1220 (2005) 12. Landauer, R.: Information is physical. Physics Today 23 (May 1991) 13. Maes, R., Verbauwhede, I.: A discussion on the Properties of Physically Unclonable Functions. In: TRUST 2010 Workshop, Berlin (2010) 14. Pappu, R.: Physical One-Way Functions, PhD thesis. MIT (2001); Pappu, R., Recht, B., Taylor, J., Gershenfeld, N.: Science 297, 2026 (2002) 15. R¨ uhrmair, U., S¨ oltner, J., Sehnke, F.: On the Foundations of Physical Unclonable Functions, Cryptology ePrint Archive, Report 2009/277 16. R¨ uhrmair, U., Sehnke, F., S¨ olter, J., Dror, G., Devadas, S., Schmidhuber, J.: Modeling attacks on physical unclonable functions. In: ACM Conference on Computer and Communications Security (CCS), pp. 237–249 (2010) 17. R¨ uhrmair, U., Jaeger, C., Algasinger, M.: An Attack on PUF-based Session Key Exchange, and a Hardware-based Countermeasure: Erasable PUFs. In: 15th International Conference on Financial Cryptography and Data Security, St. Lucia, February 28-March 4 (2011) 18. Scarani, V., Kurtsiefer, C.: The black paper of quantum cryptography: real implementation problems, arXiv:0906.4547v1 (2009) 19. Tarnovsky, C.: Deconstructing a “secure” processor. In: Black Hat Conference, Washington (2010), https://www.blackhat.com/presentations/bh-dc-10/ Tarnovsky Chris/BlackHat-DC-2010-Tarnovsky-DASP-slides.pdf 20. van Tilborg, H.C.A. (ed.): Encyclopedia of cryptography and security. Springer, New York (2005) 21. Wiesner, S.: Conjugate coding. Sigact News 15, 78 (1983)
Modeling and Analysis of a P2P-VoD System Based on Stochastic Network Calculus Kai Wang1,2 , Yuming Jiang3 , and Chuang Lin1 1
Department of Computer Science and Technology Tsinghua University, China 2 Department of Computing and Mathematical Sciences Caltifornia Institute of Technology, USA 3 Department of Telematics & Q2S Center Norwegian University of Science and Technology, Norway [email protected] [email protected] [email protected]
Abstract. Peer-to-peer video-on-demand (P2P-VoD) is one of the most commercially important and technically challenging applications. In such systems, each video file is typically divided into sub-pieces that are stored possibly at different sending peers, and the receiving peer often needs to coordinate multiple senders by specifying a transmission schedule for them. In this paper, we investigate this transmission scheduling problem for a specific P2P-VoD system which is PPLive. We formalize the problem of scheduling sub-piece transmission based on stochastic network calculus, and analyze its delay performance. Based on the delay analysis, an optimization problem is formulated to design the schedule for sending peers to transmit their sub-pieces. The objective of this optimization is to maximize a defined reward and at the same time meet the delay constraint required to ensure video quality at the receiver. Keywords: P2P, Video-on-Demand, Stochastic Network Calculus.
1
Introduction
Peer-to-peer based video-on-demand (P2P-VoD) streaming services are gaining popularity, and several P2P-VoD streaming systems have been successfully deployed to date, including PPLive [1], QQVideo client [2] and PPStream [3]. In 2010, PPLive launched a new generation of client software named P P T V [1] that integrates P2P live streaming and VoD, and it is reported that PPLive has a total number of 200 million users worldwide [4]. In November 2011, there were more than 500,000 VoD channels concurrently on-line [1], ranging from movies and TV dramas to cartoons, etc. It is a great challenge to design a highly scalable P2P-VoD system of such a large scale. Different from P2P live streaming systems, peers in the PPLive P2P-VoD system can watch different videos simultaneously, and users’ interactive behaviors dilute their ability to share contents [5]. To compensate, instead of only J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 302–316, 2012. c Springer-Verlag Berlin Heidelberg 2012
Stochastic P2P Network Calculus
303
having the playback buffer in memory, each peer contributes a fixed amount of storage (that is 2GB in PPLive), which could buffer as well. The system is essentially a highly dynamic P2P replication system, with a sophisticated, distributed scheduling mechanism guiding peers to help each other in real time [6]. In the current design of PPLive P2P-VoD system, in order to efficiently support the huge number of concurrent channels and users, the granularity of video replication is a cluster. As an aggregation of 10 chunks, a cluster serves as the basic unit for video download and replacement, while a sub-piece is the basic unit for transmission. When a receiving peer asks multiple neighbor peers for sub-pieces simultaneously, it has to coordinate the sub-piece transmission from its senders. More precisely, a receiver runs a scheduling algorithm to compose a transmission schedule, which specifies for each sender the assigned sub-pieces and their transmission times, so as to maximize the perceived video quality delivered by the on-time sub-pieces. The purpose and contribution of this paper is two-fold. One is to define models and derive results establishing a framework for delay analysis of the P2P-VoD system. This framework largely makes use of concepts and results from stochastic network calculus, hence presenting an interesting application of stochastic network calculus in real systems. Another is to formulate the scheduling problem into an optimization problem, where delay results from the analytical framework contribute as a crucial constraint. By solving this optimization problem, optimal schedules can be further obtained. The paper is organized as follows. We first introduce the architecture of the considered PPLive P2P-VoD system in Section 2, where the sub-piece transmission scheduling problem is highlighted. We then in Section 3 define models and present results for delay analysis of the system, which stem from stochastic network calculus. Also in this section, a playback delay bound is derived, which acts as a crucial constraint for designing optimal sub-piece transmission schedule in Section 4. In Section 5, a case study is conducted, where the parameters in the analysis are based on the setting of a real system. In Section 6, we review related work in the literature. Finally, we summarize the contribution of this paper.
2
System Model
In this section, we first provide an architectural overview of the P2P-VoD system investigated in this paper, and then state the transmission scheduling problem with mathematical formulation. 2.1
The PPLive P2P-VoD System Architecture
Similar to many P2P streaming systems, besides the client peers, the PPLive P2P-VoD system has the following key components: – A set of video servers as the source of content. – A bootstrap server performing bootstrapping functions, e.g., helping peers to find a suitable tracker. – A set of trackers that help peers connect to others to share the same content.
304
K. Wang, Y. Jiang, and C. Lin
The system architecture of PPLive is depicted in Fig.1. Each peer runs client software obtained from the P2P-VoD operator. With well designed protocols, the client software communicates with the servers and other peers to share content.
Fig. 1. The system architecture of PPLive P2P-VoD
2.2
Building Blocks of the System
In the PPLive P2P-VoD system, a video (typically in size of 300MB for a movie) is composed of a number of chunks (each in size of 2MB), which is the basic unit for storage and advertisement. A chunk is composed of a number of pieces (each in size of 16KB) that are dictated by the media player, while a piece is further divided into sub-pieces (each in size of 1KB) as the unit for transmission. Each peer is asked to contribute a fixed amount of hard disc storage (at the size of 2GB). Only when all pieces (respectively sub-pieces) in a chunk (respectively piece) are available locally, the chunk (respectively the piece) is advertised to other peers. In the current design of PPLive, for efficient content replication, every 10 chunks are aggregated as a cluster, which serves as the basic unit for file download and replacement. The segmentation of a movie file in PPLive is shown in Fig.2. A peer downloads chunks from partner peers using a pull method, and multiple videos are cached in the hard disc [6]. The goal of the replication strategy is to make the chunks as available to the user population as possible, and to meet users’ viewing demand without incurring excessive additional transmission overheads. When there is no more room in the hard disc, some algorithm is used to choose a video to remove [5]. For efficient replication, the granularity of replacement is a cluster, i.e., as soon as one chunk of a cluster is removed, all chunks in the same cluster are candidates for replacement. When choosing a video, a peer obtains a list of peers that currently have the corresponding cluster from the tracker. The peer then establishes partner relationships with the peers on the list. The peer asks its partners for their Chunk Bitmaps that tell which chunks a peer has. The information about which pieces a peer has is kept in a Piece Bitmap, which is locally stored at the peer’s cache. In PPLive, a peer asks multiple neighbors for sub-pieces simultaneously and
Stochastic P2P Network Calculus
Movie
…
305
Cluster
Chunk
Frame
…
…
…
Piece
sub-piece
Fig. 2. The file segmentation of the PPLive P2P-VoD system. A cluster is the basic unit for video download and replacement, and a sub-piece is the basic unit for transmission.
periodically tries to find new neighbors for backup use. Only when the neighbors cannot provide sufficient downloading rate, a peer will turn to the server. Once a peer has obtained a chunk, it informs its tracker that it is replicating that cluster; additionally, it also tells its tracker when it no longer holds a cluster in its cache. For playback continuity, an intuitive idea is to download as fast as possible for future use. In case the user jumps to a new position, the pre-downloading video would be a waste of system resource. PPLive uses an impulse-like download mechanism with which the downloading rate (typically > 100KB/s) is higher than the playback rate (typically 50KB/s) [1]. On downloading a cluster, the peer would stop and wait until the playback track-bar approaches the end of the current cluster. A schedule is created for some fixed interval of sequential sup-pieces. This interval is referred to as a frame (that typically contains a number of pieces), and should be transmitted in the scheduled time slot. A cluster is composed of a number of frames, and the sliding window that covers one frame moves faster than the playback window. In the following, the time axis is discretized into time slots, and it takes Td time slots for downloading an entire cluster, while it takes Tp (>Td ) time slots for the downloaded cluster to be played back. 2.3
Problem Statement and Formulation
The problem addressed in this paper is to develop a transmission schedule decided by the receiving peer for the sending peers to transmit sub-pieces. An aim of the design is to maximize a defined reward, for which a delay constraint required to ensure video quality at the receiver is a crucial constraint. Transmission schedules are computed for recurring time windows, which should be invoked again to compute the transmission schedule for the next one before the time slot ends. The algorithm is also invoked whenever the set of senders or their characteristics change. The scheduling algorithm must be computationally efficient and runs in real time, since it is invoked frequently within short periods.
306
K. Wang, Y. Jiang, and C. Lin
We focus on a sole component of the PPLive-like P2P-VoD system, which is the transmission scheduler. There is no assumption on the other components of the system, which are supposed to function well according to mechanisms dictated by the specific system. A set of potential partner peers will eventually be presented to a receiver for downloading the sub-pieces. With the data availability of each partner sender, the goal is to make the best out of this set of senders for the receiver. We present rigorous design of the scheduler with mathematical formulation and analytical guarantee on its performance. We then formulate the sub-piece scheduling problem as a time-indexed integer linear programming problem. Specifically, we consider a single swarm in the system, and study the problem of transmitting a cluster of sub-pieces from multiple senders to a receiving peer. In PPLive, each peer is allowed to have at most 20 concurrent upload connections, and 25 concurrent download connections, which is more than the upload ones due to the existence of silent peers [6]. To compose a feasible schedule, the receiver monitors its senders for sub-piece availability and upload bandwidth. Each sender employs some bandwidth estimation method to estimate its upload bandwidth, and evenly divides it among all connected receivers, and a receiver keeps track of the upload bandwidth of its senders by querying each sender. We denote by uj (t) the upload bandwidth of sender j. With the piece availability and sender bandwidth, the receiver composes a sub-piece schedule for a frame of n sub-pieces. Suppose that a scheduling decision is made for every frame at the beginning of each time slot. The goal is essentially to calculate an assignment of n sub-pieces to m senders. We denote the set of all sub-pieces by N (N = {1, 2, ..., n}) and the set of all sending peers by M (M = {1, 2, ..., m}), the video server by v. We let xi,k (t) be a 0 − 1 variable for each i ∈ N , k ∈ M ∪ v and t = 1, 2, ..., Td, where xi,k (t) = 1 if sub-piece i is assigned to sender k at time t, and xi,k (t) = 0 otherwise. Note that each sender can transmit multiple sub-pieces, and for efficient management, we assume the assigned set of sub-pieces are contiguous. Similarly, the boolean variable xs,k (t) indicates whether or not a set of contiguous sub-pieces s is assigned to sender k at time slot t. A sender k gets s ∈ S at time slot t if and only if xs,k (t) = 1. Specifically, each sender can be assigned to one set of contiguous sub-pieces, and each sub-piece is assigned to one and only one sender. We let as,k be the availability of sub-pieces s on sender k, and it is set to 1, if sender k has a set of sub-pieces s, and set to 0 otherwise. Fig.3 depicts an example of feasible sub-piece scheduling for the senders. We use S to denote the collection of all sets of contiguous sub-pieces, i.e., S ⊆ P(N ), where P(N ) denotes the collection of all subset of N , and ∀s ∈ S, s = i, i + 1, ..., i + l, (1 ≤ i ≤ i + l ≤ n). ∀s ∈ S, we denote by E(s) the earliest sub-piece among s, and by L(s) the latest sub-piece of s. ∀s ∈ S, and s ∈ S, in set s that we call s intersects s if s ∩ s = ∅. The total amount of sub-pieces assigned to sender k at time slot t is denoted by nk (t) = i∈s xi,k (t), and we have Σk∈M∪v nk = n. All symbols used in the paper are summarized in Table 1.
Stochastic P2P Network Calculus sub-pieces
307
n
…
…
...
m
Senders
…
...
Fig. 3. A feasible sub-piece scheduling for the receiving peer. Each blue colored tuple (sub-piece, sender) denotes the sub-piece assignment of the feasible schedule, and the sub-pieces assigned to each sender are contiguous. Table 1. List of symbols used in this paper Symbol
Description
N
the set of all sub-pieces in a frame, (N = {1, 2, ..., n})
M
the set of all sending peers (M = {1, 2, ..., m})
t
time slot t
i
sub-piece i, i ∈ N
j
sending peer j: j ∈ M
v
the video server
k
sender k: k ∈ M ∪ v
uk (t)
upload bandwidth of sender k at time t
nk (t)
the total sub-pieces assigned to sender k at time slot t
xi,k (t)
variable for sender k to transmit sub-piece i at time t
s
set of contiguous sub-pieces, s ∈ S
xs,k (t)
variable for sender k to transmit sub-pieces s at time t
S
the collection of all sets of contiguous sub-pieces
a(s, k)
the availability of contiguous sub-pieces s on sender k
Td
the number of time slots for downloading an entire cluster
Tp
the number of time slots for playing back a cluster: Tp > Td
r(s, k) the reward function by assigning s to a sender k in the schedule
3
Stochastic P2P Network Calculus
We have now formulated the framework of sub-piece scheduling, and we have to optimize some objective at each time step. In the optimization, a crucial constraint is playback delay, i.e., in order to provide high video quality and user
308
K. Wang, Y. Jiang, and C. Lin
satisfaction at the receiver, low playback delay is preferred. In this section, we originally use stochastic network calculus to model and analyze the P2P-VoD system with focus on delay. The Lemmas and Theorems given below could be derived similarly as in [7] and their proofs are omitted in this paper due to page limit. 3.1
A View from the Sending Peer
Consider a sending peer j in the P2P-VoD system in time slot interval (0, t], (t = 1, 2, · · · , Td ). We denote by Ri,j (t) the cumulative download request (in sub-pieces) from peer i; by Ui,j (t) the cumulative amount of service (upload ∗ (t) the cumulative departures capacity) provided by the sender to peer i; by Ri,j of sub-pieces from the sender to peer i. We then focus on a single receiving peer i, which is then omitted. We call Rj (t), Uj (t), and Rj∗ (t) the demand process, the supply process and the upload process of the sending peer respectively, with Rj (0) = Uj (0) = Rj∗ (0) = 0 (in sub-pieces). Similar to the stochastic network calculus, which is based on properly defined traffic models that upper-bound the cumulative arrival and service models that lower-bound the cumulative service [7][8]. For each sending peer in the system, we introduce in the following stochastic demand curve as the traffic model and stochastic supply curve as the server model. Throughout this paper, we assume all stochastic demand curve and supply curve functions are non-negative and non-decreasing. All bounding functions, which imply the permissible probability that the processes violate the desired performance, are assumed to be non-negative and non-increasing. Definition 1: A sending peer j’s download request Rj (t) is said to have a stochastic demand curve αj with bounding function fj , denoted by Rj ∼ fj , αj , if for all t ≥ 0 and all x ≥ 0, there holds P{sup0≤s≤t [Rj (s, t) − αj (t − s)] > x} ≤ fj (x).
(1)
Definition 2: A sending peer j’s supply process Uj (t) is said to provide a stochastic supply curve βj with bounding function gj , denoted by Uj ∼ gj , βj , if for all t ≥ 0 and all x ≥ 0, there holds P {Rj ⊗ βj (t) − Rj∗ (t) > x} ≤ gj (x).
(2)
With the download request and the upload capacity of a sending peer, we can characterize an upload process as follows: Theorem 1. (Upload Property) Consider a sending peer j with download request Rj (t). If the download request has a stochastic demand curve αj with bounding function fj , i.e., Rj ∼ fj , αj , and the peer provides to the demand a stochastic supply curve βj with bounding function gj , i.e., Uj ∼ gj , βj . Then the upload has a stochastic upload curve αj βj with bounding function fj ⊗ gj , i.e. Rj∗ ∼ fj ⊗ gj , αj βj .
Stochastic P2P Network Calculus
309
To ensure the stability of the system, we assume βj (t) ≥ αj (t) when t is “sufficiently” large. When fj (x) = 0 and gj (x) = 0, we have that Rj (s, t) ≤ αj (t − s) and Rj∗ (t) ≥ Rj ⊗ βj (t). To timely transmit the sub-pieces for a sending peer in a time slot, the assigned sub-pieces should not exceed the upload capacity (which is determined by the upload bandwidth) in the time slot, i.e., nj (t) ≤ uj (t), (t = 1, 2, ..., Td ), and we further get Rj (t) ≤ Uj (t). With the constraint of βj (t) ≥ αj (t), it can be found as follows that the assigned sub-pieces will be transmitted to the corresponding receiving peer in time or within some delay bound. The upload transmission delay of sender j at time slot t, Duj (t) is defined as: Duj (t) = inf{d ≥ 0 : Rj (t) ≤ Rj∗ (t + d)}. To ensure a low startup delay, we have to guarantee Duj (t) ≤ DUj for (t = 1, 2, ..., Td), where DUj denotes the acceptable delay bound on upload transmission. Lemma 1. If the peer has a stochastic demand curve αj with bounding function fj , i.e., Rj ∼ fj , αj and the peer provides to the download request a stochastic supply curve βj with bounding function gj , i.e., Uj ∼ gj , βj , then for all t ≥ 0 and all x ≥ 0, the upload delay Duj (t) is bounded by P{Duj (t) > h(αj + x, βj )} ≤ fj ⊗ gj (x),
(3)
where h(αj , βj ) = sups≥0 {inf{τ ≥ 0 : αj (s) ≤ βj (s + τ )}}, which denotes the maximum horizontal distance between αj (t) and βj (t). When fj (x) = 0, gj (x) = 0, we have that Duj ≤ h(αj , βj ). 3.2
A View from the Receiving Peer
Similarly, consider a receiving peer in the P2P-VoD system in time interval (0, t], we denote by Dj (t) the cumulative download from peer j; by D(t) the aggregate download flows from all the partner peers; by B(t) the cumulative amount of service (download capacity) provided by the network; by D∗ (t) the cumulative departures of sub-pieces from the peer (for playback). D(t), B(t), and D∗ (t) are called the download process, the buffering process and the playback process of the receiving peer respectively, with D(0) = B(0) = D∗ (0) = 0. For each receiving peer in the system, we introduce stochastic download curve as the traffic model, and stochastic buffering curve as the server model. Definition 3 : A receiving peer’s download process Dj (t) is said to have a stochastic download curve λj with bounding function ηj , denoted by Dj ∼ ηj , λj , if for all t ≥ 0 and all x ≥ 0, there holds P{sup0≤s≤t [Dj (s, t) − λj (t − s)] > x} ≤ ηj (x).
(4)
Comparing Theorem 1 and Definition 3, we notice that the download traffic has the same characterization as the upload traffic, and we have the following corollary: Corollary 1. The upload process of a sending peer j to a receiving peer, i.e., Rj∗ , can be considered as the download process of the corresponding receiving peer, i.e., Dj , and we have Dj (t) = Rj∗ (t), λj = αj βj , and ηj = fj ⊗ gj .
310
K. Wang, Y. Jiang, and C. Lin
Definition 4 : A receiving peer’s buffering process from sending peer j, Bj (t), is said to have a stochastic buffering curve (provided by the network) μ with bounding function ϕj , denoted by Bj ∼ ϕj , μj , if for all t ≥ 0 and all x ≥ 0, there holds P{Dj ⊗ μj (t) − Dj∗ (t) > x} ≤ ϕj (x). (5) The download transmission delay of sender j at time slot t, Ddj (t) is defined as: Ddj (t) = inf{d ≥ 0 : Dj (t) ≤ Dj∗ (t+ d)}. To ensure a low download transmission delay, we should guarantee Ddj (t) ≤ DDj for (t = 1, 2, ..., Td), where DDj denotes the acceptable delay bound on download transmission. Lemma 2. If the peer has a stochastic download curve λj with bounding function ηj , i.e., Dj ∼ ηj , λj and the network provides to the download a stochastic buffering curve μj with bounding function ϕj , i.e., Bj ∼ ϕj , μj , then for all t ≥ 0 and all x ≥ 0, the download delay Ddj (t) is bounded by P{Ddj (t) > h(λj + x, μj )} ≤ ηj ⊗ ϕj (x). 3.3
(6)
Video Playback Delay
The playback delay of sub-stream j from sender j at time slot t, Dpj (t) is defined as: Dpj (t) = inf{d ≥ 0 : Rj (t) ≤ Dj∗ (t + d). Here we have abused using j a bit: sub-stream j denotes the sequence of sub-pieces from sender j. To ensure a low playback delay, we should guarantee Dpj (t) ≤ DPj for (t = 1, 2, ..., Td), where DPj denotes the acceptable delay bound on playback. Together with Lemma 1 and Lemma 2, using the concatenation property of stochastic service curve [7], we have the following corollary1 for the playback delay bound for each sub-stream j: Corollary 2. If the sending peer has a stochastic demand curve αj with bounding function fj , i.e., Rj ∼ fj , αj , and the peer provides to the download request a stochastic supply curve βj with bounding function gj , i.e., Uj ∼ gj , βj , in addition, the network provides to the download process a stochastic buffering curve μj with bounding function ϕj , i.e., Bj ∼ ϕj , μj , then for all t ≥ 0 and all x ≥ 0, the playback delay Dpj (t) is bounded by P{Dpj (t) > h(αj + x, βj ⊗ μj )} ≤ fj ⊗ gj ⊗ ϕj (x).
(7)
Note that the receiving peer asks multiple neighbor peers for sub-pieces simultaneously. The playback delay needed for the full stream can be concluded as long as the delay bound for each sub-stream j is derived, which is the maximum delay among all the sub-streams. We can conclude the following theorem on the playback delay of the swarm in consideration: 1
Note that we have made a simplification in presenting the concatenation property. In its appropriate form, the stochastic service curves and bounding functions should have small changes [7]. Nevertheless, for the case study, where service curves are mostly deterministic, (7) holds.
Stochastic P2P Network Calculus
311
Theorem 2. If a receiving peer has playback delay bound Dpj for each substream j, with delay violation probability ε, then the playback delay, Dss (t), for the full streams, i.e., the video swarm in consideration, is bounded by: P{Ds(t) > Dpmax } ≤ ε,
(8)
where Dpmax = maxj∈M {Dpj }.
4
An Optimization Problem
When the neighbor sending peers cannot provide sufficient downloading rate, a receiving peer will turn to the video server. On uploading a number of sub-pieces to the receiving peer, it will result in a reward to each sending peer, or a reward to the video server operator. We assume the reward for both the sending peer and the video server has the same function of the number of sub-pieces that have been assigned and then transmitted, i.e., r(s, k). Here we assume the function r the number of sub-pieces assigned, i.e., (r: S × (M ∪ v) → R+ ) is proportional to r(s, k) = c·nk (c is a constant and nk = i∈s xi,k ), such that r(s, k) is convex in nk . We consider a general sub-piece transmission scheduling (TS) problem for the PPLive-like P2P-VoD (PV) system, given a receiving peer with n sub-pieces and m senders, as well as the video server. As a scheduling decision is made for every frame at the beginning of each time slot, which could be repeated (if the senders’ characteristics do not change during time period Td ), so we omit the time variable t. In one time slot, for each set of contiguous sub-pieces s ∈ S, each sender k (k ∈ M ∪ v), we have a reward r(s, k). We intend to find the most advantageous way to assign an s ∈ S to peers and video server, so that the total reward to the video server operator is minimized. The resulting schedule is sent to senders, and senders transmit sub-pieces following the schedule. The PVTS problem is formalized as the following binary integer programming problem. r(s, k) · xs,k (9) min (s,k)∈S×v
⎧ Σs:i∈S,k∈M∪v xs,k ≤ 1, ∀i ∈ N ⎪ ⎪ ⎪ ⎪ ⎪ Σs∈S xs,k ≤ 1, ∀k ∈ M ∪ v ⎪ ⎪ ⎪ ⎨ x ≤ a , ∀s ∈ S, k ∈ M ∪ v s,k s,k s.t. ⎪ nk ≤ uk , ∀k ∈ M ∪ v ⎪ ⎪ ⎪ ⎪ ⎪ Σk∈M∪v nk = n ⎪ ⎪ ⎩ P{Ds > Dpmax } ≤ ε
(10)
The first constraint shows that every sub-piece is assigned to at most one sender or the server, and the second one ensures that each sender or server can get no more than one set of contiguous sub-pieces. The third one makes sure that
312
K. Wang, Y. Jiang, and C. Lin
we always schedule sub-pieces to a sender which holds them, the required subpieces are available in the set of potential partner peers to the receiver. The forth one ensures that the assigned sub-pieces do not exceed the upload capacity in the time slot. In the last constraint, Dpmax denotes the acceptable delay bound for the video swarm, and ε is the corresponding permissible delay violation probability. With the assumption above, the total reward for transmiting n sub-pieces is fixed, such that minimizing the total reward of the video server is equivalent to maximize the totel reward to the sending peers. By exchanging the Chunk Bitmap as well as the Piece Bitmap, we assume the sending peers, which have established the neighborhood relationship with the receiving peer, have all the pieces the receiving peer is requesting. The problem left is the assignment of the sub-pieces in each frame to m sending peers, i.e., determining nj , the number of sub-pieces to each sending peer j (j ∈ M ). Since we are interested in the system performance bound, we further assume all the n sub-pieces are assigned to the peers, and the above optimizaiton problem (9) can be transformed into the following one:
s.t.
5
m
r(nj , j)
(11)
nj ≤ uj , ∀j ∈ M P{Ds > Dpmax } ≤ ε
(12)
max
j=1
Case Study
To demonstrate the use of the presented analytical framework above, a case study is conducted in the following. We rely on our understanding of the PPLive architecture as mentioned in Section 2, coupled with results from measurement studies, to guide the choice of model parameters. Note that the model and the solution method are flexible with respect to the choice of parameters and functions used to represent the service capacity. It is possible to modify parameters and functions of the model and solve using the solution method proposed. In the following, the chosen parameters are based on the default setting in PPLive. 5.1
Setup and Delay Analysis
For ease of presentation, we assume the system is a collection of homogeneous peers. Here we notice that multiple classes of peers can be incorporated at the cost of added notational complexity, which is not considered here. Demand curve. In the system design of PPLive, the typical video playback rate is 400kbps, and for stable downloading to ensure playback continuity, a receiving peer would simultaneously connect 25 neighbor peers to download.
Stochastic P2P Network Calculus
313
PPLive uses an impulse-like downloading mechanism where the (total) download rate (typically 100KB/s or 800kbps) is higher than the playback rate (typically 50KB/s or 400kbps). The total download rate is achieved from the 25 peers. In the case of homogeneous peers, for each frame, it is divides into 25 substreams whose pieces are evenly placed (in their locations) in the frame. These 25 sub-streams are downloaded from the 25 sending peers in parallel, with each sub-stream from a different sending peer. In order to start playing a video, the video player embedded the client software needs to download a number of sub-pieces ahead of time. Here we assume the number of pre-buffered sub-pieces is 50KB. Based on the setup above, the corresponding demand curve to sending peer j is then deterministic as αj (t) = 32 · t + 16, where 32 is in kbps and 16 is in kb. Supply curve. In addition, we assume every sending peer j has the same upload capacity that has average of 512 kbps, which is typical in the ADSL upload link capacity in China. Specifically, this 512 kbps upload capacity is the average packet transmission rate. In PPLive, each peer is allowed to have a number of concurrent upload connections. Here we assume the concurrent upload connection number to be 16, and the peer’s total uplaad capacity (512kbps) is evenly distributed among these connections through some internal processor sharing and link scheduling policy. This means the average capacity to each upload connection is then 32 kbps. In addition, for each dedicated peer, we assume the maximum error or latency, from when a packet is ready for transmission to when it is put on the network link, to be 2s. Then the uploading process has a deterministic supply curve βj (t) = 32 · (t − 2)+ , where 32 is in kbps. Buffering curve. Similarly, the network transmission bandwidth is also evenly distributed for all the connections, each with 10 KB/s. For a single transmission connection, we assume that the network provides a deterministic buffering curve μj (t) = Bj · (t − Pj )+ , where Pj has taken propagation delay into account. Without loss of generality, for each download connection, we assume the minimum Bj is 80kbps and the maximum Pj is 2s. Delay analysis. With the abovementioned parameter setup, a delay bound can be easily calculated from Theorem 2 for the homogeneous case. Specifically under this setup, we have: for the sending peer j, its demand process has Rj ∼ 0, αj ; for each sending peer, its supply process has Uj ∼ 0, βj ; for the receiving peer, its buffering process has Uj ∼ 0, μj , where αj (t) = 32·t+16, βj (t) = 32·(t−2)+ and μj (t) = 80·(t−2)+. Then for all t ≥ 0 and all x ≥ 0, the delay for sub-stream j, Dpj (t) is bounded by Dpj (t) ≤ h(αj , βj ⊗ μj ) ≤ 4.5s.
(13)
Hence, for the homogeneous case, the maximum delay needed for continuous playback is 4.5s.
314
K. Wang, Y. Jiang, and C. Lin
Note that in the above discussion and analysis, we have assumed even schedule of sub-pieces among the 25 peers, which is indeed optimal for this homogeneous peers case as to be discussed in the next subsection. If it had not been the case, some peers would have been required to contribute to the transmission of more sub-pieces. Consequently, for such a peer, its demanding curve will have different rate ρ and burstiness σ parameters in its supply curve αj (t) = ρ · t + σ, with σ > 16. This would lead to a delay bound higher than 4.5s. 5.2
Sub-piece Transmission Schedule
Recall than the system is a collection of homogeneous peers, each has uniform upload capacity, u. Under this assumptions, we can simplify the form of (11) by noting that, if m is fixed, then the remaining optimization for n is convex. Thus, we can use the KKT conditions to determine the optimal scheduling rule n∗ . This yields that n∗1 (t) = n∗2 (t) = ...n∗m (t) = n/m, which implies that once m is fixed the optimal dispatching rule is to load-balance across the peers. Given that load balancing is always optimal, we can decouple dispatching (nj , t) from service capacity planning, and simplify (11) into purely a service capacity planning optimization: max s.t.
m
r(n/m, j)
(14)
n/m ≤ u P{Ds > Dpmax } ≤ ε
(15)
j=1
Note that if the allowed maximum playback delay Dpmax is larger than 4.5s, there exists other schedules that can also meet this delay requirement. Nevertheless, the even load balancing is always optimal. If the allowed maximum playback delay Dpmax is 4.5s, the even load balancing is the only schedule. If, however, the allowed maximum playback delay Dpmax is less than 4.5s, no schedule is above to meet this requirement. 5.3
Discussion
It is worth highlighting that in modeling and analyzing the system, all relevant processes are considered generally to be stochastic. However, in the above case study, they look to be deterministic. One reason for this is that we would like the reader to focus on how the presented framework can easily be used, while not on how complex the analysis could be if stochastic versions were adopted. In addition, in a real system, to avoid the calculation complexity while not losing the insight, some measured worst case parameters may be adopted, which indeed also leads to deterministic modeling of the various processes, given that the number of peers remains unchanged, as discussed in this case study. Note that an important characteristic of P2P systems is user churn, leading to random change of available sending peers and further randomizing these otherwise deterministic processes.
Stochastic P2P Network Calculus
315
Building the P2P-VoD delay analysis framework on stochastic network calculus allows to bring in user churn in its models and analysis, which will be a focus of our future work.
6
Related Works
Several commercial P2P streaming systems have been deployed up to date [9][2][3]. To maximize the perceived video quality, it is computationally expensive to construct an optimal segment schedule. Thus, most existing P2P live streaming systems resort to simple heuristic algorithms [10]. [11] describes a stochastic model to compare different chunk selection strategies and proposes a mixed one. To explore the optimal policy, [12] presents an analytical framework to analyze a large class of chunk selection policies. To resolve the hardness of scheduling problem, the authors of [9] define a utility for each segment as a function of the rarity and urgency, which is then transformed into a min-cost flow problem, while the authors of [13] formulate an optimization problem and solve it using iterative descent algorithm. As an analytical tool for performance evaluation, most stochastic network calculus related papers mainly focus on the theoretical development, while its application is left behind. This is due to the inherent difficulty in substantially extending the theory framework for each realistic scenario. Recent work [14] adjusts stochastic network calculus to evaluate the reliability of the power grid integrating renewable energy. Rather than the data itself, the calculus for informationdriven networks in [15] is for the data information, i.e., Shannon entropy, which bridges the communication networks with information theory. [16] develops an analytical approach by extending the stochastic network calculus theory to evaluate network coding. To the best of our knowledge, we are the first to extend the framework of stochastic network calculus to analyze P2P systems.
7
Conclusion
In this paper, we have considered the sub-piece transmission scheduling problem for the PPLive P2P-VoD system (PVTS). Specifically, we formulated this problem into an optimization problem, in which, a crucial constraint is to meet some delay requirement that is needed to ensure the perceived video quality at the receiver. With a close exploration of the system architecture, we demonstrated an application of stochastic network calculus to delay analysis of this P2P-VoD system. Specifically, to give insight into the analysis, views both from the sending peer and from the receiving peer were presented, based which, models and delay analysis were defined and introduced respectively. While these models and delay analysis stem from stochastic network calculus, these views helped to further conduct analysis on video playback delay needed at the receiver. This delay analysis lays a foundation for the optimal schedule design.
316
K. Wang, Y. Jiang, and C. Lin
Acknowledgments. The fisrt author is in the Tsinghua-Caltech joint PhD program supported by the State Scholarship Fund of China. This research is supported by National Grand Fundamental Research 973 Program of China (No. 2010CB328105 and No. 2009CB320504), and the National Natural Scientific Foundation of China (No.60973107). The authors also wish to thank engineers from Shanghai Synacast Media Tech (PPLive) for fruitful discussions.
References 1. 2. 3. 4. 5.
6.
7. 8.
9.
10.
11. 12. 13.
14.
15. 16.
PPTV, http://www.pptv.com/ QQVideo, http://v.qq.com/download.html PPS, http://www.ppstream.com/ Synacast, http://www.synacast.com/en/article/14 Huang, Y., Fu, T.Z., Chiu, D.M., Lui, J.C., Huang, C.: Challenges, design and analysis of a large-scale p2p-vod system. In: Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication (2008) Wang, K., Lin, C.: Insight into the p2p-vod system: Performance modeling and analysis. In: Proceedings of 18th Internatonal Conference on Computer Communications and Networks, ICCCN 2009 (August 2009) Jiang, Y., Liu, Y.: Stochastic Network Calculus. Springer, Heidelberg (2008) Ciucu, F., Burchard, A., Liebeherr, J.: A network service curve approach for the stochastic analysis of networks. In: Proc. of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (2005) Zhang, M., Xiong, Y., Zhang, Q., Sun, L., Yang, S.: Optimizing the throughput of data-driven peer-to-peer streaming. IEEE Transactions on Parallel and Distributed Systems 20(1), 97–110 (2009) Zhang, X., Liu, J., Li, B., Yum, T.: Coolstreaming/donet: a data-driven overlay network for peer-to-peer live media streaming. In: Proc. of IEEE INFOCOM 2005 (2005) Zhou, Y., Chiu, D.M., Lui, J.: A simple model for analyzing p2p streaming protocols. In: Proc. ICNP 2007 (2007) Fan, B., Andersen, D.G., Kaminsky, M., Papagiannaki, K.: Balancing throughput, robustness, and in-order delivery in p2p vod. In: Proc. of ACM CoNEXT (2010) Chakareski, J., Frossard, P.: Utility-based packet scheduling in p2p mesh-based multicast. In: Proc. of SPIE International Conference on Visual Communication and Image Processing, VCIP 2009 (2009) Wang, K., Low, S., Lin, C.: How stochastic network calculus concepts help green the power grid. In: Proceedings of the 2nd IEEE International Conference on SmartGridComm. (2011) Wu, K., Jiang, Y., Hu, G.: A calculus for information-driven networks. In: 17th International Workshop on Quality of Service, IWQoS (2009) Yuan, Y., Wu, K., Jia, W., Jiang, Y.: Performance of acyclic stochastic networks with network coding. IEEE Transactions on Parallel and Distributed Systems 2(7), 1238–1245 (2011)
Using NFC Phones for Proving Credentials Gergely Alpár1,2 , Lejla Batina1,3 , and Roel Verdult1 1
2
Radboud University Nijmegen, ICIS/Digital Security Group Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands {gergely,lejla,rverdult}@cs.ru.nl TNO Information and Communication Technology, The Netherlands 3 K.U. Leuven ESAT/SCD-COSIC and IBBT Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium [email protected]
Abstract. In this paper we propose a new solution for mobile payments called Tap2 technology. To use it, users need only their NFC-enabled mobile phones and credentials implemented on their smart cards. An NFC device acts like a bridge between service providers and secure elements and the secure credentials (on the card) are never revealed. In this way, secure authentication can be obtained by means of anonymous credentials, implemented on a smart card to provide the functionality with minimal data disclosure. We propose to use zero-knowledge proofs based on attribute-based anonymous credentials to provide the security and privacy requirements in mobile payments. Other use cases include online shopping, easy payment, eGoverment proofs etc. Keywords: NFC, smart phone, smart card, NFC reader, anonymous credential, zero-knowledge proofs.
1
Introduction
Smart phones, smart cards and other smart devices are already omnipresent in our daily lives and used for payments, access control, transportation, etc. Especially, the ubiquity of mobile devices and the variety of services that they provide have led to many new research challenges and securing mobile communication has become essential. In addition, the necessity for cheap implementations of security protocols (due to firm constraints on area, memory, power and energy) for the applications, causes risks on security and privacy of individuals carrying the devices. As a consequence, privacy-friendly protocols are required that are also meeting (sometimes very complex) security services. Since most mobile phones in the near future will use Near Field Communication (NFC), the importance of this technology is growing. NFC-enabled mobile phones can communicate with each other and also with other such devices, e.g. contactless cards, creating in this way an NFC-based Internet of Things. NFC-enabled phones are used in many applications, providing links to smart posters, mobile payments, etc. All these services require secure authentication J.B. Schmitt (Ed.): MMB & DFT 2012, LNCS 7201, pp. 317–330, 2012. c Springer-Verlag Berlin Heidelberg 2012
318
G. Alpár, L. Batina, and R. Verdult
and communication on both sides; furthermore, the threats and the capabilities of adversaries are ever increasing. Mobile payments, in particular, pose a challenge due to the requirements involved. There exist contactless payments schemes developed by Master Card, VISA, etc. These online payment applications use NFC channels as a new communication means. One disadvantage is that peer-to-peer payments are not possible. Existing online mobile payments exhibit weaknesses from both sides (the network side and the phone side) as there are simply too many things that can go wrong; especially, considering implementation issues, malware etc. In this work we attempt to overcome the issues mentioned above. We propose to rely only on an NFC phone and a smart card to enable secure services, such as mobile banking. More precisely, because an NFC phone can also act as a reader, we separate the two by making current personal smart card readers obsolete. This setting simplifies all interesting scenarios and improves the security of the system. This is possible when applying zero-knowledge proofs with anonymous credentials to maintain strong security and privacy. In this work we explain how to use this concept for various service e.g. e-banking, online shopping, content protection, etc. Our solution is more convenient for both, users and service providers (SP). Users are more and more aware of the importance of privacy and anonymity in digital communication and in this case there is no additional burden as their phones is all they need. A service provider, on the other hand, benefits from higher security relying on credentials (issued by the SP or a trusted authority). In short, we advocate the transition to new authentication means. Instead of current situation in which a user has many authentication methods requiring either to carry around devices (USB token, smart card, random reader, etc.), or to remember some secret information (password, PIN), a user needs only his NFC-enabled phone to complete various security services. This way of using mobile phones as readers allows for location flexibility (i.e., mobility) for the users, which is a step forward in today’s digital evolution. 1.1
Contribution
The contributions of this work are as follows: – We propose a new solution for mobile banking as a particular example for the Tap2 technology, aiming to improve on existing solutions. In the same way, use cases of online shopping, streaming services, eGoverment applications etc. can be devised. – We propose to use zero-knowledge proofs based on attribute-based anonymous credentials to provide the security and privacy requirements in the use cases mentioned above. – We introduce the separation of smart card and phone. As NFC devices can act like a bridge between service providers and secure elements, it is reasonable to maintain credentials in a secure environment to avoid them to be revealed. Only proofs about these credentials (e.g., attributes, ownership, etc.) are revealed upon a legitimate request. NFC phones, having more computational and memory resources, are suitable platforms to implement
Using NFC Phones for Proving Credentials
319
complex protocols behind anonymous credential. This enables to implement stronger security for less. – This separation leads also to a more unified approach in terms of modularity. By joining forces of a powerful mobile device and a tamper-resistant smart card, not only is secure authentication (identity proof) possible, but also more fine-grained proofs. Anonymous credentials, implemented on a smart card, provide functionalities to prove qualities of the credential as well as its owner with high security assurance and minimal data disclosure. 1.2
Outline
The remainder of this paper is organized as follows. In Sect. 2 we give basic information about NFC technology and we mention some related work on NFC solutions, i.e. protocols and their security issues. In addition, we also describe an existing e-banking solution and we outline necessary cryptographic concept implied by our proposal. The model for our solution is detailed in Sect. 3. Our new solution called Tap2 technology is introduced in Sect. 4 and one of its possible use cases (i.e., mobile banking) is explained in Sect. 5. Section 6 briefly explains possible threats and attacks. Section 7 concludes this paper and suggests some alleys for future research.
2
Background
There are various protocols that could be improved, in terms of security and usability, by using NFC technology. However, in order to limit the scope of this paper we apply our generic credential proving method only on payment services. In this section we first introduce the NFC features and technology and secondly we describe an e-banking solution that is frequently used in Europe to perform on-line financial transactions. We also introduce our model notation and describe the protocol we aim at improving. In addition, we summarize the cryptographic concepts we use as the main building blocks of our model. 2.1
Technology
The Near Field Communication (NFC) technology is an extension of several Radio Frequency IDentification (RFID) proximity communication standards [12,11,17]. It basically combines the high frequency (13.56 MHz) RFID standards and reformulates them adding some more features into two new communication standards [13,14]. The two main new features added in these standards are peerto-peer connections between two active NFC devices (NFCIP) and the emulation of a passive proximity RFID tag. The initial goal behind NFC technology is to establish more complex wireless channels that operate at a proximity distance. This makes NFC much more ambitious than RFID systems. The latter is limited to plain identification, tracking of unique card numbers or storing small monetary values in the memory of an RFID tag. Because NFC is backwards
320
G. Alpár, L. Batina, and R. Verdult
compatible to RFID systems, several deployed devices could be accessed by an NFC-enabled device. These include electronic passports and identity cards based on the ICAO standard [10], most contactless public transport tickets, and access control tags that operate at 13.56 MHz. While RFID standards were merely focused on specification of the modulation, encoding, start and stop conditions of the communication, NFC extends its specification by adding application formats [23,20,22,21] and integration of (multiple) secure elements [6,5,16,15]. Secure elements are comparable with regular smart cards and are available in many forms, e.g. contactless smart card, Universal Integrated Circuit Card (UICC), MicroSD card with RF interface or an internal embedded chip which is integrated into an NFC controller chip. The latter one is used in the popular Google Nexus S phone a long side UICC that contains the Subscriber Identity Module (SIM) and is supplied by the users telephone company. Since the introduction of NFC technology, a number of protocols have been proposed [4,24,28,1,9] for specific applications. None of these protocols generalizes and uses the technology in such a way that only the essential security requirements are addressed to prove a credential. A more generic approach could be used for most applications without redesigning the original protocol. There are several security issues found in drafts of the standards and in early prototypes of NFC-enabled devices related to NFC [19,29,26]. These papers address weak spots in rapidly built systems and protocols which have not been carefully tested or formally proved. 2.2
E-Banking
Current payment solutions use complex security protocols to authenticate, verify and approve credentials. Service providers, such as banks, often use the most progressive protocols to circumvent fraudulent transactions. This paper therefore proposes the use of NFC technology in the widely deployed on-line banking solution based on the EMV Chip Authentication Program (CAP). EMV-CAP was introduced by Mastercard International in 2004 to proof possession of a set credentials by the user. These credentials include the smart card and the PIN code corresponding to the bank account of the user. E-banking protocols often make use of a second secure channel provided by a dedicated smart card and a reader which are deployed by the bank itself. These protocols heavily depend on complex cryptographic calculations performed by the smart card. The smart cards and readers are made of tamper resistant hardware to provide a trusted infrastructure at the consumers side. Protocols like these are considered more secure because of their two-factor authentication (smart card and PIN code) and their strict secure channel separation between on-line (internet) and off-line (smart card and reader). The downside of using a separate smart card and reader is the extra user interaction that is required for each transaction. We present the required steps in a model of the EMV-CAP protocol in Figure 1, it is currently used by several European banks such as Barclays, RBS, ABN AMRO, Rabobank, KBC, and Nordea.
Using NFC Phones for Proving Credentials
SC
R
U
SP
Smartcard
Reader
User
Bank
321
[P IN ]user input P IN [td, n]display [n]user input n P roof (n) P roof (n) [P roof (n)]display [P roof (n)]user input
Fig. 1. Protocol used by EMV CAP
What immediately strikes us in Figure 1 is the fact that the transaction details td are not signed by the smart card. They are shown to the user, but not included in the proof that is generated by the smart card. Some banks provide an optional extra user input after entering the nonce n when transferring a large amount of money. This however, increases the amount of user interactions that take place during the transaction and makes it less attractive to use for regular transactions. Since this extra user input is used only occasionally, we discard this as part of the protocol. More details about this protocol is available under a non-disclosure agreement with Mastercard, but most of its operations are reverse engineered and published by Drimer et al [3]. 2.3
Cryptographic Concepts
Smart cards are reasonably cheap hardware devices that provide tamper resistance for secret keys and are equipped to perform cryptographic computations. As smart cards are prevalent, convenient for users, and clear links to distinct services, they are expected to remain dominant for the foreseeable future. Smart cards are not only able to support traditional public-key infrastructures (PKIs), but also more complex cryptographic protocols. A zero-knowledge (ZK) proof is the most important example for such a cryptographic technique for achieving privacy and security at the same time. A ZK protocol is a challenge–response algorithm that enables a prover to convince a verifier of the validity of a statement without releasing any knowledge beyond
322
G. Alpár, L. Batina, and R. Verdult
the validity of the statement. While an interactive ZK proof requires the presence of both participants in the course of the protocol, a non-interactive ZK proof can be generated by the prover alone. Interactive proof protocols can be changed into non-interactive ones by applying Fiat–Shamir heuristic [7], that is, substituting the verifier’s random challenge by cryptographic hashing. Extending the input of the hash function by a message, a non-interactive proof of the knowledge of a secret value (such as a private key) can be transformed into a digital signature. An anonymous credential is a credential, a signed data structure by a trusted issuing authority, that reveals no information about the identity of the owner. In order to verify a credential, the verifier is supposed to know the public key of the issuer. The most important anonymous credential technologies, such as U-Prove [25] and Idemix [27], allow for signing several attributes in a credential. While a basic ZK proof demonstrates merely the fact of owning a credential signed by a certain issuer (e.g., “I have a driving license”), in case of selective disclosure, a user can reveal some attributes too (e.g., “I have a driving license which is due to expire on 12–Dec–2012”). The most advanced functionality of an anonymous credential enables credential owners to provide property proofs about the attributes (e.g., “I have a driving license which is still valid (i.e., today ≤ expiry day) and the card is not revoked (i.e., the card number is not on the revocation list)”). Note that the identity of the prover is not revealed in these examples. Zero-knowledge proofs and anonymous credentials are important building blocks in applications to achieve privacy for the users (provers) as well as security for the service providers (verifiers). Since wireless technologies are susceptible to be eavesdropped and to man-in-the-middle attacks, these techniques play essential roles to prevent information leakage. To summarize, smart cards are ideal means for the construction of privacyfriendly and secure protocols by carrying anonymous credentials and delivering zero-knowledge proofs about attributes.
3
Model
In this section we define our model and we specify the security requirements of our system. 3.1
Participants
In our model a user U is a human being who is assumed to have an NFCenabled mobile phone and a smart card SC. A smart card carries an anonymous credential issued by a trusted issuing authority IP. Although a user can have several smart cards in practice, we assume only one smart card in a protocol run. A mobile phone has two states: – When it is in normal mode, the mobile phone is denoted by M. Though it provides full functionality, it is untrusted, e.g., prone to be infected by malware.
Using NFC Phones for Proving Credentials
323
– In order to carry out security sensitive tasks, U can switch the mobile phone to its trusted mode, denoted by TM. In a protocol, a user, employing his hardware devices, is required by a service provider SP to prove some statement. The credential contains a secret key and further attributes, signed by IP, that enables the user to construct the expected proof. In Table 1 all the participants with corresponding abbreviations are given. Table 1. Participants U SC M TM SP IP
3.2
User Tamper-resistant smart card with a credential NFC-enabled mobile phone in normal mode NFC-enabled mobile phone in trusted mode Service provider (implicitly) Trusted authority issuing credentials
Security Requirements
Here we elaborate on the security requirements that we assume or wish to obtain. As mentioned above we aim at privacy-friendly solutions meeting the security requirements. Therefore, we need a substantial computational power but considering the trends with mobile phones, this seems to be easily accessible. In addition, as mobile devices are becoming more and more powerful, it is reasonable to assume that individuals will want to carry out most tasks on them instead of desktop computers. On the other hand, mobile devices, especially when representing more and more value, are vulnerable to theft. For this purpose, we envision an intuitive, new technology, the trusted mode of mobile devices. A user should be able to switch a mobile phone between trusted and normal modes using a hardware button. When in trusted mode, not only does the phone restrict its set of functionalities, but it also changes its appearance. A mobile phone in trusted mode, while emitting some distinct visual sign, can only communicate through NFC, while it has internal access only to a trusted domain (such as a secure element), that allows cryptographic computations and secure storage. Note that no software means may switch between modes and the visual sign must be out of reach of other hardware or software elements. To summarize, our model assumes/implies the following concepts: – Secure and privacy-preserving authentication method for the user – Reliable information about the user for the SP issued by a trusted authority (IP) (because of the credential) – Credentials never leave the smart card, only the proof of ownership (and possibly some knowledge on attributes) about the credential. In other words, we assume selective disclosure of attributes and property proofs of the attributes i.e. the use of attribute-based anonymous credentials; such as U-Prove [25] or Idemix [27].
324
G. Alpár, L. Batina, and R. Verdult
– The authentication is only possible if both the user and the smart card are present. – The service provider obtains reliable information about the user (it is not the case for a password authentication) – In the case of loss or theft of a mobile phone or a smart card, no unauthorized parties should be able to gain access to the users’ credentials or authenticate on their behalf. – The smart card is assumed to be tamper-resistant and the attacks aiming the key-recovery using side-channel information such as power consumption [18] or electromagnetic emanations of the device are out of the scope of this paper. Considering the issue of trust in mobile phones we assume the following for the resulting protocols: – If the mobile phone is trusted, it can act simply as a card reader. We foresee that several smart cards will stay separated from the mobile phone, such as bank and credit cards, identity card, driving license, etc. In order for it not to be able to store any valuable information about the credentials, smart cards should communicate only proofs about the credentials and attributes. (Note however, that the information might be stored that the smart card has ever proven.) – If the mobile phone is partially trusted, limited permissions can be provided; such as ABN-AMRO’s app1 in which a separate PIN code has to be used (not the one on the card) and only read-only account information and transfers to own accounts are allowed. – If the mobile phone is assumed to be prone to get infected by malicious software, we propose a trusted state for mobile phones. In this trusted state a phone has restricted access to its own resources: It can use NFC functionality, perform cryptographic computations, and take secure input. – Another solution can be the inclusion of a secure element which is out of the scope of this work.
4
Tap2 Technology
In [2] the Snap2 user-friendly technology has been designed for authentication and payment methods to web applications. A QR code (a two-dimensional barcode), containing channel information, is generated by the service provider’s web server and scanned by the user’s mobile phone. Using the information, the mobile phone can generate an authentication (or a payment) and send it to the web server through the Internet. Referring to the similarity and the differences, we call our technology Tap2. A Tap2 scheme includes the participants described in Table 1, and it requires the 1
http://www.abnamro.nl/en/prive/slimbankieren/mobiel-bankieren/ introduction.html
Using NFC Phones for Proving Credentials
325
user to have his/her mobile phone and the relevant smart card present. A user can perform fundamental proofs, see Table 2, by letting the mobile phone and the smart card to communicate through NFC and switching the mobile phone to the trusted mode for sensitive tasks such as, entering a PIN code.
Table 2. Types of proofs Abbreviation Type CRED Owning credential ID Having certain identity PROP Having some characteristics APPR Approval
Components signature identifiable attributes attributes and/or attribute properties proof from above and authentic consent
A smart card SC contains an attribute-based anonymous credential and the logic that is required to the following proofs. To prove that a credential is present, SC generates a signature on a given nonce, based on the proof of knowledge of the corresponding secret key. If identification is required, the SC selectively discloses an identifying attribute (or an identifying set of attributes), and proves that it is indeed in the credential. This proof implicitly contains the credential itself. Similarly, a property proof also inherently proves the knowledge of an anonymous credential’s secret key; however, the proof contains zero-knowledge subproofs about some properties of attributes. Finally, an approval is any proof from the ones above augmented by the user’s approval of a transaction. Figure 2 shows the dependencies of these proofs on each other and on the participants.
Fig. 2. Dependencies of proofs
326
5 5.1
G. Alpár, L. Batina, and R. Verdult
Applications A Use Case: Mobile Banking
On-line banking is carried out with two-factor authentication and transaction approval; besides a password authentication, a second means is applied. A temporary one-time password generated by either the bank and sent by e-mail or sms, or the on-line banking card reader at the user. Our scheme (see Figure 3) is highly secure, yet simple and intuitive for the user without an additional special device or network channel. Since there is much interaction between the card and the mobile phone, the simplest way for a user is to place the phone on the bank card and follow the instructions. However, the user has to pay attention to enter the PIN code only when the mobile device is in trusted mode (TM).
U
TM
SC
M
SP
User
Trusted mode
Smart card
Mobile
Bank td, n
[td, n]NF C
1 Verify(td)
[OK]NF C 2
Alert user to switch to trusted mode Switch: M→TM Trusted environment [td]NF C [td]display
3
[P IN +OK]user input [P IN +OK]NF C
Proof(td, n) Alert user to switch to normal mode 4
Switch: TM→M [Proof(td, n)]NF C Proof(td, n)
Fig. 3. E-Banking Scheme
A money transfer transaction in the mobile banking application works as follows from the user’s point of view:
Using NFC Phones for Proving Credentials
327
1. After the user prepares a money transfer and sends it to the server, the server requests an authorized approval, that is, the user’s consent and a proof that the authentic user and the required bank card are present. The request contains all transaction details td, information about the transfer and the bank’s signature on it, and a nonce n. The variables are communicated to the mobile phone through the Internet and to the smart card through NFC. 2. The smart card verifies the bank’s signature in td. The mobile phone displays an alert to the user to switch the device into its trusted mode. 3. Having switched M to TM, the user gets the relevant information about the transfer on the mobile phone: bank, account numbers and amount. To approve the transaction, the user enters his PIN on the mobile phone, which is communicated to the smart card through NFC. 4. While the user switches back to M (normal mode), the smart card signs td as a proof of approval. The proof is sent to the bank’s server through the mobile device. Transaction details td play an essential role in the protocol. They show the identity of the bank and the description of the transaction (e.g., in case of a money transfer: bank account from, bank account to, sum). The same tuple td should be used to display information to the user and to be input to the smart card. The user has important roles to provide authenticity and security. Firstly, the user has to switch his mobile phone to its trusted state for the most sensitive tasks. Secondly, receiving the transaction details, he has to approve the transaction by providing his PIN code. Finally, he has to switch the phone back to its normal state for it to be able to connect to the bank’s server through the Internet and to send the proof signature from the card to the server. 5.2
Other Applications
Next we show a list of examples including the classification that point out which features of the protocol could be used. – Pay the bill We assume that an invoice is equipped with an NFC tag that contains a prepared payment transaction to the provider’s bank account. By tapping the tag first and then a bank card, a user can approve the transaction with a mobile phone using the credentials from the bank. For financial transactions it is preferred to use trusted mode. – Peer-to-peer payment User A prepares a signed statement with the transaction details on the mobile phone. By tapping A’s device and B’s bank card, B can approve a transaction. The security requirements are similar to those of Pay the bill. – eGovernment identity proofs To prove the identity a user may decide not to switch to trusted mode. There is no secret information entered by the user, the identity card provides the
328
G. Alpár, L. Batina, and R. Verdult
zero-knowledge credential proof and the user only has to actively bridge the credential to the service provider (e.g. for customs declaration). – Ordering cigarettes from a vending machine An identity card could also be used to provide a proof that a user is old enough to buy cigarettes. For such a simple proof, the user could ignore trusted mode as well as confirmation. It leaks hardly any detail about the users privacy.
6
Threat Analysis
In this section we briefly consider the threats that apply and that we aim to protect against. Our solution is intended to protect against the following attacks. – Phishing: An attacker that aims at online phishing attacks would fail because there is no PC involved, such as in other options using e.g. SSL. – Relay attacks such as described by Francis et al. [8] apply when assuming that NFC can be eavesdropped or even changed [8]. However, an attacker cannot generate a valid proof without knowing the credentials on the smart card. – In the case of phone or smart card theft or loss, the fact that there are two devices required for authentication somewhat reduces the risks. In addition, if a phone is stolen or lost, the Tap2 can require the authentication of a user. If this fails, the phone is locked. Also, if a phone is lost, users can revoke Tap2 credentials from SP. The same applies for a smart card loss. – Malware on phones is probably the most serious threat that should be solved in general so we do not consider it specifically in this work. – Implementation or physical attacks using a side-channel such as power consumption or EM are out of the scope of this paper, as mentioned above.
7
Conclusions and Future Work
A new solution for mobile payments called Tap2 technology is proposed in this work that requires from users only their NFC-enabled mobile phones and credentials implemented on their smart cards. Secure authentication (proof) is obtained by means of anonymous credentials implemented on a smart card to provide the functionality with minimal data disclosure. The idea can be extended to other use cases such as online shopping, streaming service, eGoverment proofs etc. For future work, it would be interesting to incorporate a mobile trusted module (secure element) in the protocols. An interesting research direction is to investigate possible integration of open technologies, such as OpenID, in this context. Another natural extension considering Idemix possibilities, (to handle more anonymous credentials with the same master key of the user), is to include more smart cards into one protocol.
Using NFC Phones for Proving Credentials
329
Acknowledgement. This work was supported in part by the European Commission under contract number ICT-2007-216676 ECRYPT NoE phase II and by the research programme Sentinels as project Mobile IDM (10522). Sentinels is being financed by Technology Foundation STW, the Netherlands Organization for Scientific Research (NWO), and the Dutch Ministry of Economic Affairs.
References 1. Chen, W., Hancke, G.P., Mayes, K.E., Lien, Y., Chiu, J.-H.: NFC Mobile Transactions and Authentication Based on GSM Network. In: International Workshop on Near Field Communication, pp. 83–89 (2010) 2. Dodson, B., Sengupta, D., Boneh, D., Lam, M.S.: Secure, Consumer-Friendly Web Authentication and Payments with a Phone. In: Conference on Mobile Computing, Applications, and Services (MobiCASE 2010), Santa Clara, CA, USA (2010) 3. Drimer, S., Murdoch, S.J., Anderson, R.J.: Optimised to Fail: Card Readers for Online Banking. In: Dingledine, R., Golle, P. (eds.) FC 2009. LNCS, vol. 5628, pp. 184–200. Springer, Heidelberg (2009) 4. Dunnebeil, S., Kobler, F., Koene, P., Leimeister, J.M., Krcmar, H.: Encrypted NFC Emergency Tags Based on the German Telematics Infrastructure. In: International Workshop on Near Field Communication, pp. 50–55 (2011) 5. Smart Cards; UICC - Contactless Front-end (CLF) Interface; Host Controller Interface (HCI), ETSI TS 102 613 (2008) 6. Smart Cards; UICC - Contactless Front-end (CLF) Interface; Part 1: Physical and data link layer characteristics, ETSI TS 102 613 (2011) 7. Fiat, A., Shamir, A.: How to Prove Yourself: Practical Solutions to Identification and Signature Problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 186–194. Springer, Heidelberg (1987) 8. Francis, L., Hancke, G., Mayes, K., Markantonakis, K.: Practical NFC Peer-to-Peer Relay Attack using Mobile Phones. IACR e-print archive (April 2010) 9. Gauthier, V.D., Wouters, K.M., Karahan, H., Preneel, B.: Offline NFC payments with electronic vouchers. In: Proceedings of the 1st ACM Workshop on Networking, Systems, and Applications for Mobile Handhelds, MobiHeld 2009, pp. 25–30. ACM, New York (2009) 10. Machine readable travel documents (2003) 11. Identification cards — contactless integrated circuit(s) cards — vicinity cards, ISO/IEC 15693 (2000) 12. Identification cards — contactless integrated circuit cards — proximity cards, ISO/IEC 14443 (2001) 13. Information technology — telecommunications and information exchange between systems — near field communication interface and protocol 1 (NFCIP-1), ISO/IEC 18092 (2004) 14. Information technology — telecommunications and information exchange between systems — near field communication interface and protocol 2 (NFCIP-2), ISO/IEC 21481 (2005) 15. Information technology — telecommunications and information exchange between systems — near field communication wired interface (NFC-WI), ISO/IEC 28361 (2007) 16. Information technology — telecommunications and information exchange between systems — front-end configuration command for NFC-WI (NFC-FEC), ISO/IEC 16353 (2011)
330
G. Alpár, L. Batina, and R. Verdult
17. Specification of implementation for integrated circuit(s) cards (JICSAP/JSA jis x 6319) (2005) 18. Kocher, P., Jaffe, J., Jun, B.: Differential Power Analysis. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999) 19. Mulliner, C.: Vulnerability Analysis and Attacks on NFC-enabled Mobile Phones. In: Proceedings of the 1st International Workshop on Sensor Security (IWSS) at ARES, Fukuoka, Japan, pp. 695–700 (March 2009) 20. Technical Specification, NFC Data Exchange Format (NDEF), NDEF 1.0 (2006) 21. Technical Specification, NFC Record Type Definition (RTD), RTD 1.0 (2006) 22. Technical specification, connection handover, Connection Handover 1.2 (2010) 23. Technical Specification, Smart Poster Record Type Definition (2006) 24. Opperman, C.A., Hancke, G.P.: A Generic NFC-enabled Measurement System for Remote Monitoring and Control of Client-side Equipment. In: International Workshop on Near Field Communication, pp. 44–49 (2011) 25. Paquin, C.: U-Prove Cryptographic Specification V1.1. Technical report, Microsoft (February 2011), https://connect.microsoft.com/site1188/Downloads 26. Roland, M., Langer, J., Scharinger, J.: Security Vulnerabilities of the NDEF Signature Record Type. In: International Workshop on Near Field Communication, pp. 65–70 (2011) 27. IBM Research Zürich Security Team. Specification of the Identity Mixer Cryptographic Library, version 2.3.3. Technical report, IBM Research, Zürich (June 2011), https://prime.inf.tu-dresden.de/idemix/ 28. Steffen, R., Preissinger, J., Schollermann, T., Muller, A., Schnabel, I.: Near Field Communication (NFC) in an Automotive Environment. In: International Workshop on Near Field Communication, pp. 15–20 (2010) 29. Verdult, R., Kooman, F.: Practical attacks on nfc enabled cell phones. In: International Workshop on Near Field Communication, pp. 77–82 (2011)
Author Index
Abdollahpouri, Alireza Alp´ ar, Gergely 317 Andova, Suzana 136
182
Baldine, Ilia 197 Batina, Lejla 317 Bogdoll, Jonathan 249 Boyer, Marc 258
Lˆe, Minh 31 Le Boudec, Jean-Yves Lin, Chuang 302
258
1
Eittenberger, Philipp M.
Plaga, Rainer
288
Remke, Anne Riedl, Martin
166 244
240
Ferrer Fioriti, Luis Mar´ıa
151
Georgievska, Sonja 136 German, Reinhard 91 Gilani, Wasif 206 Gouberman, Alexander 244 Hartmanns, Arnd 249 Haverkort, Boudewijn R. 106, 166 Hermanns, Holger 151, 249 Hettinga, Sjors 166 Hielscher, Kai-Steffen 91 Hoefling, Michael 236 H¨ ofig, Kai 61 Huang, Shu 197 Ivkovic, Nikola
273
Mangoua Sofack, William Marshall, Alan 206 Meitner, Matthias 46 Menth, Michael 236
Calborean, Horia 221 Camen, Marcus 236 Daduna, Hans
Koob, Frank 288 Kresic, Dario 91 Krieger, Udo R. 240
Sadre, Ramin 166 Saglietti, Francesca 46 Saul, Lars Peter 1 Schuster, Johann 244 Sharma, Arpit 121 Siegle, Markus 244 Tomozei, Dan-Cristian Ungerer, Theo
273
221
Vastag, Sebastian 76 Verdult, Roel 317 Vintan, Lucian 221
91
Jahr, Ralf 221 Jiang, Yuming 302 Jongerden, Marijn R.
106
Kniep, Christian 236 Kolesnikov, Andrey 253
Walter, Max 31 Wang, Jinting 16 Wang, Kai 302 Winkler, Ulrich 206 Wolfinger, Bernd E. 182 Zhang, Feng
16