This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
5382
Frank S. de Boer Marcello M. Bonsangue Susanne Graf Willem-Paul de Roever (Eds.)
Formal Methods for Components and Objects 6th International Symposium, FMCO 2007 Amsterdam, The Netherlands, October 24-26, 2007 Revised Papers
13
Volume Editors Frank S. de Boer Centre for Mathematics and Computer Science, CWI Kruislaan 413, 1098 SJ Amsterdam, The Netherlands E-mail: [email protected] Marcello M. Bonsangue Leiden University, Leiden Institute of Advanced Computer Science P.O. Box 9512, 2300 RA Leiden, The Netherlands E-mail: [email protected] Susanne Graf VERIMAG 2 Avenue de Vignate, Centre Equitation, 38610 Grenoble-Gières, France E-mail: [email protected] Willem-Paul de Roever Christian-Albrechts University Kiel Institute of Computer Science and Applied Mathematics Hermann-Rodewald-Straße 3, 24118 Kiel, Germany E-mail: [email protected]
Library of Congress Control Number: 2008940690 CR Subject Classification (1998): D.2, D.3, F.3, D.4 LNCS Sublibrary: SL 2 – Programming and Software Engineering ISSN ISBN-10 ISBN-13
0302-9743 3-540-92187-7 Springer Berlin Heidelberg New York 978-3-540-92187-5 Springer Berlin Heidelberg New York
Large and complex software systems provide the necessary infrastructure in all industries today. In order to construct such large systems in a systematic manner, the focus in development methodologies has switched in the last two decades from functional issues to structural issues: both data and functions are encapsulated into software units which are integrated into large systems by means of various techniques supporting reusability and modifiability. This encapsulation principle is essential to both the object-oriented and the more recent component-based software engineering paradigms. Formal methods have been applied successfully to the verification of mediumsized programs in protocol and hardware design. However, their application to the development of large systems requires more emphasis on specification, modeling and validation techniques supporting the concepts of reusability and modifiability, and their implementation in new extensions of existing programming languages like Java. The 6th Symposium on Formal Methods for Components and Objects was held in Amsterdam, The Netherlands, during October 24–26, 2007. It was realized as a concertation meeting of European projects focussing on formal methods for components and objects. This volume contains the contributions submitted after the symposium by the speakers of each of the following European IST projects involved in the organization of the program jointly with the bilateral NWO/DFG project MobiJ: – The IST-FP6 project Mobius aiming at developing the technology for establishing trust and security for the next generation of global computers, using the proof-carrying code paradigm. The contact persons are Martin Hofmann (Ludwig Maximilians University Munich, Germany) and Gilles Barthe (IMDEA Software, Spain). – The IST-FP6 project SelfMan on self-management for large-scale distributed systems based on structured overlay networks and components. The contact person is Peter Van Roy (Universit´e Catholique de Louvain, Belgium). – The IST-FP6 project GridComp and the FP6 CoreGRID Network of Excellence on grid programming with components. The contact person is Denis Caromel (INRIA Sophia-Antipolis, France). – The Real-time component cluster of the Network of Excellence on Embedded System Design ARTIST. This cluster focuses on design processes and architectures for real-time embedded systems. The contact person is Albert Benveniste (INRIA / IRISA, France) – The IST-FP6 project CREDO on modelling and analysis of evolutionary structures for distributed services. The contact person is Frank de Boer (CWI, The Netherlands).
VI
Preface
The proceedings of the previous editions of FMCO have been published as volumes 2852, 3188, 3657, 4111, and 4709 of Springer’s Lecture Notes in Computer Science. We believe that these proceedings provide a unique combination of ideas on software engineering and formal methods which reflect the expanding body of knowledge on modern software systems. Finally, we thank all authors for the high quality of their contributions, and the reviewers for their help in improving the papers for this volume.
September 2008
Frank de Boer Marcello Bonsangue Susanne Graf Willem-Paul de Roever
Organization
The FMCO symposia are organized in the context of the project Mobi-J, a project founded by a bilateral research program of The Dutch Organization for Scientific Research (NWO) and the Central Public Funding Organization for Academic Research in Germany (DFG). The partners of the Mobi-J projects are: the Centrum voor Wiskunde en Informatica, the Leiden Institute of Advanced Computer Science, and the Christian-Albrechts-Universit¨ at Kiel. This project aims at the development of a programming environment which supports component-based design and verification of Java programs annotated with assertions. The overall approach is based on an extension of the Java language with a notion of component that provides for the encapsulation of its internal processing of data and composition in a network by means of mobile asynchronous channels.
Sponsoring Institutions The Dutch Organization for Scientific Research (NWO) The Dutch Institute for Programming research and Algorithmics (IPA) The Centrum voor Wiskunde en Informatica (CWI), The Netherlands The Leiden Institute of Advanced Computer Science (LIACS), The Netherlands
Table of Contents
The MOBIUS Project The MOBIUS Proof Carrying Code Infrastructure (An Overview) . . . . . . Gilles Barthe, Pierre Cr´egut, Benjamin Gr´egoire, Thomas Jensen, and David Pichardie
1
Certification Using the Mobius Base Logic . . . . . . . . . . . . . . . . . . . . . . . . . . Lennart Beringer, Martin Hofmann, and Mariela Pavlova
25
Safety Guarantees from Explicit Resource Management . . . . . . . . . . . . . . . David Aspinall, Patrick Maier, and Ian Stark
52
Universe Types for Topology and Encapsulation . . . . . . . . . . . . . . . . . . . . . Dave Cunningham, Werner Dietl, Sophia Drossopoulou, Adrian Francalanza, Peter M¨ uller, and Alexander J. Summers
72
COSTA: Design and Implementation of a Cost and Termination Analyzer for Java Bytecode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elvira Albert, Puri Arenas, Samir Genaim, German Puebla, and Damiano Zanardini
113
The GridCOMP Project Active Objects and Distributed Components: Theory and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Denis Caromel, Ludovic Henrio, and Eric Madelaine
133
The SELFMAN Project Self Management for Large-Scale Distributed Systems: An Overview of the SELFMAN Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Van Roy, Seif Haridi, Alexander Reinefeld, Jean-Bernard Stefani, Roland Yap, and Thierry Coupaye
153
The ARTIST Project Causal Semantics for the Algebra of Connectors (Extended Abstract) . . . Simon Bliudze and Joseph Sifakis
179
Multiple Viewpoint Contract-Based Specification and Design . . . . . . . . . . Albert Benveniste, Benoˆıt Caillaud, Alberto Ferrari, Leonardo Mangeruca, Roberto Passerone, and Christos Sofronis
The MOBIUS Proof Carrying Code Infrastructure (An Overview) Gilles Barthe1 , Pierre Cr´egut3 , Benjamin Gr´egoire2 , Thomas Jensen4 , and David Pichardie4 1
2
IMDEA Software, Madrid, Spain INRIA Sophia-Antipolis M´editerran´ee, France 3 France T´el´ecom, France 4 INRIA Rennes Bretagne, Fance
Abstract. The goal of the MOBIUS project is to develop a Proof Carrying Code architecture to secure global computers that consist of Java-enabled mobile devices. In this overview, we present the consumer side of the MOBIUS Proof Carrying Code infrastructure, for which we have developed formally certified, executable checkers. We consider wholesale Proof Carrying Code scenarios, in which a trusted authority verifies the certificate before cryptographically signing the application. We also discuss retail Proof Carrying Code, where the verification is performed on the consumer device.
1 Introduction MOBIUS [BBC+ 06] is a European integrated project developing basic technologies to ensure reliability and security in global computers formed of a host of Java-enabled devices, such as phones, PDAs, PCs, which provide a common runtime environment for a vast number of mobile applications. Its aim is to give users independent guarantees of the safety and security of mobile applications, using the concept of security through verifiable evidence emphasized by the Proof Carrying Code (PCC) paradigm [Nec97]. The fundamental view behind PCC is that mobile code components come equipped with a certificate that can be checked efficiently and independently by the code consumer to ensure that that downloaded components issued by the producer respects its policy. PCC complements standard security infrastructures such as PKI, which only guarantee the origin and the integrity of code, and makes an appropriate basis for security of global computers; however, there remain significant challenges to generalize its use in security architectures for global computing, in particular: – Need for comprehensive policies: PCC has mostly been used to enforce safety properties of applications, including type safety, and memory management safety. One goal of the MOBIUS project is to show the adequacy of PCC for enforcing basic security policies such as non-interference and resource control, and for the verification of functional properties of applications. – Need for enhanced PCC tools: programming logics and type systems are the two basic enabling technologies for PCC. However, developing programming logics and type systems in the context of a full-blown, object-oriented, programming F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 1–24, 2008. c Springer-Verlag Berlin Heidelberg 2008
2
G. Barthe et al.
language such as Java raises a number of challenging issues about efficiency, scalability, and trustworthiness of the PCC infrastructure itself. One goal of the MOBIUS project is to develop efficient and scalable mechanisms to generate and to check certificates, and to prove formally that the security-critical part of the PCC infrastructure is correct. The purpose of this article is to present intermediate achievements of the project with respect to its goal of achieving efficient and trustworthy PCCs tools. We consider two scenarios, wholesale and retail Proof Carrying Code, and detail for each scenario how to achieve reliable certificate checking.
2 Proof Carrying Code The purpose of this section is to recall existing approaches to Proof Carrying Code, and to present the main characteristics of the MOBIUS approach. 2.1 Type-Based Proof Carrying Code The most successful instance and widely deployed application of PCC technology to date, namely the use of stackmaps in lightweight bytecode verification, uses type systems as its enabling technology. A primer on bytecode verification. Bytecode verification [Ler03] is a central element of the Java security architecture. Its purpose is to check that applets are correctly formed and correctly typed, and that they do not attempt to perform malicious operations during their execution. To this end, the bytecode verifier (BCV) performs a structural analysis and a static analysis of bytecode programs. The structural analysis checks the absence of basic errors such as calling a method that does not exist, jumping outside the scope of the program, or not respecting modifiers such as final. While simple to enforce, these checks are important and failing to enforce them may open the door to attacks [Gow04]. The second analysis is a static analysis of the program and is meant to ensure that programs execute in adherence with a set of safety properties, and in particular that values are used with their correct type (to avoid forged pointers). This second pass of bytecode verification is implemented as a data-flow analysis using Kildall’s algorithm [Kil73]. The analysis aims at computing solutions of data-flow equations over a lattice derived from the subtyping relation between JVM types, and from a typed virtual machine which operates on the same principles as the standard JVM except for two crucial differences: the typed virtual machine manipulates types instead of values, and executes one method at a time. In a nutshell, the algorithm manipulates so-called stackmaps that store, for each program point, an abstract state (stack type) that is the least upper bound of the abstract states that have been previously reached at this program point. The stackmap is initialized to the initial state of the method being verified for the first program point, and to a default state for the other program points. One step of execution proceeds by iterating the execution function of the virtual machine over the stackmap. A non-default state is chosen and the result of the execution of the typed
The MOBIUS Proof Carrying Code Infrastructure
3
virtual machine on this state is propagated to its possible successors (by taking pointwise the least upper bound of the computed and stored abstract states types). Termination of the analysis is guaranteed since the set of states does not have infinite ascending chains, and the state stored in the history structure is increasing. Lightweight bytecode verification. In the context of devices with limited resources, applications are verified off-device and, in case of a successful verification, signed and loaded on-device. Such a solution is not optimal in the sense that it leaves a crucial component of the security architecture outside of the perimeter of the device. In order to remedy this deficiency, there are several proposals for circumscribing the trusted computing base to the device using on-device bytecode verification. The KVM overcomes this deficiency by relying on on lightweight bytecode verification (LBCV) [Ros03], a variant of bytecode verification whose objective is to minimize computations by requiring that the program comes equipped with the solution to the dataflow equations. Thus, the role of the lightweight verifier is confined to checking that the solution is correct, which can be performed in one pass. More technically, a certificate in lightweight bytecode verification is a function that attaches a stackmap to each junction point in the program, where a junction point is a program point with more than one predecessor, i.e., a program point where the results of execution potentially need to be merged. Instead of performing the merging, a lightweight bytecode verifier will merely verify that for each program point the stackmap computed by dataflow analysis, using the stackmaps of its predecessors, is compatible with the stackmap provided by the certificate, and continues its computation with the latter. In this way, a lightweight bytecode verifier essentially checks that the candidate fixpoint is indeed a fixpoint, and that it relates suitably to the program. Such a procedure minimizes computations since one just needs to check that the stackmap is indeed a fixpoint. This, as already mentioned, can be done in a single pass over the program, while simultaneously computing a stackmap for program points that are not junction points. Lightweight bytecode verification is sound and complete with respect to bytecode verification, in the sense that if a program P equipped with a certificate c is accepted by a LBCV, then P is accepted by a BCV, and conversely, if P is accepted by a BCV, then there exists a certificate c (that can be extracted directly from the fixpoint computed by the BCV) such that P equipped with the certificate c is accepted by a LBCV. Although lightweight bytecode verification is currently limited to verifying basic typing and initialisation properties of downloaded code, there is technological potential for verifying a much richer set of program properties that are useful to further secure a code. Abstraction-carrying code (ACC) [APH05] is a generalization of bytecode verification approach to PCC which is based throughout on the use of abstract interpretation [CC77] as enabling technology. The use of abstract interpretation allows ACC to generate certificates which encode complex properties, including traditional safety issues but also resource-related properties like, e.g., resource consumption. Intuitively, the abstract interpreter implements a logic dedicated to a particular verification, making it possible to develop efficient and dedicated checkers that manipulate condensed certificates. Figure 1 shows the overall PCC architecture and protocol for certificates based on type systems or abstract interpretation.
4
G. Barthe et al.
Source Program
Compiler
Typed Byte Code (incl. Certificate)
Execution
Byte Code Verifier Code Producer
OK Code Consumer
Fig. 1. PCC architecture and protocol for type-based certificates. Compilers add type information to byte code, which therefore includes a certificate. The code consumer type checks the code before executing it. Only the type checker (byte code verifier) is part of the trusted computing base.
2.2 Logic-Based Proof Carrying Code The original PCC infrastructure proposed by Necula and Lee, which is described in Figure 2, is built upon several elements: A formal logic for specifying and verifying policies. The specification language is used to express requirements on the incoming component, and the logic is used to verify that the component meets the expected requirements. Requirements are typically expressed as pre-conditions, post-conditions or invariants. A verification condition generator (VCGen). The VCGen produces, for each component and safety policy, a set of proof obligations whose provability will be sufficient to ensure that the component respects the safety policy. A proof checker. Certificates provide a formal representation of proofs, and are used to convey to the code consumer efficiently verifiable evidence that the code it receives satisfies its specification. The proof checker verifies that the certificate does indeed establish the proof obligations generated by the VCGen. Although it builds upon ideas from program verification, which in its full generality requires interactive proofs, PCC is transparent for end-users, and does not require the code consumers to build proofs; rather, it requires code consumers to check proofs, which is fully automatic. Second, logic-based PCC is general and can be used for different policies; in particular, the VCGen and the proof checker are independent of the policy. Besides, the only restriction on the security policy is that it should be expressible in the formal logic, which is often very expressive. Certifying compilation and proof-transforming compilation. One fundamental issue to be addressed by any practical deployment of logic-based PCC is the generation of certificates. If logic-based certificates are to be used to verify basic safety properties of code, and it is expected that large classes of programs carry a certificate, then it is important that certificates are generated automatically. Certifying compilers [NL98] extend traditional compilers with a mechanism to generate automatically certificates for sufficiently simple safety properties, exploiting the information available about a program during its compilation to produce a certificate that can be checked by the proof
The MOBIUS Proof Carrying Code Infrastructure
Source Program
Compiler
Byte Code
Execution
VCGen
VCGen
Verification Conditions
Verification Conditions
Prover
Certificate
Code Producer
5
Proof Checker
OK
Code Consumer
Fig. 2. PCC architecture and protocol. Code producers generate verification conditions by applying the VCGen to the compiled bytecode program. The conditions are then proved by the prover, resulting in a PCC certificate. Like producers, consumers apply the VCGen to bytecode. The certificate must contain sufficient information to allow the proof checker to discharge these conditions. Only after successful checking, the code is executed.
checker. The certifying compiler does not form part of the Trusted computing Base (TCB); nevertheless, it is an essential ingredient of PCC, since for specific properties it reduces the burden of verification on the code producer side. MOBIUS also develops proof-transforming compilers as a means to generate certificates interactively for more complex properties that cannot be established automatically. The primary objective of proof-transforming compilation is to provide a means to generate a certificate of the bytecode program using interactive program verification environments for source code programs [BGP08, MN07]. 2.3 Foundational Proof Carrying Code The verification infrastructure on the consumer side is the central element in the PCC security architecture, and it is therefore of utmost importance to achieve the highest guarantees that its design and implementation are correct. However, providing a correct implementation of a verification infrastructure for a realistic language is a significant challenge. Thus, subtle errors may arise both at the conceptual level (e.g. by formulating unsound proof rules), or at an implementation level (e.g. by omitting some checks). In order to prevent such flaws that could be exploited by malicious code, Appel [App01] suggests to pursue a foundational approach in which the adherence of the program against the policy is formally justified using a model of the code semantics formalized in a proof assistant. This foundational approach offers several advantages: 1. small TCB: Foundational Proof Carrying Code (FPCC) considerably reduces the TCB, since the verification condition generator is formally proved sound w.r.t. the operational semantics. In addition, FPCC provides the highest level of guarantee for the correctness of the certificate checking infrastructure; 2. uniform support for verification methods: since all justifications are ultimately given in terms of the operational semantics, FPCC naturally supports a vast range of security policies (more precisely policies that can be expressed in terms of the
6
G. Barthe et al.
program semantics) and the use of different verification methods, such as type systems and program logics. The MOBIUS Proof Carrying Code architecture is heavily inspired from FPCC, but departs significantly from it in the way certificates are generated and checked. FPCC is deductive by nature, i.e. typing rules are proved as lemmas and combined using the rules of logic. In contrast, the MOBIUS infrastructure exploits the computational power of the underlying proof assistant and relies on a tight integration between deduction and computation to ensure scalability of proof checking. In particular, the MOBIUS architecture uses intensively computational reflection, whose goal is to replace deduction by computation, and which provides an effective means to carry computation-intensive proofs for facts that cannot be established by purely deductive means. In Section 2.3, we illustrate the reflective proof carrying code used in MOBIUS by considering a verification condition generator and an information flow checker that have been implemented in a reflective style, and that have been verified formally. Since we provide proof objects of their semantical correctness, the verifiers are removed from the TCB (as is the case for FPCC), which now only contains the Coq type checker and the semantics of the JVM. At last, the table below summarizes the TCB of the different PCC approaches presented in this section. PCC approach TCB components Type-based PCC Byte code lightweight verifier Logic-based PCC VCGen + Proof checker Foundational PCC Coq type checker + semantics of JVM
3 Informal Development The purpose of this section is to provide a brief introduction to the proof assistant Coq [Coq04], and an overview of the approach pursued within MOBIUS. Coq is a general purpose proof assistant based on the Calculus of Inductive Constructions, a dependent type theory that supports higher-order logic. Coq features a specification language that is sufficiently expressive to model programming language semantics, program analysis and verification. Section 4 describes Bicolano, which provides a formalisation of the Java Virtual Machine as a state transition relation exec:program→ state→ state→ Prop. Following the approach of FPCC, we use this semantics to prove the correctness of certificate checkers. One essential feature of Coq is its support for writing executable specifications, which can be extracted to a functional programming language, or run inside the system itself. In the latter case, Coq warrants the use of computations to prove properties of programs, and allows to combine deduction and computation, in particular to rely on computation in place where deductive reasoning is too cumbersome. For example, the Coq system can be used in the following manner to prove the correctness and execute (in the system itself) type-based certificate checkers, as reported in Section 5.2:
The MOBIUS Proof Carrying Code Infrastructure
7
– program a computable function check:program→ bool that is a verification procedure for a safety property on program; – formalise rigorously the safety predicate Safe:program→ Prop with respect to the semantics of the language given by Bicolano; – prove that check is indeed a correct verification procedure. That is, build a term check_correct of type ∀ p, check p = true → Safe p. Then, to prove Safe p for a particular program p, if we know that check p reduces to true, we can simply use check_correct p (refl_equal true) where (refl_equal true) is a proof of true = true. Indeed, the conversion rule of Coq allows to change the type of a term by an equivalent one, and thus the term (refl_equal true) is also a proof that check p = true because check p reduces to true, so true=true is convertible with check p = true. The above technique, called reflection, is also of interest to reason about program correctness, as illustrated in Section 5.3. Consider for example that we have a verification condition generator that computes a sufficient condition for safety: Parameter vcg : program → form Parameter interp : form → Prop Parameter vcg_correct : ∀ p, interp (vcg p) → Safe p.
where form is an inductive definition of formulae, and interp is an interpretation of the formulae in the logic of Coq. Then, we can prove program safety for p with the term vcg_correct p c, where c is a proof of interp (vcg p). One can improve the overall efficiency of the approach by combining vcg with an executable simplifier for formulae: Parameter smp: form → form Parameter smp_correct : ∀ f, interp (smp f) → interp f.
In this case, the proof of safety of p will be of the form po_correct p c, where c is a proof of interp (smp (vcg p)) and po_correct is the correctness lemma obtained by composition of smp_correct and vcg_correct.1 Proofs by reflection are efficient and yield smaller certificates (of course, the proof term check_correct may be large, but the proof is only done and type-checked once, and is shared by all the instantiations). However, proofs by reflection are not appropriate for PCC scenarios where the consumer infrastructure is subject to resource constraints, as considered in Section 6. Rather than using a complete proof assistant as a checker and a logic as the language for proof certificates, it is possible to design a lightweight form of PCC where the certificate checker is a functional program extracted from Coq. Intuitively, the extraction mechanism in Coq produces Caml programs from Coq terms by eliding those parts of the terms that do not have computational content. Such parts are only necessary to ensure the well typing of the Coq term (and by the same the 1
Our approach introduces an inductive representation of formulae (known as a deep embedding in the literature) in the above example. In fact, it is often possible to use a shallow embedding and model proof obligation generation as a Prop-valued function, but our presentation is based on a deep embedding for efficiency reasons (in particular, the use of a simplifier is more natural for a deep embedding).
8
G. Barthe et al.
correctness of the corresponding programs) but are not necessary to reduce the term to normal form (to evaluate programs). In particular, we can use the extraction mechanism to obtain a certified Caml implementation of the procedure check above. The correction of the Coq extraction mechanism ensures that the extracted Caml checker satisfy the same soundness property as its Coq ancestor. While this approach is effective for several program verification techniques (as for examples static analysis based or type system based), it is not completely adapted for verifying programs using proof obligations since the checker does not return a boolean but a formula which should be proved. To be able to apply the extraction scenario to proof obligations the simplifier of formula should be sufficiently powerful to reduce valid formulas to the trivial one true (i.e. the simplifier should be a certified automatic prover).
4 Bicolano The soundness of all the MOBIUS Proof Carrying Code infrastructure requires a formal specification of the JVM. This specification is formalised in the Coq proof assistant [Coq04] and is called Bicolano. Bicolano is situated at the bottom of the trusted base of the MOBIUS PCC infrastructure. It is a formal description of the Java Virtual Machine (JVM), giving a rigorous mathematical description of Java bytecode program executions. It closely follows the official description of the JVM [LY99] as provided by Sun. Since the correctness of Bicolano is not formally provable, the close connection with the official specification is essential to gain trust in the specification. This requirement results in two important design decisions for Bicolano. First, we formalize a small step semantics to relate consecutive JVM states during program execution, as it is done in the official description. Second, we try to keep our description of the JVM at the same level of detail as in the official specification. Nevertheless some simplifications have been made with respect the official documentation, some of which are motivated by the fact that we concentrate on the Connected Limited Device Configuration (CLDC), which is the primary Java configuration for mobile devices [Sun03]. Figure 3 presents the global architecture of the development. At the core of Bicolano is the axiomatic base that describes the notion of program, and specifies the semantic domains and machine arithmetic that we use. We use the Coq module system to model these different components. The operational semantics is defined on top of this axiomatic base. We define a small step and a big step semantics, and we prove equivalence between these. Finally, to show that the axiomatisations that we use are consistent, we give concrete instantiations of the different modules that allow to represent particular bytecode programs. The instantiations can also be used to obtain executable verifiers. 4.1 Axiomatic Base The starting point of Bicolano is an axiomatisation of program syntax, semantic domains and machine arithmetic. For simplicity, we adopt a post-linking view of programs: in
The MOBIUS Proof Carrying Code Infrastructure
9
Operational semantics Small step
Equivalence proof
Big step
SmallStep.v
EquivSmallAndBigStep.v
BigStep.v
Axiomatic base Program syntax
Semantic domains
Machine arithmetic
Program.v
Domain.v
Numeric.v
Implementation Program with lists
Semantic domains
Machine arithmetic
ImplemProgramWithList.v
ImplemDomain.v
ImplemNumeric.v
Program with maps ImplemProgramWithMap.v
Fig. 3. Bicolano architecture
particular, we only handle complete programs and hence is not able to deal with dynamic linking. This choice simplifies the specification considerably, because a large part of the description in the official Sun specification is related to linking. Furthermore, we omit some of the numerical domains: 64 bits values (double and long) and float numbers are not considered. Program syntax. The syntax of a JVM program is modelled abstractly, using signatures of the Coq module system to achieve a cleaner separation of concerns in the formalisation of the semantics. The whole set of axioms is put in a module type named PROGRAM. This axiomatisation is based on various abstract data types: Parameter Program Class Interface Method Field Var MethodSignature FieldSignature PC : Set.
Note that some notions, such as methods and fields, are modelled in two forms: the standard form and the signature form. Such a distinction is necessary because bytecode instructions do not contain direct pointers to methods or fields. Each abstract type has a list of accessors to manipulate them. We group the accessors of a given type in the same sub-module type. Here we give an example of Program accessors: ( ∗ ∗ C o n t e n t s o f a J a v a program ∗ ) Module Type PROG_TYPE. ( ∗ ∗ a c c e s s o r t o a c l a s s f r o m i t s q u a l i f i e d name ∗ ) Parameter class : Program → ClassName → option Class.
10
G. Barthe et al. Parameter name_class_invariant1 : ∀ p cn cl, class p cn = Some cl → cn = CLASS.name cl. ( ∗ ∗ a c c e s s o r t o an i n t e r f a c e f r o m i t s q u a l i f i e d name ∗ ) Parameter interface : Program → InterfaceName → option Interface. Parameter name_interface_invariant1 : ∀ p cn cl, interface p cn = Some cl → cn = INTERFACE.name cl. End PROG_TYPE. Declare Module PROG : PROG_TYPE.
Notice that the Program structure contains several internal invariants like name_class_invariant1. These are properties that we require to hold for any instantiation of the module, and that can be assumed in the operational semantics. The axiomatisation of programs ends with a list of definitions using the previous notions such as subtyping and method lookup. Semantic domains. Semantic domains are axiomatised in a module type named SEMANTIC_DOMAIN. Each sub-domain is specified in a sub-module type. For example,
the set of local variables is specified as follows. Module Type LOCALVAR. Parameter t : Set. Parameter get : t → Var → option value. Parameter update : t → Var → value → t. Parameter get_update_new : ∀ l x v, get (update l x v) x = Some v. Parameter get_update_old : ∀ l x y v, x<>y → get (update l x v) y = get l y. End LOCALVAR. Declare Module LocalVar : LOCALVAR.
The most complex axiomatisation of this file concerns the heap, for which we provide an axiomatic treatment based on the work of Poetzsch-Heffter and M¨uller [PHM98]. Module Type HEAP. Parameter t : Set. Inductive AddressingMode : Set := | StaticField : FieldSignature → AddressingMode | DynamicField : Location → FieldSignature → AddressingMode | ArrayElement : Location → Int.t → AddressingMode. Inductive LocationType : Set := | LocationObject : ClassName → LocationType | LocationArray : Int.t → type → LocationType. Parameter get : t → AddressingMode → option value. Parameter update : t → AddressingMode → value → t.
The MOBIUS Proof Carrying Code Infrastructure
11
Parameter typeof : t → Location → option LocationType. Parameter new : t → Program → LocationType → option (Location * t). ...
The abstract type of heaps (called t inside the module type HEAP) has two accessors get and typeof and two modifiers update and new. These functions are based on the notions of AddressingMode and LocationType. AddressingMode gives the kind of entry in the heap: a field signature for static fields, a location together with a field signature for field values of objects, and a location together with an integer for the element of an array. The definition get gives access to the value attached to an indicated address. The definition typeof gives the type associated with a location (if there is any). This type is either a class name for objects or a length and a type of elements for arrays. The definition update allows to modify a value at a given address. Finally, the definition new allows to allocate a new object or a new array. 4.2 Operational Semantics Bicolano proposes two different operational semantics for sequential JVM bytecode: a small step and a big step, for which equivalence is proved formally. Small step semantics. The small step semantics follows exactly the reference semantics given in the official specification. It consists of an elementary relation named step between states of the virtual machine. A standard state is of the form (St h (Fr m pc s l) sf) where h is a heap; (Fr m pc s l) is the current frame composed of the current method m, the current program point pc, the local variables l and the operand stack s; and finally sf is the call stack. An exceptional state is of the form (StE h (FrE m pc loc l) sf) where all elements are similar to those found in a standard state, except the location of the exception object loc, which replaces the operand stack. Exceptional states occur when an exception is thrown, but control has not yet reached the corresponding exception handler. The step relation is given by an inductive relation. We give here a fragment describing the semantics of the getfield instruction. Inductive step (p:Program) : State.t → State.t → Prop := ... | getfield_step_ok : ∀ h m pc pc’ s l sf loc f v cn, instructionAt m pc = Some (Getfield f) → next m pc = Some pc’ → Heap.typeof h loc = Some (Heap.LocationObject cn) → defined_field p cn f → Heap.get h (Heap.DynamicField loc f) = Some v → step p (St h (Fr m pc (Ref loc::s) l) sf) (St h (Fr m pc’ (v::s) l) sf) ...
12
G. Barthe et al.
This case reads as follows: if the current instruction (given by instructionAt m pc) is Getfield f then a normal execution step is possible under several conditions: the next program counter is valid (next m pc = some pc’), the current operand stack is non empty and starts with a reference of class name cn such that the field signature f is defined in the class named cn. Under these conditions, the value v of the field f in the object pointed by the location loc is pushed on top of the operand stack. We omit here the second case where the top of the operand stack is null and an exceptional state is created. The definition of this relation closely follows the definition given in the official specification. Together with the axiomatisation presented above, it forms the trusted base of Bicolano. Big step semantics. Since JVM states contain a frame stack to handle method invocations, it is often convenient to use an equivalent semantics where method invocations are performed in one big step transition for showing the correctness of static analyses and program logics. Therefore we introduce a new semantics relation: IntraStep (p:Program) : Method → IntraNormalState → IntraNormalState + ReturnState → Prop
This relation denotes transitions of a method between two internal states, i.e. JVM states that only contain one frame (instead of a frame stack), or between an internal state and a return state, i.e. a pair of a heap and a final result (a JVM value or an exception object in case of termination by an uncaught exception). While small-step semantics uses a call stack to store the calling context and retrieve it during a return instruction, the big step semantics directly calls the full evaluation of the called method from an initial state to a return value and uses it to continue the current computation. We have formally proved the correctness of the big step semantics with respect to the reference small step semantics, by showing that the notion “evaluation of method m in program p from states s terminates with the final value ret” coincides in both semantics. Semantics of programs with safety annotations. We rely on preliminary exception analyses to reduce the control flow graph of applications. Curbing the explosion in the control flow graph is essential for maintaining a minimum of precision in an information flow analysis or reducing the size of verification conditions. This is especially a problem in a language like Java because a large number of bytecode instructions are defensive, i.e they perform dynamic verifications (such as testing nullity of pointers that must be dereferenced or testing if index are between the bounds of arrays they try to access) on the current state before executing the instruction. When these verifications fail at execution time, a JVM exception is thrown and the control flow is redirected towards an adequate handler of the current method, if it has one, or the current method is stopped and the exception is recursively handled by the next method in the call stack. If the call stack is empty the program halts with an uncaught exception. Exceptions substantially complicate the control flow graph. Moreover, in most Java applications, JVM exceptions are not intentionally manipulated by the programmer and only the normal branch of
The MOBIUS Proof Carrying Code Infrastructure
13
these instructions are executed in practice. Part of these unreachable branches can be detected by static analysis. Bicolano uses a notion of safety annotations that allows to execute a static exception checker before any other verification tool and hence let such verification tools benefit from these control flow simplifications at a semantic level. Safety annotations are exploited in an instrumented semantics, where extra properties taken from annotation information are assumed in the premise of the transition rules. Annotations take the form of flags safe attached to program points where the preanalyser predict that no exception may be thrown here. Exceptions hence only happen at program points which are not annotated as safe. We also equip each method with an over-approximation of the set of exceptions that can be raised from it without being caught (building such an over-approximation is feasible since we consider complete programs). Assuming the annotations are correct, it is straightforward to prove that each judgment of the big step semantics implies the corresponding judgment of the instrumented big step semantics. This pre-verification technique fits well in a PCC approach, as explained in Figure 4. Annotations are first generated by an exception analyser and then transmitted with the program. Using lightweight verification techniques, they are then checked on the consumer side by fixpoint verification. With this scheme, any verification tool can then safely reason on the annotated semantics of the program. program consumer
producer Exception analyser
annotations
Exception checker
Certifying verifier
certificate
Certificate checker diagnostic
Fig. 4. Hybrid certificate with safety annotations
4.3 Implementations of Module Interfaces Bicolano also provides implementations of the different modules used in the formalization of the JVM. These implementations serve two purposes: they guarantee that the axiomatizations are consistent, and they can also be used to get executable verifiers (in which case efficiency is important). We currently propose two different implementations of these modules; one is based on lists and is more readable whereas the second is based on maps and is more efficient.
14
G. Barthe et al.
5 Reflective Proof Carrying Code for Wholesale Checking The purpose of this section is to present the first MOBIUS scenario, and to elaborate on the presentation of Section 3. 5.1 Scenario and Requirements In wholesale Proof Carrying Code, mobile code transits through a trusted intermediary, e.g. a mobile phone operator. As explained in Figure 5, PCC is used by code producers (that are external to the phone operators and untrusted by them) with proofs which establish that the application is secure. The operator then digitally signs the code before distributing it to the code consumers (the customers, who rely on the operator). This scenario for “wholesale” verification by a code distributor effectively combines the best of both PCC and trust, and brings important benefits to all participating actors. For the end user in particular, the scenario does not add PCC infrastructure complexity to the device, but still allows effective enforcement of advanced security policies. From the point of view of phone operators, the proposed scenario enables achieving the required level of confidence in applications developed by third parties, since they can reproduce the program verification process performed by producers, but completely automatically and thus with low cost. Furthermore, Foundational Proof Carrying Code is an appropriate approach, since it brings the highest guarantees and there are no stringent restrictions on resource (memory, CPU, bandwidth) consumption. From the software producer perspective, the scenario removes the bottleneck of the manual approval/rejection of code by the operator. This results in a significant increase in market opportunity. Of course, this comes at a cost: producers have to verify their code and generate a certificate before shipping it to the operator, in return for access to a market with a large potential and which has remained rather closed to independent software companies. 5.2 Information Flow Lightweight Type Checker As a first instance of the reflective Proof Carrying Code approach developed in MOBIUS, we present a lightweight type checker [BPR07] that enforces confidentiality of JVM applications with the same modularity principles as Java bytecode verification: provided class files are suitably extended with appropriate security annotations, a program can be checked method by method without fixpoint iterations. More concretely, we suppose that the levels of confidentiality of data is abstracted into a lattice of security levels (a level k1 is lower than a level k2 , written k1 k2 , if k1 is less confidential as k2 ), and that programs are equipped with security annotations (whose type is named secure_annots in the rest of the section), indicating the confidentiality level of each method local variables, method returns (if any) and class fields. Each method is also equipped with a security environment se that tracks the potential indirect flows (when the execution of a given instruction indirectly depends on a branching instruction involving confidential information) and assigns a security level to each program point. In order to abstract the operand stacks, the type system computes for each program point a stack of security levels. This information is reconstructed thanks to stackmaps, in the same spirit that those that are used in bytecode verification.
The MOBIUS Proof Carrying Code Infrastructure
PCC
PKI
Producer 1
Producer 2
Phone Operator/
15
Consumer 1
Consumer 2
Manufacturer
Consumer C Producer P
Fig. 5. The MOBIUS scenario
The semantic property enforced by the type checker is a non-interference property, which ensures that there is no flow of information from secret to public data [SM03]. It depends on a security level kobs that determines the observational capabilities of the attacker. Essentially, the attacker can observe fields, local variables, and return values whose level is below kobs . This is modeled with a notion of equivalence ∼in (for program inputs) and ∼out (for program outputs) between Bicolano memory states. These relations depends on the security annotations of a program. A program is noninterferent if for each method m, two terminating runs of m with ∼in -equivalent inputs, i.e. inputs that cannot be distinguished by an attacker, yield ∼out -equivalent results, i.e. results that cannot be distinguished by the attacker. Formally, the security condition is expressed relative to the Bicolano big step semantics, which is captured by judgments of the form hin , a ⇓m r, hout , meaning that executing the method m with initial heap hin and parameters a yields the final heap hout and the result r. The semantic notion of non-interference is then expressed in Coq with a predicate non_interferent:program→ secure_annots→ Prop that models the following property on a program p ∀m, hin , a, r, hout , hin⎫ , a , r , hout hin , a ⇓m r, hout ⎬ hin , a ⇓m r , hout ⇒ (hout , r) ∼out (hout , r ) ⎭ (hin , a) ∼in (hin , a ) The type checker operates on annotated programs, and is parametrized by control dependence region (CDR) that approximate the scope of branching instructions, i.e. the set of instructions that execute under the scope of a given branching instruction. CDRs are required to detect indirect flows and to enforce global constraints that prevent assignments of public information to occur under guards that depend on confidential data. Formally, a point j is in a control dependence region of a branching points i if j is reachable from i (in the current method), and there may be an execution path from i to a return point which do not contain j. Although CDRs can be computed using post-dominators, we follow a lightweight verification approach and require the CDR to be transmitted with the program and develop a CDR checker.
16
G. Barthe et al.
Both the type system and the CDR analyzer operate on annotated programs. The annotations are used to minimize the size of regions, and are essential to guarantee some precision in the analysis: indeed, the ability of an information flow type system to accept a program directly depends on the number of branching instructions in this program. Each access to a confidential object may reveal its nullity to an attacker if NullPointer exceptions are not explicitly caught in the program. In such cases the type system has to be restrictive on the operations that may modify a public part of the memory after this access. This restriction is not necessary if the nullity case is provably unreachable. The typing rules are designed to prevent information leakage through imposing appropriate constraints. In a nutshell, typing rules are of the form below (we omit here the special case of return instructions): P [i] = ins constraints se, i st ⇒ st where st, st are stacks of security levels before and after the execution of the instruction ins found at point i in program P , and se is the security environments of the current method. For the special case of the getfield instruction the rule looks like this one P [i] = getfield ¬Safe(i) ⇒ ∀j ∈ region(i), k ≤ se(j) f (ft(f ) se(i)) :: st if Safe(i) st = liftk ((ft(f ) se(i)) :: st) otherwise se, i k :: st ⇒ st The instruction getfield f at program point i is typable if the current type stack is of the form k :: st with k the security level of the object which is read. The constraint ∀j ∈ region(i), k ≤ se(j) imposes that all points j in the control dependence region of i have a security environment se(j) whose confidentiality is equal or greater than level k because reaching one of the program points region(i) may depend on the nullity of the read object. Safe(i) represents here the safety annotation that predicts the absence of NullPointer exception if the predicate holds and is inconclusive otherwise. The previous constraint is only imposed if Safe(i) does not hold, otherwise the current point i is not a branching point. The next stack type depends also on Safe(i). If the predicate holds, we pop the top of the stack and push a security level at least greater than the level ft(f ) of field f and the level k of the location (to prevent explicit flows) and at least greater than se(i) for implicit flows. If the predicate does not hold, we use the same stack type but lift it. Here liftk is the point-wise extension to stack types of λl. k l. It is necessary to perform this lifting operation to avoid implicit flows through operand stack leakage. Finally, the checker we formalize in Coq is of type iflow_check : program → secure_annots → iflow_annots → bool
As indicated above, annotations are composed of safety annotations, control dependence regions and security environments and stackmaps of security levels. The checker successively checks these different annotations and finally ensures that the program is typable according to the information flow typing rules and the security annotations that express its security policy. The soundness theorem establishes the soundness of the checker with respect to the non-interference policy.
The MOBIUS Proof Carrying Code Infrastructure
17
iflow_check_correct : ∀ p policy annot, iflow_check p policy annot = true → non_interferent p policy
This checker is fully executable, and can be run inside Coq to obtain a foundational proof of the confidentiality of a program, as described in Section 3. Alternatively, it can be extracted into a stand-alone Caml type checker and executed on-device as described in Section 6. 5.3 Verification Condition Generator As another instance of reflective Proof Carrying Code, we present a verification condition generator (VCgen) that can be used to ensure functional and non-functional properties of programs. The VCgen operates on annotated programs, and returns a set of proof obligations whose validity ensures that the program meets its specification. More precisely, program annotations are contained in a method specification table, that assigns to every method a pre- and a post-condition, and a local specification table, that attaches to some program points an assertion (in general a loop invariant) that should be valid each time the program execution reaches them; the annotations are logical formulae that may refer to the current and initial states (heap and operand stack), and in the case of the postcondition to the termination mode and the result. The semantic property ensured by the VCgen is expressed using the Coq predicate correct:program→ funct_annots→ Prop modeling the following property on an annotated program p: ∀m, hin , a, r, hout, [hin , a ⇓m r, hout ∧ pre m (hin , a) ⇒ post m ((hin , a), (r, hout ))]
The validity of the proof obligations ensures that, upon termination of the execution of a method m, the post-condition post m of m will hold, provided the method was called in a initial state satisfying the pre-condition pre m. There are two kinds of proof obligations: to ensure the coherence of the annotations at a global level, e.g. that the specification respects behavioral subtyping [LW94], and to ensure the coherence of the annotations for each method. The latter are computed using a weakest precondition calculus, and ensure that the the annotation is valid each time the program execution reaches an annotated program point i (a local annotation is attached to i, or i is an exit point). In order to be effective, the weakest precondition calculus assumes that the methods are sufficiently annotated. Under such a condition, the weakest precondition wpi at program point k is defined using the general scheme: wpi (k)(s0 , s) = C(k,l) (s) ⇒ P(k,l) (wpl (k), s0 , s) l∈succ(k)
where s0 is the initial state (initial heap and arguments) C(k,l) (s) is the condition that needs to be satisfied by s in order for the execution to go from k to l in one step, and P(k,l) (wpl (l), s0 , s) is a predicate transformer updating s in correspondence with the instruction at k and applying it to the weakest precondition of the successor j (wpl (j)).
18
G. Barthe et al.
Computing P(k,l) (wpl (k), s0 , s). The function wpl (k) simply searches the local annotation table; if k is annotated the precondition is the annotation else it is wpi (k), so the two functions are defined using a mutual recursion. For the getfield instruction the rule is: P [k] = getfield f wpl (l, s0 , (h , v :: [], ρ)) if Handler(k, NullPointer) = l φ= m.post(s0, h , v ) otherwise wpi (k)(s0 , (h, v :: os, ρ)) = v = null ⇒ wpl (k + 1)(s0 , (h(v.f ) :: os, ρ)) ∧v = null ⇒ ∀h v , N ew(h, NullPointer) = (h , v ) ⇒ φ
In the rule above, (h, v :: os, ρ) represents the current state, h stands for the heap, v :: os is the operand stack and ρ is the mapping from local variables to their values. For a getfield instruction, the stack should be not empty. If the top value v is not null, the instruction executes normally pushing the field value of f in the top of the stack. If the top value is null, a runtime exception v is created, leading to a new heap h (N ew(h, NullPointer) = (h , v )). In that case, two cases can appear. The exception may be caught (Handler(k, NullPointer) = l), and the execution continues to the program counter l, in which case the postcondition of k is the precondition of l. Otherwise, the exception is uncaught and the method terminates abruptly, in which case the postcondition of k is the exceptional postcondition of the method m.post applied to the initial state, the final heap h and the raised exception v . Since the exception handler is statically known, the precondition of the getfield is a conjunction of two postconditions. Computing C(k,l) (s). The function C(k,l) uses safety information to eliminate impossible branches: if the program point k is annotated as safe and Safe(k) is incompatible with the condition C(k,l) the new condition is simply False, and then the simplifier can remove the trivially true proposition False ⇒ P(k,l) (wpl (k), s0 , s) from the conjunction corresponding to wpi (k)(s0 , s). Here the VCgen uses information provided by static analyses. It is also possible [GS07] to transfer the part of the assertions contained in the specification to the analysis, so that the analysis can produce more accurate results which can be used by the VCgen to simplify the proof obligations, leading to a kind of cross fertilization between the static analyses and the VCgen. Correctness. Finally, the VCgen we implement in Coq is of type vcgen:program→ funct_annots→ Prop
(strictly speaking, it takes as additional argument a proof that the program is wellannotated). Its correctness lemma is: vcgen_correct : ∀ p annot, vcgen p annot → correct p annot
The verification condition generator is executable, and can be run inside Coq, as described in Section 3.
The MOBIUS Proof Carrying Code Infrastructure
19
6 Certified Static Analysis as Lightweight Proof Carrying Code The purpose of this section is to present the first MOBIUS scenario, and to elaborate on the presentation of Section 3. 6.1 Scenario and Requirements Rather than using a complete proof assistant as a checker and a logic as the language for proof certificates, it is possible to rely on a lightweight form of PCC, inspired from ACC, where the code consumer checks the certificate itself. Compared to the “wholesale” scenario, this scenario does not require a PKI infrastructure and a trusted third party, as the checking phase can be done directly on the enduser device. This form of retail PCC is beneficial to the trusted intermediary, for which the cost of digital signatures can be prohibitive—it is not unusual to have hundreds of versions of a single application, because of the fragmentation of execution platforms, networks and internationalization—and to the code consumers, who may be offered an opportunity to customize policies they want to be enforced. These embedded verifiers are part of the trusted computing base, and should therefore be validated with respect to Bicolano–implementing a verifier is simpler than implementing an analyser but it is still error prone. The formal framework for this validation is certified abstract interpretation [BJP06b], which proposes an unified framework to specify both an analysis to infer program invariant and a checker to verify them. The correctness of the checker is proved formally in Coq. This results in a proof-carrying code architecture—outlined in Figure 6—where both correctness proofs and actual program certificates arise from abstract interpretation. In this architecture, a verified fixpoint checker is combined with an untrusted fixpoint engine that the code producer uses to produce the stackmaps. Stackmaps are further reduced by a (still untrusted) fixpoint compression technique, yielding the final stackmap for the program. On the code consumer side, the verification proceeds in two stages. The first stage is off-device and only takes place when the consumer must update his on-device verifier. The consumer first receives a proposed fixpoint checker together with a proof of its semantic correctness. He combines this proof with a formalisation of the language semantics and the targeted security policy (here given by Bicolano) and then checks the proof off-device with the type checker of the Coq kernel. Once the proof has been checked, the code consumer extracts the fixpoint verifier using Coq’s program extraction mechanism and installs it on the device. This extracted verifier can then be used to verify (on-device) the stackmaps accompanying a piece of downloaded software. 6.2 Embedding the Verifiers: Design Choices There are several technical choices to make for porting a checker to a mobile handset: – at least three underlying OS could be used: Windows Mobile, Symbian or Linux; – the extraction from Coq could target Scheme, Objective Caml, Haskell or F# (for Windows mobile phones).
20
G. Barthe et al.
Producer Program
Consumer certified verifier
semantics + Security policy
certified verifier
Coq kernel + Coq extraction
untrusted fixpoint solver
Device
fixpoint
untrusted fixpoint compressor
certificate
certificate extracted verifier
Safe
Fig. 6. A PCC architecture based on certified abstract interpretation
Because Bicolano makes a heavy use of functors, it seems natural to use Objective Caml as target language because Objective Caml also natively support this feature. To ease the porting process of the Objective Caml back-end, we have selected two Linux-based devices: a Nokia 770 tablet, an open Linux-based “web-appliance” with a very easy to use development environment and a Motorola A780 phone that runs a closed version of Linux (Montavista). Both are based on an ARM application processor. The resource constaints of the mobile handset (the operating system only leaves 1.5Mb of RAM to share among all the applications) are very typical of medium-end smartphones. One main technical difficulty is adapting the Objective Caml compiler due to the lack of hardware floating-point support on the limited processors of those devices. We have chosen to disable completely the support for floating point as the code extracted from Coq does not need it. 6.3 Embedding the Verifiers: Case Study In order to illustrate the feasibility of verifying the results of an analysis on-device, we have considered the interval analysis from [BJPT07]. The analysis operates on a restricted instruction set, which is sufficiently expressive to program a simple search algorithm, which is shown below with its annotations, consisting of linear relations between variables. The annotations provide pre- and post-conditions as well as invariants between local variables and parameters. With these invariants it is possible to prove e.g., that all accesses to buffers (arrays) happens within their bounds and do not overflow to the surrounding memory. // PRE : T rue static int bsearch(int key, int[] vec) { / / (I1 ) |vec0 | = |vec| ∧ 0 ≤ |vec0 |
The invariants given here are already compressed using an automatic pruning technique. As an example the original invariant (I5 ) was: / / (I5 ) key0 = key ∧ |vec0 | = |vec| ∧ −2 + 3 · low ≤ 2 · high + mid ∧ −1 + 2 · low ≤ high + 2 · mid ∧ −1 + low ≤ mid ≤ 1 + high ∧ high ≤ low + mid ∧ 1 + high ≤ 2 · low + mid ∧ 1 + low + mid ≤ |vec0 | + high ∧ 2 ≤ |vec0 | ∧ 2 + high + mid ≤ |vec0 | + low
As invariant reconstruction can be expensive, the optimal amount of information to supply to the checker is (I2 ) and (I5 ) so that checking is limited to verifying the inclusion of polyhedrons. Notice that (I2 ) would have been sufficient if the checker could compute a convex hull of two polyhedron. As this is a rather expensive operation, the MOBIUS project has developed a technique for producing extremely compact and easily verifiable certificates of polyhedra inclusion, which often can replace a convex hull computation—see [BJPT07] for details. 6.4 Case Study: Benchmarks Time measurements in figure 7 are given for the interval analysis [BJP06b] on the Motorola handset. The certified function Coq.BytecodeChecker.cheker is iterated 100 times to avoid problems with the resolution of the timer. Parsing time is negligible. Those figures show that the phone is roughly ten times slower than the machine used in [BJP06b]. Time measurements in figure 8 are given for ten iteration of the polyhedral checker of [BJPT07]. 6.5 Research Agenda We intend to improve these preliminary experiments along three directions: program Bubble Convolution Floyd HeapSort Polynom QuickSort Sort Product Wharshall Product Time 3.92 2.94 8.49 49.48 5.77 56.91 Fig. 7. Benchmarks of the interval-based array bound checker
22
G. Barthe et al. program BSearch FFT HeapSort Jacobi Time 0.30 5.65 0.97 0.23 Checks made 4 29 12 5 % Checked 100 78 100 50
LU QuickSort Random SparseCompRow 3.02 34.86 1.70 0.14 30 12 24 2 44 100 82 33
Fig. 8. Benchmarks of the polyhedral array bound checker
– property coverage: many of the properties that are relevant for security can be checked by static analysis. We intend to study the feasibility of embedding verifiers for a resource analysis that counts the number of calls to a given method, and also for a points-to analysis. The latter is used by several other analysis (eg to resolve virtual method dispatch statically) and which can be used directly to compute an approximation of the arguments of all the calls to dangerous methods (see [CA05] or [LL05]); – language coverage: the language recognised by the checker must be extended to accommodate a more complete fragment of the CLDC/MIDP specification. It involves extending the instruction set to a large subset of CLDC bytecodes and refining the semantics of Bicolano to account for at least the intricate details of method dispatch, object initialisation and exception handling. Furthermore, t he MIDP libraries with their links to the native platform must be taken into account. Such an extension opens several software engineering challenges in terms of modularity and maintainability. – constraints: the code of the checker must fit on constrained devices such as mobile phones. Benchmarks have shown that the CPU load is reasonable on very small examples but that we must improve the size of the representation of the bytecode handled by the checker before we extend our analysis to the complete bytecode instruction set. A modified version of the Bicolano representation of the JVM bytecode is under development. Other ideas such as mapping Coq integer types to native Objective Caml integers during the extraction of the checker (as used in [Chl0x]) should also be considered.
7 Conclusion This article presents two PCC scenarios explored in MOBIUS, and their associated infrastructures for checking certificates. Both approaches rely on proof assistants to ensure the trustworthiness of certificate checkers: we advocate the use of reflective PCC for wholesale scenarios, and the use of certified certificate checkers for retail scenarios. In parallel to developing certified certificate checkers, the MOBIUS project has been actively investigating certificate generation: relevant material is available from the project web page: http://mobius.inria.fr Acknowledgments. This work is supported by the Integrated Project MOBIUS, within the Global Computing II initiative.
The MOBIUS Proof Carrying Code Infrastructure
23
References [APH05]
[App01] [BBC+ 06]
[BGP08]
[BJP06a]
[BJP06b]
[BJPT07] [BPR07]
[CA05]
[CC77]
[Chl0x] [Coq04]
[Gow04] [GS07]
[Kil73] [Ler03] [LL05]
Albert, E., Puebla, G., Hermenegildo, M.V.: Abstraction-carrying code. In: Baader, F., Voronkov, A. (eds.) LPAR 2004. LNCS, vol. 3452, pp. 380–397. Springer, Heidelberg (2005) Appel, A.W.: Foundational proof-carrying code. In: Halpern, J. (ed.) Logic in Computer Science, p. 247. IEEE Press, Los Alamitos (2001) Barthe, G., Beringer, L., Cr´egut, P., Gr´egoire, B., Hofmann, M., M¨uller, P., Poll, E., Puebla, G., Stark, I., V´etillard, E.: MOBIUS: Mobility, ubiquity, security. In: Montanari, U., Sannella, D., Bruni, R. (eds.) TGC 2006. LNCS, vol. 4661, pp. 10–29. Springer, Heidelberg (2007) Barthe, G., Gr´egoire, B., Pavlova, M.: Preservation of proof obligations from java to the java virtual machine. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS, vol. 5195, pp. 83–99. Springer, Heidelberg (2008) (to appear, 2008) Besson, F., Jensen, T., Pichardie, D.: A PCC architecture based on certified abstract interpretation. In: Emerging Applications of Abstract Interpretation. Elsevier, Amsterdam (2006) Besson, F., Jensen, T., Pichardie, D.: Proof-Carrying Code from Certified Abstract Interpretation and Fixpoint Compression. Theoretical Computer Science 364(3), 273–291 (2006); Extended version of [BessonP06] Besson, F., Jensen, T., Pichardie, D., Turpin, T.: Result certification for relational program analysis. Research Report 6333, IRISA (September 2007) Barthe, G., Pichardie, D., Rezk, T.: A certified lightweight non-interference java bytecode verifier. In: De Nicola, R. (ed.) ESOP 2007. LNCS, vol. 4421, pp. 125–140. Springer, Heidelberg (2007) Cr´egut, P., Alvarado, C.: Improving the security of downloadable Java applications with static analysis. In: Bytecode Semantics, Verification, Analysis and Transformation. Electronic Notes in Theoretical Computer Science, vol. 141. Elsevier, Amsterdam (2005) Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Principles of Programming Languages, pp. 238–252 (1977) Chlipala, A.: Modular development of certified program verifiers with a proof assistant. Journal of Functional Programming (to appear) Coq development team. The Coq proof assistant reference manual V8.0. Technical Report 255, INRIA, France (March 2004), http://coq.inria.fr/doc/main.html Gowdiak, A.: Java 2 Micro Edition (J2ME) security vulnerabilities. In: Hack In The Box Conference, Kuala Lumpur, Malaysia (2004) Gr´egoire, B., Sacchini, J.: Combining a verification condition generator for a bytecode language with static analyses. In: Barthe, G., Fournet, C. (eds.) TGC 2007 and FODO 2008. LNCS, vol. 4912, pp. 23–40. Springer, Heidelberg (2008) Kildall, G.A.: A unified approach to global program optimization. In: Principles of Programming Languages, pp. 194–206. ACM Press, New York (1973) Leroy, X.: Java bytecode verification: algorithms and formalizations. Journal of Automated Reasoning 30(3-4), 235–269 (2003) Livshits, V., Lam, M.: Finding security vulnerabilities in java applications with static analysis. In: USENIX Security Symposium (2005)
24
G. Barthe et al.
[LW94] [LY99]
[MN07]
[Nec97] [NL98]
[PHM98]
[Ros03] [SM03] [Sun03]
Liskov, B., Wing, J.M.: A behavioral notion of subtyping. ACM Transactions on Programming Languages and Systems 16(6) (1994) Lindholm, T., Yellin, F.: The JavaTM Virtual Machine Specification, 2nd edn. Sun Microsystems, Inc. (1999), http://java.sun.com/docs/books/vmspec/ M¨uller, P., Nordio, M.: Proof-transforming compilation of programs with abrupt termination. In: SAVCBS 2007: Proceedings of the 2007 conference on Specification and verification of component-based systems, pp. 39–46. ACM, New York (2007) Necula, G.C.: Proof-carrying code. In: Principles of Programming Languages, pp. 106–119. ACM Press, New York (1997) Necula, G.C., Lee, P.: The design and implementation of a certifying compiler. In: Programming Languages Design and Implementation, vol. 33, pp. 333–344. ACM Press, New York (1998) Poetzsch-Heffter, A., M¨uller, P.: Logical foundations for typed object-oriented languages. In: Gries, D., De Roever, W. (eds.) Programming Concepts and Methods (PROCOMET), pp. 404–423 (1998) Rose, E.: Lightweight bytecode verification. Journal of Automated Reasoning 31(34), 303–334 (2003) Sabelfeld, A., Myers, A.: Language-based information-flow security. IEEE Journal on Selected Areas in Communication 21, 5–19 (2003) Sun Microsystems Inc., 4150 Network Circle, Santa Clara, California 95054. Connected Limited Device Configuration.Specification Version 1.1. JavaTM 2 Platform, Micro Edition (J2METM ) (March 2003)
Certification Using the Mobius Base Logic Lennart Beringer1, Martin Hofmann1 , and Mariela Pavlova2 1
Institut f¨ ur Informatik, Universit¨ at M¨ unchen Oettingenstrasse 67, 80538 M¨ unchen, Germany {beringer,mhofmann}@tcs.ifi.lmu.de 2 Trusted Labs, Sophia-Antipolis, France [email protected]
Abstract. This paper describes a core component of Mobius’ Trusted Code Base, the Mobius base logic. This program logic facilitates the transmission of certificates that are generated using logic- and type-based techniques and is formally justified w.r.t. the Bicolano operational model of the JVM. The paper motivates major design decisions, presents core proof rules, describes an extension for verifying intensional code properties, and considers applications concerning security policies for resource consumption and resource access.
1
Introduction: Role of the Logic in Mobius
The goal of the Mobius project consists of the development of proof-carrying code (PCC) technology for the certification of resource-related and informationsecurity-related program properties [16]. According to the PCC paradigm, code consumers are invited to specify conditions (“policies”) which they require transmitted code to satisfy before they are willing to execute such code. Providers of programs then complement their code with formal evidence demonstrating that the program adheres to such policies. Finally, the recipient validates that the obtained evidence (“certificate”) indeed applies to the transmitted program and is appropriate for the policy in question before executing the code. One of the cornerstones of a PCC architecture is the trusted computing base (TCB), i.e. the collection of notions and tools in whose correctness the recipient implicitly trusts. Typically, the TCB consists of a formal model of program execution, plus parsing and transformation programs that translate policies and certificates into statements over these program executions. The Mobius architecture applies a variant of the foundational PCC approach [2] where large extents of the TCB are represented in a theorem prover, for the following reasons. – Formalising a (e.g. operational) semantics of transmitted programs in a theorem prover provides a precise definition of the model of program execution, making explicit the underlying assumptions regarding arithmetic and logic. – The meaning of policies may be made precise by giving formal interpretations in terms of the operational model. F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 25–51, 2008. c Springer-Verlag Berlin Heidelberg 2008
26
L. Beringer, M. Hofmann, and M. Pavlova
– Theorem provers offer various means to define formal notions of certificates, ranging from proof scripts formulated in the user interface language (including tactics) of the theorem prover to terms in the prover’s internal representation language for proofs (e.g. lambda-terms). In particular, the third item allows one to employ a variety of certificate notions in a uniform framework, and to explore their suitability for different certificate generation techniques or families of policies. In contrast to earlier PCC systems which targeted mostly type- and memory-safety [2,27], policies and specifications in Mobius are more expressive, ranging from (upper) bounds on resource consumption, via access regulations for external resources and security specifications limiting the flow of information to lightweight functional specifications [16]. Thus, the Mobius TCB is required to support program analysis frameworks such as type systems and abstract interpretation, but also logical reasoning techniques.
Fig. 1. Core components of the MOBIUS TCB
Figure 1 depicts the components of the Mobius TCB and their relations. The base of the TCB is formed by a formalised operational model of the Java Virtual Machine, Bicolano [30], which will be briefly described in the next section. Its purpose is to define the meaning of JVML programs unambiguously and to serve as the foundation on which the PCC framework is built. In order to abstract from inessential details, a program logic is defined on top of Bicolano. This provides support for commonly used verification patterns such as the verification
Certification Using the Mobius Base Logic
27
of loops. Motivated by verification idioms used in higher-level formalisms such as type systems, the JML specification language, and verification condition generators, the logic complements partial-correctness style specifications by two further assertion forms: local annotations are attached to individual program points and are guaranteed to hold whenever the annotated program point is visited during a program execution. Strong invariants assert that a particular property will continue to hold for all future states during the execution of a method, including states inside inner method invocations. The precise interpretation of these assertion forms, and a selection of proof rules will be described in Section 3. We also present an extension of the program logic that supports reasoning about the effects of computations. The extended logic arises uniformly from a corresponding generic extension of the operational semantics. Using different instantiations of this framework one may obtain domain-specific logics for reasoning about access to external resources, trace properties, or the consumption of resources. Polices for such domains are difficult if not impossible to express purely in terms of relations between initial and final states. The extension is horizontal in the sense of Czarnik and Schubert [20] as it is conservative over the non-extended (“base”) architecture. The glue between the components is provided by the theorem prover Coq, i.e. many of the soundness proofs have been formalised. The encoding of the program logics follow the approach advocated by Kleymann and Nipkow [25,29] by employing a shallow embedding of formulae. Assertions may thus be arbitrary Coq-definable predicates over states. Although the logic admits the encoding of a variety of program analyses and specification constructs, it should be noted that the architecture does not mandate that all analyses be justified with respect to this logic. Indeed, some type systems for information flow, for example, are most naturally expressed directly in terms of the operational semantics, as already the definition of information flow security is a statement over two program executions. In neither case do we need to construct proofs for concrete programs by hand which would be a daunting task in all but the simplest examples. Such proofs are always obtained from a successful run of a type system or program analysis by an automatic translation into the Mobius infrastructure. Examples of this method are given in Sections 4 and 5.2. Outline. We give a high-level summary of the operational model Bicolano [30], restricted to a subset of instructions relevant for the present paper, in Section 2. In Section 3 we present the program logic. Section 4 contains an example of a type-based verification and shows how a bytecode-level type system guaranteeing a constant upper bound on the number of heap allocations may be encoded in the logic. The extended program logic is outlined in Section 5, together with an application concerning a type system for numeric correspondence assertions [34]. We first discuss some related work. 1.1
Related Work
The basic design decisions for the base logic were presented in [8], and the reader is referred to loc.cit. for a more in-depth motivation of the chosen format of
28
L. Beringer, M. Hofmann, and M. Pavlova
assertions and rules. In that paper, we also presented a type-system for constant heap space consumption for a functional intermediate language, such that typing derivations could be translated into program logic derivations over an appropriately restricted judgement form. In contrast, the type system given in the present paper works directly on bytecode and hence eliminates the language translation from the formalised TCB. The first proposal for a program logic for bytecode we are aware of is the one by Quigley [31]. In order to justify a rule for while loops, Quigley introduces various auxiliary notions for relating initial states to intermediate states of an execution sequence, and for relating states that behave similarly but apply to different classes. Bannwart and M¨ uller [4] present a logic where assertions apply at intermediate states and are interpreted as preconditions of the assertions decorating the successor instructions. However, the occurrence of these local specifications in positive and negative positions in this interpretation precludes the possibility of introducing a rule of consequence. Indeed, our proposed rule format arose originally from an attempt to extend Bannwart and M¨ uller’s logic with a rule of consequence and machinery for allowing assertions to mention initial states. Strong invariants were introduced by the Key project [6] for reasoning about transactional safety of Java Card applications using dynamic logics [7]. Regarding formal encodings of type systems into program logics, H¨ahnle et al. [23], and Beringer and Hofmann [9] consider the task of representing information flow type systems in program logics, while the MRG project focused on a formalising a complex type system for input-dependent heap space usage [10]. Certified abstract interpretation [11] complements the type-based certificate generation route considered in the present paper. Similar to the relationship between Necula-Lee-style PCC [27] and foundational PCC by Appel et al. [2], certified abstract interpretation may be seen as a foundational counterpart to Albert et al.’s Abstraction-carrying code [1]. Bypassing the program logic, the approach chosen in [11] justifies the program analysis directly with respect to the operational semantics. A generic framework for certifying program analyses based on abstract interpretation is presented by Chang et al. [14]. The possibility to view abstract interpretation frameworks as inference engines for invariants and other assertions in program logics in general was already advocated in one of the classic papers by Cousot & Cousot in [18]. Nipkow et al.’s VeryPCC project [33] explores an alternative foundational approach by formally proving the soundness of verification condition generators. In particular, [32] presents generic soundness and completeness proofs for VCGens, together with an instantiation of the framework to a safety policy preventing arithmetic overflows. Generic PCC architectures have recently been developed by Necula et al. [15] and the FLINT group [22].
2
Bicolano
Syntax and States. We consider an arbitrary but fixed bytecode program P that assigns to each method identifier M a method implementation mapping
Certification Using the Mobius Base Logic
29
instruction labels l to instructions. We use the notation M (l) to denote the instruction at program point l in M , and init M , suc M (l), and par M to denote the initial label of M , the successor label of l in M , and the list of formal parameters of M , respectively. While the Bicolano formalisation supports the full sequential fragment of the JVML, this paper treats the simplified language given by the basic instructions ⎧ ⎫ ⎨ load x, store x, dup, pop, push z, ⎬ basic(M, l) ≡ M (l) ∈ unop u, binop o, new c, athrow, ⎩ ⎭ getfield c f, putfield c f, getstatic c f, putstatic c f and additionally conditional and unconditional jumps ifz l and goto l, static and virtual method invocations invokestatic M and invokevirtual M , and vreturn. Values and states. The domain V of values is ranged over by v, w, . . . and comprises constants (integers z and Null), and addresses a, . . . ∈ A. States are built from operand stacks, stores, and heaps O ∈ O = V list
S ∈ S = X fin V
h ∈ H = A fin C × (F fin V)
where X , C and F are the domains of variables, class names, and field names, respectively. In addition to local states comprising operand stacks, stores, and heaps, s, r ∈ Σ = O × S × H, we consider initial states Σ0 and terminal states T s0 ∈ Σ 0 = S × H
t ∈ T ::= NormState(h, v) + ExcnState(h, a)
These capture states which occur at the beginning and the end of a frame’s execution. Terminal states t are tagged according to whether the return value represents a pointer to an unhandled exception object (constructor ExcnState(., .)) or an ordinary return value (constructor NormState(., .)). For s0 = (S, h) we write state(s0 ) = ([ ], S, h) for the local state that extends s0 with an empty operand stack. For par M = [x1 , . . . , xn ] and O = [v1 , . . . , vn ] we write par M → O for [xi → vi ]i=1,...,n . We write heap(s) to access the heap component of a state s, and similarly for initial and terminal states. Finally, lv (.) denotes the local variable component of a state and getClass(h, a) extracts the dynamic class of the object at location a in heap h. Operational judgements. Bicolano defines a variety of small-step and big-step judgements, with compatibility proofs where appropriate. For the purpose of the present paper, the following simplified setup suffices1 (cf. Figure 2): 1
The formalisation separates the small-step judgements for method invocations from the execution of basic instructions and jumps, and then defines a single recursive judgement combining the two. See [30] for the formal details.
30
L. Beringer, M. Hofmann, and M. Pavlova
Non-exceptional steps. The judgement M l, s ⇒norm l , r describes the (nonexceptional) execution of a single instruction, where l is the label of the next instruction (given by suc M (l) or jump targets). The rules are largely standard, so we only give a rule for the invocation of static methods, InvsNorm. Exceptional steps. The judgement M l, s ⇒excn h, a describes exceptional small steps where the execution of the instruction at program point M, l in state s results in the creation of a fresh exception object, located at address a in the heap h. In the case of method invocations, a single exceptional step is also observed by the callee if the invoked method raised an exception that could not be locally handled (cf. rule InvsExcn). Small step judgements. Non-exceptional and handled exceptional small steps are combined to the small step judgement M l, s ⇒ l , r using the two rules NormStep and ExcnStep. The reflexive transitive closure of this relation is denoted by M l, s ⇒∗ l , r. Big-step judgements. The judgement form M l, s ⇓ t captures the execution of method M from the instruction at label l onwards, until the end of the method. This relation is defined by the three rules Comp, Vret and Uncaught. Deep step judgements. The judgement M l, s ⇑ r is defined similarly to the big-step judgement, by the rules D-Refl, D-Trans D-Invs, and DUncaught. This judgement associates states across invocation boundaries, i.e. r may occur in a subframe of the method M . This is achieved by rule DInvs which associates a call state of a (static) method with states reachable from the initial state of the callee. A similar rule for virtual methods is omitted from this presentation. Small and big-step judgements are mutually recursive due to the occurrence of a big-step judgement in hypotheses of the rules for method invocations on the one hand and rule Comp on the other.
3
Base Logic
This section outlines the non-resource-extended program logic. 3.1
Phrase-Oriented Assertions and Judgements
The structure of assertions and judgements of the logic are governed by the requirement to enable the interpretation of type systems as well as the representation of core idioms of JML. High-level type systems typically associate types (in contexts) to program phrases. Compiling a well-formed program phrase into bytecode yields a code segment that is the postfix of a JVM method, i.e. all program points without control flow successors contain return instructions. Consequently, judgements in the logic associate assertions to a program label which represents the execution of the current method invocation from the current point (i.e. a state applicable at the program point) onwards. In case of method termination, a partial-correctness assertion (post-condition) applies that relates this
Certification Using the Mobius Base Logic
31
M (l) = invokestatic M M init M , ([ ], par M → O, h) ⇓ NormState (k, v) InvsNorm M l, (O@O , S, h) ⇒norm suc M (l), (v :: O , S, k)
InvsExcn
M
M (l) = invokestatic M init M , ([ ], par M → O, h) ⇓ ExcnState (k, a) M l, (O@O , S, h) ⇒excn k, a
NormStep
Comp
M l, s ⇒norm l , r M l, s ⇒ l , r
M l, (O, S, h) ⇒excn k, a getClass (k, a) = e Handler (M, l, e) = l ExcnStep M l, (O, S, h) ⇒ l , ([a], S, k)
M l, s ⇒ l , s M l , s ⇓ t M l, s ⇓ t
Uncaught
M l, s ⇒excn h, a
D-Refl
D-Invs
M l, s ⇑ s
Vret
M (l) = vreturn M l, (v :: O, S, h) ⇓ NormState(h, v)
getClass (h, a) = e Handler (M, l, e) = ∅ M l, s ⇓ ExcnState (h, a)
D-Trans
M l, s ⇒ l , s M l , s ⇑ s M l, s ⇑ s
M (l) = invokestatic M M init M , ([ ], par M → O, h) ⇑ s M l, (O@O , S, h) ⇑ s
D-Uncaught
M l, s ⇒excn h, a
getClass (h, a) = e Handler (M, l, e) = ∅ M l, s ⇑ ([a], ∅, h)
Fig. 2. Bicolano: selected judgements and operational rules
current state to the return state. As the guarantee given by type soundness results often extends to infinite computations (e.g. type safety, i.e. absence of type errors), judgements furthermore include assertions that apply to non-terminating computations. These strong invariants relate the state valid at the subject label to each future state in the current method invocation. This interpretation includes states in subframes, i.e. in method invocations that are triggered in the phrase represented by the subject label. Infinite computations are also covered by the interpretation of local annotations in JML, i.e. assertions occurring at arbitrary program points which are to be satisfied whenever the program point is visited. The logic distinguishes these explicitly given annotation from strong invariants as the former ones are not necessarily present at all program points. A further specification idiom of JML that has a direct impact on the form of assertions is \old which refers to the initial state of a method invocation and may appear in post-conditions, local annotations, and strong invariants. Formulae that are shared between postconditions, local annotations, and strong invariant, and additionally only concern the relationship between the subject state and the initial state of the method may be captured in pre-conditions.
32
L. Beringer, M. Hofmann, and M. Pavlova
Thus, the judgement of the logic are of the form G {A} M, l {B} (I) where M, l denotes a program point (composed of a method identifier and an instruction label), and the assertions forms are as follows, where B denotes the set of booleans. Assertions. A ∈ Assn = Σ0 × Σ → B occur as preconditions A and local annotations Q, and relate the current state to the initial state of the current frame. Postconditions. B ∈ Post = Σ0 × Σ × T → B relate the current state to the initial and final state of a (terminating) execution of the current frame. Invariants. I ∈ Inv = Σ0 × Σ × Σ → B relate the initial state of the current method, the current state, and any future state of the current frame or a subframe of it. The component G of a judgement represents a proof context and is represented as an association of specification triples (A, B, I) ∈ Assn ×Post ×Inv to program points. The behaviour of methods is described using three assertion forms. Method preconditions. R ∈ MethPre = Σ0 → B are interpreted hypothetically, i.e. their satisfaction implies that of the method postconditions and invariants but is not directly enforced to hold at all invocation points. Method postconditions. T ∈ MethSpec = Σ0 × T → B constrain the behaviour of terminating method executions and thus relate only initial and final states. Method invariants. Φ ∈ MethInv = Σ0 × Σ → B constrain the behaviour of terminating and non-terminating method executions by relating the initial state of a method frame to any state that occurs during its execution. A program specification is given by a method specification table M that associates to each method a method specification S = (R, T, Φ), a proof context G, and a table Q of local annotations Q ∈ Assn. From now on, let M denote some arbitrary but fixed specification table satisfying dom M = dom P . 3.2
Assertion Transformers
In order to notationally simplify the presentation of the proof rules, we define operators that relate assertions occurring in judgements of adjacent instructions. The following operators apply to the non-exceptional single-step execution of basic instructions. Pre(M, l, l , A)(s0 , r) = ∃ s. M l, s ⇒norm l , r ∧ A(s0 , s) Post(M, l, l , B)(s0 , r, t) = ∀ s. M l, s ⇒norm l , r → B(s0 , s, t) Inv(M, l, l , I)(s0 , r, t) = ∀ s. M l, s ⇒norm l , r → I(s0 , s, t) These operators resemble WP-operators, but are separately defined for preconditions, post-conditions, and invariants.
Certification Using the Mobius Base Logic
33
Exceptional behaviour of basic instructions is captured by the operators Preexcn (M, l, e, A)(s0 , r) = ∃ s h a. M l, s ⇒excn h, a ∧ getClass(h, a) = e ∧ r = ([a], lv (s), h) ∧ A(s0 , s) Postexcn (M, l, e, B)(s0 , r, t) = ∀ s h a. M l, s ⇒excn h, a → getClass(h, a) = e → r = ([a], lv (s), h) → B(s0 , s, t) Invexcn (M, l, e, I)(s0 , r, t) = ∀ s h a. M l, s ⇒excn h, a → getClass(h, a) = e → r = ([a], lv (s), h) → I(s0 , s, t) In the case of method invocations, we replace the reference to the operational judgement by a reference to the method specifications, and include the construction and destruction of a frame. For example, the operators for non-exceptional execution of static methods are Presinv (R, T, A, [x1 , . . . , xn ])(s0 , s) = ∃ O S h k v vi . (R([xi → vi ]ni=1 , h) → T (([xi → vi ]ni=1 , h), (k, v))) ∧ s = (v :: O, S, k) ∧ A(s0 , ([v1 , . . . , vn ]@O, S, h)) Postsinv (R, T, B, [x1 , . . . , xn ])(s0 , r, t) = ∀ O S k k v vi . (R([xi → vi ]ni=1 , h) → T (([xi → vi ]ni=1 , h), (k, v))) → r = (v :: O, S, k) → B(s0 , ([v1 , . . . , vn ]@O, S, h), t) Invsinv (R, T, I, [x1 , . . . , xn ])(s0 , s, r) = ∀ O S k k v vi . (R([xi → vi ]ni=1 , h) → T (([xi → vi ]ni=1 , h), (k, v))) → s = (v :: O, S, k) → I(s0 , ([v1 , . . . , vn ]@O, S, h), r) The exceptional operators for static methods cover exceptions that are raised during the execution of the invoked method but not handled locally. Due to space limitations we omit the operators for exceptional (null-pointer exceptions w.r.t. the invoking object) and non-exceptional behaviour of virtual methods. 3.3
Selected Proof Rules
An addition to influencing the types of assertions, type systems also motivate the use of a certain form of judgements and proof rules. Indeed, one of the advantages of type systems is their compositionality i.e. the fact that statements regarding a program phrase are composed from the statements referring to the constituent phrases, as in the following typical proof rule for a language of expressions e1 : int e2 : int . e1 + e2 : int Transferring this scheme to bytecode leads to a rule format where hypothetical judgements refer to the control flow successors of the phrase in the judgement’s conclusion. In addition to supporting syntax-directed reasoning, this orientation renders the explicit construction of a control flow graph unnecessary, as no control flow predecessor information is required to perform a proof.
34
L. Beringer, M. Hofmann, and M. Pavlova
Figure 3 presents selected proof rules. These are motivated as follows. basic(M, l) SC 1 SC 2 l = suc M (l) G {Pre(M, l, l , A)} M, l {Post(M, l, l , B)} (Inv(M, l, l , I)) ∀ l e. Handler (M, l, e) = l → G {Preexcn (M, l, e, A)} M, l {Postexcn (M, l, e, B)} (Invexcn (M, l, e, I)) ∀ s0 s h a. (∀ e. getClass (h, a) = e → Handler (M, l, e) = ∅) → M l, s ⇒excn h, a → A(s0 , s) → B(s0 , s, (h, a)) Instr G {A} M, l {B} (I) M (l) = Goto l SC 1 SC 2 G {Pre(M, l, l , A)} M, l {Post(M, l, l , B)} (Inv(M, l, l , I)) Goto G {A} M, l {B} (I) M (l) = ifz l SC 1 SC 2 l = suc M (l) G {Pre(M, l, l , A)} M, l {Post(M, l, l , B)} (Inv(M, l, l , I)) G {Pre(M, l, l , A)} M, suc M (l) {Post(M, l, l , B)} (Inv(M, l, l , I)) If0 G {A} M, l {B} (I) M (l) = invokestatic M M(M ) = (R, T, Φ) SC 1 SC 2 ∀ s0 O S h O r vi . (R(par M → O, h) → Φ((par M → O, h), r)) → A(s0 , (O@O , S, h)) → I(s0 , (O@O , S, h), r) A1 = Presinv (R, T, A, par M ) B1 = Postsinv (R, T, B, par M ) G {A1 } M, suc M (l) {B1 } (Invsinv (R, T, I, par M )) ∀ l e. Handler (M, l, e) = l → excn G {Preexcn sinv (R, T, A, e, par M )} M, l {Postsinv (R, T, B, e, par M )} excn (Invsinv (R, T, I, e, par M )) ∀ s0 O S h O k a. (R(par M → O, h) → Φ((par M → O, h), (k, a))) → (∀ e. getClass(k, a) = e → Handler (M, l, e) = ∅) → A(s0 , (O@O , S, h)) → B(s0 , (O@O , S, h), (k, a)) InvS G {A} M, l {B} (I) M (l) = vreturn SC 1 SC 2 ∀ s0 v O S h. A(s0 , (v :: O, S, h)) → B(s0 , (v :: O, S, h), (h, v)) Ret G {A} M, l {B} (I) G {A } {B } (I ) ∀ s0 s. A(s0 , s) → A (s0 , s) ∀ s0 s t. B (s0 , s, t) → B(s0 , s, t) ∀ s0 s r. I (s0 , s, r) → I(s0 , s, r) Conseq G {A} {B} (I) G() = (A, B, I) ∀ s0 s. A(s0 , s) → I(s0 , s, s) ∀ Q. Q() = Q → (∀ s0 s. A(s0 , s) → Q(s0 , s)) Ax G {A} {B} (I) Fig. 3. Program logic: selected syntax-directed rules
Certification Using the Mobius Base Logic
35
Rule INSTR describes the behaviour of basic instructions. The hypothetical judgement for the successor instruction involves assertions that are related to the assertions in the conclusion by the transformers for normal termination. A further hypothesis captures exceptions that are handled locally, i.e. those exceptions e to which the exception handler of the current method associates a handling instruction (predicate Handler (M, l, e) = l ). Exceptions that are not handled locally result in abrupt termination of the method. Consequently, these exceptions are modelled in a side condition that involves the method postcondition rather than a further judgemental hypothesis. Finally, the side conditions SC 1 and SC 2 ensure that the invariant I and the local annotation Q (if existing) are satisfied in any state reaching label l. SC 1 = ∀ s0 s. A(s0 , s) → I(s0 , s, s) SC 2 = ∀ Q. Q(M, l) = Q → (∀ s0 s. A(s0 , s) → Q(s0 , s)) In particular, SC 2 requires us to prove any annotation that is associated with label l. Satisfaction of I in later states, and satisfaction of local annotations Q of later program points are guaranteed by the judgement for suc M (l). The rules for conditional and unconditional jumps include a hypotheses for the control flow successors, and the same side conditions for local annotations and invariants as rule Instr. No further hypotheses or side conditions regarding exceptional behaviour are required as these instructions do not raise exceptions. These rules also account for the verification of loops which on the level of bytecode are rendered as jumps. Loop invariants can be inserted as postconditions B at their program point. Rule Ax allows one to use such invariants whereas according to Definition 1 they must be established once in order for a verification to be valid. In rule InvS, the invariant of the callee, namely Φ (more precisely: the satisfaction of Φ whenever the initial state of the callee satisfies the precondition R), and the local precondition A may be exploited to establish the invariant I. This ensures that I will be satisfied by all states that arise during the execution of M , as these states will always conform to Φ. The callee’s post-condition T is used to construct the assertions that occur in the judgement for the successor instruction l . Both conditions reflect the transfer of the method arguments and return values between the caller and the callee. This protocol is repeated in the hypothesis and the side condition for the exceptional cases which otherwise follow the pattern mentioned in the description of the rule Instr. A similar rule for virtual methods is omitted. The rule for method returns, Ret, ties the precondition A to the post-condition B w.r.t. the terminal state that is constructed using the topmost value of the operand stack. Finally, the logical rules Conseq and Ax arise from the standard rules by adding suitable side conditions for strong invariants and local assertions. 3.4
Behavioural Subtyping and Verified Programs
We say that method specification (R, T, Φ) implies (R , T , Φ ) if
36
L. Beringer, M. Hofmann, and M. Pavlova
– for all s0 and t, R(s0 ) → T (s0 , t) implies R (s0 ) → T (s0 , t) , and – for all s0 and s, R(s0 ) → Φ(s0 , s) implies R (s0 ) → Φ (s0 , s) Furthermore, we say that M satisfies behavioural subtyping for P if whenever P contains an instruction invokevirtual M with M(M ) = (S , G , Q ), and M overrides M , then there are S, G and Q with M(M ) = (S, G, Q) such that S implies S . Finally, we call a derivation G {A} M, l {B} (I) progressive if it contains at least one application of a non-logical rule. Definition 1. P is verified with respect to M, notation M P , if – M satisfies behavioural subtyping for P , and – for all M , M(M ) = (S, G, Q), and S = (R, T, Φ) • a progressive derivation G {A} M, l {B} (I) exists for any l, A, B, and I with G(M, l) = (A, B, I), and • a progressive derivation G {A} M, init M {B} (I) exists for A(s0 , s) ≡ s = state(s0 ) ∧ R(s0 ) B(s0 , s, t) ≡ s = state(s0 ) → T (s0 , t) I(s0 , s, r) ≡ s = state(s0 ) → Φ(s0 , r). As the reader may have noticed, behavioural subtyping only affects method specifications but not the proof contexts G or annotation tables Q. Technically, the reason for this is that no constraints on these components are required in order to prove the logic sound. Pragmatically, we argue that proof contexts and local annotations tables of overriding methods indeed should not be related to contexts and annotation tables of their overridden counterparts, as both kinds of tables expose the internal structure of method implementations. In particular, entries in proof contexts and annotation tables are formulated w.r.t. specific program points, which would be difficult to interprete outside the method boundary or indeed across different (overriding) implementations of a method. The distinction between progressive and non-progressive derivations prevents attempts to justify a proof context or method specification table simply by applying the axiom rule to all entries. In program logics for high-level languages, the corresponding effect is silently achieved by the unfolding of the method body in the rule for method invocations [29]. As our judgemental form does not permit such an unfolding, the auxiliary notion of progressive derivations is introduced. In our formalisation, the separation between progressive and other derivations is achieved by the introduction of a second judgement form, as described in [8]. 3.5
Interpretation and soundness
Definition 2. The triple (Q, B, I) is valid at (M, l) for (s0 , s) if – for all r, if M l, s ⇓ t then B(s0 , s, t) – for all l and r, if M l, s ⇒∗ l , r and Q(l ) = Q, then Q(s0 , r), and – for all r, if M l, s ⇑ r then I(s0 , s, r).
Certification Using the Mobius Base Logic
37
Note that the second clause applies to annotations Q associated with arbitrary labels l in method M that will be visited during the execution of M from (l, s) onwards. Although these annotations are interpreted without recourse to the state s, the proof of Q(s0 , r) may exploit the precondition A(s0 , s). The soundness result is then as follows. Theorem 1. For M P let M(M ) = (S, G, Q), G {A} M, l {B} (I) be a progressive derivation, and A(s0 , s). Then (Q, B, I) is valid at (M, l) for (s0 , s). In particular, this theorem implies that for M P all method specifications in M are honoured by their method implementations. The proof of this result may be performed in two ways. Following the approach of Kleymann and Nipkow [25,29,3], one would first prove that the derivability of a judgement entails its validity, under the hypothesis that contextual judgements have already been validated. For this task, the standard technique involves the introduction of relativised notions of validity that restrict the interpretation of judgements to operational judgements of bounded height. Then, the hypothesis on contextual judgements is eliminated using structural properties of the relativised validity. An alternative to this approach has been developed by Benjamin Gregoire in the course of the formalisation of the present logic. It consists of (i) defining a family of syntax-directed judgements (one judgement form for each instruction form, inlining the rule of consequence), (ii) proving that property M P implies that the last step in a derivation of G {A} M, l {B} (I) can be replaced by an application of the syntax-directed judgement corresponding to the instruction at M, l (in particular, an application of the axiom rule is replaced by the derivation for the corresponding code blocks from G), and (iii) proving the main claim of Theorem 1 by treating the three parts of Definition 2 separately, each one by induction over the respective operational judgement.
4
Type-Based Verification
In this section we present a type system that ensures a constant bound on the heap consumption of bytecode programs. The type system is formally justified by a soundness proof with respect to the MOBIUS base logic, and may serve as the target formalism for type-transforming compilers. The requirement imposed on programs is similar to that of the analysis presented by Cachera et al. in [13] in that recursive program structures are denied the facility to allocate memory. However, our analysis is presented as a type system while the analysis presented in [13] is phrased as an abstract interpretation. In addition, Cachera et al.’s approach involves the formalisation of the calculation of the program representation (control flow graph) and of the inference algorithm (fixed point iteration) in the theorem prover. In contrast, our presentation separates the algorithmic issues (type inference and checking) from semantic issues (the property expressed or guaranteed) as is typical for a typebased formulation. Depending on the verification infrastructure available at the code consumer side, the PCC certificate may either consist of (a digest of) the
38
L. Beringer, M. Hofmann, and M. Pavlova
typing derivation or an expansion of the interpretation of the typing judgements into the MOBIUS logic. The latter approach was employed in our earlier work [10] and consists of understanding typing judgements as derived proof rules in the program logic and using syntax-directed proof tactics to apply the rules in an automatic fashion. In contrast to [10], however, the interpretation given in the present section extends to non-terminating computations, albeit for a far simpler type system. The present section extends the work presented in [8] as the type system is now phrased for bytecode rather than an intermediate functional language and includes the treatment of exceptions and virtual methods. Bytecode-level type system. The type system consists of judgements of the form Σ,Λ : n, expressing that the segment of bytecode whose initial instruction is located at is guaranteed not to allocate more than n memory cells. Here, denotes a program point M, l while signatures Σ and Λ assign types (natural numbers n) to identifiers of methods and bytecode instructions (in particular, when those are part of a loop), respectively.
C-New
n≥1
n≥1 C-Instr
C-If
n≥0
M (l) = New C Σ,Λ M, suc M (l) : n − 1 Σ,Λ M, l : n
basic(M, l) ¬M (l) = New C Σ,Λ M, suc M (l) : n ∀ l e. Handler (M, l, e) = l →Σ,Λ M, l : n − 1 Σ,Λ M, l : n M (l) = ifz l
Σ,Λ M, l : n Σ,Λ M, l : n
Σ,Λ M, suc M (l) : n
M (l) ∈ {invokestatic M , invokevirtual M } Σ(M ) = k n≥1 k≥0 Σ,Λ M, suc M (l) : n ∀ l e. Handler (M, l, e) = l →Σ,Λ M, l : n − 1 C-Invoke Σ,Λ M, l : n + k C-Ret
M (l) = vreturn Σ,Λ M, l : 0
C-Sub
Σ,Λ : n n≤k Σ,Λ : k
C-Assum
Λ() = n Σ,Λ : n
Fig. 4. Type system for constant heap space
The rules are presented in Figure 4. The first rule, C-New, asserts that the memory consumption of a code fragment whose first instruction is new C is the increment of the remaining code. Rule C-Instr applies to all basic instructions (in the case of goto l we take suc M (l) to be l ), except for new C – the predicate basic(m, l) is defined as in Section 3.3. The memory effect of these instructions is zero, as is the case for return instructions, conditionals, and (static) method invocations in the case of normal termination. For exceptional termination, the allocation of a fresh exception object is accounted for by decrementing the type
Certification Using the Mobius Base Logic
39
for the code continuation by one unit. The rule C-Assum allows for using the annotation attached to the instruction if it matches the type of the instruction. A typing derivation Σ,Λ : k is called progressive if it does not solely contain applications of rules C-Sub and C-Assum. Furthermore, we call P well-typed for Σ, notation Σ P , if for all M and n with Σ(M ) = n there is a local specification table Λ such that a progressive derivation Σ,Λ M, init M : n exists, and for all with Λ() = k we have a progressive derivation Σ,Λ : k. Type checking and inference. The tasks of checking and automatically finding (inference) of typing derivations are not our main concern here. Nevertheless, we discuss briefly how this can be achieved. For this simple type system checking a given typing derivation amounts to verifying the inequations that arise as side conditions. Furthermore, given Σ, Λ a corresponding typing derivation can be reconstructed by applying the typing rules in a syntax-directed fashion. In order to construct Σ, Λ as well (type inference) one writes down a “skeleton derivation” with indeterminates instead of actual numeric values and then solves the arising system of linear inequalities. Alternatively, one can proceed by counting allocation statements along paths and loops in the control-flow graph. Our main interest here is, however, the use of existing type derivations however obtained in order to mechanically construct proofs in the program logic. This will be described now. Interpretation of the type system. The interpretation for the above type system is now obtained by defining for each number n a triple n = (A, B, I) consisting of a precondition A, a postcondition B, and an invariant I, as follows. ⎛ ⎞ λ (s0 , s). True, n ≡ ⎝ λ (s0 , s, t). |heap(t)| ≤ |heap(s)| + n, ⎠ λ (s0 , s, r). |heap(r)| ≤ |heap(s)| + n Here, |h| denotes the size of heap h and heap(s) extracts the heap component of a state. We specialise the main judgement form of the bytecode logic to G {n} ≡ let (A, B, I) = n in G {A} {B} (I). By the soundness of the MOBIUS logic, the derivability of a judgement G {n} guarantees that executing the code located at will not allocate more that n items, in terminating (postcondition B) and non-terminating (invariant I) cases, provided that M P holds. For (A, B, I) = n we also define the method specification Spec n ≡ (λ s0 . True, λ (s0 , t). B(s0 , state(s0 ), t), λ (s0 , s). I(s0 , state(s0 ), s)), and for a given Λ we define GΛ pointwise by GΛ () = Λ(). Finally, we say that M satisfies Σ, notation M |= Σ, if for all methods M , M(M ) = (Spec n, GΛ , ∅) holds precisely if Σ(M ) = n, where Λ is the context associated with M in Σ P . Thus, method specification table M contains for each
40
L. Beringer, M. Hofmann, and M. Pavlova
method the precondition, postcondition and invariant from Σ, the (complete) context determined from Λ, and the empty local annotation table Q. We can now prove the soundness of the typing rules with respect to this interpretation. By induction on the typing rules, we first show that the interpretation of a typing judgement is derivable in the logic. Proposition 1. For M |= Σ let M be provided in M with some annotation table Λ such that Σ,Λ M, l : n is progressive. Then GΛ M, l {n}. From this, one may obtain the following, showing that well-typed programs satisfy the verified-program property: Theorem 2. Let M |= Σ and Σ P , and let M satisfy behavioural subtyping for P . Then M P . Discussion. In order to improve the precision of the analysis, a possibility is to combine the type system with a null-pointer analysis. For this, we would specialise the proof rules for instructions which might throw a null-pointer exception. At program points for which the analysis guarantees absence of such exceptions, we may then use a specialised typing rule. For example, a suitable rule for the field access operation is the following. C-Getfld1
getField (m, l) refNotNull (m, l) Σ,Λ m, suc m (l) : n Σ,Λ m, l : n
Program points for which the analysis is unable to discharge the side condition refNotNull (m, l) would be dealt with using the standard rule. Similarly, instructions that are guaranteed not to throw runtime exceptions (like load x, store x, dup) may be typed using the optimised rule C-noRTE
Σ,Λ m, suc m (l) : n noExceptionInstr (m, l) Σ,Λ m, l : n
We expect that justifying these specialised rules using the program logic would not pose major problems, while the formal integration with other program analyses (such as the null-pointer analysis) is a topic for future research.
5
Resource-Extended Program Logic
In this section we give a brief overview of an extension of the MOBIUS base logic as described in Section 3 for dealing with resources in a generic way. The extension addresses the following shortcoming of the basic logic: Resource consumption. Specific resources that we would like to reason about include instruction counters, heap allocation, and frame stack height. A well-known technique for modelling these resources is code instrumentation, i.e. the introduction of (real or ghost) variables and instructions manipulating these. However, code instrumentation appears inappropriate for a PCC
Certification Using the Mobius Base Logic
41
environment, as it does not provide an end-to-end guarantee that can be understood without reference to the program at hand. In particular, the overall satisfaction of a resource property using code instrumentation requires an analysis of the annotated program, i.e. a proof that the instrumentation variables are introduced and manipulated correctly. Furthermore, the interaction between additional variables of different domains, and between auxiliary variables and proper program variables is difficult to reason about. Execution traces. Here, the goal is to reason about properties concerning a full terminating or non-terminating execution of a program, for example by imposing that an execution satisfies a formula expressed in temporal logics or a policy given in terms of a security automaton. Such specifications may concern the entire execution history, i.e. be defined over a sequence of (intermediate) Bicolano states, and are thus not expressible in the MOBIUS base logic. Ghost variables are heavily used in JML, both for resource-accounting purposes as well as functional specifications, but are not directly expressible in the base logic. In this section we extend the base logic by a generic resource-accounting mechanism that may be instantiated to the above tasks. In addition to the work reported here, we have also performed an analysis of the usage made of ghost variables in JML, and have developed interpretations of ghost variables in native and resource-extended program logics [24]. In particular, loc.cit. contains a formalised proof demonstrating how resource counting using ghost variables in native logics may be effectively eliminated, by translating each proof derivation into a derivation in the resource-extended logic. 5.1
Semantic Modelling of Generic Resources
In order to avoid the pitfalls of code instrumentation discussed above, a semantic modelling of resource consumption was chosen. The logic is defined over an extended operational semantics, the judgements of which are formulated over the same components as the standard Bicolano operational semantics, plus a further resource-accounting component [20]. The additional component is of the a priori unspecified type ACT, and occurs as a further component in initial, final, and intermediate states. In addition, we introduce transfer functions that update the content of this component according to the other state components, including the program counter. The operational semantics of the extended framework is then obtained by embedding each non-extended judgement form in a judgement form over extended states and invoking the appropriate transfer functions on the resource component. While these definitions of the operational semantics are carried out once and for all, the implementation of the transfer functions themselves is programmable. Thus, realisations of the framework for particular resources may be obtained by instantiating the ACT to some specific type and implementing the transfer functions as appropriate. The program logic remains conceptually untouched, i.e. it is structurally defined as the logic from Section 3,
42
L. Beringer, M. Hofmann, and M. Pavlova
but the definitions of assertion transformers and rules, and the soundness proof, are adapted to extended states and modified operational judgements. In comparison to admitting the definition of ad-hoc extensions to the program logic, we argue that the chosen approach is better suited to the PCC applications, as the consumer has a single point of reference where to specify his policy, namely the implementation of the transfer functions. 5.2
Application: Block-Booking
As an application of the resource-extended program logic, we consider a scenario where an application repeatedly sends some data across a network provided that each such operation is sanctioned by an interaction with the user. In order to avoid authorisation requests for individual send operations, a high-level language might contain a primitive auth(n) that asks the user to authorise n messages in one interaction. A reasonable resource policy for the code consumer then is to require that no send operation be carried out without authorisation, and that at each point of the execution, the acquired authorisations suffice for servicing the remaining send operations. (For simplicity, we assume that refusal by the user to sanction an authorisation request simply blocks or leads to immediate non-termination without any observable effect.) We note that as in the case of the logic loop constructs from the high-level language are mapped to conditional and unconditional jumps that must be typed using the corresponding rules. We now outline a bytecode-level type and effect system for this task, for a sublanguage of scalar (integer) values and unary static methods. Effects τ are rely-guarantee pairs (m, n) of natural numbers: a code fragment with this effect satisfies the above policy whenever executed in a state with at least m unused authorisations, with at least n unused authorisations being left over upon termination. The number of authorisations that are additionally acquired, and possibly used, during the execution are unconstrained. Types C, D, . . . are sets of integers constraining the values stored in variables or operand stack positions. Judgements take the form Δ, η, Ξ Σ,Λ : C, τ , with the following components: – the abstract store Δ maps local variables to types – the abstract operand stack η is represented as a list of types – Ξ is an equivalence relation relation ranging over identifiers ρ from dom Δ ∪ dom η where dom η is taken to be the set {0, . . . , |η| − 1}. The role of Ξ is to capture equalities between values on the operand stack and the store. – instruction labels = (M, l) indicate the current program point, as before – the type C describes the return type – the effect τ captures the pre-post-behaviour of the subject phrase with respect to authorisation and send events – the proof context Λ associates sets of tuples (Δ, η, Ξ, C, τ ) to labels l (implicitly understood with respect to method M ). – the method signature table Σ maps method names to type signatures of the
Certification Using the Mobius Base Logic
43
(mi ,ni )
form ∀i∈I. Ci −−−−−→ Di . Limiting our attention to static methods with a single parameter, such a poly-variant signature indicates that for each i (mi ,ni )
in some (unspecified) index set I, the method is of type Ci −−−−−→ Di , i.e. takes arguments satisfying constraint Ci to return values satisfying Di with (latent) effect (mi , ni ). In addition to ignoring virtual methods (and consequently avoiding the need for a condition enforcing behavioural subtyping of method specifications), we also ignore exceptions. Finally, while our example program contains simple objects we do not give proof rules for object construction or field access. We argue that this impoverished fragment of the JVML suffices for demonstrating the concept of certificate generation for effects, and leave an extension to larger language fragments as future work. For an arbitrary relation R, we let Eq(R) denote its reflexive, transitive and symmetric closure. We also define the operations Ξ − ρ, Ξ + ρ and Ξ[ρ := ρ ] on equivalence relation Ξ and identifiers ρ and ρ , as follows. Ξ − ρ ≡ Ξ \ {(ρ1 , ρ2 ) | ρ = ρ1 ∨ ρ = ρ2 } Ξ + ρ ≡ Ξ ∪ {(ρ, ρ)} Ξ[ρ := ρ ] ≡ Eq((Ξ − ρ) ∪ {(ρ, ρ )}) The interpretation of position ρ in a pair (O, S) is given by x(O,S) = S(x) and n(O,S) = O(n). The interpretation of a triple Δ, η, Ξ in a pair (O, S) is given by the formula ⎧ dom Δ ⊆ dom S ∧ |η| = |O| ∧ ⎪ ⎪ ⎨ ∀x ∈ dom Δ. S(x) ∈ Δ(x) ∧ Δ, η, Ξ(O,S) = ∀i < |η|. O(i) ∈ η(i) ∧ ⎪ ⎪ ⎩ ∀(ρ, ρ ) ∈ Ξ. ρ(O,S) = ρ (O,S) With the help of these operations, the type system is now defined by the rules given in Figure 5. Due to the formulation at the bytecode level, the authorisation primitive does not have a parameter but obtains its argument from the operand stack. The rule for conditionals, E-If, exploits the outcome of the branch condition by updating the types of all variables associated with the top operand stack position in Ξ. This limited form of copy propagation will be made use of in the verification of an example program below. In the rule of consequence, E-Sub, subtyping on types is denoted by C <: D and given by subset inclusion, and is extended to abstract stores (notation Δ <: Δ ) and abstract operand stacks (notation η <: η ) in a pointwise fashion. Sub-effecting is given by the reflexive closure of the rule k ≥m+d l ≤n+d . (m, n) <: (k, l)
44
L. Beringer, M. Hofmann, and M. Pavlova
E-Send
M (l) = send
Δ, η, Ξ Σ,Λ M, suc M (l) : D, (m − 1, n) Δ, η, Ξ Σ,Λ M, l : D, (m, n)
M (l) = auth ∀i ∈ C. i ≥ k Δ, η, Ξ − |η| Σ,Λ M, suc M (l) : D, (m + k, n) E-Auth Δ, C :: η, Ξ Σ,Λ M, l : D, (m, n) E-Goto
M (l) = goto l Δ, η, Ξ Σ,Λ M, l : D, (m, n) Δ, η, Ξ Σ,Λ M, l : D, (m, n)
M (l) = ifz l Ξ = Ξ − |η| Δ1 = Δ[x → Δ(x) ∩ (Z \ {0})](|η|,x)∈Ξ η1 = η[i → η(i) ∩ (Z \ {0})](|η|,i)∈Ξ ∧ 0≤i<|η| Δ2 = Δ[x → Δ(x) ∩ {0}](|η|,x)∈Ξ η2 = η[i → η(i) ∩ {0}](|η|,i)∈Ξ ∧ 0≤i<|η| Δ1 , η1 , Ξ Σ,Λ M, suc M (l) : (m, n) Δ2 , η2 , Ξ Σ,Λ M, l : D, (m, n) E-If Δ, C :: η, Ξ Σ,Λ M, l : D, (m, n) M (l) = store x Ξ = (Ξ[x := |η|]) − |η| Δ[x → C], η, Ξ Σ,Λ M, suc M (l) : D, (m, n) E-Store Δ, C :: η, Ξ Σ,Λ M, l : D, (m, n) M (l) = load x Ξ = Ξ[|η| := x] Δ, Δ(x) :: η, Ξ Σ,Λ M, suc M (l) : D, (m, n) E-Load Δ, η, Ξ Σ,Λ M, l : D, (m, n) E-Push
M (l) = push c
Δ, {c} :: η, Ξ + |η| Σ,Λ M, suc M (l) : D, (m, n) Δ, η, Ξ Σ,Λ M, l : D, (m, n)
M (l) = binop ⊕ C = {z|z = x ⊕ y, x ∈ C1 , y ∈ C2 } Δ, C :: η, ((Ξ − |η|) − (|η| + 1)) + |η| Σ,Λ M, suc M (l) : D, (m, n) E-Binop Δ, C1 :: C2 :: η, Ξ Σ,Λ M, l : D, (m, n) τ
i M (l) = invokestatic M Σ(M ) = ∀i∈I. Ci −→ Di k∈I Ξ = (Ξ − |η|) + |η| Δ, Dk :: η, Ξ Σ,Λ M, suc M (l) : D, (nk , n) E-InvS Δ, Ck :: η, Ξ Σ,Λ M, l : D, (mk , n)
E-Vret
M (l) = vreturn Δ, D, Ξ Σ,Λ M, l : D, (0, 0)
Δ , η , Ξ Σ,Λ : C, τ Δ <: Δ η <: η C <: D τ <: τ Ξ ⊆ Ξ E-Sub Δ, η, Ξ Σ,Λ : D, τ
E-Ax
E-Univ
(Δ, η, Ξ, D, τ ) ∈ Λ(l) Δ, η, Ξ Σ,Λ M, l : D, τ
∀ O S. Δ, η, Ξ(O,S) = False Δ, η, Ξ Σ,Λ M, l : D, (m, n)
Fig. 5. Type and effect system for block-booking
Certification Using the Mobius Base Logic
45
The final rule, E-Univ, allows us to associate an arbitrary effect and result type to a code segment under the condition that the constraints Δ, η, Ξ on the initial state are unsatisfiable. The main use of this rule is in cases where branch conditions render one branch dead code. In order to prove the soundness of the type system in the extended program logic, we instantiate the parameter ACT to the type of finite words over the set {send} ∪ {auth(z) | z ≥ 0} and implement the transfer functions such that each execution of the primitives send and auth results in appending the appropriate action to the trace - in case of authorisation events, the number z is obtained by inspecting the topmost value of the operand stack. We interpret a judgement Δ, η, Ξ Σ,Λ M, l : D, (m, n) as the logic statement ΛM {λ s0 . True} M, l {(Δ, η, Ξ, m, n, D)} ((Δ, η, Ξ, m)), with the following components. The postcondition (Δ, η, Ξ, m, n, D) is λ (s0 , (O, S, h, X), (h, v, Y )). Δ, η, Ξ(O,S) → (∃Z. v ∈ D ∧ Y = XZ ∧ |Z|auth + m ≥ |Z|send + n). For any terminating execution starting in an initial store and operand stack conforming to the abstractions Δ and η, and respecting the equivalence relation Ξ, this property guarantees that the return value satisfies D. Furthermore, the sub-traces for authorisation and send events (obtained by projecting from the trace Z of all events encountered during the execution of the phrase) satisfy the inequality interpreting the effect. A similar explanation holds for the definition of the invariant (Δ, η, Ξ, m), λ (s0 , (O, S, h, X), (O , S , h , X )). Δ, η, Ξ(O,S) → (∃Z. X = XZ ∧ |Z|auth + m ≥ |Z|send ). The local proof context ΛM is given by [(M, l) → (True, (Δ, η, Ξ, m, n, D), (Δ, η, Ξ, m))]Λ(l)=(Δ,η,Ξ,D,(m,n)) , i.e. by translating the entries of Λ pointwise. Finally, each specification entry (mi ,ni )
Σ(M ) = ∀i∈I. Ci −−−−−→ Di results in an entry M(M ) = (R, T, Φ) in the bytecode logic specification table, where R(s0 ) = True T ((S, h, X), (h, v, Y )) = ∀i ∈ I. S(arg) ∈ Ci → (∃ Z. v ∈ Di ∧ Y = XZ ∧ |Z|auth + mi ≥ |Z|send + ni ) Φ((S, h, X), (O, S , h , X )) = ∀i ∈ I. S(arg) ∈ Ci → (∃ Z. X = XZ ∧ |Z|auth + mi ≥ |Z|send ) where arg is the formal parameter. Based on this interpretation, certificate generation may now be obtained by deriving the typing rules from the program logic
46
L. Beringer, M. Hofmann, and M. Pavlova
and introducing appropriate notions of progressive derivations and well-typed programs (in the absence of virtual methods: without a behavioural subtyping condition), in a similar way as in Section 4. The formalisation of this is left as future research. 5.3
Example
We assume two builtin integer-valued functions size_string yielding the number of SMS messages required to send a given string, and size_book which gives the size of an address book. Figure 6 presents Java-style pseudocode for sending a given string to all addresses of a given address book after requiring the necessary permissions. The program first computes the total number of SMS messages public interface Parameters { int p=...; //some constant >= 0 } class BlockBooking { static void send () {...}; static void auth (int p) {...}; void block_send(Java.lang.String s, addrbook b) { int n = size_string(s); int m = size_book(b); int nb_sms = n * m; int j = 0; int sent = 0; while (nb_sms - sent > 0) { if j > 0 { //current authorisations suffice send(); sent = sent + 1; j = j - 1 } else { //acquire p new authorisations auth (Parameters.p); j = Parameters.p; } } return 0; } } Fig. 6. Program for sending a message using authorisation chunks of size p
and then sends the messages where authorisations are acquired in blocks of size p, for arbitrary fixed p ≥ 0. The primitives for sending and authorising messages are modelled as additional (static) methods.
Certification Using the Mobius Base Logic 0 1 4 5 6 9 11 12 14 15 17 18 20 21
aload 1 //variable s invokestatic sizestring istore 3 //variable n aload 2 // variable b invokestatic sizebook istore 4 //variable m iload 3 iload 4 imul istore 5 //variable nbms iconst 0 istore 6 //variable j iconst 0 istore 7 //variable sent
Fig. 7. Bytecode for method BlockBooking.block send
Figure 7 shows the bytecode for method block_send, which comprises six basic blocks. In order to verify that this method does not send more messages than authorised, we derive the typing [s → C, b → D], [ ], ∅ Σ,Λ block send, 0 : {0}, (0, 0) where C and D are arbitrary and Σ ≡ [sizestring → {(C, 0, 0, Z)}, sizebook → {(D, 0, 0, Z)}] Λ ≡ [23 → {spec d | 0 ≤ d}] spec d ≡ (Δd , [ ], Ξd , {0}, (d, 0))
Δd ≡ [n → Z, m → Z, nbsms → Z, j → {d}, sent → Z≥0 ] Ξd ≡ {(n, n), (m, m), (nbsms, nbsms), (j, j), (sent, sent)}.
The proof context Λ contains a single entry, namely a polyvariant loop invariant for instruction 23. The invariant contains one entry for each 0 ≤ d, where the index specifies precisely the content of variable j and links this value to the pre-effect. The equivalence relation relevant at this program point contains merely the reflexive entries for all (integer) variables. The verification of the above judgement applies the rules syntax-directedly for instructions 0, . . . , 21, and then applies the axiom rule for label 23, guarded by an application of rule E-Sub.
48
L. Beringer, M. Hofmann, and M. Pavlova
The overall verification complements the verification of the above judgement with a justification of the context Λ, by providing a progressive derivation for the loop invariant. Again, this verification proceeds syntax-directedly through the loop, terminating in (subtyping-protected) applications of the rule E-Ax. At the point where method send is invoked (instruction label 36) a case-split is performed on the condition d = 0. If this condition holds, a vacuous statement is obtained as the invocation occurs in the branch j > 0, and our invariant ensures that j contains the value d. The vacuity is detected as the entry for j in Δ is ∅ at that point: the load instruction at label 36 inserts (0, j) into Ξ, hence the type associated with j in the fall-through-hypothesis of the branch at label 33 (in particular: at label 36) is {d} ∩ (Z \ {0}) = ∅ where the term {d} was propagated unmodified to instruction 36 from instruction 23. Consequently, the case d = 0 may be immediately discharged by an invocation of rule E-Univ. The case d > 0 admits the application of the proof rule E-Send, and the remainder of the branch is again proven in a syntax-directed fashion. Type checking and inference. Again, we briefly discuss these issues for this system. The type system is generic in that types may be arbitrary sets of integers. In order to support effective typechecking and inference one must of course restrict these sets themselves and also the sets of types that arise in annotations and method specifications. A popular and for our intended application sufficient way consists of restricting types to convex polyhedra specified by a system of linear inequalities and to confine sets of types to those arising by intersecting a fixed convex polyhedron with a hyperplane specified by one or more additional parameters. Notice that the types in our running example are all of this form. When we make this restriction (formally by applying the subtyping rule immediately after each rule to bring the types back into the polyhedral format) then type checking amounts to checking inclusion of convex polyhedra which can be efficiently performed by linear programming. Furthermore, Farkas’ Lemma also furnishes short, efficiently computable, and efficiently checkable certificates [21,28]. Indeed, since any convex polyhedron is the intersection of hyperplanes, deciding containment of convex polyhedra reduces to deciding whether a convex polyhedron H = {x | Ax ≤ b} is contained in a hyperplane of the form P = {x | cT x ≤ d}. This, however, is the case iff max{cT x | x ∈ H} ≤ d; a linear programming problem. Now, the latter inequality can be certified by providing a vector r ≥ 0 (componentwise) such that r T A = cT and rT b ≤ d. For then, whenever x ∈ H, i.e., Ax ≤ b then cT x = r T Ax ≤ rT b ≤ d. Farkas’ lemmas asserts that such a vector r exists whenever max{cT x | x ∈ H} ≤ d. Given its existence we can efficiently compute it by minimising y T b subject to y T A = cT and y ≥ 0. Regarding automatic type inference as opposed to type checking one has to find unknown convex polyhedra specified by fixpoint equations. Besson et al. [12] report that this can be done by iteration using widening heuristics from [19]. The range and efficiency remains, however, unexplored in loc. cit. In our particular application we expect constraints to be sufficiently simple so that these heuristics
Certification Using the Mobius Base Logic
49
or those proposed in [26] will be successful. Inference of the equivalence relations Ξ can be achieved by employing standard copy-propagation techniques known from compiler constructions.
6
Discussion
We have described the use of the Mobius base logic as a unified backend for both program analyses and type systems. The Mobius base logic has been formally proved sound with respect to the Bicolano formalisation of the JVM. Compared to direct soundness proofs of type systems and analyses with respect to Bicolano the use of the Mobius base logic as an intermediary offers two distinctive advantages. First, the soundness proof of the Mobius base logic already does much of the work that is common to soundness proofs, in particular inducting on steps in the operational semantics and stack height. The Mobius logic is more transparent and allows for proof by invariant and recursion. Secondly, the standardised format of assertions in the Mobius base logic makes it easier to compare results of different type systems and analyses and also to assess whether the asserted property coincides with the intuitively desired property. The resource extension to both Bicolano and the Mobius base logic allows for direct specification and certification of resource-related intensional properties without having to go through indirect observations such as values of ordinary program variables that are externally known to reflect some resource behaviour. This is particularly important in the PCC scenario where providers and users of specifications and certificates do not coincide and might have different objectives. Similarly, the strong invariants enhance the expressive power of the Mobius base logic compared to standard Hoare logics in that resource behaviour of nonterminating programs is appropriately accounted for. In this way, the usual strong guarantees of type systems and program analyses may be adequately reflected in the logic. We have demonstrated this use of the Mobius base logic on one of the Mobius case studies: a block-booking scheme whose deployment could avoid the inflation of permission requests that lead to social vulnerabilities. Acknowledgements. This work was funded in part by the Information Society Technologies programme of the European Commission, Future and Emerging Technologies under the IST-2005-015905 MOBIUS project. This paper reflects only the author’s views and the Community is not liable for any use that may be made of the information contained therein. We are grateful to all members of the MOBIUS Working Group on work package 3, in particular Benjamin Gregoire, David Pichardie, Aleksy Schubert and Randy Pollack, for the numerous discussions on program logics, JML, and types, and on formalising these in theorem provers. The constructive feedback from the reviewers helped us to improve content and presentation of the paper.
50
L. Beringer, M. Hofmann, and M. Pavlova
References 1. Albert, E., Puebla, G., Hermenegildo, M.V.: Abstraction-carrying code. In: Baader, F., Voronkov, A. (eds.) LPAR 2004. LNCS, vol. 3452, pp. 380–397. Springer, Heidelberg (2005) 2. Appel, A.W.: Foundational proof-carrying code. In: Halpern, J. (ed.) Logic in Computer Science, p. 247. IEEE Press, Los Alamitos (invited talk, 2001) 3. Aspinall, D., Beringer, L., Hofmann, M., Loidl, H.-W., Momigliano, A.: A program logic for resource verification. In: Slind, K., Bunker, A., Gopalakrishnan, G.C. (eds.) TPHOLs 2004. LNCS, vol. 3223, pp. 34–49. Springer, Heidelberg (2004) 4. Bannwart, F.Y., M¨ uller, P.: A program logic for bytecode. In: Spoto, F. (ed.) Bytecode Semantics, Verification, Analysis and Transformation. Electronic Notes in Theoretical Computer Science, vol. 141, pp. 255–273. Elsevier, Amsterdam (2005) 5. Barthe, G., Fournet, C. (eds.): TGC 2007 and FODO 2008. LNCS, vol. 4912. Springer, Heidelberg (2008) 6. Beckert, B., H¨ ahnle, R., Schmitt, P.H. (eds.): Verification of Object-Oriented Software. LNCS, vol. 4334. Springer, Heidelberg (2007) 7. Beckert, B., Mostowski, W.: A program logic for handling JAVA cARD’s transaction mechanism. In: Pezz´e, M. (ed.) FASE 2003. LNCS, vol. 2621, pp. 246–260. Springer, Heidelberg (2003) 8. Beringer, L., Hofmann, M.O.: A bytecode logic for JML and types. In: Kobayashi, N. (ed.) APLAS 2006. LNCS, vol. 4279, pp. 389–405. Springer, Heidelberg (2006) 9. Beringer, L., Hofmann, M.: Secure information flow and program logics. In: IEEE Computer Security Foundations Workshop. IEEE Press, Los Alamitos (2007) 10. Beringer, L., Hofmann, M., Momigliano, A., Shkaravska, O.: Automatic certification of heap consumption. In: Baader, F., Voronkov, A. (eds.) LPAR 2004. LNCS, vol. 3452, pp. 347–362. Springer, Heidelberg (2005) 11. Besson, F., Jensen, T., Pichardie, D.: Proof-Carrying Code from Certified Abstract Interpretation and Fixpoint Compression.Theoretical Computer Science (2006) 12. Besson, F., Jensen, T., Pichardie, D., Turpin, T.: Result certification for relational program analysis. Inria Research Report 6333 (2007) 13. Cachera, D., Jensen, T.P., Pichardie, D., Schneider, G.: Certified memory usage analysis. In: Fitzgerald, J.S., Hayes, I.J., Tarlecki, A. (eds.) FM 2005. LNCS, vol. 3582, pp. 91–106. Springer, Heidelberg (2005) 14. Chang, B., Chlipala, A., Necula, G.: A framework for certified program analysis and its applications to mobile-code safety. In: Emerson, E.A., Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 174–189. Springer, Heidelberg (2005) 15. Chang, B., Chlipala, A., Necula, G., Schneck, R.: The open verifier framework for foundational verifiers. In: Morrisett, J., F¨ ahndrich, M. (eds.) Proceedings of TLDI 2005: 2005 ACM SIGPLAN International Workshop on Types in Languages Design and Implementation, pp. 1–12. ACM Press, New York (2005) 16. MOBIUS Consortium. Deliverable 1.1: Resource and information flow security requirements (2006), http://mobius.inria.fr 17. MOBIUS Consortium. Deliverable 3.1: Bytecode specification language and program logic (2006), http://mobius.inria.fr 18. Cousot, P., Cousot, R.: Automatic synthesis of optimal invariant assertions: mathematical foundations. In: ACM Symposium on Artificial Intelligence & Programming Languages, Rochester, NY; ACM SIGPLAN Not 12(8), 1–12 (1977)
Certification Using the Mobius Base Logic
51
19. Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: Conference Record of the Fifth ACM Symposium on Principles of Programming Languages, pp. 84–97 (1978) 20. Czarnik, P., Schubert, A.: Extending operational semantics of the java bytecode. In: Barthe, Fournet [5], pp. 57–72 21. Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. Journal of the ACM 52(3), 365–473 (2005) 22. Feng, X., Ni, Z., Shao, Z., Guo, Y.: An open framework for foundational proofcarrying code. In: Proc. 2007 ACM SIGPLAN International Workshop on Types in Language Design and Implementation (TLDI 2007), pp. 67–78. ACM Press, New York (2007) 23. H¨ ahnle, R., Pan, J., R¨ ummer, P., Walter, D.: Integration of a security type system into a program logic. In: Montanari, U., Sannella, D., Bruni, R. (eds.) TGC 2006. LNCS, vol. 4661, pp. 116–131. Springer, Heidelberg (2007) 24. Hofmann, M., Pavlova, M.: Elimination of ghost variables in program logics. In: Barthe, Fournet [5], pp. 1–20 25. Kleymann, T.: Hoare Logic and VDM: Machine-Checked Soundness and Completeness Proofs. PhD thesis, LFCS, University of Edinburgh (1998) 26. M¨ uller-Olm, M., Seidl, H.: Precise interprocedural analysis through linear algebra. In: Proc. ACM POPL 2004, pp. 330–341 (2004) 27. Necula, G.C.: Proof-carrying code. In: Principles of Programming Languages, pp. 106–119. ACM Press, New York (1997) 28. Nelson, G.: Techniques for program verification. Technical Report CSL-81-10, Xerox PARC Computer Science Laboratory (June 1981) 29. Nipkow, T.: Hoare logics for recursive procedures and unbounded nondeterminism. In: Bradfield, J. (ed.) CSL 2002 and EACSL 2002. LNCS, vol. 2471, pp. 103–119. Springer, Heidelberg (2002) 30. Pichardie, D.: Bicolano – Byte Code Language in Coq. Summary appears in [7] (2006), http://mobius.inia.fr/bicolano 31. Quigley, C.L.: A Programming Logic for Java Bytecode Programs. In: Basin, D.A., Wolff, B. (eds.) TPHOLs 2003. LNCS, vol. 2758, pp. 41–54. Springer, Heidelberg (2003) 32. Wildmoser, M.: Verified Proof Carrying Code. PhD thesis, Institut f¨ ur Informatik, Technische Universit¨ at M¨ unchen (2005) 33. Wildmoser, M., Nipkow, T., Klein, G., Nanz, S.: Prototyping proof carrying code. In: Levy, J.-J., Mayr, E.W., Mitchell, J.C. (eds.) Theoretical Computer Science, pp. 333–347. Kluwer Academic Publishing, Dordrecht (2004) 34. Woo, T.Y., Lam, S.S.: A semantic model for authentication protocols. In: RSP: IEEE Computer Society Symposium on Research in Security and Privacy (1993)
Safety Guarantees from Explicit Resource Management David Aspinall, Patrick Maier, and Ian Stark Laboratory for Foundations of Computer Science School of Informatics, The University of Edinburgh, Scotland {David.Aspinall,Patrick.Maier,Ian.Stark}@ed.ac.uk
Abstract. We present a language and a program analysis that certifies the safe use of flexible resource management idioms, in particular advance reservation or “block booking” of costly resources. This builds on previous work with resource managers that carry out runtime safety checks, by showing how to assist these with compile-time checks. We give a small ANF-style language with explicit resource managers, and introduce a type and effect system that captures their runtime behaviour. In this setting, we identify a notion of dynamic safety for running code, and show that dynamically safe code may be executed without runtime checks. We show a similar static safety property for type-safe code, and prove that static safety implies dynamic safety. The consequence is that typechecked code can be executed without runtime instrumentation, and is guaranteed to make only appropriate use of resources.
1 Introduction Safe management of resources is a crucial aspect of software correctness. Bad resource management impacts reliability and security. The more expensive a resource or the more complex its usage pattern, the more important is good management. For example, a media player could crash badly, leaving the hardware in a messy state, if its memory management was governed by the overly optimistic assumption that every request for memory will succeed. Malware on a mobile phone can defraud an unaware user by maliciously sending text messages to premium rate numbers, if there is no effective management of network access [12]. On current mobile platforms such as Java MIDP 2.0, management of network access is commonly left to the user, but users can easily be deceived by social engineering attacks. Unfortunately, current programming languages do not provide special mechanisms for resource management. Therefore, programmers can only hope that their applications are resource safe, or use necessarily imprecise analyses to try to show this. For example, there are type systems that over-approximate (hopefully tightly) the memory requirements of an application [6], and static analyses that over-approximate the number of text messages being sent by an application [7]. These approaches may fail if a dynamic set of resources must be managed, as with bulk messaging where the user wants to send a text message to a number of recipients selected from an address book. Because of the cost of sending text messages, the user must authorise each recipient (i. e., their phone number) explicitly. This could happen individually, just before each message is being sent, or collectively, before sending the F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 52–71, 2008. c Springer-Verlag Berlin Heidelberg 2008
Safety Guarantees from Explicit Resource Management send bulk :: λ let (r) = res from nums (nums) in let (m) = init () in let (m’,r’) = enable (m,r) in let (n) = size (r’) in if n then let () = consume (m’) in ret () else let (m”) = send msgs (msg,nums,m’) in let (m’”) = assertEmpty (m”) in let () = consume (m’”) in ret () : (msg:str, nums:str[]) → () res from nums :: λ let (i) = length (nums) in let (r) = empty () in let (r’) = res from nums’ (nums,r,i) in ret (r’) : (nums:str[]) → (r’:res{}) res from nums’ :: λ if i then let (i’) = sub (i,1) in let (num) = read (nums,i’) in let (c) = fromstr (num) in let (r c) = single (c,1) in let (r”) = sum (r, r c) in let (r’) = res from nums’ (nums,r”,i’) in ret (r’) else let (r’) = id (r) in ret (r’) : (nums:str[], r:res{}, i:int) → (r’:res{})
53
send msgs :: λ let (i) = length (nums) in let (m’) = send msgs’ (msg,nums,m,i) in ret (m’) : (msg:str, nums:str[], m:mgr) → (m’:mgr) send msgs’ :: λ if i then let (i’) = sub (i,1) in let (num) = read (nums,i’) in let (m”) = send msg (msg,num,m) in let (m’) = send msgs’ (msg,nums,m”,i’) in ret (m’) else let (m’) = id (m) in ret (m’) : (msg:str, nums:str[], m:mgr, i:int) → (m’:mgr) send msg :: λ let (c) = fromstr (num) in let (r) = single (c,1) in let (m’,m r) = split (m,r) in let (m r’) = assertAtLeast (m r,r) in let () = prim send msg (msg,num) in let () = consume (m r’) in ret (m’) : (msg:str, num:str, m:mgr) → (m’:mgr) prim send msg :: λ ...: (msg:str, num:str) → ()
Fig. 1. Bulk messaging application
first message. Collective authorisation, or block booking of resources, is preferable but requires detailed resource management, keeping track of the (multi-)set of authorised resources – in this case the permitted phone numbers. In this paper, we present a language-based mechanism that provides programmers with a safe way to control complex resource usage patterns using a notion of resource manager. Figure 1 shows the code of a bulk messaging application using resource managers in our intermediate-level functional programming language. The language and functions used will be explained in full detail in Section 2; for now, we just give an outline of operation. The function send bulk calls send msgs to send the message msg to the phone numbers stored in the array nums. Along with these two arguments send msgs takes a resource manager m’ which encapsulates the resources that have been authorised (during the call to enable) to send the messages. For each phone number in nums, send msgs calls the wrapper function send msg, passing along a resource manager. Prior to calling the primitive send function prim send msg, the wrapper checks (using assertAtLeast) whether its input manager m contains the resource required to send a message to num; if the resource is not present, the program will abort with a runtime error, otherwise send msg removes the resource from the manager (using split), and returns the modified manager as m’. The bulk messaging application is (dynamically) resource safe by construction, as the resource managers will trap attempts to abuse resources. The resource manager
54
D. Aspinall, P. Maier, and I. Stark
abstraction works in tandem with a static analysis, so that programs which can be proved resource safe statically can be treated more efficiently at runtime by removing the dynamic accounting code. In Section 3.2, we prove resource safety statically for the bulk messaging application. Our contribution is two-fold. In Section 2, we develop a functional programming language for coding complex resource idioms, such as block booking resources in the bulk messaging application. The language is essentially a first-order functional language in administrative normal form (ANF) [10] with a novel type system serving two purposes. First, the type system names input and output parameters of functions and avoids shadowing of previously bound names, thus admitting to view functions as relations (expressed by logical formulae) between their input and output parameters. Second, the language includes a special, linear type for resource managers, where linearity serves as a means of introducing stateful objects into an otherwise pure functional language. Resource managers track what resources a program is allowed to use, and the operational semantics causes the program to go wrong (i. e., abort with a runtime error) as soon as it attempts to abuse resources. This induces a notion of dynamic resource safety, which holds if a program never attempts to abuse resources. In this case, accounting is not necessary. As our first result, we show that erasing resource managers does not alter the semantics of dynamically resource safe programs. Decisions about which resources programs may use are typically guided by resource policies. From the point of view of a program, a policy is simply an oracle determining what resources to grant; and we abstract this as a non-deterministic operation on resource managers. This covers many concrete policy mechanisms, both static (e. g., Java-style policy files) or dynamic (e. g., user interaction); see [3] for more on the interaction of resource managers and policies. In Section 3 we present our second contribution, an effect type system for deriving relational approximations of functions. These approximations are expressed as pairs of constraints in a first-order logic, specifying a pre- and postcondition (or rather, state transforming action) of a given function, similar to Hoare type theory [11]; note that the use of logical formulae as effects is the rationale behind choosing a programming language where functions have named input and output parameters. Typability of functions in the effect type system induces a notion of static resource safety. As our second result, we prove a soundness theorem stating that static implies dynamic resource safety. As a corollary, we show that resource managers can always be erased from statically resource safe programs. Proofs have been omitted due to lack of space.
2 A Programming Language for Resource Management We introduce a simple programming language with built-in constructs for handling resource managers. The language is essentially a simply-typed first-order functional language in ANF [10], with the additional features that functions take and return tuples of values, function types name input and output arguments, scoping avoids shadowing, and the type of resource managers enforces a linearity restriction on its values. The first three of these features are related to giving the language a relational appeal: for the purpose of specifying and reasoning logically, functions ought to be viewed as relations
Safety Guarantees from Explicit Resource Management fundecl ::= prodtype → prodtype | λexp : prodtype → prodtype exp ::= if val then exp else exp | let (var,. . .,var) = fun (val,. . .,val) in exp | ret (var,. . .,var) val ::= const | var
prodtype ::= (var:type,. . .,var:type) type ::= datatype | mgr datatype ::= unit | int | str | res | res{} | datatype[] Fig. 2. BNF grammar
between input and output parameters. The fourth feature is a means of introducing state into a functional language. The choice for such a language has been inspired by Grail [2], another first-order functional language in ANF. Moreover, Appel [1] argues that ANF, the intermediate language used by many compilers for functional languages, and SSA, the intermediate representation used by most compilers for imperative languages, are essentially the same thing. Therefore, our language should capture the essence of first-order programming languages, whether functional or imperative. 2.1 Syntax and Static Semantics Grammar. Figure 2 shows the grammar of the programming language. The nonterminals fun, var and const represent functions, variables and constants, respectively. A program Π is a partial function from fun to fundecl, i. e., Π maps functions to function declarations, which are either type declarations for built-in functions or λabstractions (with type annotations serving as variable binders). We use the notation Π(f ) = [λ . . . ]σ → σ if we are only interested in the type of f , regardless whether f is built-in or a λ-abstraction. By dom(Π), we denote the domain of Π. We denote the restriction of Π to the built-in functions by Π0 , i. e., Π(f ) is a λ-abstraction if and only if f ∈ dom(Π) \ dom(Π0 ). We assume that Π0 declares exactly the functions that are shown in Figure 4. The grammar of expressions e ∈ exp and values v ∈ val is quite standard for a first-order functional language in ANF. Throughout, functions operate on tuples of values, which is reflected by the syntax for function call and return. The sets of free and bound (by the let-construct) variables of an expression e, denoted by free(e) and bound(e) respectively, are defined in the usual way. Datatypes τ ∈ datatype comprise the unit type, integers, strings, resources, multisets of resources, and arrays. A type τ ∈ type is either a datatype or the special type of resource managers, denoted mgr. See Section 2.2 for the interpretations of types. A tuple (x1 :τ1 ,. . .,xn :τn ) ∈ prodtype is a product type if the variables x1 , . . . , xn are pairwise distinct. Product types appear to associate types to variables, but they really associate variables and types to positions in tuples. A pair of product
56
D. Aspinall, P. Maier, and I. Stark
types of the form (x1 :τ1 ,. . .,xm :τm ) → (x1 :τ1 ,. . .,xn :τn ) forms a function type if the variable sets {x1 , . . . , xm } and {x1 , . . . , xn } are disjoint. We call the product types to the left and right of the arrow argument type and return type, respectively. As an example consider the type of the function send msg from Figure 1. It states that send msg takes two strings and a resource manager and returns a resource manager, while at the same time binding the names of the formal input parameters msg, num and m and announcing that the formal output parameter will be m’. Static typing. A type environment Γ is a functional association list of type declarations of the form x:τ , where x is a variable and τ a type. Being functional implies that whenever Γ contains two type declarations x:τ and x:τ we must have τ = τ . Therefore, Γ can be seen as a partial function mapping variables to types. By dom(Γ ), we denote the domain of this partial function, and for x ∈ dom(Γ ), we may write Γ (x) for the unique type which Γ associates to x. We write type environments as commaseparated lists, the empty list being denoted by ∅. The restriction Γ |X of Γ to a set of variables X, is defined in the usual way and induces a partial order type environments, where Γ Γ iff Γ |dom(Γ ) = Γ . We call a type environment Γ = x1 :τ1 , . . . , xn :τn linear if the variables x1 , . . . , xn are pairwise distinct. Note that such a linear type environment Γ may be viewed as a product type σ = (x1 :τ1 ,. . .,xn :τn ), and vice versa. Occasionally, we will write Π(f ) = [λ . . . ]Γ → Δ to emphasise that argument and return types of the function f are to be viewed as linear type environments. Figure 3 shows the typing rules for the programming language. The judgement C; Γ v : τ expresses that the value v has type τ in type environment Γ and context C, where a context is a set of variables (generally the set of variables occurring in some super-expression of v). Note that (T-const) restricts program constants to the unit value, integers and strings, which are the interpretations of the types unit, int and str, respectively (see Section 2.2). All other types are abstract in the sense that their values can only be accessed through built-in functions. The judgement C; Γ Π e : σ means that the expression e has product type σ in type environment Γ , context C and program Π. If the program is understood we may write C; Γ e : σ. There are three things worth noting about expression typing. First, although the type system is linear, weakening and contraction are available to all types but mgr, rendering mgr the sole linear type of the language. Second, the side condition of (T-let) ensures that let-bound variables do not shadow any variables in the context (which is generally a superset of the set of variables occurring in the let-expression). Third, the rule (T-ret) matches the variables in the return expression to the variables in the product type, thus enforcing that an expression uniformly uses the same variables to return its results (even though these return variables may be let-bound in different branches of the expression). Note that (T-ret) is the only rule to exploit type information about variables. Finally, the judgement Γ e : σ (or Γ Π e : σ if we want to stress the program Π) means that e has product type σ in a linear type environment Γ . The judgement Π f states that f is a well-typed λ-abstraction in program Π. Note that the syntax of λ-abstractions does not appear to bind variables, yet it does bind the variables hidden in the argument type. Note also that the restriction on function
Safety Guarantees from Explicit Resource Management Typing of values C; Γ v : τ (T-var)
C; x:τ x : τ
if x ∈ C
Typing of expressions C; Γ e : σ j C; Γ e : σ x∈C ∧ (T-weak) if τ = mgr C; Γ, x:τ e : σ (T-if)
(T-const)
C; ∅ d : τ
(T-contr)
if
j d∈τ ∧ τ ∈ {unit, int, str}
C; Γ, x:τ, x:τ e : σ if τ = mgr C; Γ, x:τ e : σ
C; Γ v : int C; Γ e1 : σ C; Γ e2 : σ C; Γ, Γ if v then e1 else e2 : σ
types means that the return variables of the body of a λ-abstraction must be disjoint from its argument variables. Finally, we call a program Π well-typed if Π f for all f ∈ dom(Π) \ dom(Π0 ). Lemma 1. Let e be an expression (referring to an implicit program Π), Γ a type environment and σ a product type. 1. If Γ e : σ then free(e) ⊆ dom(Γ ) and bound (e) ∩ dom(Γ ) = ∅. 2. If Γ e : σ and X ⊇ free(e) then Γ |X e : σ. 2.2 Interpretation of Types and Effects of Built-in Functions Constraints. To provide a formal semantics for the built-in functions, we introduce a many-sorted first-order language L with equality. Sorts of L are the datatypes of the programming language (note that this excludes the type mgr). Formulae of L are formed from atomic formulae using the usual Boolean connectives ¬, ∧, ∨, ⇒ and ⇔ (in decreasing order of precedence), and the quantifiers ∀x:τ and ∃x:τ , where x ∈ var is a variable and τ ∈ datatype a sort. Atomic formulae are the Boolean constants and ⊥, or are constructed from terms using the binary equality predicate ≈ (which
58
D. Aspinall, P. Maier, and I. Stark
is available for all sorts), the binary inequality predicate ≤ on sort int or the binary inclusion predicate ⊆ on sort res{}. Terms are constructed from variables in var and the term constructors, which are introduced below, alongside associating the sorts to specific interpretations. Sort unit is interpreted by the one-element set {}. Its only constant is . There are no function symbols. Sort int is interpreted by the integers with infinity. Constants are the integers plus ∞. Function symbols are the usual − : int → int and +, ·, /, % : int × int → int (where / and % denote integer division and remainder, respectively). Sort str is interpreted by the set of strings (over some fixed but unspecified alphabet). Constants are all strings. The only function symbol is ++ : str × str → str (concatenation). Sort res is interpreted by an arbitrary infinite set (whose elements are termed resources). There are no constants, and fromstr : str → res, an embedding of strings into resources, is the only one function symbol. Sort res{} is interpreted by multisets of resources. It features the constant ∅ (empty multiset) and the function symbols ∩, ∪, : res{}×res{} → res{} (intersection, union and sum of multisets, respectively), | | : res{} → int (size of a multiset), count : res{} × res → int (counting the multiplicity of a resource in a multiset) and { : } : res × int → res{} (constructing a “singleton” multiset containing a given resource with a given multiplicity and nothing else). Sort τ [] is interpreted by integer-indexed arrays of elements of sort τ , where an integerindexed array is a function from an initial segment of the natural numbers to τ . This sort features the constant null (array of length 0) and the function symbols len : τ [] → int (length of an array), [ ] : τ [] × int → τ (reading at a given index) and [ := ] : τ [] × int × τ → τ [] (updating a given index with a given value). Note that the values of a[i] and a[i:=v] are generally unspecified if the index i is out of bounds (i. e., i < 0 or i ≥ len(a)). As an exception, for i = len(a), the array a[i:=v] properly extends a, i. e., len(a[i:=v]) = len(a) + 1. This models vectors that can grow in size. Treating the type mgr as an alias for the sort res{}, type environments can be seen as associating sorts to variables. Given a type environment Γ and constraint φ ∈ L, we write Γ φ if φ is well-sorted w. r. t. Γ ; note that this entails free(φ) ⊆ dom(Γ ), where free(φ) is the set of free variables in φ. Substitutions. A substitution μ maps variables x ∈ var to values μ(x) ∈ val (which are variables again or constants, not arbitrary terms). We denote the domain of a substitution μ by dom(μ). Given a type environment Γ , we write Γ μ for the type environment that arises from substituting the variables in Γ according to μ. This is defined recursively: ∅μ = ∅ and (Γ, x:τ )μ equals Γ μ, x:τ if x ∈ / dom(μ), or Γ μ, μ(x):τ if μ(x) ∈ var, or Γ μ if μ(x) ∈ const. Note that Γ μ need not be linear even if Γ is. Given a formula φ such that Γ φ, we write φμ for the formula obtained by substituting the free variables of φ according to μ, avoiding capture. Note that Γ φ implies Γ μ φμ.
Safety Guarantees from Explicit Resource Management
59
Valuations. Let Γ be a type environment. A Γ -valuation α maps variables x ∈ dom(Γ ) to elements α(x) in the interpretation of the sort Γ (x); we call α a valuation if we do not care about the particular type environment Γ . We denote the domain of α by dom(α). Note that dom(α) ⊆ dom(Γ ) but not necessarily dom(α) = dom(Γ ); we call α a maximal Γ -valuation if dom(α) = dom(Γ ). Given a Γ -valuation α and a set of variables X, we denote the restriction of α to X by α|X ; note that dom(α|X ) = dom(α) ∩ X. Restriction induces a partial order on Γ -valuations, where α α iff α |dom(α) = α. Given n pairwise distinct variables xi ∈ dom(Γ ) and corresponding elements di in the interpretation of Γ (xi ), we write α{x1 → d1 , . . . , xn → dn } for the Γ -valuation α that maps the xi to di and all other x ∈ dom(α) to α(x). In the special case dom(α) = ∅, we may drop α and simply write {x1 → d1 , . . . , xn → dn }. Entailment. Let φ, ψ ∈ L be constraints such that Γ φ and Γ ψ. Given a Γ valuation α with free(φ) ⊆ dom(α), we write α |= φ if α satisfies φ. We write |= φ if α |= φ for all Γ -valuations α with free(φ) ⊆ dom(α), and we write φ |= ψ if α |= φ implies α |= ψ for all Γ -valuations α with free(φ) ∪ free(ψ) ⊆ dom(α). Entailment induces a theory T = {φ | free(φ) = ∅ ∧ |= φ}, with respect to which entailment can be reduced to unsatisfiability. Note that unsatisfiability w. r. t. T is not even semi-decidable as T contains Peano arithmetic. Thus for reasoning purposes, we will generally approximate T by weaker theories. Effects. Let f be a built-in function with Π(f ) = Γ → Δ (viewing argument and return types of f as type environments Γ and Δ, respectively.) An effect for f is a pair of constraints φ and ψ such that Γ φ and Γ, Δ ψ. (Note that Γ → Δ being a function type implies dom(Γ ) ∩ dom(Δ) = ∅, hence Γ, Δ is a type environment.) We write φ → ψ to denote such an effect, and we call φ its precondition and ψ its action. An effect environment maps the built-in functions f ∈ dom(Π0 ) to effects for f . Figure 4 displays the effect environment Θ0 , providing an axiomatic, relational semantics for all f ∈ dom(Π0 ). This semantics ties most built-in functions to corresponding logical operators in a straightforward way; note the non-trivial preconditions for division, reading and writing arrays, and constructing singleton multisets. The effects of functions operating on resource managers warrant some explanation. init returns an empty manager m . enable non-deterministically adds some sub-multiset of r to manager m, returning the result in manager m ; the complement of the added multiset is returned in r . In an implementation [3] the multiset to be added to m would be chosen by some policy, perhaps involving security profiles or user input; we use non-determinism to abstractly model such policy mechanisms. split splits the multiset held by manager m and distributes it to the managers m1 and m2 such that m2 gets the largest possible sub-multiset of r. join adds the multisets held by managers m1 and m2 , returning their sum in m . consume is an explicit destructor for manager m and all its resources; the linear type system means that calls to consume are necessary even if m is known to be empty. assertEmpty acts as identity on managers, but subject to the precondition that m is empty; it will be treated specially by the programming language semantics.
60
D. Aspinall, P. Maier, and I. Stark
f
Π0 (f )
Θ0 (f )
idτ
(x:τ ) → (x :τ )
eqτ
(x1 :τ ,x2 :τ ) → (i :int)
add sub mul div mod leq
(i1 :int,i2 :int) → (i :int)
→x ≈x → i ≈ 1 ∧ x1 ≈ x2 ∨ i ≈ 0 ∧ x1 ≈ x2 → i ≈ i1 + i2 → i ≈ i1 + (−i2 ) → i ≈ i1 · i2 i2 ≈ 0 → i ≈ i1 / i2 i2 ≈ 0 → i ≈ i1 % i2 → i ≈ 1 ∧ i1 ≤ i2 ∨ i ≈ 0 ∧ i1 i2
conc
(w1 :str,w2 :str) → (w :str)
→ w ≈ w1 ++ w2
fromstr
(w:str) → (c :res)
→ c ≈ fromstr (w)
nullτ lengthτ readτ writeτ
() → (a :τ []) (a:τ []) → (i :int) (a:τ [],i:int) → (x :τ ) (a:τ [],i:int,x:τ ) → (a :τ [])
→ a ≈ null → i ≈ len(a) 0 ≤ i ∧ i < len(a) → x ≈ a[i] 0 ≤ i ∧ i ≤ len(a) → a ≈ a[i:=x]
empty single inter union sum size count include
() → (r :res{}) (c:res,i:int) → (r :res{})
→ r ≈ ∅ i ≥ 0 → r ≈ {c:i} → r ≈ r1 ∩ r2 → r ≈ r1 ∪ r2 → r ≈ r1 r2 → i ≈ |r| → i ≈ count(r, c) → i ≈ 1 ∧ r1 ⊆ r2 ∨ i ≈ 0 ∧ r1 r2
(r1 :res{},r2 :res{}) → (r :res{}) (r:res{}) → (i :int) (r:res{},c:res) → (i :int) (r1 :res{},r2 :res{}) → (i :int)
→ m ≈ ∅ → r ⊆ r ∧ m r ≈ m r → m2 ≈ m ∩ r ∧ m ≈ m1 m2 → m ≈ m1 m2 → m ≈ ∅ → m ≈ m r ⊆ m → m ≈ m
Fig. 4. Types and effects of built-in functions. The subscripts τ indicate families of functions indexed by τ ∈ datatype, except for idτ , which is indexed by τ ∈ type.
assertAtLeast acts as identity on managers, but subject to the precondition that the manager m contains the multiset r; will be treated specially by the programming language semantics. To facilitate the presentation of programming language semantics, we capture the logical semantics of effects directly in terms of valuations. Given a built-in function f with 0 Π0 (f ) = Γ → Δ and Θ0 (f ) = φ → ψ, we define Eff Π Θ0 (f ) to be the set of maximal Π0 (Γ, Δ)-valuations such that α ∈ Eff Θ0 (f ) if and only if α |= φ ∧ ψ. 2.3 Small-Step Reduction Semantics We present a stack-based reduction semantics (which is essentially a continuation semantics) for our programming language. We will show that reduction preserves the
Safety Guarantees from Explicit Resource Management
61
resources stored in resource managers, thanks to linearity. Throughout this section, let Π be a fixed well-typed program. Stacks. We call a tuple x1 , . . . , xn |α, e a frame if x1 , . . . , xn is a list of pairwise distinct variables, α is a valuation and e is an expression such that – dom(α) ∩ {x1 , . . . , xn } = ∅ and – dom(α) ⊆ free(e) ⊆ dom(α) ∪ {x1 , . . . , xn }. The roles of e (redex) and α (providing values for the free variables of e) should be clear. The xi are only present if the frame is suspended waiting for a function to return in which case the xi act as slots for the return values. A pre-stack is either or or F :: S, where F is a frame and S is a pre-stack. (Pre-stacks essentially correspond to continuations in an abstract machine interpreting λ-terms in ANF [10].) A stack (or Πstack if we want to emphasise the program Π) is a pre-stack of the form or |α, e :: S. We call the error stack. A stack of the form |α, ret (x1 ,. . .,xn ):: is called terminal. If F :: S is a stack then F is its top frame. Reduction. Figure 5 presents the rules generating the reduction relation Π on stacks. We denote the reflexive-transitive closure of Π by ∗Π . As usual Π may be omitted if it is understood. Note that reduction performs an eager garbage collection in that it deallocates unused variables immediately by restricting the valuation α in the post stack to the free variables of the expression e. Reduction is deterministic, except for calls to the built-in function enable. Proposition 2. For all stacks S0 there is at most one stack S1 such that S0 S1 , unless S0 is of the form |α, let (m ,r ) = enable (m,r) in e :: S0 . Typed stacks. Reduction is untyped since type information is not needed at runtime. However, various properties of reduction are best stated if the type of variables is known. Therefore, we annotate stacks with type environments and conservatively extend reduction to typed stacks. Given a frame x1 , . . . , xn |α, e, we call x1 , . . . , xn |α, eΓ a typed frame if Γ is a linear type environment such that – dom(Γ ) = dom(α) ∪ {x1 , . . . , xn }, – α is a Γ -valuation, and – Γ e : σ for some product type σ. A typed pre-stack is , or , or F :: where F is a typed frame, or F ::F :: S where S is a typed pre-stack and F = x1 , . . . , xm |α, eΓ and F = x1 , . . . , xn |α , e Γ are typed frames such that Γ e : (z1 :Γ (x1 ),. . .,zn :Γ (xn )) for some variables z1 , . . . , zn . A typed stack is typed pre-stack of the form or |α, eΓ :: S. Given a typed frame F = x1 , . . . , xn |α, eΓ , we denote its underlying frame x1 , . . . , xn |α, e by F . We extend this notation to typed (pre-)stacks, writing S for the (pre-)stack underlying the typed (pre-)stack S. The following proposition shows that reduction does not break the invariants maintained by typed stacks.
62
(R-ret)
D. Aspinall, P. Maier, and I. Stark α = α {x1 → α(x1 ), . . . , xn → α(xn )} |α, ret (x1 ,. . .,xn ) :: x1 , . . . , xn |α , e :: S |α |free (e ) , e :: S
Π(f ) = λe : (z1 :τ1 ,. . .,zm :τm ) → σ α = {z1 → α(v1 ), . . . , zm → α(vm )} (R-lettl1 ) |α, let (x1 ,. . .,xn ) = f (v1 ,. . .,vm ) in ret (x1 ,. . .,xn ) :: S |α |free(e) , e :: S
(R-let1 )
Π(f ) = λe : (z1 :τ1 ,. . .,zm :τm ) → σ e = ret (x1 ,. . .,xn ) α = {z1 → α(v1 ), . . . , zm → α(vm )} |α, let (x1 ,. . .,xn ) = f (v1 ,. . .,vm ) in e :: S |α |free (e) , e :: x1 , . . . , xn |α|free (e ) , e :: S
Π0 (f ) = (z1 :τ1 ,. . .,zm :τm ) → (z1 :τ1 ,. . .,zn :τn ) 0 αf = {z1 → α(v1 ), . . . , zm → α(vm )} αf ∈ Eff Π αf αf Θ0 (f ) α = α{x1 → αf (z1 ), . . . , xn → αf (zn )} (R-let2 ) |α, let (x1 ,. . .,xn ) = f (v1 ,. . .,vm ) in e :: S |α |free(e ) , e :: S Π0 (f ) = (z1 :τ1 ,. . .,zm :τm ) → σ f ∈ {assertEmpty, assertAtLeast} 0 α = {z → α(v ), . . . , z → α(v )} ∀αf ∈ Eff Π 1 1 m m f Θ0 (f ) : αf αf (R-let 2) |α, let (x1 ,. . .,xn ) = f (v1 ,. . .,vm ) in e :: S (R-if1 )
α(v) =0 |α, if v then e1 else e2 :: S |α|free (e1 ) , e1 :: S
(R-if2 )
α(v) = 0 |α, if v then e1 else e2 :: S |α|free (e2 ) , e2 :: S
Fig. 5. Small-step reduction relation (for a fixed program Π). Application of valuations α extends to values v ∈ val in the natural way, i. e., α(v) = v if v is a constant.
Proposition 3. Let Sˆ0 be a typed stack and S1 a stack. If Sˆ0 S1 then there is a typed stack Sˆ1 such that Sˆ1 = S1 . The proposition justifies the view of reduction on typed stacks as a conservative extension of the reduction relation defined in Figure 5, where reduction on typed stacks is defined by Sˆ0 Π Sˆ1 if and only if Sˆ0 Π Sˆ1 ; as usual Π may be omitted if it is understood. We call a stack S0 stuck if there is no stack S1 such that S0 S1 , and S0 is neither terminal nor the error stack. Our next result shows that reduction on typed stacks will get stuck only at calls to built-in functions (other than assertEmpty and assertAtLeast), and only if the preconditions of these calls fail. As the effects listed in Figure 4 reveal, reduction will get stuck only upon attempts to divide by 0, access arrays out of bounds or construct singleton multisets with negative multiplicity. Proposition 4. Let Sˆ be a typed stack. If Sˆ is stuck then it is of the form |α, let (x1 ,. . .,xn ) = f (v1 ,. . .,vm ) in e :: S ,
Safety Guarantees from Explicit Resource Management
63
0 f ∈ dom(Π0 ) \ {assertEmpty, assertAtLeast}, and there is no αf ∈ Eff Π Θ0 (f ) such that αf αf , where αf is defined as in rule (R-let2 ).
Preservation of resources. Given a typed frameF = x1 , . . . , xn |α, eΓ , we define the multiset res(F ) of resources in F by res(F ) = {α(x) | x ∈ dom(α), Γ (x) = mgr}. We extend res to typed non-error stacks by defining res( ) = ∅ and res(F :: S) = res(F ) res(S). Proposition 5 states resource preservation: The sum of all resources in the system remains unchanged by reduction, unless the built-in functions enable and consume are called. The former admits increasing (but not decreasing) the resources, whereas the latter behaves the other way round. Obviously, resource preservation depends on the linearity restriction on type mgr, otherwise resources could be duplicated by re-using managers. Proposition 5. Let S0 and S1 be typed stacks such that S0 S1 = . 1. If S0 is of the form |α, let (m ,r ) = enable (m,r) in eΓ :: S0 then res(S0 ) ⊆ res(S1 ). 2. If S0 is of the form |α, let () = consume (m) in eΓ :: S0 then res(S0 ) ⊇ res(S1 ). 3. In all other cases, res(S0 ) = res(S1 ). 2.4 Erasing Resource Managers According to the reduction semantics, a call to assertEmpty or assertAtLeast either does nothing1 or goes wrong, and calling one of these two tests is the only way to go wrong. Hence, if we know that a program cannot go wrong (and Section 3 will present a type system for proving just that) then we can erase all calls to these built-ins (or rather, replace them by true no-ops) and obtain an equivalent program. In fact, we can do more than that. Once the assertion built-ins are gone, it is even possible to remove the resource managers themselves. By the design of the programming language (in particular, the choice of built-in operations on resource managers) the contents of resource managers cannot influence the values of variables of any other type. Informally, this justifies replacing the resource managers themselves by variables of type unit whenever we know that a program cannot go wrong. Erasing resource managers also means that the built-in functions acting on managers can be replaced by simpler ones on unit: all of which are no-ops, except for enable itself.2 The remainder of the section formalises this intuition. Figure 6 shows the necessary program transformations to erase resource managers. Most fundamentally, erasure maps the manager type mgr to the unit type unit. Erasure on types determines erasure on product types, type environments, programs and valuations (where erasure uniformly maps the values of mgr-variables to , the only value of type unit), which in turn determines erasure on typed stacks. As outlined 1
2
Due to the linearity restriction on resource managers these functions must copy the input manager to an output manager; a true no-op would violate resource preservation. We do keep the calls in place, so that erasure preserves the structure of programs; this simplifies reasoning, and does not preclude optimising away no-op calls at a later stage.
64
D. Aspinall, P. Maier, and I. Stark
Erasure τ ◦ of types τ τ ◦ = unit if τ = mgr τ◦ = τ otherwise
Erasure Γ ◦ of type environments Γ ∅◦ = ∅ (Γ, x:τ )◦ = Γ ◦ , x:τ ◦
above, erasure on effect environments trivialises the effect of resource manager builtins, except enable, and preserves the effects of all built-ins not operating on managers. The effect of enable after erasure is to non-deterministically choose a sub-multiset of r and return its complement in r . This reflects the fact that calls to enable provide points of interaction for the policy (e. g., the user) to decide how many resources the system is granted. Erasing resource managers does not mean that policy decisions are fixed, it just removes the managers’ book keeping about those decisions. Lemma 6. Let Π be a well-typed program and S a typed Π-stack. Then Π ◦ is a welltyped program and S ◦ a typed Π ◦ -stack. Erasure makes trivial the effects of assertEmpty and assertAtLeast, and in particular, replaces their precondition by . Thus a program cannot go wrong after erasure, as rule (R-let2 ) will never apply. Proposition 7. Let Π be a well-typed program and S a Π ◦ -stack S. Then S ∗Π ◦ . The next result states that the small-step reduction relation Π of a program Π is almost bisimulation equivalent to the reduction relation Π ◦ of its erasure. In fact,
Safety Guarantees from Explicit Resource Management
65
it shows that the relation R = {S, S ◦ | S is a Π-stack} would be a bisimulation if Π could not reduce stacks to the error stack . Put differently, if Π cannot go wrong then Π and Π ◦ are bisimulation equivalent. The proof of this theorem is by case analysis on the reduction relation Π of the unerased program. As a corollary, we get that reachability in the erased program is essentially the same as reachability in the unerased one, provided that the unerased program cannot go wrong. Theorem 8. Let Π be a well-typed program and Sˆ0 a typed Π-stack with Sˆ0 Π . 1. For all typed Π-stacks Sˆ1 , if Sˆ0 Π Sˆ1 then Sˆ0◦ Π ◦ Sˆ1◦ . 2. For all typed Π ◦ -stacks S1 , if Sˆ0◦ Π ◦ S1 then there is a typed Π-stack Sˆ1 such that Sˆ0 Π Sˆ1 and Sˆ1◦ = S1 . Corollary 9. Let Π be a well-typed program and S0 a typed Π-stack. If S0 ∗Π then {S ◦ | S0 ∗Π S} = {S | S0◦ ∗Π ◦ S}. What distinguishes erasure of resource managers from other erasure results (e. g., type erasure during compilation, Java generics erasure) is that here, erasure does not completely remove a language construct. Instead, it removes the book keeping but retains the semantically important bit that deals with dynamic policy decisions. 2.5 Big-Step Relational Semantics The reduction semantics presented in Section 2.3 is good for showing preservation properties, like the preservation of resources. However, it does not easily yield a relational view on functions, relating input and output parameters. This is achieved by a relational semantics, which we will prove equivalent to the reduction semantics. Contrary to the reduction semantics, which was originally untyped and had type environments added conservatively, the relational semantics will be typed from the start. (Types do not hurt here, as the relational semantics is not geared towards execution.) Throughout this section, we assume that Π is a well-typed program. A state β is either the error state or a normal state Γ ; α, where Γ is a linear type environment and α a maximal Γ -valuation. Given an expression e, a normal state Γ ; α and a state β , we define the judgement e, Γ ; α ⇓Π β (or e, Γ ; α ⇓ β if Π is understood) by the rules in Figure 7 if dom(Γ ) ∩ bound(e) = ∅ and there are Γe and σ such that Γ Γe and Γe e : σ. The intended meaning of e, Γ ; α ⇓ β is that evaluating expression e in state Γ ; α may terminate and result in state β . The reduction semantics deallocates variables once they become unused (an eager garbage collection, so to say), which is essential for the linear variables as otherwise resource preservation would not hold. However, the intermediate values of variables are thus lost. In contrast, the relational semantics names and records all intermediate values, even the linear ones, as e, Γ ; α ⇓ Γ ; α implies Γ Γ and α α. By definition, violations of resource safety manifest themselves in reductions ending in the error stack, and hence reductions which diverge or get stuck cannot violate resource safety. Therefore, resource safety is not affected by the fact that the relational semantics ignores such reductions. Under this proviso, Proposition 10 shows the equivalence of reduction and relational semantics.
e1 , Γ ; α ⇓ β if α(v) =0 if v then e1 else e2 , Γ ; α ⇓ β
(E-if2 )
e2 , Γ ; α ⇓ β if α(v) = 0 if v then e1 else e2 , Γ ; α ⇓ β Fig. 7. Big-step evaluation relation (for a fixed program Π)
Proposition 10. Let Γ ; α and Γ ; α be states. Let e be an expression such that dom(Γ ) = free(e) and Γ e : σ for some product type σ. Then 1. e, Γ ; α ⇓ if and only if |α, eΓ :: ∗ , and 2. e, Γ ; α ⇓ Γ ; α if and only if there is a typed stack |α , ret (x1 ,. . .,xn )Γ :: such that |α, eΓ :: ∗ |α , ret (x1 ,. . .,xn )Γ :: and Γ Γ and α α .
3 Effect Type System In this section, we will develop a type system to statically guarantee dynamic resource safety, i. e., the absence of reductions to the error stack . We will do so by annotating functions with effects and then extending the notion of effect to a judgement on expressions, which we will define by a simple set of typing rules.
Safety Guarantees from Explicit Resource Management
67
3.1 Effect Type System We extend the notion of effect φ → ψ from built-in functions to λ-abstractions. To be precise, φ → ψ is an effect for f if Γ φ and Γ, Δ ψ, where Π(f ) = [λ . . . ]Γ → Δ, regardless of whether f is built-in or a λ-abstraction. In line with this extension, an effect environment Θ maps all functions f ∈ dom(Π) to effects Θ(f ) for f . In order to derive the effects of λ-abstractions, we generalise effects to effect types for expressions and develop a type system for inductively constructing such effect types. Effects relate input and output parameters of functions by logical formulae. Likewise, effect types shall relate input and output parameters of expressions. Here, the input parameters of an expression are its free variables; the output parameters are those variables that are not free yet but will become free during reduction, i. e., the (let-)bound variables. Formally, an effect type Γ ; φ → Δ; ψ is a pair of constraints φ and ψ together with a pair of type environments Γ and Δ such that dom(Γ ) ∩ dom(Δ) = ∅ and Γ φ and Γ, Δ ψ. We call φ and ψ precondition and action, and Γ and Δ input and output (parameters), respectively. Given an expression e, we say that an effect type Γ ; φ→Δ; ψ is an effect type for e if dom(Γ ) ∩ bound (e) = ∅. We say that an effect type Γ ; φ→Δ; ψ is stronger than an effect type Γ ; φ →Δ ; ψ , denoted by Γ ; φ → Δ; ψ ⊇ Γ ; φ → Δ ; ψ , if φ |= φ and (φ ∧ ψ) |= ψ , i. e., the stronger effect type Γ ; φ → Δ; ψ has a weaker precondition but stronger action. The stronger-than relation ⊇ is a quasi-order, i. e., reflexive and transitive, and induces an equivalence relation on effect types, the as-strong-as relation, which we denote by ≡. Note that for every effect type Γ ; φ → Δ; ψ is as strong as an effect type Γ ; φ → Δ ; ψ with linear type environments Γ and Δ . Figure 8 presents the typing rules for deriving effect types. There, the judgement Θ Π e : Γ ; φ → Δ; ψ states that expression e has effect type Γ ; φ → Δ; ψ in the context of program Π and effect environment Θ. If Π is understood, we may omit it and write Θ e : Γ ; φ → Δ; ψ instead. The judgement Π, Θ f means that the effect type ascribed to a λ-abstraction f by Θ and Π is consistent with the effect type derived for the body of f . We say that Θ is an admissible effect environment for a program Π if Π, Θ f for all λ-abstractions f ∈ dom(Π) \ dom(Π0 ). Lemma 11. Let e be an expression, Θ an effect environment (referring to an implicit program Π) and Γ ; φ → Δ; ψ an effect type. If Θ e : Γ ; φ → Δ; ψ then Γ ; φ → Δ; ψ is an effect type for e. Theorem 12 states soundness of effect typing w. r. t. the big-step relational semantics. The proof is by double induction on the derivation of relational semantics judgements over the derivation of effect type judgements. As a corollary, we get that reduction starting from a state that satisfies the precondition can’t go wrong, hence resource managers can be erased. In fact, the untyped reductions in the erased program match exactly the typed reductions in the original program. Theorem 12. Let Θ be an admissible effect environment for a well-typed program Π. Let e be an expression and Γ ; φ → Δ; ψ an effect type such that Θ e : Γ ; φ → Δ; ψ. Let Γ ; α and β be states such that e, Γ ; α ⇓ β (which implies Γe e : σ for some Γe , σ). If α |= φ then β = Γ ; α for some Γ and α such that α |= φ ∧ ψ. (In particular, if α |= φ then β = .)
Fig. 8. Typing rules for effect types (for a fixed program Π)
Corollary 13. Let Θ be an admissible effect environment for a well-typed program Π. Let e be an expression and Γ ; φ → Δ; ψ an effect type such that Θ Π e : Γ ; φ → Δ; ψ. Let α be a maximal Γ -valuation, and let Sˆ0 = |α|free(e) , eΓ |free(e) :: be a typed Π-stack (which implies Γ |free(e) Π e : σ for some σ). If α |= φ then 1. Sˆ0 ∗Π and 2. for all (untyped) Π ◦ -stacks S, Sˆ0◦ ∗Π ◦ S if and only if there is a typed Π-stack Sˆ such that Sˆ0 ∗Π Sˆ and Sˆ◦ = S. (In particular, Sˆ0◦ ∗Π ◦ .) 3.2 Example: Bulk Messaging Application To illustrate the use of the effect type system, we revisit the example from Figure 1. The interesting bits of code are in the functions send bulk and send msg. The function send bulk first builds up a multiset of resources r by converting the strings representing phone numbers in nums into resources. Next it attempts to authorise the use of all resources by having enable add r to an empty resource manager m. If this fails, i. e., the multiset r’ returned by enable is of non-zero size, send bulk terminates (after destroying m’ and whatever resources it holds).3 If authorising all resources succeeds, send bulk calls send msgs to actually send the messages while checking 3
A more sophisticated version of the application could deal more gracefully with enable granting only part of the requested resources. This would require more complex code to inspect the multisets r and r’ (but not the resource manager m’).
Safety Guarantees from Explicit Resource Management
f
69
Θ(f )
send bulk → res from nums → r ≈ bagof (map fromstr (nums)) 0 ≤ i ≤ len(nums) ∧ r’ ≈ bagof (map fromstr (subarray(nums, i, len(nums)))) res from nums’ → r ≈ bagof (map fromstr (nums)) send msgs bagof (map fromstr (nums)) ⊆ m → m ≈ m’ bagof (map fromstr (nums)) 0 ≤ i ≤ len(nums) ∧ bagof (map fromstr (subarray(nums, 0, i))) ⊆ m send msgs’ → m ≈ m’ bagof (map fromstr (subarray(nums, 0, i))) send msg count(m, fromstr (num)) ≥ 1 → m ≈ m’ {|fromstr (num):1|} prim send msg → ∀a : len(map fromstr (a)) ≈ len(a) ∀a∀i : 0 ≤ i < len(a) ⇒ map fromstr (a)[i] ≈ fromstr (a[i]) ∀a∀j∀k : 0 ≤ j ≤ k ≤ len(a) ⇒ len(subarray(a, j, k)) = k + (−j) ∀a∀j∀k∀i : 0 ≤ j ≤ k ≤ len(a) ∧ 0 ≤ i < len(subarray(a, j, k)) ⇒ subarray(a, j, k)[i] = a[j + i] ∀a : |bagof (a)| ≈ len(a) ∀a : len(a) ≈ 1 ⇒ bagof (a) ≈ {a[0]:1} ∀a∀k : 0 ≤ k ≤ len(a) ⇒ bagof (a) ≈ bagof (subarray(a, 0, k)) bagof (subarray(a, k, len(a)))
Fig. 9. Bulk messaging application: admissible effect environment Θ and axiomatisation of theory extension; for the sake of readability sort information is suppressed in the axioms
that the manager m’ contains the required resources. After that, send bulk checks that send msgs has used up all resources by asserting that the returned manager m” is empty; failing this assertion will trigger a runtime error. Finally, send bulk explicitly destroys the empty manager m’” and terminates. The function send msg sends one message, checking whether the resource manager m holds the resource required. It does so by converting the string num into a singleton multiset of resources r. Then it splits the manager m into m’ and m r, so that m r contains at most the resources in r. Next, send msg asserts that m r contains at least r; failing this assertion will trigger a runtime error. Succeeding the assertion, send msg calls the primitive send function, destroys the now used resource by consuming m r’, and returns the remaining resources in the manager m’. The bulk messaging example is statically resource safe, as witnessed by the admissible effect environment displayed in Figure 9. Of particular interest is the effect → ascribed to the main function send bulk. This least informative effect expresses nothing about the function itself but implies the absence of runtime errors via Corollary 13. The effects require an extension of the theory T (see Section 2.2) by three new functions, axiomatised in Figure 9. The function map maps an array of strings to an array of resources, subarray takes an array and cuts out the sub-array between two given indices, and bagof converts an array of resources to a multiset (containing the same elements with the same multiplicity). Note that the axiomatisation of bagof is not complete4 but sufficient for our purposes. Effect type checking, e. g., for checking admissibility of the effect environment Θ from Figure 9, requires checking the side condition of the weakening rule (ET-weak), 4
A complete axiomatisation of bagof is possible in the full first-order theory of multisets and arrays but it is much more complicated and unusable in practise.
70
D. Aspinall, P. Maier, and I. Stark
which involves checking logical entailment w. r. t. to an extension of the theory T . Due to the high undecidability of T , we actually check entailment w. r. t. (an extension of) an approximation of T ; in particular, we approximate multiplication and division by uninterpreted functions. For the bulk messaging example, we used an SMT solver [4] that can handle linear integer arithmetic and arrays. We added axioms for multisets and the axioms in Figure 9. Due to an incomplete quantifier instantiation heuristic, we had to instantiate a number of these axioms by hand, yet eventually, the solver was able to prove all the entailments required by the weakening rules. Even though arising from a single example, we believe that the extension of the theories of multisets and arrays with the functions subarray and bagof is quite generic and could prove useful in many cases.
4 Conclusion We have presented a programming language with support for complex resource management, close to the standard SSA/ANF forms of compiler intermediate languages [1]. By construction, programs are dynamically resource safe in that any attempts to abuse resources are trapped. We have extended the language with an effect type system which guarantees the for well-typed programs no such attempts occur: we have static resource safety. In addition, for such programs the bookkeeping required by dynamic resource management can be erased. Related Work. Many tools and methods have been proposed to assist with resource management at runtime, e.g., in Java, the JRes [9] and J-Seal [8] frameworks. Generally, these aim to enable programs to react to fluctuations of resources caused by an unpredictable environment. Our aim, however is to track the flow of resources through the program, where the environment can influence the availability of resources only at well-understood points of interaction with the program and with clear availability policies. This offers the chance for more precise resource control whose behaviour can be predicted statically. This paper builds on previous work [3] with a Java library implementing resource managers and focusing on the dynamic aspects of resource management policies. This Java library supports essentially the same operations on resource managers as our functional language, except that state is realised by destructive updates instead of linear types. While [3] does not provide a static analysis to prove static resource safety, it does outline how dynamic accounting could be erased if static resource safety were provable. Our work here shows one way to do just that. Our approach is in line with a general trend of providing the programmer with language-based mechanisms for security and additional static analyses (often using type systems) which use these mechanisms. This combination provides a desirable graceful degradation: if static analysis succeeds in proving certain properties, then the program may be optimised without affecting security. Yet, even if the analyses fail the language based mechanisms will enforce the security properties at runtime. The context of our work is the MOBIUS project [5] on proof-carrying code (PCC) for mobile devices. Our effect type system is very simple and in principle well-suited for a PCC setting where checkers themselves are resource bounded. However, the weakening
Safety Guarantees from Explicit Resource Management
71
rule relies on checking logical entailment in a first-order theory, which is undecidable in general. Therefore, a certificate for PCC need not only provide a type derivation tree but also proofs (in some proof system) for the entailment checks in the weakening rule. The development of a suitable such proof system is a topic for further research, as is the investigation of decidable fragments of relevant first-order theories. Acknowledgements. This work was funded in part by the Sixth Framework programme of the European Community under the MOBIUS project FP6-015905. This paper reflects only the authors’ views and the European Community is not liable for any use that may be made of the information contained therein. Ian Stark was also supported by an Advanced Research Fellowship from the UK Engineering and Physical Sciences Research Council, EPSRC project GR/R76950/01.
References [1] Appel, A.W.: SSA is functional programming. SIGPLAN Notices 33(4), 17–20 (1998) [2] Aspinall, D., Beringer, L., Hofmann, M., Loidl, H.-W., Momigliano, A.: A program logic for resources. Theoret. Comput. Sci. 389(3), 411–445 (2007) [3] Aspinall, D., Maier, P., Stark, I.: Monitoring external resources in Java MIDP. Electron. Notes Theor. Comput. Sci. 197, 17–30 (2008) [4] Barrett, C., de Moura, L., Stump, A.: Design and results of the 2nd annual satisfiability modulo theories competition. Form. Meth. Syst. Des. 31(3), 221–239 (2007) [5] Barthe, G., Beringer, L., Cr´egut, P., Gr´egoire, B., Hofmann, M., M¨uller, P., Poll, E., Puebla, G., Stark, I., V´etillard, E.: MOBIUS: Mobility, ubiquity, security. In: Montanari, U., Sannella, D., Bruni, R. (eds.) TGC 2006. LNCS, vol. 4661, pp. 10–29. Springer, Heidelberg (2007) [6] Beringer, L., Hofmann, M., Momigliano, A., Shkaravska, O.: Automatic certification of heap consumption. In: Baader, F., Voronkov, A. (eds.) LPAR 2004. LNCS, vol. 3452, pp. 347–362. Springer, Heidelberg (2005) [7] Besson, F., Dufay, G., Jensen, T.P.: A formal model of access control for mobile interactive devices. In: Gollmann, D., Meier, J., Sabelfeld, A. (eds.) ESORICS 2006. LNCS, vol. 4189, pp. 110–126. Springer, Heidelberg (2006) [8] Binder, W., Hulaas, J., Villaz´on, A.: Portable resource control in Java. In: Proc. OOPSLA 2001, pp. 139–155. ACM, New York (2001) [9] Czajkowski, G., von Eicken, T.: JRes: A resource accounting interface for Java. In: Proc. OOPSLA 1998, pp. 21–35. ACM, New York (1998) [10] Flanagan, C., Sabry, A., Duba, B.F., Felleisen, M.: The essence of compiling with continuations. In: Proc. PLDI 1993, pp. 237–247. ACM, New York (1993) [11] Nanevski, A., Ahmed, A., Morrisett, G., Birkedal, L.: Abstract predicates and mutable ADTs in Hoare type theory. In: De Nicola, R. (ed.) ESOP 2007. LNCS, vol. 4421, pp. 189–204. Springer, Heidelberg (2007) [12] Unknown: Redbrowser. A, J2ME trojan. Identified in February 2006 as Redbrowser. A (F-Secure), J2ME/Redbrowser.a (McAfee), Trojan. Redbrowser. A (Symantec), TrojanSMS.J2ME.Redbrowser.a (Kaspersky Lab)
Universe Types for Topology and Encapsulation Dave Cunningham1 , Werner Dietl2 , Sophia Drossopoulou1 , Adrian Francalanza3, Peter M¨ uller2 , and Alexander J. Summers1 1
Imperial College London {david.cunningham04,s.drossopoulou,alexander.j.summers}@imperial.ac.uk 2 ETH Zurich {Werner.Dietl,Peter.Mueller}@inf.ethz.ch 3 University of Southampton [email protected]
Abstract. The Universe Type System is an ownership type system for object-oriented programming languages that hierarchically structures the object store; it is used to reason modularly about programs. We formalise Universe Types for a core subset of Java in two steps: We first define a Topological Type System that structures the object store hierarchically into an ownership tree, and demonstrate soundness of the Topological Type System by proving subject reduction. Motivated by concerns of modular verification, we then present an Encapsulation Type System that enforces the owner-as-modifier discipline; that is, that object updates are initiated by the owner of the object. The contributions of this paper are, firstly, an extensive type-theoretic account of the Universe Type System, with explanations and complete proofs, and secondly, the clean separation of the topological from the encapsulation concerns.
1
Introduction
Imperative object-oriented programming languages, such as C++, Java and C#, use references to build object structures and share state. Aliasing allows multiple references to the same object and gives much of the power of object-oriented programming. However, it makes several other programming aspects more difficult, including reasoning about programs, garbage collection and memory management, code migration, parallelism and the analysis of atomicity. To address these issues, different ownership type systems have been proposed: ownership types [6,9,10], ownership domains [1], Universe Types [15,26] and similar other type systems [2,18]. All have in common that they organise the heap as an ownership tree where each object is owned by at most one other object. It is common practice to depict ownership through a box in an object graph, where all objects that share the same owner are within the box of the owning object. For example, in Figure 1 the dashed box around object 1 of class Bag indicates that it owns objects 2 and 13, while object 2 of class Stack owns F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 72–112, 2008. c Springer-Verlag Berlin Heidelberg 2008
Universe Types for Topology and Encapsulation
73
objects 3, 4 and 5. Objects that are not owned by any other object, such as 1, 6 and 9 are contained within the outermost dashed box labelled root, which gives us a tree. However, different ownership type systems enforce different encapsulation policies, that is, put different restrictions on what references between objects might exist or limit the use of certain references. Ownership types enforce the owneras-dominator policy, guaranteeing that every reference chain from an object in the root context to an object o goes through o’s owner. Ownership domains allow the declaration of flexible encapsulation policies by special link declarations. Universe Types enforce the owner-as-modifier discipline that ensures that all modifications of an object are initiated by the object’s owner. root
1: Bag
6: Object
10: Bag state
state 13: Stack
2: Stack
7: Object
11: Stack
top
top value
value 3: Node
8: Object
12: Node
next value 4: Node
9: Object
next value 5: Node
Fig. 1. Depicting object ownership and references in a heap
The hierarchic heap topology and the different encapsulation policies can be exploited in different ways: Andreae et al. [2] speed up garbage collection, because, if the owner-as-dominator policy is enforced, as soon as an owner is unreachable, all owned objects are unreachable, too. Flanagan et al. [18] and Boyapati et al. [5] can guarantee that a program will not have races, because locking an object implicitly locks all owned objects. Cunningham et al. [11] show that Universe Types can be used to guarantee race free programs. Clarke et al. [9] use ownership types to calculate the effects of computations, and thus determine when they do not affect each other. Banerjee et al. [3] use the owneras-dominator policy to prove representation independence of data structures. M¨ uller et al. [27] use the hierarchic topology for defining modular verification techniques of object invariants. Universe Types [26,15] were developed to support modular reasoning about programs. The type system is one of the simplest possible in the family of ownership type systems. There are three Universe Modifiers: rep, peer and any, which denote the relative placement of objects in the ownership hierarchy. The qualifier
74
D. Cunningham et al.
rep (short for representation) expresses that the object is owned by the currently active object, while peer expresses that the object has the same owner as the currently active object. The qualifier any abstracts over the object’s position in the hierarchy and does not give any static information about the owner. 1 2 3 4 5
class Bag { rep Stack state ; impure void add(any Object o) { this . state .push(o); } pure Bool isEmpty() { return ( this . state .top == null); } }
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
class Stack { rep Node top; impure void push(any Object o) { this .top := new rep Node(o, this.top) ; } impure any Object pop() { any Object o := null ; if ( this .top != null ) { o := this .top. value ; this .top := this .top.next; } return o; } }
21 22 23 24 25 26
class Node { any Object value ; peer Node next; Node(any Object v, peer Node n) { value := v; next := n; } } Fig. 2. Code augmented with Universe Modifiers
An example appears in Figure 2. The Stack field state in class Bag is declared as rep and indeed, we see that in Figure 1 the Stack objects 2 and 11 are owned by Bag objects 1 and 10, respectively; similarly, in class Node the field next is declared as peer, and indeed the Node objects 3, 4 and 5 have the same owner. The field value in class Node is declared as any Object and can therefore refer to any object in the hierarchy. In that sense, the Universe Modifier any plays a similar role to that played by the any ownership parameter in [24]. The code in Figure 2 also augments methods with the keywords pure, qualifying methods without side-effects, and impure, for methods that might affect the state of the program; these purity annotations are used to guarantee encapsulation. In contrast with those ownership type systems that enforce the owner-asdominator policy [1,2,6,10], Universe Types do not restrict references into the
Universe Types for Topology and Encapsulation
75
boxes, as long as they are carried out through any references. Thus, a reference from 3 to 12 would be legal through the field value, because this field has the type any Object. Nevertheless, Universe Types can be used to impose encapsulation, in the sense of guaranteeing that the state of an object can only be modified when the object’s owner is one of the currently active objects. This owner-as-modifier discipline [15] is checked by the type system by forbidding all field updates on any references and requiring that only pure methods can be called when the receiver is an any reference. The owner-as-modifier discipline supports modular reasoning by restricting the objects whose invariants can be broken [23,27]. In this paper we give a formal, type theoretical description of Universe Types. We distinguish the Topological Type System from the Encapsulation Type System, and describe the latter on top of the Topological System. The reasons for this distinction are: – the distinction clarifies the rationale for the type systems; – some systems, such as those required for race detection, atomicity and deadlock detection, only require the topological properties; – one might want to add an encapsulation part onto a different Topological Type System or use different encapsulation policies. Universe Types were already introduced and proven sound [26], but the description was in terms of proof theory, rather than the type theoretic machinery we adopt in this report; they were further described in the context of JML [15,22]. Extensions to Universe Types, such as the addition of generics [14], already use the type theoretic approach, but do not separate topology and encapsulation. This paper thus aims to fill a gap in the Universe Types literature, by giving a full type theoretical account of the basic type system, proofs, sufficient examples, explanations, and elucidate the distinction between the Topological and the Encapsulation System. We describe Universe Types (UT) for a small Java-like language, which contains classes, inheritance, fields, methods, dynamic method binding and casts. The rest of the paper is organised as follows: we introduce our base language in Section 2, followed by a discussion of Universe Modifiers and owners in Section 3. In Section 4 we give the operational semantics for our language. Section 5 presents the Topological Type System and states subject reduction. Section 6 covers the Encapsulation Type System. Section 7 discusses related work and Section 8 concludes. Finally, Appendix A gives supporting results and proofs.
2
Source Language
UT models a subset of Java supporting class definitions, inheritance, field lookup and update, method invocations and casting. On top of this core subset, we augment types with Universe Modifiers and method signatures with purity annotations.
76
D. Cunningham et al.
Universe Modifiers were originally defined [26] as the set {rep, peer, readonly}. In this paper, we present a generalisation of this approach. Firstly, we replace the modifier readonly with the modifier any, since the name readonly applies only to the intentions of the Encapsulation System, but not to the Topological. Secondly, we extend the set of modifiers with two new values: self and lost, which do not occur in source programs but are key in the operation of the Topological Type System we will present in Section 5. Universe Modifiers are attached to object references, and provide relative information about the location (that is, position in the heap topology) of the referred-to object with respect to the current object. More specifically: – rep states that the referred object constitutes part of the current object’s direct representation. Stated otherwise, the current object owns the object pointed to by the reference. – peer states that the referred object constitutes part of the same representation to which the current object belongs. Stated otherwise, the current object and the object pointed to by the reference are owned by the same object. – self is a specialisation of peer, referring to the current object. – any does not provide information about the location of the referred object. It is a ‘don’t care’ modifier: such a reference may refer to objects with arbitrary owners. – lost does not provide information about the location of the referred object. It is a ‘don’t know’ modifier: it is used when the type system cannot accurately describe the location of the object1 . Example 2.1 (Comparing any and lost). We illustrate the difference between any and lost using the example from Figure 2 and a reference node of type any Node. The field access node.value has type any Object because the declared type of field value uses the any modifier and therefore permits references to arbitrary objects; the field doesn’t care about the location of the value object. Consequently, an update of value is statically safe regardless of the owners of the receiver and the right-hand side. In particular, the update node.value := y is legal provided that y’s class is a subclass of Object. Conversely, the field access node.next has type lost Node because the declared type of the field requires that the next node and the current node have the same owner. We do not statically know the owner of node as its Universe Modifier is any. Hence, we cannot express that the next field has the same owner. Since we don’t know the accurate ownership information for node.next, the field update node.next := z is potentially unsafe and has to be rejected by the type system, as we cannot ensure the topology of the heap would be preserved. 1
As we said earlier, Lu and Potter’s any ownership parameter [24] is the counterpart to our any Universe Modifier. Their unknown ownership parameter corresponds to our lost Universe Modifier—however, the main motivation for unknown is to preserve effective ownership rather than topology. Furthermore, our Universe Modifiers any and lost introduce a form of existential types [25].
Universe Types for Topology and Encapsulation
77
The example above also illustrates that lost variables cannot be assigned meaningful values, therefore our language does not permit any explicit occurrences of lost in the program. Analogously, variables with modifier self can only be aliases of this and so do not add expressibility. For these reasons, when writing source programs we only make use of Universe Modifiers from the set {rep, peer, any}. The syntax of our source language is given in Figure 3. We assume three countably infinite, distinct sets of names: one for classes, c ∈ Idc , one for fields, f ∈ Idf , and another for methods, m ∈ Idm . A program P is defined in terms of three partial functions, F , M and MBody, and a relation over class names ≤c ; these functions and relations are global. F associates field names accessible2 in a class to types, M associates method names accessible in a class to method signatures and MBody associates method names accessible in a class to method bodies. The reflexive, transitive relation ≤c denotes subclassing. Types, denoted by t, constitute a departure from standard Java. They consist of a pair, u c, where class names c are preceded by Universe Modifiers u. Source types, denoted by s are the subset of types t which are allowed to occur in source programs; i.e., they feature only Universe Modifiers from the set {rep, peer, any}. Method signatures also deviate slightly from standard Java. They consist of a triple, denoted as p : s1 (s2 ) where s2 is the type of the (single) method parameter, s1 is the return type of the method and p is a purity annotation, ranging over the set {pure, impure}. An extension to multiple method parameters is straightforward because each argument can be type-checked independently.
e ∈ SrcExpr ::= this | x | null | new s | (s) e | e.f | e.f := e | e.m(e) u ∈ Universe Modifiers ::= rep | peer | self | any | lost s ∈ SrcType ::= rep c | peer c | any c t ∈ Type ::= s | self c | lost c MethSig ::= p : s (s) p ∈ Purity Tag ::= pure | impure Fig. 3. Syntax of source programs. The arrow indicates partial mappings.
Source expressions, denoted by e, are a standard subset of Java. They consist of the self reference this, parameter identifier x, the basic value null, new object creation new s, casting (s) e, field access e.f , field update e.f := e and method invocation e.m(e). 2
By “accessible”, we mean those which are either defined in the class or inherited from a superclass. We do not formalise visibility; all declarations are implicitly public.
78
3
D. Cunningham et al.
Universe Modifiers and Owners
Universe Types structure the heap as an ownership tree. Every object a in a heap is owned by a single owner, o, which is either another object in the heap or the root of the ownership tree, root. The direct representation of an object in a heap h is defined to be all the objects it owns (i.e., is the owner of). Ownership is required to be acyclic. When we do not want to refer to a particular heap, we find it convenient to refer to objects as pairs with their owners (a, o). For example, (1, root) and (2, 1), meaning 1 owned by root and 2 owned by 1, respectively. We recall that the Universe Modifiers self, peer and rep are interpreted with respect to the current object and do not mean anything without this viewpoint. One can assign a Universe Modifier to an object (a , o ) with respect to another object (a, o) using the judgement (a, o) (a , o ) : u
(1)
which is defined as the least relation satisfying the rules in Figure 4. It states that, from the point of view of a (owned by o), a (owned by o ) has Universe Modifier u. The lost and any modifiers can always be assigned because they do not express any ownership information. (a, o) (a, o) : self
(Self)
( , o) ( , o) : peer
(a, ) ( , a) : rep ( , ) ( , ) : lost
(Lost)
(Peer)
(Rep)
( , ) ( , ) : any
(Any)
Fig. 4. Assigning Universe Modifiers to objects
Example 3.1 (Universe Modifiers and objects). In the heap shown in Figure 1, from the point of view of 2 (owned by 1) object 3 (owned by 2) has Universe Modifier rep that is (2, 1) (3, 2) : rep (2) Similarly, we can derive (3, 2) (4, 2) : peer
(3)
Each object can view itself as self or peer: (2, 1) (2, 1) : self
(2, 1) (2, 1) : peer
(4)
Also, we can assign any to any object from any viewpoint using rule (Any): (3, 2) (6, root) : any
(2, 1) (3, 2) : any
(3, 2) (4, 2) : any
(5)
Universe Types for Topology and Encapsulation
3.1
79
Universe Ordering
We define the following reflexive ordering for Universe Modifiers u ≤u u : u ≤u u
self ≤u peer ≤u lost
rep ≤u lost
lost ≤u any
It states that peer and rep are smaller (more precise) than lost, self is similarly a specialisation of peer, and any is the least specific modifier. The ‘don’t care’ modifier any is treated as more general than the ‘don’t know’ modifier lost; this is because we want to be able to assign to an any field even those objects whose location cannot be expressed in the type system. Lemma 3.2 states that the Universe ordering relation (≤u ) is consistent with the judgement of Figure 4. Thus, any object that is assigned rep, peer or self can also be assigned Universe Modifier any or lost, and any object that is assigned self can be assigned Universe Modifier peer, as we have already seen in Example 3.1. Lemma 3.2 (Universe object judgements respect Universe ordering) (a, o) (a , o ) : u =⇒ (a, o) (a , o ) : u u ≤u u Proof. By a simple case analysis of (a, o) (a , o ) : u.
Example 3.3 (Subtyping and Universe Modifiers). We illustrate how the Universe Modifiers of the fields in the classes of Figure 2 characterise the references in Figure 1. For instance, the Stack object 2 has the rep top field correctly assigned to Node 3, since from (2) above we know that 3 has Universe Modifier rep with respect to 2. In fact, this reference can only point to (Node) objects 3, 4 and 5 since these are the only objects owned by object 2. Similarly, Node 3 has the peer next field correctly assigned to Node 4, which is owned by the same owner as 3; see (3) above. Trivially, the any value field of Node 3 assigned to Object 6 also respects the Universe Modifier because of (5) above. It can however point to any object in the heap since any type t is a subtype of any Object. 3.2
Viewpoint Adaptation
The ownership information given by Universe Modifiers is relative with respect to a particular viewpoint. To adapt Universe Modifiers from one viewpoint to another, we define an operation u1 u2 called viewpoint adaptation [14]. This operation takes two Universe Modifiers u1 and u2 , and yields a new Universe Modifier, as defined in Figure 5. The resulting modifier can be intuitively described as follows: “if a1 sees a2 as u1 , and a2 sees a3 as u2 , then a1 sees a3 as u1 u2 ”3 . If there is no modifier to explicitly describe this relationship, then the operation yields the modifier lost. For example, rep rep = lost, since there is no modifier to explicitly express that a referred object is in the ‘transitive 3
For readers familiar with work on Ownership Types, this operation is vaguely analogous with the notion there of substitution.
80
D. Cunningham et al.
u1 u2 self peer u1 rep any lost
self self lost lost lost lost
peer peer peer rep lost lost
u2 rep rep lost lost lost lost
any any any any any any
lost lost lost lost lost lost
Fig. 5. Viewpoint adaptation
representation’ of the current object. Also note that the modifiers any and lost do not depend on a viewpoint. Therefore, if u2 is any or lost, the result will again be any or lost, respectively. In Lemma 3.4, we show that the intuition of is sound with respect to the interpretation of Universe Modifiers as object ownership in a heap, that is, judgement (1). We also show how we can recover information from a Universe Modifier u u , so long as it is not lost. Lemma 3.4 (Sound viewpoint adaptation) (a1 , o1 ) (a2 , o2 ) : u1 =⇒ (a1 , o1 ) (a3 , o3 ) : u1 u2 (a2 , o2 ) (a3 , o3 ) : u2 ⎫ (a1 , o1 ) (a2 , o2 ) : u1 ⎬ (a1 , o1 ) (a3 , o3 ) : u1 u2 =⇒ (a2 , o2 ) (a3 , o3 ) : u2 ⎭ u 1 u2 = lost Proof. By a case analysis of u1 and u2 and inspection of Figures 4 and 5. The argument for the second part depends essentially on the fact that, if there exists a u2 such that u1 u2 = lost then it is unique (in other words, u1 u2 = u1 u2 = lost implies that u2 = u2 ). This can be seen by inspection of Figure 5. Example 3.5 (Viewpoint adaptation). From judgements (2) and (3) from Example 3.1, using Lemma 3.4, we derive (2, 1) (4, 2) : rep
(6)
because rep peer = rep. Conversely, from (2) and (6), using Lemma 3.4 and rep = rep peer = lost, we recover (3) (3, 2) (4, 2) : peer
4
Operational Semantics
We give the semantics of our Java subset in terms of a small-step operational semantics. We assume a countably infinite set of addresses, ranged over by a, b.
Universe Types for Topology and Encapsulation
81
a, b ∈ Addr : N o ∈ Owner ::= a | root v ∈ Value ::= a | null flds ∈ Flds : Idf Value h ∈ Heap : Addr (Owner × Idc × Flds) σ ∈ StackFrame : (Addr × Value) e ∈ RunExpr ::= v | this | x | frame σ e | e.f | e.f := e | e.m(e) | new s | (s) e E[·] ::= [·] | E[·].f | E[·].f := e | a.f := E[·] | E[·].m(e) | a.m(E[·]) | (s) E[·] Fig. 6. Syntax of runtime expressions
At runtime, a value, denoted by v, may be either an address or null. Owners, denoted by o, can be either an address or the special owner root. Runtime expressions are described in Figure 6. During execution, expressions may contain addresses as values; they may also contain the keyword this and the parameter identifier x. Thus, a runtime expression is interpreted with respect to a heap, h, which gives meaning to addresses, and a stack frame, σ, which gives meaning to the keyword this and parameter identifier x. Evaluation contexts E define expressions with ‘holes’ [17]; they are used in the semantics to permit reductions to take place below the top level of the expression. A heap is defined in Figure 6 as a partial function from addresses to objects. An object is denoted by the triple (o, c, flds). Every object has an immutable owner o, belongs to a class c, and has a state, flds, which is a mutable field map (a partial function from field names to values). In the remaining text we use the following heap operations: def
def
owner(h, a) = h(a)↓1
class(h, a) = h(a)↓2
def
def
fields(h, a) = h(a)↓3 h(a.f ) = fields(h, a)(f ) def h[(a, f ) → v] = h a → owner(h, a), class(h, a), fields(h, a)[f → v] The first three operations extract the components making up an object, where ↓i is used for the i-th projection. The fourth operation is merely a shorthand notation for field access in a heap. The fifth operation is heap update, updating the field f of an object mapped to by the address a in the heap h to the value v. A stack frame σ is a pair (a, v), of an address and a value. The address a denotes the currently active object referred to by this whereas v denotes the value of the parameter x. We find it convenient to define the following operations on stack frames: def
σ(this) = σ↓1
def
σ(x) = σ↓2
For evaluating method calls, we require to push and pop new address-value pairs onto the stack. To model this, runtime expressions also include the expression
82
D. Cunningham et al.
frame σ e, which denotes that the sub-expression e is evaluated with respect to the inner stack frame σ. We employ a further operation called heap extension, written alloc(h, σ, s), which extends a heap h with a new mapping from a fresh address a to a newlyinitialised object of type s; it is defined by the following function: def
alloc(h, σ, u c) = (h , a) if u ∈ {rep, peer} where a∈ / dom(h) h = h[a
→ (o, c, flds)] σ(this) if u = rep o= owner(h, σ(this)) if u = peer flds = {f → null | F (c, f ) = } The owner is initialised according to the Universe Modifier specified in the type s and the current stack frame σ. The above function is partial: it is only defined for Universe Modifiers rep and peer since the owner of a new object cannot be determined if the Universe Modifier is any. The values of the fields of class c, and all its superclasses, are initialised to null. Expressions e are evaluated in the context of a heap h and a stack frame σ. We define the small-step semantics σ e, h e , h
(7)
in terms of the reduction rules in Figure 7. We will use ∗ to indicate a consecutive sequence of (zero or more) small-step reductions.
σ x, h σ(x), h
(rVar)
σ this, h σ(this), h (h , a) = alloc(h, σ, s)
σ new s, h a, h
(rThis)
(rNew)
h, σ a : s (rCast) σ (s) a, h a, h σ a.f, h h(a.f ), h
(rField)
h = h[(a, f ) → v]
(rAssign)
σ a.f := v, h v, h e = MBody(class(h, a), m) σ a.m(v), h frame (a, v) e, h
(rCall)
σ e, h e , h (rEvalCtx) σ E[e], h E[e ], h σ e, h e , h (rFrame1) σ frame σ e, h frame σ e , h σ frame σ v, h v, h
(rFrame2)
Fig. 7. Small-step operational semantics
Most of the rules in Figure 7 are straightforward. When creating a new object (rNew) the alloc(h, σ, s) function defined above determines the new heap h and fresh address a. In (rCall), a method call creates a new stack frame σ
Universe Types for Topology and Encapsulation
83
to evaluate the body of the method, where σ (this) is the receiver object a and σ (x) is the value passed by the call, v. Once a frame evaluates to a value v, we discard the sub-frame and return to the outer frame, as shown in (rFrame2). We also note that the rule (rEvalCtx) dictates the evaluation order of an expression, based on the evaluation contexts E[·] defined in Figure 6. The type judgement in rule (rCast) expresses that the object at address a has type s; see Subsection 5.3. Note that the source expression null is identical to the value null, which makes a special rule dispensable. Example 4.1 (Runtime execution). Let h denote the heap depicted in Figure 1 and the current stack frame be σ = (2, 9). Then, if we consider the program in Figure 2, and execute the expression this .push(7) we get the following reductions4 , where the rule names on the side indicate the main reduction rule applied to derive the reduction. We simplify the example slightly, by not mentioning the use of context rules ( rEvalCtxt) and ( rFrame1). σ this .push(7), h 2.push(7), h ( rThis) frame σ this .top:= new rep Node(x, this.top), h ( rCall) frame σ 2.top:= new rep Node(x, this.top), h ( rThis) frame σ 2.top:= new rep Node(7, this.top), h ( rVar) frame σ 2.top:= new rep Node(7, 2.top), h ( rThis) frame σ 2.top:= new rep Node(7, 3), h ( rField) frame σ 2.top:= 14, h ( rNew) frame σ 14, h [(2, top) → 14] ( rAssign) 14, h [(2, top) → 14] ( rFrame2) where σ = (2, 7), h = h {14 → (2, Node, {value → 7, next → 3})}, and 14 is a fresh address in the heap h.
5
Topological System
In this section, we define the Topological Type System for UT. The formalism is based on earlier work [26,15], but has some differences: as we said in the introduction, we focus here on the hierarchical topology imposed by Universe Types, but do not enforce the owner-as-modifier discipline at this stage—this is dealt with in Section 6. The main result of this section is Topological Subject Reduction, stating that a type assigned to an expression and the ownership hierarchical heap structure are preserved during execution. 4
In order to follow the Java code of Figure 2, the reductions use an object constructor that immediately initialises values to the parameters passed. This is more advanced than the simpler new construct considered in our language, which initialises all the fields of a fresh object to null. These details are however orthogonal to the determination of the owner of the object upon creation, which is the relevant issue for our work. Similarly, we return the value of the method body even if the method is declared to be void.
84
D. Cunningham et al.
5.1
Subtyping and Viewpoint Adaptation
As was already stated in Section 2, types, t, are made up of two components: a Universe Modifier u and a class name c. Using the Universe ordering ≤u of Section 3.1 and subclassing ≤c , we define the subtype relation as: u c ≤ u c = u ≤u u and c ≤c c def
(8)
We extend defined in Section 3.2 for Universe Modifiers, to an operator on a Universe Modifier and a type, and that produces a type, denoted by u t: def
u (u c) = (u u ) c We use this auxiliary operator whenever we need to change the viewpoint of the types. 5.2
Static Types
We type-check UT source expressions with respect to a type environment Γ , which keeps type information for this and the method parameter x. The types in a method signature are meant to be interpreted with respect to this, the currently active object. We assign the self Universe Modifier to Γ (this) when type-checking method bodies and note that self u = u for all u. The use of a specific self Universe Modifier is a variation from previous models of the Universe Type System [14,15,26] and of other ownership type systems. In the work on Universe Types, the expression this was treated separately, viewpoint adaptation was omitted for access through this, and additional checks had to be made to ensure the protection of the representation of an object. This special treatment of the this expression can also be compared to the static visibility constraint of Ownership Types [10], which ensures that a type that contains rep is only accessible by this. Even when not enforcing the static visibility constraint, the this parameter in a type needs to be treated specially upon type application [1,4]. The use of a special self ownership modifier makes the special role of the current object more explicit, while at the same time simplifying the overall system5 . For example, attempting to update the representation of another object using a peer reference results in lost ownership information, i.e., peer rep = lost and the update is forbidden. On the other hand, updating the representation of the current object preserves ownership information, i.e., self rep = rep and an update is allowed. Definition 5.1 (Type environment). A type environment Γ consists of a pair of types, (t, t ), assigning types to the currently active object this and the parameter x, respectively. We define the following operations on Γ : def
Γ (this) = Γ ↓1 5
def
Γ (x) = Γ ↓2
Note that in the works mentioned on Ownership Types [10,1,4], types such as A or A<self> do not correspond with our Universe Modifier self (which indicates the current receiver): their types would instead be represented in our system by the type rep A.
Universe Types for Topology and Encapsulation
85
The source expression type judgement takes the form Γ e:t denoting that expression e has type t with respect to the type environment Γ . Note that we do not restrict t to source types; although only source types may be written explicitly in the program, an inferred Universe Modifier may well be lost, for example. The judgement is defined as the least relation satisfying the rules given in Figure 8. We sometimes find it convenient to use the shorthand judgement notation Γ e:u
Γ e:c
whenever components of the type judgement are not important, that is Γ e : u and Γ e : c, respectively. Most of the rules are standard, with the exception of the type rules (Field), (Assign) and (Call), which use the auxiliary operation u t to adapt types from one viewpoint to another. For example, consider the rule for field lookup (Field). The first premise says that e can be assigned class c, and that from the current point of view, e’s position in the heap topology can be described by Universe Modifier u. The second premise states that the field f is declared in class c as having source type s. Since the Universe Modifier of s describes the location of the field with respect to the point of view of e, to assign a type for this field from the current point of view, we take into account e’s relative position; that is, we adapt the type s with respect to u. Example 5.2 (Type viewpoint adaptation). If Γ this.top : rep Node and field next in class Node has type peer Node, then using ( Field), the dereference this.top.next has type rep (peer Node) = rep Node, that is Γ this.top.next : rep Node Conversely, we use ( Assign) to check that when Γ this.top : rep Node
and
Γ new rep Node : rep Node
then the assignment this.top.next := new rep Node respects the field type assigned to next in class Node. For this calculation we use F (Node, next) = peer Node rep peer = rep = lost The source expression type judgement allows us to define well-formed classes by requiring consistency between subclasses. In particular, we require that field types in a subclass match those in any superclasses in which the same fields are present (this requirement is trivially met if fields cannot be overridden), and method signatures in subclasses are specialisations of the signatures of overridden methods. In addition to the usual variance on argument and return types, we allow pure methods to override impure methods (but not the opposite).
86
D. Cunningham et al.
Γ null : t
(Null)
Γ x : Γ (x)
(Var)
Γ this : Γ (this)
(This)
u ∈ {rep, peer} (New) Γ new u c : u c
Γ e:t (Cast) Γ (s) e : s Γ e : t t ≤ t (Sub) Γ e:t
Γ e:uc F (c, f ) = s (Field) Γ e.f : u s
Γ e:uc F (c, f ) = s us = lost Γ e : u s (Assign) Γ e.f := e : u s
Γ e:uc M (c, m) = : sr (sx ) u sx = lost Γ e : u sx (Call) Γ e.m(e ) : u sr
Fig. 8. Source type system
Furthermore, we require that method bodies are consistent with their signatures. A program P is well-formed if all the defined classes are well-formed: Definition 5.3 (Well-formed classes and programs) ∀c . (c ≤c c ∧ F (c , f ) = s) =⇒ F (c, f ) = s ∀c . (c ≤c c ∧ M (c , m) = p : sr (sx )) =⇒ M (c, m) = p : sr (sx ) where sr ≤ sr and sx ≤ sx and p = pure ⇒ p = pure ∀m . M (c, m) = : sr (sx ) =⇒ (self c, sx ) MBody(c, m) : sr c
(WFClass)
P ⇐⇒ (∀c ∈ Idc . c) 5.3
Runtime Types
We define a type system for runtime expressions. These are type-checked with respect to the stack frame σ, which contains actual values for the current receiver this and the parameter x. Since runtime expressions also contain addresses, we also need to type-check them with respect to the current heap, so as to retrieve the class membership and owner information for addresses. The runtime Universe Type System allows us to assign Universe Types to runtime expressions with respect to a particular heap h and stack frame σ, through a judgement of the form h, σ e : t It is defined as the least relation satisfying the rules in Figure 9. Once again, we use the shorthand notation h, σ e : u and h, σ e : c whenever the other component of t in the judgement is not important. In the rule (tAddr), the
Universe Types for Topology and Encapsulation
87
type of an address in a heap is derived from the class of the object and the Universe Modifier obtained using judgement (1) of Section 3. The three rules (tField), (tAssign) and (tCall) use viewpoint adaptation in the same way as their static-expression counterparts in Figure 8. The new rule (tFrame) also uses viewpoint adaptation to adapt the type of the sub-expression, obtained with respect to the local stack frame, to the current frame’s viewpoint.
h, σ null : t
h, σ σ(x) : t (tVar) h, σ x : t
(tNull)
h, σ e : t t ≤ t (tSub) h, σ e : t
h, σ e : t (tCast) h, σ (s) e : s
h, σ σ(this) : t (tThis) h, σ this : t u ∈ {rep, peer} (tNew) h, σ new u c : u c
class(h, a) = c (σ(this), owner(h, σ(this))) (a, owner(h, a)) : u (tAddr) h, σ a : u c h, σ e : u c F (c, f ) = s us = lost h, σ e : u s (tAssign) h, σ e.f := e : u s
h, σ e : u c F (c, f ) = s (tField) h, σ e.f : u s h, σ e : u c M (c, m) = : sr (sx ) u sx = lost h, σ e : u sx (tCall) h, σ e.m(e ) : u sr
h, σ e : t h, σ σ (this) : u (tFrame) h, σ frame σ e : u t
Fig. 9. Runtime type system
Lemma 5.4 shows that viewpoint adaptation respects the judgements of the runtime type system. The viewpoint adaptations of Lemma 5.4 trivially hold for the case v = null since rule (tNull) immediately yields the desired judgements. The proof is relegated to Appendix A. Lemma 5.4 (Determining the relative Universe Types of values) (i) If h, σ a : u and h, (a, ) v : t then h, σ v : u t (ii) If h, σ a : u and h, σ v : u t and u t = lost then, for any value v we have h, (a, v ) v : t Example 5.5 (Relative viewpoints in a heap). In Figure 1, using (2) and (3) from Example 3.1 and rule ( tAddr) we derive h, (2, ) 3 : rep Node
and
h, (3, ) 4 : peer Node
From Lemma 5.4(i) we immediately derive h, (2, ) 4 : rep Node
88
D. Cunningham et al.
Conversely, using h, (2, ) 3 : rep Node, h, (2, ) 4 : rep Node as well as Lemma 5.4(ii), we can recover h, (3, ) 4 : peer Node. We now have enough machinery to define well-formed addresses, heaps and stack frames (Definition 5.7). An address is well-formed in a heap whenever its owner is valid (that is, it is another address in the heap or root) and the types of its fields respect the types of the fields defined in F . A heap is well-formed, denoted as h, if transitive ownership always includes root (this implies that the ownership relation is acyclic, since each address has one owner and root has no owner) and all its addresses are well-formed. Finally, a stack frame is wellformed with respect to a heap if the receiver address it contains is defined in the heap (we make no requirements about the argument on the stack, since these are enforced where necessary by the type system). We use owner+ (h, o) to denote the transitive closure of owner(h, o). Definition 5.6 (Transitive ownership)
Definition 5.7 (Well-formed addresses, heaps and stack frames) owner(h, a) ∈ (dom(h) ∪ {root}) class(h, a) = c ∀f . F (c, f ) = s =⇒ h, (a, ) h(a.f ) : s (WFAddr) ha
root ∈ owner+ (h, a) ∀a . a ∈ dom(h) =⇒ ha (WFHeap) h σ(this) ∈ dom(h) (WFStack) hσ We conclude the subsection by showing the correspondence between the source type system and runtime type system. Lemma 5.8 below states that, with respect to a suitable stack frame σ, where σ(this) and σ(x) match the respective type assignments in Γ , a well-typed source expression is also a well-typed runtime expression. Lemma 5.8 (Source typing to runtime typing) ⎫ Γ e:t ⎬ h, σ x : Γ (x) =⇒ h, σ e : t ⎭ h, σ this : Γ (this) Proof. By induction on the structure of the derivation Γ e : t, considering the last rule applied. Comparing the two type systems, all cases follow by straightforward induction except for the rules (Var) and (This). These are guaranteed by the conditions on σ.
Universe Types for Topology and Encapsulation
5.4
89
Subject Reduction
In this subsection, we present the first main result of the paper. It states that if a well-typed runtime expression e reduces with respect to a stack frame σ, and a well-formed heap h, then the resulting expression preserves its type (with respect to the new heap), and the resulting heap preserves its well-formedness as well as the well-formedness of the stack frame. Because our definition of well-formed heaps imposes strong topological constraints in correspondence with the Universe Modifiers in the program, this result means in particular that the implied topology is preserved during execution. Theorem 5.9 (Topological Subject Reduction) For well-formed programs, the following property holds: ⎫ ⎧ h ⎪ ⎪ ⎬ ⎨ h hσ =⇒ h σ h, σ e : t ⎪ ⎩ ⎪ h , σ e : t ⎭ σ e, h e , h Proof. We build up to this result by first proving a number of intermediary lemmas concerning the evolution of the heap under reduction and extracting object information from types (see Appendix A). The owner and class components of an object in a heap are immutable during execution (Lemma A.2). During execution, we never remove existing addresses from the heap (Lemma A.3). Earlier in Section 4, we discussed how reduction rules make use of two operations to update a heap in the form of alloc(h, σ, s) and h[(a, f ) → v]. Lemma A.4 shows that the heap extension operation creates a new object with the requested type in the heap. Lemma A.6 states that under appropriate conditions, heap update and heap extension operations preserve heap well-formedness. Lemma A.7 states that the type judgement h, σ a : u c implies that the class of a in h is a subclass of c and that a has Universe Modifier u from the current viewpoint σ(this). We relegate these five lemmas to Appendix A. The proof uses also Lemma 5.4 and Lemma 5.8 from Section 5.3. The main cases of the Subject Reduction proof are given in Appendix A.
6
Encapsulation System
In Section 5, we showed how the Topological Type System guarantees that the topology of the objects in the heap agrees with the one described by the Universe Types. In this section, we enhance the Topological Type System and obtain the Encapsulation Type System. We show that the latter system guarantees the owner-as-modifier discipline [15], which localises the effects of execution in a heap with respect to the currently active object. We prove two related theorems: the Encapsulation Theorem (6.8) guarantees that an encapsulated expression can only modify objects transitively owned by
90
D. Cunningham et al.
the owner of the current receiver, while the Owner-as-Modifier Theorem (6.14) guarantees that execution of an encapsulated expression starting from the initial configuration may update an object only when the object’s owner is on the call stack. Notice that although related, the two theorems do not follow from each other. In terms of our running example, the Encapsulation Theorem guarantees that execution of a method by receiver 13 can modify—at most—objects 2, 3, 4, 5 and 13. On the other hand, the Owner-as-Modifier Theorem guarantees that execution of an encapsulated expression starting from the initial configuration may modify 13 only while 1 is on the stack. 6.1
Encapsulation Types
For the subsequent discussion we find it convenient to define contexts C[·] which are generally used to describe the field updates and method calls present within an expression. These are more liberal than the evaluation contexts E[·] previously defined, which are used to specify where evaluation should next take place. For example, x.f.m(·) is a C[·], but not an E[·] context. Like E[·] contexts, C[·] contexts, are restricted to not include frame expressions, which allows us to express relationships between the sequence of stack frames in the expression (e.g., see rule (Enc) in Definition 6.2 below). Definition 6.1 (Frame-free contexts) C[·] ::= [·] | C[·].f | C[·].f := e | e.f := C[·] | C[·].m(e) | e.m(C[·]) | (s) C[·] We will write pure(c, m) to mean that m is declared to be pure in c: def
pure(c, m) = M (c, m) = pure : ( ) The Encapsulation Type System imposes extra restrictions so as to enforce the owner-as-modifier discipline and to guarantee restrictions on the effect of method calls. We define an encapsulation judgement for expressions, Γ enc e, reflecting the expression restrictions needed to enforce the owner-as-modifier discipline. These restrictions state that for an expression e to respect encapsulation, it can only assign to and call impure methods on the current object, on rep receivers or on peer receivers. To determine (conservatively) when a method is actually pure, we require a purity judgement for expressions Γ pure e. An expression e is pure by this judgement if it never assigns to fields and only calls methods declared to be pure. We use this very strict notion of purity to simplify the rules. Weaker purity requirements [14,30] suffice to enforce the owner-as-modifier discipline.
Universe Types for Topology and Encapsulation
91
Definition 6.2 (Purity and encapsulation for source expressions) Γ e:t ∀C, e1 , e2 , f . e = C[e1 .f := e2 ] ∀C, e1 , e2 , m . e = C[e1 .m(e2 )] =⇒ ∃c . Γ e1 : c ∧ pure(c, m) (Pure) Γ pure e Γ e:t ∀C, e1 , e2 , f . e = C[e1 .f := e2 ] =⇒ ∃u . Γ e1 : u ∧ u ∈ {peer, rep} ∀C, e1 , e2 , m . e = C[e1 .m(e2 )] =⇒ ∃u, c . Γ e1 : u c ∧ (u ∈ {peer, rep} ∨ pure(c, m)) (Enc) Γ enc e Note that if an expression is considered pure, it automatically respects encapsulation; i.e., Γ pure e ⇒ Γ enc e. In terms of our example code (Figure 2), suppose Γ = (self Bag, any Object). Then we can derive Γ enc this.state.push(x), since the expression is typeable, contains no field updates and the only method call has this.state as receiver, where Γ this.state : rep Stack. A class is well-formed with respect to encapsulation, denoted as enc c, if and only if all pure methods have bodies that are pure, and all impure methods have bodies that are encapsulated, according to the corresponding definitions above. We recall that according to Definition 5.3, a method declared to be pure can only be overridden by another method declared to be pure. A program P is well-formed with respect to encapsulation if all its classes are encapsulated. Definition 6.3 (Encapsulated well-formed classes and programs) ∀m . M (c, m) = pure : sr (sx ) =⇒ (self c, sx ) pure MBody(c, m) ∀m . M (c, m) = impure : sr (sx ) =⇒ (self c, sx ) enc MBody(c, m) enc c
(WFEncClass)
enc P ⇐⇒ ( P ∧ ∀c ∈ Idc . enc c) Example 6.4 (Comparing Topological and Encapsulation Systems).We compare the Topological Type System and the Encapsulation Type System in the context of Example 2.1 (that is, the program from from Figure 2 and a variable node of type any Node). We explained in Example 2.1 that the update node.value := y is valid in the Topological Type System provided that y’s class is a subclass of Object. Since the viewpoint-adapted type of field value is not lost, the conditions of rule ( Assign) in Figure 8 are satisfied. However, the encapsulation rule ( Enc) in Definition 6.2 forbids the update through the any reference node.
92
D. Cunningham et al.
We also define encapsulation and purity judgements for runtime expressions subject to a heap h and a stack frame σ; these judgements are denoted as h, σ enc e and h, σ pure e, respectively. Encapsulation and purity for runtime expressions impose similar requirements to those for source expressions but add an extra clause for frame expressions. In particular, encapsulation for frames, h, σ enc frame σ e , requires that the receiver in σ , that is σ (this), is a self, peer or rep of that in σ. This condition is expressed through the predicate h σ enc σ, defined below. Definition 6.5 (Frame encapsulation) def
h σ enc σ = ∃u ∈ {peer, rep} . h, σ σ (this) : u Definition 6.6 (Purity and encapsulation for runtime expressions) h, σ e : t ∀C, e1 , σ1 . e = C[frame σ1 e1 ] =⇒ h, σ1 pure e1 ∀C, e1 , e2 , f . e = C[e1 .f := e2 ] ∀C, e1 , e2 , m . e = C[e1 .m(e2 )] =⇒ ∃c . h, σ e1 : c ∧ pure(c, m) (rPure) h, σ pure e h, σ e : t ∀C, e1 , σ1 . e = C[frame σ1 e1 ] =⇒ (h σ1 enc σ ∧ h, σ1 enc e1 ) ∨ h, σ1 pure e1 ∀C, e1 , e2 , f . e = C[e1 .f := e2 ] =⇒ ∃u . h, σ e1 : u ∧ u ∈ {peer, rep} ∀C, e1 , e2 , m . e = C[e1 .m(e2 )] =⇒ ∃u, c . h, σ e1 : u c ∧ (u ∈ {peer, rep} ∨ pure(c, m)) (rEnc) h, σ enc e We can show that the source and runtime notions of purity and encapsulation are closely related. Lemma 6.7 (Source encapsulation to runtime encapsulation) ⎫ Γ pure e ⎬ h, σ x : Γ (x) =⇒ h, σ pure e ⎭ h, σ this : Γ (this) ⎫ Γ enc e ⎬ h, σ x : Γ (x) =⇒ h, σ enc e ⎭ h, σ this : Γ (this) Proof. Using Lemma 5.8.
We now state the Encapsulation Theorem. It says that if an expression respects encapsulation (with respect to some h, σ), then during its execution it will only update objects that form part of the representation of the owner of the currently active object.
Universe Types for Topology and Encapsulation
93
Theorem 6.8 (Encapsulation) ⎫ enc P ⎪ ⎪ ⎪ ⎪ h, σ enc e ⎬ σ e, h ∗ e , h =⇒ h(a) = h (a) ⎪ ⎪ a ∈ dom(h) ⎪ ⎪ ⎭ owner(h, σ(this)) ∈ owner+ (h, a) In terms of our running example, the Encapsulation Theorem guarantees, that execution of a method by receiver 2 (i.e., σ(this) = 2) will not modify the objects 1, 6, 7, 8, 9, 10, 11 and 12. It may, however, modify the fields in 2, 3, 4, 5 and 13. On the other hand, consider the further Stack object 13, which is also owned by 1; execution of a method by 13 would be allowed to update the fields of 2, 3, 4 and 5, for instance, by calling an impure method on 2, which in turn would update the fields of 2, 3, 4 and 5. These updates are permissible, according to our theorem, because 1, the owner of 13, is among the transitive owners of 2, 3, 4 and 5. Before we can prove Theorem 6.8, we need to introduce a number of auxiliary lemmas. In the following lemma we show that execution preserves purity and encapsulation, and that the execution of pure expressions preserves the contents of allocated objects. Lemma 6.9 (Preservation of purity and encapsulation ) For any program such that enc P , if σ e, h e , h then: 1. If h, σ pure e then (a) h , σ pure e (b) a ∈ dom(h) ⇒ h (a) = h(a) 2. If h, σ enc e then (a) h , σ enc e
Proof. See Appendix A.
The following definition of extended runtime contexts, D[·], allows for contexts within any number of nested calls: An expression e can be decomposed as e = D[frame σ e ] if and only if it contains a nested method call with receiver and argument as described by σ and method body e6 . Definition 6.10 (Extended runtime contexts) D[·] ::= E[·] | E[frame σ D[·]] 6
D[·] contexts are more liberal than E[·] contexts, however no such relation exists between D[·] and C[·] contexts. For example, x.f.m(·) is a C[·] but not a D[·] context, while a.m(frame σ ·) is a D[·] but not a C[·] context.
94
D. Cunningham et al.
Lemma 6.11 guarantees that the execution of an encapsulated expression e can only modify an object a if it is directly mentioned in one of the nested calls (e = D[frame σ E[a.f := v]] or e = E[a.f := v], and furthermore, a must be a rep or peer of the receiver of the nested call which causes the modification. In terms of our running example, if execution of an encapsulated expression were to modify object 2, then one of the objects 1, 2 or 13 will be either the outermost receiver, or the receiver in one of the stack frames in the expression itself. Lemma 6.11 (Encapsulated expressions have limited write effects) If enc P , and h, σ enc e, and σ e, h e , h , and h(a) = h (a) for some a ∈ dom(h), then there exist σ , f, v, D[·], and E[·] such that 1. e = D[frame σ E[a.f := v]] or (σ = σ and e = E[a.f := v]) and 2. h, σ a : rep or h, σ a : peer Proof. The proof proceeds by induction on the derivation of σ e, h e , h considering cases for the last rule applied in the derivation, and using the preservation of encapsulation (Lemma 6.9), and the definition of encapsulated expressions (Definition 6.6). In Appendix A we outline some interesting cases. Lemma 6.12 guarantees that for an encapsulated expression e, the outermost receiver (σ(this)), is either a peer, or a transitive owner of any of the receivers of non-pure method calls in e. In terms of our running example, if an expression e were encapsulated from the point of view of σ and contained a method call with receiver 2, i.e., if h, σ enc . . . frame (2, . . .) . . ., then the outermost receiver, i.e., σ(this), will be either 13, 2 or 1. Lemma 6.12 (Owners of receivers precede them on the stack) ⎫ enc P ⎪ ⎪ h, σ pure e ⎬ h, σ enc e or =⇒ e = D[frame σ e ] ⎪ ⎪ owner(h, σ(this)) ∈ owner+ (h, a) ⎭ σ = (a, ) Proof. By induction on the definition of D[·] (c.f., Definition 6.10). We freely use the fact that any evaluation context E[·] is trivially an expression context C[·] (note that neither kind of context contain frames). (Case: D[·] = E[·]) Then e = E[frame σ e ]. By Definition 6.6, we obtain that either h, σ pure e (in which case we are done) or h σ enc σ. In the latter case, by Definition 6.5, we obtain that either σ(this) = owner(h, a) or owner(h, σ(this)) = owner(h, a). In either case, owner(h, σ(this)) ∈ owner+ (h, a) as required. (Case: D[·] = E[frame σ D [·]]) Then e = E[frame σ D [frame σ e ]. By Definition 6.6, we obtain that either h, σ pure e (in which case we are done) or both h σ enc σ and h, σ enc e . By induction, we obtain that either h, σ pure e (and we are done) or owner(h, σ (this)) ∈ owner+ (h, a). By
Universe Types for Topology and Encapsulation
95
combining this latter statement with h σ enc σ, we can show that owner(h, σ(this)) ∈ owner+ (h, a) by a similar argument to the previous case. Using the lemmas above, we can now prove the encapsulation theorem itself. Theorem 6.8 (Encapsulation)
⎫ enc P ⎪ ⎪ ⎪ ⎪ h, σ enc e ⎬ ∗ σ e, h e , h =⇒ h(a) = h (a) ⎪ ⎪ a ∈ dom(h) ⎪ ⎪ ⎭ owner(h, σ(this)) ∈ owner+ (h, a)
Proof. We prove the equivalent assertion that enc P , and h, σ enc e, and σ e, h ∗ e , h , and a ∈ dom(h), and h(a) = h (a) imply that owner(h, σ(this)) ∈ + owner (h, a). The proof proceeds by induction over the length of the reduction of σ e, h ∗ e , h . The base case trivially holds, since we have an execution of length zero, and thus h = h . For the inductive step, we have σ e, h ∗ e , h e , h . By application of Lemma 6.9 we obtain h , σ enc e . 1st Case h(a) = h (a). The assertion follows from the inductive hypothesis. 2nd Case h(a) = h (a). Because of the assumption that h(a) = h (a) we ob tain h (a) = h (a). Therefore, by the fact that h , σ enc e and Lemma 6.11, we obtain that there exist D[·], E[·], σ such that ( h , σ a : rep or h , σ a : peer) and (e = D[frame σ E[a.f := v]] or (σ = σ and e = E[a.f := v])). The first part of the conjunction gives owner(h , σ (this)) ∈ owner+ (h , a), while the latter, together with the fact that h , σ enc e and application of Lemma 6.12 gives that owner(h , σ(this)) ∈ owner+ (h , σ (this)). The last two assertions give that owner(h , σ(this)) ∈ owner+ (h , a), and because a and σ(this) were already defined in h, and owners do not change during execution, we also obtain that owner(h, σ(this)) ∈ owner+ (h, a). 6.2
Owner-as-Modifier Discipline
The owner-as-modifier discipline [15] guarantees that any update to the field of an object is initiated by the object’s owner. By “initiated”, we mean that the owner is still on the stack when the modification takes place. This guarantee can only be made if we consider executions starting at the root of our heap topology, otherwise there is no guarantee that the call-stack will reflect the hierarchy of the heap topology. We formalise the notion of an initial heap and stack as follows: hinit indicates an initial heap which only contains one object (belonging to root) at address 1, while σinit indicates an initial stack, where σinit = (1, null).
96
D. Cunningham et al.
Theorem 6.14 states formally the owner-as-modifier guarantee. In terms of our running example, any modification of the fields of, say, the object 4 is, according to the owner-as-modifier discipline, guaranteed to happen only while 2 is on the stack or the outermost receiver (i.e., either a direct or an indirect caller). We first prove Lemma 6.13 below, which guarantees that, if we consider reductions that begin from an initial heap and stack, then the resulting sequence of stack frames has the property that: either the corresponding expression is pure (in which case the frame may result from a call in an arbitrary position in the heap topology, via an any or a lost reference), or else all of the (transitive) owners (except root which is not an object anyway) of the receiver in the stack frame, are receivers in a preceding stack frame. Note that this is subtly different from the requirements on the sequence of stacks imposed by the judgement hinit , σinit enc e, which says that if a stack frame is in the sequence, then it will conform to the restrictions imposed by the h enc relation. Applying Lemma 6.13 to our running example, execution of an encapsulated expression starting from the initial configuration and leading to an impure expression containing a method call with receiver 12, is guaranteed to have a method call with receiver 10, enclosing the earlier method call. Lemma 6.13 (All owners are preserved on the stack) ⎫ h , σ pure e enc P ⎪ ⎪ or ⎪ ⎪ hinit , σinit enc e ⎬ ∃D [·], D [·], σ such that ∗ σinit e, hinit e , h =⇒ D[·] = D [frame σ D [·]], ⎪ ⎪ e = D[frame σ e ] ⎪ ⎪ and ⎭ a ∈ owner+ (h , σ (this)) \ {root, 1} σ(this) = a Proof. By induction on the length of the reduction σinit e, hinit ∗ e , h . For the base case, i.e., when e = e , we use induction over the structure of D[·]. For the inductive step, i.e., when σinit e, hinit ∗ e , h ∗ e , h by case analysis over the last step in the derivation. We now state the owner-as-modifier guarantee, and prove it using the lemma from above, the preservation of encapsulation (Lemma 6.9), and the fact that encapsulated expressions have limited write effects (Lemma 6.11). Theorem 6.14 (Owner-as-modifier) ⎫ enc P σinit (this) = a ⎪ ⎪ ⎪ ⎪ hinit , σinit enc e or ⎪ ⎪ ⎬ σinit e, hinit ∗ e , h e , h ∃D[·], D [·], σ, e such that =⇒ a ∈ dom(h ) e = D[frame σ D [e ]], ⎪ ⎪ ⎪ ⎪ h (a) = h (a) and ⎪ ⎪ ⎭ a = owner(h , a) = root σ(this) = a Proof. If a = 1, then we are done by construction of σinit . Therefore, we can proceed assuming that a ∈ / {root, 1}.
Universe Types for Topology and Encapsulation
97
The first three premises and Lemma 6.9 give that h , σinit enc e . This, together with the fourth and fifth premises, and Lemma 6.11 give that there exist f , v, D1 [·], E1 [·], σ such that ( h , σ a : rep or h , σ a : peer ) and ( e = D1 [frame σ E1 [a.f := v]] or (σ = σinit and e = E1 [a.f := v]) ). The second part of the conjunction above gives the following two cases: 1st Case σ = σinit and e = E1 [a.f := v]. Then, because h , σinit enc e , using the definition of encapsulated expressions, we obtain that h , σinit a : rep (which gives that a = 1, in which case we are done), or h , σinit a : peer, (which gives that a = root, and then we are done again). 2nd Case e = D1 [frame σ E1 [a.f := v]]. The first part of our conjunction from earlier on gives that either σ (this) = a , or owner(h , σ (this)) = a . 2.1st Case σ (this) = a . We choose σ = σ , and D[·] = D1 [·], and D [·] = E1 [·], and e = a.f := v. This concludes the case. 2.2nd Case owner(h , σ (this)) = a . Because a ∈ / {root, 1} we can apply Lemma 6.13, and obtain that there exist further contexts D3 [·], D4 [·], and frame σ, such that D1 [·] = D2 [frame σ D3 [·]], and σ(this) = a . We now choose D[·] = D2 [·], and D [·] = frame σ D3 [·], and e = frame σ E1 [a.f := v], and conclude the case.
7
Related Work
Over the past ten years, there have been a large number of publications on ownership and ownership type systems. In this section, we discuss work that is most closely related to the focus of this paper, namely the separation of ownership topologies from encapsulation policies and the formalisation of ownership type systems. Most ownership type systems combine the enforcement of an ownership topology and an encapsulation policy. Ownership Types [10] and its descendants [4,6,8,9,29] enforce an ownership topology as well as the owner-as-dominator encapsulation policy, which guarantees that every reference chain from an object in the root context to an object goes through the object’s owner. Similarly, Universe Types [14,15,26,28] enforce an ownership topology as well as the owneras-modifier encapsulation policy, which guarantees that every modification of an object is initiated by the object’s owner. In this paper, we showed how to separate the Topological System from the Encapsulation System. This separation is facilitated by distinguishing between the ‘don’t care’ modifier any and the ‘don’t know’ modifier lost because the Topological Type System treats them differently. Ownership domains [1] was the first ownership system that separated the ownership topology from the encapsulation policy. This is achieved by allowing programmers to distinguish between private and public ownership domains and to declare links between ownership domains. While the Encapsulation Type
98
D. Cunningham et al.
System presented in this paper enforces a fixed encapsulation policy, it is possible to combine our Topological System with various encapsulation policies. Dietl and M¨ uller [16] encoded ownership types on top of Dependent Classes [19]. Dependent Classes are used to enforce the ownership topology, whereas encapsulation has to be enforced separately. Most ownership type systems have been formalised for a small programming language similar to the one used in this paper. The formalisation of OGJ [29] is based on Java generics. Ownership information is encoded in the type parameters, which makes the formalisation simple. Dynamic ownership [23] as available in Spec# uses ghost state to encode the ownership topology and the Boogie verification methodology to enforce an encapsulation policy similar to the one of Universe Types. The Topological Type System presented in this paper can be combined with the Boogie methodology. Type checkers for the Universe Type System are implemented in the JML tools [12,22] and as a pluggable type system for Scala [13]. In this work, in keeping with most works on Universe or Ownership Types, each object is owned directly by at most one other object, and the ownership hierarchy forms a tree. This view can, however, be generalised to allow several direct owners, and the ownership hierarchy to form a DAG [7].
8
Conclusion
We presented UT, a new formalisation of the Universe Type System, which is given in two steps: first presenting a Topological Type System that builds the ownership topology and then augmenting it to the Encapsulation Type System. The two-step formalisation permits a gentler presentation of the mathematical machinery we develop and primarily allows for separation of concerns when extending this work, as some extensions and applications of Universe Types do not require encapsulation properties. Both of these factors facilitate the adoption of the work as a starting point for further work. We introduced the distinction between the ‘don’t care’ modifier any and the ‘don’t know’ modifier lost. We proved subject reduction (for both the Topological and the Encapsulation Type System) for a small-step operational semantics of a subset of Java. Like UT most ownership type systems have been formalised on paper. We also formalised a version of Universe Types including arrays in Isabelle and proved type safety [21]. The main difference is that there we use a big-step semantics, whereas here we use a small-step semantics. This formalisation of the Universe Type System is the basis for various future extensions. We plan to extend our work on Generic Universe Types [14] to also separate topology from encapsulation. We are also planning to improve the expressiveness of Universe Types by adding path-dependent types. Adapting the type system to Java bytecode is other future work. This will permit the use of Universe Types for the verification of mobile bytecode in a Proof-Carrying-Code architecture such as the one proposed by the Mobius project [20].
Universe Types for Topology and Encapsulation
99
Acknowledgements We thank our reviewer for extensive feedback and many useful suggestions. This work was funded in part by the Information Society Technologies program of the European Commission, Future and Emerging Technologies under the IST-2005-015905 MOBIUS project, and the EPSRC grant Practical Ownership Types for Objects and Aspect Programs, EP/D061644/1.
References 1. Aldrich, J., Chambers, C.: Ownership domains: Separating aliasing policy from mechanism. In: Odersky, M. (ed.) ECOOP 2004. LNCS, vol. 3086, pp. 1–25. Springer, Heidelberg (2004) 2. Andreae, C., Coady, Y., Gibbs, C., Noble, J., Vitek, J., Zhao, T.: Scoped types and aspects for real-time java. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 124–147. Springer, Heidelberg (2006) 3. Banerjee, A., Naumann, D.: Representation independence, confinement, and access control. In: Principles of Programming Languages (POPL), pp. 166–177. ACM Press, New York (2002) 4. Boyapati, C.: SafeJava: A Unified Type System for Safe Programming. PhD thesis, MIT (2004) 5. Boyapati, C., Lee, R., Rinard, M.: Ownership types for safe programming: Preventing data races and deadlocks. In: Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, New York (2002) 6. Boyapati, C., Liskov, B., Shrira, L.: Ownership types for object encapsulation. In: Principles of programming languages (POPL), pp. 213–223. ACM Press, New York (2003) 7. Cameron, N., Drossopoulou, S., Noble, J., Smith, M.: Multiple Ownership. In: Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 441–460. ACM Press, New York (2007) 8. Clarke, D.: Object Ownership and Containment. PhD thesis, University of New South Wales (2001) 9. Clarke, D., Drossopoulou, S.: Ownership, Encapsulation and the Disjointness of Types and Effects. In: Object-oriented programming, systems, languages, and applications (OOPSLA), pp. 292–310. ACM, New York (2002) 10. Clarke, D., Potter, J., Noble, J.: Ownership types for flexible alias protection. In: Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), vol. 33(10), pp. 48–64. ACM Press, New York (1998) 11. Cunningham, D., Drossopoulou, S., Eisenbach, S.: Universe Types for Race Safety. In: Verification and Analysis of Multi-threaded Java-like Programs (VAMP), pp. 20–51 (2007) 12. Dietl, W.: JML2 Eclipse plug-in, http://pm.inf.ethz.ch/research/universes/tools/eclipse/ 13. Dietl, W.: Universe type system tools for Scala, http://pm.inf.ethz.ch/research/universes/tools/scala/ 14. Dietl, W., Drossopoulou, S., M¨ uller, P.: Generic universe types. In: Ernst, E. (ed.) ECOOP 2007. LNCS, vol. 4609, pp. 28–53. Springer, Heidelberg (2007)
100
D. Cunningham et al.
15. Dietl, W., M¨ uller, P.: Universes: Lightweight ownership for JML. Journal of Object Technology (JOT) 4(8), 5–32 (2005) 16. Dietl, W., M¨ uller, P.: Ownership type systems and dependent classes. In: Foundations of Object-Oriented Languages (FOOL) (2008) 17. Felleisen, M., Friedman, D.P., Kohlbecker, E., Duba, B.: A syntactic theory of sequential control. Journal of Theoretical Computer Science 52, 205–237 (1987) 18. Flanagan, C., Qadeer, S.: Types for atomicity. In: Types in Language Design and Implementation (TLDI), pp. 1–12. ACM Press, New York (2003) 19. Gasiunas, V., Mezini, M., Ostermann, K.: Dependent classes. In: Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 133–152. ACM Press, New York (2007) 20. Global Computing Proactive Initiative. Mobius: Mobility, Ubiquity and Security. IST-15905, http://mobius.inria.fr/ 21. Klebermaß, M.: An Isabelle formalization of the Universe Type System. Master’s thesis, Technical University Munich and ETH Zurich (2007), http://pm.inf.ethz.ch/projects/student docs/Martin Klebermass/ 22. Leavens, G.T., Poll, E., Clifton, C., Cheon, Y., Ruby, C., Cok, D., M¨ uller, P., Kiniry, J., Chalin, P., Zimmerman, D.M., Dietl, W.: JML reference manual. Department of Computer Science, Iowa State University (2008), http://www.jmlspecs.org 23. Leino, K.R.M., M¨ uller, P.: Object invariants in dynamic contexts. In: Odersky, M. (ed.) ECOOP 2004. LNCS, vol. 3086, pp. 491–516. Springer, Heidelberg (2004) 24. Lu, Y., Potter, J.: Protecting Representation with Effect Encapsulation. In: Principles of programming languages (POPL), pp. 359–371. ACM Press, New York (2006) 25. Mitchell, J.C., Plotkin, G.D.: Abstract types have existential type. ACM Trans. Program. Lang. Syst. 10(3), 470–502 (1988) 26. M¨ uller, P.: Modular Specification and Verification of Object-Oriented Programs. LNCS, vol. 2262. Springer, Heidelberg (2002) 27. M¨ uller, P., Poetzsch-Heffter, A., Leavens, G.T.: Modular invariants for layered object structures. Science of Computer Programming 62, 253–286 (2006) 28. M¨ uller, P., Rudich, A.: Ownership transfer in Universe Types. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), pp. 461–478. ACM Press, New York (2007) 29. Potanin, A., Noble, J., Clarke, D., Biddle, R.: Generic ownership for generic Java. In: Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pp. 311–324. ACM Press, New York (2006) 30. Salcianu, A., Rinard, M.C.: Purity and side effect analysis for java programs. In: Cousot, R. (ed.) VMCAI 2005. LNCS, vol. 3385, pp. 199–215. Springer, Heidelberg (2005)
A
Supporting Results and Proofs
Lemma A.1 (Owners are defined in well-formed heap). If h and a ∈ dom(h) then (owner+ (h, a) \ {root}) ⊆ dom(h). Proof. By induction on the definition of owner+ (h, a) using the rule (WFAddr).
Universe Types for Topology and Encapsulation
101
Lemma A.2 (Object owner and class preservation)
h(a) = (o, c, ) h (a) = (o, c, ) =⇒ (h , a ) = alloc(h, σ, t) owner+ (h , a) = owner+ (h, a)
h(a) = (o, c, ) h (a) = (o, c, ) (ii) =⇒ h = h[(a , f ) → v] owner+ (h , a) = owner+ (h, a)
h(a) = (o, c, ) h (a) = (o, c, ) (iii) =⇒ σ e, h e , h owner+ (h , a) = owner+ (h, a) (i)
Proof. (i) Immediate from the definition of alloc(h, σ, t), noting it is not possible that a = a since a ∈ dom(h) and a ∈ dom(h). (ii) Follows from the definition of h[(a , f ) → v], in which the owner and class information is explicitly preserved. (iii) By induction on the derivation of σ e, h e , h , using parts (i) and (ii). Lemma A.3 (Heap domain inclusion) (i) If (h , a) = alloc(h, σ, t) then a ∈ / dom(h) and dom(h ) = dom(h) ∪ {a} (ii) If a ∈ dom(h) and h = h[(a, f ) → v] then dom(h ) = dom(h) (iii) If σ e, h e , h then dom(h) ⊆ dom(h ) Proof. (i) Immediate from the definition of alloc(h, σ, t). (ii) Immediate from the definition of h[(a, f ) → v]. (iii) By induction on the derivation of σ e, h e , h , using parts (i) and (ii). Lemma A.4 (Soundness of object creation) ⎫ u ∈ {rep, peer} ⎬ hσ =⇒ h , σ a : u c ⎭ (h , a) = alloc(h, σ, u c) Proof. By case analysis of u, the definition of alloc(h, σ, u c) and using rule (tAddr). Lemma A.5 (Heap operations preserve value types). If h, σ v : t then (i) If (h , a) = alloc(h, σ, t ) then h , σ v : t (ii) If h = h[(a, f ) → v ] then h , σ v : t
102
D. Cunningham et al.
Proof. There are two sub-cases: v = null: we can still derive the type judgement using (tNull). v = b: Neither of the operations change the ownership and class information of an existing object in a heap, as we saw in Lemma A.2. Thus we can still derive h , σ b : t in both cases, using (tAddr). Lemma A.6 (Heap operations and well-formedness). If h and h σ then (i) If u ∈ {rep, peer} and (h , a) = alloc(h, σ, u c) then h (ii) If h, σ a : u c, F (c, f ) = s and h, (a, ) v : s then h[(a, f ) → v] Proof. (i) First, we notice that by Lemma A.3(i) and Lemma A.2(i) we know: a∈ / dom(h) ∧ dom(h ) = dom(h) ∪ {a}
(9)
∀b ∈ dom(h). class(h , b) = class(h, b) ∀b ∈ dom(h). owner(h , b) = owner(h, b)
(10) (11)
∀b ∈ dom(h). owner+ (h , b) = owner+ (h, b)
(12)
We aim to show h by rule (WFHeap, Definition 5.7). By the assumption h, and the form of the rule (which is the only rule which can derive such judgements), we must have: b ∈ dom(h) ⇒ root ∈ owner+ (h, b) ∀b ∈ dom(h). h b
(13) (14)
We show the second premise of (WFHeap) first; i.e., that ∀b ∈ dom(h ). h b. To show this, we consider two cases for b: (b = a): Then by (9), we know b ∈ dom(h). By (14) we have h b. From the form of (WFAddr), and using (10) and (11), it suffices to prove (where class(h , b) = class(h, b) = c) that F (c, f )=s ⇒ h , (a, ) h (a.f ) : s. Since h b, we know that F (c, f )=s ⇒ h, (b, ) h(b.f ) : s. We complete the case by applying Lemma A.5(ii). (b = a): We show that the premises of (WFAddr) hold, directly to deduce h b. Since h σ, we have in particular that σ(this) ∈ dom(h). By Lemma A.1, we also know that either owner(h, σ(this)) ∈ dom(h) or owner(h, σ(this)) = root. From the definition of alloc(h, σ, u c) we can then show owner(h , a) ∈ (dom(h) ∪ {root}) ⊆ (dom(h ) ∪ {root}). Furthermore, by Lemma A.4, we know that h , σ a : u c as required. To show the first premise of (WFHeap), suppose that b ∈ dom(h ). By (9), we know that either b ∈ dom(h) or b = a. In the former case, we have (by (13) and (12)), root ∈ owner+ (h, b) = owner+ (h , b) as required. In
Universe Types for Topology and Encapsulation
103
the latter case, from the argument above we know that owner(h , a) ∈ (dom(h) ∪ {root}) Therefore by (13) and the definition of owner+ (a, h ), we know that root ∈ owner+ (a, h ) as required. Thus we have all the premises needed to apply the rule (WFHeap) and obtain h . (ii) Let h = h[(a, f ) → v] in what follows. We aim to deduce h by rule (WFHeap). By Lemma A.3(ii) we know that dom(h ) = dom(h). By Lemma A.2(ii), we therefore also know that the ownership and class information defined in h is exactly that defined in h . Therefore, given the assumption h, and considering the premises of the rules (WFHeap) and (WFAddr), it suffices to prove that: ⎫ a ∈ dom(h ) ⎬ class(h , a ) = c =⇒ h , (a , ) h (a .f ) : s ⎭ F (c , f ) = s We now consider two cases. Firstly, if either a = a or f = f , then by definition of h[(a, f ) → v], we have h (a .f ) = h(a .f ). Therefore h , (a , ) h (a .f ) : s follows from the assumption h (examining the premises of (WFHeap) and (WFAddr)). On the other hand, if both a = a and f = f then c = class(h , a ) = class(h, a) = c and so s = F (c , f ) = F (c, f ) = s. Therefore, it suffices to prove h , (a, ) h (a.f ) : s. This follows from applying Lemma A.5(ii) to the assumption h, (a, ) v : s, noting that by definition of h[(a, f ) → v] we have h (a.f ) = v. Lemma A.7 (Extracting information from address type judgements) If h, σ a : u c then (i) a ∈ dom(h) (ii) class(h, a) ≤c c (iii) (σ(this), owner(h, σ(this))) (a, owner(h, a)) : u Proof. By induction on the derivation of h, σ a : u c, specifically using the rules (tADDR), (tSUB) and Lemma 3.2. Lemma A.8 (Extracting information from method type judgements) If h, σ e1 .m(e2 ) : t then there exist u1 and c1 such that: (i) (ii) (iii) (iv)
h, σ e1 : u1 c1 M (c1 , m) = p : sr (sx ) u 1 sx = lost h, σ e2 : u1 sx
Note that we make no requirements on how u1 and c1 are related to t; our assumption simply insists that the call is typeable somehow. Proof. By induction on the derivation of h, σ e1 .m(e2 ) : t, specifically using the rules (tCALL) and (tSUB).
104
D. Cunningham et al.
Lemma A.9 (Viewpoint adaptation preserves subtyping) s ≤ s =⇒ u s ≤ u s Proof. By case analysis of u and the relation u1 ≤u u2 .
Lemma 5.4 (Determining the relative Universe Types of values) (i) If h, σ a : u and h, (a, ) v : t then h, σ v : u t (ii) If h, σ a : u and h, σ v : u t and u t = lost then, for any value v we have h, (a, v ) v : t Proof. Uses Lemma A.7(iii) to extract Universe determination judgement for addresses. We have two cases for v: – If v = null then we can trivially derive any type judgement using (tNull). – If v = b we use Lemma A.7(iii) to extract Universe judgements for the addresses a and b. Then we apply Lemma 3.4 and the rule (tADDR) to obtain the desired judgement. Theorem 5.9 (Topological Subject Reduction). If a program is wellformed, then ⎫ ⎧ h ⎪ ⎪ ⎬ ⎨ h hσ =⇒ h σ h, σ e : t ⎪ ⎩ ⎪ h , σ e : t ⎭ σ e, h e , h Proof. By induction over the structure of h, σ e : t
(15)
The most interesting cases are when the last rules used to derive (15) are (tField),(tAssign) and (tCall), which we show here. We leave the other cases for the interested reader. In all cases, the conclusion h σ follows straightforwardly, using Lemmas A.3 and A.7 as necessary. (tField): From the premise of the rule we know e = e1 .f
(16)
t = u s h, σ e1 : u c
(17) (18)
F (c, f ) = s
(19)
From (16) we know the reduction was derived using either (rField) or (rEvalCtxt). (rField): we know: e1 = a
(20)
h =h
(21)
e = h(a.f ) = v
(22)
Universe Types for Topology and Encapsulation
105
From (18) and (20), and applying Lemma A.7(i), we know that a ∈ dom(h)
(23)
From h, the premises of the rule (WFHeap) and (23), we know in particular that ha (24) From (24) and the premise of (WFAddr), along with (19) and (22) we deduce h, (a, ) v : s
(25)
By (21), (20), (18), (25) and Lemma 5.4 we obtain h, σ v : u s The resultant heap h is trivially shown to be well-formed from the assumption h and (21). (rEvalCtxt): we know: e = e1 .f
(26)
e1 , h
(27)
σ e1 , h
By h, (18), (27) and the inductive hypothesis we know h, σ e1 : u c h
(28) (29)
By (28), (19), (17), (26) and (tField) we derive our first required conclusion h, σ e : t and (29) gives us the second required conclusion. (tAssign): From the premises of the rule we know e = e1 .f := e2
(30)
h, σ e1 : u c F (c, f ) = s
(31) (32)
us = lost h, σ e2 : u s
(33) (34)
From the structure of e, (30), we know the reduction could have been derived using either (rAssign) or (rEvalCtxt). We here consider the former case, (rAssign), and leave the (easier) latter case for the interested reader. From (rAssign) we know: e1 = a
(35)
e2 = e = v
(36)
h = h[(a, f ) → v]
(37)
106
D. Cunningham et al.
Applying Lemma 5.4(ii), using (35), (31), (36), (34), (33), we obtain h, (a, ) v : s
(38)
By the assumption h, (35), (31), (32), (38) and Lemma A.6 we get h[(a, f ) → v] Also, by (36) and (34) we obtain h, σ e : u s
(39)
and by (39) and Lemma A.5 we get h[(a, f ) → v], σ e : u s as required. (tCall): From the premises of this rule we know e = e1 .m(e2 )
(40)
h, σ e1 : u1 c1 M (c1 , m) = p : sr1 (sx1 )
(41) (42)
u1 sx1 = lost h, σ e2 : u1 sx1
(43) (44)
t = u1 sr1
(45)
From the structure of e derived from (40), we know the reduction could have been derived using either (rCall) or (rEvalCtx). We here consider the case for (rCall) and leave the other case for the interested reader. From (rCall) and its assumption we know e1 = a and e2 = v
(46)
h =h σ = (a, v)
(47) (48)
ca = class(h, a) eb = MBody(ca , m)
(49) (50)
e = frame σ eb
(51)
From (47) and the assumption h we know that the resulting heap is wellformed. Thus from (51), (45) we only need to show that h, σ frame σ eb : u1 sr1
(52)
The rest of the proof is dedicated to showing this. From (41), (46), (49) and Lemma A.7 we derive ca ≤ c1
(53)
Universe Types for Topology and Encapsulation
107
From (53), (42), the premises of the rule (WFClass) and the assumption of well-formed programs, giving ca , we derive M (ca , m) = p : sr (sx ) r
sr1 x
s ≤ sx1 ≤ s
(54) (55) (56)
Also, by (50), (54), ca and the premises of (WFClass) we derive Γ e b : sr where Γ = (self ca , sx )
(57) (58)
Since σ (this) = a (using (48)), we can use the rule (Self) (of Figure 4) to derive (σ (this), owner(h, σ (this))) (a, owner(h, a)) : self (59) Using (49), (59) and (tAddr) we derive h, σ a : self ca
(60)
Also, using Lemma 5.4(ii) with (46), (48), (41), (44), we get h, σ v : sx1
(61)
and by (61), (56) and (tSub) we derive h, σ v : sx
(62)
By (48), (57), (58), (60), (62) and Lemma 5.8 we derive h, σ eb : sr
(63)
and by (63), (41), (46),(48) and (tFrame) we get h, σ frame σ eb : u1 sr
(64)
From (55) and Lemma A.9 we derive u1 sr ≤ u1 sr1
(65)
and thus by (64), (65) and (tSub) we get h, σ frame σ eb : u1 sr1 as required by (52). Lemma 6.9 Preservation of purity and encapsulation. For any program such that enc P , if σ e, h e , h then:
108
D. Cunningham et al.
1. If h, σ pure e then (a) h , σ pure e (b) a ∈ dom(h) ⇒ h (a) = h(a) 2. If h, σ enc e then (a) h , σ enc e Proof. The proof proceeds by induction on the derivation of σ e, h e , h
(66)
considering cases for the last rule applied in the derivation. We show here the interesting cases and leave the simpler ones for the interested reader. (rAssign): Then we know e = (b.f := v) e = v
(67) (68)
h = h[(b, f ) → v]
(69)
From (67) we know h, σ pure e. So we do not need to consider case (1). To show case (2), we assume h, σ enc b.f := v
(70)
By using (70) and unravelling Definition 6.6, we obtain that (for some type t): h, σ e : t
(71)
By applying the Topological Subject Reduction Theorem 5.9 and using (68), we therefore know h, σ v : t
(72)
Combining this with (69), we obtain h , σ v : t, and then apply Definition 6.6 to obtain h , σ enc v as required. (rCall): Let c = class(h, b). Then we know e = b.m(v) e = frame (b, v) eb
(73) (74)
eb = MBody(c , m) h = h
(75) (76)
We consider the two cases we need to show in turn: 1. (h, σ pure e): From Definition 6.6 we know that: h, σ b.m(v) : t
(77)
h, σ b : c
(78)
pure(c, m)
(79)
Universe Types for Topology and Encapsulation
109
Using the rule (Self) of Figure 4, and the rule (tAddr) we can derive h, (b, v) b : self c
(80)
Applying rule (tThis), we can then deduce h, (b, v) this : self c
(81)
From the assumption enc P we know enc c and thus by (75) and Definition 6.3 we can write: M (c, m) = pure : sr (sx )
(82)
By (78) and Lemma A.7, we deduce c ≤ c c
(83)
By Definition 5.3 and (82) and (83), we obtain: M (c , m) = pure : sr (sx ) sx
sx ≤ sr ≤ sr
(84) (85) (86)
From the assumption enc P we know enc c and thus by (75) and Definition 6.3 we know that: (self c , sx ) pure eb
(87)
Returning to (77), and applying Lemma A.8, we obtain (for some u , c ): h, σ b : u c M (c , m) = p : sr (sx )
u sx = lost h, σ v : u sx
(88) (89) (90) (91)
We can now take (88), (90) and (91) and apply Lemma 5.4(ii) to obtain: h, (b, v) v : sx
(92)
By (88) and Lemma A.7, we deduce c ≤c c
(93)
Combining this with Definition 5.3 and (89), we obtain in particular: sx ≤ sx
(94)
110
D. Cunningham et al.
By (92), (94), Lemma A.9 and (tSub), we obtain h, (b, v) v : sx
(95)
From this, we apply the rule (tVar) to obtain h, (b, v) x : sx
(96)
and as a result of Lemma 6.7, (87), (81) and (96) we get: h, (b, v) pure eb
(97)
By (77), (66), (74) and the Topological Subject Reduction Theorem 5.9 we get h, σ frame (b, v) eb : t (98) and hence by (97), (98), (74), (76), and Definition 6.6 we obtain h , σ pure e which completes the case. 2. (h, σ enc e): From Definition 6.6 we know we have two sub-cases. The first sub-case states that the method called is pure and the proof then progresses as the previous case for h, σ pure e. Therefore, it suffices to consider the case when, for some u ∈ {peer, rep} and source types sr , sx , we have: h, σ b : u c h, σ b.m(v) : t
(99) (100)
M (c, m) = impure : sr (sx )
(101)
Since the method m is declared impure in c, and we have assumed enc P (and thus enc c), it follows from (75) and Definition 6.3 that we know (self c, sx ) enc eb
(102)
By similar argument to the previous case, we can deduce from (99) and (100) that h, (b, v) this : self c
(103)
h, (b, v) x : sx
(104)
and thus by (103), (104), (102) and Lemma 6.7 we get h, (b, v) enc eb
(105)
From (99) we derive (see Definition 6.5): h (b, v) enc σ
(106)
Universe Types for Topology and Encapsulation
111
Also, by (100), (66), (74) and the Topological Subject Reduction we know h, σ frame (b, v) eb : t (107) Thus by (107), (106), (105), (74), (76). and Definition 6.6 we conclude h , σ enc e as required. (rFrame2): 1. (h, σ pure e): Similar to the following case. 2. (h, σ enc e): From the rule we know e = frame σ v e = v
(108) (109)
h = h
(110)
From Definition 6.6 we know that either h, σ pure v or h, σ frame σ v : t h σ enc σ
(111) (112)
h, σ enc v
(113)
By (111), (66), the Topological Subject Reduction Theorem 5.9 and (109) we get h, σ v : t (114) and by (114), (76), and Definition 6.6 we obtain h , σ enc v as required.
Lemma 6.11 (Encapsulated expressions have limited write effects) If enc P , and h, σ enc e, and σ e, h e , h , and h(a) = h (a) for some a ∈ dom(h), then there exist σ , f, v, D[·], and E[·] such that 1. e = D[frame σ E[a.f := v]] or (σ = σ and e = E[a.f := v]). and 2. h, σ a : rep or h, σ a : peer Proof. The proof proceeds by induction on the derivation of σ e, h e , h considering cases for the last rule applied in the derivation We show here the interesting cases: (rAssign): Then, because h(a) = h (a), we know that there exists a field f , and value v such that e = (a.f := v), and h = h[(a, f ) → v]. From the latter, the encapsulation property, and the conditions of Definition 6.6 we know there exists u ∈ {self, rep, peer} such that: h, σ a : u. We choose σ = σ and E[·] = [·], and the rest follows easily.
112
D. Cunningham et al.
(rCall): Then h = h, and thus the case is vacuous. (rFrame2): Then h = h, and thus the case is vacuous. (rFrame1): Then we know that there exist σ1 , e1 and e1 , such that e = frame σ1 e1 , and σ1 e1 , h e1 , h , and e = frame σ1 e1 . By application of the induction hypothesis, we obtain that there exists a σ , f, v, D1 [·], and E1 [·] such that 1. h, σ a : rep or h, σ a : peer and 2. e1 = D1 [frame σ E1 [a.f := v]], or (σ = σ1 and e1 = E1 [a.f := v]). The second part of the conjunction above gives two cases: 1st Case e1 = D1 [frame σ E1 [a.f := v]]. We then choose E[·] = E1 [·], and D[·] = frame σ1 D1 [frame σ E1 [·]], and the rest follows. 2nd Case σ = σ1 and e1 = E1 [a.f := v]. We then choose E[·] = E1 [·], and D[·] = frame σ1 E[·], and the rest follows.
COSTA: Design and Implementation of a Cost and Termination Analyzer for Java Bytecode E. Albert1 , P. Arenas1 , S. Genaim2 , G. Puebla2 , and D. Zanardini2 1
2
DSIC, Complutense University of Madrid, E-28040 Madrid, Spain CLIP, Technical University of Madrid, E-28660 Boadilla del Monte, Madrid, Spain
Abstract. This paper describes the architecture of costa, an abstract interpretation based cost and termination analyzer for Java bytecode. The system receives as input a bytecode program, (a choice of) a resource of interest and tries to obtain an upper bound of the resource consumption of the program. costa provides several non-trivial notions of cost, as the consumption of the heap, the number of bytecode instructions executed and the number of calls to a specific method. Additionally, costa tries to prove termination of the bytecode program which implies the boundedness of any resource consumption. Having cost and termination together is interesting, as both analyses share most of the machinery to, respectively, infer cost upper bounds and to prove that the execution length is always finite (i.e., the program terminates). We report on experimental results which show that costa can deal with programs of realistic size and complexity, including programs which use Java libraries. To the best of our knowledge, this system provides for the first time evidence that resource usage analysis can be applied to a realistic object-oriented, bytecode programming language.
1
Introduction
Research about automatic cost analysis goes back to the seminal work by Wegbreit in 1975 [29], which proposes to analyze the performance of a program by deriving closed-form expressions for its execution behavior. This approach consists of two phases. (1) In the first phase, given a program and the description of some cost measure, a set of equations is produced, which captures the cost of the program in terms of the size of its input data. Such equations are generated by converting the iteration constructs (loops and recursion) of the program into recurrence, and by inferring size relations which approximate how the size of arguments varies between calls. This set of equations can be regarded as a set of Recurrence Relations (RR for short). (2) The aim of the second phase is to obtain a non-recursive representation (solution) of the equations, known as closed-form solution. In most cases, it is not possible to find an exact solution, and the closed-form corresponds to an upper bound. F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 113–132, 2008. c Springer-Verlag Berlin Heidelberg 2008
114
E. Albert et al.
There are a good number of cost analysis frameworks for a wide variety of programming languages, including functional, logic and imperative [23,14,3]. Despite such a large amount of work, applying cost analysis to realistic languages, and programs with realistic size and complexity, is still an open issue, and, there is a lack of working tools. Termination analysis [10,19] can be regarded as another kind of resource usage analysis, and it has also been studied in the context of several programming languages. Termination analysis tries to prove that a program cannot infinitely run by considering its iterative and recursive structures and by proving that the number of times they can be executed in a program run is bounded. Putting cost and termination analysis together in the same tool makes sense because of the tight relation between them: proving termination implies that the amount of resources used at runtime is finite. In practical terms, cost and termination analysis share most of the system machinery, as they need to consider and infer roughly the same information about the program. We will use the term resource usage analysis (RUA) to refer to either cost or termination analyses. The present paper describes the design and implementation features of costa, a tool which is, to the best of our knowledge, the first RUA tool for an objectoriented, stack-based programming language, namely, Java bytecode [21]. The goal of the system is to infer the cost of a program with respect to some cost measure, and to prove its termination. costa sets up an accurate RR from the bytecode in an efficient way (phase 1 above) and is connected to a termination prover [1] and to an upper bound solver [2] to carry out phase 2. costa can currently work with different cost models, formalizing the idea of what a resource is, and how it is consumed at runtime: the number of instructions, Minst ; the heap consumption in bytes, Mheap ; the number of calls to a given method, Mcalls (e.g., the library method for sending text messages in mobile phones). The system allows the user to decide whether the analysis has to consider libraries as part of the analyzed program, i.e., if it must go and analyze the library code, or take cost information in the form of cost interfaces. Interfaces are needed when some code is not available, or is written in another language. However, they can only be used if it is guaranteed that the external code will not generate call-backs to the user code. In the absence of interfaces, the system gives symbolic names to the cost of libraries, and they remain as unknown functions in the upper bound. These options make costa a flexible RUA tool, as we show in the next example. Example 1. Figure 1 shows the Java and bytecode of the running example, whose most relevant feature is the use of Java libraries. The Java code at the top is only shown for clarity, since costa works directly on the bytecode. At the left-middle, we depict the bytecode of the method inter, which computes the intersection of a linked list l and an array a, both non-sorted and containing objects which implement the interface java.lang.Comparable. The class CompList is user-defined, and implements a linked list of Comparable elements in the standard way. The result of the intersection is stored in a java.util.ArrayList object al. The method main (right-middle) allocates memory for a, l and al by means of their constructors.
COSTA: Design and Implementation of a Cost and Termination Analyzer public static void inter(CompList l,Comparable[ ] a, ArrayList al){ while (l!=null){ for (int i=0; i
115
public static void main (String[ ] args){ Comparable[ ] a = new Integer[12]; ArrayList al = new ArrayList(); CompList l = new CompList(); loadArray(a); loadList(l); inter(l,a,al);} 0 bipush 12 2 anewarray #6 //Integer
Fig. 1. The running example, with upper bounds computed for different cost models
Afterwards, calls to static methods loadArray and loadList fill, resp., the array and the list with objects of the library class java.lang.Integer, also implementing java.lang.Comparable. For brevity, the code of both static (user-defined) methods, which terminate and have constant cost, is omitted. Note that parameters a and l of inter are non-null and have constant length, and al is also non-null. At the bottom, we show the upper bounds computed by costa for main and inter, with the cost models Minst and Mheap . Variables a and l in the solutions denote, resp., the length of the array a and the maximal path-length (Sec. 5) of l. The left column is computed by analyzing all required library methods. In the right column, library methods are not analyzed; instead, their cost appears as c1 , . . . , c6 , where c1 and c2 stand for, resp., java.util.ArrayList.ArrayList and java.util.ArrayList.add, and c3 , . . . , c6 stand for Integer.Integer, Object.Object, System.arraycopy and Comparable.compareTo, all from java.lang. When analyzing libraries, upper bounds for main depend only on the cost of c5 , which corresponds to the native Java method arraycopy invoked within ArrayList.add inside inter. When we analyze inter independently of main, c2 and c6 are not analyzed, as the objects have not been created and their class is not statically known. When libraries are not considered, c5 is not reached. While c1 and c2 are invoked, resp., in main and inter, c3 originates from loadList and loadArray, which create Integer objects by invoking their constructor. Due to inheritance, c4 is also required. In Mheap , inter does not consume any heap locations by itself,
116
E. Albert et al.
as it does not allocate any object. Yet, analysis considers the heap usage of c2 and c6 , which could allocate memory. 2
2
Architecture of the Cost and Termination Analyzer
Figure 2 shows the overall architecture of the costa analyzer. The dashed frames represent the two main phases of the analysis: (i) the transformation of the bytecode into a suitable internal representation; and (ii) the actual resource usage analysis. Input and output of the system are depicted on the left: costa takes a Java bytecode program JBC and a description of the cost model, and yields as output an upper bound UB of its cost, and information TERM about termination. Ellipses (as CFG) represent what the system produces at each intermediate stage of the analysis; rounded boxes (as CFG build ) indicate the main steps of the analysis process; square boxes (as class analysis), which are connected to the main steps by dashed arrows, denote auxiliary analyses which allow to obtain more precise results or to improve efficiency. During the first phase, depicted in the upper half of the figure, the incoming JBC is transformed into a rule-based representation (RBR), through the construction of the control flow graph (CFG). The purpose of this transformation (Sec. 3) is twofold: (1) to represent the unstructured control flow of the bytecode into a procedural form (e.g., goto statements are transformed into recursion); and (2) to have a uniform treatment of variables (e.g., operand stack cells are represented as local variables). Several optimizations are performed on the rule-based representation to enable more efficient and accurate subsequent analyses: in particular, class analysis is used to approximate the method instances which can be actually called at a given program point in case of virtual invocation; loop extraction makes it possible to effectively deal with nested loops by extracting loop-like constructs from the control flow graph; stack variables elimination, constant propagation and static single assignment make the rest of the analyses simpler, more efficient
loop extraction
class analysis
JBC cost model
TERM UB
CFG build
RBR build
CFG
RR build PUBS solver
SIZE
stack variables elimination constant propagation static single assignment
RBR
size analysis
RBR optim
ABS
RR
Fig. 2. Architecture of costa
nullity sign
RBR’
abstract compilation
slicing
COSTA: Design and Implementation of a Cost and Termination Analyzer
117
and precise. Essentially, the construction of the RBR turns out to be effective for developing a (compositional) RUA (Sec. 3.5). In the second phase, depicted in the lower half of the figure, the system performs cost and termination analysis on the RBR. Abstract compilation, which is helped by auxiliary static analyses, prepares the input to size analysis, whose aim is to find interesting relations between execution states (Sec. 5). As usual in object-oriented languages, nullity analysis improves the accuracy of size analysis, together with class analysis, which was performed previously. Finally, sign analysis helps in dealing with operations on integers. Afterwards, costa sets up a Recurrence Relation (RR) for the selected cost model. The latter is given as an input, selected among the available models. It is also trivial to define new cost models in the system by just associating a cost to each bytecode instruction. For the purpose of cost, the system performs slicing of the RBR in order to remove those variables which are useless to cost analysis. Up to this point, phase (1) in Sec. 1 has basically been achieved. In order to deal with phase (2), costa integrates the dedicated upper bound solver of [2,1], which finds closed-form solutions for RRs and proves termination (Sec. 6).
3
From the Bytecode to the Rule-Based Representation
Unlike other bytecode analyses performed directly on the CFG, in order to study cost and termination, an essential step is to transform the JBC into an appropriate recursive rule-based representation. Basically, this will facilitate the task of identifying loops (necessary for termination), and producing a recurrence relation from (the RBR of) the bytecode which represents its cost. 3.1
The Java Bytecode Language
A (sequential) JBC program consists of a set of class files, one for each class. A class file contains information about its name, the class it extends, and the fields and methods it defines. Each method has a unique signature m containing the class where it is defined, its name and its type. The bytecode associated to m is a sequence pc 1 :b1 , . . . , pc n :bn , where each bi is a bytecode instruction and pc i is its address. Local variables are denoted by l0 , . . . , lk−1 , where l0 is the this reference (explicit in JBC) and l1 , . . . , ln , with n < k, are the formal parameters of m. Similarly, each field f has a unique signature, containing its name and the name of the class it belongs to. It is out of the scope of this paper to provide a thorough description of the JVM (see the specification [21] for details). Example 2. Let us explain some instructions in Fig. 1 related to object-oriented features. Indexes 0, . . . , 3 in the bytecode correspond, resp., to parameters l, a, al and the local variable i. As the method is static, there is no this reference. The instruction 15: aload 0 pushes the reference to l on the stack. Next instruction, 16: getfield #2;, fetches the field data from l: the top of the stack l is popped, and #2 is used to build an index of the runtime constant pool (RCP) of the class where the reference to the name is stored. When this reference is fetched, it is
118
E. Albert et al.
pushed on the stack. As another example, 19: invokeinterface #3 pops a[i] and l.data from the stack, and searches the closest method with the correct signature, by looking up first in the class of the dispatching object, and then going up in the inheritance chain. As before, #3 is used to search the name of the method in the RCP. The method result is then pushed on the stack. 2 The execution environment of the JVM consists of a stack of activation records and a heap. Each activation record contains a program counter, a local operand stack, and a table of local variables. The heap is a global data structure which contains objects (and arrays) allocated by the program. Each method invocation generates a new activation record according to its signature, number and type of local variables, and maximum size of operand stack. Different activation may contain references to the same objects in the heap. 3.2
Generation of Control Flow Graphs Guided by Class Analysis
The control flow of JBC is unstructured. Conditional and unconditional jumps are allowed, as well as other implicit sources of branching such as virtual method invocation and exception throwing. The notion of Control Flow Graph (CFG) facilitates reasoning about programs in unstructured languages. costa transforms the bytecode of a method into CFGs by using techniques from compiler theory. In particular, the instruction sequence is split into maximal sub-sequences of non-branching instructions, which form the basic blocks (nodes) of the initial graph. Basic blocks of a method m are given a unique identifier mi , where i is an index, and are connected by guarded edges which describe all possible transitions. Guarded edges are introduced by considering the last bytecode instruction of each block, and represent the condition for the control going from one block to another one. Edges take the form mi →mj , φij , where mi and mj are the source and destination nodes, and φij is a boolean condition. Branching instructions include conditional jumps, dynamic dispatching and exceptions. Example 3. Figure 3 depicts the CFGs of method inter (Ex. 1). The edge from inter1 to inter2 takes the form inter1 →inter2 , ifnonnull, indicating that the top of the stack must be non-null for the control going from pc 1 (last instruction of inter1 ) to 4 (first one of inter2 ). Guards which are always true are left implicit. 2 Virtual invocation implies that more than one method can be executed at a given program point. In practice, computing a precise approximation of the reachable methods is not trivial, and asking the user to provide such information is not practical. As customary in the analysis of OO languages, costa uses class analysis [25] in order to precisely approximate this information. First, the CFG of the initial method is built, and class analysis is applied in order to approximate the possible runtime classes at each program point. This information is used to resolve virtual invocations. Methods which can be called at runtime are loaded, and their corresponding CFGs are constructed. Class analysis is applied to their body to include possibly more classes, and the process continues iteratively.
COSTA: Design and Implementation of a Cost and Termination Analyzer inter3
Fig. 3. CFGs for the Java bytecode program in Fig. 1 after loop extraction
Once a fixpoint is reached, it is guaranteed that all reachable methods have been loaded, and the corresponding CFGs have been generated. To handle realistic programs, we implemented a simple class analysis which does not keep class information at the level of reference variables, but just computes the set of reachable classes from any point in the program. This simple class analysis turned out to be crucial for the overall practicality of the analyzer, especially to analyze methods which are defined in the Object class as those found in most libraries. The simple class analysis used drastically reduces the number of methods to be analyzed while remaining quite efficient in practice. In the running example, class analysis detects that only one instance (the one in java.lang.Integer) of compareTo and add can be called resp. at bytecodes 19 and 32, so that virtual invocations invokeinterface and invokevirtual can be actually considered as non-branching. As regards exceptions, costa handles internal exceptions (i.e., those associated to bytecodes as stated in the JVM specification), exceptions which are thrown (bytecode athrow) and possibly propagated back in methods, as well as finally clauses (even if they are compiled using the bytecode jsr). Exceptions are handled by adding edges to the corresponding handlers. When the type of the exception is not statically known, as it happens when exceptions come from calls to methods, mutually exclusive edges are generated, which capture all possible instantiations. In order to infer resource usage, costa provides the options of ignoring only internal exceptions, all possible exceptions or considering them all. In Fig. 3, exceptions are not explicitly shown as edges; instead, they are indicated by marks (*) (see Ex. 4 and 5) in bytecodes producing them. For instance, bytecode 8 might generate a (N) exception if a is null in the call (of a.length). 3.3
Compositional Analysis by Means of Loop Extraction
A subsequent loop extraction transformation is applied to the initial CFG in order to separate sub-graphs corresponding to loops. Loop extraction has been
120
E. Albert et al.
well studied in the area of program decompilation [6] and it has been proposed in termination analysis [1]; yet, to the best of our knowledge, its use in cost analysis is new. It is crucial when the program contains nested loops, since it allows analyzing the program compositionally, in the sense that it is possible to reason on the termination and cost by taking one loop at a time. This is important for finding ranking functions, which are required to bound the number of iterations of loops (an essential piece of information for both cost and termination). costa implements an existing efficient algorithm [26] to identify the loops, and modifies it to have loops which, in addition to having a single entry, also have a single exit. The latter condition is required to avoid multiple return branches from loops, and is allowed when additional exits correspond to exceptions which can be caught and thrown by the caller. Whenever a loop is extracted, the corresponding subgraph is replaced by a new instruction call loop(j, o), where j is a fresh integer identifier, and o (often omitted for brevity) is the set of local variables of m which are modified by the execution of the loop. Besides, a new CFG is generated for each sub-graph, whose entry block has mj as its identifier. Hence, after extracting the loops, there is one CFG which corresponds to the entry of m, and the remaining CFGs correspond to loops. Example 4. Figure 3 shows the CFGs of inter after applying loop extraction. The middle graph corresponds to the loop called in inter0 by call loop(1), while the inner loop (right graph) is called from inter2 . The (E) mark indicates that an exception can be generated in the loop and propagated back to the caller block. loop exit(j) denotes the normal exit from loops, which transfers the control to the bytecode following call loop (bytecode 50, in the case of the outer loop). Exceptional exits from loops are omitted for brevity. 2 3.4
Rule-Based Representation
As already mentioned, for a method m and its CFGs, the system obtains a rulebased representation (RBR) for m whose purpose is twofold: 1) to transform iteration into recursion; and 2) to flatten the operand stack, in the sense that its content is represented as a series of local variables. The latter is possible because, in valid bytecode, the stack height can be statically decided. This is done in one pass on the CFGs, where the stack height is computed at the entry and exit of each block, and saved. The formal translation from a CFG to the rule-based representation can be found in previous work [3,1]. In the present paper, the CFG is different, as class analysis and loop extraction have been introduced in costa. This results in a more accurate and compositional representation. Intuitively, the system computes the rule-based representation of a JBC program by producing, for each basic block mj of its associated CFGs, a rule which: (1) contains the set of bytecode instructions within the basic block with the variables (local and stack) it operates on, appearing explicitly in the instructions; (2) if there is a method invocation within the instructions, includes a call to the corresponding rule; and
COSTA: Design and Implementation of a Cost and Termination Analyzer
121
(3) at the end, contains a call to a continuation rule mcj . The definition of a continuation must include mutually exclusive rules to cover all possible continuations from the block, guarded by the respective conditions. Example 5. When analyzing libraries, and by taking into account exceptions, the RBR for inter contains 59 rules. Let us show the rules associated to block inter4 in the CFG. For clarity, exception rules are not shown but we just annotate with ”%” the instructions susceptible of throwing exceptions. For them, there are rules in the RBR which capture the corresponding behavior. inter4 (l, a, al, i, i)
nop(ifne(s1 )), interc4 (l, a, al, i, s1 , i). interc4 (l, a, al, i, s1 , i) := guard(ifeq(s1 )), inter5 (l, a, al, i, i). interc4 (l, a, al, i, s1 , i) := guard(ifne(s1 )), inter6 (l, a, al, i, i).
It can be seen that there are two possible continuations (rule interc4 ), depending on the result of comparing the method output with zero. The comparison is the bytecode ifne, which is wrapped in a nop mark, meaning that the bytecode must be ignored at this point, but its cost must be taken into account later. The continuation rule may call inter5 or inter6 , depending on which condition holds at the entry, as made explicit by guards before the calls. 2 3.5
Optimizations on the Rule-Based Representation
Several automatic transformations can be done on the RBR, to improve both accuracy and efficiency of the rest of the analysis. Basically, optimizations aim at removing variables to have a simpler program representation. Static Single Assignment. A Static Single Assignment [13] (SSA) transformation is performed on the bytecodes of the RBR. SSA enables simple, yet efficient, denotational program analyses. For example, an instruction iadd(s0 , s1 , s0 ) is transformed into iadd(s0 , s1 , s0 ) where s0 refers to the value of s0 after the instruction. Our implementation of SSA keeps, for each rule, a mapping from variable names (as they appear in the rule) to new variable names (constraint variables). E.g., the rule for block inter3 takes the following form after SSA: inter3 (l, a, al, i, i ) := iload(i, s1 ), aload(a, s2 ), arraylength(s2 , s2 ), nop(ifcmpge(s1 , s2 )), interc3 (l, a, al, i, s1 , s2 , i ).
Stack Variable Elimination. While SSA introduces new variables, it also enables the removal of a large number of stack variables which correspond to intermediate states. costa unifies stack elements, local variables and constants occurring in instructions which move data to and from the stack, as iload, iconst,
122
E. Albert et al.
istore and ireturn. These unifications reduce the number of (distinct) variables which occur in the rule. After stack variable elimination, rule inter3 becomes: inter3 (l, a, al, i, i ) := iload(i, i), aload(a, a), arraylength(a, a), nop(icmpge(i, a)), interc3 (l, a, al, i, i, a, i ).
Note that the unification in arraylength does not mean that the length of a is written in a. Actually, this kind of unification is only meant to make size analysis easier. Most stack variables can be removed. In most cases, only those stack variables associated to operations on the heap, such as aaload(s1 , s2 , s1 ), and the return value of methods, as s1 (s1 after SSA) in inter5 , are kept in the RBR. Also, arguments which are duplicated in all possible call patterns to a rule can be filtered out. Rules inter3 and interc3 are transformed into: inter3 (l, a, al, i, i ) := iload(i, i), aload(a, a), arraylength(a, a), nop(ifcmpge(i, a)), interc3 (l, a, al, i, i ). interc3 (l, a, al, i, i ) := guard(ifcmplt(i, a)), inter4 (l, a, al, i, i ). interc3 (l, a, al, i, i ) := guard(ifcmpge(i, a)), loop exit(3)(l, a, al, i, i ).
Note that iload, aload and arraylength in the SSA form of rule inter3 have no effect here, and can be ignored by size analysis. However, they are not removed since their cost has to be taken into account when generating the recurrence relation. Inter-Block Constant Propagation. The above optimizations only achieve intra-block constant propagation, as variables are unified with values within the scope of a rule (i.e., a block), but are not propagated to other rules. Clearly, inter-block constant propagation is interesting in terms of both accuracy (more knowledge about values) and efficiency (less variables to consider). costa does a simple, yet effective constant propagation post-process, where constants are propagated forward to continuation rules. In a nutshell, when a block call is found, the current calling pattern is stored and, if it is guaranteed that such block is only invoked from that point, constants in the calling pattern are propagated to its body. For instance, the call pattern to inter3 from call loop(3) takes 0 for the counter i. However, this block is also invoked from inter6 , so that the value cannot be propagated. For correctness, constant propagation must be stopped as soon as variables whose value is being propagated are assigned a new value. This is automatically dealt with by using unification in the SSA transformation above.
4
Context-Sensitive (Pre-)Analyses to Improve Accuracy
costa implements two context-sensitive analyses based on abstract interpretation [11]: nullity and sign. The aim of these analyses is to improve the accuracy (and efficiency) of subsequent steps. Both analyses infer information from individual bytecodes, and propagate it via a standard, top-down fixpoint computation. They are designed to achieve good performance by implementing abstract operations using bitmaps, which allow accessing and updating the analysis information in constant time. 4.1
Nullity Analysis
A simple nullity analysis is performed on the RBR in order to keep track of nonnull objects. For instance, the bytecode new(si ) allows to assign the abstract
COSTA: Design and Implementation of a Cost and Termination Analyzer
123
value non-null to si . Afterwards, this information can be propagated by means of bytecodes like astore(si , lj ), which copies the non-null abstract value of si into lj . The results of nullity analysis often allow to remove rules corresponding to NullPointerException, essentially those guarded by guard(ifnull(si )). Nullity analysis is very effective when methods are analyzed context-sensitively. For instance, in the main program in Fig. 1, which calls inter with non-null lists l and al, and a non-null array a, nullity analysis of inter guarantees that no NullPointerException can be thrown when accessing fields or invoking methods belonging to the arguments of inter. Thus, bytecode instructions annotated with (N) in Fig. 3 will not generate exception branches. This is clearly beneficial both in terms of precision and efficiency of the remaining analysis steps. 4.2
Sign Analysis
Sign analysis keeps track of the sign of variables. The abstract domain contains the elements ≥, ≤, >, <, = 0, = 0, and ⊥, partially ordered in a lattice. Domain operations can be efficiently implemented with bitmaps (three bits for each abstract value). For instance, sign analysis of const(si , V ) evaluates the integer value V and assigns the corresponding abstract value = 0, > or < to si , depending, resp., on if V is zero, positive or negative [11]. Information from arithmetic bytecode instructions is inferred as expected. Knowing the sign of data allows to remove RBR rules associated to arithmetic exceptions which are guaranteed never to be thrown. In addition, sign information plays a crucial role in cost analysis, as it allows obtaining accurate upper bounds for logarithmic methods. E.g., consider a method with a simple recursive call of the form void m(int n) { .. m(n/2);..} for which we want to measure number of instructions executed. According to the JVM specification, without knowing the sign of n, it is not possible to know whether n/2 will be rounded to the next (if negative) or previous (if positive) integer. Therefore, unless accurate sign information is available, it is not possible to obtain a logarithmic upper bound for m; instead, a less accurate (linear) upper bound is found. After this step, a post-process on the RBR unfolds intermediate rules which correspond to unique continuations. This iterative process finishes when a continuation is not unique, or when direct recursion is reached.
5
Size Analysis of Java Bytecode
From the RBR, size analysis takes care of inferring the relations between the values of variables at different points in the execution. To this end, the notion of size measure is crucial. The size of a piece of data at a given program point is an abstraction of the information it contains, which may be fundamental to prove termination and infer cost. The costa system uses several size measures: – Integer-value maps an integer value to its value (i.e., the size of an integer is the value itself). It is typically used in loops with an integer counter to approximate the number of iterations by detecting how the size of the counter changes at each pass through the loop body.
124
E. Albert et al.
– Path-length [18] maps an object to the length of the maximum path reachable from it by dereferencing. E.g., null has size 0 and, in a non-null reference x, the size of x is 1 plus the maximum path-length of fields in x which are in turn references. Therefore, for a non-cyclic data structure x, the size of x is greater than the size of any reference field of x, i.e., the size of a data structure decreases as fields are dereferenced. This measure can be used to predict the behavior of loops which go through objects, since the path-length is supposed to strictly decrease through the loop. – Array-length maps an array to its length and is used to predict the behavior of loops which traverse arrays. Sec. 5.1 shows how it is possible to improve the efficiency of size analysis by simplifying the abstract compilation removing useless information. Finally, a description of the actual size analysis is given in Sec. 5.2 and 5.3. 5.1
Slicing of Useless Variables
When looking at the RBR, it is sometimes possible to note that some variables are not relevant for the specific purpose of getting cost information, and can therefore be removed in order to make the analysis more efficient and the solving process more feasible [4]. In this sense, a variable is relevant if it directly or indirectly affects some guards, i.e., the control flow (thus, potentially, the cost), or is needed by the cost model (e.g., in our example, Mheap needs the length of an array created by newarray to infer the allocated memory). Non-relevant variables can be removed from the RBR. As an example, an accumulator variable, which only stores partial results of a computation (e.g., the sum of the elements of a list, where a temporary variable is updated during the loop) is essential to the semantics, but can be removed since, in general, does not affect the cost. To this end, a variant of backward program slicing [27] is used, where variables are removed instead of program statements. The slicing criterion consists of the variables occurring in guards or needed by the cost model, which are propagated backwards through the rules by means of a simple dependency calculus, so that variables which directly or indirectly affect the criterion are kept in the slice. As a result, variables which cannot affect the cost are removed. Unlike in normal slicing, soundness is not an issue here: removing variables which are actually relevant may result in a loss of precision, but the correctness of (upper bound) cost and termination results is preserved. In fact, losing precision would make the upper bound bigger (possibly infinite, meaning that it was impossible at all to infer the cost of the program), or make it impossible to prove termination, but such result would not lose correctness (since a bigger upper bound is correct whenever a smaller one is correct, and not proving termination is trivially correct). Because of this, the treatment of calls to methods or loop rules can be simplified: when a call to m is found, relevant variables of m are taken (i.e., those which affect its cost), but relevant variables in the caller rule are not propagated through the call (context-insensitivity). Such a slicing on the rules is unsound, and different with respect to a previous, analogous
COSTA: Design and Implementation of a Cost and Termination Analyzer
125
algorithm [4], where this information is correctly dealt with (context-sensitivity). This results in a less precise and unsound, but more efficient and importantly, scalable slicing. 5.2
Abstract Compilation
The purpose of size analysis is to detect how the size of variables changes during execution [14]. For example, when analyzing a loop where an integer counter i goes from 0 to a threshold, as in the inner loop of Ex. 1, size analysis w.r.t. Integer-value should see that the size of i in the n-th iteration of the loop is greater by 1 than its size in the n−1-th iteration. This information is essential for inferring how many times the loop body will be executed, which is a crucial piece of information in cost and termination analyses. Each bytecode, call or guard is abstracted by linear constraints on the size of its variables: for example, iadd(s0 , s1 , s0 ) will be abstracted by the constraint s0 =s1 +s0 , meaning that the size of s0 after executing the instruction is the sum of the size of s0 and s1 before. Similarly, getfield(f, s0 , s0 ) is abstracted by s0 >s0 , meaning that the (Path-length) output size is less than the input size, due to the field access. This only holds if non-cyclicity of s0 can be proven; otherwise, no information can be obtained, and an empty constraint is produced. We refer to [18] for details on path-length and its requirements. This step results in an abstract constraint program, or simply abstract compilation, which approximates the cost and termination behavior of the original program w.r.t. the chosen size abstractions. E.g., rules inter3 and interc3 , after RBR optimizations, are abstract-compiled into: inter3 (l, a, al, i, i ) := {} interc3 (l, a, al , i, i ) c inter3 (l, a, al, i, i ) := {i
Expressions in brackets are constraints which describe the behavior of the bytecodes. Abstract rules for the loops in the example are: inter2 (l, a, al, i, l , i ) := {i =0, l>l } inter3 (l, a, al , i , i ), inter1 (l , a, al, i , l , i ) inter6 (l, a, al, i, i ) := {i =i+1} inter3 (l, a, al , i , i )
Abstract. We present the mathematical foundations and the design methodology of the contract-based model developed in the framework of the SPEEDS project. SPEEDS aims at developing methods and tools to support “speculative design”, a design methodology in which distributed designers develop different aspects of the overall system, in a concurrent but controlled way. Our generic mathematical model of contract supports this style of development. This is achieved by focusing on behaviors, by supporting the notion of “rich component” where diverse (functional and non-functional) aspects of the system can be considered and combined, by representing rich components via their set of associated contracts, and by formalizing the whole process of component composition.
1 Introduction Several industrial sectors involving complex embedded systems design have recently experienced drastic moves in their organization—aerospace and automotive being typical examples. Initially organized around large, vertically integrated companies supporting most of the design in house, these sectors were restructured in the 80’s due to the emergence of sizeable competitive suppliers. OEMs performed system design and integration by importing entire subsystems from suppliers. This, however, shifted a significant portion of the value to the suppliers, and eventually contributed to late errors that caused delays and excessive additional cost during the system integration phase. In the last decade, these industrial sectors went through a profound reorganization in an attempt by OEMs to recover value from the supply chain, by focusing on those parts of the design at the core of their competitive advantage. The rest of the system was instead centered around standard platforms that could be developed and shared by otherwise competitors. Examples of this trend are AUTOSAR in the automotive industry [1], and Integrated Modular Avionics (IMA) in aerospace [2]. This new organization
This research has been developed in the framework of the European IP-SPEEDS project number 033471.
F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 200–225, 2008. c Springer-Verlag Berlin Heidelberg 2008
Multiple Viewpoint Contract-Based Specification and Design
201
requires extensive virtual prototyping and design space exploration, where component or subsystem specification and integration occur at different phases of the design, including at the early ones [3]. Component based development has emerged as the technology of choice to address the challenges that result from this paradigm shift. In the particular context of (safety critical) embedded systems with complex OEM/supplier chains, the following distinguishing features must be addressed. First, the need for high quality, zero defect, software systems calls for techniques in which component specification and integration is supported by clean mathematics that encompass both static and dynamic semantics— this means that the behavior of components and their composition, and not just their port and type interface, must be mathematically defined. Second, system design includes various aspects—functional, timeliness, safety and fault tolerance, etc.—involving different teams with different skills using heterogeneous techniques and tools. Third, since the structure of the supply chain is highly distributed, a precise separation of responsibilities between its different actors must be ensured. This is addressed by relying on contracts. Following [4] a contract is a component model that sets forth the assumptions under which the component may be used by its environment, and the corresponding promises that are guaranteed under such correct use. The semantic foundations that we present in this paper are designed to support this methodology by addressing the above three issues. At its basis, the model is a languagebased abstraction where composition is by intersection. This basic model can then be instantiated to cover functional, timeliness, safety, and dependability requirements performed across all system design levels. No particular model of computation and communication is enforced, and continuous time dynamics such as those needed in physical system modeling is supported as well. In particular, executable models such as state transition systems and hybrid automata can be used for the description of the behaviors. On top of the basic model, we build the notion of a contract, which is central to our methodology, by distinguishing between assumptions and promises. This paper focuses on developing a generic compositional theory of contracts, providing relations of contract satisfaction and refinement called dominance, and the derivation of operators for the correct construction of complete systems. Our key contribution is the handling of multiple viewpoints. We observe that combining contracts for different components and combining contracts for different viewpoints attached to the same component requires different operators. Thus, in addition to traditional parallel composition, and to enable formal multi-viewpoint analysis, our model includes boolean meet and join operators that compute conjunction and disjunction of contracts. To be able to blend both types of operations in a flexible way, we introduce a new operator that combines composition and conjunction to compute the least specific contract that satisfies a set of specifications, while at the same time taking their interaction into account. The operators are complemented by a number of relations between contracts and their implementations. Of particular interest are the notion of satisfaction between an implementation and its contract, and relations of compatibility and consistency between contracts. Specifications are also introduced to model requirements or obligations that must be checked throughout the design process. Our second contribution consists in organizing these relations in a design and analysis methodology that
202
A. Benveniste et al.
spans a wide range of levels of abstraction, from the functional definition to its final hardware implementation. The rest of paper is organized as follows. We first review and discuss previous work related to the concept of contract in the context of our contribution in Section 2. We then introduce our model by first motivating our choices, and then by defining formally the notions of component, contract, their implementations and specification in Section 3. In addition, in the same section, we introduce and discuss a number of operators and relations that support the incremental construction and verification of multi-viewpoint systems. After that, we discuss the design methodology in Section 4. Finally, Section 5 presents an illustrative example of the use of the model.
2 Related Work The notion of contract has been applied for the first time by Meyer in the context of the programming language Eiffel [5]. In his work, Meyer uses preconditions and postconditions as state predicates for the methods of a class, and invariants for the class itself. Preconditions correspond to the assumptions under which the method operates, while postconditions express the promises at method termination, provided that the assumptions are satisfied. Invariants must be true at all states of the class regardless of any assumption. The notion of class inheritance, in this case, is used as a refinement, or subtyping, relation. To guarantee safe substitutability, a subclass is only allowed to weaken assumptions and to strengthen promises and invariants. Similar ideas were already present in seminal work by Dijkstra [6] and Lamport [7] on weakest preconditions and predicate transformers for sequential and concurrent programs, and in more recent work by Back and von Wright, who introduce contracts [8] in the refinement calculus [9]. In this formalism, processes are described with guarded commands operating on shared variables. Contracts are composed of assertions (higherorder state predicates) and state transformers. This formalism is best suited to reason about discrete, untimed process behavior. Dill presents an asynchronous model based on sets of sequences and parallel composition (trace structures) [10]. Behaviors (traces) can be either accepted as successes, or rejected as failures. The failures, which are still possible behaviors of the system, correspond to unacceptable inputs from the environment, and are therefore the complement of the assumptions. Safe substitutability is expressed as trace containment between the successes and failures of the specification and the implementation. The conditions obtained by Dill are equivalent to requiring that the implementation weaken the assumptions of the specification while strengthening the promises. Wolf later extended the same technique to a discrete synchronous model [11]. More recently, De Alfaro and Henzinger have proposed Interface Automata which are similar to synchronous trace structures, where failures are implicitly all the traces that are not accepted by an automaton representing the component [12]. Composition is defined on automata, rather than on traces, and requires a procedure to restrict the state space that is equivalent to the process called autofailure manifestation of Dill and Wolf. The authors have also extended the approach to other kinds of behaviors, including resources and asynchronous behaviors [13,14]. A more general approach along the lines proposed by Dill
Multiple Viewpoint Contract-Based Specification and Design
203
and Wolf is the work by Negulescu with Process Spaces [15], and by Passerone with Agent Algebra [16], both of which extend the algebraic approach to generic behaviors introduced by Burch [17]. In both cases, the exact form of the behavior is abstracted, and only the properties of composition are used to derive general results that apply to both asynchronous and synchronous models. An interesting aspect of Process Spaces is the identification of several derived algebraic operators. In contrast, Agent Algebra defines the exact relation between concepts such as parallel composition, refinement and compatibility in the model. Our notion of contract supports speculative design in which distributed teams develop partial designs concurrently and synchronize by relying on the notions of rich component [4] and associated contracts. We define assumptions and promises in terms of behaviors, and use parallel composition as the main operator for decomposing a design. This choice is justified by the reactive nature of embedded software, and by the increasing use of component models that support not only structured concurrency, capable of handling timed and other non-functional properties, but also heterogeneous synchronization and communication mechanisms. Contracts in [8] are of a very different nature, since there is no clear indication of the role (assumption or promise) a state predicate or a state transformer may play. We developed our theory on the basis of assertions, i.e., languages of traces or runs (not to be confused with assertions in [8], which are state predicates). Our contracts are intended to be abstract models of a component, rather than implementations, which, in our context, may equivalently be done in hardware or software. Similarly to Process Spaces and Agent Algebra, we develop our theory on the basis of languages of generic “runs”. However, to attain the generality of a metamodel, and to cover non-functional aspects of the design, we also develop a concrete model enriched with real-time information that achieves the expressive power of hybrid systems. Behaviors are decomposed into assumptions and promises, as in Process Spaces, a representation that is more intuitive than, albeit equivalent to, the one based on the successes and failures of asynchronous trace structures. Unlike Process Spaces, however, we explicitly consider inputs and outputs, which we generalize to the concept of controlled and uncontrolled signals. This distinction is essential in our framework to determine the exact role and responsibilities of users and suppliers of components. This is concretized in our framework by a notion of compatibility which depends critically on the particular partition of the signals into inputs and outputs. We also extend the use of receptiveness of asynchronous trace structures, which is absent in Process Spaces, to define formally the condition of compatibility of components for open systems. Our refinement relation between contracts, which we call dominance to distinguish it from refinement between implementations of the contracts, follows the usual scheme of weakening the assumption and strengthening the guarantees. The order induces boolean operators of conjunction and disjunction, which resembles those of asynchronous trace structures and Process Spaces. To address mutliple viewpoints for multiple components, we define a new fusion operator that combines the operation of composition and conjunction for a set of contracts. This operator is introduced to make it easier for the user to express the interaction between contracts related to different viewpoints of a component.
204
A. Benveniste et al.
The model that we present in this paper is based on execution traces, and is therefore inherently limited to representing linear time properties. The branching structure of a process whose semantics is expressed in our model is thus abstracted, and the exact state in which non-deterministic choices are taken is lost. Despite this, the equivalence relation that is induced by our notion of dominance between contracts is more distinguishing than the traditional trace containment used when executions are not represented as pairs (assumptions, promises). This was already observed by Dill, with the classic example of the vending machine [10], see also Brookes et al. on refusal sets [18]. There, every accepted sequence of actions is complemented by the set of possible refusals, i.e., by the set of actions that may not be accepted after executing that particular sequence. Equivalence is then defined as equality of sequences with their refusal sets. Under these definitions, it is shown that the resulting equivalence is stronger than trace equivalence (equality of trace sets), but weaker than observation equivalence [19,20]. A precise characterization of the relationships with our model, in particular with regard to the notion of composition, is deferred to future work.
3 Model Overview In the SPEEDS project, a major emphasis has been placed on the development of a model that supports concurrent system development in the framework of complex OEM-supplier chains. This implies the ability to support abstraction mechanisms and to work with multiple viewpoints that are able to express both functional (discrete and continuous evolutions) and non-functional aspects of a design. In particular, the model should not force a specific model of computation and communication (MoCC). The objective of this paper is to develop a theory and methodology of component based development, for use in complex supply chains or OEM/supplier organizations. Two broad families of approaches can be considered for this purpose: – Building systems from library components. This is perhaps the most familiar case of component based development. In this case, emphasis is on reuse and adaptation, and the development process is largely in-house dominated. In this case, components are exposed in a simplified form, called their interface, where some details may be omitted. The interface of components is typically obtained by a mechanism of abstraction. This ensures that, if interfaces match, then components can be safely composed and deliver the expected service. – Distributed systems development with highly distributed OEM/supplier chains. This second situation raises the additional and new issue of splitting and distributing responsibilities between the different actors of the OEM/supplier chain, possibly involving different viewpoints. The OEM wants to define and know precisely what a given supplier is responsible for. Since components or sub-systems interact, this implies that each entity in the area of interaction must be precisely assigned for responsibility to a given supplier, and must remain out of control for others. Thus each supplier is given a design task in the following form: A goal, also called guarantee or promise, is assigned to the supplier. This goal involves only entities the supplier is responsible for. Other entities, which are not under the responsibility of this
Multiple Viewpoint Contract-Based Specification and Design
205
supplier, may still be subject to constraints that are thus offered to this supplier as assumptions. Assumptions are under the responsibility of other actors of the OEM/supplier chain, and can be used by this supplier for achieving its own promises. This mechanism of assumptions and promises is structured into contracts, which form the essence of distributed systems development involving complex OEM/supplier chains. 3.1 Components and Contracts Our model is based on the concept of component. A component is a hierarchical entity that represents a unit of design. Components are connected together to form a system by sharing and agreeing on the values of certain ports and variables. A component may include both implementations and contracts. An implementation M is an instantiation of a component and consists of a set P of ports and variables (in the following, for simplicity, we will refer only to ports) and of a set of behaviors, or runs, also denoted by M , which assign a history of “values” to ports. This model essentially follows the TaggedSignal model introduced by Lee and Sangiovanni [21], which is shown appropriate for expressing behaviors of a wide variety of models of computation. However, unlike the Tagged-Signal model, we do not need a predetermined form of behavior for our basic definitions, which will remain abstract. Instead, the way sets of behaviors are represented in specific instances will define their structure. For example, an automata based model will represent behaviors as sequences of values or events. Conversely, behaviors in a hybrid model will consist of alternations of continuous flows and discrete jumps. Our basic definitions will not vary, and only the way operators are implemented is affected. This way, our definitions are independent of the particular model chosen for the design. Thus, because implementations and contracts may refer to different viewpoints, we refer to the components in our model as heterogeneous rich components (HRC). We build the notion of a contract for a component as a pair of assertions, which express its assumptions and promises. An assertion E is a property that may or may not be satisfied by a behavior. Thus, assertions can again be modeled as a set of behaviors over ports, precisely as the set of behaviors that satisfy it. Note that this is unlike preconditions and postconditions in program analysis, which constrain the state space of a program at a particular point. Instead, assertions in our context are properties of entire behaviors, and therefore talk about the dynamics of a component. An implementation M satisfies an assertion E whenever they are defined over the same set of ports and all the behaviors of M satisfy the assertion, i.e., when M ⊆ E. A contract is an assertion on the behaviors of a component (the promise) subject to certain assumptions. We therefore represent a contract C as a pair (A, G), where A corresponds to the assumption, and G to the promise. An implementation of a component satisfies a contract whenever it satisfies its promise, subject to the assumption. Formally, M ∩ A ⊆ G, where M and C have the same ports. We write M |= C when M satisfies a contract C. Satisfaction can be checked using the following equivalent formulas, where ¬A denotes the set of all runs that are not runs of A: M |= C ⇐⇒ M ⊆ G ∪ ¬A ⇐⇒ M ∩ (A ∩ ¬G) = ∅ There exists a unique maximal (by behavior containment) implementation satisfying a contract C, namely MC = G ∪ ¬A. One can interpret MC as the implication A ⇒ G.
206
A. Benveniste et al.
Clearly, M |= (A, G) if and only if M |= (A, MC ), if and only if M ⊆ MC . Because of this property, we can restrict our attention to contracts of the form C = (A, MC ), which we say are in canonical form, without losing expressiveness. The operation of computing the canonical form, i.e., replacing G with G ∪ ¬A, is well defined, since the maximal implementation is unique, and it is idempotent. Working with canonical forms simplifies the definition of our operators and relations, and provides a unique representation for equivalent contracts. In order to more easily construct contracts, it is useful to have an algebra to express more complex contracts from simpler ones. The combination of contracts associated to different components can be obtained through the operation of parallel composition, denoted with the symbol . If C1 = (A1 , G1 ) and C2 = (A2 , G2 ) are contracts (possibly over different sets of ports), the composite C = C1 C2 must satisfy the guarantees of both, implying an operation of intersection. The situation is more subtle for assumptions. Suppose first that the two contracts have disjoint sets of ports. Intuitively, the assumptions of the composite should be simply the conjunction of the assumptions of each contract, since the environment should satisfy all the assumptions. In general, however, part of the assumptions A1 will be already satisfied by composing C1 with C2 , acting as a partial environment for C1 . Therefore, G2 can contribute to relaxing the assumptions A1 . And vice-versa. Formally, this translates to the following definition. Definition 1 (Parallel Composition). Let C1 = (A1 , G1 ) and C2 = (A2 , G2 ) be contracts. The parallel composition C = (A, G) = C1 C2 is given by A = (A1 ∩ A2 ) ∪ ¬(G1 ∩ G2 ),
(1)
G = G1 ∩ G2 ,
(2)
This definition is consistent with similar definitions in other contexts [12,10,15]. C1 and C2 may have different ports. In that case, we must extend the behaviors to a common set of ports before applying (1) and (2). This can be achieved by an operation of inverse projection. Projection, or elimination, in contracts requires handling assumptions and promises differently, in order to preserve their semantics. Definition 2 (Elimination). For a contract C = (A, G) and a port p, the elimination of p in C is given by [C]p = (∀p A, ∃p G)
(3)
where A and G are seen as predicates. Elimination trivially extends to finite sets of ports, denoted by [C]P , where P is the considered set of ports. For inverse elimination in parallel composition, the set of ports P to be considered is the union of the ports P1 and P2 of the individual contracts. Parallel composition can be used to construct complex contracts out of simpler ones, and to combine contracts of different components. Despite having to be satisfied simultaneously, however, multiple viewpoints associated to the same component do not generally compose by parallel composition. Take, for instance, a functional viewpoint Cf and an orthogonal timed viewpoint Ct for a component M . Contract Cf specifies allowed data pattern for the environment, and sets forth the corresponding behavioral
Multiple Viewpoint Contract-Based Specification and Design
207
property that can be guaranteed. For instance, if the environment alternates the values T, F, T, . . . on port a, then the value carried by port b never exceeds a given value x. Similarly, Ct sets timing requirements and guarantees on meeting deadlines. For example, if the environment provides at least one data per second on port a (1ds), then the component can issue at least one data every two seconds (.5ds) on port b. Parallel composition fails to capture their combination, because the combined contract must accept environments that satisfy either the functional assumptions, or the timing assumptions, or both. In particular, parallel composition computes assumptions that are too restrictive. Figure 1 illustrates this. Figure 1(a) shows the two contracts (Cf on the left, and Ct on the right) as truth tables. Figure 1(b) shows the corresponding inverse projection. Figure 1(d) is the parallel composition computed according to our previous definition, while Figure 1(c) shows the desired result. We would like, that is, to compute the conjunction of the contracts, so that if M |= Cf Ct , then M |= Cf and M |= Ct . This can best be achieved by first defining a partial order on contracts, which formalizes a notion of substitutability, or refinement.
TFT
a)
!TFT
TFT
>x <x
>1ds TFT
<1ds TFT
>1ds !TFT
<1ds
>1ds
<.5ds
<1ds !TFT
<.5ds At
>1ds TFT
<1ds TFT
>1ds !TFT
<1ds !TFT
<1ds
>.5ds
Gf
>1ds TFT
Gt
<1ds TFT
>1ds !TFT
<1ds !TFT
>1ds TFT
>x
>x
>x
>x
>.5ds
>.5ds
>.5ds
>.5ds
>x
>x
>x
>x
<.5ds
<.5ds
<.5ds
<.5ds
<x
<x
<x
<x
>.5ds
>.5ds
>.5ds
>.5ds
<x
<x
<x
<x
<.5ds
<.5ds
<.5ds
<.5ds
Af lifted
>1ds TFT
c)
>1ds >.5ds
<x Af
b)
!TFT
>x
<1ds TFT
>1ds !TFT
Gf lifted
<1ds !TFT
>1ds TFT
<1ds TFT
>1ds !TFT
At lifted
<1ds !TFT
>1ds TFT
<1ds TFT
>1ds !TFT
<1ds TFT
>1ds !TFT
<1ds !TFT
Gt lifted
<1ds !TFT
>1ds TFT
>x
>x
>x
>x
>.5ds
>.5ds
>.5ds
>.5ds
>x
>x
>x
>x
<.5ds
<.5ds
<.5ds
<.5ds
<x
<x
<x
<x
>.5ds
>.5ds
>.5ds
>.5ds
<x
<x
<x
<x
<.5ds
<.5ds
<.5ds
<.5ds
<1ds TFT
>1ds !TFT
<1ds !TFT
d)
Fig. 1. Truth tables for the synchronization of categories. The four diagrams on the top are the truth tables of the functional category Cf and its assumption Af and promise Gf , and similarly for the timed category Ct . Note that these two contracts are in canonical form. In the middle, we show the same contracts lifted to the same set of variables b, db , x, dx , combining function and timing. On the bottom, the two tables on the left are the truth tables of the greatest lower bound Cf Ct . For comparison, we show on the right the truth tables of the parallel composition C1 C2 , revealing that the assumption is too restrictive and not the one expected.
208
A. Benveniste et al.
Definition 3 (Dominance). We say that C = (A, G) dominates C = (A , G ), written C C , if and only if A ⊇ A , and G ⊆ G , and the contracts have the same ports. Dominance amounts to relaxing assumptions and reinforcing promises, therefore strengthening the contract. Clearly, if M |= C and C C , then M |= C . Given the ordering of contracts, we can compute greatest lower bounds and least upper bounds, which correspond to taking the conjunction and disjunction of contracts, respectively. Definition 4 (Bounds). For contracts C1 = (A1 , G1 ) and C2 = (A2 , G2 ) (in canonical form), we have C1 C2 = (A1 ∪ A2 , G1 ∩ G2 ), C1 C2 = (A1 ∩ A2 , G1 ∪ G2 ).
(4) (5)
The resulting contracts are in canonical form. Conjunction of contracts amounts to taking the union of the assumptions, as required, and can therefore be used to compute the overall contract for a component starting from the contracts related to multiple viewpoints. The operations of parallel composition and conjunction are related by the result below, which allows the designer to relate in a precise way the designs obtained by following different implementation flows: Theorem 1. Let C, C1 and C2 be contracts. Then, (C C1 ) (C C2 ) C (C1 C2 ) when both sides of the inequality are defined. Proof. Let C = (A, G), C1 = (A1 , G1 ) and C2 = (A2 , G2 ) be contracts. Then, by (1), (2) and (4), C (C1 C2 ) = (A ∪ (A1 ∩ A2 ) ∪ ¬(G1 ∩ G2 ), G ∩ G1 ∩ G2 ). Similarly, (C C1 ) (C C2 ) = ((A ∪ A1 ) ∩ (A ∪ A2 ) ∪ ¬(G ∩ G1 ∩ G2 ), G ∩ G1 ∩ G2 ). Clearly, ¬(G1 ∩ G2 ) ⊆ ¬(G ∩ G1 ∩ G2 ). In addition, A ∪ (A1 ∩ A2 ) = (A ∪ A1 ) ∩ (A ∪ A2 ). The result then follows by definition of dominance.
Multiple Viewpoint Contract-Based Specification and Design
209
The left hand side of the formula of Theorem 1 yields the contract obtained by first combining viewpoints for each component and then composing the components. On the other hand, the right hand side of the same formula yields the contract obtained by applying the converse flow. Thus, the theorem expresses that component centric design (left hand side) results in less flexibility in the implementations than a viewpoint centric design (right hand side) would do. 3.2 System Obligations Contracts are not the only way a designer would like to express system’s requirements. System obligations are typically high level requirements that the designer would like to hold without considering any environment nor assumption. System obligations are useful for both overall system requirements and for overall properties of the computing platform. System obligations are formally defined as assertions, i.e., sets of behaviors. An important point is that system obligations should be checked on contracts as early as possible in the design flow, because this significantly reduces the analysis effort, required to prove or disprove the obligation, and the design effort, required to revise the contract if the obligation is not met. To formalize this idea we introduce the conformance relation: a contract C = (A, G) conforms to a system obligation B if A ∩ G ⊆ B. With each contract C = (A, G) we can associate an obligation, that we call the contract obligation, defined as BC = A ∩ G. Hence, a contract conforms to a system obligation if its contract obligation is contained in (i.e., it is stronger than) the system obligation. There is a simple relationship between the maximal implementation MC of contract C and the corresponding contract obligation BC , namely A ∩ MC = BC and MC = BC ∪ ¬A. Indeed: A ∩ MC = A ∩ (G ∪ ¬A) = A ∩ G = BC BC ∪ ¬A = A ∩ G ∪ ¬A = A ∩ G ∪ ¬A ∩ G ∪ ¬A = G ∪ ¬A = MC The formulation of the conformance relation in terms of the contract obligation suggests an extension of the notion of conformance to contracts: a contract C2 conforms to a contract C1 if BC2 ⊆ BC1 . This definition ensures that conformance is transitive, thereby implying that contract C2 conforms to any system obligation which C1 conforms to. Conformance is compositional with respect to parallel composition. This follows from the fact that BC1 ||C2 = BC1 ∩ BC2 , i.e., the contract obligation associated with the parallel composition of C1 and C2 is the intersection of their contract obligations. This can be shown by using equations (1) and (2) of parallel composition. Contracts and system obligations are specifications that are intended to guide the designer(s) towards a consistent system’s implementation. Hence, in the design process we intend to relate implementations to contracts and system obligations. In particular, implementations are used in the contexts defined by contracts and are meant to satisfy all system obligations. In more precise terms, given an implementation that satisfies a contract that conforms to a system obligation, we want that such an implementation also satisfy in some sense to the system obligation. To formalize this we say that an implementation M satisfies a system obligation B through a contract C = (A, G) if A ∩ M ⊆ B. It can be readily observed that if an implementation M satisfies a contract C = (A, G), and if C conforms to a system obligation B, then M satisfies B through C.
210
A. Benveniste et al.
Conformance and dominance between contracts are complementary, in the sense that one does not imply the other. Nevertheless, there is a strong relationship between the two. Specifically, given two contracts, C1 = (A1 , G1 ) and C2 = (A2 , G2 ), if C2 dominates C1 , then C2 conforms to C1 if and only if A2 ⊆ A1 ∪ ¬G2 , as shown below: G2 ⊆ G1 ⇒ A2 ∩ G2 ⊆ G1 and A2 ⊆ A1 ∪ ¬G2 ⇐⇒ A2 ∩ G2 ⊆ A1 Note that the condition A2 ⊆ A1 ∪ ¬G2 requires that if a given behavior is not allowed by the contract C1 (i.e., is not in A1 ), but is possible in C2 (i.e., is in G2 ), then it must be disallowed also by contract C2 (i.e., is not in A2 ). This condition together with the dominance relation is called strong dominance. 3.3 The Asymmetric Role of Ports So far we have ignored the role of ports and the corresponding splitting of responsibilities between the implementation and its environment, see the discussion above. Such a splitting of responsibilities avoids the competition between environment and implementation in setting the value of ports and variables. Intuitively, an implementation can only provide promises on the value of the ports it controls. On ports controlled by the environment, instead, it may only declare assumptions. Therefore, we will distinguish between two kinds of ports for implementations and contracts: those that are controlled and those that are uncontrolled. Uncontrollability can be formalized as a notion of receptiveness: for E an assertion, and P ⊆ P a subset of its ports, E is said to be P -receptive if and only if for all runs σ restricted to ports belonging to P , there exists a run σ ∈ E such that σ and σ coincide over P . In words, E accepts any history offered to the subset P of its ports. This closely resembles the classical notion of inputs and outputs in programs and HDLs; it is more general, however, as it encompasses not only horizontal compositions within a same layer, but also cross-layer integration such as the integration between application and execution platform performed at deployment. In some cases, different viewpoints associated with the same component need to interact through some common ports. This motivates providing a scope for ports, by partitioning them into ports that are visible (outside the underlying component) and ports that are local (to the underlying component). The above discussion can be summarized as a profile π = (vis, loc, u, c), which partitions a set of ports P into subsets such that P = vis loc = {visible} {local} P = u c = {uncontrolled} {controlled} Thus, in addition to sets of runs, components, implementations and contracts can be characterized by a profile over a set of ports P . As before, for a contract C = (A, G) or an implementation M , the sets A, G and M are constrained to include only runs over P . The satisfaction and the dominance relations are easily extended to take profiles into account, by simply insisting that the implementations and the contracts that are put in relation have the same profile. Consequently, conjunction (the greatest lower bound) can only be taken between contracts over the same profiles. If two contracts have different profiles, then an operation of inverse projection is required. However, the
Multiple Viewpoint Contract-Based Specification and Design
211
resulting profiles must be consistent regarding which ports are controlled and which are uncontrolled, and local. This restriction highlights the fact that the logical operations that we have defined are relative to contracts that refer to the same components, and which must therefore treat controlled and uncontrolled ports in the same way. The situation is different for parallel composition. Here, we enforce the property that each port should be controlled by at most one contract. Hence, parallel composition is defined only if the sets of controlled ports of the contracts are disjoint. However, one contract may regard one port as controlled, and the other as uncontrolled. In this case, we are simply stating that the controlling contract determines the value of the port for the other contract. Thus, in the composite contract, a port is controlled exactly when it is controlled by one of the component contracts. Uncontrolled ports of the contracts remain uncontrolled in the composite provided that they are not already controlled by the other contract. A similar reasoning is applied to visible and local ports. In this case, however, we distinguish between the composition of contracts for the same component, and contracts for different components. In the first case, local ports have no effect on the composition, since the scope of local ports extends to the entire component. In the second case, instead, the set of local ports of one contract must be disjoint from the set of ports of the other contract. More formally, for contracts C1 = (π1 , A1 , G1 ) and C2 = (π2 , A2 , G2 ) for the same underlying component, parallel composition is defined if and only if c1 ∩ c2 = ∅, and in that case is the contract C = (π, A, G) defined by: vis = vis1 ∪ vis2 , loc = (loc1 ∪ loc2 ) − (vis1 ∪ vis2 ), c = c1 ∪ c2 , u = (u1 ∪ u2 ) − (c1 ∪ c2 ), The formulas are the same for contracts of different components, where composition is defined only if loc1 ∩ P2 = loc2 ∩ P1 = ∅. 3.4 Consistency and Compatibility The notion of receptiveness and the distinction between controlled and uncontrolled ports is at the basis of our relations of consistency and compatibility between contracts. Our first requirement is that an implementations M with profile π = (vis, loc, u, c) be u-receptive, formalizing the fact that an implementation has no control over the values of ports set by the environment. For a contract C we say that C is – consistent if G is u-receptive, and – compatible if A if c-receptive. The sets A and G are not required to be receptive. However, if G is not u-receptive, then the promises constrain the uncontrolled ports of the contract. In particular, the contract admits no receptive implementation. This is against our policy of separation of responsibilities, since we stated that uncontrolled ports should remain entirely under
212
A. Benveniste et al.
the responsibility of the environment. Corresponding contracts are therefore called inconsistent. The situation is dual for assumptions. If A is not c-receptive, then there exists a sequence of values on the controlled ports that are refused by all acceptable environments. However, by our definition of satisfaction, implementations are allowed to output such sequence. Unreceptiveness, in this case, implies that a hypothetical environment that wished to prevent a violation of the assumptions should actually prevent the behavior altogether, something it cannot do since the port is controlled by the contract. Therefore, unreceptive assumptions denote the existence of an incompatibility internal to the contract, that cannot be avoided by any environment. The notion of consistency and compatibility can therefore be extended to pairs of contracts. We say that two contracts C1 and C2 are consistent or compatible whenever their parallel composition is consistent or compatible. Consistency and compatibility may not be preserved by Boolean operations and by parallel composition. For example, one obtains an inconsistent contract when taking the greatest lower bound of two contracts, one of which promises that certain behaviors will never occur in response to a certain input, while the other promises that the remaining behaviors will not occur in response to the same input. This is because the contracts control the same ports, and composition of promises is by intersection. Similarly, assumptions may become unreceptive as a result of taking the least upper bound. We do not generally use least upper bounds, so we do not elaborate further on this situation. In general, however, the conjunction of compatible contracts is still compatible, since assumptions compose by union. Another form of inconsistency may arise when taking parallel composition. In this case, certain input sequences may be prevented from happening because they might activate unstable zero-delay feedback loops. The resulting behaviors may have no representations in our model, thus resulting in an empty promise. This problem can be avoided by modeling the oscillating behaviors explicitly (perhaps using a special value that denotes oscillation) [11]. We assume that this kind of inconsistency is taken care of by the user or by the tools. Assumptions may also become unreceptive as a result of a parallel composition even if they are not so individually. This is because the set of controlled ports after a composition is strictly larger than before the composition. In particular, ports that were uncontrolled may become controlled, because they are controlled by the other contract. In this case, satisfying the assumptions is the responsibility of the other contract, which acts as a partial environment. If the assumptions are not satisfied by the other contract, then the assumptions of the composition become unreceptive. That is, a hypothetical environment that wished to prevent a violation of the assumptions should actually prevent the behavior altogether, something it cannot do since the port is controlled by one of the contract. Therefore, unreceptive assumptions denote the existence of an internal incompatibility within the composition. Finally, we point out that the operation of transforming a contract to its canonical form preserves consistency (and compatibility), since the promises G are replaced by their most permissive version G ∪ ¬A.
Multiple Viewpoint Contract-Based Specification and Design
213
3.5 Fusion When several viewpoints and several components are present in a system, combining conjunction and parallel composition may not be trivial. To overcome the problem, we define a unique operator, which combines the operations of conjunction and parallel composition, and results in an overall contract for a system. We call this operation a fusion of contracts. The fusion operator takes a finite set of contracts (Ci )i∈I as operand, and a set of ports Q to be eliminated, because internal to the component. The fusion of (Ci )i∈I with respect to Q is defined by [[(Ci )i∈I ]]Q = J⊆I j∈J Cj Q , (6) where J ranges over the set of all subsets of I for which composition is defined and, after the composition, no input is contained in Q. In other words, fusion considers only compositions of contracts for which internal connections have been fully established and discharged, and therefore talk only about the global input to output behavior. To guarantee the maximal flexibility in fusion, the subsets J are also chosen to be maximal with respect to containment. By doing so, we avoid considering partial compositions which, when taking the conjunction, restrict the range of accepted environments, and therefore strengthen the assumptions. Certain particular cases are of interest. For instance, when Q = ∅, the fusion reduces to the greatest lower bound: [[(Ci )i∈I ]]∅ = i∈I Ci . Likewise, if for i = 1, 2, ∀Q (Ai ∪ ¬G) ⊇ ∀Q (A1 ∪ A2 )
(7)
then fusion reduces to the parallel composition operator: [[(Ci )i∈{1,2} ]]Q = [C1 C2 ]Q . Condition (7) says that the restriction to Q of each contract is a valid environment for the restriction to Q of the other contract. This situation corresponds to two components that interact through ports in Q, which are subsequently hidden from outside. In practice, fusion computes the parallel composition of contracts attached to different sub-components of a composite, whereas contracts attached to the same composite that involve the same inputs and outputs (including their direction) fuse via the operation of conjunction. The general case lies in between and is given by formula (6).
4 Methodology The aforementioned relations and operations among contracts set the basis for composition and manipulation of components composed of contracts belonging to more than one viewpoint. Moreover, we provide a methodology to orchestrate the usage of the relations and give guidelines to the user on how to design and verify her model against a number of requirements/constraints that follow the laws presented in the previous sections. We distinguish between the Design and the Analysis methodology. The former defines the design steps that the user can take for the evolution of her system, while the latter specifies the relations that should be established (or re-established) depending on the corresponding design step.
214
A. Benveniste et al.
Fig. 2. Abstract view of the design methodology
An abstract representation of the design methodology is shown in Figure 2. Initially, from a set of requirements, we derive a system model composed of rich components and a (initially empty) set of relations between them. From this point, the user may take a number of steps, perform analysis to enhance this set of relations or to perform design space exploration. The latter is subject of future work. We detail on the design steps later on, after we specify the different elements of the design. 4.1 Elements of the Design We distinguish between four categories of design elements, as defined in Section 3: contracts, implementations, system obligations, and relations. A rule of thumb to distinguish the relations is that consistency, compatibility and dominance relations are established between two contracts; the satisfaction relation between an implementation and a contract; and the conformance relation between a contract and a system obligation. As part of the design methodology, we consider an organization of those elements in three design spaces and furthermore in layers. These are: the implementation space, the contract space and the system obligation space. Relations (since they are not syntactic elements of the model) are represented as “connections” between elements of the same or different spaces. For example, dominance “connects” two elements within the contract space, while satisfaction “connects” one element from the contract and one from the implementation space. Note that a relation may “connect” an element with itself, as in the case of the compatibility relation. Each space may be further subdivided into layers. In the context of the SPEEDS project, only the layering of the contract space is relevant, because the main purpose of the HRC model is to represent contracts. However, a layering of the obligation and implementation spaces is also possible.
Multiple Viewpoint Contract-Based Specification and Design
215
Fig. 3. Organization of spaces and layers
Figure 3 shows a possible layering of the contract space in the case of automotive applications. Here we identify three layers: 1) the functional layer, corresponding to describing the basic functional requirements and guarantees; 2) the Engine Control Unit (ECU) layer, that sets forth the high level timing and architecture assumptions and guarantees; and 3) the Hardware (HW) layer, corresponding to a more detailed description of the individual platform components. In addition, mapping an element from one layer to an element from another will create an element that does not belong to either of the operands’ layers. Thus, we have three extra layers for all the possible mappings, as depicted in Figure 4; one layer for the mapping of Function to ECU, one for the mapping of ECU to HW and one for the mapping of all the layers. 4.2 Design Steps A design step is an evolution of the development of the design, which can be seen also as the evolution of the design in time. We use the elements defined in the previous section to define the basic design steps that a user can follow during design. In principle, a design step is defined as a tuple of source and target rich components. Moreover, we specify a number of relations that are “required” between the elements of the rich components participating in the step. A design step is said to be validated if the required relations hold. This validation is performed using high-level analysis services which are being developed within the SPEEDS project, and include tools that can check satisfaction, dominance, compatibility and consistency for a variety of models (from pure discrete to hybrid) using both formal and semi-formal (simulation with dynamic property checking) techniques. Moreover, we introduce the notion of valid rich component, that is a rich component whose contract is compatible, consistent, it is satisfied by its implementation and conforms to its obligations.
216
A. Benveniste et al.
Fig. 4. Layers derived when mapping elements from different layers
When a design step is validated, and if the source rich components are valid, then also the target components are valid, which means that the resulting component can be used “safely” in place of the originating one, i.e., we can substitute without losing any verification and validation results obtained previously. We subdivide the design steps in two categories: 1. Design steps on single rich components. The first category contains design steps which specify or modify only one element of a rich component, resulting in a new rich component where the remaining elements are unchanged. Let RC = {B, C, M } be the source and RC = {B , C , M } the target rich components, where B, B are the system obligations, C, C the contracts and M, M the implementation.1 The design steps and the corresponding relations for their validation are described below. System obligation modification design step is when C = C and M = M , whereas B = B . For the validation of this step there are two options: – verify that B ⊆ B or – verify that C conforms to B ∪ ¬B Contract modification design step is when B = B and M = M , whereas C = C. For the validation of this step there are two options: – verify that C strongly dominates C or – verify that C is compatible, consistent, conforms to C and M satisfies it Implementation modification design step is when B = B and C = C , whereas M = M . For the validation of this step there are two options: – verify that M refines M or – verify that M satisfies C The above steps are called “modifications” even though we may not have prior definition of the “modified” element, in which case, we consider the trivial element. 1
Even though a rich component may have more than one contract or system obligation, their composition results into a unique one, and thus, without loss of generality, we consider this assumption for the rest of the document.
Multiple Viewpoint Contract-Based Specification and Design
217
2. Design steps on multiple rich components. The second category contains those design steps having more than one source or more than one target rich components. Let RCn = {Bn , Cn , Mn } for n ∈ [1..κ] be κ source and RCm = {Bm , Cm , Mm } for m ∈ [1..λ] be λ target rich components. Decomposition design step: For the decomposition we have κ = 1 and λ ≥ 2. Let RC = {B , C , M } be the parallel composition of all rich components RCm for m ∈ [1..λ]. For the validation of this step there are two options: – verify that C strongly dominates C1 or – verify that C is compatible, consistent, conforms to C1 and M satisfies C1 Composition design step: For the composition we have κ ≥ 2 and λ = 1. Since parallel composition preserves (strong) dominance, satisfaction and refinement, no verification task is needed for integration. Mapping design step: Mapping is a composition (fusion) of design elements from different modeling layers and therefore we refer the reader to the discussion for composition. Using the design steps. The above design steps are all possible actions that can advance the system design and are the “bricks” to build the design methodology. The design methodology that we follow uses these building blocks in a viewpoint centric approach. This means that we should not apply any contract prior to performing decomposition. In that way, and following Theorem 1, we retain a greater level of flexibility for the implementations that should satisfy the decomposed components. We can see this in Figure 5, where component RC, containing two contracts Cr and Cf, from the real time and functional viewpoints respectively, has two possible decompositions: rich
Fig. 5. Two possible decompositions of RC
218
A. Benveniste et al.
components RC and RC . Therefore, we propose decomposition to RC , where the real-time viewpoint is not applied (composed) to the decomposed components. Note that the relations between the different elements of this decomposition should hold according to the definition of the decomposition design step above, if we want to have a valid design step. Thus, since we have no implementation or system obligation in our example, the following must hold: Cf1 Cf2 Cf.
5 Illustrative Example Our approach aims at supporting component based development of heterogeneous embedded systems with multiple viewpoints, both functional and non-functional. The following simple example illustrates this for the case of functional, timed, and safety viewpoints. The top level view of our system is shown in Figure 6. It consists of a system controller that can let the underlying plant “start”, “stop”, or “work” (signals r, s, and w). The system controller promises that the mean amount of work performed by the plant does not exceed a maximum and that the work is not paused for too long. A human operator may decide to reinitialize the controller by sending the “reset” message z. The system controller, which is the part of the system under design, must conform to the following obligations: Protocol obligation: “work” requests can be sent only after a “start” and before a “stop”. A “stop” must follow a “start” or a “work” request. Longest idle time obligation: a “work” request must follow a “start” or a previous request at most after τmax seconds. Maximum mean work obligation: from the last operator “reset”, the amount of “work” requests per unit of time must not be greater than 1/ξ. Figure 7 shows the automata that specify the obligations. For this example, the notation [g]s/a denotes a transition. It is a triple consisting of a guard g, a triggering event s, and an action a. Action a may, in turn, assign some variables and/or emit some output(s). The idle time and mean work obligations are specified in terms of hybrid automata. These hybrid automata use a timer x bound to physical time, thus satisfying the differential equation x˙ = 1 (x increases with constant speed 1). The system controller is decomposed into several components as shown on Figure 8. It consists of a simple controller that is responsible for sending the “start”, “stop”, and “work” signals to the underlying plant. The controller is deployed over a computing r Human Operator
z
System Controller
s w
Fig. 6. System view
Plant
out
Multiple Viewpoint Contract-Based Specification and Design
219
Protocol obligation r w
working
idle
s Longest idle time obligation
x˙ := 1
r/x := 0
[x ≤ τmax ]w/x := 0
Maximum mean work obligation x˙ := 1 z/x := 0, m := 0
[m/x ≤ 1/ξ]w/m := m + 1
Fig. 7. System obligations
c z
supervisor
o
r s
controller
computing platform
w
f
Fig. 8. The decomposition of the system controller
platform subject to “failure” f . This component guarantees that the underlying plant receives “work” requests within a maximum amount of time. The supervisor component, instead, limits the “work” requests sent by the controller (the plant has limited capacity) by moving the controller into a “blocked” mode. This is achieved by mimicking the token bucket mechanism used for traffic shaping in communication networks: every unit of time, the supervisor accumulates a token for doing “work”; every request of “work” reduces the token amount by ξ. The supervisor monitors the flow of w’s. When they get too frequent, i.e., no token is available, an “overloaded” message o is sent to the controller, stopping it from emitting further w requests. Only after an appropriate amount of time, long enough to let a token accumulate, does the supervisor emit a clear “c”
220
A. Benveniste et al.
message to the controller to enable the emitting of additional w requests. The supervisor resets the accumulated tokens when the human operator sends the “reset” message z. This system involves three viewpoints: functional, Quality of Service (QoS) of timed nature, and safety. The contracts for the different viewpoints are depicted in Figures 9– 11. For each contract, we show its assumption (top) and promise (bottom). Assumptions are specified as observers, meaning that the corresponding diagrams define the negation functional viewpoint
trivial assumption
⇓
r working
idle
w
s
QoS viewpoint
c
[h > τn ]
w o
⇓
r c
h˙ := 1
[h ≤ τn ]w/h := 0 o r/h := 0 Fig. 9. The two contracts Cfunct and CQoS specifying the two viewpoints of the controller. The assumption is put on top of the promise and both are separated by the implication symbol ⇓ . safety viewpoint
f
no guarantee Fig. 10. The contract Csafety specifying the contract of the computing platform
Multiple Viewpoint Contract-Based Specification and Design
221
QoS supervisor w, o w
c
w
z/x := 0
z/x := 0
⇓
x˙ = 1, h˙ = 1 [x < ξ]/o
[x ≥ ξ]w/x := x − ξ, h := 0
[x ≥ ξ and h < τn ]/c z/c, x := 0
x
w
w
w
w
z
w
ξ
o
ξ
c
o c
o c; o c o
co
c
time
controller is blocked Fig. 11. Contract Cs of the supervisor and its behaviour
of these assumptions. In these diagrams, the circles filled in black denote not accepting states. Figure 9 depicts the set of contracts associated to the controller. The first contract Cfunct describes the functional aspect under trivial assumption. The promise of the contract CQoS indicates that there exist two modes: nominal, corresponding to normal operation, and blocked, in which w’s are not emitted. Contract CQoS relates to timing. This contract assumes that, if the controller is blocked, it will move to the nominal mode (by receiving an event c) at most τn seconds after the last “work” request. If not, the observer will move to a non accepting state. When in nominal mode, the controller guarantees that the time interval between two successive “work” requests is at most τn seconds. The timer h is dedicated to computing the elapsed time for both assumption and promise.
222
A. Benveniste et al.
Contract Csafety , shown in Figure 10, is attached to the computing platform and asserts an assumption of no failure. The failure f is abstracted and considered as input to the platform itself. The promise is not provided and can be thought as “any” possible behavior (i.e., the “universe”). If a failure event arrives, then the assumption moves to a non accepting state, meaning that nothing could be guaranteed about the provided behavior of the platform. This kind of contracts is useful to introduce assumptions without altering the guarantees of the system. Figure 11 depicts the QoS contract for the supervisor, which is in charge of avoiding system collapse that may occur when an excessive amount of “work” is supplied to the plant. The assumption says that no w must occur when the system is in the overloaded state. The promise is specified in terms of a hybrid automaton. This hybrid automaton uses two clocks x and h bound to physical time, thus satisfying the differential equation x˙ = 1, h˙ = 1. Timer x is used to implement the token bucket mechanism, while timer h is used to guarantee that a w request will be delayed by at most τn seconds. When action w occurs, timer x decreases and, if w occurs too frequently, in the long range x eventually reaches ξ, which causes the emission of message o and switches the mode to “blocked”, where latency is at most the smallest between ξ and τn . At some point, when x is again greater than ξ, a cleaning message c is sent to the controller to switch to mode “nominal”. It is guaranteed that the sending of c is at most τn seconds after the last “work” command. It is also guaranteed that, if the operator sends a reset by an event z, then the system resets the timer value x and after ξ seconds the system turns back to its nominal mode. Contract conformance to the system obligations As introduced in Section 3.2, system obligations are compositional, i.e., if a contract C1 conforms to a system obligation B1 and a contract C2 conforms to a system obligation B2 , then the parallel composition between C1 and C2 conforms to the system obligation B1 ∩ B2 . This property allows us to “allocate” system obligations (see Fig. 7) to components in order to check conformance of the corresponding contracts separately, thereby reducing the complexity of the verification task. In order for this check to be successful it is necessary that the system obligation’s interface be part of the allocated component’s interface. For example, the protocol obligation relates r, s and w, that are outputs of the controller component (see Fig.8). Hence, we can allocate the protocol obligation to the controller for the conformance check. Similarly, the longest idle time obligation relates r and w, so that it can also be allocated to the controller component for the conformance check. Conversely, the maximum mean work obligation relates z and w. Hence, this obligation can either be allocated to the supervisor, because z and w are supervisor’s inputs, or to the parallel composition of the supervisor with the controller, because w is also a controller’s output. In the former case, the verification task is simpler, because it does not require the parallel composition of the supervisor and the controller. Nonetheless, the conformance check may fail in this case because w is controlled by the controller and not by the supervisor, so that the parallel composition might in the end be necessary to verify conformance to the system obligation. To illustrate how the conformance check works, we show that the controller’s contract conforms to the protocol and the longest idle time obligations. Let us first consider
Multiple Viewpoint Contract-Based Specification and Design safety and functional viewpoint
223
f
⇓ r working
idle
w
s
Fig. 12. Composed contract of the functional and safety viewpoints of the system controller h˙ := 1
A∩G
¬B
[h > τn ]w
x˙ := 1 [x > τmax ]w
c [h ≤ τn ]w/h := 0 o
r/x := 0
[x ≤ τmax ]w/x := 0
r/h := 0
Fig. 13. Controller’s contract obligation and negation of the corresponding system obligation
the protocol obligation. Observe that the controller’s promise (Fig. 9) is equal to the protocol obligation. Conformance of a contract to a system obligation requires that the contract’s promise subject to the contract’s assumption (also called the contract obligation) is contained in the system obligation. In formulas: A ∩ G ⊆ B, where (A, G) denotes the contract and B denotes the system obligation. Since the controller’s promise is equal to the protocol obligation, then the controller’s contract conforms to the protocol obligation for any assumptions. This shows that the controller’s contract conforms to the protocol obligation even after composition with the safety viewpoint (Figs. 10 and 12). Let us now consider the longest idle time obligation. The conformance check of the controller’s contract to this obligation is not as trivial as in the case of the protocol obligation. To verify conformance in this case we need to compute the contract obligation A ∩ G and check the containment relation with the longest idle time obligation. To do so we can check that A ∩ G ∩ ¬B = ∅. The contract obligation and the negation of the longest idle time obligation are shown in Fig. 13. To compute the negation of the longest idle time obligation, we first complete the obligation’s specification with its non-accepting states (not shown for clarity reasons). Since the resulting automaton is
224
A. Benveniste et al.
deterministic, its negation can be computed by exchanging accepting and non-accepting states. If we now take the intersection of the two automata shown in Fig. 13 we obtain an automaton that has no accepting states if we assume τn ≤ τmax , representing therefore the empty assertion. This proves that conformance is met.
6 Conclusion We have presented mathematical foundations for a contract-based model for embedded systems design. Our generic mathematical model of contract supports “speculative design”. This is achieved by focusing on component behavior, via compositions of contracts, with which diverse (functional and non-functional) aspects of the system can be expressed. This enabled a formalization of the whole process of component and multiple viewpoint composition through the general mechanism of contract fusion. A key contribution of our approach is that the incremental consideration of components and viewpoints can be handled with flexibility — whether through a component or a viewpoint centric methodology. The formalism and the design methodology has been illustrated through a multi viewpoint example. Future work includes the development of effective algorithms to handle contracts, coping with the problems raised by complementation. Taking complements is a delicate issue: hybrid automata are not closed under complementation; in fact, no model class is closed under complementation beyond deterministic automata. To account for this fact, various countermeasures can be considered. First, the designer has the choice to specify either E or its complement ¬E (e.g., by considering observers). However, the parallel composition of contracts requires manipulating both E and its complement ¬E, which is the embarrasing case. To get compact formulas, our theory was developed using canonical forms for contracts, systematically. Not enforcing canonical forms provides room for flexibility in the representation of contracts, which can be used to avoid manipulating both E and ¬E at the same time. ¯ where E ¯ is an approximate A second idea is to redefine an assertion as a pair (E, E), complement of E, e.g., involving some abstraction. In doing so, one of the two char¯ = ∅ or E ∪ E ¯ = , do not hold. acteristic properties of complements, namely E ∩ E However, either necessary of sufficient conditions for contract dominance can be given. The above techniques are the subject of ongoing work and will be reported elsewhere.
Acknowledgments The authors would like to acknowledge the entire SPEEDS team for their contribution to the project and to the ideas presented in this paper.
References 1. Damm, W.: Embedded system development for automotive applications: trends and challenges. In: Proceedings of the 6th ACM & IEEE International conference on Embedded software (EMSOFT 2006), Seoul, Korea, October 22–25 (2006)
Multiple Viewpoint Contract-Based Specification and Design
225
2. Butz, H.: The Airbus approach to open Integrated Modular Avionics (IMA): technology, functions, industrial processes and future development road map. In: International Workshop on Aircraft System Technologies, Hamburg (March 2007) 3. Sangiovanni-Vincentelli, A.: Reasoning about the trends and challenges of system level design. Proc. of the IEEE 95(3), 467–506 (2007) 4. Damm, W.: Controlling speculative design processes using rich component models. In: Fifth International Conference on Application of Concurrency to System Design (ACSD 2005), St. Malo, France, June 6–9, pp. 118–119 (2005) 5. Meyer, B.: Applying ”design by contract”. IEEE Computer 25(10), 40–51 (1992) 6. Dijkstra, E.W.: Guarded commands, nondeterminacy and formal derivation of programs. Communications of the ACM 18(8), 453–457 (1975) 7. Lamport, L.: win and sin: Predicate transformers for concurrency. ACM Transactions on Programming Languages and Systems 12(3), 396–428 (1990) 8. Back, R.J., von Wright, J.: Contracts, games, and refinement. Information and communication 156, 25–45 (2000) 9. Back, R.J., von Wright, J.: Refinement Calculus: A systematic Introduction. Graduate Texts in Computer Science. Springer, Heidelberg (1998) 10. Dill, D.L.: Trace Theory for Automatic Hierarchical Verification of Speed-Independent Circuits. ACM Distinguished Dissertations. MIT Press (1989) 11. Wolf, E.S.: Hierarchical Models of Synchronous Circuits for Formal Verification and Substitution. PhD thesis, Department of Computer Science, Stanford University (October 1995) 12. de Alfaro, L., Henzinger, T.A.: Interface automata. In: Proceedings of the Ninth Annual Symposium on Foundations of Software Engineering, pp. 109–120. ACM Press, New York (2001) 13. Chakrabarti, A., de Alfaro, L., Henzinger, T.A., Stoelinga, M.: Resource interfaces. In: Alur, R., Lee, I. (eds.) EMSOFT 2003. LNCS, vol. 2855, pp. 117–133. Springer, Heidelberg (2003) 14. Henzinger, T.A., Jhala, R., Majumdar, R.: Permissive interfaces. In: Proceedings of the 13th Annual Symposium on Foundations of Software Engineering (FSE 2005), pp. 31–40. ACM Press, New York (2005) 15. Negulescu, R.: Process spaces. In: Palamidessi, C. (ed.) CONCUR 2000. LNCS, vol. 1877. Springer, Heidelberg (2000) 16. Passerone, R.: Semantic Foundations for Heterogeneous Systems. PhD thesis, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720 (May 2004) 17. Burch, J., Passerone, R., Sangiovanni-Vincentelli, A.: Overcoming heterophobia: Modeling concurrency in heterogeneous systems. In: Proceedings of the 2nd International Conference on Application of Concurrency to System Design, Newcastle upon Tyne, UK, June 25–29 (2001) 18. Brookes, S.D., Hoare, C.A.R., Roscoe, A.W.: A theory of communicating sequential processes. Journal of the Association for Computing Machinery 31(3), 560–599 (1984) 19. Engelfriet, J.: Determinacy → (observation equivalence = trace equivalence). Theoretical Computer Science 36, 21–25 (1985) 20. Brookes, S.D.: On the relationship of CCS and CSP. In: D´ıaz, J. (ed.) ICALP 1983. LNCS. vol. 154. Springer, Heidelberg (1983) 21. Lee, E.A., Sangiovanni-Vincentelli, A.L.: A framework for comparing models of computation. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems 17(12), 1217–1229 (1998)
Coordination: Reo, Nets, and Logic Dave Clarke CWI, Amsterdam, The Netherlands [email protected]
Abstract. This article considers the coordination language Reo, a Petri net variant called zero-safe nets, and intuitionistic temporal linear logic (ITLL). The first part examines the semantics of the coordination language Reo in relation to zero-safe nets. Although the external presentations of the two models are quite different, the difference in underlying semantics is rather small. In fact, Reo connectors can be compositionally encoded into zero-safe nets. This means that the tools and techniques developed for Petri nets over the last 30 years, such as various extensions to the zero-safe nets model, such reconfigurable and dynamic nets, can be adapted to the Reo setting. The second part re-examines the idea of using linear logic as a basis for coordination languages. Specifically, we argue that intuitionistic temporal linear logic (ITLL) can encode the semantics of Reo and zero-safe nets, by encoding their notion of transaction. Moreover, by adapting the encoding and exploring the additional connectives of ITLL, it can form the basis of an expressive coordination language which goes beyond these models, by introducing means for explicitly reasoning about choices made by the environment and by providing more fine-grained control over the timing of interaction.
1
Introduction
Pundits of coordination languages and models argue that concurrent, component-based software should be broken into two independent parts: the entities performing computation (components), and the entities coordinating the data flow and resource usage between the components. This was crystallised into the equation [23]: Concurrent Programming = Computation + Coordination. To date numerous coordination models have been proposed, each realising the philosophy in a different manner [4,37]. The ones we are interested in in this paper endeavour to schedule collections of actions together into multi-party transactions, in the sense that either some desired set of actions occur together or none of them do, possibly at the exclusion of other collections of actions. The following variant of the well-known holiday booking example illustrates well what we are after. Consider the following:
This work is in the context of the EU project IST-33826 CREDO: Modeling and analysis of evolutionary structures for distributed services (http://credo.cwi.nl)
F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 226–256, 2008. c Springer-Verlag Berlin Heidelberg 2008
Coordination: Reo, Nets, and Logic
a flight to Rio a hotel on the beach a dune-buggy
or
a flight to Zurich a hotel on the mountain a 4x4
or
227
stay home.
The goal is to book, with suitably matching dates, either all the entries in the first box and none of the entries in the second box, or all the entries in the second box and none of those in the first box, or nothing at all (the third box). Our assumption is that various components (or services) providing the basic functionality for booking flights, hotels, dune buggies, 4x4s, etc. exist, and the task of the coordinator is to plug these together in such a way to achieve the desired goal. In order to discuss the kind of coordination we are interested in this paper, we first need to establish some terminology. We consider coordination not simply as a one-off event, but rather as an ongoing activity. Conceptually, time is partitioned into a number of contiguous epochs, which we consider as the units of coordination, meaning that the coordinator attempts to achieve something within each epoch, and that, from the perspective of an external observer, no finer granularity of time is observable. From the perspective of the coordinator, a lot may occur within an epoch. We say that a set of actions A is atomic wrt to another set of actions B if and only if (1) either all of A occur or none of A within an epoch; and (2) the actions A are not interleaved with the actions in B within that epoch. The actions A are said to be mutually exclusive with the actions B. It is clear that this notion of atomic can be broken into two part. Indeed, this is how atomicity is conceived in the coordination language Reo [6]. In Reo, and in this article, the notion of synchrony is associated with (1), that is, a set of actions are considered synchronous if and only if they either all occur or none of them occurs within an epoch. The notion of asynchrony or mutual exclusion is associated with (2), that is, two sets of actions A and B are asynchronous if and only if if some a ∈ A occurs within an epoch, then no element of B can occur within that epoch, and vice versa. Simply put, two events that necessarily occur within the same epoch are considered to be synchronous, and two events that necessarily occur within different epochs are considered to be asynchronous. Synchrony can be also be equated with a primitive notion of transaction. Using this terminology, we say that the actions underlying the Rio-holiday are synchronous, the actions underlying the Zurich-holiday are synchronous, and the Rio-holiday is asynchronous with the Zurich-holiday. We will work with the following definition of coordination: coordination is the task of (continuously/repeatedly) scheduling groups of dependent actions within temporal epochs. In this definition, ‘scheduling’ captures both that actions may be scheduled to occur or to not occur—or perhaps even that they occur and are rolled-back. Furthermore, ‘dependent’ captures that one action may have temporal or data
228
D. Clarke
dependencies on another action, so even though all actions are conceptually considered to happen at the same time (synchronously), there may be a sequential ordering among them, though only the fact that they occur within the same epoch is relevant for our purposes. The two coordination models considered in this paper are Reo [6] and zero-safe nets [15]. These models, more or less, employ the notions defined above, coordinating by grouping together actions into synchronous epochs at the exclusion of other possibilities. In both cases, the possibilities in different epochs changes due to state changes of the primitive elements of the coordination model. Reo. Reo is a channel-based coordination designed by Farhad Arbab, and developed extensively by his group at CWI [6]. In Reo component connectors are built by plugging together primitive channels.1 In Reo, channels come in a variety of behavioural variants, but one characteristic persistent across the entire set is that some channels are synchronous, as defined above, whereas others are asynchronous; other channels admit variants of these possibilities or change their behaviour depending upon their state. Channels in Reo can be composed to express a wide range of possible behaviours. Composition occurs at nodes—a collection of coincident channel ends—, which consist of a number of channel ends writing data into the node, and a number of ends taking data from the node. The node selects one possible data item written to it at the exclusion of all others, and duplicates it to all the ends taking data synchronously—thus all ends must be capable of accepting the data. Nodes have the policy that they cannot buffer data, it must be passed onward. The power of Reo comes from the combination of synchronous and mutual exclusion constraints of channels and nodes, resulting in the propagation of synchrony and exclusion across a connector. This means, for example, that if two primitives require that their ports a, b, c and d, e, respectively, are synchronised, and they are plugged together, joining c and d, then the synchronisation propagates over the composite of the two primitives, meaning that a, b, c(= d), e will all by synchronised. If action f is mutually exclusive with a, then it will be mutually exclusive with b, c(= d), and e as well after the composition. Synchronisation and mutual exclusion constraints restrict the possible ways that data can flow through a connector. As we shall see, this means that data cannot simply flow through a connector; rather, only some of the possible ways data can flow are valid. Consider the example in Figure 1(a). Primitives 1(Anpr), 3(moB), and 7(stC) are replicators. They move the data synchronously from A, m and s, respectively, to npr, oB, and tc. Primitives 2(nm) and 6(rs) are lossy sync channels. They move data synchronously from n and r to m and s, respectively, or simply lose the data at n and r respectively. Primitive 5(otq) is a merger. It non-deterministically chooses between moving data from o to q or from t to q. Finally, primitive 4(pq) is a synchronous drain. It will remove data from p and q synchronously. The semantics of this connector can be understood as the following ‘token game’: the goal is to move a token from boundary node 1
We sometimes write primitive instead of channel to de-emphasise the requirement that they have two ends.
Coordination: Reo, Nets, and Logic B
B
m
A
q
A
t
r
n
o
p
q
A
t
r
s
o
p
q
t
r
s C
(a)
m
n
o
p
B
m
n
s C
(b)
229
C
(c)
Fig. 1. (a) An exclusive router in Reo. (The numbering of primitives will be used later). (b) Valid data flow. (c) Stuck data flow.
A so that (1) moving it through the connector obeys all the above-mentioned rules imposed by the primitives, (2) no token gets stuck on an internal node {n, m, o, p, q, s, r, t}, and (3) no primitive is used more than once, so that (4) the token possibly ends up on B and/or C or is lost. Two of the possible ‘plays’ are presented in Figures 1(b) and 1(c). The dashed (blue) line marks the path of the token(s) moved through the connector. In the first case, choices were made satisfying the rules of the game, and thus this corresponds to a correct behavioural possibility of the connector. In the second case, we were over-zealous and allowed the data to pass through both lossy syncs nm and rs, because the merger can choose only one item from o and t, hence the token will get stuck on, for example, o, and thus the behavioural possibility indicated (copying data from A to B and C) is invalid. After playing with the various possibilities for this connector, the conclusion should be that it is possible to move the token on A to either B or C, exclusively. This connector is in fact called an exclusive router. Zero-safe Nets. Coordination in zero-safe nets [15] is achieved in a different way. Zero-safe nets are a variant of Petri nets which includes certain places called zero places. These places cannot be observed, meaning that any transition involving a marking of such a place is internal. In contrast, normal places are called stable. Only markings of stable places are observable. The basic firing of transitions of a zero-safe net is the same as for an ordinary Petri net. The main difference is that the semantics is expressed as two levels. The micro-level is more or less the same as in an ordinary Petri net, except that a micro-step may involve multiple firings. Precisely, it may involve the movement of any number of tokens into and out of zero places, but a token that is moved into a stable place may not be moved out of it again. This way a series of micro-steps makes up a so-called macro-step, which models a transaction between observable states. The internal transitions can be seen as synchronising the observable ones. As usual, choice in a Petri net is modelled by the competition between different transitions for the same token. Figure 2(a) presents and example zero-safe net solving the dining philosophers problem (adapted from Bruni et. al. [15]). A token may be moved from
230
D. Clarke thinking
thinking drop
eating fork
fork
choice
thinking drop
eating hungry
thinking
drop
choice
choice
drop
eating hungry
hungry
eating fork
choice
choice
fork
choice
take
take hungry
choice
take hungry
(a)
choice
hungry
hungry
hungry
(b)
Fig. 2. (a) A Zero-safe Petri net solving the dining philosophers problem. Stable places are represented using large circles. Zero places are represented using small circles. (b) A successful transaction. Notice that it involves multiple internal steps. A stuck configuration would result, for example, if the far right hand choice had selected the fork.
fork (a stable place) to one of the zero places between the choice and the take transitions. A take transition will fire only if both connected zero places and the hungry place have tokens. The transition from having two fork s (on the table) to eating involves a number of internal transitions. Only combinations of internal transitions that do not get stuck are considered to be valid transactions. Thus, in this example, the two choice transitions need to coordinate in order to ensure that the transaction succeeds (Figure 2(b)). This Work. On the surface, the coordination models Reo and zero-safe nets do not look particularly close. We show that they are surprisingly similar by casting their semantic models into a common, simple framework. The semantics of Reo and zero-safe nets are cast in terms of transitions between elements of a pair of monoids. The monoids respectively represent the stable parts of a configuration (states of buffers or stable places) and the internal parts of a configuration (the nodes or the zero places), and transitions represent the synchronous flow of data through Reo primitives or the firing of Petri net transitions. A common set of operational rules gives the semantics for the two models, and the difference is ultimately is in the details of the monoid used. After casting Reo’s and zero-safe nets’ semantics into a common semantic framework, we show that Reo connectors can be compositionally encoded into zero-safe nets. This result applies to causal Reo models which do not include any notion of context dependence, priority or maximal progress. This creates the opportunity for transferring tools, theoretical results, and implementation techniques for Petri nets to Reo. The second part of the paper explores the idea of using intuitionistic temporal linear logic [26] as a meta-language for describing coordination models that have a transactional flavour. We then use the logic to encode both Reo and zero-safe nets. Transitions are encoded as linear formula describing the transfer of data between ports and state changes. In addition, the temporal modalities of the logic are used to control in which epochs events may, must and cannot occur. For example, m⊗z −◦ m ⊗z , states that connector in state (or stable place) m in the presence of data on node z (or in zero place z) can by synchronously
Coordination: Reo, Nets, and Logic
231
transformed into state (or stable place) m , which cannot be used until the next step or thereafter, and data on node (or in zero place) z, which must be consumed in this epoch. The − modality on the m represents that connector m (or place) may or may not be used in the epoch, and the absence of such a modality on z states that it must be used within the epoch. The semantics of Reo and zero-safe nets are soundly encoded as proofs of judgements of a certain shape, which succeed in either consuming all formulæ that must be consumed within an epoch or by delaying the consumption of formulæ until another epoch. The encoding consists of encoding the common operational framework and showing how the firing of primitive elements can be encoded. A side benefit of using an intuitionistic logic is that our model of Reo does not suffer from causality problems which blight most other models of Reo. Finally, we describe how the additional connectors of ITLL can express coordination patterns beyond those expressible in Reo and zero-safe nets, by exploiting the game-like nature of the logic to express choices made by the environment, in contrast to the choices made by the coordinator, and by varying the use of the temporal modalities to change both the specification of the epoch in which an even occurs and who decides when it occurs. Caveat. Many semantic models for Reo have been presented in the literature to address more complex classes of connectors possessing mechanisms such as context dependency [18]. In this article, we take the simplest published semantics of Reo as our basis [8], though the semantics we present does not suffer from any causality problems, so is somewhat closer in spirit to the operational semantics of Mousavi et. al. [35]. In any case, there is no natural semantics for Reo—Reo makes sense only when given a semantic model, and an encoding of channel semantics and composition in that model—so we pursue but a reasonable model of Reo. Organization. The paper is organized as follows: Section 2 presents a unified semantic framework into which both Reo’s and zero-safe nets’ semantics are cast, and shows how Reo can be encoded into zero-safe nets. Section 3 describes intuitionistic temporal linear logic. Section 4 describes how both Reo and Zerosafe nets are encoded into ITLL. Section 5 explores ITLL as a coordination model. Section 6 describes related work. Section 7 concludes the paper.
2
Unified Semantic Framework
One of the goals of this paper is to compare the coordination language Reo and zero-safe nets. We do so by casting both formalisms into a common semantic framework. An appropriate, simple framework was developed by Bruni et. al. [13] for studying zero-safe nets, which extends the well-known ‘Petri nets are monoids’ approach [34]. Typically, the monoids are so-called place monoids modelling the markings of places—essentially, the monoids are multi-sets, with multi-set union as the multiplication operation. The variant adapted for zerosafe nets considers the net as two monoids, one for stable places and one for zero places.
232
D. Clarke
Here we model Reo simply by choosing different (partial) monoids as the basis of the semantics, while keeping the same operational rules as for zero-safe nets. By making the monoid partial, we are able to both place an upper bound on the placement of tokens, modelling the requirement in Reo that nodes (temporarily) have at most one token, and to track the fact that each Reo primitive can fire at most once per epoch. The semantics is based on a two reduction relations, macro-step reduction ⇒Δ ⊆ M × M and mirco-step reduction →Δ ⊆ (M × Z) × (M × Z), between configurations capturing the state of a system, where the configurations are elements of M and M ×Z, respectively, and M and Z are monoids. Both relations are determined by a set of primitive firings Δ ⊆ (M × Z) × (M × Z), described below. External configurations M represent externally observable states. Internal configurations in M × Z represent, in addition, the internal, unobservable states. M denotes the stable part of a configuration, which persists between epochs, and is externally observable. Z denotes the internal or unobservable part of a configuration. States in Z are used in internal steps to coordinate externally observable transitions, but do not themselves persist from one epoch to another. Thus, macro-steps capture what occurs from one epoch to another epoch as far as the outside world is concerned, whereas micro-steps capture what occurs within an epoch in order to make a macro-step happen—the coordination. One can think of the macro-step semantics as an extensional description of behaviour, and the micro-step semantics as an intensional description of behaviour. In Reo, M records the states of the primitive connectors, whereas Z records the data flow on nodes, as the semantics of Reo do not permit data to be (observably) buffered on nodes. The chosen monoids (Section 2.2) will ensure that each primitive can be only in one state and can only fire once per epoch, and that only one data item is permitted on a node. In zero safe nets, M records the markings of the stable places, whereas Z records the zero-safe places. Only stable places can form part of external configurations. The stable places may change at most once during each epoch. More precisely, each stable token in a zero-safe net may be moved out of a stable place and, eventually, into another stable place at most once per epoch. The zero places are used to coordinate the activities occurring within an epoch. These may be used an arbitrary number of times, and can be fired either sequentially or in parallel. We now present a generic operational semantics which can be used to give the semantics to Reo and zero-safe nets. The semantic rules are parameterised both by two monoids and a set of transitions, Δ ⊆ (M × Z) × (M × Z), describing the firing of primitives (such as channels and nodes in Reo and transitions in zerosafe nets). Each element of Δ is written as (m, z) [ (m , z ), where m, m ∈ M and z, z ∈ Z, stating that there is a transition taking internal configuration (m, z) to internal configuration (m , z ). Note that only transitions between internal configurations are provided (though z and z may be 0, and hence not involve internal states). These form the basis for micro-steps; macro-steps are derived from them using the rules of the operational semantics.
Coordination: Reo, Nets, and Logic (firing) (m, z) [(m , z ) ∈ Δ m ∈ M z ∈ Z (m ◦ m , z · z ) →Δ (m ◦ m , z · z )
(concatenation) (m1 , z) →Δ (m1 , z ) (m2 , z ) →Δ (m2 , z ) (m1 ◦ m2 , z) →Δ (m1 ◦ m2 , z )
(commit) (m, 0Z ) →Δ (m , 0Z ) m ⇒Δ m
Fig. 3. Operational Semantics Rules. A transition is defined only when each of its constituents parts is defined. (M, ◦, 0M ) and (Z, ·, 0Z ) are monoids (Section 2.1).
Figure 3 presents the generic operational semantic rules for the micro- and macro-steps of a system represented by monoids M and Z and primitive transition relation Δ. The rule (firing) describes how a primitive firing embeds into the rule format. The rule also can capture parts of a configuration that do not change during a particular micro-step. The rule (parallel) describes two independent micro-steps firing in parallel, in different parts of the connector/net. The rule (concatenation) describes the sequential composition of two series of micro-steps. The two mirco-step sequences typically deal with different parts of the configuration. This rule acts as a coordinator for the various actions which may occur in parallel. The rule (commit) describes macro-steps as a series of micro-steps starting from an external configuration and ending in an external configuration. At the beginning and end of an epoch, the Z component must equal the unit element of the monoid, meaning, in zero-safe nets, that there are no elements on zero-places, and, in Reo, that no data is buffered in nodes. Some advantages of the monoid-based semantics are that it can define the semantics of a system in a piecemeal fashion and that it can express non-interleaved concurrency. For example, a part of a Reo connector (or zero-safe net) may be described by configuration (m1 , z1 ) and another part by (m2 , z2 ). The configuration (m1 ◦m2 , z1 ·z2 ) describes the combination of these two connector (net) parts, assuming that it is defined. If we have micro-step transitions (m1 , z1 ) →Δ (m1 , z1 ) and (m2 , z2 ) →Δ (m2 , z2 ), the transition (m1 ◦ m2 , z1 · z2 ) →Δ (m1 ◦ m2 , z1 · z2 ) may be possible in the larger connector, depending on certain conditions. 2.1
Partial Commutative Monoids
The generic semantic rules presented above are parameterized by two partial commutative monoids—we will typically just say monoid for partial commutative monoid. We now define these and present a number of example monoids used in this paper. But first, some notation. Notation 1. Recall that AB = {f : B → A} describes the set of functions from B to A, and that B A is the partial functions from B to A. If f : B A, we write dom(f ) to be the part of B for which f is defined, i.e., dom(f ) = {b ∈ B | f (b) is defined}. Given a set C and a C-indexed collection of sets Qc∈C , define the partial dependent function space C Qc∈C to be the subset of the partial functions
234
D. Clarke
C c∈C Qc such that if c ∈ dom(f ), then f (c) ∈ Qc . Such a function maps each defined c to a member of Qc . Let Data be a non-empty set representing the data domain. Definition 1 (Partial Commutative Monoid). A partial commutative monoid M ◦ = (M, ◦, 1) consists of a set M , an unit element 1 ∈ M , and a partial function ◦ : M × M M , obeying the following axioms: – (a ◦ b) ◦ c = a ◦ (b ◦ c) – a◦b=b◦a – a◦1 = 1◦a = a
(associativity) (commutativity) (neutral element)
Definition 2 (Product). Given two monoids M ◦ = (M, ◦, 1M ) and N · = (N, ·, 1N ), their product, M ◦ × N · is defined as M ◦ × N · = (M × N, •, 1M×N ) where: – (m, n) • (m , n ) = (m ◦ m , n · n ), whenever both components are defined, and undefined otherwise; and – 1M×N = (1M , 1N ). The following two monoids help define semantics for Petri nets. The first models the number of tokens on each place in the net, whereas the second models coloured tokens in coloured variants. Definition 3 (Place Monoid). Given a set of places P, define the monoid PM (P) = (NP , ·, 0) as: – (p1 · p2 )(x) = p1 (x) + p2 (x), and – 0(x) = 0. Definition 4 (Coloured Place Monoid). Given a set of places P, define the monoid CPM (P) = (P → (Data → N), ·, 0) with – (p1 · p2 )(x) = λd.(p1 (x)(d) + p2 (x)(d)), and – 0(x) = λd.0. Typically, we would require that the function located at each place has finite support—meaning that each place holds a finite multiset of data items. The following three monoids are used to define the semantic of Reo. The first two model data on nodes: the black-and-white node monoid models the case where data values have no significance (they are signals), whereas the coloured node monoid models the case where the data values are of interest. In both cases, elements of the monoid represent a partial mapping of nodes to data values representing the value stored on the node (or just whether it has data in the B/W case). The key point to note is that composition is undefined for two elements of the monoid that give a data value for the same node. The state monoid is similar to the coloured node monoid, in that it describes a partial function; the main difference is that each element of the domain of the function has its own codomain. The domain C is used to represent the primitives and each component Qc of the codomain c∈C Qc represents the state set of components c. Again, only one state per connector is permitted, so the composition operation is also partial.
Coordination: Reo, Nets, and Logic
235
Definition 5 (B/W Node Monoid). Given a set of nodes N , define the monoid NM (N ) = (2N , ·, ∅) with n1 ∪ n 2 , if n1 ∩ n2 = ∅ – n1 · n 2 = undefined, otherwise Definition 6 (Coloured Node Monoid). Let N be a set of nodes. A coloured node monoid is a triple CNM (N ) = (N Data, ·, 0), where f 1 ∪ f2 , if dom(f1 ) ∩ dom(f2 ) = ∅ – f1 · f2 = undefined, otherwise – 0 is the nowhere defined function. Definition 7 (State Monoid). Let C be a set, and for each c ∈ C, let Qc be a non-empty set. Define the monoid SM (C, Qc∈C ) = (C Qc∈C , ·, 0) where: f 1 ∪ f2 , if dom(f1 ) ∩ dom(f2 ) = ∅ – f1 · f2 = undefined, otherwise – 0 is the nowhere defined function. 2.2
Semantics of Reo
Much of the hard work has now been done to give an operational semantics to Reo connectors. We need simply to instantiate the monoids M and Z, and provide transitions for primitives. We present two variants, a ‘black and white’ one which ignores data and only sends signals through connectors, and a ‘coloured’ one which considers data as well. Let Port be a denumerable set of port names. Let i, o, a, b, . . . range over elements of Port and I, O, A, B, . . . range over subsets of Port. A Reo primitive P is a 5-tuple P = (Q, q, I, O, Δ), where Q is a set of states, q ∈ Q is the initial state, I, O ⊆ Port are the sets of input and output ports, respectively, and Δ defines the transition relation. The transitions depend upon whether the model we are interested in is black and white or coloured. B/W. Initially, transitions will be of the form Δ ⊆ (Q×2I )×(Q×2O ), satisfying the requirement that if (q, A) [ (r, B) ∈ Δ, then A ∩ B = ∅, for causality reasons. The transition (q, A) [ (r, B) states that in state q the connector can synchronously accept input on ports A (excluding ports I \ A), and output on ports B (excluding ports O \ B). Now let M = SM (1, Q) and Z = NM (I ∪O), where 1 is a one element set. It is clear that the following embeddings exists Q × 2I → Q × 2I∪O → M × Z, and similarly for Q × 2O . Thus we consider the transitions Δ in the desired format as Δ ⊆ (M × Z) × (M × Z) using these embeddings. Coloured. This time transitions have the form Δ ⊆ (Q × (I Data)) × (Q × (O Data)), where for (q, f ) [ (r, g) ∈ Δ we again require that dom(f ) ∩ dom(g) = ∅, for causality reasons. The transition (q, f ) [ (r, g) states that in state q the connector can synchronously accept input on ports dom(f ) (excluding ports I \ dom(f )), and output on ports dom(g) (excluding ports
236
D. Clarke
O \ dom(g)). Partial functions f and g describe the actual data that flows. Now let M = SM (1, Q) and Z = CNM (I ∪ O). Given the embeddings Q × (I Data) → Q × (I ∪ O Data) → M × Z, and similarly for O, we can consider the transitions Δ in the desired format as Δ ⊆ (M ×Z)×(M ×Z) using these embeddings. Figure 4 presents the firing relations for a number primitives. These consist of Reo’s selection of channels, mergers and replicators (to model Reo’s n-m nodes), and takers and writers, to model boundary interaction. We assume that the channels are defined over port set {a, b}, and that mergers and replicators are defined over port set {a, b, c}. Primitive Sync SyncDrain SyncSpout AsyncDrain
Arity States Init. B/W Trans. Coloured Trans. {a} → {b} {◦} ◦ (◦, a) [(◦, b) (◦, a → d) [(◦, b → d) {a, b} → ∅ {◦} ◦ (◦, ab) [(◦, 1) (◦, a →d+b → d ) [(◦, 1) ∅ → {a, b} {◦} ◦ (◦, 1) [(◦, ab) (◦, 1) [(◦, a →d+b → d ) {a, b} → ∅ {◦} ◦ (◦, a) [(◦, 1) (◦, a → d) [(◦, 1) (◦, b) [(◦, 1) (◦, b → d) [(◦, 1) AsyncSpout ∅ → {a, b} {◦} ◦ (◦, 1) [(◦, a) (◦, 1) [(◦, a → d) (◦, 1) [(◦, b) (◦, 1) [(◦, b → d) LossySync {a} → {b} {◦} ◦ (◦, a) [(◦, b) (◦, a → d) [(◦, b → d) (◦, a) [(◦, 1) (◦, a → d) [(◦, 1) FIFO1 {a} → {b} {◦, •} ◦ (◦, a) [(•, 1) (◦, a → d) [(•(d), 1) (•, 1) [(◦, b) (•(d), 1) [(◦, b → d) Merger {a, b} → {c} {◦} ◦ (◦, a) [(◦, c) (◦, a → d) [(◦, c → d) (◦, b) [(◦, c) (◦, b → d) [(◦, c → d) Replicator {a} → {b, c} {◦} ◦ (◦, a) [(◦, bc) (◦, a → d) [(◦, b →d+c → d) Fig. 4. Some Reo Primitives. ab denotes the set {a, b}. a → d+b → d is a function mapping a to d and b to d .
A Reo connector is simply a set of primitives P = (Qn , qn , In , On , Δn )n∈N , where N is used to name the primitives, subject to the conditions (1) for all a ∈ Port , if a ∈ In and a ∈ Im , for n, m ∈ N , then n = m, and, similarly, if a ∈ On and a ∈ Om , for n, m ∈ N , then n = m; and (2) for all a ∈ Port , if a ∈ In then there is at most one m = n such that a ∈ Om , and if a ∈ On , then there is at most one m = n such that a ∈ Im . Condition (1) ensures that the input/output ports are unique for each primitive, and condition (2) ensures that the ports are primitive ports are plugged together 1:1, output to input—that is, if a ∈ On ∩ Im , then output port a of n is plugged into input port a of m. The semantics of a Reo connector is given by taking the two monoids M = SM (N, Qn∈N ) and Z = NM ( n∈N (In ∪On )) (or Z = CNM ( n∈N (In ∪On )) for coloured models) and the transition relation Δ = n∈N Δn , where each primitive firing (q, A) [ (r, B) (or (q, f ) [ (r, g)) is embedded in the obvious fashion. Figure 5 gives an example derivation for the connector presented in Figure 1.
Coordination: Reo, Nets, and Logic
(Firing) (Concat) (Firing)
(1, A) [(1, npr)
237
(2, n) [(2, 0) (2, npr) → (2, pr)
(∗)
(2567, npr) → (2567, pqC)
(13, A) → (13, npr)
(4, pq) [(4, 0) (4, pqC) → (4, C)
(24567, npr) → (24567, C)
(Firing) (Concat) (Concat)
(1234567, A) → (1234567, C) where (*) is (Firing) (Concat)
(6, r) [(6, s)
(7, s) [(7, tC)
(6, pr) → (6, ps)
(7, ps) → (7, ptC)
(Firing)
(67, pr) → (67, ptC) (567, pr) → (567, ptC)
(5, t) [(5, q) (5, ptC) → (5, pqC)
(Firing) (Concat)
Fig. 5. Derivation for Exclusive Router (Figure 1). The primitives are all stateless, so the M (first) component really just records which primitives are used. The Z (second) component records the data flow.
2.3
Zero-Safe Nets
We directly give the semantics of zero-safe nets in terms of monoids, borrowing from (a compressed version of) the development of Bruni et. al. [15]. Definition 8 (Zero-safe net). A zero-safe net is a tuple N = (S, T, F, uin , Z) where – S is the set of places, a, a , . . .; – T is the set of transitions t, t , . . . (with S ∩ T = ∅); – F ⊆ (S × T ) ∪ (T × S) is called the flow relation; the elements of the flow relation are called arcs, and we write x F y for (x, y) ∈ F ; – a finite multiset uin : S → N is the initial marking of N ; and – the set Z ⊆ S is the set of zero places. The places M = S \ Z are called the stable places. A stable marking is a multiset of stable places, and the initial marking uin must be stable. A marking for a Petri net with places S is a map S → N indicating the number of tokens at each place—this can equally be seen as a multiset of places. Each transition, t ∈ T , connects some places • t = {s ∈ S | s F t} with some places t• = {s ∈ S | t F y}. Transitions generate a firing relation Δ ⊆ (S → N) × (S → N) such that f [ g ∈ Δ if and only if 1, if s ∈ • t 1, if s ∈ t• f (s) = and g(s) = 0, otherwise 0, otherwise for some t ∈ T . This could be easily extended to deal with weighted transitions. We can view markings in S → N equivalently as functions from (M × Z) → N or as (M → N) × (Z → N), giving the stable and the zero markings. Thus the firing relation Δ ⊆ (S → N) × (S → N) can trivially be seen as a relation
238
D. Clarke
(a)
(b)
Fig. 6. (a) Synchronous chain ‘coordinating’ actions (FIFO-buffers) (b) Again in zerosafe nets
Δ ⊆ ((M → N) × (Z → N)) × ((M → N) × (Z → N)), which is a relation between the product of two place monoids, namely, PM (M)× PM (Z), so, finally, Δ ⊆ (PM (M)×PM (Z))×(PM (M)×PM (Z)). Thus, the two monoids underlying the semantic model for zero-safe nets are M = PM (M) and Z = PM (Z), where, as mentioned above, M are the stable places and Z are the zero places. The remainder of the operational semantics is given by the rules in Figure 3. It is an easy exercise to work out the derivations corresponding to the behaviour demonstrated in Figure 2. A coloured variant can be easily described, simply by replacing PM (M ) × PM (Z) by CPM (M ) × CPM (Z) and redefining the firing relation Δ. 2.4
Comparision of Reo and Zero-Safe Nets
On the surface, Reo and zero-safe nets look quite different, but delving a little deeper into their semantics reveals that they are in fact very similar in many respects. In fact, we can encode Reo into zero-safe nets. – Both models enforce a notion of transaction in that valid, externally observable macro-steps consist of a sequence of internal micro-steps, and that not all possible sequences of micro-steps result in a valid macro-step. – Reo’s nodes are akin to zero-safe places and its primitives’ states are stable, that is, they are part of the observable configuration. A FIFO buffer behaves similarly to a (bounded/1-safe) place in a Petri net. – Choice in Reo is handled by merger and primitives such as asynchronous drains (which can in fact be encoded using a merger). Choice in Petri nets is handled by multiple transitions from a place competing for the token. – Replication in Reo is handled by replicators. Replication in Petri nets is handled by multiple target places of a transition. – A chain of synchronous channels must fire sequentially at the micro level, within an epoch. This is very similar to how a chain of zero-safe places would fire sequentially. Replication (via replicators or Petri transition) can then be used to coordinate activities, respecting the sequential ordering. Parallel reductions occur in Reo as in zero-safe nets in independent parts of a connector. In both cases, the topology of the connector/net guides the sequential/parallel micro-step reduction. Figure 6 illustrates the idea for both Reo and zero-safe nets.
Coordination: Reo, Nets, and Logic
239
– The semantics of Reo primitives are expressed externally as a set of transitions, whereas the nature of zero-safe networks is predefined, based on the variant of Petri-net under consideration. – The major difference between the two is that in any macro-step (epoch), a Reo primitive can participate at most once. Equally, in Reo data may flow through each node at most once in an epoch. On the other hand, a zero-safe net permits each transition to fire an arbitrary number of times in an epoch (though this is easy to control, as we show below). – In zero-safe nets, places can hold an arbitrary number of tokens. Reo nodes (equivalent to zero-safe places) can hold only one token. Variants of Petri nets exist where places hold at most one token (called 1-safe). Such a variant is closer to Reo. In order to establish a correspondence between the two models, in one direction at least, we need to demonstrate how to restrict zero places so that they can be used at most once per epoch. This is quite simple, and is illustrated in Figure 7. The stable place is used to ensure that the zero place can fire only once, as the token leaves the stable place when the token enters the zero place, and re-enters the stable place when the token leaves the zero place. After these two steps, the stable place can no longer participate in the epoch. Figure 8 presents the encoding of each of Reo’s primitives into zero safe nets. Each encoded connector uses the above trick to ensure that it fires only once per epoch. The labelled zero places represent the boundary/interface ports of the connectors. Composition of two encoded Reo connectors consists of superimposing the two zero places corresponding to the boundary nodes—that is, the two Reo ends plugged together to form a node are encoded as the same zero place. Encoding more general Reo primitives based on their semantic description is straightforward, as we simple follow the patterns used in the encodings above. The encoding of FIFO1 demonstrates how to encode a state machine, the encodings of AsyncDrain and LossySync demonstrate how to encode choice, and the encodings of the Sync∗ demonstrate how to encode synchronisation. Thus, we argue, Reo can be compositionally encoded in terms of zero-safe nets by encoding Reo’s primitives, and then composing them. Now that Reo can be encoded into zero-safe nets, extensions to the zero-safe model such as dynamic and reconfigurable nets [13] can be applied in the context of Reo. Exploring these directions is left for future work. Indeed, reconfiguration of Reo connectors, in particular, when the reconfiguration is caused by dynamic behaviour in the connector, is already an active area of research [17,31]. In the other direction, encoding zero-safe nets into Reo is not possible in general. This is simply because places in zero-safe nets may hold more than one token, and thus zero-safe nets can model multiple interactions on a given port within a single epoch. Finally, observe that the Reo and zero-safe nets could easily be combined into a single model, simply by taking the product of both their normal and zero components, and by adding non-trivial additional firing rules to link the Reo
240
D. Clarke
Fig. 7. Ensuring that zero places fire at most once per epoch
Sync
SyncDrain
SyncSpout
b
a
b
b
a
AsyncDrain
a
AsyncSpout
LossySync
a
a
a b
b
b
FIFO1
Merger
Replicator
a
b
a a
c
c
b
b
Fig. 8. Zero-safe net Encodings of Standard Reo Primitive. Labelled places represent Reo’s ports (ends). Other places are used to ensure that the primitive fires at most once per epoch, plus the buffer for FIFO1 (Compare behaviours with those described in Figure 4).
connector and the zero-safe net, that is, by having transitions from MReo × Mzsn to ZReo × Zzsn . This approach is at present being explored to connect Reo and workflow language YAWL [41].
Coordination: Reo, Nets, and Logic
3
241
Intuitionistic Temporal Linear Logic
We now shift focus away from the Reo and zero-safe nets, and introduce intuitionistic temporal linear logic (ITLL)2 both as a meta-language for reasoning about the sort of coordination models we are interested in, and as a vehicle for enabling more refined behavioural descriptions beyond those expressible in Reo or zero-safe nets. For example, neither of these models can make the distinction between internal and external choice, whereas in ITLL the formula A & B expresses an internal choice, and formula A ⊕ B expresses external choice. Intuitionistic temporal linear logic (ITLL) is an extension of linear logic [24], explored by Hirai [26], that includes linear variants of the temporal modalities −, −, and 3−. ITLL is a resource conscious logic with modalities for reasoning about how resource usage changes over time. ITLL has been used to reason about timed Petri nets [25], for modelling agent choices, agent interaction [38,39], and agent negotiation [33], and as a logic programming language [9]. For now, we will consider the following fragment of ITLL (additional connectives will be discussed later): A ::= a | 1 | A ⊗ A | A −◦ A | A & A | A ⊕ A | A | A where a ranges over primitive formulæ. In general, we use lower case letters to range over primitive formulæ and upper case letters to range over formulæ. Our interpretation of ITLL is that formulæ can describe epochs of time, where within each epoch, formulæ describe resource usage in a linear fashion. Ultimately, we associate coordination (or ability-to-coordinate) with the ability to perform proofs of a certain form. Thus, we base this interpretation on a standard reading of proofs. The standard understanding of the game semantics underlying linear logic formulæ [12], adapted to account for the temporal connectives, has guided our interpretation. The connectives can be read as follows: a 1 A⊗B A −◦ B A&B A⊕A A A
availability of single resource a, now no resources, now availability of (resources) A and B, now availability of a one-time converter of (resources) A to B, now an internal choice between (resources) A and B, now an external choice between (resources) A and B, now (resource) A is available any time, exactly once (resource) A is available in the next epoch, exactly once
It is important to note that A does not mean the same as its linear temporal logic (LTL) counterpart. To be compatible with linear logic, Hirai linearised (in the sense of linear logic) the meaning of A: A can be used any time, but only once. Note that we have no exponentials (!A), though we do suggest how they 2
A distinguishing feature of intuitionistic linear logic is that judgements allow only a single conclusion. Furthermore, it lacks the par connective among others which require multiple conclusions.
242
D. Clarke
can be used as an alternative in our encodings. Further interpretation of ITLL from a coordination perspective will be given in Section 5. The proof rules for ITLL are presented in Figure 9. Let Γ denote a possibly empty sequence of formulæ. Let Γ = A, B, . . .. Similarly for Γ . Rules are labelled to indicate whether the connective is introduced on the left or right of the turnstile , so, for example, −◦l labels the rule introducing −◦ on the left. A A
Ax
Γ A A, Δ B Cut Γ, Δ B
Γ A B, Δ C −◦l Γ, A −◦ B, Δ C Γ, A, B C ⊗l Γ, A ⊗ B C Γ, A C & Γ, A & B C 1l
Γ, A B −◦r Γ A −◦ B Γ A Δ B ⊗r Γ, Δ A ⊗ B
Γ, B C & Γ, A & B C 2l
Γ, A C Γ, B C ⊕l Γ, A ⊕ B C
Γ A Γ B & r Γ A&B
Γ A ⊕1r Γ A⊕B
Γ C 1l Γ, 1 C A, Γ B l A, Γ B
Γ, A, B, Δ C Ex Γ, B, A, Δ C
Γ A r Γ A
1
Γ B ⊕2r Γ A⊕B
1r Γ, Γ A
Γ , Γ A
Fig. 9. The proof rules for intuitionistic temporal linear logic
The first five rows of rules are standard. From a coordination perspective, the computation interpretation of the rules &l and ⊕l reveals the difference between the two choice connectives. &l will enable a proof using A & B to go through even if only one of A or B is suitable. The interpretation is the the prover (the coordinator) can choose which. On the other hand, ⊕l requires that both A and B in A ⊕ B can make the proof go through. The coordinator needs to be able to produce an appropriate proof in both cases for coordination to succeed, modelling that the choice can be made externally. This is consistent with existing interpretations of linear logic [1]. Rule l enables a resource to be used at any time (A) to be used now (A). Rule r , which we do not use directly, states that a result that depends only on resources that can be used at any time can itself be used anytime. Rule states that reasoning about future epochs (such as the next one) is done by shifting the future (next state) resources to the present state (removing the ) and then reasoning as per usual. This requires that there are no now resources, only − and − ones. In the encodings of Reo and zero-safe nets presented later in this section, the proof rules for ITLL are used as the workhorse handling details of the semantic
Coordination: Reo, Nets, and Logic
243
framework, whereas the specific firing rules for the primitives are encoding as externally specified axioms. Let Σ denote a set of axioms of the form Γ A. We write Γ Σ A to denote judgements in the proof system extended with axioms Σ and the following rule: (Γ A) ∈ Σ Σ-axiom Γ Σ A Note that proving judgement Π Σ B is equivalent to proving Σ ∗ , Π B in the logic extended with the standard axioms for exponentials along with adaptations of the proof rules above (see [26]), where Σ ∗ converts each (Γ A) ∈ Σ into the formula !(Γ ⊗ −◦ A), and Γ ⊗ is the formulæ in Γ connected by an ⊗. Thus, if Γ = A1 , . . . , An , then Γ ⊗ = A1 ⊗ · · · ⊗ An , and if Γ = , then Γ ⊗ = 1. 3.1
Semantics of ITLL
Hirai [25] presented semantics of ITLL as an adaptation of the original phase space semantics of Girard [24]. No game semantics exists for ITLL. Nonetheless, we can (and have) drawn from game semantics for linear logic when giving our intuitive interpretation of ITLL’s connectives. The open question remains: what are the game semantics for full ITLL. Here we provide only some intuition, leaving the complete treatment of game semantics for ITLL for future work. In games semantics for linear logic [12], one telling feature is the difference between the choice connectives − & − and −⊕− is who makes the choice. In the first case, the player needs to only win one of the possible games encoded in the choice in order to win the game. In the second case, the player can only win the game if he can win both games encoded in the choice, as the opponent makes the choice. To extend to deal with the temporal modalities, we anticipate the following changes. Games are played over multiple epochs. A is interpreted as a game that is played in the following epoch. A is a game for which the player (coordinator) can chose which epoch the game corresponding to A is played in — the moves are thus, ‘play now’ and ‘play later’, when cast in terms of a game played ‘now’. Similarly, 3A is the dual, which means that the opponent (environment) makes the choice when to play the game. Finally, for a player to win a game (= valid/provable judgement), she needs to win every epoch (until the game peters out).
4
Encodings
After investing the effort to cast the semantics of Reo and zero-safe nets into a uniform framework, encoding them into ITLL is quite easy: we need to encode the general framework and then the primitives for each system. The basic assumption we run with is that the elements of the respective monoids can be encoded as primitive formulæ or tensor products (⊗) of such formuæ. We typically do
244
D. Clarke
not insert explicit conversions between the elements of the monoid and their representation as formulæ. We demonstrate the soundness of our encoding, but do not give a completeness result. There are two reasons: (1) completeness simply would not hold for the entire class of monoids we consider, though we expect that it will hold for monoids which are not partial and are sufficiently free; and (2) we do not know whether a cut elimination result holds for ITLL, yet this seems to be crucial for proving completeness. 4.1
Encoding Configurations
The encoding of a configuration depends whether it is on the left or right hand side of the arrow. Given a reduction (m, z) →Δ (m , z ), where m, m ∈ M and z, z ∈ Z, the configuration on the left-hand side is encoded as (m)⊗ ⊗ z ⊗ , (or equivalently, as a sequent of form (m), z, without the separating ⊗), and the configuration on the right-hand side is encoded as (m )⊗ ⊗ (z )⊗ . Note that the primitive firings themselves are encoded slightly differently, as we shall see in Section 4.2 The reading of (m)⊗ ⊗ z ⊗ is that (1) any of the elements in m can be used at any time, and (2) all of the elements in z must be used in this step. Similarly (m )⊗ ⊗ (z )⊗ states, in addition, that (3) in the next step, but not now, any of the elements of m can be used at any time. The part involving z is analogous to (2). When encoding the left and right hand sides of m ⇒Δ m , the encodings can be simplified to (m)⊗ and (m )⊗ . 4.2
Encoding a Primitive Firing
Each primitive firing (m, z) [ (m , z ) ∈ Δ is encoding as an axiom as follows: [[(m, z) [ (m , z )]] = m, z (m )⊗ ⊗ (z )⊗ Proofs for a system with primitive firing relation Δ are carried out in the logic Γ Σ A, where Σ = [[Δ]]. Example 1. The transition from a Reo replicator (◦, a) [ (◦, bc) would be encoded as an axiom: ◦, a ◦ ⊗ b ⊗ c. An alternative approach is to encode the state-machines as a single formula, using an exponential: [[Δ]] = !([[(m1 , z1 ) [ (m1 , z1 )]] & · · · &[[(mn , zn ) [ (mn , zn )]]) where Δ = {(m1 , z1 ) [ (m1 , z1 ), . . . , (mn , zn ) [ (mn , zn )}
[[(m, z) [ (m , z )]] = m⊗ ⊗ z ⊗ −◦(m )⊗ ⊗ (z )⊗
In this format, the above example would be expressed as !(◦ ⊗ a −◦ ◦ ⊗ b ⊗ c).
Coordination: Reo, Nets, and Logic
4.3
245
Encoding the Semantic Rules into ITLL
We now show how to encode the semantic rules from Figure 3 into ITLL proof fragments. This is the essence of soundness. In our encodings, we will use the exchange rule implicitly on both sides of the turnstile, which implies an implicit use of Cut. We also use −∗ to mark multiple applications of a rule or pattern. Firing. The rule: (firing) (m, z) [ (m , z ) ∈ Δ m ∈ M z ∈ N (m ◦ m , z · z ) →Δ (m ◦ m , z · z ) is encoded as proof: (m, z Σ ( m )⊗ ⊗ (z )⊗ ) ∈ Σ m, z Σ ( m )⊗ ⊗ (z )⊗ m, z Σ ( m )⊗ ⊗ (z )⊗
Σ-axiom ∗l
(1)
(2)
m Σ ( m )⊗
z Σ (z )⊗
m, m , z, z Σ ( m )⊗ ( m )⊗ ⊗ (z )⊗ ⊗ (z )⊗
⊗∗r
where (1) is multiple occurrences of the following proof, one for each atom ma , joined using ⊗r : Ax
ma Σ ma
ma Σ ma and (2) is similar, but simpler. Parallel. The rule: (parallel) (m1 , z1 ) →Δ (m1 , z1 ) (m2 , z2 ) →Δ (m2 , z2 ) (m1 ◦ m2 , z1 · z2 ) →Δ (m1 ◦ m2 , z1 · z2 ) is encoded as proof fragment: m1 , z1 Σ (m1 )⊗ ⊗ (z1 )⊗ m2 , z2 Σ (m2 )⊗ ⊗ (z2 )⊗ ⊗r m1 , m2 , z1 , z2 Σ (m1 )⊗ ⊗ (m2 )⊗ ⊗ (z1 )⊗ ⊗ (z2 )⊗ Concatenation. The rule: (concatenation) (m1 , z) →Δ (m1 , z ) (m2 , z ) →Δ (m2 , z ) (m1 ◦ m2 , z) →Δ (m1 ◦ m2 , z ) is encoded as proof fragment: m1 , z
m2 , z Σ (m2 )⊗ ⊗ (z )⊗ ⊗∗l ⊗ (z ) m2 , (z )⊗ Σ (m2 )⊗ ⊗ (z )⊗ Cut m1 , m2 , z Σ (m1 )⊗ ⊗ (m2 )⊗ ⊗ (z )⊗
An interesting aspect of this encoding is that the main coordination rule (Concatenation) corresponds to Cut. Cut elimination in proof theory often is thought of as communication or interaction. Not all proofs go through, however, so coordination is more accurately thought of as constrained interaction, as suggested by Wegner [43]. The result we have established is the soundness of our encoding. Specifically, we have the following: Theorem 1. Let Σ = [[Δ]]. 1. If (m, z) →Δ (m , z ), then m, z Σ (m )⊗ ⊗ (z )⊗ . 2. If m ⇒Δ m , then m Σ (m )⊗ . More generally, we have show that ITLL is expressive enough to encode both Reo and zero-safe nets. In the next section, we demonstrate through example that ITLL is more expressive.
5
Coordination via ITLL
We have thus far presented, in very general sense, how ITLL can encode the coordination patterns expressible in Reo and in zero-safe nets. Doing so explored only a fraction of ITLL’s expressiveness. We now further explore ITLL, mainly focussing on the various pieces of the coordination puzzle, such as the behaviour or protocol of parties involved in coordination, and on variants of how coordination is done in Reo and in zero-safe nets, in particular, with regard to the role of epochs. We assume in addition that two parties are involved in a particular interaction: the coordinator, whose choices are controllable, and the environment (or the rest ), whose choices are uncontrollable.3 After presenting a number of 3
We ignore the data flowing through connectors/nets—thus we are working in the ‘black-and-white’ models. Adding colour raises no difficult issues. For example, the filter over a binary domain that accepts only zeros is modelled by: Σ = {Filter ⊗ a(0) b(0) ⊗ Filter , Filter ⊗ a(1) Filter }, where a(0) means that datum 0 is on node a.
Coordination: Reo, Nets, and Logic
247
variations, we pull all the pieces together into a SoC protocol which would form the essence of a coordinator for the holiday booking problem presented in the introduction. Choice. ITLL can express choice in two different ways: choices made by the coordinator and choices made by the environment. (This is well-known, and is particularly well exemplified in the game semantics of linear logic [12].) Formulæ of the form A & B express a choice between A and B that can made by the coordinator, whereas formulæ of the form A ⊕ B express that the choice between A and B is made by the environment. Progress of Time. ITLL can also express the occurrence of events within epochs. Recapping, a formula of the form A without a preceding modality expresses that A must be dealt with in the current epoch. A expresses an action that the coordinator may choose to deal with A in the current epoch. A states that A must be dealt with in next epoch. Combinations of these are possible. For example, A states that the coordinator may deal with A in some future epoch (but not now), and A states that the coordinator may deal with A in a future epoch, though the decision must be made one epoch in advance. 7 7 A states that any time after 7 epochs the coordinator may decide, 7 epochs in advance, to deal with A. Note that a dual to − exists [26]: 3A states that the environment decides some time when A must be dealt with.4 3A behaves very much like A, except that the choice is made by the environment instead of the coordinator. State Machines (Revisited). An encoding of state machines was hinted at above. It is in fact an adaptation of an existing encoding [29, for example]. The difference here is that we use the modalities to control when transitions may or may not be taken. An example state machine, in the encoding described above, with 4 states (s, t, u, v) and 4 external actions (A, B, C, D) is encoded as axioms:5
t
B A
Σ=
s
D v
C D
s −◦(A ⊗ t & C ⊗ u), u −◦ D ⊗ v,
t −◦ B ⊗ s, v −◦ D ⊗ s
u
The current state of such an automaton is recorded as a formula of the form s, 4
The rules for 3− introduction and elimination are: Γ, A 3B 3→ Γ, 3A 3B
5
Γ A →3 Γ 3A
There are a number of equivalent ways of presenting our axioms. For example, we could write (s t & u) ∈ Σ or (1 s −◦ t & u) ∈ Σ or even (1 s −◦ t), (1 s −◦ u) ∈ Σ or (s t), (s u) ∈ Σ. We typically choose the single formula representation s −◦ t & u.
248
D. Clarke
meaning that a transition from s can be taken at a time chosen by the coordinator. This particular automaton implements the regular expression (AB | CDD)∗ over events, where each event occurs in a different epoch. The fact that the events occur in different epochs is given by the form of the next-state formulæ, namely, s. Hence, the automaton is disabled after performing a transition in any given epoch, but it is re-enabled in the next epoch. All Reo primitives would have state machines of this form. We can change the formulæ describing the state machine behaviour in a number of ways: – The next state formula can be modified from s to s, giving a transition such as t −◦ B ⊗ s, to model that the primitive in state t, after making a transition to state s, may optionally perform an additional transition from state s in the given epoch. – The next state formula can be simply s, giving a transition such as t −◦ B⊗s, which means that the primitive, after moving from state t to state s, would be forced to perform another transition in this epoch. (Applying the two previous the two previous variations to the semantics of a Reo primitive would give primitives that could perform interactions involving more than one step within a given epoch [20]. We will give an example shortly.) – The choice in transition s −◦(A ⊗ t & C ⊗ u) is determined by the coordinator, as − & − is used. An alternative is to use − ⊕ −, as follows s −◦(A ⊗ t ⊕ C ⊗ u), meaning that the choice is made externally. This can model, for example, interaction with an external component. Naturally, other variations are possible. Boundary Interactions. In Reo, a component interacts with a connector by issuing a write or a take (read) to a port, which is subsequently satisfied by the connector (unless the writer/taker times-out). We can model these in ITLL in a number of ways (showing only a writer, as a taker is quite similar; and ignoring time-out): 1. A Writer has two states {Writing, NotWriting} and the following axioms: Writing −◦ a ⊗ (Writing ⊕ NotWriting) NotWriting −◦ (Writing ⊕ NotWriting) Notice that the next state formula Writing ⊕ NotWriting forces the environment to choose whether the component will be writing or not. When the component is writing, coordinator can determine whether to allow the write to succeed when it chooses. When the component is not writing, the decision whether to write must be made again in the following epoch, determined by the second axiom. The initial state is represented by Writing ⊕ NotWriting, forcing the environment to make a choice in the first epoch.
Coordination: Reo, Nets, and Logic
249
If we were also considering data, the formulastating that data flows on channel a, namely a, would be replaced by d∈Data a(d) to denote that there is a value on the channel and that it is externally selected. 2. An alternative encoding of a Writer exploits 3− to handle the fact that the decision to perform a write is determined by the environment, after which the coordinator decides when the write succeeds (via −): Writing −◦ a ⊗ NotWriting NotWriting −◦ 3Writing In this case, the initial state is NotWriting, which is effectively 3Writing. The mode of interaction between the coordinator and the environment need not be restricted to that which is assumed by Reo. For example, interacting by making method calls to standard objects, or by calling web services, are equally useful alteratives, yet distinct from Reo’s. In these circumstances the interaction originates with the coordinator, not the environment. A service/method that takes input described by A and produces output described by B can be modelled by a formula as simple as A −◦ B. This, however, makes the assumptions that service is guaranteed to produce a result and that the coordinator must wait for the result. A more asynchronous invariant is modelled by the formula A −◦ 3B or even A −◦ 3B. The choice of which model to adopt depends upon the desires of the system builder; the choice of ITLL formula to model the mode of interaction determines the dependency of the coordination on the environment. Other modes of interaction can be (at least partially) described using ITLL— one way message sends, compulsory interactions, interrupts. A complete study of ITLL’s expressiveness and limitations would be very interesting. User Interaction. ITLL formulæ can capture the essence of synchronous user interaction, which, when combined with the extended Reo we are hinting at, permits coordination of multi-action events that include user interaction, such as obtaining the user’s approval of a particular holiday package. A simple formula expressing user interaction is: User ⊗ Q −◦(Yes ⊕ No) ⊗ User User is the (single, uninteresting) state. When asked a question on port Q, a reply can be supplied on either port Yes or port No, where the choice is made by the environment (or user). The question is asked and its answer is given all in the same epoch. The test to see whether a coordinator can handle both the Yes and No answers is whether certain judgements (of the shape described in Theorem 1) are provable. When the proof does not go through, this means that during coordinationtime a choice can be made that the coordinator cannot handle. Roll-back or some sort of recovery would be required when this occurs. It is worth taking pause to consider whether this kind of user interaction can be handled in Reo. The short answer is that it cannot. External components in
250
D. Clarke
Reo perform either a single write or take in each epoch—this is the only kind of event that the coordinator has to deal with. Any (inter)action that involves multiple steps must be dealt with over multiple epochs in Reo, even though the steps conceptually belong to the same action. Existing implementations of Reo cannot handle a channel whose choices are made externally, within a single epoch (though an attempt was made in the implementation Reolite [16]). Thus the ability to perform multi-step actions, possibly involving external choices, within an epoch goes beyond what Reo can do. Yet, in our opinion, these possibilities makes better use of the abstractions (synchrony and asynchrony) provided by Reo, by enabling more interesting interaction to occur during an atomic step. A Service-Oriented Computing Protocol. We pull together the ideas presented in this section to give a more realistic example, which ties in with our original holiday booking example. The left hand side of the Figure 10 presents a connector whose job is to monitor a transaction (adopting a simplified version of the WSTransaction protocol [22]), so that it can be incorporated into a larger Reo connector. An informal description of its behaviour is given by the ‘automaton’ on the right hand side of Figure 10. This automaton is initiated by a try action (try to perform transaction), which results, eventually, in a success or fail action, the choice of which is made by the environment. Afterwards, the transaction can be reset (which also causes a no action) or, in the case that it is successful, committed (which also causes a yes action). The choice of whether to reset or commit is made by the coordinator. The key here is that success/failure is externally determined. In order to have this controlled by Reo (or a zero-safe net), we make all steps occur within a single epoch, which would, for example, allow the combination of two or more WC-Transactors to form proper transactions which occur atomically within a synchronous slice of time (i.e., epoch). The ITLL encoding consists of the following formulæ, where the initial state is Init : the nodes marked with bold font are on boundary with the user, whereas the others are used to interact with the underlying transaction ⎧ ⎫ Init ⊗ try −◦ t, ps ⊗ commit −◦ yes ⊗ Init, ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ t −◦ s ⊕ f, ps ⊗ reset −◦ no ⊗ Init, Σ= s −◦ success ⊗ ps, pf ⊗ reset −◦ no ⊗ Init , ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ f −◦ fail ⊗ pf A more compressed variant would replace the t −◦ s ⊕ f , s −◦ success ⊗ ps, and f −◦ fail ⊗ pf by t −◦(success ⊗ ps) ⊕ (fail ⊗ pf ). The Reo-like connector presented in Figure 11(a) combines two WCTransactors so that either both transactions are committed or neither are. The choice in the middle selects whether to commit or reset both WC-Transactors, giving priority to committing (indicated by !).6 The simple connector in Figure 11(b) enforces that either exactly one yes (using the merger) or two nos 6
Priority was not discussed in this paper. Our work on 3-colouring deals with this fairly effectively [18]. A full treatment of priority in Reo, especially in the present setting, is a whole new kettle of worms that we prefer to leave on the back-burner.
Coordination: Reo, Nets, and Logic Commit
251
Reset 5
eventually reset/No
fail
external choice
Yes 1
Try
try
2
WC-Transactor
No
commit/Yes
reset/No success eventually 6
internal choice Success
Fail
Fig. 10. Coordinator for simplified model of WS-Transaction protocol. There are 7 ports to this coordinator, grouped in the following logical units: try initiates part of a transaction; success & fail are the results reported by the component being controlled; commit & reset commits to the result (only on success), or resets (either on success or failure); and yes & no reports whether the transaction succeeded and is committed to.
result, thus ensuring that either one holiday package is booked or none is. This example could be extended with data and user interaction to gain the user’s approval of the chosen holiday, but clearly more work would be required to make it completely realistic. In any case, the present example cannot handled by Reo as it exists now, unless the transaction handling is implemented over multiple epochs, and thus failing to fully exploit synchrony.
6
Related Work
Many semantic models for Reo have been presented [7,8,18,35], though all of them, apart from the operational semantics of Mousavi et. al. [35], suffer from causality problems. This problem is avoided in operational models such as the one presented here. The problem is that most models have a non-constructive, classical semantics which would give a semantics to certainly loops stating that data flows in the loop, even though the loop has no source of data. Only the connector colouring model [18] deals with priority. Our semantics is superior to Mousavi et. al.’s on the following counts: the semantics of primitives such as channels are not built into our rules, thus we have only 4 rules compared to 16; we more cleanly deal with the treatment of data on nodes and the fact that data flow may occur only in a part of connector; and, perhaps most importantly, our rule format enables an easier comparison with existing systems. On the other hand, Mousavi et. al.’s semantics have a globally defined notion of maximal progress for filtering possible behaviours, which we do not consider here. Many, many variants of Petri nets exists, far too many to review here. Bruni et. al. [13] show how the zero-place approach can be applied to coloured, reconfigurable, and dynamic nets. Given the close correspondence between Reo and zero-safe nets, it would be be fruitful for the development of Reo to transfer these
252
D. Clarke Success
Fail
1
1 Yes
Yes
WC-Transactor
No
Yes Try
(a) Rio
Try Commit
No
Reset !
Yes
1
Try Yes WC-Transactor
No
(a) Zurich
No
No 1
Success
1
Fail
Fig. 11. (a) Combining Two WC-Transactors. The 1 connectors N either always offer or accept a signal, depending upon the direction of the arrow. denotes an exclusive router, which makes a choice of sending the data to exactly one of its outputs. ! indicates priority. A three-way variant of this can be constructed to implement the coordination for each of the holiday options (ignoring the actual data). (b) The connector combines two instances of the connector in (a), enforcing that either exactly one yes or two no results, thus ensuring that either one holiday package is booked or none is.
ideas into Reo. More generally, Bruni et. al. [14] present a modular high-level account of the relationship between, in our terms, the mirco- and macro-step levels, demonstrating that the naturalness of the two-level approach to describing transactions. A large volume of literature considers the relationship between Linear Logic and Petri nets (and other concurrency models) [21,10,42,28, among others]. We adapted a simple encoding of Petri nets to use the temporal modalities of ITLL to model epochs. The closest work to ours, in this respect, is Hirai’s thesis and subsequent article [26,25]. Hirai reasons about Timed Petri nets [11] (a model wherein transitions have a waiting time during which they are unusable), whereas we encode zero-safe nets (and Reo), which have a more transactional flavour. A different study of Petri nets uses the logic of bunched implications (BI) [36,40]. From a semantic perspective, it makes a lot of sense to adapt this approach to our domain, as one semantic model of BI involves partial commutative monoids. We would be interested to see what BI could bring to our application domain. Hirai [26,25] also briefly considers variants of process encodings into ITLL. Specifically, he observes that − can be used to model synchronous message passing, which, he claims, is not expressible in linear logic. Specifically, he compares the pair of linear logic ‘processes’ !(p −◦(m ⊗ p)) and !(q −◦(m −◦ q)) with the ITLL processes !(p −◦(m ⊗ p)) and !(q −◦(m −◦ q)). In both cases, the first process sends an m message to the second one, repeatedly. In the LL encoding, the m is sent asynchronously, as both parts of m ⊗ p proceed in parallel. In contrast, m is sent and received in the ITLL encoding, as the recursive call to p cannot occur until the next step. Even though we are using the same
Coordination: Reo, Nets, and Logic
253
ingredients, our encoding (or our reading) is quite different, as we equate synchrony with what occurs (necessarily) within some epoch. In fact, we would say that the first case expresses entirely synchronous processes—that is, their entire development occurs within an epoch—whereas the second expresses synchrony for message m, asynchronously with any subsequent behaviour. Because of the different granularity, we typically exploit the parallelism inherent in −⊗−, rather than worry about the asynchrony it introduces. K¨ ungas [33] and Pham et. al. [38] uses temporal linear logic to model the choices of agents and agent negotiation. Both articles make use of the wellknown the distinction between internal (− & −) and external (− ⊕ −) choice. Formalæ express agent choices and time-sliced resource usage (such as printers). K¨ ungas also models a kind of promise by formula such as A −◦ B, meaning that the offer to do A in the next step is traded for B now. K¨ ungas uses partial evaluation (via a technique called partial deduction) to give more efficient programs (applicable to the ideas presented here), but also to derive offers, so that agents can negotiate the use of resources. Pham et. al. includes reductions corresponding to the choices made by the various choice makers. This feature would be useful in our setting to express the dynamics in the models we considered beyond Reo and zero-safe nets. Both papers only use the − modality, whereas we suggest uses for the whole spectrum. Kanovich and Ito [30] present an alternative (and earlier) formulation of ITLL. Hirai [26] extends Kanovich and Ito’s formulation with the modal storage operator (that is, the exponential !), and removes possible infinite derivations which were at odds with the notion of passage of time—specifically, Zeno’s paradox could arise. ITLL has also been used as the basis for a logic programming language [9]. The main goal was to give a more fine-grained control over resources than is possible in linear logic programming languages, and to enable complex data structures which are not entirely available all the time, and thus can be the target for more efficient implementations. Linear logic has been used as the basis for the coordination language LO (Linear Objects) [5]—though this can equally be seen as a way of combining logic and object-oriented programming—and, its successor, CLF (the Coordination Language Facility) [3]. Both languages have a very strong logic programming flavour and rely on a notion of transaction, so are close in spirit to our logical encoding. Their base language, however, does not have temporal modalities, so they cannot make as fine-grained distinctions as we can. Several authors [32, for example] have used linear logic to analyse and synthesise planning problems. This corresponds to our intuition that in both Reo and in zero-safe networks one cannot just send the data optimistically through channels or push tokens through the network and hope to arrive at a suitable (external) configuration. Either backtracking, planning or constraint solving [19] is required. Alexiev [2] presents a now-dated overview of many applications of Linear Logic. One of the key points that he makes is that “everything is connected
254
D. Clarke
to everything else.” This is certainly made no less true by the present article. Kamide [27] presents linear and affine logics extended with temporal, spatial and epistemic modalities. This looks like very fertile ground for exploring further extensions to coordination language, perhaps even taking on a more agent-like character.
7
Conclusion and Future Work
In this article, we cast both the coordination language Reo and zero-safe nets into a common semantic framework. The key difference between the two models is the choice of monoids used in the framework, demonstrating that, although the two models differ significantly on the surface, they are very similar underneath. In fact, Reo can be compsitionally encoded into zero-safe nets. We then encoded both systems into intuitionistic temporal linear logic (ITLL), to explore its applicability as unifying framework for coordination languages. We illustrated this by encoding both Reo and zero-safe nets into ITLL. The meaning of the logical formulæ used in the encodings provided useful intuition into the meaning of configurations and reductions in the respective semantics. In both cases, the reading corresponds to our intuitive understanding of the language being modelled. In the latter part of the paper, we explored the remainder of ITLL, and demonstrated, mostly through examples, the range of possible behaviour that can be expressed in ITLL, but in neither Reo nor zero-safe nets. In particular, we described more complex per epoch behaviour, choices attributable to different players (the coordinator vs. the environment), and optional and compulsory actions. None of these distinctions can be made in Reo or in zero-safe nets. This work merely scratches the surface of an interesting and useful connection between Reo, zero-safe nets, and ITLL. We hope that this paper will serve as a source of ideas for further developing Reo and the coordination languages that will follow it. A detailed game-semantic study of ITLL is warranted in order to fully elucidate the ideas presented here. In particular, such a model would provide not only a semantics for Reo, but of all the extensions described in Section 5. Additional ideas we would like to investigate include increasing the expressiveness to cover notions such as co-operation, wherein the actual possibilities for a choice may be limited by both the coordinator and the environment, but the actual choice is determined by both; the distinction between reversible and irreversible actions; groups and operations on them; and many more.
References 1. Abramsky, S.: Computational interpretations of linear logic. Theor. Comput. Sci. 111(1-2), 3–57 (1993) 2. Alexiev, V.: Applications of linear logic to computation: An overview. Logic Journal of IGPL 2(1), 77–107 (1994) 3. Andreoli, J.-M., Freeman, S., Pareschi, R.: The Coordination Language Facility: coordination of distributed objects. Theory and Practice of Object Systems 2(2), 77–94 (1996)
Coordination: Reo, Nets, and Logic
255
4. Andreoli, J.-M., Hankin, C., Le Metayer, D. (eds.): Coordination Programming: Mechanisms, Models and Semantics. Imperial College Press (1996) 5. Andreoli, J.-M., Pareschi, R.: Linear objects: logical processes with built-in inheritance. New Generation Computing 9 (1991) 6. Arbab, F.: Reo: a channel-based coordination model for component composition. Math. Struct. in Comp. Science 14(3), 329–366 (2004) 7. Arbab, F., Rutten, J.J.M.M.: A coinductive calculus of component connectors. In: Wirsing, M., Pattinson, D., Hennicker, R. (eds.) WADT 2003. LNCS, vol. 2755, pp. 34–55. Springer, Heidelberg (2003) 8. Baier, C., Sirjani, M., Arbab, F., Rutten, J.: Modeling component connectors in Reo by constraint automata. Science of Computer Programming 61(2), 75–113 (2006) 9. Banbara, M., Kang, K.-S., Hirai, T., Tamura, N.: Logic programming in a fragment of intuitionistic temporal linear logic. In: Codognet, P. (ed.) ICLP 2001. LNCS, vol. 2237, pp. 315–330. Springer, Heidelberg (2001) 10. Bellin, G., Scott, P.J.: On the pi-calculus and linear logic. Theoretical Computer Science 135, 11–65 (1994) 11. Bestuzheva, I.I., Rudnev, V.V.: Timed Petri nets: Classification and comparative analysis. Automation and Remote Control 51(10), 1308–1318 (1990) 12. Blass, A.: A game semantics for linear logic. Annals of Pure and Applied Logic 56, 151–156 (1992) 13. Bruni, R., Melgratti, H.C., Montanari, U.: Extending the zero-safe approach to coloured, reconfigurable and dynamic nets. In: Desel, J., Reisig, W., Rozenberg, G. (eds.) Lectures on Concurrency and Petri Nets. LNCS, vol. 3098, pp. 291–327. Springer, Heidelberg (2004) 14. Bruni, R., Meseguer, J., Montanari, U.: Tiling transactions in rewriting logic. In: WRLA 2002, Rewriting Logic and Its Applications. Electronic Notes in Theoretical Computer Science, vol. 71, pp. 90–109 (April 2004) 15. Bruni, R., Montanari, U.: Zero-safe nets: Comparing the collective and individual token approaches. Information and Computation 156(1-2), 46–89 (2000) 16. Clarke, D.: Reolite implementation (December 2005), http://www.cwi.nl/∼ dave/reolite 17. Clarke, D.: A basic logic for reasoning about connector reconfiguration. Fundam. Inform. 82(4), 361–390 (2008) 18. Clarke, D., Costa, D., Arbab, F.: Connector colouring I: Synchronisation and context dependency. Science of Computer Programming 66(3), 205–225 (2007) 19. Clarke, D., Proen¸ca, J., Lazovik, A., Arbab, F.: Deconstructing Reo. In: FOCLASA 2008. ENTCS (July 2008) (to appear) 20. Diakov, N., Arbab, F.: Adaptation of software entities for synchronous exogenous coordination: An initial approach. In: Proceedings of The Second International Workshop on Coordination and Adaptation of Software Entities, W-CAT 2005 (July 2005) 21. Engberg, U.H., Winskel, G.: Linear logic on Petri nets. In: de Bakker, J.W., de Roever, W.-P., Rozenberg, G. (eds.) REX 1993. LNCS, vol. 803, pp. 176–229. Springer, Heidelberg (1994) 22. Cabrera, L.F., et al.: Web Services Atomic Transaction (WS-AtomicTransaction). MSDN Library (November 2004) 23. Gelernter, D., Carriero, N.: Coordination languages and their significance. Commun. ACM 35(2), 97–107 (1992) 24. Girard, J.-Y.: Linear logic. Theoretical Computer Science 50, 1–102 (1987)
256
D. Clarke
25. Hirai, T.: Propositional temporal linear logic and its application to concurrent systems. EICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (Special Section on Concurrent Systems Technology) E83A(11), 2219–2227 (2000) 26. Hirai, T.: Temporal Linear Logic and Its Application. PhD thesis, The Graduate School of Science and Technology, Kobe University, Japan (September 2000) 27. Kamide, N.: Linear and affine logics with temporal, spatial and epistemic logics. Theoretical Computer Science 252, 165–207 (2006) 28. Kanovich, M.I.: Linear logic as a logic of computations. Annals of Pure and Applied Logic 67(1–3), 183–212 (1994) 29. Kanovich, M.I.: Linear logic automata. Annals of Pure and Applied Logic 78, 147– 188 (1996) 30. Kanovich, M.I., Ito, T.: Temporal linear logic specifications for concurrent processes (extended abstract). In: Twefth Annual IEEE Symposium on Logic in Computer Science, pp. 48–57 (1997) 31. Koehler, C., Costa, D., Proen¸ca, J., Arbab, F.: Reconfiguration of Reo connectors triggered by dataflow. In: Electronic Communications of the EASST: Graph Transformation and Visual Modeling Techniques, vol. 10 (2008) 32. K¨ ungas, P.: Analysing AI planning problems in linear logic – a partial deduction approach. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS, vol. 3171, pp. 52–61. Springer, Heidelberg (2004) 33. K¨ ungas, P.: Temporal linear logic for symbolic agent negotiation. In: Zhang, C., W. Guesgen, H., Yeap, W.-K. (eds.) PRICAI 2004. LNCS, vol. 3157, pp. 23–32. Springer, Heidelberg (2004) 34. Meseguer, J., Montanari, U.: Petri nets are monoids. Information and Computation 88, 105–155 (1990) 35. Mousavi, M.R., Sirjani, M., Arbab, F.: Formal semantics and analysis of component connectors in Reo. Electronic Notes in Computer Science 154(1), 83–99 (2006) 36. O’Hearn, P.W., Yang, H.: Petri net semantics of bunched implications(October 1999) (unpublished) (available from Peter’s webpage) 37. Papadopoulos, G.A., Arbab, F.: Coordination models and languages. In: Zelkowitz, M. (ed.) The Engineering of Large Systems. Advances in Computers, vol. 46, pp. 329–400. Academic Press, London (1998) 38. Pham, D.Q., Harland, J., Winikoff, M.: Modelling agent’s choices in temporal linear logic. In: Baldoni, M., Son, T.C., van Riemsdijk, M.B., Winikoff, M. (eds.) DALT 2007. LNCS, vol. 4897, pp. 140–157. Springer, Heidelberg (2008) 39. Pham, D.Q., Harland, J.: Temporal linear logic as a basis for flexible agent interactions. In: Durfee, E.H., Yokoo, M., Huhns, M.N., Shehory, O. (eds.) 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), IFAAMAS (2007) 40. Pym, D.J.: The Semantics and Proof Theory of the Logic of Bunched Implications. Applied Logic Series, vol. 26. Kluwer Academic Publishers, Dordrecht (2002) 41. van der Aalst, W.M.P., ter Hofstede, A.H.M.: YAWL: Yet another workflow language. Information Systems 30(4), 245–275 (2005) 42. Watkins, K., Cervasato, I., Pfenning, F., Walker, D.: A concurrent logical framework: The propositional fragment. In: Berardi, S., Coppo, M., Damiani, F. (eds.) TYPES 2003. LNCS, vol. 3085, pp. 355–377. Springer, Heidelberg (2004) 43. Wegner, P.: Coordination as constrained interaction. In: Ciancarini, P., Hankin, C. (eds.) COORDINATION 1996. LNCS, vol. 1061, pp. 28–33. Springer, Heidelberg (1996)
An Object-Oriented Component Model for Heterogeneous Nets Einar Broch Johnsen, Olaf Owe, Joakim Bjørk, and Marcel Kyas Department of Informatics, University of Oslo, Norway {einarj,olaf,joakimbj,kyas}@ifi.uio.no
Abstract. Many distributed applications can be understood in terms of components interacting in an open environment. This interaction is not always uniform as the network may consist of subnets with different quality: Some components are tightly connected with order preservation of communicated messages, whereas others are more loosely connected such that overtaking of messages and even message loss may occur. Furthermore, certain components may communicate over wireless networks, where sending and receiving must be synchronized, since the wireless medium cannot buffer messages. This paper proposes a formal framework for such systems, which allows high-level modeling and formal analysis of distributed systems where interaction is managed by a variety of nets, including wireless ones. We introduce a simple modeling language for object-oriented components, extending the Creol language. An operational semantics for the language is defined in rewriting logic, which directly provides an executable implementation in Maude.
1 Introduction Object-oriented modeling languages [3, 11, 22] aim for a high level of abstraction, and typically capture systems in a platform independent manner, as advocated by modeldriven architecture [24]. Consequently, these languages abstract from low-level communication details such as, e.g., the specific properties of the communication medium components use. However modern distributed applications often require a certain quality of service, which cannot be modeled when perfect channels are assumed; e.g., a maximum latency or a minimum throughput. In practice, the properties of a specific connection may even evolve during execution. In particular, connections to other components may appear or disappear, and network components may be shared between several applications with different requirements. Consequently, the quality of a connection between two components may vary over time, and connections to components with the same functional interface may vary significantly in bandwidth or robustness. In many cases, the behavioral properties of the modeled system depend on the specific properties of the net. For such systems, it is desirable to enable cross-layer designs [23] by reflecting aspects of the (low-level) connectivity in the high-level modeling language. In this paper, we develop a light-weight, timed component model with an executable semantics. We present a kernel language, extending our previous work on Creol [11,12].
This research is in the context of the EU project IST-33826 CREDO: Modeling and analysis of evolutionary structures for distributed services (http://credo.cwi.nl).
F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 257–279, 2008. c Springer-Verlag Berlin Heidelberg 2008
258
E.B. Johnsen et al.
Creol is a modeling language for distributed concurrent objects communicating by means of asynchronous method calls. However, this previous work did not address the communication medium in which the concurrent objects live. In this paper, we introduce language primitives to reflect links with different qualities. This allows us to model communication in heterogeneous nets; these are nets in which some components may be tightly and reliably connected whereas others may be loosely connected through unreliable or wireless links. This allows certain aspects of the connection quality between components to be taken into account during the analysis of a model. Furthermore, a component may decide on its actions depending on how it is connected to other components. In particular we consider radio communication as well as multicast and broadcast communication, in order to integrate object-oriented modeling with wireless networks. The language abstracts from many implementation details; e.g., it uses a functional sublanguage for side-effect free expressions and execution may be highly nondeterministic. The language has an operational semantics defined in rewriting logic [15] and it is executable on the Maude platform [5], which supports various forms of analysis such as simulation and breadth-first search through the execution space. To illustrate the language and analysis using Maude’s simulation support, an example of a simple sensor network is given. The paper is structured as follows: Section 2 discusses modeling of network aspects, Sect. 3 presents the modeling language, and Sect. 4 provides an example based on wireless medical sensors. Sect. 5 defines the operational semantics of the language, Sect. 6 discusses related and future work, and Sect. 7 concludes the paper.
2 Modeling of Network Information This paper considers modeling of distributed systems that communicate over different links, and introduces a novel framework where such systems may be modeled and simulated, and where system properties may be subjected to formal analysis. When modeling a distributed system, the model should not only describe the components and their behaviors, but also how the different communication media involved are composed, since media properties often affect the overall properties of the system. However, it is not desirable to address all aspects of the network and communication details in a highlevel model . We will here focus on safety properties such as “the sender is aware of sent messages that have arrived”, but also certain liveness properties such as “a message will arrive at its destination in at most n hops”. This means that certain aspects of the communication and network media must be formalized, for instance whether communication preserves the order, whether communication is immediate or delayed, and whether message may get lost. Such factors typically affect overall system properties. In particular, we consider here tight, loose, and wireless links. A tight link between two nodes provides a reliable communication channel that guarantees FIFO ordering of messages sent between the linked objects. A typical example is a serial line link between the two nodes, or a TCP/IP connection. A loose link between two nodes is still reliable but it does not guarantee the FIFO ordering; rather some messages take more time than others. A wireless link provides synchronous transmission of messages, but
An Object-Oriented Component Model for Heterogeneous Nets
259
Name
Description and Examples Simple Model 7. Application Web browser, file transfer, mail transfer 3. Application 6. Presentation Data representation (MIME, XML) and encryption Host Layers 5. Session Interhost communication (RPC, iSCSI) 2. Transport 4. Transport End-to-end connections & reliability (TCP, UDP) 3. Network Path determination & logical addressing (IP) 2. Data link Physical addressing (802.3 (Ethernet), Media Layers 802.11a/b/g MAC/LLC (Wireless)) 1. Media 1. Physical Media (100BASE-TX (Ethernet), IEEE 802.11a/b/g PHY (Wireless)), signal, binary coding Fig. 1. Network layering model
simultaneous messages may be lost if they are within reach of each other (message collision). A message is received if the sender sends, the message does not collide in transmission, and the receiver receives at the same time; otherwise the message is lost. (Remark that the models considered in this paper are non-deterministic but not probabilistic.) A network built from tight links is called a tight network. Analogously, we define loose networks and wireless networks. A network may have parts that are loose, tight, and wireless. A loose network may have parts that are tight, but not vice versa. We say that a tight link is better than any other link. Our model is based on the Open System Interconnection Basic Reference Model (ISO/IEC 7498) [26], OSI model for short. This is considered as one of the standard models for describing networks and applications. It allows application level programming without knowledge of the underlying network protocols. However, the abstractions provided by the OSI model sometimes make it difficult to exploit the capabilities of the underlying network and protocols. For example, applications in wireless networks often have specific requirements on memory and energy use and still need to guarantee their service with a certain quality. Such applications call for cross-layer designs, where the abstractions of the OSI model are weakened with APIs that enable the control of aspects of the lower layers (cf. [23] and below). In addition, by using different protocols over the same net, one may obtain different network qualities, such as lossy and fast versus non-lossy and slow, or FIFO and slow versus reordering and fast. In both cases a model design for a given application may benefit from some low-level network (protocol) knowledge, and possibly also from the reprogramming abilities of certain network related aspects. By using the same language for the application level modeling and the network related aspects, such as the programming of network protocols and wireless radio controllers, one may obtain a uniform model with desired properties. This paper considers a simplified version of the OSI model, which we compare to the original ISO/OSI model (see Fig. 1). Our model allows application level programming as well as the programming of network protocols and wireless radio controllers. Details that are not relevant for high-level modeling and analysis are abstracted away, while other details are included, such as the presence of communication buffers,
260
E.B. Johnsen et al.
Syntactic categories. C, I, m in Names n in Network t in Label g in Guard p in MtdCall s in Stm x in Var e in Expr o in ObjExpr b in BoolExpr
Definitions. IF ::= interface I {x : I} {inherits I} begin {with I Sg} end CL ::= class C {x : I} {inherits C} {implements I} begin {var x : I {= e}} {with I} M end M ::= Sg == {var x : I {= e}; } s Sg ::= op m ({in x : I}{out x : I}) n ::= loose | tight | wless g ::= b | t? | g ∧ g | g ∨ g p ::= m(e) | o.m(e) s ::= (s) | s; s | x := e | x := t.get | x := new C{(e)}{in o} | tick(n) | if b then s {else s} fi | while b do s od | await g | {t :=}!p | {x :=}p | await x := p | !o.m(e) | !all : I.m(e) swless ::= send | receive snet ::= link o n o | unlink o n o
Fig. 2. The language syntax. Overlined terms such as e, x, and s, denote lists over the corresponding syntactic categories and curly brackets denote optional elements. Additional constructs for low-level wireless programming are given by swless , and for network connections by snet .
ordering properties, immediateness of transmission, radio transmission synchronization and messages collision of wireless messages. Our model consists of three levels: 1. the media layer is represented by rules in the operational semantics, formalizing the transport of messages in the different nets and which is partly “programmable” in that link and unlink statements allow to establish and sever links between components, as well as synchronization of radio sending and receiving, 2. the transport layer is partly represented by rules of the operational semantics (formalizing the meaning of “tight” and “loose”) and partly programmable by language primitives allowing; e.g., the programming of routing in wireless systems, 3. the application level represents top level programs. At this level the actual underlying net is invisible, in the sense that one may use the same high-level communication primitives, including broadcast and multicast primitives regardless of the actually used network. The chosen primitives enable cross-layer designs of wireless network applications, which arise from the necessity to adapt properties of lower layers to the applications under design [23]. Such designs allow to adapt the network for better quality of service [19] or to optimize its energy consumption [10]. The framework may be adjusted to cater for other communication properties, such as message loss and packet size.
3 A Modeling Language for Components in Heterogeneous Nets We introduce an executable modeling language for components in heterogeneous networks, based on the object-oriented language Creol [11, 12]. Creol proposes imperative programming constructs for distributed concurrent objects, based on asynchronous method calls and processor release points. Asynchronous method calls may be seen
An Object-Oriented Component Model for Heterogeneous Nets
261
as triggers of concurrent activity, resulting in new processes in the called object. Objects are dynamically created instances of classes, which are organized in an inheritance hierarchy. Concurrent objects encapsulate an execution thread and an internal process queue. Active behavior, triggered by a run method, is interleaved with passive behavior by means of the processor release points. The modeling language includes a standard expression language for values of basic data types, which will not be explained in detail. Objects have unique identities (names); communication takes place between named objects, and object identities may be exchanged between objects. Object variables are typed by interfaces. The language is strongly typed: invoked methods are supported by the called object (when not null ), such that formal and actual parameters match. In contrast, the modeling language considered in this paper targets network components in heterogeneous networks. It combines object-oriented components and definitions of actual networks; including tight, loose and wireless networks, and dynamic network changes. Technically, we extend the concurrent object communication model of Creol with representations of components and networks, incorporate a notion of (local and global) time, and introduce multicasts and forms of broadcast. Finally we define primitives for dealing with the special needs of wireless networks inside this model. Network components. In order to model the units of the heterogeneous network, we introduce a light-weight notion of multi-object network components. The objects inside a component are tightly connected and communicate directly with each other. A component supports all interfaces supported by its objects; thus the caller may a call a method on a component if the called method supported by some object in that component. In case several objects in the component support the called method, one of these is chosen non-deterministically. However, if the caller knows the identity of a preferred object inside the component, the caller may call that object directly. Component creation has the syntax x := new C(e) where C is the class name and e the list of actual class parameters, if any. In fact, this statement creates a network component consisting of a single object. Both components and objects have identity: for an object identity o, the expression component(o) gives the component identity of o (for a component identity c, component(c) = c). The statement x := new C(e) in o creates a new (object) instance of C inside the component o. Basic statements. The basic language syntax is given in Fig. 2. A program consists of interface and class declarations. Classes CL contain definitions of attributes x (with initial values) and methods M. A method contains a list of statements s, which may access class attributes, locally defined variables, and the formal parameters of the method (given by the keywords in and out). An interface IF contains method signatures Sg associated with a cointerface I, denoting the (minimal) type of a client of IF (given by a with clause). Both classes and interfaces may also contain parameters and inherit other classes and interfaces, respectively. Finally, a class implements a list of interfaces. In order to allow type correct call-backs, a method may use the implicit caller parameter, which supports the cointerface of the method. Input parameters, as well as the self-reference this, are ready-only. Note that remote attribute access is not permitted, so method interaction is the only means of communication in the language. Assignment and if- and while-constructs are standard. We assume that purely local operations take no time; local delays may be captured by the statement tick(n), for n time units.
262
E.B. Johnsen et al.
In the statement await g, the guard g is used to control processor release and may consist of Boolean conditions and return tests (see below). If g evaluates to false in the active process, the process is suspended and the execution thread becomes idle. When the execution thread is idle, any enabled process may be chosen from the local process queue. Therefore explicit signaling is not part of the language. The run method of an object is called upon creation, and initiates active behavior. Release points in the run method allow processes in the process queue to be handled. Communication. After making an asynchronous method call t :=!o.m(e), the caller may proceed with its execution without waiting for the method reply. Here o is an object expression and e are (data value or object) expressions. The tag t will be assigned a unique tag value identifying the call (relative to the current object), which may later be used to refer to that call in two different ways. First the guard await t? suspends the active process unless a return to the call associated with t has arrived. Second the return values are accessed by the blocking reply statement x := t.get, once a return has arrived. We identify certain special cases of these communication primitives: For local calls the dot-notation and o is omitted; e.g., t :=!m(e). If no return value is desired by the caller, the tag may be omitted; e.g., !o.m(e). The sequence t :=!o.m(e); x := t.get gives a blocking call, abbreviated x := o.m(e), whereas the call sequence t :=!o.m(e); await t?; x := t.get gives a non-blocking call, abbreviated await x := o.m(e). A multicast !o.m(e) is an asynchronous method call with a set of target objects. The multicast is sent simultaneously to all callees. A broadcast all : I.m(e) is an asynchronous method invocation which targets all objects of a certain interface I. For strong typing, I must provide a declaration of the invoked method. The use of these communication primitives is illustrated below and in Sect. 4. Heterogeneous nets. In a given model, the network connecting the components need not be uniform. Actual nets are defined by means of a number of direct links between components, which may have different characteristics. In this paper, we consider three basic forms of links: wireless, loose, and tight. In order to model the heterogeneous net, links are declared by the statements link o wless u, link o loose u, and link o tight u (for sets of component expressions o and u). These statements respectively add wless, loose, or tight links from each each component in o to each component in u. Correspondingly, links are explicitly broken by, e.g., the statement unlink o wless u. Remark that (wired) loose or tight links go both ways, but this is not generally the case for wireless links. Furthermore, links to self are redundant. Example 1. Initial links may be made inside the run method of a class System from which the initial components are created. op run == var a,b,c,d : Any; a:= new Class1; b:= new Class2; c:= new Class3; ... link a wless b,c; link b,c,d wless a,b,c,d;
Here a must use b or c to communicate with d, but d may communicate directly with a. Wireless communication. A wireless component needs radio functionality to handle the sending and receiving of messages. In order to control the timing of this sending and
An Object-Oriented Component Model for Heterogeneous Nets
263
receiving precisely, we model the radio functionality in a separate radio object in the wireless component. Therefore a component acting in wireless media will consist of at least two active objects; the main processing unit and the radio object. It is the task of the radio object to make wireless messages available for regular processing by the objects inside the component. The objects themselvesact as if they work in a wired network, using the standard primitives for communication. A component acting in a wireless net must be able to wait for and to send a message in a given time interval. We use two explicit non-blocking primitives to capture wireless sending and receiving: receive to receive a wireless message and make it available to the component’s objects, and send to send the first pending wireless message. Example 2. A cycle of the radio unit of a wireless component could be to receive, then send, receive again, and finally sleep. Instead of defining a controller in the sensor’s central processing unit, we use the radio’s run method to control the cycle. class Radio(sendtime:Nat, sleeptime:Nat, cycle:Nat, sync:Nat) implements Controllable begin var on:Bool := true, timer: Nat := 0 op run == while on do await (clock - sync) rem cycle = 0; *** synchronize timer:= clock; while clock < timer + sleeptime do if clock = timer + sendtime then send else receive fi od od with Any op turnoff == on := false op turnon == on := true op reset (in time: Nat) == sync := time op setSend (in time: Nat) == if time < sleeptime then sendtime := time fi op setSleep(in time:Nat) == if sendtime
When the radio is turned on, the cycle consists of an active phase where the radio is sending in a specified interval (here of length 1 time unit) and otherwise receiving, followed by a sleeping phase. In addition there are methods to turn the radio on and off, to adjust the sending and sleeping intervals, and for synchronizing the radio cycle. These methods form an interface, Controllable, allowing external control of the radio. When sleeping, the processor is released and invocations of the radio methods may be processed. For simplicity, we here assume a fixed cycle length (set at creation time).
4 Example: A Model of a Wireless Sensor Network A typical biomedical sensor network consists of a number of sensors, a sink, and users. The example in Fig. 3 has five sensors and one sink connected by wireless links. The sink sends signals which are sufficiently strong for all sensors to receive them. The sensors, which could be inside patients, run on battery, and save power by reducing their signal strength, which again limits their range. Hence some sensors are not directly connected with the sink and depend on other sensors to forward their messages. The sink is connected to the end users by a tight network. The interfaces of the system components are given in Fig. 4. The Forwarder interface declares a forward operation, and
is inherited by both the Sink and the Sensor interfaces. Methods of Forwarder have Forwarder as cointerface (given by the with clause) because a Forwarder object should communicate with another Forwarder object. The Sensor interface adds a method for updating the distance to the sink, which is accessible only to sink objects, as Sink is the cointerface. The User interface declares a newData method for receiving data from the sink. The run method of the System class constructs the system model by creating all objects and setting up the initial network (not given here). A sensor component is created as in s1:= new TempSensor(10); r1:= new Radio(t1,6,10,1) in component(s1);
and similarly for the sink component, ensuring that the sending intervals of the different radios are disjoint, by appropriate radio parameter values. It also reconfigures the network at runtime to simulate the patients’ movements. For example, we express that sensor s5 moves too far to reach the sink after 200 time units by severing the link: await clock > 200; unlink s5 wless sink;
The sensors, sink, and users are described by the classes TempSensor, Sink, and User in Fig. 4. The sensors operate in cycles with a period given by the class parameter interval. In each cycle a sensor reads the current temperature by calling an instance of the TempMeter interface, using a non-blocking call. The TempMeter interface models access to the hardware. After reading the temperature, the sensor sends a message with the temperature to all reachable components that implement the Forwarder interface by the statement !all:Forwarder.forward(this, distToSink + 2, 1, clock, temp)
Here, reachable means that there is a direct link from the caller to the callee. If there is more than one link from the two components, the best link is selected. If the link is tight, the message will move directly from the caller’s out-queue to the callee’s in-queue. If the link is wireless then the radio unit transports the message.
An Object-Oriented Component Model for Heterogeneous Nets
265
interface Forwarder begin with Forwarder op forward(...) end interface Sensor inherits Forwarder begin with Sink op setDistToSink(...) end interface Sink inherits Forwarder begin ... end interface User begin with Sink op newData(...) end class TempSensor(interval:Nat) implements Sensor begin var forwarded:List[Oid*Nat] := emp, distToSink:Nat := 10, distUpdTime:Nat := 0, timer:Nat, temp:Int, tempm:TempMeter op run == tempm := new TempMeter in component(this); while true do timer := clock; await temp := tempm.getTemp(); !all:Forwarder.forward(this, distToSink + 2, 1, clock, temp); await clock > timer+interval od with Sink op setDistToSink(in time:Nat, dist:Nat) == if distUpdTime < time ∨(distUpdTime = time ∧ dist < distToSink) then distToSink := dist; distUpdTime := time fi with Forwarder op forward(in origin:Oid,htl:Nat,steps:Nat,timestamp:Nat,data:Int)== if not((origin, timestamp) in forwarded) ∧ origin = this ∧ htl > distToSink then if length(forwarded) > 9 then forwarded := after(forwarded, length(forwarded)-9) fi; forwarded := forwarded (origin, timestamp); !all:Forwarder.forward(origin, htl-1, steps+1, timestamp, data) fi end class Sink() implements Sink begin var forwarded:List[Oid*Nat*Int] = emp with Forwarder op forward(in origin:Oid, htl:Nat, steps:Nat, timestamp:Nat, data:Int) == !origin.setDistToSink(timestamp, steps); if not((origin, timestamp) in forwarded) then if length(forwarded) > 9 then forwarded := after(forwarded, length(forwarded)-9) fi; forwarded := forwarded (origin, timestamp); !all:User.newData(origin, timestamp, data) fi end class User(criticalLow:Nat, criticalHigh:Nat) implements User begin var allData: List[Oid*Nat*Nat*Int] := emp; op alarm()... with Sink op newData(in origin:Oid, timestamp:Nat, data:Int) == var i:Nat := 1; if data < criticalLow ∨data > criticalHigh then alarm() fi; while i <= length(allData) ∧ index(index(allData, i),1) = origin do i := i+1 od; while i <= length(allData) ∧ index(index(allData, i),1) = origin ∧ index(index(allData, i),2) < timestamp do i := i+1 od; allData := insertAtIndex(allData, i, (origin, timestamp, clock, data)) end
Fig. 4. Model of the temperature sensor, the sink, and a user. We here use for list append.
266
E.B. Johnsen et al.
In class Sink, the forward method has the following parameters: origin gives the identity of the sensor that has provided the data; htl (hops to live) gives the number of remaining hops the message should live; steps gives the number of hops this message has taken so far; timestamp stores the time when the origin sent this message; and data is the temperature measured in degrees centigrade. When a sensor receives a forward call, it adds the message to forwarded and forwards the call to all reachable forwarders, unless the message has already forwarded. The length of forwarded is limited to ten entries, a common limit of, e.g., biomedical sensors. The distance to the sink from a sensor may be measured by the minimum number of hops needed. Because sensors may move, this distance may change. When the sink gets a message, it sends the number of hops taken by this message to the original sender, using setDistToSink. If the data are new to the sink, they are broadcasted to all users. The user stores data in a list allData, which is sorted by sensor name and sending time to simplify queries. Observe that none of the remote calls made in any class are blocking; consequently, the system is deadlock free (assuming that local calls always terminate).
5 Operational Semantics The operational semantics of the language is defined using rewriting logic (RL) [15]. A rewrite theory is a 4-tuple R = (Σ , E, L, R), where the signature Σ defines the function symbols of the language, E defines equations between terms, L is a set of labels, and R is a set of labeled rewrite rules. A state configuration in RL will be modeled as a multiset of terms representing local system states, of given types. These types are specified in (membership) equational logic (Σ , E), the functional sublanguage of RL which supports algebraic specification in the OBJ [9] style. RL extends algebraic specification techniques with transition rules: The dynamic behavior of a system is captured by rewrite rules, supplementing the equations which define the term language. Assuming that all terms can be reduced to normal form, rewrite rules transform terms modulo the equations of E. A rewrite rule t −→ t if c may be seen as a local transition rule allowing an instance of the pattern t to evolve into the corresponding instance of the pattern t , where the optional condition c is a conjunction of rewrites and equations which must hold for the main rule to apply. If several rules can be applied to distinct subconfigurations, they can be executed in a concurrent rewrite step. As a result, concurrency is implicit in rewriting logic semantics. Rules in RL may be formulated at a high-level of abstraction, similar to a compositional operational semantics. In fact, RL provides a semantic framework unifying equational and operational semantics [16]. Many concurrency models have been successfully represented in RL [15, 5]; including Petri nets, CCS, Actors, and Unity. RL also offers its own model of object orientation [5]. In RL, objects are commonly represented by terms o : C | a1 : v1 , . . . , an : vn where o is the object’s identity, C is its class, the ai ’s are the names of the object’s fields, and the vi ’s are the corresponding values [5]. We adopt this form of presentation and define the elements of our semantics as RL objects. When auxiliary functions are needed, these are defined in equational logic and evaluated in between transitions [15]. Whitespace is used as the associative and commutative constructor of multisets with identity
An Object-Oriented Component Model for Heterogeneous Nets
267
element empty, whereas semicolon is used as the associative constructor of lists, also with identity element empty. Variables of the operational semantics are written in upper case letters, whereas variables of the modeling language (as well as auxiliary functions) are written in lower case letters. As before, variables for lists or multisets are written M for semantic construct M. 5.1 Configurations, Local and System Transitions The modeling language considered in this paper depends on a notion of time. For timed distributed systems, time is either modeled by a global clock (or equivalently, local clocks which evolve with the same rate), or by local clocks. For simplicity we use a so-called fictitious clock model [1] based on a global clock, which allows us to ignore clock synchronization between objects. In this clock model, the clock value is just a number that serves to group simultaneous events. The values need not correspond to values of real clocks, and are therefore usually chosen to be natural numbers that count steps. Effects such as radio broadcast are confined to a particular instance of time, and disappear as soon as time advances. Local clocks coordinate the objects’ behavior with the global progress of time, enforcing the invariant that an object may only make a step when the local time is less than or equal to the global time. The global clock advances as soon as its value is less than the values of all local clocks. A state configuration, of sort Configuration, is a multiset which consists of objects, classes, interfaces, queues, messages, and links. The empty configuration is denoted empty. A basic link is written [O N O ], where O and O range over object identities and N over networks (i.e., wless, tight, and lossy ). In order to capture the global clock in RL, we let a term @C clock(N)@ of sort System include a configuration C with at least one object and a (global) clock, denoted clock(N), where the variable N ranges over natural numbers. There are three different kinds of rewrite rules: – Code execution rules correspond to the different program statements; – Transport rules move messages between objects, components, and the network; – System level rules manage low-level activities such as global clock update and table lookup for classes and interfaces Remark that code execution and transport rules apply to local configurations and allow concurrent execution, whereas system level rules apply to the whole system. Components consist of tightly connected objects, and are represented by a naming discipline: a component name may be extracted from every object identity by a function component : Oid → Cid, where the component name is of sort Cid (a subsort of Oid). The objects with the same component name form a component. From outside a component, one may then refer to an object by its identity or by its component name. A component O has one in-queue object O: InQu | EvQ: M , where the queue M is FIFO ordered, and one out-queue object O: OutQu | EvQ: M , Tag: K , where M has a simple form of priority ordering and the tag K (together with the object identity) is used to uniquely identify outgoing messages. (Other forms of priority queues could be considered in more specialized settings; e.g., LIFO out-queues would give priority to fresh messages.) The queues have the same name as the component, and provide (a controlled form of) shared data structures for the component’s objects. The queues may
268
E.B. Johnsen et al.
interact with the net at the same time as internal actions inside the component’s objects. The specific message processing depends on the different networks linked to an object. Remark that by using the same component name for all objects in a component, we need no further encapsulation syntax for components. In many cases, including the examples in this paper, full object names are not needed and component names suffice. A concurrent object is represented by O : C | Pr: Q, PrQ: Q, Att: V , where O is the object identity, of sort Oid, C the class name, Pr the active process (which includes code and local variables), PrQ a multiset of suspended processes with unspecified queue ordering, and Att the object state variables, including some predefined system variables such as clock which represents the local clock of the object. A process is modeled as a pair consisting of code and local state, (S,W ), where S is a statement list and W is a state mapping from (local) variable names to values, using + for concatenation (and overwriting) and _
→ _ for constructing variable-to-value associations. The suspended processes in the process queue represent remaining parts of method activations. Programs have read-only access to the clock, so the programmer may not assign to the clock variable. Let [[E]]V denote the evaluation of an expression E in the state V . Example. A wireless sensor may have one radio object O2 responsible of sending and receiving wireless messages in interaction with the network, together with a main object O1 doing the main computations. Such a component may have the form:
O : InQu |EvQ: M O : OutQu |EvQ: M , Tag: K O1 : C |Pr: Q1 , PrQ: Q1 , Att: V 1 O2: Radio |Pr: Q2 , PrQ: Q2 , Att: V 2 where component(O1) = component(O2) = O, and a class Radio defines active behavior controlling the wireless sending and receiving of messages (see Section 4). In a class C :Cl | Ifc: I, Inh: C, Par: Y , Att: V , Mtds: P , C is the class name, Ifc is the list of interfaces supported by the class, Inh is the list of superclasses, Par the list of class parameters, Att a list of attributes with initial values, and Mtds a multiset of methods (including the initialization method init and a method run defining active object behavior). The attributes include a system variable token used for unique naming of generated objects. When an object needs a method, it is loaded from the Mtds multiset of the object’s class. Similarly, in an interface I : Ifc | Inh: I I is the name, I the inherited interfaces. The inheritance list is used for broadcasts at run-time to determine all (connected) objects of a given super-interface. Method and cointerface declarations in an interface are used for type checking purposes and may be ignored at run-time. Heterogeneous networks are represented by sets of links. We let Link be a subsort of Config, thereby allowing (multi)sets of links directly in the configuration, with terms [O N O ] (where N denotes a net; i.e., either wless, tight, or loose). We assume that the transmission strength in a wireless link may vary, in contrast to a wired link. Consequently, wireless links are directed and not symmetric, whereas wired connections are both symmetric and transitive. In the multiset of links, duplicates as well as links to self are ignored (i.e., we have the equations CN CN = CN and [O N O] U = U, where CN denotes some link and O a component). The link statements described in Sect. 3 result in changes of link configurations. We define a function bestcon : Oid Oid Configuration → Net+ to identify the best connection between two objects, exploiting the transitivity and symmetry of wired
An Object-Oriented Component Model for Heterogeneous Nets
(B IND )
@ O : C |PrQ: Q O : InQu |Ev: invoc m(E); M U @ = @ O : C |PrQ: Q; bind(m,E,C,U) O : InQu |Ev: M U @ if supports (C, m, U )
(GUARD )
O : C |Pr: (await G ; S, W ), PrQ: Q, Att: V O : InQu |Ev: M −→O : C |Pr: (S, W ), PrQ: Q, Att: V O : InQu |Ev: M if enabled(G, (V +W ), M)
(SUSPEND )
O : C |Pr: (S, W ), PrQ: Q, Att: V O : InQu |Ev: M −→O : C |Pr: idle , PrQ: Q; (S, W ), Att: V O : InQu |Ev: M if not enabled(S, (V +W ), M)
(P R Q-READY )
O : C |Pr: idle, PrQ: (S, W ); Q, Att: V O : InQu |Ev: M −→O : C |Pr: (S, W ), PrQ: Q, Att: V O : InQu |Ev: M if enabled(S, (V +W ), M)
(IDLESTEP )
O : C |Pr: idle, PrQ: Q, Att: V O : InQu |Ev: M clock(T) −→O : C |Pr: idle, PrQ: Q, Att: advance(V ) O : InQu |Ev: M clock(T) if not enabled(Q,V , M) and [[clock]]V = T
269
Fig. 5. Rules for process queue handling. In the rules we omit object fields not relevant for the rule. Note that matching is modulo associativity, commutativity, and identity for the multiset constructor, and modulo associativity and identity for the list constructor.
networks. The sort Net+ extends the sort Net with the constant noNet and we define bestcon(O, O ,U) = noNet if there is no connection path from (the component of) O to (the component of) O . Otherwise, the connection between the two objects is tight if there is a connection path from O to O consisting of tight direct connections only; loose if there is one or more loose direct connections in the path; and wless if there is a wireless connection between O and O . Objects in the same component are always tightly connected: bestcon(O,O , U) = tight if component(O) = component(O ) For example, bestcon (o1, o3, [o1 wless o3][o1 tight o2][o3 loose o2]) = loose, whereas bestcon(o1, o3, [o1 wless o3][o1 tight o2]) = wless. Messages. There are three different kinds of message bodies MB: these have the form invoc m(par) for invocation messages, where m is the name of the called method and par are actual parameters; comp (par) for completion messages; and error (name) for error messages, capturing network errors or other kinds of errors. The actual parameters include system generated parameters such as the caller identity and tag value. With full header information, a message has the form MB from O to O by NET, where O is the sender object, O the destination (either a single object or a list of objects), and NET is the network to be used: loose, tight, wless, or noNet. For simplicity, we omit sender information from messages inside out-queues and keep only message bodies inside in-queues. The network information of a message is determined when the message is placed in the out-queue (by means of an equation taking the total network into consideration). Remark that messages by noNet cannot be sent.
270
E.B. Johnsen et al.
5.2 The Rewrite Rules The operational semantics ensures that clock values increase, and that the global clock is less than or equal to each local clock. The global clock is updated by the rule (CLOCK) in Fig. 6, in which the variable U ranges over configurations, clockmin gives the smallest local clock value in U, and refresh removes any remaining receive statements and wireless messages from the configuration (since such messages are not persistent). The function clockmin is defined by the following equations (where OB denotes an object): clockmin(OB OB U) = min(clockval(OB), clockmin(OB U)) clockmin(OB U) = clockval (OB) otherwise clockval(O : C |. . . Att: V ) = [[clock]]V Note that in RL, equations marked by otherwise only apply when no other equations are applicable [5]. In general, an object may only compute when its local clock value equals the global clock. Thus a rule modeling object behavior typically has the form object clock(T ) −→object clock(T ) if clockval ( object ) = T where object is a pattern representing an object (possibly with its associated in-queue) and object is the resulting object state, typically with local time increased. In particular the rule (IDLESTEP ) in Fig. 5 increases the local time of objects that are idle; i.e., objects with no active process in which no processes in the process queue are enabled. Similarly, the rule (NO REPLY) in Fig. 6 increases local time when the active process is blocked (i.e., x:=t.get) with no matching label value in the in-queue. In addition, all communication statements increase the local clock (see Fig. 6); e.g., send, receive, method calls, return, link, and unlink. The principle that local computation requires the local and global clock values to be the same, may be relaxed for internal object actions; i.e., by allowing local actions of object which do not involve any interaction (affected by or affecting other objects). For example, assignments to local and state variables do not need to depend on the global clock. Thus, the assignment rule, which may be given by
(A SSIGN )
O : C |Pr: (X:=E; S,W ), Att: V −→ if X in V then O : C |Pr: (S,W ), Att: V +(X
→[[E]](W+V ) ) else O : C |Pr: (S,W +(X
→[[E]](W+V ) )), Att: V fi
does not increase the local time. The tick(n) statement evaluates as an assignment on the local clock; i.e., clock := clock + n. The rules for if and while (omitted here), as well as guards and process queue handling do not involve clocks, except (IDLESTEP ) (given in Fig. 5). The (BIND) rule additionally uses the class hierarchy to bind methods, and the supports function checks if a class supports a method in a given class hierarchy. To simplify, object names in the rules are abstracted to component names. This allows a direct matching by names (O = O ) rather than matching by component name (as in component(O) = component(O )). Note that method binding based on component names is non-deterministic when the component has several objects supporting the method. With this simplification, all remote calls are made to components. For many practical purposes, including the sensor example, this simplification works well.
An Object-Oriented Component Model for Heterogeneous Nets
271
(C LOCK )
@ clock(T ) U @ −→@ clock(clockmin(U)) refresh(T, U) @ if T
(N ET)
@ O : OutQu |Ev: M; (MB to O ); M U @ = @ O : OutQu |Ev: M; (MB to O by bestcon(O, O ,U)); M U @
(NO N ET)
O : OutQu |Ev: M; (MB to O by noNet) = O : OutQu |Ev: M; (error(“noNet”) to O by tight)
(M ULTIM SG 1)
O : OutQu |Ev: M; (MB to (O ; O)) = O : OutQu |Ev: M; (MB to O ); (MB to O)
(M ULTIM SG 2)
O : OutQu |Ev: M; (MB to empty) = O : OutQu |Ev: M
(M ULTIM SG 3)
@ O : OutQu |Ev: M; (MB to all: I) U @ = @ O : OutQu |Ev: M; (MB to all(O, I,U)) U @
(M ULTICAST )
O : C |Pr: (!O.m(E) ; S,W ), Att: V O : OutQu |Ev: M, Tag: K clock(T ) −→O: C |Pr: (S,W ), Att: advance(V ) O : OutQu |Ev: M; invoc m(O, K,[[E]](V +W ) ) to [[O]](V +W ) ), Tag: K+1 clock (T ) if [[clock]]V = T
(U NICAST )
O : C |Pr: (L:=!O.m(E) ; S,W ), Att: V O : OutQu |Ev: M, Tag: K clock(T ) −→ O : C |Pr: (L :=K; S,W ), Att: advance(V ) O : OutQu |Ev: M; invoc m(O, K,[[E]]V +W ) to [[O]]V+W )),Tag: K+1 clock(T ) if [[clock]]V = T
(B ROADCAST )
O : C |Pr: (! all: I.m(E) ; S,W ), Att: V O : OutQu |Ev: M, Tag: K clock (T ) −→O : C |Pr: (S,W ), Att: advance(V ) O : OutQu |Ev: M; invoc m(O, K,[[E]](V +W ) ) to all: I, Tag: K+1 clock(T ) if [[clock]]V = T
(R ETURN )
O : C |Pr: (return(E) ; S,W ), Att: V clock(T ) O : OutQu |Ev: M, Tag: K −→O : C |Pr: (S,W ), Att: advance(V ) clock(T ) O : OutQu |Ev: M ; comp ([[(label, E)]]V +W ) to [[caller]]W ), Tag: K if [[clock]]V = T
(R EPLY )
O : C |Pr: (X :=L? ; S,W ), Att: V clock(T ) O : InQu |Ev: M; comp(K, E); M −→O : C |Pr: (X :=E; S,W ), Att: advance(V ) clock(T ) O : InQu |Ev: M; M if [[clock]]V = T and K = [[L]](V+W ) )
(N O R EPLY )
O : C |Pr: (X :=L? ; S,W ), Att: V clock(T ) O : InQu |Ev: M −→O : C |Pr: (X :=L? ; S,W ), Att: advance(V ) clock(T ) O : InQu |Ev: M if [[clock]]V = T and not inqueue([[L]](V+W ) , M) Fig. 6. Rewrite equations and rules for message processing
272
E.B. Johnsen et al.
(T IGHT1)
O : OutQu |Ev: (MB to O by tight); M O : InQu |Ev: M = O : OutQu |Ev: M O : InQu |Ev: M ; MB
(T IGHT2)
(M1 to O1 by wless); (M2 to O2 by N) = (M2 to O2 by N); (M1 to O1 by wless) if not N = wless
(L OOSE1)
O : OutQu |Ev: (MB to O by loose); M = O : OutQu |Ev: M (MB from O to O by loose)
(L OOSE2)
(MB from O to O by loose) O : InQu |Ev: M −→O : InQu |Ev: M; MB Fig. 7. Tight and loose networks
Basic Statements and Creation. The rules for basic statements, such as skip, if, while, link, and unlink, are straightforward (see Fig. 9). The rule for object creation creates a new object and associated queues. In case of a new object in a given component, the component identity is reused for the new object and no further queues are created. Local calls are defined by remote calls to self (by the obvious equation). For simplicity, the rules for reentrance are ignored in this paper. Network processing. The rewrite rules for network processing are given in Fig. 6. In the equation (NET), the network determines how to send messages. Notice that the equation is on the total system, such that all possible connections are considered. The equation represents network actions, realized by hardware or the operating system. Equation (NO N ET) reflects that messages to noNet cannot be sent. Such messages may represent serious communication failures and should be communicated to the caller. In our framework this is indicated by an error message. Message sending to multiple destinations is defined by means of messages to single destinations in equation (MULTIM SG1), and by ignoring messages sent to empty destination lists in (MULTI MSG2). The rules for tight and loose networks are given in Fig. 7. A tight network from o to o is defined by a transport equation (TIGHT1) which takes the first message from the out-queue of o marked by “tight”, directly into the in-queue of o . We let wired messages in out-queues have priority over wireless ones, as stated in equation (TIGHT2). Loose networks are modeled by using an equation to move messages from an out-queue into the configuration, and by a (nondeterministic) rule taking a message from the configuration into the appropriate in-queue ((LOOSE2)). Messages over loose nets are automatically sent to the network (i.e., placed in the configuration multiset) by the equation (LOOSE1). Communication. Messages in the out-queue are created by call and return statements in the associated objects. We consider uni-cast, multicast, and broadcast; the rules are given in Fig. 6. Of these, only labeled uni-casts allow the caller to request a result of the call. We have seen that blocking (synchronous) methods calls are understood in terms of labeled asynchronous method calls: t :=!o.m(e) where o is an object expression. The label value provides a way of identifying the call and the reply. In rule (UNICAST ), a labeled call has only one callee, which ensures that there is a unique reply. A call’s result value is communicated in rule (RETURN ) as a completion message, caused by a
An Object-Oriented Component Model for Heterogeneous Nets
(W SEND 1)
O : C |Pr: (send ; S,W ), Att: V clock(T ) O : OutQu |Ev: (MB to O by wless) ; M = O : C |Pr: (S,W ), Att: advance(V ) clock(T ) O : OutQu |Ev: M (MB from O to O by wless) if [[clock]]V = T
(W SEND 2)
O : C |Pr: (send ; S,W ), Att: V clock(T ) O : OutQu |Ev: empty −→O : C |Pr: (S,W ), Att: advance(V ) clock(T ) O : OutQu |Ev: empty if [[clock]]V = T
(C OLLIDE )
(M1 from O1 to O by wless) (M2 from O2 to O by wless) = (error(“collision”) from null to O by wless)
(R ECEIVE )
273
@ O : C |Pr: (receive ; S,W ), Att: V O : InQu |Ev: M (M from O to O by wless) [O’ wless O] clock(T ) U @ = @ O : C |Pr: (S,W ), Att: advance(V ) O : InQu |Ev: M ; M [O’ wless O] clock (T ) U @ if [[clock]]V = T otherwise
(R EFRESH 1)
refresh (T, (MB from O to O by wless) U) = refresh (T, U)
(R EFRESH 2)
refresh (T,O : C |Pr: (receive; S,W ), Att: V U) = O : C |Pr: (S,W ), Att: advance(V ) refresh(T,U) if [[clock]]V = T
(R EFRESH 3)
refresh (T, U) = U otherwise Fig. 8. Wireless communication
return statement in the callee, where caller and label are the implicit local parameters identifying the caller and the tag. A blocking reply statement is captured by the rule (R EPLY ), when the corresponding completion has arrived. Otherwise, the rule (N O R EPLY ) ensures that the local clock progresses. A similar rule ensures the progress of the local clock when an object is idle without any enabled process in the queue. Unlabeled calls may have multiple destinations, given by a list O of object expressions. These are captured by rule (MULTICAST ), where advance(V) is defined by V + (clock
→ [[clock]]V + 1). We let semicolon denote both concatenation and append, for sequences of statements as well as of messages. Notice that the caller O and tag-value K are added as implicit parameters. Broadcast, captured by the rule (BROADCAST ), is restricted to all objects of a given interface, using the notation all : I. This restriction is needed to maintain strong typing. A broadcast message MB to all : I should arrive (in a single copy) to all I-objects in the system which are connected to O. The all : I expression is expanded by means of the equation (MULTIM SG 3) on the total system, using a function all to collect all objects of interface I (or better) in a configuration: all (O, I,U O : C |... ) = O ; all(O, I,U) if supports (C, I,U) and bestcon(O, O ,U) = noNet all (O, I,U) = empty otherwise Here, supports (C, I,U) checks whether class C implements interface I, or a subinterface of I, in the configuration U.
274
E.B. Johnsen et al.
(LINK )
O : C | Pr: (W , (link E NW E’); S), Att: V −→O :C | Pr: (W , S), Att: V component([[E]]V+W ) NW component([[E ]]V +W )
(UNLINK )
O : C | Pr: (W , (unlink E NW E’); S), Att: V [O’ NW O’’] −→O :C | Pr: (W , S), Att: V if component([[E]]V +W ) = O’ and component([[E ]]V +W ) = O’’
(IF -EL )
O : C | Pr: (W , if E then S1 else S2 fi ; S), Att: V −→if [[E]]V +W )) then O :C |Pr: (W , S1 ; S), Att: V else O : C | Pr: (W , S2 ; S), Att: V fi
(WHILE)
O : C | Pr: (W , while E do S1 od ; S2) −→O :C | Pr: (W , (if E then (S1 ; while E do S1 od) else skip fi); S2)
(NEW 1)
@ O :C | Pr: (W , (X :=new C’ (E); S)), Att: V U @ −→@ O :C | Pr: (W , (X :=O’); S), Att : V +(token
→[[token]]V +1) U O’ : C’ | Pr: init (C’ [[E]]V +W , U), PrQ: empty, Att : ( clock
Fig. 9. Rules concerning basic statements and object creation. The function newId is used to create new object identities (composed by the parent object and a counter). The auxiliary functions inherit and init are used to define multiple inheritance and initial code. An if-clause of the form O := E represents a let expression.
5.3 Wireless Sending and Receiving The two language primitives send and receive model synchronized wireless communication. Their operational semantics, given in Fig. 8, depends on the refresh function, used in the (CLOCK) rule, which erases wireless messages and receive statements as time passes. The equation (WSEND1) defines the semantics of sending; the first wireless message in the out-queue is sent, making the from-part of the message header explicit. When there is no message to send, a send message is skipped in rule (WSEND2). Recall that wireless messages in a network (i.e., configuration) disappear when global time advances. This is captured by an equation (REFRESH1) on the refresh function used in the global clock rule. Two wireless messages with the same destination which occur in the configuration at the same time, cause a collision and destroy each other. We model this by an equation (COLLIDE ), which results in an error message. The equation detects collisions of two or more messages. The receiving itself is modeled by an equation (RECEIVE ). Here, the otherwise-clause implies that the equation should have lower priority than
An Object-Oriented Component Model for Heterogeneous Nets
275
(C OLLIDE ). Therefore, the left hand side considers the system state and includes all possible matches of the (COLLIDE ) rule. The rule also checks that there is a wless connection, since it may have been disconnected after the message was sent. By advancing the local clock, we ensure that two wireless messages cannot be received at the same time. Notice that the condition on the clocks implies that the receiving of a message happens at the same time as its sending, thereby modeling synchronous transmission. Recall that refresh applies whenever the global local is advanced, which should also happen when the local time of receiving objects is equal to the global time. We therefore add an equation for this case
clockval(O : C |Pr: (receive ; S,W ), Att: V ) = [[clock]]V + 1 and otherwise use the previous equation for clockval. 5.4 Simulation and Search The operational semantics outlined above is executable on the RL platform Maude [5], and thus form the basis for an analysis tool for the modeling language of Sect. 3. We illustrate this by showing how to perform some simple simulations of the wireless sensor network example of Sect. 4 in Maude. The model of Sect. 4 focuses on the behavior of the components. A special class System is used to set up the system model by creating components and establishing the links between components. The initial System object can later be used to modify these links, thus changing the topology of the network. Interesting behaviour of the communication medium may be investigated by simulating the system given in Fig. 4; e.g., can message overtaking occur in this net? By message overtaking, we mean that a message arrives at the user before an older one from the same sender. This may happen, if an additional link from the sensor s1 to the sink in the network of Fig. 3 is introduced at runtime and the interval between two measurements is sufficiently short. To exhibit this behaviour, the following code is added to the run method of the System class: await (clock > int(25)); connect s1 wless sink;
Now the link between s1 and the sink is added when the local clock has reached 25 time units. Simulating the system will then show that a later message may overtake an earlier message, because the earlier message is waiting to be forwarded at sensor s2 , while the later message is sent directly to the sink via the newly established link. Maude also allows to search in a breadth-first manner through all possible executions from a given initial state. The state space of concurrent and distributed systems is huge, usually growing exponentially in the number of components. In order to make searching feasible, abstractions that reduce the state space are needed, preferably by eliminating components. For example, objects of class TempSensor may be replaced by equations for forwarding messages, moving these from in-queues to out-queues while updating the “step” and the “hops to live” attributes. This way, simple searches about communication patterns may be perfortmed while abstracting from the internal functionality of the sensors.
276
E.B. Johnsen et al.
6 Related and Future Work Our approach is based on modelling with active objects. Active objects have been used to model mobile ad-hoc networks, which are similar to our biomedical sensor networks, in [7]. However, in contrast to our work, cross-layer design is not considered, because no means for reasoning about the network are provided. Formal automata models have been used analyse protocols and channels. The properties of communication media are usually modelled as automata, too. For example, Nancy Lynch models communication media by processes in [14]. A lossy channel is modeled by a process that randomly drops messages. In contrast to these approaches, which apply ad-hoc techniques to model various kinds of links and networks, our modelling language fully integrates into the modelling language a set of primitives to describe dynamically evolving network topologies. TinyOS [6] is a popular operating system for wireless sensor nodes. The associated programming language nesC [8] takes an approach similar to ours: Programs in nesC are structured in components. However, the number of components in nes C is statically fixed and each component ressembles a single Creol objects. In contrast, our components may be created dynamically and contain a number of concurrent objects. In nesC tasks correspond to our processes and are cooperatively scheduled, because sensor nodes usually do not permit dynamic scheduling. In contrast, our approach abstracts from particular scheduling schemes; in fact, our models could be refined with application-specific schedulers (see [21]). This may be starting point for a development technique for applications which target TinyOS. We are currently investigating the relationship between our models and nesC programs in more detail. For the analysis of networks, the current state of the art focuses on discrete event simulation software, such as OMNet++ [25], that defines accurate models of wireless communication networks and channels. These simulators target the quantitative aspects of the model, such as throughput figures, whereas we are mainly concerned with functional aspects, e.g., the correctness of the deployed protocol and sensor functionality. Verisim [2] is a simulator similar to OMNet++, which is used to validate functional properties of wireless networks. Verisim allows models from discrete event simulations to be used directly, and integrates well with established design methods. Monte-Carlo simulation is used to record traces of events, which may be queried using a special language. In contrast, our approach is based on a simple, high-level modeling language with a formal semantics. Furthermore, the integration with Maude makes it possible to customize the simulation strategy, as well as to apply search techniques to the models. Ölveczky and Thorvaldsen [18] have shown how Real-Time Maude [17] can be applied to model and analyse advanced wireless sensor network algorithms, using, e.g., Monte Carlo simulations for performance evaluation for networks with up to 800 nodes. Rodriguez [20] has similarly used Real-Time Maude to analyse flooding in WSN protocols. These papers focus on the modeling and analysis of protocol algorithms. Our work complements this approach by emphasising sensor functionality and behavior, as well as heterogeneous media. However, we intend to investigate how their techniques for simulation may apply in our setting. Compared to earlier work on Creol [11, 12], the main contribution of this paper is the extension to heterogeneous networks and the introduction of language abstractions
An Object-Oriented Component Model for Heterogeneous Nets
277
suitable for a unified modeling of network components and different kinds of networks. In particular, the extension consists of the notion of network components with tightly connected objects sharing in- and out-queues, specification of different and dynamic networks architectures, including wireless networks and radio programming, multi- and broadcasts, as well as the extension to a timed semantics. The proposed primitives are useful to establish connections at the network level but may also be exploited at the application level, for instance in service-oriented architectures [4]. Since our approach allows the radio level to be programmed inside the modeling language, we may experiment with different radio solutions. For instance, the active radio object used in the example of Sect. 4 may be replaced by a passive radio model. This can be done by letting the sensor class inherit a passive radio class, and letting the sensor object control the radio sender and receiver. Furthermore our approach may be adjusted to allow more realistic models of wireless communication by considering factors like battery capacity, power consumption of sending and receiving, signal strength, and location. Stochastic modeling, however, is less trivial, but might be addressed using the probabilistic Maude tool [13] as a basis for the operational semantics.
7 Concluding Remarks The main contribution of this paper is a modeling framework for heterogeneous networks. The framework allows the unified modeling of network components in different kinds of networks, as well as network changes. In particular, we consider wireless networks and radio communication, as well as loosely and tightly connected wired networks. Our approach extends the object-oriented paradigm by suggesting novel language abstractions related to heterogeneous networks, using Creol as an underlying language for concurrent and distributed objects. The extended language may be used for high-level application programming (without knowledge of the particular network available) as well as for network-aware programming such as radio controllers. Our framework is based on formal methods and may serve as a basis for reasoning about system properties and semantical analysis. The formal semantics of the language is presented through a high-level operational semantics, defined in rewriting logic. The operational semantics is executable, allowing simulation and formal analysis by means of the Maude tool. The language is demonstrated through an example of a wireless sensor network, and some initial simulations and analysis have been performed. The value of a formal framework as a basis for intuitive understanding of a language and towards practical modeling and reasoning, depends crucially on the simplicity of the semantics. The presented operational semantics consists of 32 rules or equations, apart from auxiliary function definitions. This covers the semantics of basic statements and object-oriented issues such as object creation, inheritance, late binding, and (asynchronous) method calls and replies, as well as extensions for heterogeneous networks, including broad- and multicast, timing (with global and local clocks), programming and re-programming of network links, and primitives for wireless receiving and sending. The semantical simplicity may also be taken as an argument for the appropriateness of the proposed abstractions and their integration within the object-oriented paradigm.
278
E.B. Johnsen et al.
The long term goal of this work is to adapt the object-oriented paradigm to the setting of modern distributed systems by exploring suitable language abstractions and constructs that at the same time support simplicity both in reasoning and in semantics. The present paper may be seen as a first step in the direction of high-level, object-oriented, and formal modeling of heterogeneous systems where properties of the different networks are directly modeled. Acknowledgments. We are grateful for comments by Wolfgang Leister and Xuedong Liang on sensor network modeling and analysis.
References 1. Alur, R., Henzinger, T.A.: Logics and models of real time: A survey. In: Huizing, C., de Bakker, J.W., Rozenberg, G., de Roever, W.-P. (eds.) REX 1991. LNCS, vol. 600, pp. 74– 106. Springer, Heidelberg (1992) 2. Bhargavan, K., Gunter, C.A., Kim, M., Lee, I., Sokolsky, O., Viswanathan, M.: Verisim: Formal analysis of network simulations. IEEE Transaction on Software Engineering 28(2), 129–145 (2002) 3. Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User Guide. Addison-Wesley, Reading (1999) 4. Clarke, D., Johnsen, E.B., Owe, O.: Concurrent Objects à la Carte. In: Correctness, Concurrency, and Components: Festschrift for Willem-Paul de Roever. LNCS. Springer, Heidelberg (to appear, 2008) 5. Clavel, M., Durán, F., Eker, S., Lincoln, P., Martí-Oliet, N., Meseguer, J., Quesada, J.F.: Maude: Specification and programming in rewriting logic. Theoretical Computer Science 285, 187–243 (2002) 6. Culler, D.E., Hill, J.L., Buonadonna, P., Szewczyk, R., Woo, A.: A network-centric approach to embedded software for tiny devices. In: Henzinger, T.A., Kirsch, C.M. (eds.) EMSOFT 2001. LNCS, vol. 2211, pp. 114–130. Springer, Heidelberg (2001) 7. Dedecker, J., Belle, W.V.: Actors for mobile ad-hoc networks. In: Yang, L.T., Guo, M., Gao, G.R., Jha, N.K. (eds.) EUC 2004. LNCS, vol. 3207, pp. 482–494. Springer, Heidelberg (2004) 8. Gay, D., Levis, P., von Behren, J.R., Welsh, M., Brewer, E.A., Culler, D.E.: The nesC language: A holistic approach to networked embedded systems. In: Proc. Conf. on Programming Language Design and Implementation PLDI 2003, pp. 1–11. ACM, New York (2003) 9. Goguen, J.A., Winkler, T., Meseguer, J., Futatsugi, K., Jouannaud, J.-P.: Introducing OBJ. In: Goguen, J.A., Malcolm, G. (eds.) Software Engineering with OBJ: Algebraic Specification in Action, Advances in Formal Methods, ch. 1, pp. 3–167. Kluwer Academic Publishers, Dordrecht (2000) 10. Goldsmith, A.J., Wicker, S.B.: Design challenges for energy-constrained ad hoc wireless networks. IEEE Wireless Communications 9(4), 8–27 (2002) 11. Johnsen, E.B., Owe, O.: An asynchronous communication model for distributed concurrent objects. Software and Systems Modeling 6(1), 35–58 (2007) 12. Johnsen, E.B., Owe, O., Yu, I.C.: Creol: A type-safe object-oriented model for distributed concurrent systems. Theoretical Computer Science 365(1–2), 23–66 (2006) 13. Kumar, N., Sen, K., Meseguer, J., Agha, G.: A rewriting based model for probabilistic distributed object systems. In: Najm, E., Nestmann, U., Stevens, P. (eds.) FMOODS 2003. LNCS, vol. 2884, pp. 32–46. Springer, Heidelberg (2003)
An Object-Oriented Component Model for Heterogeneous Nets
279
14. Lynch, N.A.: Distributed Algorithms. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers, Inc, San Francisco (1996) 15. Meseguer, J.: Conditional rewriting logic as a unified model of concurrency. Theoretical Computer Science 96, 73–155 (1992) 16. Meseguer, J., Rosu, G.: Rewriting logic semantics: From language specifications to formal analysis tools. In: Basin, D., Rusinowitch, M. (eds.) IJCAR 2004. LNCS, vol. 3097, pp. 1–44. Springer, Heidelberg (2004) 17. Ölveczky, P.C., Meseguer, J.: Specification of real-time and hybrid systems in rewriting logic. Theoretical Computer Science 285(2), 359–405 (2002) 18. Ölveczky, P.C., Thorvaldsen, S.: Formal modeling and analysis of the OGDC wireless sensor network algorithm in Real-Time Maude. Theoretical Computer Science (to appear, 2008) 19. Raisinghani, V.T., Iyer, S.: Cross-layer design optimizations in wireless protocol stacks. Computer Communications 27(8), 720–724 (2004) 20. Rodríguez, D.E.: On modelling sensor networks in Maude. In: Denker, G., Talcott, C. (eds.) Proc. 6th Intl. Workshop on Rewriting Logic and its Applications (WRLA 2006). Electronic Notes in Theoretical Computer Science, vol. 176, pp. 199–213. Elsevier, Amsterdam (2007) 21. Schlatte, R., Aichernig, B., de Boer, F., Griesmayer, A., Johnsen, E.B.: Testing concurrent objects with application-specific schedulers. In: Fitzgerald, J.S., Haxthausen, A.E., Yenigun, H. (eds.) ICTAC 2008. LNCS, vol. 5160, pp. 319–333. Springer, Heidelberg (2008) 22. Smith, G.: The Object-Z Specification Language. In: Advances in Formal Methods, Kluwer Academic Publishers, Dordrecht (2000) 23. Srivastava, V., Motani, M.: Cross-layer design: A survey and the road ahead. IEEE Communications Magazine 43(12), 112–119 (2005) 24. Mellor, A.U.S.J., Scott, K., Weise, D.: Model-driven architecture. In: Bruel, J.-M., Bellahsène, Z. (eds.) OOIS 2002. LNCS, vol. 2426, pp. 233–239. Springer, Heidelberg (2002) 25. Varga, A.: Omnet++. IEEE Network Interactive 16(4) (July 2002) 26. Zimmermann, H.: OSI reference model—the ISO model of architecture for open system interconnection. IEEE Transactions on Communication 28(4), 425–432 (1980)
Coordinating Object Oriented Components Using Data-Flow Networks Mohammad Mahdi Jaghoori CWI, Amsterdam, The Netherlands [email protected]
Abstract. We propose a framework for component-based modeling of distributed systems. It provides separation of concerns between computation (in object oriented components), coordination (via connectors) and dynamic reconfiguration (by the network manager). This framework builds upon the object oriented modeling language Creol for modeling the components, and uses the ideas of Reo for exogenous coordination using data-flow networks.
1
Introduction
Internet and systems distributed over internet, such as service oriented software, are nowadays becoming more popular. This brings about the need for modeling and analysis of these systems before implementation. The growing size of such systems favors use of high level languages supporting component-based design techniques. Internet based software usually consists of loosely coupled components, possibly provided by different parties, interacting with each other. Components-based software development has been proposed by several authors as a solution to the increasing complexity of software development. Components are assumed to be individual and independent units of functionality and deployment and thus to turn them into an application, a mechanism for component composition is needed. In this paper, we propose a framework for component-based modeling based on Creol [10]. The basic concept of standard Creol is to provide a formal objectoriented solution for modeling distributed systems. Creol objects have their own processors and communicate by asynchronous method calls. Although Creol is suitable for distributing objects over different processors, it does not in itself separate the coordination issues from computation. Nonetheless, we can translate our component models to standard Creol allowing us to use all the techniques developed for Creol. Furthermore, we get the formal semantics of the model for free. Our work is on integrating the data-flow network modeling to an object oriented modeling language, namely Creol, by providing a syntax for modeling components consisting of objects. For modeling a distributed system in this framework, three different perspectives should be considered. First of all, one
This work has been funded by the European IST-33826 STREP project CREDO on Modeling and Analysis of Evolutionary Structures for Distributed Services.
F.S. de Boer et al. (Eds.): FMCO 2007, LNCS 5382, pp. 280–311, 2008. c Springer-Verlag Berlin Heidelberg 2008
Coordinating Object Oriented Components Using Data-Flow Networks
281
should define the components in the system. The inside of a component is modeled as a number of (standard Creol) objects communicating by message passing. Every component implements some facades, which in turn declare (disjoint) sets of ports. The objects inside the component may read, write or wait for a signal on the ports. The computation inside a component does not depend on how ports will be connected; hence, allowing the modeling of reusable off-the-shelf components. The second perspective focuses on the exogenous coordination of components and their communication. At this level, as proposed in [15], a software component is considered a static abstraction (black box) with plugs (called ports here). We use connectors to connect components. A connector has a fixed number of connector-ends, which can be plugged into component ports. In other words, component ports are to be bound to connector-ends. Connectors are independent of any specific components. We have developed a library of mobile connectors for some basic coordination patterns. A connector is mobile in the sense that its ends can move during execution, i.e., connect different components at different times. Mobility together with dynamic creation of connectors (and components) provide the means for a dynamically reconfigurable network, in a similar way as modeled in π-calculus [14]. The third perspective of modeling addresses reconfiguration. Reconfiguration policies are defined in network managers. In addition to ports, a component facade specifies the events it may raise, e.g., a request/announcement for a specific service. Events are handled by the network manager. As a result of a sequence of events (possibly from different components), the network manager may reconfigure the network (by creating new connectors and/or changing port bindings). For a given system, one or more network managers can be developed modeling different reconfiguration policies. This shows the level of decoupling in the component based framework. 1.1
Related Work
The object based modeling languages based on method calls [1,10,20] are more suitable for modeling and analysis of the internals of the components rather than the communication between independently developed components. On the other hand, languages like [2,5,8] consider components as black boxes and focus only on the interactions between them. In our model, these aspects can be modeled and analyzed separately as well as integrated into a Creol model and analyzed together as a whole. The data flow networks in our model are inspired from Reo [2]. We can compare our work with the implementations of Reo mobile channels (MoCha) in Java [17] and π-calculus [18]. Mocha, like our framework, allows for modeling component behavior as a separate issue from coordination. The main difference is that we also propose a syntax for modeling components. In addition, we allow for modeling dynamic reconfigurations in the network. We can also compare our work with JCSP [21,22] as both bring channel based communication to the world of object oriented languages. The difference is in the
282
M.M. Jaghoori
fact that JCSP processes are connected by and synchronize upon a small set of primitives - such as message-passing channels and multiway events. However, we allow construction of complex connectors (independently from the components) which provide exogenous coordination. Another difference with the two works above is the underlying concurrency model is different, i.e., multi-threaded Java vs. concurrent objects in Creol. One may also use Creol features like dynamic upgrades [23] or its executable semantics [10]. Among other component-based modeling frameworks are BIP [4] and Ptolemy II [6]. BIP provides an algebraic formalization of connectors, wherein atomic components are modeled as a set of transitions. Ptolemy II, implemented in Java, is aimed at modeling heterogeneous components. In Ptolemy, the components communicate by contacting the receiver object of one another, whereas in our model components are exogenously coordinated. Furthermore, we use an operational formalization of components and connectors using the object oriented language Creol. 1.2
Paper Structure
In the next section, the basics of Creol and Reo are described. Then we explain how Creol can be lifted up to the level of component-based modeling in Section 3. Section 3.1 covers the syntax proposed for modeling the components. Section 3.2 describes how coordination can be modeled using connectors. A case study demonstrates the typical use of the framework in Section 4. Section 5 concludes the paper.
2 2.1
Preliminaries Creol
The basic concept of the Creol modeling language is to provide a formal objectoriented solution for modeling distributed systems. The (simplified) syntax of Creol is given in Fig. 1. A complete presentation of the formal semantics of Creol (given in rewrite logic) in [10] is beyond the scope of this paper. Here we overview some Creol characteristics. Creol objects have dedicated processors and communicate by asynchronous method calls. The caller can choose to wait for an answer, thus simulating synchronous method calls. Creol objects are typed by interfaces, whereas classes can implement as many interfaces as necessary. The notion of co-interfaces can be used to restrict who can call the operations provided in interfaces. As a result, the callee can communicate with the caller in a type-safe way. Fig. 2 shows a very simple Creol model. The Simple interface defines a callMe operation that can be called only by instances of type Simple, while the response operation does not require any co-interface. The run method in a class defines its active behavior; thus the class Easy starts with calling its own callMe operation.
Coordinating Object Oriented Components Using Data-Flow Networks ::= interface N {(Par )}? {inherits Inh}? begin {with N Msig + }? end Inh ::= {N {(E)}? }+ , + Par ::= {{v}+ , : N }, Msig ::= op N {({in Par }? {out Par }? )}? CL ::= class N {(Par )}? {implements Inh}? {inherits Inh}? begin Vdcl ? {{with N }? Mtd }∗ end ? + Vdcl ::= var {{v}+ , : N {= e} }, IF
283
Mtd ::= {Msig == {Vdcl ; }? S}+ g ::= b | t? | ¬g | g ∧ g p ::= x.m | m S ::= | s; S s ::= (S) | V := E | skip | v := new N (E) | !p(E) | t!p(E) | t?(V ) | p(E; V ) | if b then S else S end | await g | await t?(V ) | await p(E; V ) | release
Fig. 1. BNF grammar for Creol. Curly brackets are used as meta parenthesis, superscript ? for optional parts, superscript * for repetition zero or more times, whereas {...}+ , denotes repetition one or more times with , as delimiter. Identifiers N denote interface, class, type, or method names. Capitalized terms such as E, V , and S, denote lists of the syntactic categories of the corresponding lower-case terms [10,11].
1 i n t e r f a c e Simple begin 2 with Simple op c a l l M e 3 with Any op r e s p o n s e 4 end 6 c l a s s Easy implements Simple begin 7 op run == await t h i s . c a l l M e ( ) 8 with Simple op c a l l M e == ! c a l l e r . r e s p o n s e ( ) 9 with Any op r e s p o n s e == skip 10 end Fig. 2. A simple Creol model
It ‘awaits’ until the requested operation is accomplished (synchronous method call with releasing the processor). The this keyword can be used to refer to the same object. The caller keyword refers to the object calling the current method, and it is typed by the co-interface used. The callMe method calls the caller back asynchronously (denoted by the ! sign). Each object in Creol, shows reactive behavior upon receiving messages. As an object receives a message, a new process is created inside the object for responding to the message. The processes inside an object are interleaved. Any of the enabled processes can be scheduled to run (nondeterministically) when the current process finishes. Alternatively, the currently executing process may, at its own discretion, release the processor (at some predefined processor release point using the await keyword) allowing other enabled processes to start execution. Releasing processor can be unconditional (via the release command) or guarded by boolean expressions. These guards can also test whether an asynchronous call has been accomplished. This is handled implicitly by the semantics with a
284
M.M. Jaghoori
completion notification. An example of a release point is ‘await this.callMe()’ in Fig. 2. In general, the caller may expect a return value, thus simulating synchronous method calls. Notice that waiting for a return value without releasing the processor blocks the objects, disallowing other processes in the object to use the processor. To make a system run, one should provide the initialization script. In the Creol convention, one can have a class for the initialization. This is similar to the class containing the main method in Java. Creol is backed by its formal operational semantics and its strong typing allows for dynamic class upgrades [23]. Since Creol semantics is given in rewrite logic [13], Creol specifications can be readily executed and analyzed on the Maude [7] platform. Maude is a rewrite engine that can perform analysis like simulation, model checking, etc., on transition systems specified using rewrite logic. 2.2
Reo: A Coordination Language
Reo is a model for building component connectors in a compositional manner. It allows for modeling the behavior of such connectors, formally reasoning about them, and once proven correct, automatically generating the so-called glue code from the specification. Reo’s notion of components and connectors is depicted in Figure 3, where component instances are represented as boxes, channels as straight lines, and connectors are delineated by dashed lines. Each connector in Reo is, in turn, constructed compositionally out of simpler connectors, which are ultimately composed out of primitive channels.
C1
C2
C3
(a) a 3way connector
C1
C4
C2
C5
C3
C6
(b) a 6way connector
C1
C4
C3
C2
C5
C6
(c) two 3way connectors and a 6way connector
Fig. 3. Components and Connectors
Reo is a compositional approach to defining component connectors. Reo connectors (also called circuits) are constructed in the same spirit as logic and electronics circuits: take basic elements (e.g., wires, diodes and transistors) and connect them. Basic connectors in Reo are channels. Each channel has exactly two ends, which can be a sink end or a source end. A sink end is where data flows out of a channel, and a source end is where data flows in a channel. It is possible that the channel ends of a channel are both sink or both source. A channel
Coordinating Object Oriented Components Using Data-Flow Networks
285
must support a certain set of primitive operations, such as I/O, on its ends; beyond that, Reo places no restriction on the behavior of a channel. This allows an open-ended set of different channel types to be used simultaneously together in Reo, each with its own policy for synchronization, buffering, ordering, computation, data retention/loss, etc. Channels are connected to make a circuit. Connecting channels is putting channel ends together in a node. So, a node is a set of channel ends. A node in Reo has a certain semantics: for all the source channel ends on a node, a fork operation takes place which is copying the outgoing data to all the channel ends (replicator); for all the sink channel ends on a node, a merge operation takes place which is a nondeterministic choice between incoming data (merger).
3
Mobile Connectors Framework
In this section, we explain how components and mobile connectors can be modeled in Creol. Fig. 4(a) shows the class diagram of the general framework. We have extended the standard notation of class diagrams with a dotted arrow with black head ( ) between two interfaces (shown as ovals) which represents the required co-interfaces. This arrow should not be confused with dashed arrows (with white heads) showing that a class implements an interface (cf. Fig. 8). As shown in Fig. 4(a), every component must implement the Component interface to have access to connector ends. In the next subsection, we provide an abstract syntax for modeling a component hiding these details. A component modeled using these abstractions can be automatically translated into the model in Fig. 4(a). Then we explain modeling mobile connectors, which are the basic elements for constructing a reconfigurable network. Connectors can transfer data between components and/or synchronize their actions. Each connector provides a fixed number of connector-ends, to which component ports can be bound. Fig. 4.(b) shows the object diagram of a connector
Connector create(;ce)
Network Manager
1
* ConnectorEnd connect(;) disconnect(;)
ports
*
1
Component
(a) General class diagram
(b) Object diagram for a connector
Fig. 4. The general model of the framework
286
M.M. Jaghoori
with three ends. We model each connector-end with an object, while the behavior of the connector is provided by another object. Connector-end objects are similar to the buttons of a coffee machine, where the coffee machine resembles the connector object. The output of the machine depends on the button pressed, i.e., the machine needs to distinguish between its input buttons to be able to produce the correct behavior. Similarly, since a connector needs to distinguish between the connector-ends, we have used one object per connector-end. This object scheme is especially necessary because of the mobility of the connector, i.e., the components attached to each connector may change dynamically. For a connector, however, the connector-end objects do not change during its whole life time. Therefore, the connector object only needs to establish the proper synchronization and data flow paths among its fixed connector-end objects. 3.1
Component View
A component provides one or more facades. Facades encapsulate the internal behavior of a component, in the same way as interfaces encapsulate the internals of a class. A facade declares a set of ports and events. Fig. 5 shows the extensions to Creol syntax for defining facades and components. Furthermore, a class can be defined to be inside (the facade of) a component to be able to raise events and access the ports of that component. A port is essentially just a reference to a connector-end and is used for communication with other (anonymous) components. The actual binding of ports (to connector-end objects) is beyond component’s control. In addition to communicating with other components, components may raise events. Event examples include a request/announcement for a specific service, reporting a time-out for a given request, or simply acknowledging that a request accomplished successfully. These events guide possible reconfigurations of the context-aware network. Since components have no explicit reference to the network manager, an event can be raised by the abstract command raise event. There are three types of ports. Objects inside a component may read from inports, write to outports and wait for synchronization on syncports. These actions are performed synchronously. Apart from these basic operations, ports can only appear as parameters of events. Whenever necessary, the component requests its ports to be bound by raising a proper event (at the so called reconfiguration Fcd ::= facade N {(Par )}? {inherits Inh}? begin {Pdcl }∗ {Edcl }∗ end + Pdcl ::= port {{v}+ , : N }, Edcl ::= {sync event}? N {({in Par }? {out Par }? )}?
::= class N {(Par )}? {inside N }? . . . Cmp ::= component N {implements Inh}? begin {N := new N (E)}+ end S ::= . . . | raise event N (E{; V }? ) CL
Fig. 5. Extended syntax for modeling the component view. The type of a port variable in Pdcl can only be inport, outport or syncport. Triple dots are used to avoid repeating the parts similar to Fig. 1.
Coordinating Object Oriented Components Using Data-Flow Networks
287
points). Ports cannot be used elsewhere, e.g., cannot be assigned explicitly as normal variables in the code, or may not be sent around as message parameters. Events are by default asynchronous. However, some events especially those expecting some return values (e.g., reconfiguration points) can be declared synchronous (using the keyword sync event). Fig. 6 shows the facade of a component in a peer2peer system. This example is explained in detail in Section 4. The facade Peer inherits the ports and (synchronous) events defined in ServerSide and ClientSide facades. In addition, Peer adds two asynchronous events. A peer component (providing a client and a server side) can use the ports and raise the events declared in these facades. Having defined the facade and the internal interfaces of a component, one can implement the internal classes to provide the functionality of the component. A concrete component is obtained by instantiating these classes. Fig. 7 shows a Node component based on the classes in Appendix B implementing the Peer facade. Notice that the classes used in this component are defined to be inside a Peer. Section 4 elaborates more on this example. A class can have access to the ports defined in a facade only if it defined to be inside that facade. Notice that a class can be inside only one facade. A class can raise an event only if it is inside a facade declaring that event. Section 3.4 will explain how facade and component definitions can be translated to normal Creol syntax adhering to the diagram in Fig. 4. 3.2
Coordination View
Coordination is modeled in a network consisting of connectors. In general, every connector should provide a create () method that can be called by the network manager (as the co-interface). This method should create the necessary connector-end objects and return the list of their references. The behavior of a connector is completely independent of the components and the rest of the system. This makes them suitable for reuse. A library of connectors has been developed and Appendix A includes the Creol code for some of these connectors. From the connectors point of view, a connector-end object represents the component port(s) attached to that end. However, more than one component ports may be attached to a connector-end (i.e., have a reference to the object). This sharing is handled by the connector-end object by forwarding only one request to the connector object at a time. Connectors may offer three types of connector-ends. Source and Sink connectorends are used for writing to and reading from a connector, respectively. A Signal connector-end is used when only a signal is to be provided. For example, a Synchronizer connector (see Fig. 8) provides (rendezvous) synchronization between (the components attached to) its Signal connector-ends. The ConnectorEnd interface contains connect and disconnect operations to be implemented by all types of connector-ends. A component may use the connect operation to have exclusive access to a connector-end. If more than one component try to connect to one connector-end, nondeterministically one of them will
288
M.M. Jaghoori
1 facade C l i e n t S i d e 2 begin 3 port myReq : o u t p o r t 4 port myAns : i n p o r t 5 sync event o p e n C l i e n t ( in k : Data , r e q i : o u t p o r t , a n s i : i n p o r t 6 out r e q o : o u t p o r t , anso : i n p o r t , f : Boolean ) 7 sync event c l o s e C l i e n t ( in r e q i : o u t p o r t , a n s i : i n p o r t 8 out r e q o : o u t p o r t , anso : i n p o r t ) 9 end
1 facade S e r v e r S i d e 2 begin 3 port exReq : o u t p o r t 4 port exAns : i n p o r t 5 sync event o p e n S e r v e r ( in 6 out 7 sync event c l o s e S e r v e r ( in 8 out
r e q i : inport reqo : inport r e q i : inport reqo : inport
, ansi : outport , anso : o u t p o r t ) , ansi : outport , anso : o u t p o r t )
9 end
1 facade Peer i n h e r i t s C l i e n t S i d e , S e r v e r S i d e 2 begin 3 r e g i s t e r ( in k e y L i s t : L i s t [ Data ] ) // a s y n c e v e n t 4 update ( in k e y L i s t : L i s t [ Data ] ) // a s y n c e v e n t 5 end Fig. 6. The facade of a component in peer2peer system
1 component Node1 implements Peer begin 2 s t o r e := new DataStore ( ”1” , ” Data one ” ) 3 c l := new C l i e n t I m p ( s t o r e , myReq , myAns) 4 s r v := new ServerImp ( s t o r e , exReq , exAns ) 5 u s e r := new T e s t e r 1 ( c l ) 6 end
user
eR
srv eA
mR
cl store
mA
Fig. 7. A concrete component definition and its component-object diagram
Coordinating Object Oriented Components Using Data-Flow Networks
289
Connector create(;ce)
WConnector write(d;)
RConnector take(;d)
Synchronizer
Merger Source write(d;)
1
1
Sink take(;d)
1 CSource 2
SConnector wait(;)
1
Signal wait(;)
*
CSink
CSignal
ConnectorEnd connect() disconnect()
Fig. 8. Class diagram for connector interfaces
Connector create(;ce) Channel create(;e1,e2)
WConnector write(d)
WChannel
DarinChannel
RConnector take(;d)
RChannel
SConnector wait()
SChannel
SyncChannel
Fig. 9. Class diagram for channels with two ends
be selected (by the Creol scheduler). Other connect messages will be granted after the first one disconnects. Each connector-end sub-type introduces its specific operation. The write on a Source takes an input, a take on a Sink produces an output, while the wait operation on a Signal has no parameter. Accordingly, connectors should provide
290
M.M. Jaghoori
(a subset of) the same operations. For instance, the WConnector interface provides a write operation to a Source connector-end. In Fig. 8, this is reflected with a co-interface (dotted arrow) relation. Therefore, a connector with source ends needs to implement the WConnector interface. Similarly, a connector with sink (resp. signal) ends needs to implement the RConnector (resp. SConnector) interface, which in turn provides a take (resp. wait) operation. A channel is a special connector with exactly two ends. This fact is reflected in the model by overloading the create method (cf. Fig. 9). Since channels have exactly two ends, this method returns exactly two connector-ends instead of a list. Channel interfaces can be used for modeling the equivalents for Reo channels. Similar to connectors, the Channel interface has three sub-types for different operations. To define connectors and channels, one needs to implement the proper interfaces shown in Fig. 8 and 9. A merger has two sources and one sink. It propagates the data on one of its two input sources onto its output sink. Therefore, it should provide write and read capabilities (by implementing WConnector and RConnector). A drain channel only has two sources and only needs to implement the WChannel interface. 3.3
Network Manager
The network manager is the entity responsible for reconfigurations in the network. It is context-aware in the sense that it performs reconfigurations based on the events raised by the components. A network manager must provide event handlers for all of the events defined in the facades of the components in the model. This can be implemented in different ways providing different strategies.
with NetworkManager op c r e a t e ( out c e : L i s t [ ConnectorEnd ] ) == s i g := new C S i g n a l ( t h i s ) ; snk := new CSink ( t h i s ) ; s r c := new CSource ( t h i s ) ; c e := n i l |− s r c |− snk |− s i g Fig. 10. A typical create method
The list of event declarations in a facade is a contract between components and the network manager. This list, in fact, shows the event handlers that must be implemented in the network manager. A special event initPorts is implicitly sent by all components upon creation. This event informs the network manager of a new component in the system whose ports may need to be initialized. Note that a facade does not include the initPorts event, which must be handled by the network manager.
Coordinating Object Oriented Components Using Data-Flow Networks
291
Unlike connectors and components, a network manager is designed specifically for a particular system. It creates the necessary connectors by calling their create methods (Fig. 11.a). Each connector returns the list of its connector-ends to the network manager. A typical create method is shown in Fig. 10. The network manager, knowing the components and having the references to the connector-ends, binds component ports to connector-ends properly. Fig. 11.b depicts a specific configuration of the system. Later on, as the system evolves, and after certain (complementary) events are raised (possibly by different components), the network configuration may change. This takes place by reassigning connector-end references to the ports of components. The network manager can change port bindings in response to events raised at reconfiguration points, i.e., events that have ports as parameters. 3.4
Translation to Creol
The model provided in the higher-level syntax (in Section 3.1) can be translated to normal Creol syntax such that the Creol interpreter can be used to run (simulate, analyze, etc.) the model. This also shows the formal semantics of our model in terms of Creol’s operational semantics. The algorithm for translation is given below. – ‘facade N(P)’ results in: • interface N(mgr:NetworkManager, P) ∗ each event is changed into an operation with cointerface ‘Any’ • class N Boundary(mgr:NetworkManager, P) implements N(mgr, P) ∗ change ports to var declaration. ∗ add op run == await mgr.initN(;allPorts) ∗ implement each event as forwarding the message to the network manager. – ‘component M(P) implements N’ results in: • class M Boundary(mgr:NetworkManager, P) inherits N Boundary(mgr, P) ∗ ‘V := new T (Args)’ generates ‘var V : T’ • add a run method ∗ every ‘V := new T (Args)’ is added. – ‘class C(P) inside N’ results in: • class C(b:N Boundary, P) • change raise event to ‘await b.’ (for sync events) or ‘ ! b. ’ (for async events) In a nutshell, events are translated to operations and ports to connector-end references (inport to Sink, outport to Source, and syncport to Signal). Notice that this step is transparent to the modeler. A boundary class will be automatically generated for each facade and each component concrete definition (cf. Fig. 7). For instance, Fig. 12 shows the boundary for the client side facade. A similar boundary class is created for any facade in the model. This boundary class initializes the ports defined in that facade. It also provides the definitions for operations corresponding to events.
292
M.M. Jaghoori
(a) Network initialization
(b) Running network configuration Fig. 11. Network Manager. Components are viewed as black boxes here.
Fig. 12 shows the boundary class for Node1 (cf. Fig. 7). One instance of the boundary class will be created per instance of the component. A boundary object creates the objects as specified in the component definition (see the run method in Fig. 12, compared to Fig. 7). It also receives all the events raised (by the internal objects), and forwards them to the network manager. This ensures that
Coordinating Object Oriented Components Using Data-Flow Networks
1 2 3 4 5 6 7 8
293
i n t e r f a c e C l i e n t S i d e ( mgr : NetworkManager ) i n h e r i t s Component ( mgr ) begin with Any op o p e n C l i e n t ( in k : Data , r e q i : Source , a n s i : Sink out r e q o : Source , anso : Sink , f : Boolean ) ; with Any op c l o s e C l i e n t ( in r e q i : Source , a n s i : Sink out r e q o : Source , anso : Sink ) ; end
10 11 12 13 14 15 16 17 18 19 20 21 22 23
c l a s s C l i e n t S i d e B o u n d a r y ( mgr : NetworkManager ) implements C l i e n t S i d e ( mgr ) begin var myReq : S o u r c e var myAns : Sink op run == await mgr . i n i t C l i e n t S i d e ( ; myReq , myAns ) ; with Any op o p e n C l i e n t ( in k : Data , r e q i : Source , a n s i : Sink out r e q o : Source , anso : Sink , f : Boolean ) == await mgr . o p e n C l i e n t ( k , r e q i , a n s i ; reqo , anso , f ) with Any op c l o s e C l i e n t ( in r e q i : Source , a n s i : Sink out r e q o : Source , anso : Sink ) == await mgr . c l o s e C l i e n t ( r e q i , a n s i ; req o , anso ) end
1 2 3 4 5 6 7 8 9 10 11 12 13
c l a s s Node1 Boundary ( mgr : NetworkManager ) i n h e r i t s PeerBoundary ( mgr ) begin var s t o r e : S t o r e var c l : C l i e n t var s r v : S e r v e r var u s e r : User op run == s t o r e := new DataStore ( ”1” , ” Data one ” , t h i s ) ; c l := new C l i e n t I m p ( s t o r e , myReq , myAns , t h i s ) ; s r v := new ServerImp ( s t o r e , exReq , exAns , t h i s ) ; u s e r := new T e s t e r 1 ( c l , t h i s ) end Fig. 12. Boundary classes for ClientSide facade and a Node component
294
M.M. Jaghoori
the ‘caller’ of all events from the same component remains unique. In other words, if each object (inside the component) was set up to send an event directly to the manager, the caller of the events would be different objects, whereas the manager should see them coming from one component. The implementation of the classes inside a component must be changed such that a reference to the boundary object, say ‘b’, is provided as a parameter to the class definitions. The boundary object is, in turn, provided (as parameter) with a reference to the network manager, to which it sends the raised events. Finally, raising an event should be replaced with a call to the boundary object; i.e., raise event will become ‘await b.’ for sync events, and ‘ ! b. ’ for async events.
4
Case Study
Peer-to-peer networks are now a commonly used way of sharing data. In such networks, each node shares some data and in return can (search for and) get the data on other nodes. We model a hybrid p2p architecture (like that of Napster [19]) in which there is a central server (called the broker) that keeps track of (the keywords for) the data in every node. Each node, upon creation, registers its data with the broker. Later it may query the broker for some new data (using their keywords), and the broker connects it to the node who has the data. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
interface Store begin with C l i e n t op add ( in key : Data , i n f o : Data ) with S e r v e r op f i n d ( in key : Data out i n f o : Data ) end /∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗/ i n t e r f a c e C l i e n t ( s t o r e : S t o r e , r e q : o u t p o r t , ans : i n p o r t ) begin with User op s e a r c h ( in key : Data out r e s u l t : Data ) end /∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗/ i n t e r f a c e User ( c l : C l i e n t ) begin end /∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗/ i n t e r f a c e S e r v e r ( s t o r e : S t o r e , r e q : i n p o r t , ans : o u t p o r t ) begin end Fig. 13. Interfaces of objects inside a Peer component
Coordinating Object Oriented Components Using Data-Flow Networks
295
1 i n t e r f a c e Brok er i n h e r i t s NetworkManager 2 begin 3 with Peer async event r e g i s t e r 4 ( in k e y L i s t : L i s t [ Data ] ) 5 with Peer async event update 6 ( in k e y L i s t : L i s t [ Data ] ) 7 with Peer sync event o p e n S e r v e r 8 ( in r e q i : i n p o r t , a n s i : o u t p o r t 9 out r e q o : i n p o r t , anso : o u t p o r t ) 10 with Peer sync event c l o s e S e r v e r 11 ( in r e q i : i n p o r t , a n s i : o u t p o r t 12 out r e q o : i n p o r t , anso : o u t p o r t ) 13 with Peer sync event o p e n C l i e n t 14 ( in k : Data , r e q i : o u t p o r t , a n s i : i n p o r t 15 out r e q o : o u t p o r t , anso : i n p o r t , f : Boolean ) 16 with Peer sync event c l o s e C l i e n t 17 ( in r e q i : o u t p o r t , a n s i : i n p o r t 18 out r e q o : o u t p o r t , anso : i n p o r t ) 19 end
Fig. 14. The Broker interface
4.1
A Peer Component
The facade of a peer component, including a server and a client side, was defined in Fig. 6, Section 3.2. A component implementing Peer may raise events inherited from ClientSide or ServerSide, as well as ‘ register ’ and ‘update’ declared in Peer itself. In this model, openClient/ closeClient and openServer/ closeServer are reconfiguration points; because, every time a node asks for some data (as a client) or services a request (as a server), it is possibly connected to different nodes, i.e., its port bindings may (or may not) be updated. Note that in reconfiguration points, ports are used as in/out parameters of the event. We consider four interfaces for the internal objects of peer components (Fig. 13). The ‘User’ object represents the interface of the component to a user who can ask for new pieces of data; and, in turn, it drives the ‘Client’ object. The client object uses the two ports myReq and myAns from the ClientSide facade. As a client, a node writes its request on myReq and expects the result on myAns. The local store is updated upon acquiring the requested data using the ‘add’ operation in the ‘Store’ interface. Each node also shares its data; i.e., it can service the requests from other nodes. A component implementing the ServerSide facade, reads a request (‘key’ to some data) from exReq and writes the data corresponding to the given key
296
M.M. Jaghoori
eR
mR
N1 eA
mA
eR
mR
N3 eA eR
mA
mR
N2 eA
mA
(a) Two nodes requesting data from the same node with the SimpleBroker eR
mR
N2 eA
eR
mR
N1 eA
mA
X
mA
eR
mR
N3 eA
mA
(b) Two nodes providing the requested data with the AlternateBroker Fig. 15. Peer-to-peer network configurations
on exAns. A server, therefore, needs to query the local data store for existing information. The implementation of these classes is given in Appendix B. The final step in defining the components is to instantiate the objects inside a component. The Clinet and Server objects are the same. The Store objects are initialized with different data. Different implementations of the User interface can be used for testing the model. Fig. 7 shows the definition of a component Node1 which is based on a specific implementation of User in the class Tester1. Components Node2 and Node3 are defined similarly using Tester2 and Tester3 classes as User. The implementation of these classes is given in Appendix B. 4.2
Network Manager and System Setup
A network manager must provide event handlers for the events defined in the facade of the components (Peer in this example). Fig. 14 shows the Broker interface that handles all the events that a Peer may raise. This interface can be implemented in different ways providing different strategies. A simple broker connects the requesters with at most one provider (possibly using a shared connection shown in Fig. 15.a), while an alternate broker may connect a requester to all the nodes that can provide the requested data (Fig. 15.b). Appendix B includes the implementation of SimpleBroker. To make the whole system run, one should provide the initialization script. As in the Creol convention, one can have a class for the initialization. The only things needed are to instantiate a network manager, and some components.
Coordinating Object Oriented Components Using Data-Flow Networks
1 class 2 op 3 4 5 6 7 8 9 10 11 end
297
NetOne begin i n i t == var mgr : Brok er ; var n1 : Peer ; var n2 : Peer ; var n3 : Peer ; mgr := new S i m p l e B r o k e r ; n1 := new Node1 ( mgr ) ; n2 := new Node2 ( mgr ) ; n3 := new Node3 ( mgr )
Fig. 16. The system setup
Fig. 16 shows a possible system setup using the simple broker. For simplicity, one instance of each component type is created. After translating to normal Creol syntax (cf. Section 3.4) and generating the Maude code for the model (using Creol compiler), one can perform different kinds of analyses in Maude. The simplest analysis is to run (simulate) the model. As defined in the tester classes in Appendix B.1, the data store in node one should get data ‘2’ and similarly the data store in node three gets data ‘1’ and ‘2’, in addition to their initial data. As explained before, one can run NetOne (see Fig. 16) to see how the simple broker implementation works. To do so one can execute the command ‘rew[100000] init main(”NetOne”, emp) .’ in Maude. By examining the Maude output, we can see that using the simple broker, the nodes will finally receive the requested data.
5
Conclusions and Future Work
In this paper, we presented a framework for component-based modeling of distributed systems with dynamic reconfigurable networks using mobile connectors in Creol. Components communicate anonymously, i.e., they allow the connectors to exogenously decide how they should communicate with other components in the system. The task of network reconfiguration is left to the network manager. This separates the issues of modeling (and analysis of) computation, coordination, and reconfiguration. Components and connectors are modeled independently and can be reused in other system models. We have implemented a library of connectors. These connectors provide a variety of basic coordination schemes and can be used in modeling different systems. Nevertheless, the framework is not restricted to the currently implemented set of connectors and can be expanded by introducing new connectors.
298
M.M. Jaghoori
However, a network manager should be modeled specifically for each system. The network is context aware, in the sense that reconfigurations depend on the events raised by components. Different network managers can be designed modeling different reconfiguration policies. In addition to automatic discovery of services (from a component’s point of view), the framework allows for modeling different ways to connect the components. One can use all characteristics of Creol in modeling the inside of a component, for example dynamic class upgrades. In the end, the component-based model can be translated to standard Creol. Thus we obtain an executable high-level modeling language, and can use all the techniques developed for analyzing Creol. 5.1
Future Work
The first thing to consider is adding a type-system. We are working on protocol types as an extension of session types [9]. A protocol type should be attached to each facade specifying abstractly how ports in that facade will be used. The object manipulating the ports in that facade should be type correct with respect to this protocol. This allows analyzing the network based on the protocol types of the participating components without looking into the component implementations. We are working on allowing components to create ports dynamically when needed. For example, a service provider should preferably create new ports upon request, so that more requests can be serviced in parallel. Protocol types for each facade can be used to ensure type-safety. In this paper, we focused on implementing the components as pliant entities. It remains to do more research on systematic implementation of network managers. One approach is to study a high-level way of specifying the network manager, for example by broadcast and late binding of anonymous messages, implemented for instance using Maude reflective programming techniques, so as to avoid keeping redundant information about all objects and components. Another trend for future work is real time modeling. Timing constraints such as time-out on performing a network operation can be added to the model. On the other hand, schedulability of real time tasks on each component is being studied.
Acknowledgement I would like to thank Frank de Boer, Tom Chothia, the anonymous reviewers and others who provided ideas and comments on developing this framework.
References 1. Agha, G.: The structure and semantics of actor languages. In: Proc. the REX Workshop, pp. 1–59 (1990) 2. Arbab, F.: Reo: A channel-based coordination model for component composition. Mathematical Structures in Computer Science 14, 329–366 (2004)
Coordinating Object Oriented Components Using Data-Flow Networks
299
3. Basu, A., Bozga, M., Sifakis, J.: Modeling heterogeneous real-time components in bip. In: Proc. Software Engineering and Formal Methods (SEFM 2006), pp. 3–12. IEEE Computer Society, Los Alamitos (2006) 4. Bliudze, S., Sifakis, J.: The algebra of connectors: structuring interaction in bip. In: Proc. Embedded software (EMSOFT 2007), pp. 11–20. ACM Press, New York (2007) 5. Bowles, J.K.F., Moschoyiannis, S.: Concurrent logic and automata combined: A semantics for components. ENTCS 175(2), 135–151 (2007) 6. Cervin, A., Eker, J.: The control server: A computational model for real-time control tasks. In: Proc. Real-Time Systems ECRTS 2003, pp. 113–120. IEEE Computer Society Press, Los Alamitos (2003) 7. Clavel, M., Dur´ an, F., Eker, S., Lincoln, P., Mart´ı-Oliet, N., Meseguer, J., Quesada, J.F.: Maude: specification and programming in rewriting logic. Theoretical Computer Science 285(2), 187–243 (2002) 8. Cook, W.R., Misra, J.: Computation orchestration, a basis for wide-area computing. Software and Systems Modeling 6(1), 83–110 (2007) 9. Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type discipline for structured communication-based programming. In: Hankin, C. (ed.) ESOP 1998. LNCS, vol. 1381, pp. 122–138. Springer, Heidelberg (1998) 10. Johnsen, E.B., Owe, O.: An asynchronous communication model for distributed concurrent objects. Software and Systems Modeling 6(1), 35–58 (2007) 11. Johnsen, E.B., Owe, O., Yu, I.C.: Creol: A type-safe object-oriented model for distributed concurrent systems. Theoretical Computer Science 365(1-2), 23–66 (2006) 12. Lau, K.-K., Wang, Z.: A survey of software component models (second edition).Technical Report CSPP-38, School of Computer Science, The University of Manchester (2006) 13. Meseguer, J.: Conditioned rewriting logic as a united model of concurrency. Theoretical Computer Science 96(1), 73–155 (1992) 14. Milner, R.: Communicating and Mobile Systems: The π-Calculus. Cambridge University Press, Cambridge (1999) 15. Nierstrasz, O., Dami, L.: Component-oriented software technology. In: Objectoriented software composition, pp. 3–28. Prentice Hall, Englewood Cliffs (1995) 16. Unified modeling language: Superstructure, version 2.0, http://www.omg.org 17. Scholten, J.G., Arbab, F., de Boer, F.S., Bonsangue, M.M.: Mobile channels, implementation within and outside components. ENTCS 66(4) (2002) 18. Scholten, J.G., Arbab, F., de Boer, F.S., Bonsangue, M.M.: Mocha-pi, an exogenous coordination calculus based on mobile channels. In: Proc. the 2005 ACM Symposium on Applied Computing, pp. 436–442. ACM Press, New York (2005) 19. Shirky, C.: Listening to Napster. In: Peer-to-Peer: Harnessing the power of disruptive technologies. O’Reilly, Sebastopol (2001) 20. Sirjani, M., Movaghar, A., Shali, A., de Boer, F.S.: Modeling and verification of reactive systems using Rebeca. Fundamamenta Informaticae 63(4), 385–410 (2004) 21. Welch, P.H., Brown, N., Moores, J., Chalmers, K., Sputh, B.: Integrating and extending JCSP. In: Proc. Communicating Process Architectures (CPA 2007), pp. 349–370. IOS Press, Amsterdam (2007) 22. Welch, P.H., Martin, J.M.R.: A CSP model for Java multithreading. In: Proc. Software Engineering for Parallel and Distributed Systems (PDSE), pp. 114–122 (2000)
300
M.M. Jaghoori
23. Yu, I.C., Johnsen, E.B., Owe, O.: Type-safe runtime class upgrades in creol. In: Gorrieri, R., Wehrheim, H. (eds.) FMOODS 2006. LNCS, vol. 4037, pp. 202–217. Springer, Heidelberg (2006)
A
Connectors Implementation
In this appendix, the Creol implementation of the three connector-end types are given. In addition, the implementation details of some connectors are explained. These implementations are based on the class diagram in the main paper in Fig. 8. Each instance of a ConnectorEnd, upon creation, should be supplied with a reference to the connector object to which it belongs. This parameter is specialized in sub-types of ConnectorEnd to sub-types of Connector. On the other hand, instances of Source (similarly Sink or Signal), at creation, must be provided as parameter, a reference to a WConnector (RConnector or SConnector, respectively). In other words, the sub-types of the ConnectorEnd interface, specialize their parameters. This allows them to have access to proper operations of the connectors. In order to prevent requests of different components from being mixed, a connector-end should propagate to the connector only one request at a time. This is achieved by using a blocking synchronous call such as cnct.write(d;) . After the current request finishes, nondeterministically, one of the suspended requests is chosen and propagated to the connector1 . If one component needs to perform a sequence of actions on a connector-end atomically, it can first request an uninterruptible connection to the connectorend (via the connect method). In this case, the variable cc holds the identity of (i.e., a reference to) the currently connected component. After this connection is made, the connector-end, suspends any requests from other components, until the connected component explicitly disconnects (via the disconnect method). When no component is connected, cc is null, which allows any request from any component to be accepted. Note that the implementation of the disconnect operation allows only the currently connected component to perform this operation. The implementation of Sink and Signal is essentially the same as Source. Fig. 17 depicts how to implement these connector-ends. The main difference is replacing the call to write with take and wait for sink and signal, respectively. Other issues, for example, the connect and disconnect operations, which are not shown in Fig. 17, are completely the same in all three types of connector-ends. Sync Channel. The synchronous channel is the simplest connector. This connector has a source and a sink connector-ends. It allows any data written to its source to be taken from its sink. The actions on its ends need be synchronized. Fig. 18 shows the implementation of the synchronous channel. As shown in the figure, the channel implements the two forms of the create method, i.e., the one for general connectors and the one specific to channels. For returning the 1
A nondeterministic choice made by Creol scheduler from the enabled processes.
Coordinating Object Oriented Components Using Data-Flow Networks
1 c l a s s CSource ( c n c t : WConnector ) implements S o u r c e ( c n c t ) 2 begin 3 var c c : Component := n u l l 5 6 7 9 10 11
with Component op w r i t e ( in d : Data ) == await ( c a l l e r=c c | | c c=n u l l ) ; cnct . write (d ; ) ; with Component op c o n n e c t == await c c=n u l l ; c c := c a l l e r
13 with Component op d i s c o n n e c t == 14 i f ( c a l l e r = c c ) then 15 c c := n u l l 16 end 17 end 19 c l a s s CSink ( c n c t : RConnector ) implements Sink ( c n c t ) 20 begin 21 var c c : Component := n u l l 22 with Component 23 op t a k e ( out d : Data ) == 24 await ( c a l l e r=c c | | c c=n u l l ) ; 25 cnct . take ( ; d) 26 // c o n n e c t and d i s c o n n e c t a r e t h e same as Source 27 end 29 c l a s s C S i g n a l ( c n c t : SConnector ) implements S i g n a l ( c n c t ) 30 begin 31 var c c : Component := n u l l 32 with Component 33 op w a i t == 34 await ( c a l l e r=c c | | c c=n u l l ) ; 35 cnct . wait ( ; ) 36 // c o n n e c t and d i s c o n n e c t a r e t h e same as Source 37 end
Fig. 17. The implementation of connector-ends
301
302
M.M. Jaghoori
1 c l a s s SyncChannel implements RChannel , WChannel 2 begin 3 var end1 : ConnectorEnd 4 var end2 : ConnectorEnd 5 var dd : Data 6 var ready : Boolean 8
op i n i t == ready := f a l s e
10 11 12 13
with Sink op t a k e ( out d : Data)== await ready ; d := dd ; ready := f a l s e // s y n c h r o n i z e w i t h ‘ w r i t e ’
15 16 17 18
with S o u r c e op w r i t e ( in d : Data)== dd := d ; ready := t r u e ; await ˜ ready // w a i t f o r n e x t t a k e
20 21 22 23 24 25
with NetworkManager op c r e a t e ( out e1 : ConnectorEnd , e2 : ConnectorEnd)== end1 := new S o u r c e ( t h i s ) ; end2 := new Sink ( t h i s ) ; e1 := end1 ; e2 := end2
27 28 29 30 31
with NetworkManager op c r e a t e ( out c e : L i s t [ ConnectorEnd ])== end1 := new S o u r c e ( t h i s ) ; end2 := new Sink ( t h i s ) ; c e := n i l |− end1 |− end2
33 end Fig. 18. Synchronous channel
connector-ends as a list, it first creates the connector-ends by calling the channelstandard create operation. This implementation of the create operations is the same in all channels with a source and a sink connector-ends. A variable of type Data (namely dd) is used to store the value written to the source end. Using the type Data enables passing values of any type. The boolean
Coordinating Object Oriented Components Using Data-Flow Networks
303
variable ready is used for synchronizing the actions on the two ends. The write action sets this variable indicating that there is some data available. This makes is possible for the take action to proceed (if there is a take action pending). After the take succeeds, it resets the ready flag, enabling the write to continue. On the other hand, if a take action happens first, it cannot proceed until the next write action appears. Note that each connector-end propagates only one action at a time. Therefore, the connector is not concerned about situations where two write (or take) actions arrive consecutively on the same connector-end before the previous one has succeeded. Furthermore, for synchronization, we have used an extra flag (ready) rather than using a constraint on the value of dd. This allows passing any value (including ‘ null ’) as data along the channel. Replicator. A replicator connector has one source and can have any number of sinks. It replicates the data written to its source on all its sink ends. The actions on all connector-ends need to be synchronized. It is interesting to note that by using a simple sync channel, and letting many components have a reference to the sink end, no replication takes place. In that case, each component attached to the sink end, has a chance to synchronize with the source end, and take the data. Nevertheless, only one component (from those sharing the reference) can participate in this procedure at a time. In other words, the take operations are serialized instead of being synchronized. Fig. 19 shows the implementation of a replicator. The parameter n to the class shows the number of the sink ends; so, the connector has n+1 ends in total. A naive solution to the problem of synchronizing all the sinks is to let each sink set its corresponding flag and wait until all flags are set. With this approach, only the last sink can continue, and as soon as it resets its flag, other sink will remain suspended. Instead, every sink, except the last one, should set its flag to true and wait until it is reset, which is in fact reset by the last sink. All the calls to the take operation, except the last one, need to wait for the last call (at line 27). The last call knows that it’s the last, by checking if all the elements of the arrive flag are set. It then waits for the source end. After that, releases all the sinks (by resetting all the elements of arrive ) as well as the source (by resetting the ready flag). On the other hand, the source can proceed with the write operation, only after the last sink arrives (i.e., all sinks arrive). The last sink also makes a copy of the data (line 25), because after synchronization, the source may overwrite the data (dw), as a result of the next write action, before other sinks had a chance to read the previous value (which is now kept in dr and read in line 29). It is worth noting again that each connector-end propagates only one request at a time, thus the consecutive requests from the same end are not confused. This connector has a parametric number of sink ends. Therefore, one needs to write a recursive function for creating ‘n’ sinks. Note that in Creol while and for loops are not allowed. A similar function is defined in all connectors with parametric number of connector-ends. This function is called synchronously so that all the connector-ends are created in one step.
304
M.M. Jaghoori
1 c l a s s S y n c R e p l i c a t o r ( n : Nat ) implements WConnector , RConnector 2 begin 3 var sinkEnd : L i s t [ ConnectorEnd ] := n i l 4 var a r r i v e : L i s t [ Boolean ] := n i l 5 var s r c : ConnectorEnd := n u l l 6 var dw , dr : Data := n u l l 7 var ready : Boolean := f a l s e 8 op c r e a t e N S i n k ( in m: Nat)== 9 var s : Sink ; 10 i f (m > 0 ) then 11 s := new CSink ( t h i s ) ; 12 sinkEnd := sinkEnd |− s ; 13 a r r i v e := ( a r r i v e |− ( f a l s e ) ) ; 14 c r e a t e N S i n k (m−1;) 15 end 17 18 19 20 21 22 23 24 25 26 27
with Sink op t a k e ( out d : Data ) == var i : Nat ; i := i n d e x ( sinkEnd , c a l l e r ) ; a r r i v e [ i ] := t r u e ; i f ( a r r i v e = t r u e ) then await ready ; ready := f a l s e ; a r r i v e := f a l s e ; dr := dw else await a r r i v e [ i ] = f a l s e // w a i t f o r t h e l a s t s i n k end ; d := dr
29 30 31
with S o u r c e op w r i t e ( in d : Data ) == dw := d ; ready := t r u e ; await ˜ ready // w a i t f o r n e x t t a k e
33 with NetworkManager 34 op c r e a t e ( out c e : L i s t [ ConnectorEnd ])== 35 s r c := new CSource ( t h i s ) ; 36 createNSink ( n ; ) ; 37 c e := s r c −| sinkEnd 38 end Fig. 19. Synchronous Replicator
Coordinating Object Oriented Components Using Data-Flow Networks
305
1 c l a s s NMerger ( n : Nat ) implements WConnector , RConnector 2 begin 3 var s r c L i s t : L i s t [ S o u r c e ] := n i l 4 var snk : Sink := n u l l 5 var ready : Boolean := f a l s e 6 var w r i t t e n : Boolean := f a l s e 7 var dd : Data := n u l l 9 10 11 12 13 14 15
op c r e a t e N S o u r c e ( in m: Nat)== var s : S o u r c e ; i f (m > 0 ) then s := new CSource ( t h i s ) ; s r c L i s t := s r c L i s t |− s ; c r e a t e N S o u r c e (m−1;) end
17 18 19 20 21
with Sink op t a k e ( out d : Data)== ready := t r u e ; await w r i t t e n ; w r i t t e n := f a l s e ; d := dd
23 24 25 26 27
with S o u r c e op w r i t e ( in d : Data)== await ready ; ready := f a l s e ; dd := d ; w r i t t e n := t r u e
29 30 31 32 33
with NetworkManager op c r e a t e ( out c e : L i s t [ ConnectorEnd ])== createNSource (n ; ) ; snk := new CSink ( t h i s ) ; c e := s r c L i s t |− snk
35 end Fig. 20. General nondeterministic merger
Nondeterministic Merger. A nondeterministic merger can have any number of inputs (source ends), but has one output (sink end). Whenever a take appears on the sink, it should synchronize with exactly one of the sources, if there is some
306
M.M. Jaghoori
data available on the source (a write action pending). If more than one source is ready, one of them should be chosen nondeterministically. The data from the chosen source will flow to the sink. Besides, the write action on that source and the take on the sink will succeed. Creol supports a binary choice operator, which nondeterministically chooses one of its enabled operands. It would suspend (and possibly release the processor) if none is enabled. The main problem is that in the general case, the number of entities participating in the choice is not statically known. Therefore, the choice operator cannot be used. To solve the problem of general nondeterministic choice between dynamic number of entities, we make use of the nondeterministic choice of Creol scheduler from among enabled processes. To this end, we make all available writers into a process, one of which succeeding to synchronize with the reader. Fig. 20 shows the implementation of this connector. In this implementation, n is the number of source ends. When a source tries to write, it is suspended until the ready flag becomes true. When there is a take action, the ready flag is set to true. This enables exactly one writer process, because immediately afterwards, the writer resets the ready flag back to false. The sink can continue when some data is written. This is indicated by the written flag. If there are more than one writer pending, the Creol scheduler selects one nondeterministically. If there is no writer available, the sink waits until the first source provides some data. The current implementation of Creol uses a completely nondeterministic choice between the processes in an object. It is interesting to note that when Creol allows defining schedulers for objects, one can use a particular scheduler for the merger object to implement different policies in making its choice, e.g., to have priorities in its choice.
B B.1
Peer to Peer Implementation Sample Components Implementing Peer
The implementation of a component, among other things, includes the implementation of its internal objects (i.e., classes implementing the interfaces of the internal objects). The interfaces of these objects are given in Fig. 13. The client provides the search operation for the user interface. It can be invoked with a given key. The openClient and closeClient events are reconfiguration points in the Client. After the openClient, the client expects to be connected to (at least) one node that has the requested data. It then writes the ‘key’ for the required data on the ‘req’ port and expects the result on the ‘ans’ port. Finally, it updates the local store by adding the new (key,ans) pair of data. If no node has the data, the manager returns a ‘false’ and the client does not continue with the session. From openClient until closeClient , the current process must enter a critical region. That is to disallow concurrent ‘search’ operations. Suppose the first search is waiting for a server to read the ‘key’. If another search can be interleaved, it
Coordinating Object Oriented Components Using Data-Flow Networks
1 c l a s s C l i e n t I m p ( s t o r e : S t o r e , r e q : o u t p o r t , ans : i n p o r t ) 2 i n s i d e Peer implements C l i e n t ( s t o r e , req , ans ) 3 begin 4 var busy : Boolean 5 op i n i t == busy := f a l s e 7 8
with User op s e a r c h ( in key : Data out r e s u l t : Data ) == var found : Boolean ;
9 10 11 12 13 14
await ˜ busy ; busy := t r u e ; raise event o p e n C l i e n t ( key , req , ans ; req , ans , found ) ; i f ( found ) then await r e q . w r i t e ( key ; ) ; await ans . t a k e ( ; r e s u l t ) ;
15 16 17 18
! s t o r e . add ( key , r e s u l t ) end ; raise event c l o s e C l i e n t ( req , ans ; req , ans ) ; busy := f a l s e
20 end 1 c l a s s ServerImp ( s t o r e : S t o r e , r e q : i n P o r t , ans : o u t P o r t ) 2 i n s i d e Peer implements S e r v e r ( s t o r e , req , ans ) 3 begin 4 5
var busy : Boolean op i n i t == busy := f a l s e
7 8 9
op run == var key , r e s u l t : Data ; await ˜ busy ;
10 11 12 13 14 15 16
busy := t r u e ; raise event o p e n S e r v e r ( req , ans ; req , ans ) ; await r e q . t a k e ( ; key ) ; await s t o r e . f i n d ( key ; r e s u l t ) ; await ans . w r i t e ( r e s u l t ; ) ; raise event c l o s e S e r v e r ( req , ans ; req , ans ) ; busy := f a l s e ;
17
! run ( )
19 end
307
308
M.M. Jaghoori
1 c l a s s D a t a S t o r e ( key : Data , i n f o : Data ) 2 i n s i d e Peer implements S t o r e 3 begin 4 var i n f o L s t : L i s t [ Data ] 5 var i n f o K e y : L i s t [ Data ] 7 8 9
op i n i t == i n f o L s t := empty |− i n f o ; i n f o K e y := empty |− key
11 12
op run == raise event r e g i s t e r ( i n f o K e y )
14 15 16 17
with C l i e n t op add ( in key : Data , i n f o : Data ) == i n f o L s t := i n f o L s t |− i n f o ; i n f o K e y := i n f o K e y |− key ; raise event update ( i n f o K e y )
19 with S e r v e r op f i n d ( in key : Data ; out i n f o : Data ) == 20 i n f o := nth ( i n f o L s t , i n d e x ( infoKey , key ) ) 21 end 1 class 2 begin 3 op 4 5 6 end
T e s t e r 1 ( c l : C l i e n t ) i n s i d e Peer implements User ( c l ) run == var r e p l y : Data ; await ! c l . s e a r c h ( ” 2 ” ; r e p l y )
8 c l a s s T e s t e r 2 ( c l : C l i e n t ) i n s i d e Peer implements User ( c l ) 9 begin 10 end 12 c l a s s T e s t e r 3 ( c l : C l i e n t ) i n s i d e Peer implements User ( c l ) 13 begin 14 op run == 15 var r e p l y : Data ; 16 ! c l . search ( ”0” ) ; 17 ! c l . search ( ”1” ) ; 18 ! c l . search ( ”2” ) 19 end
Coordinating Object Oriented Components Using Data-Flow Networks
309
may reconfigure the network and change the port assignments. This results in the first search taking the answer from a wrong connection. After the openServer event is performed, a server is connected to a client. The server receives a key to some data on ‘req’ port. It then finds the data associated to the key, and writes the answer onto ‘ans’ port. To find the data, the server queries the store object. After servicing a request, the server invokes its ‘run’ method again to be able to service other requests. Since a server has a fixed number of ports, it can service one request at a time. The store keeps the data residing in the current node. The implementation is provided with a (key, info) pair as the initial data in the node. It first tries to register (the keys to) its local data at the broker. Later the client can add the new acquired data to the store (‘add’ operation). Then the store should update the broker’s information. A server can try and find some data in the store (‘ find ’ operation). The user represents the entity who invokes the search on the client. Three different implementations of the User interface are provided in this example. By putting these implementations in different Peer components, one can make different components behave differently. The three implementations are to be used as test cases. One can provide different implementations to test the model. B.2
Broker Implementation
In a simple broker, (the client side of) the requesting node and (the server side of) the providing node are connected by synchronous channels. One channel is used to connect the ‘req’ ports and another channel connects the ‘ans’ ports. Therefore, a client is connected to at most one server. Fig. ?? shows the implementation of a simple broker with this policy. When a node is added to the system, it requests initialization. For a server, in initServer , the simple broker creates two synchronous channels for the new node, and assigns the server side ports (exReq and exAns) accordingly (see Fig. 15.a). When a node raises an openClient event, along with the desired keyword, the network manager finds the node who has the data (matching the given key) and assigns the client side ports (myReq and myAns) of the requester to point to the channels of the node with the data. Fig. 15.a shows a particular configuration of 3 nodes, where both nodes N1 and N2 try to read some (possibly different) data from N3. The network manager is not shown in this figure. But as can be seen, each node has two pairs of in/outports, and a pair of synchronous channels are associated to the server side ports. The dashed arrows are used to show references to the connector-ends. The requesting node sends its keywords to the channel, which is taken by the node on the other end (the server object) via its extReq port. And then the response goes from the extAns port of the latter to the myAns port of the former. Note that, even if more than one node is trying to request some data from the same node (as in Fig. 15.a), each node gets the correct data, due to the synchronous nature of the channels.
310
M.M. Jaghoori
1 c l a s s S i m p l e B r o k e r implements Broker 2 begin 3 var n o d e L i s t : L i s t [ Peer ] 4 var d a t a L i s t : L i s t [ L i s t [ Data ] ] 5 var s r c L i s t : L i s t [ Source ] 6 var s n k L i s t : L i s t [ S ink ] 8 9 10 11 12
op i n i t == n o d e L i s t := n i l ; d a t a L i s t := n i l ; s r c L i s t := n i l ; s n k L i s t := n i l
14 15 16
// f i n d i n d a t a l i s t op f i n d F i r s t ( in key : Data ; out i n d : Nat ) == s u b F i n d F i r s t ( 1 , key , d a t a L i s t ; i n d )
18 19 20 21 22 23 24 25 26 27 28
op s u b F i n d F i r s t ( in k : Nat , key : Data , l s t : L i s t [ L i s t [ Data ] ] out i n d : Nat ) == i f l s t = n i l then i n d := 0 else i f has ( head ( l s t ) , key ) then i n d := k else s u b F i n d F i r s t ( k+1 , key , t a i l ( l s t ) ; i n d ) end end
30 31
with Peer op i n i t C l i e n t ( out myReq : Source , myAns : Sink ) == skip
33 34 35 36 37 38
with Peer op i n i t S e r v e r ( out exAns : Source , exReq : Sink ) == var c1 : S o u r c e ; var k1 : Sink ; var temp : SyncChannel ; temp := new SyncChannel ; temp . c r e a t e ( ; c1 , exReq ) ; temp := new SyncChannel ; temp . c r e a t e ( ; exAns , k1 ) ;
Fig. 21. A simple Broker implementation
Coordinating Object Oriented Components Using Data-Flow Networks
39 40 41 42
311
n o d e L i s t := n o d e L i s t |− c a l l e r ; d a t a L i s t := d a t a L i s t |− n i l ; s r c L i s t := s r c L i s t |− c1 ; s n k L i s t := s n k L i s t |− k1
44 45
with Peer op r e g i s t e r ( in l s t : L i s t [ Data ] ) == d a t a L i s t := r e p l a c e ( d a t a L i s t , i n d e x ( n o d e L i s t , c a l l e r ) , l s t )
47 48
with Peer op update ( in l s t : L i s t [ Data ] ) == d a t a L i s t := r e p l a c e ( d a t a L i s t , i n d e x ( n o d e L i s t , c a l l e r ) , l s t )
50 51 52
with Peer op o p e n S e r v e r ( in r e q i : Sink , a n s i : S o u r c e out r e q o : Sink , anso : S o u r c e ) == r e q o := r e q i ; anso := a n s i
54 55 56
with Peer op c l o s e S e r v e r ( in r e q i : Sink , a n s i : S o u r c e out r e q o : Sink , anso : S o u r c e ) == r e q o := r e q i ; anso := a n s i
58 59 60 61 62 63 64 65 66 67 68
with Peer op o p e n C l i e n t ( in k : Data , r e q i : Source , a n s i : Sink out r e q o : Source , anso : Sink , f : Boolean ) == var i : Nat ; findFirst (k ; i ); i f i = 0 then f := f a l s e else f := t r u e ; r e q o := nth ( s r c L i s t , i ) ; anso := nth ( s n k L i s t , i ) end
70 71 72
with Peer op c l o s e C l i e n t ( in r e q i : Source , a n s i : S in k out r e q o : Source , anso : Sink ) == r e q o := r e q i ; anso := a n s i
74 end
Fig. 21. (continued)
Author Index
Albert, Elvira 113 Arenas, Puri 113 Aspinall, David 52
Jaghoori, Mohammad Mahdi Jensen, Thomas 1 Johnsen, Einar Broch 257
Barthe, Gilles 1 Benveniste, Albert 200 Beringer, Lennart 25 Bjørk, Joakim 257 Bliudze, Simon 179
Kyas, Marcel
Caillaud, Benoˆıt 200 Caromel, Denis 133 Clarke, Dave 226 Coupaye, Thierry 153 Cr´egut, Pierre 1 Cunningham, Dave 72
Owe, Olaf
Dietl, Werner 72 Drossopoulou, Sophia
Haridi, Seif 153 Henrio, Ludovic 133 Hofmann, Martin 25
Madelaine, Eric 133 Maier, Patrick 52 Mangeruca, Leonardo M¨ uller, Peter 72
200
257
Passerone, Roberto 200 Pavlova, Mariela 25 Pichardie, David 1 Puebla, German 113 Reinefeld, Alexander 72
Ferrari, Alberto 200 Francalanza, Adrian 72 Genaim, Samir 113 Gr´egoire, Benjamin 1
257
153
Sifakis, Joseph 179 Sofronis, Christos 200 Stark, Ian 52 Stefani, Jean-Bernard 153 Summers, Alexander J. 72 Van Roy, Peter Yap, Roland