Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
2629
3
Berlin Heidelberg New York Hong Kong London Milan Paris Tokyo
Ali E. Abdallah Peter Ryan Steve Schneider (Eds.)
Formal Aspects of Security First International Conference, FASec 2002 London, UK, December 16-18, 2002 Revised Papers
13
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Ali E. Abdallah London South Bank University School of Computing, Information Systems and Mathematics Centre of Applied Formal Methods Southwark Campus, 103 Borough Road, London, SE1 0AA, UK E-mail:
[email protected] Peter Ryan University of Newcastle upon Tyne School of Computing Science Newcastle upon Tyne, NE1 7RU, UK E-mail:
[email protected] Steve Schneider Royal Holloway, University of London Department of Computer Science Egham, Surrey, TW20 0EX, UK E-mail:
[email protected] Cataloging-in-Publication Data applied for A catalog record for this book is available from the Library of Congress. Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at
. CR Subject Classification (1998): E.3, D.4.6, C.2.0, D.2.4, K.4.4, K.6.5 ISSN 0302-9743 ISBN 3-540-20693-0 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag is a part of Springer Science+Business Media springeronline.com c Springer-Verlag Berlin Heidelberg 2003 Printed in Germany Typesetting: Camera-ready by author, data conversion by Boller Mediendesign Printed on acid-free paper SPIN: 10975541 06/3142 543210
This Proceedings Volume Is Dedicated to the Memory of Roger Needham
Roger was to have been one of the two keynote speakers at FASec. A few weeks before the event we heard that Roger was indisposed and would not be able to attend. A little later we learnt the sad news that Roger had in fact been diagnosed with terminal cancer and then, in March, that he had passed away. We have decided to dedicate these proceedings to his memory. Roger was very much a “Cambridge Man.” Doing his first degree at Cambridge, followed by a Ph.D., he then went on to become a professor in 1981 and Head of the Computer Laboratory from 1980 until 1995. Roger was one of the outstanding pioneers of the field of computer science, being one of the driving forces behind the Cambridge Ring, the Cambridge Model Distributed System, and the UNIVERSE project. He was also responsible for a number of seminal contributions in the field of computer security. He introduced the idea of storing the cryptographic hashes of passwords. He co-invented authentication protocols with Schroeder. He was also a co-author, with Burrows and Abadi, of the seminal BAN logic, the first rigorous framework for the analysis of cryptographic security protocols. In 1997 Roger was lured away from the Computer Laboratory to set up Microsoft Research in Cambridge. This brought together many of the best researchers in computer science and information security. He was a Fellow of the Royal Society, the Royal Academy of Engineering and the British Computer Society. In 1998, Roger was awarded the IEE Faraday Medal. His insight, vision and humanity will be missed by all of us.
VI
Preface
Preface
Formal Aspects of Security (FASec) was held at Royal Holloway, University of London, 18–20 December 2002. The occasion celebrated a Jubilee, namely the 25th anniversary of the establishment of BCS-FACS, the Formal Aspects of Computing Science specialist group of the British Computer Society. FASec is one of a series of events organized by BCS-FACS to highlight the use of formal methods, emphasize their relevance to modern computing, and promote their wider application. As the architecture model of information systems evolves from unconnected PCs, through intranet (LAN) and internet (WAN), to mobile internet and grids, security becomes increasingly critical to all walks of society: commerce, finance, health, transport, defence and science. It is no surprise therefore that security is one of the fastest-growing research areas in computer science. The audience of FASec includes those in the formal methods community who have (or would like to develop) a deeper interest in security, and those in security who would like to understand how formal methods can make important contributions to some aspects of security. The scope of FASec is deliberately broad and covers topics that range from modelling security requirements through specification, analysis, and verifications of cryptographic protocols to certified code. The discussions at FASec 2002 encompassed many aspects of security: from theoretical foundations through support tools and on to applications. Formal methods has made a substantial contribution to this exciting field in the past. Our intended keynote speaker, Prof. Roger Needham, to whom this proceedings volume is dedicated, was one of the first researchers to mention, almost 25 years ago, that formal methods could be useful for assuring the correctness of security protocols [Needham and Schroeder, Using encryption for authentication in large networks of computers, CACM, 1978]. Judging by the quality of the papers in this volume, formal methods promise to make significant contributions to security in the future. We were very privileged to include in the conference program contributions from a number of outstanding international invited speakers: Fred Schneider Ernie Cohen Dieter Gollmann Andy Gordon Lawrence Paulson Bart Preneel Susan Stepney
Cornell University, USA Microsoft Research, UK Microsoft Research, UK Microsoft Research, UK University of Cambridge, UK Catholic University of Leuven, Belgium University of York, UK
Preface
VII
Our gratitude goes to the authors for submitting their papers and responding to the feedback provided by the referees. Our thanks go to the referees for their valuable efforts in providing detailed and timely reviews of the papers. We owe special thanks to the BCS-FACS steering committee and its chairman, Jonathan P. Bowen, for their solid support of this event. Special thanks are also due for the generous contributions of our sponsors: MSR (Microsoft Research), CSR (Centre for Software Reliability), Adelard, and DSTL (Defence Science and Technology Laboratory). Finally, we are very grateful to the local organization team, especially to Janet Hales, for their professionalism and hard work, which ensured the smooth running of the local arrangements. Online information concerning the conference is available at http://www.lsbu.ac.uk/menass/fasec or from the BCS-FACS Web site: http://www.bcs-facs.org FASec attracted more than sixty participants from the UK, Europe, USA, Canada, and Australia. The audience comprised a unique mixture of participants from different backgrounds and organizations (industrial and academic). The program contained an interesting combination of exciting topics in invited and refereed talks. These factors, combined with the charm of the Royal Holloway venue, the bright sun for the whole duration of the conference (yes, unbelievable, pleasant December English weather!), and a wonderful after-dinner speech by Tom Anderson (CSR) in the beautiful surroundings of the famous Picture Gallery, greatly helped in making FASec a memorable, intellectually stimulating, lively, and enjoyable event. We hope this proceedings captures some of the spirit of this event.
London and Newcastle, March 2003
Ali Abdallah, Peter Ryan and Steve Schneider
VIII
Organization
Organization
Program Committee Ali Abdallah Jonathan Bowen John Cooke Neil Evans Cedric Fournet Dieter Gollmann Jeremy Jacob Wenbo Mao Lawrence Paulson Peter Ryan Steve Schneider
London South Bank University, UK (Conference Co-chair) London South Bank University, UK (BCS-FACS Chair) Loughborough University, UK Royal Holloway, University of London, UK Microsoft Research, UK Microsoft Research, UK University of York, UK HP Labs, UK University of Cambridge, UK University of Newcastle, UK (Conference Co-chair) Royal Holloway, University of London, UK (Conference Co-chair)
Local Organizers Neil Evans Mark Green Janet Hales Etienne Khayat Steve Schneider
Sponsors
Royal Holloway, University of London, UK Oxford Brookes University, UK Royal Holloway, University of London, UK London South Bank University, UK Royal Holloway, University of London, UK
Table of Contents
Keynote Talk Lifting Reference Monitors from the Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . F.B. Schneider
1
Invited Talks I Authenticity Types for Cryptographic Protocols . . . . . . . . . . . . . . . . . . . . . . A. Gordon
3
Verifying the SET Protocol: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . L.C. Paulson
4
Protocol Verification Interacting State Machines: A Stateful Approach to Proving Security . . . . D. von Oheimb
15
Automatic Approximation for the Verification of Cryptographic Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Oehl, G. Cece, O. Kouchnarenko, D. Sinclair
33
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Boyd, K. Viswanathan
49
Invited Talks II Critical Critical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Stepney
62
Analysing Security Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Gollmann
71
Analysis of Protocols Analysis of Probabilistic Contract Signing . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Norman, V. Shmatikov
81
Security Analysis of (Un-) Fair Non-repudiation Protocols . . . . . . . . . . . . . . S. G¨ urgens, C. Rudolph
97
Modeling Adversaries in a Logic for Security Protocol Analysis . . . . . . . . . 115 J.Y. Halpern, R. Pucella
X
Table of Contents
Security Modelling and Reasonning Secure Self-certified Code for Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 M. Debbabi, J. Desharnais, M. Fourati, E. Menif, F. Painchaud, N. Tawbi Z Styles for Security Properties and Modern User Interfaces . . . . . . . . . . . . 152 A. Hall
Invited Talks III Cryptographic Challenges: The Past and the Future . . . . . . . . . . . . . . . . . . . 167 B. Preneel TAPS: The Last Few Slides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 E. Cohen
Intrusion Detection Systems and Liveness Formal Specification for Fast Automatic IDS Training . . . . . . . . . . . . . . . . . 191 A. Durante, R. Di Pietro, L.V. Mancini Using CSP to Detect Insertion and Evasion Possibilities within the Intrusion Detection Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 G.T. Rohrmair, G. Lowe Revisiting Liveness Properties in the Context of Secure Systems . . . . . . . . 221 F.C. G¨ artner
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Lifting Reference Monitors from the Kernel Fred B. Schneider Computer Science Department Cornell University Ithaca, New York 14853 [email protected]
Abstract. Much about our computing systems has changed since reference monitors were first introduced, 30 years ago. Reference monitors haven’t—at least, until recently—but new forms of execution monitoring are now possible, largely due to research done in the formal methods and programming languages communities. This talk will discuss these new approaches: why they are attractive, what can be done, what has been done, and what problems remain.
In contrast to 1972, operating systems today are too large to be considered trustworthy and security policies are needed not only to protect one user from another but also to protect programs from themselves, since so much of today’s software is designed to be extensible. Thus, while the principle of complete-mediation remains sound, it no longer makes sense to locate mechanisms that do execution monitoring in the operating system kernel: – The integrity of such a reference monitor is difficult to guarantee, by virtue of its location in a large, complex body of code. – Access to only certain resources could observed, which restricts the vocabulary of abstractions that policies could then govern. An alternative to deploying software—in the kernel or elsewhere —that intercepts run-time events is to automatically rewrite programs prior to execution, effectively in-lining the reference monitor. The approach, called an ”in-line reference monitor” (IRM), has been prototyped for X’86 and JVM as well as for a variety of high-level languages. It has been a clear candidate for commercial deployment, though none has yet occurred. The added run-time checks do not seem to affect performance; and an extremely broad class of policies can be enforced. With the expressive power of IRMs comes a burden: formulating the policies to enforce. For sure, the translation of informal requirements into formal specifications is not a new challenge, though whether for security we know what should A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 1–2, 2003. c Springer-Verlag Berlin Heidelberg 2003
2
F.B. Schneider
be those informal requirements is certainly a valid question. Security policies also present fundamentally new technical difficulties. The writer of a security policy seemingly must not only understand the semantics of system’s interfaces but also must understand what is hidden by those interfaces. Moreover, policy formalizations can have subtle ramifications, not only with regard to run-time efficiency but also with regard to the trusted computing base. The technical problems can be posed; promising directions for solutions can only be suggested. Finally, the exploration of execution monitoring and program rewriting in policy enforcement is providing a new lens through which security policies can be viewed. Ad hoc arguments about the virtues of one protection mechanism over another is starting to be replaced by mathematical arguments about limits and by a rigorously defined hierarchy of enforcement mechanisms. Some very recent results (done with Kevin Hamlen Greg Morrisett) concerning the power of program-rewriting for policy enforcement will be discussed.
References 1. Ulfar Erlingsson and Fred B. Schneider. SASI enforcement of security policies: A retrospective. Proceedings of the New Security Paradigms Workshop (Caledon Hills, Ontario, Canada, September 1999), Association for Computing Machinery, 87–95. http://www.cs.cornell.edu/fbs/publications/sasiNSPW.ps 2. Ulfar Erlingsson and Fred B. Schneider. IRM enforcement of Java stack inspection. Proceedings 2000 IEEE Symposium on Security and Privacy (Oakland, California, May 2000), IEEE Computer Society, Los Alamitos, California, 246–255. http://www.cs.cornell.edu/fbs/publications/sasiOakland.ps 3. Fred B. Schneider Enforceable security policies. ACM Transactions on Information and System Security 3, 1 (February 2000), 30–50.
Authenticity Types for Cryptographic Protocols Andy Gordon Microsoft Research, 7 J J Thomson Ave, Cambridge CB3 0FB, UK [email protected]
Abstract. Cryptographic protocols are essential for the security of many critical networking applications, such as authenticating various financial transactions. Moreover, many new consumer-to-business and businessto-business protocols are being proposed and need cryptographic protection. It is famously hard to specify and verify such protocols, even if we assume that the underlying cryptographic algorithms cannot be cryptanalysed. My talk describes a new approach to specifying and verifying authenticity properties of security protocols, based on a typed process algebra. The theory has been developed in a series of papers with Alan Jeffrey. Our approach requires little human effort per protocol, puts no bound on the size of the opponent attacking the protocol, and requires no state space enumeration. Moreover, the types for protocol data provide some intuitive explanation of how the protocol works. My talk explains the basic ideas by example, states our main results, and discusses some applications.
Acknowledgment This abstract is based on joint work with Alan Jeffrey, DePaul University, http://cryptyc.cs.depaul.edu/.
References 1. A.D. Gordon and A. Jeffrey. Authenticity by typing for security protocols. In 14th IEEE Computer Security Foundations Workshop (CSFW 2001), pages 145-159, Cape Breton, June 11-13, 2001. IEEE Computer Society. A journal version is to appear in the it Journal of Computer Security. 2. A.D. Gordon and A. Jeffrey. Types and effects for asymmetric cryptographic protocols. In 15th IEEE Computer Security Foundations Workshop (CSFW 2002), pages 77-91, Cape Breton, June 24-26, 2002. IEEE Computer Society. 3. A.D. Gordon and R. Pucella. Validating a web service security abstraction by typing. In 2002 ACM Workshop on XML Security (XML Security 2002), George Mason University, November 22, 2002.
A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 3–3, 2003. c Springer-Verlag Berlin Heidelberg 2003
Verifying the SET Protocol: Overview Lawrence C. Paulson Computer Laboratory, University of Cambridge JJ Thomson Avenue, Cambridge CB30 0FD, England [email protected]
Abstract. The project to verify SET, an e-commerce protocol, is described. The main tasks are to comprehend the written documentation, to produce an accurate formal model, to identify specific protocol goals, and finally to prove them. The main obstacles are the protocol’s complexity (due in part to its use of digital envelopes) and its unusual goals involving partial information sharing. Brief examples are taken from the registration and purchase phases. The protocol does not completely satisfy its goals, but only minor flaws have been found. The primary outcome of the project is experience with handling enormous and complicated protocols.
1 Introduction SET (Secure Electronic Transaction) is an e-commerce protocol devised by Visa and MasterCard. It enables credit card holders to pay for purchases while protecting their personal information, which includes both their account details and their purchasing habits. Most research on protocol verification focuses on simple protocols that have simple objectives. One reason for verifying SET is to demonstrate that verification technology is mature enough to cope with the demands of a huge, complex industrial protocol. Protocol verification techniques fall into several categories. A general-purpose model-checker can verify protocols, as pioneered by Lowe and his colleagues at Oxford [7]. A general-purpose proof tool can also be effective, as in my work [13]. Additionally, there exist several specialized protocol analysis tools. Most perform an exhaustive search in the spirit of model checking; among the best is Meadows’ NRL [11], which has deductive capabilities. Cohen’s TAPS processes the protocol specification and verifies the desired properties using a resolution theorem prover [6]. Formal proof is preferable for establishing properties, while model-checking is best for finding attacks. Exhaustive search is only feasible if the model is kept as small as possible, for example by minimizing the number of permitted executions. If the assumptions are too strong, the absence of an attack does not guarantee correctness. Interactive proof tools are not automatic, but offer flexibility in expressing specifications and proofs. Models need not be finite and can therefore be more realistic. My colleagues and I have verified [2,3] the main phases of the SET protocol using the inductive approach and the theorem prover Isabelle. A substantial proportion of the effort was devoted to understanding the documentation rather than to proving properties. This paper is a brief overview of the project, referring to other papers that describe the separate tasks. A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 4–14, 2003. c Springer-Verlag Berlin Heidelberg 2003
Verifying the SET Protocol: Overview
5
The paper begins by outlining the SET protocol (Sect. 2). It briefly introduces the inductive approach and Isabelle (Sect. 3). It discusses the issues we faced in converting the documentation into a formal model (Sect. 4). It outlines our proofs of the registration protocols (Sect. 5) and the payment protocol (Sect. 6). Finally, There are some general conclusions (Sect. 7).
2 The SET Protocol People today pay for online purchases by sending their credit card details to the merchant. A protocol such as SSL or TLS keeps the card details safe from eavesdroppers, but does nothing to protect merchants from dishonest customers or vice-versa. SET addresses this situation by requiring cardholders and merchants to register before they may engage in transactions. A cardholder registers by contacting a certificate authority, supplying security details and the public half of his proposed signature key. Registration allows the authorities to vet an applicant, who if approved receives a certificate confirming that his signature key is valid. All orders and confirmations bear digital signatures, which provide authentication and could potentially help to resolve disputes. A SET purchase involves three parties: the cardholder, the merchant, and the payment gateway (essentially a bank). The cardholder shares the order information with the merchant but not with the payment gateway. He shares the payment information with the bank but not with the merchant. A set dual signature accomplishes this partial sharing of information while allowing all parties to confirm that they are handling the same transaction. The method is simple: each party receives the hash of the withheld information. The cardholder signs the hashes of both the order information and the payment information. Each party can confirm that the hashes in their possession agrees with the hash signed by the cardholder. In addition, the cardholder and merchant compute equivalent hashes for the payment gateway to compare. He confirms their agreement on the details withheld from him. All parties are protected. Merchants do not normally have access to credit card numbers. Moreover, the mere possession of credit card details does not enable a criminal to make a SET purchase; he needs the cardholder’s signature key and a secret number that the cardholder receives upon registration. The criminal would have better luck with traditional frauds, such as ordering by telephone. It is a pity that other features of SET (presumably demanded by merchants) weaken these properties. A merchant can be authorized to receive credit card numbers and has the option of accepting payments given a credit card number alone. SET is a family of protocols. The five main ones are cardholder registration, merchant registration, purchase request, payment authorization, and payment capture. There are many minor protocols, for example to handle errors. SET is enormously more complicated than SSL, which merely negotiates session keys between the cardholder’s and merchant’s Internet service providers. Because of this complexity, much of which is unnecessary, the protocol is hardly used. However, SET contains many features of interest: – The model is unusual. In the registration protocols, the initiator possesses no digital proof of identity. Instead, he authenticates himself by filing a registration form
6
L.C. Paulson
whose format is not specified. Authentication takes place outside the protocol, when the cardholder’s bank examines the completed form. – The dual signature is a novel construction. The partial sharing of information among three peers leads to unusual protocol goals. – SET uses several types of digital envelope. A digital envelope consists of two parts: one, encrypted using a public key, contains a fresh symmetric key K and identifying information; the other, encrypted using K, conveys the full message text. Digital envelopes keep public-key encryption to a minimum, but the many symmetric keys complicate the reasoning. Most verified protocols distribute just one or two secrets.
3 Isabelle and Inductive Protocol Verification My colleagues and I used the Isabelle theorem prover with the inductive approach to protocol verification. It is not clear that model checking could cope with this protocol’s complexity. Specialized verification tools are more powerful than Isabelle, but less flexible. Most protocols, even esoteric ones like non-repudiation and fair exchange protocols, involve the standard cast of characters: Alice, Bob, and a trusted third party. SET is different: it has cardholders, merchants, payment gateways, and a hierarchy of certificate authorities. Changing Isabelle’s theory of protocols to use SET’s cast of characters was easy. The inductive approach [13] verifies protocols using the standard techniques of operational semantics. An inductive definition defines the possible executions of a system consisting of the honest protocol participants and an active attacker. An execution comprises any number of attempted protocol runs and is a trace of message transmissions and other events. A standard theory of messages and their operations underlies these inductive models. Safety properties are proved by induction on traces. For example, we can prove that any trace containing a particular event x must also contain some other event y; such properties can express authentication or agreement. Secrecy properties are hardest to prove. For example, if we are concerned with the secrecy of a certain key K, then we must prove K = K for each key K that might be compromised. Every encrypted message produces a case split, since we must prove that K is secure whether or not the encrypting key is. Huge case analyses can arise. Despite the difficulties, we can use established techniques and tools in our attempt to prove secrecy. The model includes a set of honest agents. Typically we can prove (perhaps optimistically) that their long-term keys cannot become compromised. The spy controls another set of agents, with full access to their internal states. The spy also controls the network and retains every transmitted message. Session keys may become compromised. If a key becomes compromised then the spy can read all ciphertexts encrypted using that key, and if it has been used to encrypt other keys, then the consequential losses cascade. Proving secrecy in this situation requires special techniques, which I have presented for the Yahalom protocol [15] and applied also to Kerberos [5]. Messages in our model have types. A nonce can never equal an agent name or a session key, for example. Such assumptions can be defended: in the real world, different kinds of items are likely to have different lengths. However, our model does not allow reasoning about operators like exclusive-OR. Because (X ⊕ Y ) ⊕ Y = X, exclusive-
Verifying the SET Protocol: Overview
7
OR can yield a result of essentially any type. Reasoning about exclusive-OR probably requires a bit-level formalization. Isabelle/HOL [12] is an interactive proof tool for higher-order logic. Isabelle provides a simplifier, a predicate calculus theorem prover, a choice of proof languages, and automatic generation of LaTeX documents. Isabelle’s support for inductive definitions is particularly strong, both in its specification language and in its prover automation. However, other tools for higher-order logic could be suitable. We have applied the inductive approach to a wide range of protocols, including industrial ones such as Kerberos and TLS [14].
4 Modelling Issues Researchers compete to produce the fastest automatic tools. However, the main obstacle to protocol verification lies in digesting the documentation and producing a formal model. Understanding hundreds of pages of text is a massive undertaking. Meticulous care is essential to avoid defining an incorrect model. The main SET documents are the Business Description [8], the Programmer’s Guide [10], and the Formal Protocol Definition [9]. SET is defined using Abstract Syntax Notation One (ASN.1).1 The programmers guide presents each message format as a figure based on the underlying ASN.1 definition, augmented with a detailed English description of how to process each message. The formal protocol definition consists of the programmers guide with the ASN.1 notation inserted and the English text removed. Since the ASN.1 adds little to the figures, the formal protocol definition essentially consists of the syntax without the semantics. We derived our model from the programmer’s guide. The enormous size and complexity of the SET message formats demanded simplification. As we have discussed elsewhere [4], this was not always straightforward. A field might be described as optional and yet seem to play an essential role. Additional simplifications were necessary, forcing us to decide what constituted SET’s core feature set. One detail that we eliminated was payment by instalments. Most payment cards provide payment by instalments anyway, so SET does not have to provide a similar mechanism. However, critics might reject this reasoning. Attacks against protocols often arise from unclear assumptions about the operating environment rather than from flaws in the protocols themselves. Experts can dispute whether the formal model accurately reflects the real world and thus whether the attack is realistic. For example, Lowe’s famous attack [7] against the Needham-Schroeder public-key protocol relies on the possibility that insiders can be compromised. However, Needham-Schroeder designed the protocol with the express purpose of protecting the honest insiders from outsiders. SET has a much more complex environment and parts of its operation are specifically left “out of band.” Our formal model has to make reasonable assumptions about these undefined parts. It also must specify which insiders can be compromised and innumerable other details. It also has to define the protocol goals, since the documentation outlines them only in general management terms. 1
http://www.asn1.org
8
L.C. Paulson
5 Verifying the Registration Protocols The cardholder registration protocol (Fig. 1) comprises three message exchanges between the cardholder and a certificate authority. In the first exchange, the cardholder requests registration and is given the certificate authority’s public keys. In the second exchange, the cardholder supplies his credit card number, called the PAN, or Primary Account Number; he receives an application form suitable for the bank that issued his credit card. In the third exchange, the cardholder returns the completed application form; in addition, he delivers his public signature key and supplies a 20-byte secret number (the CardSecret). Finally, the cardholder receives a certificate that contains his public signature key and another 20-byte secret number, the PANSecret. The registration protocol for merchants is simpler: it has only two message exchanges and involves no credit card number. My colleagues and I verified both cardholder registration and merchant registration. Cardholder registration is the one I discuss below.
Certificate Authority (CA) Process
Cardholder Computer Cardholder initiates registration
Initiate request
Initiate response Cardholder requests registration form
Cardholder completes registration form and requests certificate Cardholder receives certificate
Certificate Authority sends response
Registration form request
Certificate Authority takes request and
Registration form
sends registration form
Cardholder certificate request
Cardholder certificate
Certificate Authority checks registration form and issues certificate
Fig. 1. Cardholder Registration
Conceptually, cardholder registration is straightforward. Its chief peculiarity is that the cardholder is authenticated by the registration form, not by the possession of a secret key. The protocol as defined in SET however is difficult to verify, mainly because it employs digital envelopes. While Yahalom and Kerberos have a dependency chain of length one — one session key encrypts just one secret — with digital envelopes
Verifying the SET Protocol: Overview
9
the dependency chains could be arbitrarily long. (In the current model of cardholder registration, the chain links only three items, though at one point it was longer.) I was able to generalise the previous technique [15] in order to cope with arbitrary dependency relationships. A relation must be defined in higher-order logic, identifying the protocol events that cause one secret to depend upon another. (This relation is necessarily transitive.) Lemmas must be proved, saying in effect that the loss of a key can cause no losses other than the obvious ones given by the relation. Such lemmas put a bound on the consequential losses. The proofs employ induction and the intermediate subgoals can be many pages long. Let us consider these points more precisely. Here is the fifth message, Cardholder Certificate Request: 5. C → CA : CryptKC3 (m, S), CryptpubEK CA (KC3, PAN, CardSecret) where m = C, NC3, KC2, pubSK C and S = CryptpriSK C (Hash(m, PAN, CardSecret)) The cardholder chooses an asymmetric signature key pair. He gives the CA the public key, pubSK C, and the number CardSecret. This message is a digital envelope, sealed using the key KC3; it contains another key, KC2, which the CA uses for encrypting the Cardholder Certificate: 6. CA → C : CryptKC2 (Sign CA (C, NC3, CA, NonceCCA), CertCA (pubSK C, PANSecret), CertRCA (pubSK CA))) where PANSecret = CardSecret ⊕ NonceCCA The CA returns a certificate for the cardholder’s public signature key. The certificate also includes the cryptographic hash of PANSecret. This 20-byte number is the exclusive-OR of the CardSecret and NonceCCA: a nonce chosen by the CA. The cardholder must use the PANSecret to prove his identity when making purchases. Cardholder registration does not have to be this complicated. Since the cardholder has a private signature key, why does he also need the PANSecret? If he really does need the PANSecret to prove his identity, why must the CA contribute to its calculation through NonceCCA? The point of such an calculation is to avoid sending the secret across the network, but the cardholder must disclose the PANSecret each time he makes a purchase. Eliminating NonceCCA would eliminate the need to encrypt message 6, which would contain only public-key certificates. We could dispense with the key KC2 and eliminate the dependency chain KC3, KC2, NonceCCA. These changes would make the protocol simpler and more secure, as we shall see. Figure 2 presents the Isabelle specification of message 5. You will find it hard to read, but comparing it with the informal notation above conveys an idea of the syntax. The inductive definition consists of one rule for each protocol message, which extends a given trace. (Note that # is Isabelle’s syntax for the list “cons” operator. In message 5,
10
L.C. Paulson
the current trace is called evs5.) One of the rule’s preconditions is that CardSecret must be fresh: Nonce CardSecret ∈ / used evs5
The nonce NC3 and the two symmetric keys (KC2 and KC3) must also be fresh. Other preconditions check that the cardholder has sent an appropriate instance of message 3 to the CA and has received a well-formed reply. If the preconditions are satisfied, then C can generate the corresponding instance of message 5. [[evs5 ∈ set cr; C = Cardholder k; / used evs5; Nonce CardSecret ∈ / used evs5; Nonce NC3 ∈ = CardSecret; NC3 / used evs5; KC2 ∈ symKeys; Key KC2 ∈ / used evs5; KC3 ∈ symKeys; KC2 =KC3; Key KC3 ∈ Gets C {|sign (invKey SKi) {|Agent C, Nonce NC2, Nonce NCA|}, cert (CA i) EKi onlyEnc (priSK RCA), cert (CA i) SKi onlySig (priSK RCA)|} ∈ set evs5; Says C (CA i) {|Crypt KC1 {|Agent C, Nonce NC2, Hash (Pan (pan C))|}, Crypt EKi {|Key KC1, Pan (pan C), Hash {|Agent C, Nonce NC2|}|}|} ∈ set evs5]] =⇒ Says C (CA i) {|Crypt KC3 {|Agent C, Nonce NC3, Key KC2, Key (pubSK C), Crypt (priSK C) (Hash {|Agent C, Nonce NC3, Key KC2, Key(pubSK C), Pan(pan C), Nonce CardSecret|})|}, Crypt EKi {|Key KC3, Pan (pan C), Nonce CardSecret|}|} # evs5 ∈ set cr
Fig. 2. Cardholder Registration in Isabelle (Message 5)
My colleagues and I did not discover any attacks against cardholder registration. Under reasonable assumptions, the PAN, PANSecret and other sensitive information remain secure. However, merely by inspection, I observed a flaw. The PANSecret is computed by exclusive-OR, which gives the certificate authority full control over its value. One would like to be able to trust the certificate authorities, but banks have issued insecure Personal Information Numbers [1, p. 35]: One small upper-crust private bank belied its exclusive image by giving all its customers the same PIN. This was a simple programming error; but in another, more down-market institution, a programmer deliberately arranged things so that only three different PINs were issued, with the idea that this would provide his personal pension fund.
Verifying the SET Protocol: Overview
11
The remedy is trivial: compute the PANSecret by hashing instead of exclusive-OR. Another remedy is to leave its choice entirely to the cardholder’s computer — after all, it exists for the cardholder’s protection.
6 Verifying the Purchase Phase A SET purchase can involve three protocols: purchase request, payment authorisation, and payment capture. The first two of these often behave as a single protocol, which is how we model them. (We have yet to investigate payment capture.) The protocol is too complex to present here. Even the means of identifying the transaction is complicated. The cardholder and merchant may each have an identifying number; sometimes a third number is chosen. The choice of method is actually left open. For the sake of simplicity, we discard all but one of the identification options, and use the merchant’s transaction identifier. The essential parameters of any transaction are the order description (presumably a text string) and the purchase amount. The cardholder forms a dual signature on the order information and payment information, as outlined above and sends it to the merchant. The merchant forwards the payment information, under his signature, to the payment gateway. Only the payment gateway can read the account details, which include the Primary Account Number and the PANSecret. If they are acceptable, he replies to the merchant, who confirms the transaction with the cardholder. A look at message 3 illustrates the complexity of the dual signature: 3. C → M : PIDualSign, OIDualSign Here, the cardholder C has computed HOD = Hash(OrderDesc, PurchAmt) PIHead = LID M, XID, HOD, PurchAmt, M, Hash(XID, CardSecret) OIData = XID, Chall C, HOD, Chall M PANData = PAN, PANSecret PIData = PIHead, PANData PIDualSign = SignpriSK C (Hash(PIData), Hash(OIData)), CryptpubEK P (PIHead, Hash(OIData), PANData) OIDualSign = OIData, Hash(PIData) Figure 3 presents this message using Isabelle syntax. Because of the hashing, all the information appears repeatedly. Although in the real world the hash of any message is a short string of bytes, in the formal model the hash of message X is literally Hash X: a construction involving X. The formal model of message 3 involves massive repetition. Most digital envelopes involve hashing, causing further repetition. Other details of our model include a dummy message to model the initial shopping agreement, which lies outside SET. Our model includes the possibility of unsigned
12
L.C. Paulson
[[evsPReqS ∈ set pur; C = Cardholder k; CardSecret k = 0; / used evsPReqS; KC2 ∈ symKeys; Key KC2 ∈ Transaction = {|Agent M, Agent C, Number OrderDesc, Number PurchAmt|}; HOD = Hash{|Number OrderDesc, Number PurchAmt|}; OIData = {|Number LID M, Number XID, Nonce Chall C, HOD, Nonce Chall M |}; PIHead = {|Number LID M, Number XID, HOD, Number PurchAmt, Agent M, Hash{|Number XID, Nonce (CardSecret k)|}|}; PANData = {|Pan (pan C), Nonce (PANSecret k)|}; PIData = {|PIHead, PANData|}; PIDualSign = {|sign (priSK C) {|Hash PIData, Hash OIData|}, EXcrypt KC2 EKj {|PIHead, Hash OIData|} PANData|}; OIDualSign = {|OIData, Hash PIData|}; Gets C (sign (priSK M){|Number LID M, Number XID, Nonce Chall C, Nonce Chall M, cert P EKj onlyEnc (priSK RCA)|}) ∈ set evsPReqS; Says C M {|Number LID M, Nonce Chall C |} ∈ set evsPReqS; Notes C {|Number LID M, Transaction|} ∈ set evsPReqS ]] =⇒ Says C M {|PIDualSign, OIDualSign|} # evsPReqS ∈ set pur
Fig. 3. The Signed Purchase Request Message
purchases. These allow unregistered cardholders to use SET using a credit card number alone and offer little protection to merchants. SET perhaps offers this option in order to provide an upgrade path from SSL. Because the SET documentation did not tell us what properties to prove, we specified them ourselves. Obviously, the PAN and PANSecret must remain secure. Correctness also means that each party to a purchase must be assured that the other parties agree on all the essential details: namely, the purchase amount, the transaction identifier, the order description, and the names of the other agents. We were able to prove most of these properties. Digital envelopes cause further problems, however. Agreement among principals obviously refers to important fields such as the order description and purchase amount. While we certainly hope the two parties will agree on which session key was used in a digital envelope, that property is not essential. Given the choice of either devoting much effort to proving agreement on session keys or ignoring them, I adopted the latter course. Agreement fails in one important respect: the payment gateway cannot be certain that the cardholder intended him to take part in the transaction. Message 3 involves six copies of the field XID (transaction identifier) and nine copies of the field PurchAmt (purchase amount), but it never mentions the identity of the intended payment gateway! Although the failure of this property is disappointing, it does not appear to allow a significant attack. It could only be exploited by a rogue payment gateway, who would presumably prefer harvesting credit card numbers to causing anomalous SET executions. Thus, we must reject the dualistic view that every protocol is either correct or vulnerable to attack. Anomalous executions that do little harm cannot be called attacks.
Verifying the SET Protocol: Overview
13
7 Conclusions Our study demonstrates that enormous protocols such as SET are amenable to formal analysis. Such work is challenging, however. Understanding the documentation and defining a formal model can take months. Some assertions are too long to be comprehensible, comprising a dozen or two lines of formalism. Whether those assertions are specifications or theorem statements, their incomprehensibility raises the possibility that they could be misinterpreted. During an interactive proof, Isabelle may present the user with subgoals that are hundreds of lines long. Such monstrosities impose a heavy burden on the computer. A simplification step can take 10 or 20 seconds on a 1.8 gigahertz processor. Diagnosing a failed proof requires meticulous examination of huge and unintuitive formulae, where all abbreviations have been fully expanded. The bar chart shows the runtime required to execute the proofs for several protocols on a 1.8GHz machine. There are three SET protocols (dark shading) and three others (light shading). This data is suggestive rather than compelling, because minor changes to a proof script can cause dramatic changes to the required runtime. It suggests that merchant registration is very simple. Cardholder registration requires more effort, partly because it is longer and partly because it demands more secrecy proofs. The purchase phase is twice as difficult again. Otway-Rees TLS Kerberos Merchant Reg Cardholder Reg Purchase 0s
50s
100s
150s
200s
250s
300s
350s
I doubt that existing methods can cope with protocols that are more complicated than SET. (Perhaps such protocols should not exist.) The single greatest advance would be a method of abstraction allowing constructions such as the digital envelope to be verified independently. We could then model these constructions abstractly in protocol specifications. In the case of SET, we could replace all digital envelopes by simple encryptions. Assertions would become more concise; proofs would become much simpler. Abstraction in the context of security is ill understood, however, and can mask grave flaws [16]. The other advance can happen now, if protocol designers will co-operate. They should provide a Formal Protocol Definition worthy of the name. It should not employ a logical formalism — people would disagree on which one to use — but it should precisely specify several things: 1. an abstract version of the message flow, comprising the core security features only 2. the protocol’s precise objectives, expressed as guarantees to each party 3. the protocol’s operating environment, including the threat model
14
L.C. Paulson
At present, we are forced to reverse engineer the protocol’s core design from its documentation, and we have to guess what the protocol is supposed to achieve. Acknowledgements. The SET verification is joint work with Giampaolo Bella, Fabio Massacci and Piero Tramontano. Bella also commented on this paper. The EPSRC grant GR/R01156/R01 Verifying Electronic Commerce Protocols supported the Cambridge work. In Italy, CNR and MURST grants supported Massacci.
References 1. R. Anderson. Why cryptosystems fail. Comm. of the ACM, 37(11):32–40, Nov. 1994. 2. G. Bella, F. Massacci, and L. C. Paulson. The verification of an industrial payment protocol: The SET purchase phase. In V. Atluri, editor, 9th ACM Conference on Computer and Communications Security, pages 12–20. ACM Press, 2002. 3. G. Bella, F. Massacci, and L. C. Paulson. Verifying the SET registration protocols. IEEE J. of Selected Areas in Communications, 21(1), 2003. in press. 4. G. Bella, F. Massacci, L. C. Paulson, and P. Tramontano. Formal verification of cardholder registration in SET. In F. Cuppens, Y. Deswarte, D. Gollman, and M. Waidner, editors, Computer Security — ESORICS 2000, LNCS 1895, pages 159–174. Springer, 2000. 5. G. Bella and L. C. Paulson. Kerberos version IV: Inductive analysis of the secrecy goals. In J.-J. Quisquater, Y. Deswarte, C. Meadows, and D. Gollmann, editors, Computer Security — ESORICS 98, LNCS 1485, pages 361–375. Springer, 1998. 6. E. Cohen. TAPS: A first-order verifier for cryptographic protocols. In Proc. of the 13th IEEE Comp. Sec. Found. Workshop, pages 144–158. IEEE Comp. Society Press, 2000. 7. G. Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using CSP and FDR. In T. Margaria and B. Steffen, editors, Tools and Algorithms for the Construction and Analysis of Systems: second international workshop, TACAS ’96, LNCS 1055, pages 147–166. Springer, 1996. 8. Mastercard & VISA. SET Secure Electronic Transaction Specification: Business Description, May 1997. Available electronically at} http://www.setco.org/set specifications.html. 9. Mastercard & VISA. SET Secure Electronic Transaction Specification: Formal Protocol Definition, May 1997. Available electronically at http://www.setco.org/set specifications.html. 10. Mastercard & VISA. SET Secure Electronic Transaction Specification: Programmer’s Guide, May 1997. Available electronically at http://www.setco.org/set specifications.html. 11. C. Meadows. Analysis of the Internet Key Exchange protocol using the NRL Protocol Analyzer. In SSP-99, pages 216–231. IEEE Comp. Society Press, 1999. 12. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL: A Proof Assistant for Higher-Order Logic. Springer, 2002. LNCS Tutorial 2283. 13. L. C. Paulson. The inductive approach to verifying cryptographic protocols. J. of Comp. Sec., 6:85–128, 1998. 14. L. C. Paulson. Inductive analysis of the internet protocol TLS. ACM Trans. on Inform. and Sys. Sec., 2(3):332–351, 1999. 15. L. C. Paulson. Relations between secrets: Two formal analyses of the Yahalom protocol. J. of Comp. Sec., 9(3):197–216, 2001. 16. P. Ryan and S. Schneider. An attack on a recurive authentication protocol. a cautionary tale. Inform. Processing Lett., 65(15):7–16, 1998.
Interacting State Machines: A Stateful Approach to Proving Security David von Oheimb Siemens AG, Corporate Technology, D-81730 Munich [email protected]
Abstract. We introduce Interacting State Machines (ISMs), a general formalism for abstract modeling and verification of reactive systems. We motivate and explain the concept of ISMs and describe their graphical representation with the CASE tool AutoFocus. The semantics of ISMs is defined using Higher-Order Logic within the theorem prover Isabelle. ISMs can be seen as high-level variants of Input/Output Automata, therefore we give also a semantic translation from ISMs to IOAs. By the “benchmark” example of Lowe’s fix of the Needham-Schroeder protocol we demonstrate the strengths of the ISM approach to express and prove security properties in a both elegant and machine-checked way.
1
Motivation and Related Work
When investigating the correctness, safety and security of complex IT systems, formal modeling and verification of their key properties are essential for fulfilling strong quality requirements. A prominent example of this approach is security analysis according to the upper evaluation assurance levels defined in the IT security evaluation criteria catalog ITSEC [ITS91] and its successor, the Common Criteria [CC99]. We present a formalism and tool support facilitating abstract formal analysis of a wide range of reactive IT systems including smart cards, embedded systems, network protocols, operating systems, databases, etc. During the development of the framework [NO02], an important requirement was that it must be simple and practical enough for industrial application where results are to be obtained quickly and with limited effort. Therefore it should allow to express the key aspects of such systems in a convenient, flexible, and intuitive way. It should be supported by a well-developed theory and mature tools for verification and textual as well as graphical documentation. From an abstract perspective, common to the IT systems mentioned above is the notion of concurrently running distributed components with local state that interact typically in an asynchronous way via messages. Thus any modeling framework for such systems should provide built-in capabilities for both state transitions and buffered communication. Many classical system modeling techniques focus either on state transitions like e.g. the B [Abr96] and Z [Spi92] notations or on interaction like the process algebra CSP [Hoa80] and the Picalculus [MPW92]. There are also efforts to combine the best of both approaches, A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 15–32, 2003. Springer-Verlag Berlin Heidelberg 2003
16
D. von Oheimb
e.g. translating CSP to B [But99] or Z to CSP [Fis00]. The drawback of such hybrids is that the user has to deal with two different non-trivial formalisms. Moreover, theorem proving support respecting the structure of the mixed-style specifications seems not to be available. There are at least three formalisms – with confusingly similar, or even equal, names – that pursue the approach of extending non-interactive automata with explicit input/output: both [MIB98] and [J¨ ur02b, 3] build their definitions on Gurevich’s ASMs [Gur97] and consequently call the resulting formalisms Interacting Abstract State Machines. Common to both approaches is the use of unordered input/output buffers, which also constitutes the main conceptual difference to our approach. AutoFocus automata [HSSS96], as well as Interactive Algebraic State Machines [J¨ ur02a], provide essentially unbuffered clock-synchronous communication. Focus [BS01] uses the very abstract and mathematically elegant – yet on the other hand more difficult – model of stream-processing functions to describe the behavior of reactive systems. For neither of these formalisms mechanical theorem proving support is available. There is a simple well-developed formalism that has been designed for modeling state-oriented asynchronous distributed computation from the outset: I/O Automata (IOAs) [LT89]. IOAs come with a mature meta theory that offers compositional refinement. System properties, both safety and liveness ones, may be described using temporal logics and proved manually or with suitable tools. The IOA approach has been implemented by M¨ uller [M¨ ul98] using the theorem prover Isabelle [Pau94]. This implementation supports not only interactive verification but also model checking [Ham99]. Thus IOAs seem to be a good candidate for the desired framework, but from our perspective they suffer from one severe drawback: their interaction scheme is rather low-level. Buffered communication has to be modeled explicitly, and transitions involving several related input, internal processing, and output activities cannot be expressed atomically. Instead, any such transition has to be split into multiple low-level transitions, and between these, any number of further input events may take place due to the input-enabledness of IOAs. This typically makes both modeling and verification rather cumbersome. Our solution in order to avoid these disadvantages is to add extra structure, essentially by designating parts of the local state of an automaton as input/output buffers and introducing transitions with simultaneous input/output inspired by AutoFocus automata [HSSS96]. The notion of ISMs implements these ideas. In contrast to the article [OL02] emphasizing the application of ISMs to securiy modeling, the present article focusses on the ISM semantics and on verification techniques, in particular for analyzing authentication protocols.
2
Interacting State Machines
In this section, which is the core of the current article, we introduce the notion of Interacting State Machines both informally and mathematically. Then we introduce their graphical representation within AutoFocus and their textual
Interacting State Machines: A Stateful Approach to Proving Security
17
representation within Isabelle/HOL. Finally, we give a semantic translation of ISMs to IOAs. 2.1
Concepts
An Interacting State Machine (ISM) is an automaton whose state transitions may involve multiple input and output simultaneously on any number of ports. As the name suggests, the key concepts of ISMs are states (and in particular the transitions between them) and interaction. By interaction we mean explicit buffered communication via named ports (which are also called connections), where on each port one receiver listens to possibly many senders. An ISM system is the interleaved parallel composition of any number of ISM components where the state of the whole system is essentially the Cartesian product of the states of its components. The state of an ISM consists of its input buffers and a local state. The local state may have arbitrary structure but typically is the Cartesian product of a control state which is of finite type and a data state which is a record of named fields representing local variables. Each ISM has a single1 local initial state. Input Buffers: In
Trans
Out
Local State: Control State
Data State
Fig. 1. ISM structure
Each ISM declares two sets of port names, one for input and the other for output. The input buffers are a family, indexed by the port names, of (unbounded) message FIFOs. Message exchange is triggered by any outputting ISM within the system or by the environment. Inputs cannot be blocked, i.e. they may occur at any time, appending the received value to the corresponding FIFO. Values stored in the input buffers of an ISM may be processed by the ISM when it is ready to do so. This is done by user-defined transitions, which may be nondeterministic and can be specified in any relational style. Thus the user has the choice to define them in an operational (i.e., executable) or axiomatic (i.e., property-oriented) fashion or a mixture of the two. Transition rules specify that – potentially under some precondition that may include matching of messages in the input buffers – the ISM consumes as much input from its buffers as appropriate, makes a local state transition, and produces some output. The output 1
If a non-singleton set of initial states is required, this may be simulated by nondeterministic spontaneous transitions from a single dummy initial state.
18
D. von Oheimb
is forwarded to the input buffers of all ISMs listening to the respective ports, which may result in direct or indirect feedback. A run of an ISM or ISM system is any finite2 (but unbounded) prefix of the sequence of configurations reachable from the initial configuration. Transitions of different ISMs that are composed in parallel are related only by the causality wrt. the messages interchanged. Execution gets stuck when there is no component that can perform any step. As typical for reactive systems, there is no built-in notion of final or “accepting” states. 2.2
Semantics
This subsection gives the logical meaning of ISMs in detail. This is meant as a precise reference and may be skipped for a first reading of the article. Message Families Let M be the type of all messages potentially exchanged by ISMs and PN the type of port names. Then the message families, which are used to denote both input buffers and output patterns, have type MSGs = PN → M ∗ where M ∗ is any finite sequence of elements of M . The symbol denotes the empty message family λp. , the term dom(m) abbreviates {p. m(p) = }, i.e. the domain of a message family m, and the infix operation m .@. n denotes pointwise concatenation λp. m(p) @ n(p) of two message families m and n. States and Transitions The type of an ISM state is STATE(Σ) = MSGs × Σ where the parameter Σ stands for the type of the local state. The set of transitions has type TRANS(Σ) = ℘(STATE(Σ) × MSGs × STATE(Σ)). Each of its elements has the form ((i, σ), o, (i , σ )) and means that the ISM can perform a step from local state σ to σ , taking the current input buffer contents i to i (thus consuming as much input as required) and producing output o. Recall that i, i and o each denote whole families of message FIFOs. Single Automata An ISM is given as a quadruple a = (In, Out, σ0 , Trans(a)) of type ISM(Σ) = ℘(PN) × ℘(PN) × Σ × TRANS(Σ) where – – – –
In is the set of input port names Out is the set of output port names σ0 is the initial local state Trans(a) is the transition relation
Such a definition is well-formed iff all the port names actually used in the transitions for input or output are contained in the sets In or Out, respectively. Note that In and Out may overlap, which enables direct feedback. 2
Finiteness allows for a simple trace semantics, but on the other hand implies that we cannot handle liveness properties. Yet we do not feel this as a real restriction because security properties are essentially safety properties: if at all they involve guarantees about the existence of future events, these typically involve timeouts.
Interacting State Machines: A Stateful Approach to Proving Security
19
Runs The runs Runs(a) ∈ ℘((MSGs × STATE(Σ))∗ ) of an ISM a are finite sequences of configurations and is inductively defined as dom(in) ⊆ In ( , (in, σ0 )) ∈ Runs(a) cs (o, (b, σ)) ∈ Runs(a) ((b, σ), o , (b , σ )) ∈ Trans(a) dom(in) ⊆ In cs (o, (b, σ)) (o , (b .@. in, σ )) ∈ Runs(a) The operator appends elements to a sequence. ISM traces have the form ( , (in0 , σ0 )), (o1 , (b1 .@. in1 , σ1 )), (o2 , (b2 .@. in2 , σ2 )), . . . where each element of the sequence consists of the output, the input buffer contents, and the local state of the current step. Note that in each step the environment can provide arbitrary input in for the next step of the ISM. All ISM output is visible in the trace, whereas output not used for internal communication or feedback (within parallel composition, as described below) is discarded. Parallel Composition The parallel composition i∈I ai (with global input buffers not directly supporting multicast) of a family of ISMs A = (ai )i∈I is an ISM defined as the quadruple (AllIn\AllOut, AllOut\AllIn, S0 , PTrans(A)): – AllIn = i∈I Ini – AllOut = i∈I Outi – S0 = Πi∈I (σ0 )i is the Cartesian product of all initial local states – PTrans(A) ∈ TRANS(Πi∈I Σi ) is the parallel composition of the transition relations, defined as j∈I ((b, σ), o , (b , σ )) ∈ Trans(aj ) ((b, S[j → σ]), o|PN\AllIn , (b .@. o|AllIn , S[j → σ ])) ∈ PTrans(A) S[j → σ] denotes the replacement of the j-th component of the tuple S by σ. m|C denotes the restriction λp. if p ∈ C then m(p) else of the message family m to the set C. The subterm o|PN\AllIn denotes those parts of the output o provided to the environment, while o|AllIn denotes the internal output to peer ISMs or feedback, which is added to the current buffer contents b . A parallel composition is well-formed iff all its components are well-formed. Note that there are no inter-component restrictions. This means in particular that inputs of different components may overlap (which leads to competition on inputs without fairness guarantees) and outputs may overlap as well (which leads to nondeterministic interleaving of outputs). An ISM system is called closed if AllIn = AllOut, i.e. there is no interaction with the environment. When composing ISMs, it is occasionally necessary to prevent name clashes or to hide connections, which can be achieved by suitable renaming of ports.
20
D. von Oheimb
Parallel Runs We define the runs of a parallel composition for simplicity also directly (yet non-compositionally) as PRuns(A) ∈ ℘((STATE(Πi∈I Σi ))∗ ) where we do not include ISM output in the trace: dom(in) ⊆ AllIn\AllOut (in, S0 ) ∈ PRuns(A) j∈I → σ]) ∈ PRuns(A) cs (b, S[j ((b, σ), o , (b , σ )) ∈ Trans(aj ) dom(in) ⊆ AllIn\AllOut cs (b, S[j → σ]) (b .@. o|AllIn .@. in, S[j → σ ]) ∈ PRuns(A) One can show easily that running ISMs directly in parallel is equivalent to first combining the components in parallel, then running the system and projecting away the output from its trace: PRuns(A) = {mapπ2 (cs) | cs ∈ Runs(i∈I ai )}. 2.3
Graphical Representation
When designing and presenting system models, a graphical representation is very helpful since it gives a good overview of the system structure and a quick intuition about its behavior. This is particularly important in an industrial setting: models are developed in collaboration with clients and documented for their further use, where strong familiarity with formal notations cannot be assumed. Therefore we have designed the structure of ISMs in a way that they can be easily displayed using an already available graphical tool, in this case AutoFocus. AutoFocus [HSSS96] is a freely available prototype CASE tool for specification and simulation of distributed systems. Components and their behavior are specified by a combination of system structure diagrams (SSDs), state transition diagrams (STDs) and auxiliary data type definitions (DTDs). Their execution is visualized using extended event traces (EETs). As an illustrating example, take two figures from our model of Lowe’s fix of the classical Needham-Schroeder public-key authentication protocol [Low96]. This model, which we call NSL, will be described in more detail in 3. The system structure diagram in Figure 2 shows the four components with their local variables and the named connections between them, all including type information. The meaning of the diagram, i.e. the mapping to the ISM semantics, should be obvious. Concerning the syntactic structure of systems, AutoFocus automata are richer than ISMs, which we have kept as basic as possible in order to simplify their semantics and to alleviate verification: – We merge the AutoFocus notions of channels and ports into the notion of ports. The motivation for the more complex AutoFocus notions is easy re-use of components in different contexts. One can achieve an analogous effect by renaming of ports where required.
Interacting State Machines: A Stateful Approach to Proving Security Local Variables: agent Apeer nonce AnA nonce AnB
Local Variables: set(msg) known = initState(Intruder)
AI:msg Alice IA:msg
21
Local Variables: agent Bpeer nonce BnA nonce BnB
IB:msg Intruder
Bob BI:msg
Local Variables: set(nonce) used = N0
NA:msg
NB:msg
NGen
Fig. 2. NSL System Structure Diagram
– AutoFocus offers direct support for multicasts, which can be simulated by replication of output. – AutoFocus automata may be hierarchical, which can be simulated by hierarchical use of the parallel composition operator and renaming in order to hide internal connections. The state transition diagram in Figure 3 shows the three control states of the ISM Alice and the transitions between them, which have the general format precondition : inputs : outputs : assignments. Each input is given by a port name, the ? symbol, and a message pattern, while each output is given by a port name, the ! symbol, and a message value. The initial state is marked with a black bullet on the left.
Init
: NA ? Nonce(nA) : AI ! Crypt (pubK(B), {| Nonce(nA), Agent(Alice) |}) : Apeer := B, AnA := nA Wait
nA = AnA & B = Apeer : IA ? Crypt (pubK(Alice), {| Nonce(nA), Nonce(nB), Agent(B) |}) : AI ! Crypt (pubK(B), {| Nonce(nB) |}) : AnB := nB Fine
Fig. 3. NSL State Transition Diagram: Alice
22
D. von Oheimb
Concerning state transitions, the expressiveness of ISMs is higher than the one of AutoFocus automata because transition rules may be generic in the sense that each of them can describe a whole family of transitions: – multiple source and/or destination control states – non-constant port names, offering limited support for dynamic topologies – unbounded nondeterminism concerning the values used for outputs and/or assignments – changes to the local state can be given as arbitrary relational expressions We use AutoFocus as a graphical front-end to our Isabelle implementation of ISMs. In a typical application of our framework, a user first “paints” ISMs using AutoFocus, saves them in the so-called Quest file format, and translates them into suitable Isabelle theory files, described in the next subsection, utilizing a tool program [Nan02, NO02]. Unfortunately, we cannot make use of the simulation, code generation and model checking capabilities of current AutoFocus and its back ends, which may be acquired by purchase from Validas [S+ ]. This is because its underlying semantics is still clock-synchronous, due to the original emphasis of AutoFocus on embedded systems. In contrast, for the most of our applications, an asynchronous buffered semantics is more adequate. An alternative asynchronous semantics is currently under consideration also for future versions of AutoFocus. Anyway, the current semantic incompatibility is not a real obstacle to us since we are interested mainly in the graphical capabilities of AutoFocus and the AutoFocus syntax is general enough to cover also our deviating semantics. 2.4
Isabelle/HOL Representation
When aiming at system verification, one has the fundamental choice of using (automatic) model checking or (interactive) theorem proving techniques. We opt for the latter because the systems that we model are typically too complex for the capabilities of model checkers, and data abstraction techniques are either not applicable or would lead to counterintuitive modifications of the models. We employ Isabelle/HOL because of excellent experience with this tool. Isabelle [Pau94] is a generic interactive theorem prover that has been instantiated to many logics, in particular the very practical Higher-Order Logic (HOL). The only drawback of using Isabelle/HOL for applications like ours is the lack of dependent types: for each system modeled there is a single type of message contents into which all message data has to be injected, and the same holds for the local states of automata. Despite of this nuisance, we consider Isabelle/HOL the most flexible and mature verification environment available. Using it, security properties can be expressed easily and adequately and can be verified using powerful proof methods. Furthermore, Isabelle offers good facilities for textual presentation and documentation. In order to represent ISMs in Isabelle theories in an adequately abstract way that has (almost) a one-to-one correspondence to the AutoFocus represen-
Interacting State Machines: A Stateful Approach to Proving Security
23
tation, we have designed a new theory section for Isabelle/HOL. This section is introduced by the keyword ism and has the following general structure3 : ism name = ports pn type inputs I pns outputs O pns messages msg type [ buffers buffer name] states [glob state type] [control cs expr0 :: cs type] [data ds expr0[, ds name] :: ds type] [ transitions (tr name: [cs expr -> cs expr’] [pre (bool expr)+ ] [in (I pn I msgs)+ ] [out (O pn O msgs)+ ] [post ((lvar name := expr)+ | ds expr’)] + ) ] The meaning of the individual parts is as follows. – pn type is the Isabelle/HOL type of the port names, while I pns and O pns denote the set of input and output port names, respectively. – msg type is the type of the messages, which is typically an algebraic datatype with a constructor for each kind of message. The optional name buffer name, which defaults to ism buffers, is the name of a logical variable that may be used to refer to the contents of the input buffers within transition rules. – The optional glob state type should be given if the current ISM forms part of a parallel composition and the state types of the ISMs involved differ. In this case, glob state type should be a free algebraic datatype with a constructor for each state type of the ISMs involved. cs expr0 and ds expr0 specify the initial values of the control and data state, respectively, while cs type and ds type give their types. Either (i.e., not both!) the control state or the data state may be absent. The optional logical variable name ds name, which defaults to st, may be used to refer to the whole data state within transition rules. Transitions are given via named rules. The control state (if any) before and after the transition is specified by the expressions4 cs expr and cs expr’. Any expression within a rule may refer to the two logical variables mentioned above. In particular, the value of any local variable lvar of the ISM may be referred to by st lvar if st is the name of the data state variable. The scope of free variables appearing in a rule is the whole rule, i.e. free variables are implicitly universally quantified (immediately) outside each rule. 3 4
[. . .] means optional parts, (. . .)+ means one or more comma-delimited occurrences These need not be constant but may contain also variables, which is useful for modeling generic transitions. In this case, one such transition has to be represented by a set of transitions within AutoFocus.
24
D. von Oheimb
All the following parts of a transition rule are optional: – The pre part contains guard expressions bool expr, i.e. preconditions constraining the transition. – The in part gives a set of input port names I pn, each in conjunction with a list I msgs of message patterns expected to be present in the corresponding input buffer. When performing a transition, free variables in the patterns are bound to the actual values that have been input. Any input port not explicitly mentioned is left untouched. – The out part gives a set of output port names O pns, each in conjunction with an expression O msgs denoting a list of values designated for output to the corresponding port. Any output port not mentioned does not obtain new output. – The post part describes assignments of values expr to the local variables lvar name of the data state. Variables not mentioned remain invariant. Alternatively, an expression ds expr’ may be given that represents the new data state after the transition. Assignments to the local variables suit an operational style, whereas an axiomatic style can be achieved using ds expr’ and suitable constraints in the preconditions. An ism theory section as described above is translated to standard Isabelle concepts in a straightforward way using an extension to Isabelle/HOL, as described in [Nan02]. In particular, each ISM section is translated to a record with the appropriate fields, the most complex one being the transition relation, which is defined via an inductive (but not actually recursive) definition. The meta theory of ISMs that we have defined in Isabelle/HOL includes all concepts mentioned in 2.2, in particular well-formedness, renaming, parallel composition, runs, and composite runs. Further auxiliary concepts are introduced as well, in particular reachability and several induction schemes related to ISM runs. Two of them will be given in 3.3 and 3.4. The characteristic properties of these concepts, as required for system verification, are derived within Isabelle/HOL. All details of the meta theory may be found in [Ohe02]. 2.5
IOA Semantics
Next to the standard semantics of ISM runs given in 2.2, our Isabelle/HOL formalization [Ohe02] provides also two alternatives: – the clock-synchronous semantics of AutoFocus mentioned in 2.3, which we do not further describe in this article, and – a translation of ISMs to special instances of Input-/Output Automata [LT89], which yields an (essentially) equivalent semantics. In this subsection we give a semi-formal description of the latter translation in order to show both the similarities and the differences of the two automata concepts. The intuition behind the translation is that there is a one-to-one correspondence between IOAs and ISMs if each IOA is augmented by a pair of input and
Interacting State Machines: A Stateful Approach to Proving Security
25
output buffers holding the built-in input buffers of the corresponding ISM and the ISM output not yet transmitted, respectively. Each ISM transition involving input of n messages and output of m messages is split into n + 1 + m IOA actions (which may be interleaved with other actions): after the n input messages have arrived and have been stored via n (singleton) input actions of the IOA, an internal action performs the transition of the local state, consuming the n messages from the input buffers and appending the m output messages to the output buffers. These messages are later transmitted to their recipients via m (singleton) output actions. The resulting internal actions within any one IOA need not to be distinguished, while the internal actions of different IOAs must be kept distinct, which we achieve easily by augmenting them with the name of the ISM they are derived from. Recalling that IOAs communicate by synchronizing the sender and potentially many receivers on equal external actions, it becomes clear that each message sent or received by an ISM on a given port has to be represented on the IOA level by an external action holding both the port name and the message content. More formally, a well-formed ISM C whose input and output ports do not overlap is translated to an IOA A as follows. sig(A), the action signature, is the triple S = (in(S), out(S), int(S)) where in(S), the set of input actions, contains all entities of the form Extern pn v, where pn is an input port name and v any message potentially transferred on the port pn. out(S), the set of output actions, contains all entities of the form Extern pn v, where pn is an output port name and v any message potentially transferred on the port pn. int(S), the set of internal actions, contains the single entity Step C, where C is the name of the ISM. states(A), the set of IOA states, contains all possible configurations, i.e. tuples of the form (o, (i, σ)) giving the current output and input buffer state as well as the local state of the ISM. start(A), the set of initial states, contains the single element ( , ( , σ0 )) representing the empty output buffers, empty input buffers, and the initial local state σ0 of C. steps(A), the transition relation, consists of the following three sorts of transitions: – for each input port name pn of C and any value v that is potentially input on the given port there is a transition labeled Extern pn v that appends v to the input buffer associated with pn. – for each output port name pn of C and any value v currently at the head position of the output buffer associated with the port pn, there is a transition labeled by the output action Extern pn v that removes v from the buffer . – for each (user-defined) transition of C there is a transition labeled Step C that reflects the change to the local state, removes from the input buffers all that is consumed, and appends all output to the output buffers.
26
D. von Oheimb
part(A), the equivalence relation used for describing fairness, is empty since we do not consider fairness, i.e. we define a so-called safe IOA [M¨ ul98, Definition 2.2.2]. Note that the resulting automaton A is input-enabled, i.e. accepts input in any state, and has a well-formed signature S, i.e. in(S), out(S) and int(S) are pairwise disjoint. We have designed the format of the actions of A carefully such that it is possible to perform also the inverse translation from A to C, which essentially means projecting the input and output actions to input and output port names, removing the output buffers from the state, and keeping only the user-defined transitions where the ISM output is determined by the current difference of the output buffer states. It can easily be shown that the translation from ISMs to IOAs and back to ISMs is the identity mapping (even for ISMs that are not well-formed). When considering finite executions only and disallowing the overlapping of input ports of ISMs composed in parallel (which would lead to multicasts according to the IOA semantics and non-determinism according to the ISM semantics), the direct ISM semantics and the one by translation to IOAs are equivalent. In particular, for any set of well-formed ISMs whose whose output ports do not overlap (which is a standard precondition for IOA composition), the given translation commutes with parallel composition on the ISM and IOA levels. By the translation of ISMs to IOAs we get access to all the concepts available for IOAs. Reasoning about the properties of ISMs can be done on the level of IOAs, which means that we do not have to develop all proof methodology and tools from scratch. Thus in particular compositional reasoning on automaton refinement carries over to ISMs, as well as model checking support [Ham99]. Of course, it is desirable to transfer these concepts from the IOA level to the ISM level, which we do as far as our applications require and our development capacity allows. We can conclude from this subsection that the expressive power of ISMs and IOAs is the same, while ISMs offer higher-level transitions and thus more structure that allows to model and verify reactive systems more adequately.
3
Authentication Protocol Verification
In this section we give an example of using the ISM framework for the verification of security properties in Isabelle/HOL. In order to allow for an easy comparison with other theorem proving approaches, in particular [Sch97], [Pau98] and [Coh00], we take the well-known example of the classical Needham-Schroeder public-key authentication protocol in the version fixed by Lowe [Low96]. 3.1
Modeling
The honest agent Alice tries to initiate a single connection with Bob with the help of a nonce server NGen but in the presence of an attacker (according to
Interacting State Machines: A Stateful Approach to Proving Security
27
the Dolev-Yao model) called Intruder. A diagram of the global structure of the system model as well as the state transition diagram for Alice is given in 2.3, while the textual representation of the four ISMs each describing the behavior of one system component can be found in the appendix. Our Isabelle/HOL formalization inherits the representation of messages and the intruder’s capabilities from Paulson’s protocol verfication theories explained in [Pau98]. In comparison to the versions given in [OL02], the model is slightly extended: we have added the local variable AnB to the state of Alice representing her view on the nonce received from her peer (which hopefully is the responder Bob ). In the remainder of this subsection we extend the comparison, given in [OL02], of our NSL model with Paulson’s model [Pau98] to Schneider’s model [Sch97]. There are major differences in two aspects. The first of them is obvious: Schneider’s approach uses the CSP notation which does not have a notion of states, therefore system properties can be stated only with respect to message exchange. Sometimes even artificial messages have to be introduced just for specification and verification purposes. The same holds for Paulson’s approach which is not based on CSP but also operates on message traces only. The second is more subtle: Schneider makes the implicit assumption that the nonces used by the honest agents are known in advance and that all of them are distinct. Paulson’s model generates fresh nonces by choosing a value not already contained in the current trace of the system. Our model makes the computation and distinctness of nonces explicit by introducing the nonce server NGen. 3.2
Authentication Theorem
We aim to prove the most critical property of NSL, namely authentication of Alice to Bob, including session agreement. In our state-oriented ISM approach, this can be expressed very naturally, namely entirely on the basis of the states (in particular, knowledge and expectations) of the two honest agents: theorem Bob_trusts_NSL: "Alice ∈ / bad −→ (b,s)#cs ∈ PRuns(NSL) −→ ( ∃ nA. Bob_state s = (Conn, (|Bpeer = Alice, BnA = nA, BnB = nB |))) −→ ( ∃ (b’,s’) ∈set cs. ( ∃ nA. Alice_state s’ = (Fine, (|Apeer = Bob , AnA = nA, AnB = nB |))))"
If Alice does not give away her private key and in some state of a run5 of NSL Bob believes to be connected with Alice in a session identified by the nonce nB that he had brought up, then some time earlier Alice has indeed accepted a connection with Bob identified by the same nonce nB. 3.3
Invariant Proof
We prove the theorem using a variant of Schneider’s rank function approach [Sch97]. Essentially, we show that as long as Alice does not reach the state of being happily connected to Bob in a session identified by nB (with we formulate 5
For technical reasons, ISM traces are constructed from right to left in Isabelle/HOL.
28
D. von Oheimb
as Alice_Fine_with Bob nB below), Bob cannot receive the final acknowledgment concerning nB (which is implied by the predicate noleak_all nB below) when trying to accept a connection with Alice (expressed by Bob_Resp_only_to Alice ). Freshness of nonces, which we deal with in the next subsection, also plays a role, captured by the predicate Alice_Wait_nA nB. The key lemma just paraphrased can be stated formally in Isabelle/HOL as lemma Bob_trusts_NSL_lemma: "Alice ∈ / bad −→ cs ∈ PRuns(NSL) −→ ( ∀ (b,s) ∈set cs. constrain nB s) −→ ( ∀ c ∈set cs. noleak_all nB c)"
where the predicate on the left-hand side of the implication is defined as "constrain nB s ≡ ¬Alice_Wait_nA nB s ∧ Bob_Resp_only_to Alice s ∧ ¬Alice_Fine_with Bob nB s" "Alice_Wait_nA nB s ≡ ∃ as. Alice_state s = (Wait,as) ∧ AnA as = nB" "Bobs_Resp_only_to A s ≡ ∀ bs. Bob_state s = (Resp,bs) −→ Bpeer bs = A" "Alice_Fine_with B nB s ≡ ∃ nA. Alice_state s = (Fine, (|Apeer = B, AnA = nA, AnB = nB |))"
and the right-hand side is a simplified version of the invariant given in [Sch97], adapted to the use within our stateful approach. It lifts the predicate noleak over all messages in the channels between the two agents and the intruder and over all information in the intruder’s state: "noleak_all nB (b,s) ≡ ∀ p ∈ / {NA, NB}. ( ∀ m ∈ set (b p). noleak nB m) ∧ ( ∀ m ∈ Intruder_known s. noleak nB m)" "noleak nB (Agent a) = True" "noleak nB (Nonce n) = (n = nB)" "noleak nB (Key k) = (Key k ∈ initState Intruder)" "noleak nB {|m, n |} = (noleak nB m ∧ noleak nB n)" "noleak nB (Crypt k m) = (k=pubK Alice ∧ ( ∃ n. m= {|n,Nonce nB,Agent Bob |}) ∨ noleak nB m)"
The protocol-specific predicate noleak nB describes a superset of the messages actually transmitted by any agent (including the intruder) during runs of the constrained protocol, such that these messages cannot lead to sending Crypt (pubK Bob) (Nonce nB). This means in particular that nB must not be leaked to the intruder. In this case, one can prove that the intruder is unable to derive nB, i.e. noleak nB is an invariant of the intruder’s behavior: lemma init_noleak: " ∀ m ∈ initState Intruder. noleak nB m" lemma synth_analz_maintains_noleak: "Alice ∈ / bad −→ ( ∀ m ∈ known. noleak nB m) −→ ( ∀ m ∈ synth (analz known). noleak nB m)"
The great advantage of the given formulation of the key lemma is that it has the form of an invariant that can be proved by induction on the protocol runs. Since both sides of the rightmost implication in the key lemma are universal quantifications on the configurations contained in a trace, the induction can be done rather conveniently. In this case we can make use of the following (derived) induction rule for a well-formed closed system of ISMs A = (ai )i∈I :
Interacting State Machines: A Stateful Approach to Proving Security
29
P (( , S0 )) −→ Q(( , S0 )) ∀j b σ o b σ cs S. (j ∈ I ∧ ((b, σ), o , (b , σ )) ∈ Trans(aj ) ∧ → σ]) ∈ PRuns(A) ∧ (∀c ∈ (cs (b, S[j → σ])). P (c) ∧ Q(c)) −→ cs (b, S[j P ((b .@. o , S[j → σ ])) −→ Q((b .@. o , S[j → σ ]))) ∀cs ∈ PRuns(A). (∀c ∈ cs. P (c)) −→ (∀c ∈ cs. Q(c))
Just observe that in order to apply this rule, one has to show only Q(( , S0 )) and → σ ])) while all other formulas are premises that may be taken Q((b .@. o , S[j advantage of. Proving the key lemma – including subproofs – takes about 70 lines of proof script (in the conventional semi-automatic tactics style of Isabelle) while applying it to get the main result takes about 20 lines. 3.4
Freshness Proof
We take about 40 extra lines of proof script to prove the freshness of nonces produced by NGen. The lemma interfacing this fact to the above proofs reads as lemma Alice_Wait_not_BnB: "(b,s)#cs ∈ PRuns(NSL) −→ Bob_state s = (Resp, bs) −→ ( ∀ (b’,s’) ∈set ((b,s)#cs). ¬Alice_Wait_nA (BnB bs) s’)"
which means that the nonce that Alice had brought up and which she expects to get back from her peer is not the one brought up by Bob. We transform this proof goal to one stating that the nonce Alice expects can be only a value received on her port NA : lemma Alice_Wait_NA: "cs ∈ PRuns(NSL) −→ ( ∀ (b,s) ∈set cs. Nonce nB ∈ / set b NA)) −→ ( ∀ (b,s) ∈set cs. ¬Alice_Wait_nA nB s)"
which can be proved easily making use of the above induction rule again. The transformation requires eight straightforward freshness lemmas like lemma NB_disjoint_NA_past: "(b,s)#cs ∈ PRuns(NSL) =⇒ Nonce n ∈ set (b NB) −→ (b’,s’) ∈set cs −→ Nonce n ∈ / set (b’ NA)"
all of which are proved in a very schematic way using the following simple induction rule for well-formed closed systems of ISMs A = (ai )i∈I : P (, , S0 ) ∀j b σ o b σ cs S. (j ∈ I ∧ ((b, σ), o , (b , σ )) ∈ Trans(aj ) ∧ → σ]) ∈ PRuns(A) ∧ P (cs, b, S[j → σ]) −→ cs (b, S[j P (cs (b, S[j → σ]), b .@. o , S[j → σ ])) ∀cs (b, s) ∈ PRuns(A). P (cs, b, s)
All details of the proofs may be found in [Ohe02].
4
Conclusion and Future Work
We have introduced Interactive State Machines, a formalism for modeling and verifying the correctness and security of reactive state transition systems.
30
D. von Oheimb
ISMs can be seen as high-level Input/Output Automata with the same expressiveness but significantly improved structuring. The ISM semantics can be translated to the IOA semantics, inheriting all their semantic concepts (except for fairness and liveness) as well as proof support. Yet for practicality reasons we prefer to define, respectively implement, them directly on the ISM level. Future work includes carrying over the concepts of refinement, compositionality and temporal logics as well as related proof methods and tools. For applications with restricted complexity, model checking support might be useful. ISMs can also be viewed as variants of AutoFocus automata with an asynchronous buffered communication. Users of the approach can specify and present ISMs graphically as AutoFocus diagrams and then translate them to Isabelle theories. Alternatively, they may define ISMs also directly within Isabelle/HOL. The Isabelle representation enforces and supports fully formal – and thus maximally reliable – system modeling and verification. For verification, the powerful semi-automatic proof tools of Isabelle/HOL are available. By the an example of using ISMs for authentication protocol analysis, our stateful approach (in contrast to e.g. Schneider’s approach using CSP [Sch97], Paulson’s inductive method [Pau98] or Cohen’s TAPS [Coh00], which deal with communication events only) turns out to make system modeling as well as the formulation of security properties rather intuitive. Also the proofs provide more insights (via the invariants and freshness properties required), while due to the extra amount of detail, our proofs for NSL take more effort than Schneider’s and in particular Paulson’s and Cohen’s. Acknowledgments. We thank Volkmar Lotz, Thomas Kuhn, Haykal Tej, Jan J¨ urjens, Guido Wimmel and several anonymous referees for fruitful discussions and their comments on earlier versions of this paper.
References [Abr96] [BS01] [But99]
[CC99] [Coh00]
[Fis00] [Gur97]
Jean-Raymond Abrial. The B-book: assigning programs to meanings. Cambridge University Press, 1996. Manfred Broy and Ketil Stø len. Specification and development of interactive systems. Springer, 2001. Michael Butler. csp2B : A practical approach to combining CSP and B. In Proceedings of FM’99: World Congress on Formal Methods, pages 490–508, 1999. Common Criteria for Information Technology Security Evaluation (CC), Version 2.1, 1999. ISO/IEC 15408. Ernie Cohen. Taps: A first-order verifier for cryptographic protocols. In 13th IEEE Computer Security Foundations Workshop — CSFW’00, pages 144–158, Cambridge, UK, 3–5 July 2000. IEEE Computer Society Press. Clemens Fischer. Combination and implementation of processes and data: from CSP-OZ to Java. PhD thesis, Univ. of Oldenburg, 2000. Y. Gurevich. Draft of the asm guide. Technical Report CSE-TR-336-97, EECS Dept., University of Michigan, 1997.
Interacting State Machines: A Stateful Approach to Proving Security
31
[Ham99] Tobias Hamberger. Integrating theorem proving and model checking in Isabelle/IOA. Technical report, TU M¨ unchen, 1999. [Hoa80] C. A. R. Hoare. Communicating sequential processes. In R. M. McKeag and A. M. Macnaghten, editors, On the construction of programs – an advanced course, pages 229–254. Cambridge University Press, 1980. [HSSS96] Franz Huber, Bernhard Sch¨ atz, Alexander Schmidt, and Katharina Spies. Autofocus - a tool for distributed systems specification. In Proceedings FTRTFT’96 - Formal Techniques in Real-Time and Fault-Tolerant Systems, volume 1135 of LNCS, pages 467–470. Springer-Verlag, 1996. See also http://autofocus.in.tum.de/index-e.html. [ITS91] Information Technology Security Evaluation Criteria (ITSEC), 1991. [J¨ ur02a] Jan J¨ urjens. Algebraic state machines: Concepts and applications to security, 2002. Submitted for publication. [J¨ ur02b] Jan J¨ urjens. Principles for Secure Systems Design. PhD thesis, Oxford University Computing Laboratory, Trinity Term 2002. [Low96] Gavin Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using FDR. In Proceedings of TACAS, volume 1055, pages 147–166. SpringerVerlag, 1996. [LT89] Nancy Lynch and Mark Tuttle. An introduction to input/output automata. CWI Quarterly, 2(3):219–246, 1989. http://theory.lcs.mit.edu/tds/papers/Lynch/CWI89.html. [MIB98] M. Maia, V. Iorio, and R. Bigonha. Interacting Abstract State Machines. In Proceedings of the 28th Annual Conference of the German Society of Computer Science. Technical Report, Magdeburg University, 1998. [MPW92] Robin Milner, Joachim Parrow, and David Walker. A calculus of mobile processes - parts i+ii. Information and Computation, 100(1):1–77, September 1992. [M¨ ul98] Olaf M¨ uller. A Verification Environment for I/O Automata Based on Formalized Meta-Theory. PhD thesis, Technische Univerit¨ at M¨ unchen, 1998. See also http://isabelle.in.tum.de/IOA/. [Nan02] Sebastian Nanz. Integration of CASE Tools and Theorem Provers: A Framework for System Modeling and Verification with AutoFocus and Isabelle. Master’s thesis, TU M¨ unchen, 2002. http://home.in.tum.de/nanz/csthesis/. [NO02] Sebastian Nanz and David von Oheimb. ISM Framework: Manual and distribution, 2002. http://ddvo.net/ISM/. [Ohe02] David von Oheimb. The Isabelle/HOL implementation of Interacting State Machines, 2002. Technical documentation, available on request. [OL02] David von Oheimb and Volkmar Lotz. Formal Security Analysis with Interacting State Machines. In Proc. of the 7th European Symposium on Research in Computer Security (ESORICS). Spinger, 2002. http://ddvo.net/papers/FSA_ISM.html. [Pau94] Lawrence C. Paulson. Isabelle: A Generic Theorem Prover, volume 828 of LNCS. Springer-Verlag, 1994. For an up-to-date description, see http://isabelle.in.tum.de/. [Pau98] Lawrence C. Paulson. The inductive approach to verifying cryptographic protocols. Journal of Computer Security, 6:85–128, 1998. Oscar Slotosch et al. Validas Model Validation AG. [S+ ] http://www.validas.de/.
32 [Sch97]
[Spi92]
D. von Oheimb Steve Schneider. Verifying authentication protocols with CSP. In Proceedings of the 10th Computer Security Foundations Workshop (CSFW). IEEE Computer Society Press, June 1997. J. Mike Spivey. The Z Notation: A Reference Manual. Prentice Hall International Series in Computer Science, 2nd edition, 1992.
Automatic Approximation for the Verification of Cryptographic Protocols Fr´ed´eric Oehl1 , G´erard Cece2 , Olga Kouchnarenko2, and David Sinclair1 1
School of Computer Applications, Dublin City University {foehl,dsinclai}@computing.dcu.ie 2 L.I.F.C., Universit´e de Franche-Comt´e {cece,kouchnarenko}@univ-fcomte.fr
Abstract. This paper presents an approximation function developed for the verification of cryptographic protocols. The main properties of this approximation are that it can be build automatically and its computation is guaranteed to terminate unlike Genet and Klay’s algorithm. This approximation has been used for the verification of the NeedhamSchroeder, Otway-Rees and Woo Lam protocols. To be more precise, the approximation allows us to check secrecy and authenticity properties of the protocols.
1
Introduction
Cryptography is used to secure the exchange of information over open networks. Cryptographic protocols define the rules (message formats and message order) to establish secure communications. But with some cryptographic protocols, information is not safe even when used with strong encryption algorithms. So, since these flaws have been discovered in protocols considered to be secure, several methods have been developed to verify cryptographic protocols. One of the first papers presenting a method to verify cryptographic protocols was [BAN89]. In this paper, Burrows, Abadi and Needham introduce a logic to model and to analyze cryptographic protocols. The idea is to reason about the beliefs of the agents in the network and the evolution of these beliefs after each protocol step. The lack of an automatic tool and of a complete semantics has encouraged the development of other logics [GNY90, AT91]. Existing techniques have also been extended for cryptographic protocol verification. In [Mea94, Mea96], a method based on the model-checking techniques is presented. The technique presents an extension of the Dolev-Yao model [DY83] and also integrates the notion of belief introduced in [BAN89]. The protocol is modeled by sets of rules that describe the intruder abilities, then by a narrowing technique it checks if an insecure state is reachable or not. With the NRL Protocol Analyzer [Mea96] several flaws have been discovered. The main advantage of this tool is that the verification is done automatically. [JRV00] also introduces an automatic tool that has been successfully tested on simple protocols [CJ97]. They use rewriting rules to model the protocol and the intruder behaviour, then A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 33–48, 2003. c Springer-Verlag Berlin Heidelberg 2003
34
F. Oehl et al.
they apply those rules with a variant of ac-narrowing on an initial configuration. When the tool finds an inconsistency the protocol is flawed. In [Pau98], Paulson introduces a method based on the proof by induction to verify cryptographic protocols. This method allows the verification of a large range of properties. But in this approach the secrecy and authenticity properties/theorems are very difficult to prove3. The proofs require an experienced user to inject the right lemma at the right time to make the proofs converge. The proofs of the remaining properties (freshness, regularity, ...) are more simple and are slightly the same for all protocols. In [Bol96], Bolignano presents a method based on the clear distinction of reliable and unreliable agents. His method allows a precise specification of the protocol. The properties are modeled with temporal logic features and proved with the help of invariants of the protocol and axioms about the knowledge. The technique has been tested with the Coq prover [Bol95]. The list of techniques to verify cryptographic protocols is long. To have a better view of this particular field of research, the reader can see the surveys [GSG99] and [AGH+ 01]. Automata and tree automata are well known to model infinite systems. Recently, methods using tree automata to verify cryptographic protocols have been introduced [Mon99, GK00, GL00]. [Mon99] was the first paper where tree automata were used to verify cryptographic protocols. In [Mon99] and [GL00], tree automata model the set of messages that intruders are able to construct. In [GK00] tree automata model the network (traces of the protocol + capabilities of the intruder) and the current intruder knowledge. Another difference between these approaches is that in [Mon99] and [GL00] results are limited to a given number of agents and sessions, which is not the case in [GK00]. To conclude, these methods use an abstract analysis technique to compute the limit (reached when no new information is added to the model) after which the computation must stop. To be more precise, the size of the system that must be verified, here a cryptographic protocol, is too big to be explored. To reduce the size of the system and thus be able to verify some properties of this system, an approximated system is built. In this approximated system, only relevant information for the verification are kept. For the techniques previously cited, the approximated system is a super-set of the concrete one. The result is that if a property is satisfied at the end of the computation then the protocol verifies this property, otherwise nothing can be said. Notice that the main problem remains the termination of this semi-effective computation. These methods have been used to verify secrecy and authentication properties. Secrecy guarantees that information defined as secret (like shared keys) cannot be caught by an intruder during protocol runs. Authentication guarantees that an agent in the network can identify the sender of a message. To build an approximated system or an approximation of a system, mathematical functions are used. Those functions, called approximation functions, 3
to see what is involved look at the proofs of the Needham-Schroeder protocol: http://www.cl.cam.ac.uk/Research/HVG/Isabelle/library/HOL/Auth/
Automatic Approximation for the Verification of Cryptographic Protocols
35
define which information/parts of the system will be abstracted and how they will be. Unlike the Genet and Klay’s approximation algorithm, the approximation algorithm that we will present is guaranteed to terminate. Secrecy and authentication properties have been analysed on the Needham-Schroeder protocol (public key without server, shared key with server) [NS78, Low95], the Woo Lam Pi3 protocol [WL94] and the simplified version of Otway-Rees [AN96] with our approximation function. The paper is organized as follows. Section 2 introduces some useful definitions. Section 3 presents the approximation of [Gen98] and our approximation. Section 4 explains why our approximation fits to the verification of cryptographic protocols.
2
Definitions
To facilitate the understanding of the rest of the paper some notations and basics definitions are introduced in this section. Let F be a finite set of symbols associated with an arity function, X a countable set of variables, T (F , X ) the set of terms, and T (F ) the set of ground terms (terms without variables). Let V ar(t) denote the set of variables of the term t ∈ T (F , X ). Definition 1 A term rewriting system (TRS) R is a set of rewrite rules l → r, where l, r ∈ T (F , X ), l ∈ / X , and V ar(r) ⊆ V ar(l). If s|p denotes the subterm of s at the position p and s[rσ]p denotes the term obtained by substitution of the subterm s|p at the position p by the term rσ, then the relation →R means that for any s, t ∈ T (F , X ) we have s →R t if there exists a rule l → r in R, a position p ∈ Pos(s), where Pos(s) is the set of positions in s, and a substitution σ such that lσ = s|p and t = s[rσ]p . The set of R-descendants of a set E of ground terms is denoted by R∗ (E) and defined by R∗ (E) = {t ∈ T (F ) | ∃s ∈ E.s →∗R t}, where →∗R is the transitive closure of →R . Definition 2 Let R be a TRS defined on T (F , X ). A term t ∈ T (F , X ) is linear if any variables of V ar(t) has exactly one occurrence in t. A rewrite rule is left-linear if the left-hand side of the rule is linear. R is left-linear if every rewrite rule of R is left-linear. Definition 3 A bottom-up finite tree automaton is a quadruple A = {F , Q,Qf , ∆} where F is a finite set of symbols, Q is a finite set of states, Qf is the set of terminal states such that Qf ⊆ Q, and ∆ is a set of transitions. A transition of ∆ is a rewrite rule c →A q, where c ∈ T (F Q) and q ∈ Q. The tree language recognized by A is L(A)={t ∈ T (F ) | ∃q ∈ Qf . t →∗A q}. From now on, we consider bottom-up finite tree automata and we say tree automata for short.
36
F. Oehl et al.
3
Approximations
The idea of Genet and Klay is the following. Given an initial automaton A (recognizing the initial configuration of the network where everybody wants to communicate with everybody), a term rewriting system R (modeling the protocol steps and the intruder abilities), and an approximation function (see Section 3.1 for more detail), an automaton TR ↑ (A) recognizing an approximation of the possible configurations of the network reachable by R from A is built. Moreover, R∗ (L(A)) ⊆ L(TR ↑ (A)). The technique to compute the approximation automaton and the approximation function used by Genet and Klay is explained in the next section. Then our approximation function is introduced and two of its properties are established: 1. TR ↑ (A) computed with our approximation verifies R∗ (L(A)) ⊆ L(TR ↑ (A)). 2. The computation of TR ↑ (A) with our approximation stops. 3.1
Approximation of Genet and Klay
Let A = {F , Q,Qf , ∆} be a tree automaton. Definition 4 Given a configuration s ∈ T (F a mapping α:
Q) \ Q, an abstraction of s is
→Q α : {s|p | p ∈ P osF (s)} (sequences of integers) in the term where P osF (s) denotes the set of positions s. The mapping α is extended on T (F Q) by defining α as the identity on Q. Definition 5 Let s → q be a transition such that s ∈ T (F ∪ Q), q ∈ Q, and α an abstraction of s. The set N ormα (s → q) of normalized transitions is inductively defined by: 1. if s = q, then N ormα (s → q) = ∅, and 2. if s ∈ Q and s = q, then N ormα (s → q) = {s → q}, and 3. if s = f (t1 , . . . , tn ), then N orm nα (s → q) = {f (α(t1 ), . . . , α(tn )) → q} ∪ i=1 N ormα (ti → α(ti )). Definition 6 Let Q be a set of states, Qnew be any set of new states such that Q Qnew = ∅, and Q∗ new the set of sequences q1 . . . qk of states in Qnew . Let Σ(Q, X ) be the set of substitutions of variables in X by the states in Q. An approximation function, γ, is a triple (a set of rewriting rules, a set of states and mapping to a set of sequences of states, i.e. a set of substitutions) → Q∗ new , such that γ(l → r, q, σ) = γ: R × (Q Qnew ) × Σ((Q Qnew ), X ) q1 . . . qk where P osF (r) is the set of positions in r and k = Card(P osF (r)). Q
let Qnew be any set of new states such that In the rest of the paper, Qnew = ∅, and Qu =Q Qnew .
Automatic Approximation for the Verification of Cryptographic Protocols
37
From every γ(l → r, q, σ) = q1 . . . qk , the states q1 , . . . , qk can be associated with positions p1 , . . . , pk ∈ P osF (r) by defining the corresponding abstraction function α on the restricted domain {rσ|p | ∀ l → r ∈ R, ∀p ∈ PosF (r), ∀ σ ∈ Σ(Q, X )}: α(rσ|pi ) = qi for all pi ∈ P osF (r)={p1 , . . . , pk }, such that pi ¡ pi+1 for i=1 . . . k (where ¡ is the lexicographic ordering). In the following, N ormγ is the normalization function where α value is defined according to γ as above. Starting from a left-linear TRS R, an initial automaton A0 = A and an approximation function γ, Genet and Klay construct Ai+1 from Ai by: 1. searching for a critical pair, i.e. a state q ∈ Q, a rewrite rule l → r and a substitution σ ∈ Σ(Q, X ) such that lσ →∗Ai q and rσ ∗Ai q. 2. Ai+1 = Ai ∪ N ormγ (rσ → q). The above process is iterated until it stops on a tree automaton Ak such that there is no critical pair. Before introducing our approximation we give an example that will be reused later to illustrate the difference between the approximation γ (Definition 6) and our approximation γf (Definition 8). Example 1 Consider the alphabet F = {0 : 0, s : 1}, the initial automaton A0 = {F , {q0 , q1 }, {q0 , q1 }, {0 → q0 , s(q0 ) → q1 }} and the term rewriting system R = {s(x) → s(s(x))}. If we apply the above process to compute Ai+1 from Ai , the computation will never stop. New states are introduced for each normalization. The computation without normalization looks like: s(q0 ) →R s(s(q0 )) →R s(s(s(q0 )))... Now if we apply the normalization, we have: s(q0 ) →R s(q2 ) →R s(q3 )... with γ(R, q1 , x = q0 ) = q2 , γ(R, q1 , x = q2 ) = q3 , ... . The computation will go like that for ever! 3.2
Our Approximation Function
In our approach we keep the same normalization as in Definition 5 but we refine the approximation of Genet and Klay (γ; Definition 6), and we have: Definition 7 Let Q∗u be the set of sequences q1 . . . qk of states in Qu . Let A = {F , Qu , Qf , ∆} be a tree automaton. Let P osF (r) = {p1 , . . . , pk }. An approxi→ Q∗u , such that γe (l → mation function is a mapping γe : R × Qu × Σ(Qu , X ) r, q, σ) = q1 . . . qk s.t. 1. ∀i. i ∈ [1, k] ⇒ ((∃q . q ∈ Qu ∧ (rσ|pi ) →∗A q ) ⇒ qi = q ) 2. ∀i. i ∈ [1, k] ⇒ ((∃f .f ∈ F ∧ rσ = f (t1 , . . . , tk ) ∧ f (q1 , . . . , qk ) → q ∧ (rσ|pi ) = f (q1 , . . . , qk )) ⇒ qi = q)
38
F. Oehl et al.
3. ∀i. i ∈ [1, k] ⇒ (((∃σ , q , l → r . σ ∈ Σ(Qu , X ) ∧ q ∈ Qu ∧ l → r ∈ R ∧ γe (l → r , q , σ ) = q1 . . . qz ) ⇒ (∃j. j ∈ [1, z] ∧ (r σ |pj ) = (rσ|pi ))) ⇒ qi = qj ). To facilitate the understanding of Definition 7, we can say: – the first rule says that if a subterm of rσ is already recognized by the state q of the current automaton then q is used for the normalization of this subterm; – the second rule explains that if a subterm of rσ at the position pi is equal to γe (l → r, q, σ) = q1 . . . qk then the state qi used in γe (l → r, q, σ) is replaced by the state q; – the third rule tells that two same subterms in two γe have the same normalization state. We establish now that our function γe (Definition 7) gives an upper approximation of R∗ (L(A)). Proposition 1 Let R be a left-linear TRS. Let γe be an approximation function such that γe is defined by Definition 7. Let A0 and Aek be two tree automata such that Aek is computed as explained in Section 3.1 with R, γe and the initial automaton A0 ; Then R∗ (L(A0 )) ⊆ L(Aek ). Proof Proposition 1 is guaranteed by Theorem 1 in [GK00].
3
The number of substitutions and of new states is infinite so the computation of the approximated automaton with the function Definition 7 may go forever. To guarantee the termination of the computation, we have to refine Definition 7. The idea is to generate an approximation γ where only variables that can be replaced by symbols of arity zero are replaced by the states corresponding to those symbols in the tree automaton. Then with the results of γ , the new approximation γf is built (Definition 8). Definition 8 Let R be a left-linear term rewriting system. Let A = {F , Qu , Qf , ∆} be a tree automata. Let P osF (r) = {p1 , . . . , pk }. Let γ be an approximation function such that γ is defined by Definition 7 with the substitution def Σ(Qu , X ) = Card(E) {Fi | Fi ⊆ E ∧ Card(Fi ) = i} where E = {x = q |x ∈ i=0 X ∧ ∃a. (a ∈ T (F ) ∧ a →A q) ∧ ∀y. ((y ∈ X ∧ y = x) ⇒ y = q ∈ E)}. → Q∗u An approximation function γf is a mapping γf : R × Qu × Σ(Qu , X ) with γf (l → r, q, σ) = q1 . . . qk s.t. – ∃σ .(σ ⊆ σ . γ (l → r, q, σ ) = q1 . . . qk ) and qi = qi (1 ≤ i ≤ k) with the maximum of matches between σ and σ, – and: 1. ∀i. i ∈ [1, k] ⇒ ((∃j. j ∈ [1, k] ∧ (rσ|pi ) = (rσ |pj )) ⇒ qi = qj ); 2. ∀i. i ∈ [1, k] ⇒ ((∃f. f ∈ F ∧ rσ = f (t1 , . . . , tk ) ∧ f (q1 , . . . , qk ) → q ∧ (rσ|pi ) = f (q1 , . . . , qk )) ⇒ qi = q); 3. ∀i. i ∈ [1, k] ⇒ ((∃j. j ∈ [1, k] ∧ (rσ|pi ) = (rσ|pj )) ⇒ qi = qj ).
Automatic Approximation for the Verification of Cryptographic Protocols
39
As for Definition 7, we give a brief explanation of the rules of Definition 8: – the first rule says that if a subterm of rσ is already recognized by the state q of the current automaton then q is used for the normalization of this subterm; – the second rule explains that if a subterm of rσ at the position pi is equal to γf (l → r, q, σ) = q1 . . . qk then the state qi used in γf (l → r, q, σ) is replaced by the state q; – the third rule tells that two same subterms of rσ have the same normalization state. This definition of γf (Definition 8) also verifies Proposition 1 (Consequence of Theorem 1 in [GK00]). Proposition 2 Let R be a left-linear TRS. Let Af 0 = {F , Q,Qf , ∆} be a tree automaton. Let γf be an approximation function such that γf is defined by Definition 8. If the number of rules in R and the number of states in Q are finite then the computation of Af k converges to a fixpoint. Proof [Sketch] – the first approximation γ terminates • initial stage ∗ Substitutions used in Definition 8 only substitute variables that model symbols of arity 0 by states (the remaining variables can take any values). So the number of substitutions is finite; ∗ The number of rewriting rules is finite; ∗ The number of automaton states is finite; · The numbers of rules lσ → rσ with l → r ∈ R and σ a substitution such that lσ →∗Af i q and rσ ∗Af i q with q ∈ Q is finite; If we apply the normalization process to all the previous rules with the function γ then we add a finite number of states. th • n stage ∗ So the number of substitutions is finite as the substitution are the same as the ones at the initial stage; ∗ The number of rewriting rules is finite; ∗ The number of automaton states is still finite but has been increased of a particular number of states by the last normalization process; · The numbers of rules lσ → rσ with l → r ∈ R and σ a substitution such that lσ →∗Af i q and rσ ∗Af i q with q ∈ Q is finite; If all the 3-uple γ (l → r, q, σ) has been used once, the normalization process does not add any more states to the automaton (because of the substitution used and the definition of γ ); otherwise a finite number of states is added. • After a finite number of normalization processes the computation stops.
40
F. Oehl et al.
– the principal approximation γf terminates γf does not introduce new states, it uses the states introduced by γ or those of the automaton. So the computation terminates. 3 We conclude this section by two examples. The first example was presented in Section 3.1 and the second example, originally from [Gen98], is used to illustrate the utility of γ . Recall that in [Gen98], the computation of this example does not terminate with the approximation of Definition 6. Example 2 We have the alphabet F = {0 : 0, s : 1}, the initial automaton A0 = {F , {q0 , q1 }, {q0 , q1 }, {0 → q0 , s(q0 ) → q1 }} and the term rewriting system R = {s(x) → s(s(x))}. If we apply the Genet and Klay process to compute Ai+1 from Ai with our approximation γf , the computation stops which is shown below. The computation without normalization looks like: s(q0 ) →R s(s(q0 )) →R s(s(s(q0 )))... Now if we apply the normalization, we have: s(q0 ) →R s(q1 ) →R s(q2 ) →R s(q2 ) We have γ (R, q1 , x = q0 ) = q1 and γ (R, q1 , ∅) = q2 . The first normalization adds no transition (s(q0 ) → q1 is already in A0 ). The next normalizations add the transitions s(q1 ) → q2 and s(q2 ) → q2 (because of γ (R, q1 , ∅) = q2 ) and after no more critical pair can be found so the computation stops. Example 3 In this example taken from [Gen98], we have A a tree automaton where ∆={app(q0 , q0 ) → q1 , cons(q2 , q1 ) → q0 , nil → q0 , nil → q1 , a → q2 }, rl = app(cons(x, y), z) → cons(x, app(y, z)), R = {rl}, and γf (Definition 8) the approximation function mapping every tuple (rl, q, σ) to one state. Now, we apply the Genet and Klay process to compute Ai+1 from Ai : 1. We have to add N ormγf (cons(q2 , app(q1 , q0 )) → q1 ) to ∆. N ormγf (cons(q2 , app(q1 , q0 )) → q1 ) = {cons(q2 , q3 ) → q1 , app(q1 , q0 ) → q3 }, to find this set of normalized transitions, we have to compute γ and then to use the right γ to have γf : γ (rl, q1 , {x = q2 , y = q1 , z = q0 }) = q3 . ..
γ (rl, q1 , {x = q2 , z = q0 }) = q4 .. . γ (rl, q1 , ∅) = q5 – γf (rl, q1 , {x = q2 , y = q1 , z = q0 }) = γ (rl, q1 , {x = q2 , y = q1 , z = q0 }) = q3 ; so the transitions cons(q2 , q3 ) → q1 and app(q1 , q0 ) → q3 are added to the current automaton set of transitions. – γ=
Automatic Approximation for the Verification of Cryptographic Protocols
41
2. We have to add N ormγf (cons(q2 , app(q3 , q0 )) → q3 ) to ∆. N ormγf (cons(q2 , app(q3 , q0 )) → q3 ) = {cons(q2 , q4 ) → q3 , app(q3 , q0 ) → q4 }, to find this set of normalized transitions, we repeat the same process as the one in the first step: γ (rl, q3 , {x = q2 , y = q1 , z = q0 }) = q3 . ..
γ (rl, q3 , {x = q2 , z = q0 }) = q4 .. . γ (rl, q3 , ∅) = q5 – γf (rl, q1 , {x = q2 , y = q3 , z = q0 }) = γ (rl, q1 , {x = q2 , z = q0 }) = q4 ; so the transitions cons(q2 , q4 ) → q3 and app(q3 , q0 ) → q4 are added to the current automaton set of transitions. 3. We have to add N ormγf (cons(q2 , app(q4 , q0 )) → q4 ) to ∆. N ormγf (cons(q2 , app(q4 , q0 )) → q4 ) = {cons(q2 , q4 ) → q4 , app(q4 , q0 ) → q4 }, to find this set of normalized transitions, we repeat the same process as the one in the first step: γ (rl, q4 , {x = q2 , y = q1 , z = q0 }) = q3 . .. – γ=
– γ=
γ (rl, q4 , {x = q2 , z = q0 }) = q4 .. . γ (rl, q4 , ∅) = q5 – γf (rl, q4 , {x = q2 , y = q4 , z = q0 }) = γ (rl, q4 , {x = q2 , z = q0 }) = q4 ; so the transitions cons(q2 , q4 ) → q4 and app(q4 , q0 ) → q4 are added to the current automaton set of transitions. 4. The computation stops unlike the computation with the γ function of [Gen98, GK00] given in Definition 6. We saw that our approximation is an upper-approximation of what we want to approximate and that the computation of the approximation has stopped. Now, we can see why this approximation fits to the verification of the cryptographic protocols, and in particular to the verification of the secrecy and authentication properties.
4
Cryptographic Protocols
In this section we explain why our approximation is quite efficient for the verification of cryptographic protocols. We implemented it in OCAML4 [RV98, LDG+ 01] to be used by Timbuk5 [GT01]. In [GK00], the authors explain how cryptographic protocols can be verified with their approximation function (Definition 6). Their idea is to compute a superset of all the reachable states (approximation automaton) from the initial 4 5
http://caml.inria.fr/ocaml/index.html http://www.irisa.fr/lande/genet/timbuk/index.html
42
F. Oehl et al.
configuration of the network, where everybody wants to communicate with everybody (initial automaton), the protocol steps (term rewriting system) and the approximation function. Then the negation of the secrecy and authentication properties are each modeled by a tree automaton (negation automata). The verification of secrecy and authentication are done by checking the intersection of the approximation automaton with the negation automata. If the intersection is empty the property is satisfied, otherwise another technique must be used to verify the property. To facilitate the understanding of the following comments, the syntax and the semantics used in [GK00] are summarized in Table 1. agt(x) x is an agent c init(x, y, z) x thinks he has established a communication with y but he really communicates with z c resp(x, y, z) x thinks he responds to a request of communication from y but he really communicates with z cons(x, y) concatenation of the information x and y encr(x, y, z) z is encrypted by y with x goal(x, y) x wants to communicate with y hash1(x, y) y is hashed by x hash2(x, y, z) z is hashed by y with the key x mesg(x, y, z) z is a message sent by x to y N(x, y) nonce created by x to communicate with y pubkey(x) public key of x serv(x) x is a server sharekey(x, y) key shared by x and y
Table 1. Description of the terms used The messages exchanged during the protocol runs will be composed of basic pieces of information (i.e. agent name, shared key, ...) or of a concatenation of basic pieces of information (i.e. agent name and shared key encrypted, ...). To reduce the number of messages that can be sent, we decide to fix the format of the messages by typing them. So in the term rewriting system (TRS) instead of having for example pubkey(x) you have pubkey(agt(x)) to indicate that x can only be an agent. In a message, two types of information can be distinguished, the one understood by the agent (i.e. agent names, ...) and the one that cannot be understood by the agent (i.e. an agent cannot access to a piece of information that has been encrypted if he does not have the right decryption key, ...). In TRS, this distinction is visible, if we take the example of a nonce, an agent can identify a nonce if he has created this nonce in the TRS when agt(x) has created a nonce to communicate with agt(y) we have N(agt(x),agt(y)) and when it is a nonce created by someone else we have N(w,z). Our approximation also makes the distinction, in one case we will have the state corresponding to the precise nonce and in the other case we will introduce a new state (because of the approximation γ ). This
Automatic Approximation for the Verification of Cryptographic Protocols
43
approximation γ gives precise states for known information (as known information contains variable that can be substituted by constant) and abstract states for unknown ones. The goal is to verify that information during protocol runs are kept secret (secrecy properties) and that actors can identify senders of messages (authentication properties). In [GK00] the authors assume the existence of Alice, Bob (two trusted agents), an unbounded number of untrusted agents and an intruder. They gather together all the untrusted agents to only consider Alice, Bob and the Rest and they verify the secrecy and authentication properties for Alice and Bob. Alice, Bob and the Rest can be seen as constants. In γ , only variables that model them are substituted by a state. The computation of the approximated automaton with this γ using this substitution has no effect on the verification of our two properties. To have a better view of what we just said, we can look at the NeedhamSchroeder-Lowe protocol [Low95] (cf. Fig. 1). Two agents, Alice and Bob, want to establish a secure communication using a public key infrastructure. Alice initiates a protocol run, sending a nonce Na and her name A to Bob. Message 1: A =⇒ B : {N a, A}Kb Bob responds to Alice’s message with a further nonce Nb. Message 2: B =⇒ A : {N a, N b, B}Ka Alice proves her existence by sending Nb back to Bob. Message 3: A =⇒ B : {N b}Kb
Fig. 1. Good version of the Needham-Schroeder-Lowe protocol So, for this protocol we verify that the nonce N a (resp. N b) created by Alice (resp. Bob) to communicate with Bob (resp. Alice) are kept secret during the protocol runs. We also verify that at the end of each run Alice (resp. Bob) really communicates with Bob (resp. Alice). The approximation function (corresponding to Definition 8) used by Timbuk6 [GT01] to compute the approximation automaton of Needham-Schroeder is available from FASec web site. The reader can see that we have all the possible messages, from Alice to Alice, from Alice to Bob, from Alice to someone else, .... To help to understand this approximation, we can take the second step of the protocol. The rewriting rule corresponding to this step is given by Figure 2. This rule means that when agt(b) receives an agent name and a nonce encrypted with his public key then he encrypts and sends to the agent, his name, a nonce created by himself and the nonce received (the message is added to the current trace LHS by U(LHS, mesg(...))). The approximation function is used to normalize and to add new terms to the current automaton. So basically, it gives a normalization process for all the right term of each rewriting rule. 6
http://www.irisa.fr/lande/genet/timbuk/index.html
44
F. Oehl et al. mesg(agt(a 3), agt(b), encr(pubkey(agt(b)), agt(a 2), cons(N(a 1,b 1), agt(a)))) -> U(LHS, mesg(agt(b), agt(a), encr(pubkey(agt(a)), agt(b), cons(N(a 1,b 1), cons(N(agt(b), agt(a)), agt(b))))))
Fig. 2. Rewriting rule of the second step of Needham-Schroeder-Lowe Figure 3 gives the approximation function of the second step of the protocol when Bob, agt(q2), sends the message. The function is composed of rules of the form ”[...] -> [...]” where the first part of the rule is the term to normalize and the second one is the normalization process to use. As you can see a precise state is given to information known by Bob, i.e. N(q5, q5) -> q15, and a global state is given to unknown information, i.e. N(a 1, b 1) -> q45 (a 1 and b 1 will be replaced during the computation by q3 (the Rest), q4 (Alice), q5 (Bob)). It is clear on the figure that the nonces created by Bob to communicate with himself, Alice and someone else are not gather together so the verification of the secrecy of the nonce created to communicate with Alice (be sure that the intruder does not catch N(q5, q4)) is not affected by our approximation. For the same reason, distinction of the communication between Alice, Bob and the Rest, our approximation does not affect the verification of the authentication. This distinction is very helpful, when the intersection of the approximation automaton and the negation property automaton is not empty. By looking at the approximation automaton with the approximation function, information, that can help the user to verify whether the property is satisfied or not with another method [Mea96, Pau98, JRV00], can be deduced. In particular, by studying the states of the automaton the user can find which particular step can lead or not to an attack and thus have an idea of how to direct the verification with the other verification technique. Two comments to conclude this section. First comment, the term rewriting systems used in the protocol case are not inevitably left-linear but it has no consequence for the computation of the approximation. The non-linearity only concerns the agents present in the network and each of them is initially recognized by a precise state. Those states are initially deterministic (you have one state for Alice, one for Bob and one for the Rest) and this property is conserved during the computation (see [GK00] for more detail). Second comment, the examples used in the previous section are not directly related to cryptographic protocols. So it seems that our approximation might also be used in other contexts where you are confronted to an infinite number of reachable states.
5
Conclusion
A tool that generates the approximation function (Definition 8) for protocols without timestamps has been implemented in OCAML7 [RV98, LDG+ 01]. This tool also generates the term rewriting system and the initial automaton for the 7
http://caml.inria.fr/ocaml/index.html
Automatic Approximation for the Verification of Cryptographic Protocols
45
[U(LHS, mesg(agt(q2), agt(q2), encr(pubkey(agt(q2)), agt(q2), cons(N(q5, q5), cons(N(agt(q2), agt(q2)), agt(q2)))))) -> q13] -> [LHS -> q13 agt(q2) -> q5 agt(q2) -> q5 N(q5, q5) -> q15 cons(q15, q5) -> q16 cons(q15, q16) -> q46 pubkey(q5) -> q17 encr(q17, q5, q46) -> q13 mesg(q5, q5, q13) -> q13] [U(LHS, mesg(agt(q2), agt(q2), encr(pubkey(agt(q2)), agt(q2), cons(N(a 1, b 1), cons(N(agt(q2), agt(q2)), agt(q2)))))) -> q13] -> [LHS -> q13 agt(q2) -> q5 agt(q2) -> q5 N(q5, q5) -> q15 cons(q15, q5) -> q16 N(a 1, b 1) -> q45 cons(q45, q16) -> q46 pubkey(q5) -> q17 encr(q17, q5, q46)-> q13 mesg(q5, q5, q13) -> q13] [U(LHS, mesg(agt(q2), agt(q1), encr(pubkey(agt(q1)), agt(q2), cons(N(a 2, b 2), cons(N(agt(q2), agt(q1)), agt(q2)))))) -> q13] -> [LHS -> q13 agt(q2) -> q5 agt(q1) -> q4 N(q5, q4) -> q19 cons(q19, q5) -> q20 N(a 2, b 2) -> q48 cons(q48, q20) -> q49 pubkey(q4) -> q21 encr(q21, q5, q49)-> q13 mesg(q5, q4, q13) -> q13] [U(LHS, mesg(agt(q2), agt(q0), encr(pubkey(agt(q0)), agt(q2), cons(N(q5, q3), cons(N(agt(q2), agt(q0)), agt(q2)))))) -> q13] -> [LHS -> q13 agt(q2) -> q5 agt(q0) -> q3 N(q5, q3) -> q23 cons(q23, q5) -> q24 cons(q23, q24) -> q52 pubkey(q3) -> q25 encr(q25, q5, q52) -> q13 mesg(q5, q3, q13) -> q13] [U(LHS, mesg(agt(q2), agt(q0), encr(pubkey(agt(q0)), agt(q2), cons(N(a 3, b 3), cons(N(agt(q2), agt(q0)), agt(q2)))))) -> q13] -> [LHS -> q13 agt(q2) -> q5 agt(q0) -> q3 N(q5, q3) -> q23 cons(q23, q5) -> q24 N(a 3, b 3) -> q51 cons(q51, q24) -> q52 pubkey(q3) -> q25 encr(q25, q5, q52)-> q13 mesg(q5, q3, q13) -> q13] ...
Fig. 3. Approximation of the second step of Needham-Schroeder-Lowe protocol we want to verify. With this tool and the Timbuk library verifications of protocols’ properties have been done. The secrecy and authentication have been analysed for the Needham-Schroeder protocol (public key without server, shared key with server) [NS78, Low95], the Woo Lam Pi3 protocol [WL94] and the simplified version of Otway-Rees [AN96]. Table 2 gives the results of our tests. We found that the Needham-Schroeder with symmetric key was safe. We proved that the shared key between Alice and Bod is only known by them and the server. The intruder cannot break encryption if he does not have the key in our approach so we cannot find the flaw presented in [DS81] (they assume that the shared key between Alice and Bob is compromised). If we assumed that this key is compromised by adding in the automata a transition that models the intruder as the key used by Alice and Bob, we arrived to the result that authentication might be not verified (result of [DS81]). We also consider that our agents differentiate all the basic information
46
F. Oehl et al.
Protocols Secrecy property Authentication property Needham-Schroeder symmetric key verified verified Needham-Schroeder may be a flaw may be a flaw Needham-Schroeder-Lowe verified verified Otway Rees verified verified Woo Lam Pi3 none verified
Table 2. Test results and know the format of the messages. So the attack described in [CJ97] (where an agent accepts a set of information as a possible key) on the Otway Rees protocol cannot be found. The results for the other protocols are those that you might expected. Of course these protocols are well known and when you read in the table ”may be a flaw”, you already know by the previous verification done on these protocols that they are flawed. If you were facing a new protocol, that answer, ”may be a flaw”, will require another verification approach to check if it is the case or not. If we compare Genet and Klay’s approximation function with ours, we can say that our function is generated automatically and guarantees to make the computation stop. For the computation time, we cannot compare the results. For our tests we have used Timbuk, which has been specially designed after [GK00] to compute an approximation automaton from an initial automaton, a term rewriting system and an approximation function. It can also be generated automatically before the computation. Our approximation works well on basic protocols. We are going to test it on more complex protocols (SET [Gro96a], TLS [Gro96b], ...). We will also determine if by extending the semantics of [GK00] it would be possible still with our approximation function to verify freshness properties. Since we are using an abstract technique, we can only say that the property is satisfied when the intersection of tree automata is empty. Otherwise we have to use another technique [Mea96, Pau98, JRV00] to verify whether the property is satisfied or not. In [OS01], we explain how to combine Paulson’s idea and the approximation technique to exploit the strengths of each method. This combining approach is illustrated with the Needham-Schroeder-Lowe protocol [Low95] in [OS02]. The goal is to develop a technique/tool that could be used by expert in protocols and not in theorem proving. In the same context of cryptographic protocol verification, we will also look at [BT02] and see how we can adapt their technique to our approach. The examples solved in this paper with our approximation have no close relation with the cryptographic protocols. We will also look for other fields to successfully use our approximation. May be in the processors/hardware design or algorithm verification... Acknowledgements Thanks are due to Thomas Genet (Institut de Recherche en Informatique et Syst`emes Al´eatoires de Rennes), Alain Giorgetti
Automatic Approximation for the Verification of Cryptographic Protocols
47
(Laboratoire Informatique de Franche-Comt´e), Michael Rusinowitch (Laboratoire Lorrain de Recherche en Informatique et ses Applications) and Laurent Vigneron (Laboratoire Lorrain de Recherche en Informatique et ses Applications) for useful comments and discussions. Of course, any mistakes are entirely our responsibility.
References [AGH+ 01] B. Aziz, D. Gray, G. Hamilton, F. Oehl, J. Power, and D. Sinclair. Implementing Protocol Verification for E-Commerce. Proceedings of the 2001 International Conference on Advances in Infrastructure for Electronic Business, Science, and Education on the Internet (SSGRR 2001), 2001. http://student.dcu.ie/∼oehlf2/. [AN96] M. Abadi and R. Needham. Prudent Engineering Practice for Cryptographic Protocols. IEEE Transactions on Software Engineering, 22(1):6– 15, 1996. [AT91] M. Abadi and M. Tuttle. A Semantics for a Logic of Authentication. In Proceedings of the Tenth Annual ACM Symposium on Principles of Distributed Computing, pages 201–216, 1991. [BAN89] M. Burrows, M. Abadi, and R. Needham. A Logic of Authentication. Technical report, DIGITAL, Systems Research Center, N 39, February 1989. http://www.research.digital.com/SRC/publications/. [Bol95] D. Bolignano. V´erification formelle de protocoles cryptographiques ` a l’aide de Coq, 1995. [Bol96] D. Bolignano. An Approach to the Formal Verification of Cryptographic Protocols. In ACM Conference on Computer and Communications Security, pages 106–118, 1996. [BT02] A. Bouajjani and T. Touili. Extrapolating Tree Transformations. In In Proc. 14th Int. Conf. on Computer Aided Verification (CAV’02), Lecture Notes in Computer Science. Springer-Verlag, 2002. http://verif.liafa.jussieu.fr/∼touili/. [CJ97] J. Clark and J. Jacob. A Survey of Authentication Protocol literature: Version 1.0., 1997. [DS81] D. E. Denning and G. M. Sacco. Timestamps in key distribution protocols. Communications of the ACM, 24(8):533–536, 1981. [DY83] D. Dolev and A. Yao. On the security of public-key protocols. IEEE Transactions on Information Theory, 2(29), 1983. [Gen98] T. Genet. Decidable Approximations of Sets of Descendants and Sets of Normal Forms. In RTA, pages 151–165, 1998. [GK00] T. Genet and F. Klay. Rewriting for Cryptographic Protocol Verification. In CADE: International Conference on Automated Deduction, 2000. http://citeseer.nj.nec.com/genet99rewriting.html. [GL00] J. Goubault-Larrecq. A method for automatic cryptographic protocol verification (extended abstract). In Proc. Workshop on Formal Methods in Parallel Programming, Theory and Applications (FMPPTA’2000), number 1800 in Lecture Notes in Computer Science, pages 977–984. SpringerVerlag, 2000. http://www.dyade.fr/fr/actions/vip/publications.html.
48
F. Oehl et al.
[GNY90]
[Gro96a] [Gro96b] [GSG99]
[GT01]
[JRV00] [LDG+ 01] [Low95] [Mea94] [Mea96] [Mon99]
[NS78]
[OS01]
[OS02]
[Pau98]
[RV98] [WL94]
L. Gong, R. Needham, and R. Yahalom. Reasoning About Belief in Cryptographic Protocols. In Deborah Cooper and Teresa Lunt, editors, Proceedings 1990 IEEE Symposium on Research in Security and Privacy, pages 234–248. IEEE Computer Society, 1990. SET Working Group. SETT M Specification, books 1,2 and 3. 1996. http://www.setco.org/set specifications.html. TLS Working Group. The TLS Protocol Version 1.0. 1996. http://www.ietf.org/html.charters/tls-charter.html. S. Gritzalis, D. Spinellis, and P. Georgiadis. Security Protocols over open networks and distributed systems: Formal methods for their analysis, design, and verification. Computer Communications, 22(8): 695-707, 1999. http://citeseer.nj.nec.com/gritzalis99security.html. T. Genet and V. Viet Triem Tong. Reachability Analysis of Term Rewriting Systems with Timbuk. In Proc. Logic for Programming, Artificial Intelligence and Reasoning (LPAR 2001), LNAI. Springer-Verlag, 2001. F. Jacquemard, M. Rusinowitch, and L. Vigneron. Compiling and Verifying Security Protocols. Logic Programming and Automated Reasoning, pages 131-160, 2000. X. Leroy, D. Doligez, J. Garrigue, D. R´emy, and J. Vouillon. The Objective Caml system release 3.02, 2001. G. Lowe. An Attack on the Needham-Schroeder Public-Key Authentication Protocol. Information Processing Letters, 56(3):131–133, 1995. C. Meadows. A Model of Computation for the NRL Protocol Analyzer. In CSFW, 1994. C. Meadows. The NRL protocol analyser: An overview. Journal of Logic Programming, 26(2):113–131, February 1996. D. Monniaux. Abstracting Cryptographic Protocols with Tree Automata. In Static Analysis Symposium, Lecture Notes in Computer Science, pages 149–163. Springer-Verlag, 1999. http://citeseer.nj.nec.com/monniaux99abstracting.html. R. Needham and M. Schroeder. Using encryption for authentication in large networks of computers. Communications of the ACM, 21(2):120– 126, 1978. F. Oehl and D. Sinclair. Combining two approaches for the verification of cryptographic protocols. Workshop Specification, Analysis and Validation for Emerging Technologies in Computational Logic (SAVE 2001), 2001. http://student.dcu.ie/∼oehlf2/. F. Oehl and D. Sinclair. Combining ISABELLE and Timbuk for Cryptographic Protocol Verification. Workshop S´ecurit´e des Communications sur Internet (SECI 2002), 2002. http://student.dcu.ie/∼oehlf2/. L. C. Paulson. The Inductive Approach to Verifying Cryptographic Protocols. Journal of Computer Security, 6, 1998. http://www.cl.cam.ac.uk/users/lcp/papers/protocols.html. D. R´emy and J. Vouillon. Objective ML: An effective object-oriented extension to ML, 1998. T. Y. C. Woo and S. S. Lam. A Lesson on Authentication Protocol Design. Operating Systems Review, 28(3):24–37, 1994.
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis Colin Boyd and Kapali Viswanathan Information Security Research Centre, Queensland University of Technology, Brisbane, Australia {boyd,kapalee}@isrc.qut.edu.au
Abstract. We propose a way to unify two approaches to analysis of protocol security, namely complexity theoretic cryptographic analysis and formal specification with machine analysis. We present a specification in Sum of the Bellare– Rogaway cryptographic model and demonstrate its use with a specification of a protocol of Jakobsson and Pointcheval. Although this protocol has a proof in the Bellare-Rogaway model, its original version was flawed. We show that our specification can be used to find the flaw.
1 Introduction Even though the problem of key establishment is now widely understood in its basic form, new protocols continue to be proposed for specialised situations such as use of devices of low computational power, multi-user groups, and where the shared keys are low entropy passwords. Meanwhile, more complex protocols for application areas like electronic payments, electronic auctions and fair exchange of assets are too difficult to be handled in full generality by most analysis techniques. There is therefore still a pressing need for improved methods for protocol analysis. The problem of how to gain confidence in the security of protocols has been approached by two different research communities: the cryptography community and the computer security community. The cryptography community has built on the definitions for cryptographic primitives to provide security proofs for a relatively small number of protocols. The computer security community has generally used formal methods to specify protocols which are generally used in one of two ways. The first way is to search for insecure points within the state space using a model checker such as FDR [15] or Murφ [16]. The second way is to use a theorem prover, such as Isabelle [17], to arrive at results which are true in all states. Both cryptographic analysis and formal specification have their strengths and limitations, but the two communities have worked almost independently and there are even contradictions in what can constitute a secure protocol. A major issue is how cryptography is handled. In the cryptography community proofs are usually reduction proofs for which the aim is to show that if the protocol is broken then some computationally difficult problem (or perhaps a generic cryptographic primitive) can be broken. The computer security community generally has assumed a model of “perfect cryptography” which means that it is impossible to obtain a ciphertext from a plaintext, or a A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 49–61, 2003. c Springer-Verlag Berlin Heidelberg 2003
50
C. Boyd and K. Viswanathan
plaintext from a ciphertext, without the correct key. Apart from ignoring probabilistic outcomes, this also means that authentication and confidentiality are not treated independently and that different kinds of confidentiality cannot be differentiated. A second major difference is that proofs in the cryptography community are human generated (“mathematician’s proofs”) rather than machine checkable proofs which can be obtained with software theorem provers. This paper contributes to unifying the two approaches by suggesting and exploring a way of combining the ideas from the two communities. Abadi and Rogaway [1] have examined the difference in approach between the “perfect cryptography” assumption and cryptographic definitions of confidentiality. However, no attempt seems to have been made to analyse the same adversarial model with the two different approaches. We believe that this is a promising way forward. In this paper the Bellare-Rogaway model used for cryptographic proofs is formally specified, animated and analysed in a machine checkable way. The long-term aim of this is to gain the benefits of both worlds, by allowing proofs which are more accessible, more meaningful and more likely to be correct. This ultimately means a greater assurance in the security of protocols used to protect our communications. We illustrate the potential use of our formal specification by using it to explore a protocol of Jakobsson and Pointcheval [14] that was proved secure in Bellare and Rogaway’s model. Despite this proof the original version of the protocol turned out to be flawed, as shown by Wong and Chan [18]. We demonstrate that the flaw could be revealed by machine analysis of our specification. We contend that this demonstrates the value of a combined approach to the problem of formal analysis of protocols. The remainder of this paper is structured as follows. In the rest of this section we outline the Bellare–Rogaway model and the formal language and tool that we used. In Section 2 we give more details of the Bellare–Rogaway model and describe how we captured the model in the formal specification. Section 3 describes the Jakobsson– Pointcheval protocol and our formal analysis of it. 1.1 The Bellare–Rogaway Model In the 1990s academic cryptography started its move towards a mature science by demanding new standards of proof. In the last decade commonly accepted formal definitions have been established for the main cryptographic services such as confidentiality, integrity and signature, and many specific algorithms have been proven to satisfy such definitions on the assumption of the computational difficulty of well established problems such as integer factorisation and discrete logarithms. Proofs for cryptographic protocols have taken longer to establish than those for the primitives that they use. An important direction in cryptographic protocol research was pioneered by Bellare and Rogaway in 1993 [5] when they published the first mathematical proof that a simple entity authentication protocol was secure. This work, which covered only the two-party case, was followed up with a paper on server-based protocols [3] and various authors have extended the same idea to include public-key based key transport [6], key agreement protocols [7], password-based protocols [4,8], and conference key protocols [10,9].
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis
51
The general approach is to produce a mathematical model that defines a protocol in which a powerful adversary plays a central role. The adversary essentially controls all the principals and can initiate protocol runs between any principals at any time. Insider attacks are modelled by allowing the adversary to corrupt any principals, and the adversary can also obtain previously used keys. Cryptographic algorithms may be modelled either with generic properties or as specific transformations. Security of protocols is defined in terms of matching conversations (for authentication) and indistinguishability (for confidentiality of keys). Because cryptographic protocols are notoriously difficult to design correctly, a proof of security is a very valuable property. Nevertheless, there are as yet relatively few protocols available which have a security proof. Most new protocol designs continue to be published without any attempt to prove security, leading to the traditional cycle of protocol attack being followed by a fix before a new attack is found. Some of the reasons for this are the following. Proofs are difficult to understand. Security proofs tend to be difficult to understand for the average practioner. They typically run to several pages of mathematical reasoning. The lack of accessibility of the proofs means that there are few people who check the proofs in any detail. Proofs can be wrong. Although there are relatively few protocols with security proofs available, a number of protocols which have been proven secure have turned out to have significant security weaknesses. We give a detailed example of the protocol of Pointcheval and Jakobsson later. The significance of proofs can be hard to assess. Protocol goals used in the Bellare– Rogaway model do not correspond closely to the traditional protocol goals of authentication and key establishment. In particular, a goal such as entity authentication is defined in terms of the protocol specific property of matching conversations. Protocols with different traditional properties cannot be differentiated by such a property [12]. A comparison between the Bellare–Rogaway model and analysis using formal specifications shows that they are largely complementary. Proofs of security are long and complex and therefore prone to error and difficult to understand. Formal specifications, on the other hand, have not used the complete cryptographic definitions. It therefore seems a natural way forward to combine the benefits of both approaches. The bridge to unify these separate domains is the model of the protocol and the adversary. The same model used in the mathematical proof will be formally specified. Understanding and validation of the model can also be enhanced through animation of the specification. 1.2 Sum and Possum Because of the abstract nature of the model, we used a high level specification modelling the global view of the adversary. This means that we have no need to specify explicit communicating processes; instead we use a state based specification modelling operations as the possible adversary actions in the Bellare–Rogaway model. Our choice of formal language for the specifications was Z [2]. Some of the reasons for the choice of Z are the following.
52
C. Boyd and K. Viswanathan
– There are widely used and understood conventions for modelling state based systems in Z; this is the obvious way to structure a highly abstract protocol specfication. It is appropriate to specify the Bellare–Rogaway model as a state based specification since the adversary has a global view of the protocol which is updated following each action of the adversary. – The notation and semantics of Z are based on set theory and first order predicate logic. This means that Z specifications are easily accessible for most computer scientists and mathematically trained users. – Z is widely used in the research community and by practitioners, and is also approaching international standardisation. There are a number of friendly tools available to support development of Z specifications and animation which are freely available, at least for academic research purposes. Animation is an established technique in software engineering to allow clients to verify that a formal specification corresponds to the real world expectations of the system. This informal process is part of the user acceptance process and allows the system functionality to be examined before implementation takes place. Many of these features make animation a valuable addition to the process of formal modelling of security protocols. We used the software tool Possum. Possum was developed at the Software Verification Research Centre at the University of Queensland [13]. It provides animation of specifications written in Sum, which is essentially a markup of Z. Possum supports Z conventions for modelling state based systems and allows for manual as well as script based animations. It also supports a graphical front end to enable visual animation of specifications if required.
2 Specifying the Bellare–Rogaway Model In this section we give an informal definition of the Bellare–Rogaway model and outline how the model was specified in Sum. We first consider the model of communication which governs what the adversary is allowed to do. We then explore the definition of security. 2.1 Model of Communication The model of communication used is independent of the details of the protocol and is the same for all protocols with the same set of principals and the same protocol goals. The adversary controls all the communications that take place and does this by interacting with a set of oracles, each of which represents an instance of a principal in a specific protocol run. The principals are defined by an identifier U from a finite set and an oracle ΠUs represents the actions of principal U in the protocol run indexed by integer s. Interactions with the adversary are called oracle queries and the list of allowed queries is summarised in Table 1. This list applies to the model appropriate for server based key transport protocols, as described in Bellare and Rogaway’s 1995 paper [3]; additional queries are appropriate for other protocol types. We now describe each one informally.
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis
53
Send(U, s, M ) Send message M to oracle ΠUs Reveal(U, s)
Reveal session key (if any) accepted by ΠUs
Corrupt(U, K) Reveal state of U and set long-term key of U to K Test(U, s)
Attempt to distinguish session key accepted by oracle ΠUs
Table 1. Queries available to the adversary in Bellare-Rogaway model Send(U, s, M ) - This query allows the adversary to make the principals run the protocol normally. The oracle ΠUs will return to the adversary the next message that an honest principal U would do if sent message M according to the conversation so far. (This includes the possibilty that M is just a random string in which case ΠUs may simply halt.) If ΠUs accepts the session key or halts this is included in the response. The adversary can also use this query to start a new protocol instance by sending an empty message M in which case U will start a protocol run with a new index s. Reveal(U, s) This query models the adversary’s ability to find old session keys. If a session key Ks has previously been accepted by ΠUs then it is returned to the adversary. An oracle can only accept a key once (of course a principal can accept many keys modelled in different oracles). Corrupt(U, K) This query models insider attacks by the adversary. The query returns the oracle’s internal state and sets the long-term key of U to be the value K chosen by the adversary. The adversary can then control the behaviour of U with Send queries. Test(U, s) Once the oracle ΠUs has accepted a session key Ks the adversary can attempt to distinguish it from a random key as the basis of determining security of the protocol. A random bit b is chosen; if b = 0 the Ks is returned while if b = 1 a random string is returned from the same distribution as session keys. The Send(U, s, M ) query implicitly assumes a specification of the protocol which is being analysed. This is provided in a rather informal manner in typical use of the Bellare–Rogaway model, by simply stating what each principal should do on receipt of each protocol message. In our formal specification we give abstract operations corresponding to each possible message that can be invoked by the adversary. We explicitly define instances of each principal. The Reveal(U, s) query corresponds to an explicit operation in the specification. Figure 1 shows the schema. The principal and instance are chosen as inputs and if the instance has accepted then its name is added to the set of revealed instances and the key it accepted is added to the set of exposed keys. We did not explicitly model the Corrupt(U, K) query in our specification since it was not needed to demonstrate the attack. This can easily be added, however, by allowing the long-term key of principal U to be added to the set of keys known to the adversary. Since the Test(U, s) query models the probabilistic advantage of the adversary (discussed further below) we deliberately avoid modelling this query explicitly. Instead we
54
C. Boyd and K. Viswanathan
op schema Reveal is dec p? : Players; i? : Instances pred (p?,i?) in accepted; revealed’ = revealed union {(p?,i?)}; exposedKeys’ = exposedKeys union {entityKeys(p?,i?)}; changes_only{revealed,exposedKeys} end Reveal;
Fig. 1. Sum schema specifying Reveal(U, s) query
look only for ‘trivial’ losses of session keys by recording which keys are known to the adversary. This allows us to keep the specification simple, while at the same time seems sufficient to capture mechanistic attacks which can be missed when concentrating on a complexity theoretic analysis. 2.2 Defining Security Success of the adversary is measured in terms of its advantage in distinguishing the session key from a random key after running the Test query. If we define Good-Guess to be the event that the adversary guesses correctly whether b = 0 or b = 1 then Advantage = 2 · Pr[Good-Guess] − 1. A critical element in the definition of security is the notion of the partner of an oracle, which captures the idea of the principal with which any oracle ‘thinks’ it is communicating. The way of defining partner oracles has varied in different papers using the Bellare–Rogaway model. In the more recent research partners have been defined by having the same session identifier (SID) which consists of a concatenation of the messages exchanged between the two. Partners must both have accepted the same session key and recognise each other as partners. The Test query may only be used for an oracle which has not been corrupted and which has accepted a session key that has not been revealed in a Reveal query. In addition partner of the oracle to be tested must not have had a Reveal query. Bellare and Rogaway define a protocol in this model to be secure if: 1. when the protocol is run without any intervention from the adversary then both principals will accept the same session key. 2. Advantage is a computationally negligible function (it decreases faster than any polynomial function in terms of the length of the cryptographic keys). The first condition is a completeness criterion that guarantees that the protocol will run as expected in normal circumstances. The second condition says that the adversary is unable to find anything useful about the session key after interacting in the specified
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis
55
way. One may ask how this condition relates to more conventional protocol goals. For example, key establishment protocols are typically required to provide key authentication which means that a principals should know which other principals have. or may have, the new session key. Although the above definition appears to be concerned only with key confidentiality it does imply key authentication. For suppose that the session key is known to an oracle ΠUs different from that recorded in an oracle ΠUt to be tested. Then ΠUs is not the partner of ΠUt and so it can be opened by the adversary and so the protocol cannot be secure. A key is defined as fresh in the Bellare-Rogaway model if it has been accepted by an oracle which has not been opened, its partner (if any) has not been opened, and the user it represents has not been corrupted. Since the Test query can only be performed on oracles with fresh keys all keys accepted in a secure protocol must be fresh. In the formal specification every schema operation may potentially change the state of the protocol. Rather than define security directly we found it more natural to test the insecurity of the state of the protocol in the schema operation Insecure shown in Figure 2. The set exposedKeys is specified to denote the set of exposed keys. The only mechanism that can update exposedKeys is the operation Reveal . The Insecure operation verifies that for every key in the set of exposed keys there exists an instance of a player (an oracle) such that the following conditions hold: 1. the oracle has accepted the exposed key; 2. the oracle has not been revealed (and has accepted the exposed key); and, 3. the partner of the oracle has not been revealed and has accepted the exposed key. This means that a Reveal query can be used to obtain a key that should not be know by the adversary. The Insecure operation may be invoked at any time to test whether an insecure state has been reached. op schema Insecure is pred exists k: Keys @ (k in exposedKeys) and (exists p:Players; i:Instances @ ((p,i) in accepted diff revealed) and (entityKeys(p,i) = k) and (partnerOf(p,i) in accepted diff revealed)); changes_only{} end Insecure
Fig. 2. Sum schema specifying an insecure state
3 Specification and Analysis Jakobsson and Pointcheval’s protocol [14] was designed specifically for use between a low-power computational device and a powerful server. It is a combination of a DiffieHellman exchange and a Schnorr signature. Computation takes place in a subgroup of
56
C. Boyd and K. Viswanathan
Z∗p where p is a suitable large prime. The element g generates the subgroup which has prime order q with q|(p − 1). In Figure 3 the low power entity A is able to complete all the computations required to send the first message off-line as long as the public key of the server, B, is previously available. This means that only simple calculations need to be performed by A during the protocol run itself. A by-product of the improved efficiency is that forward secrecy is not provided, since once xB is compromised any previously agreed session keys become available. Shared Information: Three hash functions h1 , h2 , h3 with outputs of lengths l1 , l2 , l3 respectively. Security parameter k (suggested value k = 64). A
B
∈R q rA , rA rA K = yB r = h2 (g rA )
t A = g rA
?
B, tA , r −−−−−−−→
K = txAB X = h3 (yB , tA , K)
X, e ←−−−−−−−
e ∈R {0, 1}k
X = h3 (yB , tA , K) d = rA − exB mod q
KAB
A, d ? e r = h2 (g d yB , yB , tA , K) −−−−−−−→ = h1 (yA , tA , K)
Fig. 3. Jakobsson-Pointcheval protocol Figure 3 shows a traditional informal description of the protocol. This is the original version. In the revised version the message field r was defined by r = h2 (g rA , yB , tA , K); this change connects the signature (d, r) to the session key to prevent an impersonation attack discovered by Wong and Chan [18]. The protocol, in both its original and revised versions, carried a proof of security in the Bellare–Rogaway model. We specified the Jakobsson–Pointcheval protocol in Sum in a very abstract manner. We used the Possum animator to explore the specification and demonstrate the attack. 3.1 Data Types and Protocol State The data types that are central to the specification can be broadly classified into the following categories: 1. Names for every instance of every player. Every oracle, which is an instance of a protocol principal, is specified by an ordered pair of the form (playerNumber , instanceNumber). When any two instances have the same playerNumber , it is assumed that both the instances contain the same long-term secret values that the player is specified to possess. The data type for these 2-tuples is named Entities == Players × Instances.
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis
57
2. Names for every type of message that could be exchanged in the protocol. Every message, which the original protocol design specifies for communication, is specified as a type. Therefore, the specification contains as many message types as the number of messages communicated during a single protocol run without any intervention from the adversary. Every communication of the protocol can then be specified as a collection of message types sent by an oracle to another oracle. The state of the protocol is specified to hold a set of datasets which provides a global view of the protocol that an adversary could possess. The person animating the specification would then have the view of an adversary who has access to every communicated message by every entity and who has the capability to induce any entity to change state by providing the appropriate inputs. The adversary can obtains such inputs either by choosing random elements belonging to a particular message type or by using oracle queries to various entities, which are represented as schema operations. The schema operations are the only mechanisms that can be used to change the state of the protocol.
3.2 Specification of Secret Values The cryptographic operations using secret keys that various entities possess are not explicitly modelled. Rather, they are modelled as operations that a valid entity, which possesses a set of valid secret keys, can perform. For example, the oracles that are modelled by schema operation SendMessage1 are assumed to possess the long-term private key of the server. Therefore, this schema specifies the capability to compute shared Diffie-Hellman keys, which specified using the function dhKeys(.). Thus, this schema specifies operations that any instance of the server would perform when it receives the first communication of the protocol run. Similarly, the entities that are modelled by the schema operation SendMessage2 are assumed to possess the long-term private key of the corresponding client. Therefore, this schema specifies the capability to perform Schnorr signature operations by the following operations: ....... d’ = d + 1; dSeenBy’ = dSeenBy func_override {(p?, j?) |--> d}; ....... That is the ability to update the signature counter, specified by d, and the “seen by” function specified by dSeenBy, which is specified to contain either the set of valid signatures that valid entities generated during some past operation or the set of claimed signatures that a particular entity received. The claimed signatures will be valid if the signature is claimed to be from an entity that indeed generated the signature, or invalid if the signature is claimed to be from an entity that did not generate the signature. The schema operation SendMessage3 specifies a signature verification operation using the dSeenBy function along with some other functions. That is, valid signature generators, who possess the corresponding long-term private key, can write to the dSeenBy dataset and any verifier can read the dSeenBy dataset.
58
C. Boyd and K. Viswanathan
3.3 The Initialisation Operation The protocol state is initialised using the schema named init . The initialisation function deletes the memory of every oracle, initialises the counters for the respective message types, initialises the lists of accepted and revealed oracles, and updates the partnerOf function. Immediately after initialisation, every oracle is a partner of itself. The reason for this initialisation is to allow a consistent definition of the security of the protocol state as discussed in Section 2.2. This is because security depends on both the oracle and its partner (which may or may not exist) not having been asked a Reveal query. The schema operation MatchPartner , shown in Figure 4 is the only mechanism to alter the behaviour of the partnerOf function. The MatchPartner schema operation will update the partnerOf function only when certain preconditions are satisfied, which ensure that both parties possess the same session identifier. op schema MatchPartner is dec i?: Instances; client?: Players pred exists c: Instances | (bSeenBy(0,i?) = bSeenBy(client?,c)) and (aSeenBy(0,i?) = aSeenBy(client?,c)) and (rSeenBy(0,i?) = rSeenBy(client?,c)) and (dSeenBy(0,i?) = dSeenBy(client?,c)) and (eSeenBy(0,i?) = eSeenBy(client?,c)) @ (partnerOf’ = partnerOf func_override {(0,i?) |--> (client?,c), (client?,c) |--> (0,i?)}); // this server instance thinks that client? is its partner changes_only{partnerOf} end MatchPartner;
Fig. 4. Sum specification of partner function
3.4 Analysis of the Specification When the command can be executed, or when the preconditions are met, the animator provides an output (or a solution). The solution is the state information before and after the operation. Figure 5 shows a series of animation commands that were executed. Each numbered command is a user input, and the number is incremented if the command is successful. The outputs are not provided here to save space; instead the string “solution” is inserted to represent the state information. When there is no solution the animator outputs the string “no solution.” Finally an insecure state is reached revealing a successful attack. (In fact the attack shown is different from the one given by Wong and Chan [18] although it exploits the same weakness.)
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis
59
4 MutualAuth: StartClient{1/p?,0/j?} solution 5 MutualAuth: StartClient{2/p?,0/j?} solution 6 MutualAuth: SendMessage1{0/i?,10/b?,31/r?} solution 7 MutualAuth: SendMessage1{1/i?,11/b?,30/r?} solution 8 MutualAuth: SendMessage2{1/p?,0/j?,20/a?,41/e?} solution 9 MutualAuth: SendMessage2{2/p?,0/j?,21/a?,40/e?} solution 10 MutualAuth: SendMessage3{0/i?,2/client?,51/d?} solution 11 MutualAuth: SendMessage3{1/i?,1/client?,50/d?} solution 12 MutualAuth: MatchPartner{0/i?,1/client?} no solution 12 MutualAuth: MatchPartner{1/i?,1/client?} no solution 12 MutualAuth: MatchPartner{1/i?,2/client?} no solution 12 MutualAuth: MatchPartner{0/i?,2/client?} no solution 12 MutualAuth: Insecure no solution 12 MutualAuth: Reveal{2/p?,0/i?} solution 13 MutualAuth: Insecure solution
Fig. 5. Animated attack on the Sum specification
Note that during the initialisation phase, every oracle was specified to be a partner of itself. Therefore, if an oracle has accepted a session key and it is its own partner (that is the schema operation named MatchPartner fails to match a partner for this oracle), then the protocol state can become insecure when an appropriate Reveal operation is performed. Therefore, to provide security, it is important that the schema operation SendMessage3 has a solution, if the corresponding MatchPartner schema operation has a solution. The correspondence between these schema operations is specified by the input variable denoting the instance of a server oracle (i?) and the input variable denoting the client (client ?). The proposed “fix” to the protocol by Jakobsson and Pointcheval adopts such a mechanism by specifying a similar sequence of operations for the SendMessage3 and MatchPartner schema operations.
60
C. Boyd and K. Viswanathan
4 Conclusion We have demonstrated that the Bellare–Rogaway model can be specified formally in a state-based specification with a simplified security definition. Using an example we have shown that this model could have found attacks that were missed even when protocols have been proven secure in the model. We believe that such an approach complements the human derived proofs. Furthermore the specification can help to clarify the meaning of these proofs, and animation of the specification allows the model to be more accessible to practitioners. We regard the work in this paper as a first step, demonstrating the potential of unifying cryptographic proofs and formal specifications. There are a number of ways that we are planning to extend this work. – We intend to conduct a similar analysis of other protocols to gain a better understanding of how best to use the model. – We plan to explore protocols which use different cryptographic properties. These will be modelled using different oracles available to the adversary, particularly encryption and decryption oracles. – We would like to experiment with use of purpose-built model checkers to automate searching of the specification. – We would like to use a theorem prover to provide machine checkable proofs at a high level to complement the human derived proofs. – We intend to specify other models which provide cryptographic reduction proofs, particularly Canetti-Krawczyk’s modular method [11]. The ultimate goal of this work would be a fully formalised proof of security for cryptographic protocols which is able to capture the probabilistic reduction to a known cryptographic primitive or computational problem. At present such a goal appears out of reach.
References 1. Mart´ın Abadi and Phillip Rogaway. Reconciling two views of cryptography (the computational soundness of formal encryption). Journal of Cryptology, to appear. 2. J. Sinclair B. Potter and D. Till. An Introduction to Formal Specification and Z. Prentice Hall, 1991. 3. M. Bellare and P. Rogaway. Provably secure session key distribution – the three party case. In Proceedings of the 27th ACM Symposium on the Theory of Computing, 1995. 4. Mihir Bellare, David Pointcheval, and Phillip Rogaway. Authenticated key exchange secure against dictionary attacks. In Advances in Cryptology - Eurocrypt 2000, pages 139–155. Springer-Verlag, 2000. 5. Mihir Bellare and Phillip Rogaway. Entity authentication and key distribution. In Advances in Cryptology – CRYPTO’93, pages 232–249. Springer-Verlag, 1993. Full version at www-cse.ucsd.edu/users/mihir. 6. S. Blake-Wilson and A. Menezes. Security proofs for entity authentication and authenticated key transport protocols employing asymmetric techniques. In Security Protocols Workshop. Springer-Verlag, 1997.
Towards a Formal Specification of the Bellare-Rogaway Model for Protocol Analysis
61
7. Simon Blake-Wilson and Alfred Menezes. Authenticated Diffie-Hellman key agreement protocols. In Selected Areas in Cryptography, pages 339–361. Springer-Verlag, 1999. 8. Victor Boyko, Phillip MacKenzie, and Sarvar Patel. Provably secure password-authenticated key exchange using Diffie-Hellman. In Advanced in Cryptology - Eurocrypt 2000. SpringerVerlag, 2000. 9. Emmanuel Bresson, Olivier Chevassut, and David Pointcheval. Provably authenticated group diffie-hellman key exchange – the dynamic case. In Advances in Cryptology - Asiacrypt 2001, pages 290–309. Springer-Verlag, 2001. 10. Emmanuel Bresson, Olivier Chevassut, David Pointcheval, and Jean-Jacques Quisquater. Provably authenticated group Diffie-Hellman key exchange. In CCS’01, pages 255–264. ACM Press, November 2001. 11. Ran Canetti and Hugo Krawczyk. Analysis of key-exchange protocols and their use for building secure channels. In Advances in Cryptology – Eurocrypt 2001, volume 2045 of LNCS, page ?? Springer-Verlag, 2001. http://eprint.iacr.org/2001/040.pg.gz. 12. Dieter Gollmann. Authentication – myths and misconceptions. Progress in Computer Science and Applied Logic, 20:203–225, 2001. 13. Dan Hazel, Paul Strooper, and Owen Traynor. Possum: An animator for the Sum specification language. In Asia-Pacific Software Engineering Conference and International Computer Science Conference, pages 42–51. IEEE Computer Society, 1997. 14. Markus Jakobsson and David Pointcheval. Mutual authentication for low-power mobile devices. In Proceedings of Financial Cryptography, pages ??–?? Springer-Verlag, 2001. 15. Gavin Lowe. Breaking and fixing the Needham-Schroeder public key protocol using FDR. In Tools and Algorithms for the Construction and Analysis of Systems, pages 147–166. Springer-Verlag, 1996. 16. John C. Mitchell, Mark Mitchell, and Ulrich Stern. Automated analysis of cryptographic protocols using Murφ . In IEEE Symposium on Security and Privacy, pages 141–151. IEEE Computer Society Press, 1997. 17. Lawrence Paulson. The inductive approach to verifying cryptographic protocols. Journal of Computer Security, 6:85–128, 1998. 18. Duncan S. Wong and Agnes H. Chan. Efficient and mutually authenticated key exchange for low power computing devices. In Advances in Cryptology - Asiacrypt 2001, pages 272–289. Springer-Verlag, 2001.
Critical Critical Systems Susan Stepney Department of Computer Science, University of York, Heslington, York, YO10 5DD, UK. [email protected]
Abstract. I discuss the view of communication networks as self-organised critical systems, the mathematical models that may be needed to describe the emergent properties of such networks, and how certain security hygiene schemes may push a network into a super-critical state, potentially leading to large scale security disasters
1
Introduction
The word “critical” is used in two very different technical senses, both of which are appropriate for considering the security of communication networks, such as the Internet. The use probably most familiar to delegates at this conference is that of “being indispensable or vital“, of being a “high consequence” system. So we have safety- and security-critical systems, where safety and security are indispensable (or even literally vital) issues of concern to the users of the systems. The second use is that of “being at a turning point, or a sudden change”. I explore this second sense further here (within the context of the first sense). In particular, I discuss the notion of “super-critical” systems, systems in a state where the potential for the “sudden change” is magnified, and discuss whether certain security defences may be increasing the probability of such states.
2 2.1
Critical Systems Controlled Critical Systems
Critical systems can exist in two (or more) phases, dependent on some controlling parameter. These phases are separated by a complex boundary state, “the edge of chaos”, where the controlling parameter has a critical value. The prototypical example is a physical phase change where one phase is an ordered “frozen” solid or liquid, the other is a random “chaotic” liquid or gas (the word gas has the same root as the word chaos), and the controlling parameter is temperature. In other cases there are analogues of these states. Systems in the boundary state are in some sense “fluid”, and often have particularly interesting behaviour: neither too frozen nor too chaotic. These systems are in a state of A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 62–70, 2003. c Springer-Verlag Berlin Heidelberg 2003
Critical Critical Systems
63
constant churning flux and change: they are far from equilibrium. The study of physical systems has tended to focus on equilibrium states, however, as these are much easier to model. Another physical example is the transition from an unmagnetised to a magnetised state through the critical temperature. Road traffic flow exhibits an ordered state of free laminar flow, a chaotic state of gridlock, and a transition state of propagating jams on all scales, controlled by the traffic flow rate parameter [29]. Pushing the analogy further, computation may be viewed as the interesting and complex “edge of chaos” between pure frozen memory and pure chaotic process. 2.2
Self Organising Critical Systems
In these physical examples the controlling parameter needs to be externally tuned to hold the system at the critical point. A self organising critical system (SOCS), on the other hand, adjusts itself so that its controlling parameter moves to the critical value. Its dynamics has an attractor at the critical point; the critical point is an emergent property of the dynamics. The prototypical example of a SOCS is that of slowly adding simulated sand to a simulated sand pile [2]. The controlling parameter is the slope. The frozen state has low slope: nothing happens. The chaotic state has high slope: the pile collapses. The system self-organises to the critical slope: if the slope is low, nothing happens, and adding more sand to the pile increases the slope; if the slope is high, the pile collapses, reducing the slope. At the critical slope, avalanches happen on all scales, from a single grain to the entire pile, with a power law distribution in the frequency of their size; so no preferred or typical size is singled out. (It turns out that real sand is too dense to behave in this way; but rice can.) SOCS are typified by such power laws: they lack any scale in time or space, exhibiting power law temporal fluctuations and fractal spatial organisations [19]. This means that events on all scales have the same cause, so no special catastrophic cause is needed to explain the catastrophic events. Other examples of SOCS may include earthquakes, forest fires (with the controlling parameter being the density of unburned trees), autocatalytic chemical networks [20], ecological food webs [12] and industrial supply webs, stock market prices, control systems with in-built self-regulation, and large communication networks. 2.3
Communication Networks
Communication networks and their security are what concern us here. Evidence for a move towards critical behaviour of comms networks includes network traffic jams, computer virus propagation [22], and small changes in administrative policies occasionally having cascading knock-on effects (for example, restricting a protocol that is managing other protocols or resources can introduce a bottleneck; legitimately turning off a data flow can result in a chain of resource
64
S. Stepney
exhaustion effects due to upstream application buffer exhaustion; security access controls stopping even a small percentage of system traffic can result in the higher layer protocols causing a cascading halt [9]. As networks become ever larger and more dynamic, their behaviour becomes ever more internally adaptive [1], rather than being hand-tuned by external SysAdmins, and they move from being simple critical systems to full SOCS. Any defences we wish to design need to take this into account. SOCS have a driving force timescale very much longer than the relaxation force timescale [19]. Some kind of pressure slowly builds up on the slower driving force timescale, until it is big enough to overcome a threshold, leading to “cascades” of relaxation on the faster timescale. The difference in timescales means that the entire cascade of avalanche events can be considered to occur between the driving force events. In the case of attacks on a network, attackers can be considered to be applying the driving force. For example, they may attempt to cause jams by flooding the network with messages. One particular kind of attack is to attempt to tune a system “to the brink”, and then use one small change to push it over the edge. For example, attempting to nearly exhaust each of a range of resources, and then letting a final small resource request (possibly made by an innocent third party) push the entire system over the limit. These kinds of attack can actually be harder to achieve in the context of a SOCS, for two reasons. Firstly, the system is already close to critical, so a small change may simply trigger a cascade, stopping the attacker being able to build up a large pressure. (Of course, the system being close to critical, a small change may well trigger a large cascade. But that is a different issue, and no different for an attacker than for a legitimate user.) Secondly, the attacker may have less access to the detailed behaviour of the system, so may not be able to fine tune an attack. For example, it is difficult to fine tune a resource exhaustion attack in the face of an adaptive stochastic resource allocation policy. It should be noted that there is not a perfect analogy between intelligent attacks and a classic SOCS. As noted, with a classic SOCS, the driving force timescale is much longer than the relaxation timescale. An intelligent attacker, however, may be able to drive the system on a much shorter timescale, possibly of the same order as the relaxation timescale, and hence build up the pressure much more quickly. New driving force events may occur during the relaxation cascades. For example, an intelligent attacker may be able to design a new virus on a timescale comparable to the defence response time, rather than on the longer evolutionary timescale that nature requires, or simply release multiple diverse viruses simultaneously. This similarity in timescales may result in a qualitatively different behaviour from classic SOCS [29] (but not one that is in any way more understandable or predictable).
Critical Critical Systems
3 3.1
65
Modelling Critical Systems What to Model
This move towards SOCS mirrors a move in the kinds of models we need to build to understand and design our systems. Classical models emphasise static aspects: entities, states, events, fitness landscapes. SOCS, and nature-inspired models in general, must emphasise dynamic aspects: processes, relationships, environment, growth and change, attractors, trajectories [15]. When we are modelling, designing and predicting a complex network, there are several things we are interested in. There are specific properties we want it to have, such as stability and resilience in the face of errors (parts of a large enough network will always be broken) and growth (new instances and new kinds of nodes, connections, and communications); availability and throughput properties; and the like. There are more general properties, such as what information the system needs in order to self-organise, and how this information can be made easily available to the system. This raises further concerns, such as whether that availability would compromise privacy requirements. Also this organising information itself becomes a target for attack. Spatial properties of systems are crucial: a SOCS with spatial extent, where quantities can “diffuse” from one neighbourhood to another, behaves very differently from one that is a homogeneous mass [29]; spatially propagating waves of global behaviour can occur. In an artefact such as a communication network, the concept of proximity is not as clear cut as in a natural system: it can mean spatial proximity, but it might also refer to connectivity [26], or even similarity in physical design. It simply needs to be some property that has the potential for supporting some kind of diffusive process. 3.2
Current Modelling Languages
We have a vast resource of modelling languages and techniques to draw upon. There are languages for defining computational processes, such as CSP [17,27] and CCS [24]. More recently, languages designed to cope with mobility, locality, change, and reconfiguration have appeared, such as the pi-calculus [25] and Ambient Logic [6]. There are languages and techniques for performance modelling, such as queuing theory, and Markov models [16]. We need to remember that SCOSs are far from equilibrium, however. There are languages for probabilistic reasoning under uncertainty (for example, [18]. And there are techniques from biology, such as epidemiological models of disease propagation (for example, [8]), and much work on biological and chemical networks (for example [13]).
66
3.3
S. Stepney
Modelling Networks and Emergent Properties
We need more powerful models of complex networks, both artificial and natural. There is much mathematical theory of networks and graphs, but this tends to be of static, homogeneous, structured, closed networks. SOCS on the other hand needs theories of dynamic, heterogeneous, unstructured, open networks. – Dynamic: it is not in steady state or equilibrium, but is far from equilibrium, governed by attractors and trajectories. Swarm networks may offer insights here [4]. – Heterogeneous: the nodes, the connections, and the communications can be of many different types. – Unstructured: there is no regularity in the network connectivity: it is not regular, or fully connected, or even random. Some recent advanced in Small World networks offer intriguing new insights [3,32]. – Open: the components are not fixed: nodes and connections may come and go; new kinds of nodes and connections may appear. 3.4
Modelling Emergent Properties
We also need models that let us express emergent properties, and design and build systems that exhibit these properties. Abstract models are needed to gain deeper understanding, and to help derive and state general laws. Such laws would not help predict fine details of behaviour, but would capture general properties that could be used to guide the understanding and design of systems with emergent properties. [21] puts it well (in the context of evolution, but the argument holds for other kinds of emergent behaviour): We can never hope to predict the exact branchings of the tree of life, but we can uncover powerful laws that predict and explain their general shape. 3.5
Simulations
In addition to abstract general models, detailed executable simulations are also necessary. Simulation is needed in order to gain knowledge about detailed behaviour, and to make detailed predictions. In some cases simulation may be the only way to gain such insight, as the details of emergent properties may not be predictable in general. However, simulations by themselves may not impart much additional understanding of the system - a complicated messy incomprehensible reality has merely been replaced by a complicated messy incomprehensible simulation.
4
Artificial Immune Systems
If we are interested in detecting and preventing security attacks in networks, we can take inspiration from the vertebrate immune system. The vertebrate immune system is much more sophisticated than that of lower animals or plants;
Critical Critical Systems
67
it uses antibodies, which allows it to adapt to new previously unseen threats, and remember previously encountered threats. (Nevertheless, it is not infallible.) As well as its defensive function, it also has an important maintenance function. The classical view of the immune system is that of passive guardian. It lies dormant, waiting for attack, and then springs into action, defeating the invader, and then sleeps until the next attack. A more recent view is that of a dynamic SOCS [7], constantly reacting and adapting to its environment. The view of the vertebrate immune system as SOCS is an illuminating metaphor for network security. Artificial Immune Systems [10,28] are currently being developed as general pattern recognition and classifier systems. The application of AIS of interest here is the one that originally spawned the metaphor: intrusion detection and prevention. This takes the analogy of communication network as body, legitimate traffic as the body’s ordinary behaviour, and faults and attacks as wounds and infections (see for example, [30]). Current AIS are very simplistic models of the incredibly complicated natural immune system. Nevertheless, the metaphor immediately raises some issues. It warns us that homogeneity (lack of diversity in hardware and software) in the network may increase vulnerability to attacks. Certain email viruses and internet worms have already exploited this vulnerability. It also suggests that, even with good defences in place, there may be the possibility of (analogues of) “antibiotic resistant superbugs” caused by an analogue of the evolutionary arms race. Indeed, with the attackers being intelligent agents, this possibility is far greater than for natural immune systems with their intelligent protectors producing vaccines and other medicines. Such agents have all the power of (artificial) evolutionary approaches at their disposal, and in addition, can add in intelligent search strategies.
5
Super Critical Systems
Once we recognise that communication networks may be critical systems, even SOCS, we can use our growing understanding of the dynamics of such systems to see that certain kinds of defences may do more harm than good, in the long term. The defences are attempts to stop the many cascades that happen at the critical point. However, such defences may simply push the critical system into a super-critical state. The system becomes “an accident waiting to happen”, and the eventual inevitable cascades are simply bigger [5]. For example, forest fires are a classical example of SOCS. The models are simple grids of “trees”, that randomly “burn”, with the fire jumping to neighbouring trees if it can. This is a kind of percolation model [31], and as such it has a fairly sharply defined density threshold, below which the fires burn out rapidly, above which they consume the entire forest. Natural fires may self-organise real forests to this density threshold. Attempts to control natural fires by putting them out increase the density (particularly of underbrush), which may increase the probability that a future fire will be a large catastrophic cascade. The US is
68
S. Stepney
changing its forest fire policy from the traditional Smokey Bear’s zero tolerance to one where “naturally ignited wildland fires may burn freely as an ecosystem process”. See [14] for an in-depth discussion of Yellowstone Park’s 1988 fire. Another example where technology can push a system into a super-critical state is that of traffic management. At the critical point, between smooth free flow and solid jams, the throughput is maximised. A management system artificially moves the critical point by controlling the traffic, enabling higher throughput. But if the management system goes down, the traffic instantly grid-locks, as it finds itself in an unmanaged super-critical state. This is an example of the fragility of efficiency, exhibited by many “Just in Time” systems. Additionally, although throughput is maximised at the critical point, it is also the point where the traffic patterns display maximum unpredictability (cascades of jams on all scales) [29]. So maximum global efficiency leads to maximum unpredictability, which may not be desirable. An example of super-critical state problems much closer to the metaphor of immune systems is that of the so-called “hygiene hypothesis”. Studies suggest that insufficient exposure of the immune system to challenges in childhood, caused by living in a clean environment, may lead to immune system problems such as allergic asthma [23]. (To be fair, the chemicals used to make the home environment clean have not yet been ruled out as causes.) This “use it or lose it” view suggests the immune system needs to be exercised, that SOCS should not be pushed into a super-critical state. If large communication networks are, or are becoming, SOCS, this raises the obvious question: are security hygiene practices making the system super-critical? Are they merely deferring, and magnifying, possible disasters? For example, something as simple as trying to maximise throughput by tuning the system, or allowing it to tune itself, to the critical point could result in accidental denial of service behaviours.
6
Conclusions
Communication networks are, or are becoming, SOCS. We can use our understanding of SOCS to design and predict certain emergent properties of these systems. That understanding is still growing. We need to develop a theory of complex networks: ones that are heterogeneous, unstructured, dynamic, and open. And we need theories and models of emergent properties. We must avoid the trap of pushing the systems into super-critical states, possibly in a misguided attempt to prevent problems. Rather, we should discover how to make high consequence systems sub-critical. This will mean being willing to accept a constant (but low) level of security “illness”. We will need to welcome inefficiency, or “slack” [11] as a requirement of sub-criticality. We may even learn to welcome naive hackers as an immunisation resource!
Critical Critical Systems
7
69
Acknowledgments
I would like to thank John Clark, Howard Chivers and Fiona Polack for interesting discussions about, and comments on, this paper.
References 1. M J Addis, P J Allen, Y Cheng, M Hall, M Stairmand, W Hall, D DeRoure, Spending less time in Internet traffic jams. PAAM99 : Proceedings of the Fourth International Conference on the Practical application of Intelligent Agents and Multi-Agents, 1999. 2. Per Bak. How Nature Works: the science of self-organized criticality. Oxford University Press, 1997. 3. Albert-Laszlo Barabas. Linked: the new science of networks. Perseus, 2002. 4. Eric W. Bonabeau, Marco Dorigo, Guy Theraulaz. Swarm Intelligence: from natural to artificial systems. Oxford University Press, 1999. 5. Mark Buchanan. SUbiquity: the science of history. Weidenfeld & Nicholson, 2000. 6. Luca Cardelli and Andrew D. Gordon. Ambient Logic, 2002 (submitted). 7. Irun R. Cohen. Tending Adam’s Garden: evolving the cognitive immune self. Academic Press, 2000. 8. C. Castillo-Chavez, S. Blower, P. vande Driessche, D. Kirschner, eds. Mathematical Approaches for Emerging and Re-emerging Infectious Diseases Part I: An Introduction to Models, Methods, and Theory. Springer, 2002. 9. Howard Chivers. Private communication, 2002. 10. Dipankar Dasgupta, ed. Artificial Immune Systems and Their Applications. Springer, 1999. 11. Tom DeMarco. Slack. Broadway, 2001. 12. B. Drossel, A. J. McKane. Modelling Food Webs. In Handbook of Graphs and Networks. Wiley. 2002. 13. Walter Fontana, Gnter Wagner, Leo W. Buss. Beyond digital naturalism. Artificial Life, 1:211-227, 1994. 14. Many Ann Franke. Yellowstone in the Afterglow: lessons from the fires. 2000. 15. Brian Goodwin. How the Leopard Changed Its Spots: the evolution of complexity. Phoenix, 1994. 16. Olle Haggstrom. Finite Markov Chains and Algorithmic Applications. Cambridge University Press, 2002. 17. C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. 18. F. V. Jensen. Bayesian Networks and Decision Graphs. Springer, 2001. 19. Henrik Jeldtoft Jensen. Self-Organized Criticality: emergent complex behaviour in physical and biological systems. Cambridge University Press, 1998. 20. Stuart A. Kauffman. The Origins of Order: self-organization and selection in evolution. Cambridge University Press, 1993. 21. Stuart A. Kauffman. TAt Home in the Universe: the search for laws of complexity. Viking, 1995. 22. Jeffrey O. Kephart and Steve R. White. Directed-Graph Epidemiological Models of Computer Viruses. Proceedings of the IEEE Computer Society Symposium on Research in Security and Privacy, Oakland, California, 1991. 23. P. M. Matricardi, F. Rosmini, S. Riondino. et al. Exposure to foodborne and orofecal microbes versus airborne viruses in relation to atopy and allergic rhinitis: epidemiological study. British Medical Journal. 320:412-417, 2000.
70
S. Stepney
24. Robin Milner. Communication and Concurrency. Prentice-Hall, 1989. 25. Robin Milner. Communicating and Mobile Systems: the pi-calculus. Cambridge University Press, 1999. 26. Robin Milner. Bigraphs as a Model for Mobile Interaction. ICGT: First International Conference on Graph transformation. LNCS vol 2505. Springer. Electronic Edition, 2002. 27. A. W. Roscoe. The Theory and Practice of Concurrency. Prentice-Hall, 1997. 28. Lee A. Segel, Irun R. Cohen, eds. Design Principles for the Immune System and Other Distributed Autonomous Systems. Oxford University Press, 2001. 29. Ricard Sol, Brian Goodwin. Signs of Life: how complexity pervades biology. Basic Books, 2000. 30. A. Somayaji, S. Hofmeyr, S. Forrest. Principles of a Computer Immune System. New Security Paradigms Workshop, 1998. 31. D. Stauffer, A. Aharony. Introduction to Percolation Theory. 2nd edition. Taylor and Francis, 1994. 32. Dunan J. Watts. Small Worlds: the dynamics of networks between order and randomness. Princeton University Press, 1999.
Analysing Security Protocols Dieter Gollmann Microsoft Research 7 J J Thomson Avenue Cambridge CB3 0FB, United Kingdom [email protected] Abstract. We assess how formal methods can contribute to the design and analysis of security protocols. We explain some of the pitfalls when applying formal methods in too na¨ıve a fashion and stress the importance of identifying implicit assumptions about the environment a protocol would be deployed in that may be hidden in verification methodologies or in off-the-shelf security properties.
1
Introduction
The formal analysis of security protocols has been a fertile area of research since the publication of the BAN logic of authentication [8,9]. An overview of earlier work is given in [19]. Research is driven by the perception that the design of security protocols is difficult and error prone, and that finding attacks is too complex for the human mind. Hence, the use of formal methods is advocated for the design and analysis of security protocols. We will examine how formal methods actually help in the design of security protocols, and where their contributions may be detrimental, e.g. by adding to confusion about the goals a security protocol is supposed to achieve. Our assessment will be structured around five ‘promises’ of formal analysis: – – – – –
2
Formal modelling makes security properties precise. Formal modelling makes security properties clear. Formal specification makes security protocols precise. Formal methods can deal with the complexities of security analysis. With formal methods we can prove security.
Contributions of Formal Analysis
We will take examples from two strands of formal protocol analysis. In cryptography, protocols are analysed by hand, often in a mathematical framework founded in complexity theory. The methodology developed by Bellare and Rogaway is a prime example [4], but is not the only framework for deriving the security guarantees of a protocol from properties of the primitives used. The second strand is protocol analysis in the formal methods community, often based on logics or process algebras, and more likely to come with tool support. Comparing the literature in these two fields it is surprising how little they know about each other. It is also our aim to help bridging this gap. A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 71–80, 2003. c Springer-Verlag Berlin Heidelberg 2003
72
2.1
D. Gollmann
Making Properties Precise
Formal (mathematical) definitions are precise in a technical sense. However, formal definitions do not necessarily capture intuitive security goals precisely. As an example, consider “authentication” and correspondence properties. Frequently, the goal of authentication is described as ‘knowing whom you are talking to’, i.e. to verify the identity of a corresponding party. However, the papers that introduced correspondence properties to the analysis of authentication protocols did not try to capture the authentication of parties but of protocol runs [6,5]. Note that the requirement is that the exchange is authenticated, and not the parties themselves [6]. Nevertheless, correspondence properties are today the definition of choice in much of formal methods research on authentication protocols. A more extensive discussion of this issue can be found in [13]. Note that cryptography today uses entity authentication very much in the sense of the quote above, see e.g. the international standard IS 9798-3 [16] or the definition of entity authentication in [25]. Formal definitions can manufacture artificial security properties. It is a necessary evil that for any given verification method security properties have to be phrased in a way that makes them amenable to analysis. Sometimes, properties can be expressed quite naturally. For example, the assertion (i, EndInit(r)) → (r, BeginRespond(i))
(1)
in [30] captures nicely the purpose of a challenge-response protocol: the initiator i should commit only if the intended responder r had replied. (The symbol → is read as ‘is preceded by’.) However, the dual assertion in the same paper, (r, EndRespond(i)) → (i, BeginInit(r))
(2)
requiring the responder to authenticate the source of the challenge, does not match any of the standard authentication objectives found in the security literature. Formal analysis may highlight the necessity to differentiate security goals and properties of security mechanisms. Over time, the cryptographic community has separated various aspects that once were all known as ‘entity authentication’, – Peer entity authentication: authenticate the party at the other end of a given connection (IS 7498 [12]). – Key establishment: create a secure session when there is no a-priori connection by establishing a session key; in key agreement both parties contribute to the generation of the session key, in key transport one of the parties generates the session key. – Entity authentication: verify that the intended correspondent was involved in the protocol run; e.g. through a challenge-response protocol; dead peer detection meets a similar purpose.
Analysing Security Protocols
73
(Entity) authentication
- Peer entity authentication - IS 7498 - Key establishment - Key transport - Key authentication - Key agreement - Key confirmation - Explicit key authentication - Entity authentication - IS 9798 - Dead peer detection Fig. 1. A taxonomy of authentication
Figure 1 sketches the current taxonomy. Precise definitions can be found in [25]. As an example for a refined taxonomy of security primitives we consider cryptographic hash functions. In early days the technically inaccurate term collisionfree was used to indicate that it should not be possible to find two values that hash to the same image. Today, preimage resistance, 2nd preimage resistance, and collision resistance describe resistance to different attacks against hash functions [25]. The development of terminology was to a large extent driven by efforts to formally analyze cryptographic protocols. In this way, formal analysis has lead to an improved understanding of security properties and security primitives. 2.2
Making Properties Clear
It is tempting to look for universal security properties, where we could agree on the ‘right’ formal definitions once and for all, and then use these definitions later in security analyses. However, real security requirements are application specific and it is not always appropriate to apply an established requirement to a new application. Thus, security experts and application experts have to be able to communicate to validate security requirements. Formal methods may force properties to be expressed in a particular style. On occasion, the style itself or the specification of a particular property may appear non-intuitive to the application experts. Justifying highly formalized technical security definitions is a necessary but onerous and often neglected task, but without such a justification it may remain unclear whether a proof of an arbitrary formal property or an ‘attack’ against such a property are security relevant. Formal definitions are precise but not necessarily clear to the parties that should validate if a given property captures their requirements. Again, an obvious way how formal specifications can clarify security properties is by differentiating between properties that superficially look to be the same.
74
2.3
D. Gollmann
Making Protocols Precise
Attempts to formally specify a protocol may expose inconsistencies or omissions in the original specifications. Reports on the formal analysis of SSL/TLS [27] or SET [3,2] like to emphasize the effort that has to be put into arriving at a formal specification and note that the major benefit of the entire exercise is probably not the confirmation that a protocol meets its stated objectives but to highlight problems in the published specifications that could lead to flawed implementations. Formal methods may have idiosyncrasies regarding the specification of certain system aspects. For example, in CSP channels have to be defined to model communications between parties, even if the original protocol did not assume anything about the mode of communications. Artificial components in a model are not a problem per se but are a potential cause for problems when they introduce unwarranted assumptions about the way a protocol works. Particular circumspection is required when the formal specification of the protocol goals refers to a modelling artefact. An ‘attack’ violating such a property may not constitute an attack against the protocol in the environment it was designed for. Formalisation necessarily implies abstraction. A formal specification may thus leave out important details of the system being analyzed. In this respect, the modelling of cryptographic mechanisms has received much attention. If a protocol is formalized in a way that captures only the idealized behaviour of cryptographic algorithms, attacks that depend on specific properties of the algorithms go undetected. Conversely, implementations that use inadequate mechanisms may be subject to attacks that do not exist at the abstract level. We are facing the problem that a high-level abstract protocol specification, despite being precise in a formal sense, may not be precise enough for an implementer. Indeed, many abstract protocol specifications are silent on time-related aspects, and it is left to the implementers to handle, for example, the timing out of protocol participants. This issue is at the core of the attack presented in [14]. On the other hand, a very detailed protocol specification may be silent on the actual security goals of the protocol, forcing the analyzer to guess its true intentions. In an ideal world, we would be given an abstract high level specification with a clear indication of the protocol goals, and a more detailed concrete specification for the implementers. 2.4
Dealing with Complexity
The academic security protocols studied in the bulk of the literature on protocol analysis aren’t complex at all, nor are the attacks that have been found. New attacks on established protocols are less the fruit of sophisticated analysis than the results of changes in assumptions about the protocol goals or the environment. In fact, once the desired security properties and the assumptions about the environment have been explicitly stated, most attacks are quite obvious and would have been found sooner or later by direct inspection. The one advantage of
Analysing Security Protocols
75
tool support for protocol verification is that the attacks are found sooner rather than later. Certainly, security proofs even of simple protocols can become quite complex. It is a matter of debate whether this complexity is unavoidable or whether we have not yet found the appropriate abstractions for conducting security proofs. Similar considerations apply when comparing automated and hand-crafted proofs of mathematical theorems. ‘Industrial’ security protocols like SSL/TLS, SET, or IKE are more complex, in particular at the implementation level. Multi-phase protocols, e.g. for optimistic fair exchange, and multi-party protocols, e.g. for group key agreement, are also more complex. Here, the advantages of formal verification may become more pronounced, in particular when security guarantees are to be derived in settings where parties are allowed to pull out of a protocol run. True complexity, however, is met outside the traditional playing ground of Alice & Bob protocols. For illustration, we briefly sketch a key insertion attack on a key management module documented in [22]. Keys have tags defining their intended use. There are data encrypting keys (tags kds and kdr for sending and receiving encrypted data), key encrypting keys (tags kis and kir), and change keys for re-encrypting keys (tag kc). Only encrypted keys should leave the module. The following table gives the functions for manipulating keys. function input tags output tags SECENC DATA, eKm (KD ) - , kds eKD (DATA) SECDEC eKD (DATA), eKm (KD ) - , kdr DATA KEYGEN eKm (K) kis eKm (KG ), eK(KG ) kds,kdr RTMK eKm (K1 ), eK1 (K2 ) kir,kdr eKm (K2 ) kdr EMKKC eKm (KM ), eKM (K) kc, - eKm (K) kc EMKKIS eKm (KM ), eKM (K) kc, - eKm (K) kis EMKKIR eKm (KM ), eKM (K) kc, - eKm (K) kir Can an attacker with access to encrypted keys insert a key K of her choice encrypted under the master key Km using these functions? It is a non-trivial exercise to the reader to construct such an attack, or to prove that the system is not vulnerable in this way. The solution to this exercise in [22] happens to be an attack that starts from an encrypted key eKm (u) with tag kc (the value of u is unknown!) and a random block interpreted as eu(X) (in the attack the value X remains unknown) and yields eKm (K) with tag kis. Key insertion attacks are now receiving renewed attention [7]. They can be viewed as a particular instance of privilege escalation. Code based access control where code can assert privileges, as used in the .NET framework [21], raises similar issues. In these settings, there is also potential complexity in the formulation of security goals as we may have to account for a large number of privileges. Formal methods are very useful in the security analysis of systems already at minor levels of complexity, but analysing three-line protocols using canonical security properties is not the application that contributes most to our understanding of security.
76
2.5
D. Gollmann
Proving Security
Obviously, it is not possible to ‘prove security’. We can only show that certain properties hold under certain assumptions about the environment. The right choice of properties depends on the given application. The right assumptions about the environment depend on the given application. The results of formal analysis have to be put in context with the given application: However, the formal analysis process is largely one of working within the rules of a formal system that achieves its results by manipulating uninterpreted symbols. Extensive review of the meanings and implications of the tautologies produced by this process is needed before a supportable conclusion can be drawn on the relevance and applicability of such results. Equally important, the scope of the formal findings as well as the assumptions on which they are based need to be understood [28]. It is thus important to understand the environment one is operating in, and to understand that environments keep changing. The environments most security research originated in can be described as closed systems. In closed (local) systems, names and identities have meaning [11] and there exist central points of control and of authority. There is a clear dividing line between the entities inside the system and those outside. Security mechanisms protected against outside attackers, the principals inside are treated as honest: If they were people they were honest people; if they were programs they were correct programs [26]. It would be dangerous to read too much into the word ‘honest’. Of course organisations knew that insiders posed a considerable threat to security, but security mechanisms in the IT system would only address a minor part of this threat. Implicit assumptions specific to closed systems are still pervasive in security analysis. In the BAN logic [9], this assumption turns up in many places and is explicitly stated. The cryptologic framework of Bellare and Rogaway [4] makes the same assumption. Hence, these methodologies can only be used for security analysis if the given application is a closed system. As we cannot prove ‘security’ in general but only show that specified properties hold under specified circumstances, we should not exaggerate the value of security proofs. Attackers may still break into a system at a level we have not modelled. There are numerous examples in the cryptographic literature where a provably secure system could be broken. Side-channel analysis of cryptographic devices analyses signals at the physical level to find keys without breaking the cryptographic algorithm. An algorithm provably secure against differential cryptanalysis [29] was attacked using differentials in a non-standard fashion [20]. The AES S-boxes were designed to resist known cryptanalysis methods but their structures may help algebraic analysis [10]. (No such attack has been found yet, though.) When modifying a protocol just to make it ‘provably secure’ we may thus pay a price in performance without gaining substantial security benefits.
Analysing Security Protocols
3
77
Challenge for Protocol Analysis
Global e-commerce is a favourite paradigm for describing today’s security challenges. This is an example of an open systems where no party that has authority over all participants and where there is no central point of control. Another example are ad-hoc networks where network availability takes priority over controlling participation [24,23]. Such open systems should not be confused with closed systems running over an open network, the typical setting for security research in the last two decades. In an open system the enemy is within by default as there is no boundary line to an outside world. We have to deal with insider attacks and security analysis cannot assume that principals are honest. In an open system identities tend to carry little security relevant information. Strong controls on parties joining the system are not necessarily in place, so we cannot rely too much on identity information provided. Even if such controls did exist, having reliable information may be inconsequential when we do not have recourse to someone who has authority over the party we have identified. Changes to the environment have wider implications for the design and analysis of security protocol. Consider, for example, the prudent engineering practices proposed by Abadi and Needham [1] that contain a recommendation to include names of principals in all messages, for the purpose of making it more difficult to use messages from one protocol run in another run. However, including the name of the correspondent in a signed message conflicts with plausible deniability1 , a privacy requirement now considered for the IKEv2 key management protocol [18]. Of course, security goals keep changing too. This should come as no surprise as there is a strong pull towards standardisation in security protocols. Hence, new protocols are predominantly designed for new environments or to meet new security requirements. The transition from IP security to mobile IP security illustrates how a change in the environment leads to a re-evaluation of security goals [17]. It is thus an important challenge for protocol analysis to identify implicit assumptions about the environment in verification methodologies and in offthe-shelf security goals so that they are not applied out of context. Crucially, defining security properties is difficult and error prone. Designing protocols that meet well understood requirements is much less of a challenge.
4
Summary and Conclusions
We had listed five areas where formal methods promise to advance the design and analysis of security protocols.
1
Two parties can communicate without anyone being able to prove that they did have a conversation, even with the collusion of one of the parties [15].
78
D. Gollmann
– Making security properties precise: The rigour of formal analysis helps to refine concepts and terminology. This is indeed a major contribution of formal methods to security analysis, maybe the major contribution. An excellent reference for the concepts and terminology developed in cryptography is [25]. – Making security properties clear: On one hand, a refined terminology does clarify nuances of security properties. On the other hand, highly complex formal definitions may not capture any relevant security properties. – Making security protocols precise: Preparing a protocol for formal analysis can highlight omissions and inconsistencies. You may have to understand the protocol first before it can be properly formalized. – Dealing with complexity: This is a promising area but much of the published work on formal protocol analysis does not feature complex protocols or complex attacks. – Proving security: This is a non-goal. We have to be precise and explicit when specifying security goals and assumptions about the intended environment and remain aware of the limitations of our proofs. It may seem paradoxical for someone coming from mathematics or formal methods that proofs are (almost) the least important contribution of formal methods to security. However, once you realize that attackers tend to proceed by invalidating assumptions your proof relies on it becomes apparent that proofs only cover a limited set of threats. Thus, formal methods are better used for symbolic debugging. Failure to find a proof may expose a weakness in a protocol. A successful proof indicates that we have reached the limits of our verification methodology and need a new model to find further attacks. Proofs tell us when we should move on if we want to break a protocol.
References 1. Mart´ın Abadi and Roger Needham. Prudent engineering pratice for cryptographic protocols. In Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy, pages 122–136, 1994. 2. Giampaolo Bella, Fabio Masacci, and Lawrence C. Paulson. Verifying the SET registration protocols. IEEE Journal on Selected Areas in Communications, 21(1):77– 87, January 2003. 3. Giampaolo Bella, Fabio Masacci, Lawrence C. Paulson, and Piero Tramontano. Formal verification of cardholder registration in SET. In F. Cuppens et al., editor, Computer Security – ESORICS 2000, LNCS 1895, pages 159–174. Springer Verlag, 2000. 4. Mihir Bellare and Phillip Rogaway. Entity authentication and key distribution. In D. R. Stinson, editor, Advances in Cryptology – CRYPTO’93, LNCS 773, pages 232–249. Springer Verlag, 1994. 5. Ray Bird, I. Gopal, Amir Herzberg, Philippe A. Janson, Shay Kutten, Refik Molva, and Moti Yung. Systematic design of a family of attack-resistant authentication protocols. IEEE Journal on Selected Areas in Communications, 11(5):679–693, June 1993.
Analysing Security Protocols
79
6. Ray Bird, Inder Gopal, Amir Herzberg, Phil Janson, Shay Kutten, Refik Molva, and Moti Yung. Systematic design of two-party authentication protocols. In J. Feigenbaum, editor, Advances in Cryptology – CRYPTO’91, LNCS 576, pages 44–61. Springer Verlag, 1992. 7. Mike Bond and Ross Anderson. API-level attacks on embedded systems. IEEE Computer, 34(10):67–75, October 2001. 8. Michael Burrows, Mart´ın Abadi, and Roger Needham. Authentication: A practical study in belief and action. In M. Y. Vardi, editor, Theoretical Aspects of Reasoning About Knowledge, pages 325–342, 1988. 9. Michael Burrows, Mart´ın Abadi, and Roger Needham. A logic of authentication. DEC Systems Research Center, Report 39, revised February 22 1990. 10. Josef Courtois, Nicolas T. Pieprzyk. Cryptanalysis of block ciphers with overdefined systems of equations. In Y. Zheng, editor, Advances in Cryptology – Asiacrypt 2002, LNCS 2501, pages 267–287. Springer Verlag, 2002. 11. Carl M. Ellison, Bill Frantz, Butler Lampson, Ron Rivest, Brian M. Thomas, and Tatu Ylonen. SPKI Certificate Theory, September 1999. RFC 2693. 12. International Organisation for Standardization. Basic Reference Model for Open Systems Interconnection (OSI) Part 2: Security Architecture. Gen`eve, Switzerland, 1989. 13. Dieter Gollmann. Authentication by correspondence. IEEE Journal on Selected Areas in Communications, 21(1):88–95, January 2003. 14. Sigrid G¨ urgens and Carsten Rudolph. Security analysis of (un-) fair nonrepudiation protocols. In this volume. 15. Dan Harkins, Charlie Kaufman, Tero Kivinen, Stephen Kent, and Radia Perlman. Design Rationale for IKEv2, February 2002. Internet Draft, draft-ietf-ipsec-ikev2rationale-00.txt. 16. International Organization for Standardization. Information technology – Security techniques – Entity authentication mechanisms; Part 3: Entity authentication mechanisms using a public key algorithm. Gen`eve, Switzerland, August 1993. ISO/IEC 9798-3. 17. D. Johnson, C. Perkins, and J. Arkko. Mobility Support in IPv6, January 2003. Internet Draft, draft-ietf-mobileip-ipv6-20.txt. 18. Charlie Kaufman. Internet Key Exchange (IKEv2) Protocol, January 2003. Internet Draft, draft-ietf-ipsec-ikev2-04.txt. 19. Richard A. Kemmerer. Aanalyzing encryption protocols using formal verification techniques. IEEE Journal on Selected Areas in Communications, 7(4):448–457, May 1989. 20. Lars R. Knudsen and Vincent Rijmen. On the decorrelated fast cipher (DFC) and its theory. In L. Knudsen, editor, Fast Software Encryption – FSE’99, LNCS 1636, pages 81–94. Springer Verlag, 1999. 21. Brian LaMacchia, Sebastian Lange, Matthew Lyons, Rudi Martin, and Kevin Price. .NET Framework Security. Addison Wesley Professional, 2002. 22. D. Longley and S. Rigby. An automatic search for security flaws in key management schemes. Computers & Security, 11(1):75–89, March 1992. 23. Silja M¨ aki and Tuomas Aura. Towards a survivable security architecture for ad-hoc networks. In B. Christiansen et al., editor, Security Protocols, 9th International Workshop, Cambridge, LNCS 2467, pages 63–73. Springer Verlag, 2002. 24. Silja M¨ aki, Tuomas Aura, and Maarit Hietalahti. Robust membership management for ad-hoc groups. In Proceedings of the 5th Nordic Workshop on Secure IT Systems (NORDSEC 2000), 2000.
80
D. Gollmann
25. A.J. Menezes, P.C. van Oorschot, and S.A. Vanstone. Handbook of Applied Cryptography. CRC Press, Boca Raton, FA, 1997. 26. Roger Needham. Keynote address: The changing environment (transcript of discussion). In B. Christiansen et al., editor, Security Protocols, 7th International Workshop, Cambridge, LNCS 1796, pages 1–5. Springer Verlag, 2000. 27. Lawrence C. Paulson. Inductive analysis of the internet protocol TLS. ACM Transactions on Information and System Security, 2(3):332–351, August 1999. 28. Marvin Schaefer. Symbol security condition considered harmful. In Proceedings of the 1989 IEEE Symposium on Security and Privacy, pages 20–46, 1989. 29. Serge Vaudenay. Provable security for block ciphers by decorrelation. In STACS’98, LNCS 1373, pages 249–275. Springer Verlag, 1998. 30. Thomas Y. C. Woo and Simon S. Lam. A semantic model for authentication protocols. In Proceedings of the 1993 IEEE Symposium on Research in Security and Privacy, pages 178–194, 1993.
Analysis of Probabilistic Contract Signing Gethin Norman1 and Vitaly Shmatikov2 1
School of Computer Science, University of Birmingham, Birmingham B15 2TT U.K. [email protected] 2 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025 U.S.A. [email protected]
Abstract. We consider the probabilistic contract signing protocol of Ben-Or, Goldreich, Micali, and Rivest as a case study in formal verification of probabilistic security protocols. Using the probabilistic model checker PRISM, we analyse the probabilistic fairness guarantees the protocol is intended to provide. Our study demonstrates the difficulty of combining fairness with timeliness in the context of probabilistic contract signing. If, as required by timeliness, the judge responds to participants’ messages immediately upon receiving them, then there exists a strategy for a misbehaving participant that brings the protocol to an unfair state with arbitrarily high probability, unless unusually strong assumptions are made about the quality of the communication channels between the judge and honest participants. We quantify the tradeoffs involved in the attack strategy, and discuss possible modifications of the protocol that ensure both fairness and timeliness.
1 Introduction Consider several parties on a computer network who wish to exchange some items of value but do not trust each other to behave honestly. Fair exchange is the problem of exchanging data in a way that guarantees that either all participants obtain what they want, or none do. Contract signing is a particular form of fair exchange, in which the parties exchange commitments to a contract (typically, a text string spelling out the terms of the deal). Commitment is often identified with the party’s digital signature on the contract. In commercial transactions conducted in a distributed environment such as the Internet, it is sometimes difficult to assess a counterparty’s trustworthiness. Contract signing protocols are, therefore, an essential piece of the e-commerce infrastructure. Contract signing protocols. The main property a contract signing protocol should guarantee is fairness. Informally, a protocol between A and B is fair for A if, in any situation where B has obtained A’s commitment, A can obtain B’s commitment regardless of B’s actions. Ideally, fairness would be guaranteed by the simultaneous execution of commitments by the parties. In a distributed environment, however, simultaneity cannot be assured unless a trusted third party is involved in every communication. Protocols for contract signing are inherently asymmetric, requiring one of the parties to make the
Supported in part by the EPSRC grant GR/N22960. Supported in part by DARPA contract N66001-00-C-8015 “Agile Management of Dynamic Collaboration.”
A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 81–96, 2003. c Springer-Verlag Berlin Heidelberg 2003
82
G. Norman and V. Shmatikov
first move and thus put itself at a potential disadvantage in cases where the other party misbehaves. Another important property of contract signing protocols is timeliness, or timely termination [4]. Timeliness ensures, roughly, that the protocol does not leave any participant “hanging” in an indeterminate state, not knowing whether the exchange of commitments has been successful. In a timely protocol, each participant can terminate the protocol timely and unilaterally, e.g., by contacting a trusted third party and receiving a response that determines the status of the exchange. Research on fair contract signing protocols dates to the early work by Even and Yacobi [17] who proved that fairness is impossible in a deterministic 2-party contract signing protocol. Since then, there have been proposed randomized contract signing protocols based on a computational definition of fairness [15,16], protocols based on gradual release of commitments [12,8], as well as non-probabilistic contract signing protocols that make optimistic use of the trusted third party (a.k.a. judge). In an optimistic protocol, the trusted third party is invoked only if one of the participants misbehaves [4,18]. In this paper, we focus on probabilistic contract signing, exemplified by the probabilistic contract signing protocol of Ben-Or, Goldreich, Micali, and Rivest [6] (henceforth, the BGMR protocol). Analysis technique. Our main contribution is a demonstration of how probabilistic verification techniques can be applied to the analysis of fairness properties of security protocols. While formal analysis of fair exchange is a very active area of research (see the related work section below), formalization and verification of fairness in a probabilistic setting is quite subtle. We are interested in verifying fairness guarantees provided by the protocol against an arbitrarily misbehaving participant. Therefore, we endow that participant with nondeterministic attacker operations in addition to the probabilistic behaviour prescribed by the protocol specification. The resulting model for the protocol combines nondeterminism and probability, giving rise to a Markov decision process. We discretize the probability space of the model and analyse chosen finite configurations using PRISM, a probabilistic finite-state model checker. Timeliness and fairness in the BGMR protocol. The original BGMR protocol as specified in [6] consists of two phases: the “negotiation” phase of pre-agreed duration, in which participants exchange their partial commitments to the contract, and the “resolution” phase, in which the judge issues decisions in case one or both of the participants contacted him during the negotiation phase. The BGMR protocol does not guarantee timeliness. On the one hand, the negotiation phase should be sufficiently long to enable two honest participants to complete the exchange of commitments without involving the judge. On the other hand, if something goes wrong (e.g., a dishonest party stops responding), the honest party may contact the judge, but then has to wait until the entire period allotted for the negotiation phase is over before he receives the judge’s verdict and learns whether the contract is binding on him or not. We study a variant of the BGMR protocol that attempts to combine fairness with timeliness by having the judge respond immediately to participants’ messages, in the manner similar to state-of-the-art non-probabilistic contract signing protocols such as
Analysis of Probabilistic Contract Signing
83
the optimistic protocols of Asokan et al. [4] and Garay et al. [18]. Our analysis uncovers that, for this variant of the BGMR protocol, fairness is guaranteed only if the judge can establish a communication channel with A, the initiator of the protocol, and deliver his messages faster than A and B are communicating with each other. If the channel from the judge to A provides no timing guarantees, or the misbehaving B controls the network and (passively) delays the judge’s messages, or it simply takes a while for the judge to locate A (the judge knows A’s identity, but they may have never communicated before), then B can exploit the fact that the judge does not remember his previous verdicts and bring the protocol to an unfair state with arbitrarily high probability. We quantify the tradeoff between the magnitude of this probability and the expected number of message exchanges between A and B before the protocol reaches a state which is unfair to A. Informally, the longer B is able to delay the judge’s messages (and thus continue communicating with A, who is unaware of the judge’s attempts to contact him), the higher the probability that B will be able to cheat A. Related work. A variety of formal methods have been successfully applied to the study of nondeterministic contract signing protocols, including finite-state model checking [29], alternating transition systems [22,23], and game-theoretic approaches [9,11,10]. None of these techniques, however, are applicable to contract signing in a probabilistic setting. Since fairness in protocols like BGMR is a fundamentally probabilistic property, these protocols can only be modelled with a probabilistic formalism such as Markov decision processes and verified only with probabilistic verification tools. Recently, Aldini and Gorrieri [1] used a probabilistic process algebra to analyse the fairness guarantees of the probabilistic non-repudiation protocol of Markowitch and Roggeman [26] (non-repudiation is a restricted case of contract signing). Even for non-fairness properties such as secrecy, authentication, anonymity, etc., formal techniques for the analysis of security protocols have focused almost exclusively on nondeterministic attacker models. Attempts to incorporate probability into formal models have been limited to probabilistic characterization of non-interference [20,30,31], and process formalisms that aim to represent probabilistic properties of cryptographic primitives [25]. This paper is an attempt to demonstrate how fully automated probabilistic analysis techniques can be used to give a quantitative characterization of probability-based security properties.
2 Probabilistic Model Checking Probability is widely used in the design and analysis of software and hardware systems: as a means to derive efficient algorithms (e.g., the use of electronic coin flipping in decision making); as a model for unreliable or unpredictable behaviour (e.g., fault-tolerant systems, computer networks); and as a tool to analyse system performance (e.g., the use of steady-state probabilities in the calculation of throughput and mean waiting time). Probabilistic model checking refers to a range of techniques for calculating the likelihood of the occurrence of certain events during the execution of such system, and can be useful to establish performance measures such as “shutdown occurs with probability at most 0.01” and “the video frame will be delivered within 5ms with probability at least
84
G. Norman and V. Shmatikov
0.97”. The system is usually specified as state transition system, with probability measures on the rate of transitions, and a probabilistic model checker applies algorithmic techniques to analyse the state space and calculate performance measures. In the distributed scenario, in which concurrently active processors handle a great deal of unspecified nondeterministic behaviour exhibited by their environment, the state transition systems must include both probabilistic and nondeterministic behaviour. A standard model of such systems are Markov decision processes (MDPs) [13]. Properties of MDPs can be specified in the probabilistic branching-time temporal logic PCTL [21,7] which allows one to express properties such as “under any scheduling of nondeterministic choices, the probability of φ holding until ψ is true is at least 0.78/at most 0.04”. 2.1 PRISM Model Checker We use PRISM [24,28], a probabilistic model checker developed at the University of Birmingham. The current implementation of PRISM supports the analysis of finitestate probabilistic models of the following three types: discrete-time Markov chains, continuous-time Markov chains and Markov decision processes. These models are described in a high-level language, a variant of reactive modules [2] based on guarded commands. The basic components of the language are modules and variables. A system is constructed as a number of modules which can interact with each other. A module contains a number of variables which express the state of the module, and its behaviour is given by a set of guarded commands of the form: [] → ; The guard is a predicate over the variables of the system and the command describes a transition which the module can make if the guard is true (using primed variables to denote the next values of variables). If a transition is probabilistic, then the command is specified as: <prob> : + · · · + <prob> : PRISM accepts specifications in either PCTL, or CSL logic depending on the model type. This allows us to express various probabilistic properties such as “some event happens with probability 1”, and “the probability of cost exceeding C is 95%”. The model checker then analyses the model and checks if the property holds in each state. In the case of MDPs, specifications are written in the logic PCTL, and for the analysis PRISM implements the algorithms of [21,7,5].
3 BGMR Protocol The objective of the probabilistic contract signing protocol of Ben-Or, Goldreich, Micali, and Rivest [6] (the BGMR protocol) is to enable two parties, A and B, to exchange their commitments to a pre-defined contract C. It is assumed that there exists a third party, called the judge, who is trusted by both A and B. The protocol is optimistic,
Analysis of Probabilistic Contract Signing
85
i.e., an honest participant following the protocol specification only has to invoke the judge if something goes wrong, e.g., if the other party stops before the exchange of commitments is complete (a similar property is called viability in [6]). Optimism is a popular feature of fair exchange protocols [27,3,4]. In cases where both signers are honest, it enables contract signing to proceed without participation of a third party, and thus avoids communication bottlenecks inherent in protocols that involve a trusted authority in every instance. 3.1 Privilege and Fairness In the BGMR protocol, it can never be the case that the contract is binding on one party, but not the other. Whenever the judge declares a contract binding, the verdict always applies to both parties. For brevity, we will refer to the judge’s ruling on the contract as resolving the contract. Privilege is a fundamental notion in the BGMR protocol. A party is privileged if it has the power to cause the judge to rule that the contract is binding. The protocol is unfair if it reaches a state where one party is privileged (i.e., it can cause the judge to declare the contract binding), and the other is not. Definition 1 (Probabilistic fairness). A contract signing protocol is (v, )−fair for A if, for any contract C, if A follows the protocol, then at any step of the protocol in which the probability that B is privileged is greater than v, the conditional probability that A is not privileged given that B is privileged is at most . The fairness condition for B is symmetric. Probabilistic fairness implies that at any step of the protocol where one of the parties has acquired the evidence that will cause the judge to declare the contract binding with probability x, the other party should possess the evidence that will cause the judge to issue the same ruling with probability of no less than x − . Informally, can be interpreted as the maximum fairness gap between A and B permitted at any step of the protocol. 3.2 Main Protocol Prior to initiating the protocol, A chooses a probability v which is sufficiently small so that A is willing to accept a chance of v that B is privileged while A is not. A also chooses a value α > 1 which quantifies the “fairness gap” (see section 3.1) as follows: at each step of the protocol the conditional probability that A is privileged given that B is privileged should be at least α1 , unless the probability that B is privileged is under v. B also chooses a value β > 1 such that at any step where A is privileged, the conditional probability that B is privileged should be at least β1 . Both parties maintain counters, λa and λb , initialized to 0. A’s commitment to C has the form “sigA (With probability 1, the contract C shall be valid)”. B’s commitment is symmetric. It is assumed that the protocol employs an unforgeable digital signature scheme.
86
G. Norman and V. Shmatikov
All messages sent by A in the main protocol have the form “sigA (With probability p, the contract C shall be valid)”. Messages sent by B have the same form and are signed by B. If both A and B behave correctly, at the end of the main protocol A obtains B’s commitment to C, and vice versa. At the abstract level, the main flow of the BGMR protocol is as follows: A→B A←B A→B A→B A←B
sigA (With prob. pa1 , the contract C shall be valid) = ma1 sigB (With prob. pb1 , the contract C shall be valid) = mb1 .. . sigA (With prob. pai , the contract C shall be valid) = mai .. .
sigA (With prob. 1, the contract C shall be valid) = man sigB (With prob. 1, the contract C shall be valid) = mbn
In its first message ma1 , A sets pa1 = v. Consider the ith round of the protocol. After receiving message sigA (With prob. pai , the contract C shall be valid) from A, honest B checks whether pai ≥ λb . If not, B considers A to have stopped early, and contacts the judge for resolution of the contract as described in section 3.3. If the condition holds, B computes λb = min(1, pai · β), sets pbi = λb and sends message sigB (With prob. pbi , the contract C shall be valid) to A. The specification for A is similar. Upon receiving B’s message with probability pbi in it, A checks whether pbi ≥ λa . If not, A contacts the judge, otherwise he updates λa = max(v, min(1, pbi · α)), sets pai+1 = λa and sends message sigA (With prob. pai+1 , the contract C shall be valid), initiating a new round of the protocol. The main protocol is optimistic. If followed by both participants, it terminates with both parties committed to the contract. 3.3 The Judge Specification of the BGMR protocol assumes that the contract C defines a cutoff date D. When the judge is invoked, he does nothing until D has passed, then examines the message of the form sigX (With prob. p, the contract C shall be valid) supplied by the party that invoked it and checks the validity of the signature. If the judge has not resolved contract C before, he flips a coin, i.e., chooses a random value ρC from a uniform distribution over the interval [0, 1]. If the contract has been resolved already, the judge retrieves the previously computed value of ρC . In either case, the judge declares the contract binding if p ≥ ρC and cancelled if p < ρC , and sends his verdict to the participants. To make the protocol more efficient, ρC can be computed as fr (C), where r is the judge’s secret input, selected once and for all, and fr is the corresponding member of a family of pseudo-random functions [19]. This enables the judge to produce the same value of ρC each time contract C is submitted without the need to remember his past flips. The judge’s procedure can thus be implemented in constant memory. Observe that even though the judge produces the same value of ρC each time C is submitted, the judge’s verdict depends also on the value of p in the message submitted
Analysis of Probabilistic Contract Signing
87
by the invoking party and, therefore, may be different each time. If the judge is optimized using a pseudo-random function with a secret input to work in constant memory as described above, it is impossible to guarantee that he will produce the same verdict each time. To do so, the judge needs to remember the first verdict for each contract ever submitted to him, and, unlike ρC , the value of this verdict cannot be reconstructed from subsequent messages related to the same contract. 3.4 Timely BGMR Asokan et al. [4] define timeliness as “one player cannot force the other to wait for any length of time—a fair and timely termination can be forced by contacting the third party.” The BGMR protocol as specified in sections 3.2 and 3.3 does not guarantee timeliness in this sense. To accommodate delays and communication failures on a public network such as the Internet, the duration of the negotiation phase D should be long. Otherwise, many exchanges between honest parties will not terminate in time and will require involvement of the judge, making the judge a communication bottleneck and providing no improvement over a protocol that simply channels all communication through a trusted central server. If D is long, then an honest participant in the BGMR protocol can be left “hanging” for a long time. Suppose the other party in the protocol stops communicating. The honest participant may contact the judge, of course, but since the judge in the original BGMR protocol does not flip his coin until D has passed, this means that the honest party must wait the entire period allotted for negotiation before he learns whether the contract will be binding on him or not. This lack of timeliness is inconvenient and potentially problematic if the contract requires resource commitment from the participants or relies on time-sensitive data. In this paper, we investigate a variant of BGMR that we call TBGMR (for “Timely BGMR”). The only difference between BGMR and TBGMR is that, in TBGMR, the judge makes his decision immediately when invoked by one of the protocol participants. Once the verdict is announced and reaches an honest participant, the latter stops communicating. The rest of the protocol is as specified in sections 3.2 and 3.3. TBGMR protocol is timely. Any party can terminate it unilaterally and obtain a binding verdict at any point in the protocol without having to wait for a long time.
4 Model We formalize both the BGMR and TBGMR protocols as Markov decision processes. We then use PRISM on a discretized model to determine if TBGMR is fair. 4.1 Overview of the Model First, we model the normal behaviour of protocol participants and the judge according to the BGMR protocol specification, except that the judge makes his coin flip and responds with a verdict immediately. A dishonest participant might be willing to deviate from the protocol in an attempt to cheat the honest participant. We assume that B is
88
G. Norman and V. Shmatikov
the dishonest participant and A the honest one. We equip B with an additional set of dishonest actions, any of which he can take nondeterministically at any point in the protocol. To obtain a finite probabilistic model, we fix the parameters chosen by the participants (see section 3.2), and discretize the probability space of the judge’s coin flips. Modelling the dishonest participant. Conventional formal analysis of security protocols is mainly concerned with security against the so called Dolev-Yao attacker, following [14]. A Dolev-Yao attacker is a nondeterministic process that has complete control over the communication network and can perform any combination of a given set of attacker operations, such as intercepting any message, splitting messages into parts, decrypting if he knows the correct decryption key, assembling fragments of messages into new messages and replaying them out of context, etc. We assume that the digital signature scheme employed by the protocol to authenticate messages between participants is secure. Therefore, it is impossible for the misbehaving participant to forge messages from the honest participant or modify their content. We will also assume that the channels between the participants and the judge are resilient: it is possible for the attacker to delay messages and schedule them out of order, but he must eventually deliver every message to its intended recipient. Dishonest actions available to a misbehaving participant are thus limited to i) invoking the judge even though the honest party has not stopped communicating in the main flow of the protocol, and ii) delaying and re-ordering messages between the judge and the honest party. The misbehaving participant can nondeterministically attempt any of these actions at any point in the protocol. When combined with the probabilistic behaviour of the judge, the nondeterminism of the dishonest participant gives rise to a Markov decision process. Modelling fairness. To model fairness, we compute the maximum probability of reaching a state where the dishonest participant is privileged and the honest one is not. Note that this probability is not conditional on the judge’s coin not having been flipped yet, because the dishonest participant may choose to contact the judge even after the coin has been flipped, ρC computed and verdict rendered. In contrast, the proof of fairness in [6] calculates this probability under the assumption that the coin has not been flipped yet. 4.2 Analysis Technique We now describe our method for modelling the protocol in PRISM’s input description language. Since PRISM is currently only applicable to finite configurations and the input language allows only integer valued variables, to model the protocol there are two simplifications we need to make. First, we must discretize the judge’s coin by fixing some N ∈ N and supposing that when the judge flips the coin, it takes the value i/N for i = 1, . . . , N with probability 1/N . Second, we must fix the parameters of the parties, namely v, α and β. Once the parameters v, α and β are fixed, the possible messages that can be sent between the parties and the ordering of the messages are predetermined. More precisely,
Analysis of Probabilistic Contract Signing
89
as shown in fig. 1, the probability values included in the messages between the parties are known. Note that, for v, α, β > 0, these values converge to 1 and there are only finitely many distinct values. Therefore, with a simple script, we can calculate all the probability values included in the messages of the two parties. Now, to model the parties in PRISM, we encode the state of each party as the probability value included in the last message the party sent to the other party, i.e., the states of A and B are identified by the current values of λa and λb respectively. Formally, the states of each party range from 0 up to k, for some k such that pai , pbi = 1 for all i ≥ k, where A is in state 0 when A has sent no messages to B, and in state i ∈ {1, . . . , k} when the last message A sent to B included the probability value min(1, v · αi−1 · β i−1 ). Similarly, B is in state 0 when B has not sent any messages to A, and in state i ∈ {1, . . . , k} when the last message B sent to A included the probability value min(1, v · αi−1 · β i ). A B
pa1 =v
pa2 =v·β·α pb1 =v·β
pb2 =v·β 2 ·α
pai =v·(β·α)i−1 pbi =v·(β·α)i−1 ·β
Fig. 1. Probability values included in the messages sent between the parties. Following this construction process, in the case when N = 100, the specification of the BGMR protocol used as input into PRISM is given in fig. 2. Note that there is a nondeterministic choice as to when the judge’s coin is flipped, i.e., when a party first sends a message to the judge asking for a verdict. Furthermore, once the coin is flipped, the parties cannot send any messages to each other, that is, this model corresponds to the original protocol. In the timely variant of the protocol (TBGMR), if the misbehaving participant is capable of delaying messages from the judge, then the parties may continue sending messages to each other even after the coin has been flipped. To model this, the two lines corresponding to the parties sending messages are replaced with: // parties can continue sending messages after the coin has been flipped [] turn=0 → lambdaA =min(k, lambdaA + 1) ∧ turn =1; [] turn=1 → lambdaB =min(k, lambdaB + 1) ∧ turn =0;
Recall, if the coin has been flipped and the last messages sent by parties A and B include the probability values pa and pb respectively, then in the current state of the protocol B is privileged and A is not if and only if the value of the coin is in the interval (pb , pa ]. Therefore, in the PRISM model we can specify the set of states where B is privileged and A is not. Let’s call this set of states unfair. We then use PRISM to calculate the maximum probability, from the initial state (where no messages have been sent and the coin has not been flipped), of reaching a state in unfair. Table 1 gives the summary of the results obtained using PRISM when considering a number of different parameters for both the BGMR and TBGMR versions of the protocol.
90
G. Norman and V. Shmatikov
module lambdaA : [0..k]; // probability value in the last message A sent to B lambdaB : [0..k]; // probability value in the last message B sent to A turn : [0..1]; // which party sends message next (0 – party A and 1 – party B) c : [0..1]; // status of the coin (0 – not flipped and 1 – flipped) rho : [0..100]; // value of the coin (rho=x and c=1 means its value is x/100) // parties alternately send messages and stop when the coin has been flipped [] turn=0 ∧ c=0 → lambdaA =min(k, lambdaA + 1) ∧ turn =1; [] turn=1 ∧ c=0 → lambdaB =min(k, lambdaB + 1) ∧ turn =0; // flip coin any time after a message has been sent [] c=0 ∧ (lambdaA>0 ∨ lambdaB >0) → 1/100 : (c =1) ∧ (rho =1) +1/100 : (c =1) ∧ (rho =2) .. . +1/100 : (c =1) ∧ (rho =100); endmodule
Fig. 2. Prism code for BGMR protocol. v
α
β
0.1
1.1
1.05
0.1
1.1
1.01
0.01 1.01 1.005
0.01 1.01 1.001
0.001 1.001 1.0005
0.001 1.001 1.0001
N 10 100 1,000 10 100 1,000 10 100 1,000 10 100 1,000 10 100 1,000 10 100 1,000
number maximum probability of reaching of a state where only B is privileged states BGMR TBGMR 408 0.1000 0.8000 3,738 0.1000 0.7000 37,038 0.1000 0.7080 518 0.1000 0.9000 4,748 0.1000 0.9000 47,048 0.1000 0.9180 6,832 0.1000 0.6000 62,722 0.0100 0.6600 621,622 0.0100 0.6580 9,296 0.1000 1.0000 85,346 0.0100 0.9400 845,846 0.0100 0.9070 101,410 0.1000 0.6000 931,120 0.0100 0.7200 9,228,220 0.0010 0.6840 138,260 0.1000 0.9000 1,269,470 0.0100 0.8700 12,581,570 0.0010 0.9010
Table 1. Model checking results obtained with PRISM
5 Analysis Results The probability of reaching a state in unfair is maximized, over all nondeterministic choices that can be made by a misbehaving B, by the following strategy. As soon as B
Analysis of Probabilistic Contract Signing
91
4
1
v=0.001 and α=1.001 v=0.0005 and α=1.0005 v=0.00025 and α=1.00025 v=0.0001 and α=1.0001
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
1.002
1.004
β
1.006
1.008
1.01
8 expected time until a party is privileged
prob. reaching a state where B is privileged and A is not
receives the first message from A, he invokes the judge, pretending that A has stopped communicating and presenting A’s message to the judge. In TBGMR, this will cause the judge to flip the coin and announce a verdict immediately. B delays the judge’s announcement message to A, and keeps exchanging messages with A in the main flow of the protocol. After each message from A, B invokes the judge and presents A’s latest message, until one of them results in a positive verdict. Depending on his choice of β, B can steer the protocol to a state unfair to A with arbitrarily high probability. Table 1 lists the maximum probabilities that can be achieved by this strategy in the TBGMR protocol. Recall that the results obtained with PRISM are for a simplified model, where the coin is discretized. Nevertheless, due to the simplicity of this strategy, we were able to write a simple M ATLAB script which calculated the probability of reaching a state which is unfair to A under this strategy for a general coin, that is, a coin whose flips take a value uniformly chosen from the [0, 1] interval. Fig. 3 (left chart) shows the probability, for various choices of parameters, of reaching a state in which B is privileged and A is not. Note that B can bring this probability arbitrarily close to 1 by choosing a small β. We assume that the value of β is constant and chosen by B beforehand. In practice, B can decrease his β adaptively. A may respond in kind, of course, but then the protocol will crawl to a halt even if B is not cheating.
x 10
v=0.001 and α=1.001 v=0.0005 and α=1.0005 v=0.00025 and α=1.00025 v=0.0001 and α=1.0001
7 6 5 4 3 2 1 0 1
1.0002
1.0004
β
1.0006
1.0008
1.001
Fig. 3. Probability and Expected time of B’s win, depending on β
This attack on TBGMR is feasible because the judge remembers (or reconstructs, using a pseudo-random function with a secret input—see section 3.3) only the value of his coin flip, ρC , and not his previous verdicts. Therefore, B induces the coin flip as soon as possible, and then gets the judge to produce verdict after verdict until the probability value contained in A’s message is high enough to result in a positive verdict. Accountability of the judge. Under the assumption that the channels are resilient (i.e., eventual delivery is guaranteed), at some point A will receive all of the judge’s verdicts that have been delayed by B. Since these verdicts are contradictory (some declare the contract binding, and some do not), A may be able to argue that somebody—either B,
92
G. Norman and V. Shmatikov
or the judge, or both—misbehaved in the protocol. Note, however, that the protocol specification does not require the judge to produce consistent verdicts, only consistent values of ρC . To repair this, the protocol may be augmented with “rules of evidence” specifying what constitutes a violation (see also the discussion in section 6). Such rules are not mentioned in [6] and appear to be non-trivial. A discussion of what they might look like is beyond the scope of this paper. 5.1 Attacker’s Tradeoff Although a Dolev-Yao attacker is assumed to be capable of delaying messages on public communication channels, in practice this might be difficult, especially if long delays are needed to achieve the attacker’s goal. Therefore, it is important to quantify the relationship between B’s probability of winning (bringing the protocol into a state that’s unfair to A), and how long B needs to delay the judge’s messages to A. As a rough measure of time, we take the expected number of message exchanges between A and B. The greater the number of messages A has to send before B reaches a privileged state, the longer the delay. We wrote a M ATLAB script to calculate the expected number of messages sent before a party becomes privileged. The results are shown in fig. 3 (right chart) for various values of v and α as β varies. As expected, the lower β, the greater the number of messages that the parties have to exchange before B becomes privileged. Recall, however, that lower values of β result in higher probability that B eventually wins (left chart on fig. 3).
4
expected time until a party is privileged
8
x 10
7 6
v=0.001 and α=1.001 v=0.0005 and α=1.0005 v=0.00025 and α=1.00025 v=0.0001 and α=1.0001
5 4 3 2 1 0 0
0.2 0.4 0.6 0.8 prob. reaching a state where B is privileged and A is not
1
Fig. 4. Expected time to B’s win vs. probability of B’s win The misbehaving participant thus faces a tradeoff. The higher the probability of winning, the longer the judge’s messages to the honest participant must be delayed. This
Analysis of Probabilistic Contract Signing
93
tradeoff is quantified by the chart in fig. 4. As the chart demonstrates, there is a linear tradeoff for a misbehaving B between the expected time to win and the probability of winning. If B is not confident in his ability to delay the judge’s messages to A and/or exchange messages with A fast enough (before the judge’s verdict reaches A), B can choose a large value of β and settle for a shorter delay and thus lower probability of steering the protocol into an unfair state.
6 Making BGMR Protocol Timely As demonstrated in section 5, the BGMR protocol cannot be made timely by simply having the judge flip his coin immediately after he is invoked, because then fairness is lost. In this section, we discuss modifications to the protocol that may enable it to provide both timeliness and fairness. Fast and secure communication channels. The attack described in section 5 can be prevented if the protocol requires the communication channel between the judge and the honest participant to be of very high quality. More precisely, both fairness and timeliness will be guaranteed if B is prevented from delaying the judge’s messages to A. Also, the judge’s channel to A must be faster than the channel between A and B. This ensures that A will be notified of the judge’s verdict immediately after the judge flips his coin and will stop communicating with B. While a high-quality channel from A to the judge might be feasible, it is significantly more difficult to ensure that the channel from the judge to A is faster than the channel between A and B. The judge is necessarily part of the infrastructure in which the protocol operates, servicing multiple instances of the protocol at the same time. Therefore, it is reasonable to presume that the judge is always available and responsive. On the other hand, A is simply one of many participants using the protocol, may not be easy to locate on a short notice, may not be expecting a message from the judge before the cutoff date D, etc. On a public network, it is usually much easier for a user to contact the authorities than for the authorities to locate and contact the user. Judge with unbounded memory. The attack also succeeds because the judge does not remember his past verdicts, only the value of his coin flip ρC . If the judge is made to remember his first verdict, and simply repeat it in response to all subsequent invocations regarding the same contract, the attack will be prevented. Note, however, that such a judge will require unbounded memory, because, unlike ρC , the value of the first verdict cannot be reconstructed from subsequent evidence using a pseudo-random function. Therefore, the judge will need to store the verdict he made for every instance of the protocol he has ever been asked to resolve in case he is invoked again regarding the same instance. A possible optimization of this approach is to expunge all verdicts regarding a particular contract from the judge’s memory once the cutoff date for that contract has passed. This introduces additional complications, however, since participants are permitted to ask the judge for a verdict even after the cutoff date. If the verdict in this case is constructed from ρC and the newly presented evidence, it may be inconsistent
94
G. Norman and V. Shmatikov
with the verdicts the judge previously announced regarding the same contract. To avoid this, the old verdicts must be stored forever and the judge’s memory must be infinite, regardless of the quality of the communication channels between the parties and the judge. Signed messages to the judge. If all invocations of the judge are signed by the requesting party, and that signature is included in the judge’s verdict, then A will be able to prove B’s misbehaviour (that B invoked the judge more than once) when all of the judge’s verdicts finally reach him after having been delayed by B. If invoking the judge more than once is construed as a violation, however, the protocol might be problematic for an honest B in case there is a genuine delay on the channel between A and B. Such a delay may cause an honest B to time out, invoke the judge, and then, after A’s message with a new probability value finally arrives, invoke the judge again, which would be interpreted as an attempt to cheat. Combining fairness with timeliness seems inherently difficult in the case of probabilistic contract signing. The approaches outlined above demand either extremely stringent requirements on the communication channels, which would limit applicability of the protocol, or introduction of the additional “rules of evidence” that define what is and what is not a violation. Such rules are not included in the protocol specification [6] and would have to be designed from scratch. Designing an appropriate set of rules appears difficult, as they should accommodate legitimate scenarios (a participant should be permitted to request a verdict from the judge more than once, because the judge’s messages might have been lost or corrupted in transmission), while preventing a malicious party from repeatedly contacting the judge until the desired verdict is obtained.
7 Conclusions We presented a case study, demonstrating how formal analysis techniques can be used to verify properties of probabilistic security protocols. We analysed a timely variant of the probabilistic contract signing of Ben-Or, Goldreich, Micali, and Rivest using PRISM, a probabilistic model checker, and showed that there exists an attack strategy for a misbehaving participant that brings the protocol into an unfair state with arbitrarily high probability. We also quantified the attacker’s tradeoff between the probability of winning and the need to delay messages from the judge to the honest participant. Designing a probabilistic contract signing protocol that is both timely and fair is a challenging task. Our case study, in addition to demonstrating feasibility of probabilistic model checking as an analysis technique for security protocols, is a step in this direction.
References 1. A. Aldini and R. Gorrieri. Security analysis of a probabilistic non-repudiation protocol. In Proc. PAPM/PROBMIV’02, volume 2399 of LNCS, pages 17–36. Springer-Verlag, 2002. 2. R. Alur and T. Henzinger. Reactive modules. Formal Methods in System Design, 15:7–48, 1999.
Analysis of Probabilistic Contract Signing
95
3. N. Asokan, M. Schunter, and M. Waidner. Optimistic protocols for fair exchange. In Proc. 4th ACM Conference on Computer and Communications Security, pages 7–17, 1997. 4. N. Asokan, V. Shoup, and M. Waidner. Optimistic fair exchange of digital signatures. IEEE Selected Areas in Communications, 18(4):593–610, 2000. 5. C. Baier and M. Kwiatkowska. Model checking for a probabilistic branching time logic with fairness. Distributed Computing, 11(3):125–155, 1998. 6. M. Ben-Or, O. Goldreich, S. Micali, and R. Rivest. A fair protocol for signing contracts. IEEE Transactions on Information Theory, 36(1):40–46, 1990. 7. A. Bianco and L. de Alfaro. Model checking of probabilistic and nondeterministic systems. In Proc. Foundations of Software Technology and Theoretical Computer Science, volume 1026 of LNCS, pages 499–513. Springer-Verlag, 1995. 8. D. Boneh and M. Naor. Timed commitments. In Proc. CRYPTO’00, volume 1880 of LNCS, pages 236–254. Springer-Verlag, 2000. 9. L. Butty´an and J.-P. Hubaux. Toward a formal model of fair exchange — a game theoretic approach. Technical Report SSC/1999/39, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland, 1999. ˇ 10. L. Butty´an, J.-P. Hubaux, and S. Capkun. A formal analysis of Syverson’s rational exchange protocol. In Proc. 15th IEEE Computer Security Foundations Workshop, pages 193–205, 2002. 11. R. Chadha, M. Kanovich, and A. Scedrov. Inductive methods and contract-signing protocols. In Proc. 8th ACM Conference on Computer and Communications Security, pages 176–185, 2001. 12. I. Damg˚ard. Practical and provably secure release of a secret and exchange of signatures. J. Cryptology, 8(4):201–222, 1995. 13. C. Derman. Finite-State Markovian Decision Processes. New York: Academic Press, 1970. 14. D. Dolev and A. Yao. On the security of public key protocols. IEEE Transactions on Information Theory, 29(2):198–208, 1983. 15. S. Even. A protocol for signing contracts. Technical Report 231, Computer Science Dept., Technion, Israel, 1982. 16. S. Even, O. Goldreich, and A. Lempel. A randomized protocol for signing contracts. Communications of the ACM, 28(6):637–647, 1985. 17. S. Even and Y. Yacobi. Relations among public key signature schemes. Technical Report 175, Computer Science Dept., Technion, Israel, 1980. 18. J. Garay, M. Jakobsson, and P. MacKenzie. Abuse-free optimistic contract signing. In Proc. CRYPTO’99, volume 1666 of LNCS, pages 449–466. Springer-Verlag, 1999. 19. O. Goldreich, S. Goldwasser, and S. Micali. How to construct random functions. J. ACM, 33(4):792–807, 1986. 20. J. Gray. Toward a mathematical foundation for information flow security. J. Computer Security, 1(3):255–294, 1992. 21. H. Hansson and B. Jonsson. A logic for reasoning about time and probability. Formal Aspects of Computing, 6(5):512–535, 1994. 22. S. Kremer and J.-F. Raskin. A game-based verification of non-repudiation and fair exchange protocols. In Proc. CONCUR’01, volume 2154 of LNCS, pages 551–565. Springer-Verlag, 2001. 23. S. Kremer and J.-F. Raskin. Game analysis of abuse-free contract signing. In Proc. 15th IEEE Computer Security Foundations Workshop, pages 206–220, 2002. 24. M. Kwiatkowska, G. Norman, and D. Parker. PRISM: Probabilistic symbolic model checker. In Proc. TOOLS’02, volume 2324 of LNCS, pages 200–204. Springer-Verlag, 2002. 25. P. Lincoln, J. Mitchell, M. Mitchell, and A. Scedrov. Probabilistic polynomial-time equivalence and security analysis. In Proc. World Congress on Formal Methods, volume 1708 of LNCS, pages 776–793. Springer-Verlag, 1999.
96
G. Norman and V. Shmatikov
26. O. Markowitch and Y. Roggeman. Probabilistic non-repudiation without trusted third party. In Proc. 2nd Conference on Security in Communication Networks, 1999. 27. S. Micali. Certified e-mail with invisible post offices. Presented at RSA Security Conference, 1997. 28. PRISM web page. http://www.cs.bham.ac.uk/˜dxp/prism/. 29. V. Shmatikov and J. Mitchell. Finite-state analysis of two contract signing protocols. Theoretical Computer Science, 283(2):419–450, 2002. 30. P. Syverson and J. Gray. The epistemic representation of information flow security in probabilistic systems. In Proc. 8th IEEE Computer Security Foundations Workshop, pages 152– 166, 1995. 31. D. Volpano and G. Smith. Probabilistic non-interference in a concurrent language. In Proc. 11th IEEE Computer Security Foundations Workshop, pages 34–43, 1998.
Security Analysis of (Un-) Fair Non-repudiation Protocols Sigrid G¨ urgens and Carsten Rudolph Fraunhofer – Institute for Secure Telecooperation SIT Rheinstrasse 75, D-64295 Darmstadt, Germany {guergens,rudolphc}@sit.fraunhofer.de
Abstract. An approach to protocol analysis using asynchronous product automata (APA) and the simple homomorphism verification tool (SHVT) is demonstrated on several variants of the well known ZhouGollmann fair non-repudiation protocol. Attacks on these protocols are presented, that, to our knowledge, have not been published before. Finally, an improved version of the protocol is proposed.
1
Introduction
Non-repudiation is an essential security requirement for many protocols in electronic business and other binding tele-cooperations where disputes of transactions can occur. Especially the undeniable transfer of data can be crucial for commercial transactions. While non-repudiation can be provided by standard cryptographic mechanisms like digital signatures, fairness is more difficult to achieve. A variety of protocols has been proposed in the literature to solve the problem of fair message transfer with non-repudiation. One possible solution comprises protocols based on a trusted third party (TTP) with varying degree of involvement. In protocols published at first, the messages are forwarded by the TTP. A more efficient solution was proposed by Zhou and Gollmann [17]. Here, the TTP acts as a light-weighted notary. Instead of passing the complete message through the TTP and thus creating a possible bottleneck, only a short term key is forwarded by the TTP and the encrypted message is directly transferred to the recipient. Based on this ingenious approach, several protocols and improvements have been proposed ([16,5]). Cryptographic protocols are error prone and the need for formal verification of cryptographic protocols is widely accepted. However, non-repudiation protocols are being developed only for a comparatively short time period, thus only very few of these protocols have been subject to a formal security analysis. Only a few existing protocol analysis methods have been extended and applied to non-repudiation protocols [1,13,19]. Recently, a new approach modelling nonrepudiation protocols as games has been proposed by Kremer and Raskin [6].
This paper describes results of work within the project CASENET partly being funded by the European Commission under IST-2001-32446.
A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 97–114, 2003. c Springer-Verlag Berlin Heidelberg 2003
98
S. G¨ urgens and C. Rudolph
The development of formal methods for protocol analysis has mainly concentrated on authentication and key-establishment protocols. These methods cannot be directly applied to the security analysis of fair non-repudiation protocols. Obviously, formalisations for non-repudiation and fairness are required. Furthermore, the attack scenario for the analysis of fair non-repudiation protocols is different. Many models for protocol analysis consider an attacker to have control over the network, while protocol participants trust each other. In the case of the establishment of a new session key, the requirement of mutual trust results from the security requirement of confidentiality for the new session key. A malicious protocol agent could simply publish a newly established key and therefore no protocol would achieve secure key-establishment in this scenario. In contrast, fair non-repudiation is explicitly designed for scenarios where protocol participants may act maliciously, since no participant (except a trusted third party) is assumed to behave in accordance with the protocol specification. This has to be reflected in the model for protocol analysis as well as in the formalisation of security properties. Remarkably, one protocol proposed by Zhou and Gollmann [17] that has been analysed using three different methods [1,13,19] does not provide fair nonrepudiation under reasonable assumptions. We show possible attacks on this protocol and on two of its various versions in Sections 5.2, 6.1 and 6.2. Furthermore, in Section 6.3 we present a straightforward improvement for all three versions of the protocol. The next section defines fair non-repudiation. Section 3 explains the protocol proposed by Zhou and Gollmann and Section 4 explains our approach to automated protocol analysis. The remaining sections describe possible attacks on three protocol variants and suggest an improvement.
2
Requirements for Fair Non-repudiation
This paper concentrates on message transfer protocols with certain security properties: Agent A sends message m to agent B, such that A can prove that B has received m (non-repudiation of receipt) and B can prove that A has sent m (nonrepudiation of origin). Furthermore, the protocol should not give the originator A an advantage over the recipient B, or vice versa (fairness). Non-repudiation of origin and non-repudiation of receipt require evidence of receipt (EOR) and evidence of origin (EOO). All agents participating in the protocol have to agree that these constitute valid proofs that the particular receive or send event has happened. In case of a dispute an arbitrator or judge has to verify EOO and EOR. Therefore, for every non-repudiation protocol one has to specify what exactly constitutes valid EOO and EOR. This can be done by specifying the verification algorithm the judge has to execute in order to verify the evidence for dispute resolution. Even in fair non-repudiation protocols there are intermediate states where one agent seems to have an advantage, for example, if a TTP has transmitted evidence first to one agent and the other agent is still waiting for the next step
Security Analysis of (Un-) Fair Non-repudiation Protocols
99
of the TTP. We say a protocol is fair if at the end of the protocol execution no agent has an advantage over the other agent. This means that if there is an unfair intermediate situation for one agent this agent must be able to reach a fair situation without the help of other untrusted agents. For any agent P we say a protocol execution is finished for P if either P has executed all protocol steps or any remaining protocol step depends on the execution of protocol steps by other untrusted agents. In this paper we consider a refined version of the definition of fair nonrepudiation by Zhou and Gollmann [17]. We specify the security goals relative to the role in the protocol. The security goal of the originator of a message has to be satisfied in all scenarios where the originator acts in accordance with the protocol specification while the recipient may act maliciously. In contrast, the security goal of the recipient has to be satisfied in scenarios where the originator can act maliciously. Definition 1. A message transfer protocol for originator A and recipient B provides fair non-repudiation if the following security goals are satisfied for all possible messages m. Security goal for A: Fair non-repudiation of receipt At any possible end of the protocol execution in A’s point of view either A owns a valid EOR by B for message m or B has not received m and B has no valid EOO by A for m. Security goal for B: Fair non-repudiation of origin At any possible end of the protocol execution in B’s point of view either B has received m and owns a valid EOO by A for m or A has no valid EOR by B for m.
3
The Basic Version of the Zhou-Gollmann Protocol
In this paper we discuss three versions of a non-repudiation protocol introduced by Zhou and Gollmann in [17,18,16]. The purpose of all protocols is to transmit a message from agent A to agent B and to provide evidence for B that the message originated with A while conversely providing evidence for A that B received the message. Thus the protocols shall provide fair non-repudiation as defined above. An online trusted third party TTP is involved in all three protocols. Variants of the protocols with offline TTP are not discussed in this paper, although these protocols show similar weaknesses. The main idea of all protocols is to split the transmission of the message M into two parts. The first part consists of A sending a commitment C = eK(M ) (message M encrypted with key K ) and B acknowledging its receipt. Then, A submits the key K and signature sub K to an on-line trusted third party TTP which makes a signature con K available that serves both as the second part of the evidence of origin for B and as the second part of evidence of receipt for A. Consequently, evidence of origin EOO and evidence of receipt EOR consists of two parts: – EOO is composed of EOO C (A’s signature on the commitment C) and con K (the confirmation of key K by the trusted third party).
100
S. G¨ urgens and C. Rudolph
– EOR is composed of EOR C (B’s signature on the commitment C) and con K (the confirmation of key K by the trusted third party). We adopt the notation from Zhou and Gollmann: – – – – – – –
– – – –
m, n: concatenation of two messages m and n. H(m): a one-way hash function applied to message m. eK(m) and dK(m): encryption and decryption of message m with key K. C = eK(m): commitment (ciphertext) for message m. L: a unique label to link all protocol messages. fEOO , fEOR , fSUB , fCON : message flags to indicate the purpose of the respective message. sSA (m): principal A’s digital signature on message m with A’s private signature key SA . Note that the plaintext is not recoverable from the signature, i.e. for signature verification the plaintext needs to be made available. EOO C = sSA (fEOO , B, L, C), EOR C = sSB (fEOR , A, L, C) sub K = sSA (fSUB , B, L, K), con K = sST T P (fCON , A, B, L, K) A → B : m: agent A sends message m with agent B being the intended recipient. A ↔ B : m: agent A fetches message m from agent B using the “ftp get ” operation (or by some analogous means). We first concentrate on the basic version of the protocols which is as follows: 1. A 2. B 3. A 4. A 5. B
→B →A → TTP ↔ TTP ↔ TTP
: fEOO , B, L, C, EOO C : fEOR , A, L, C, EOR C : fSUB , B, L, K, sub K : fCON , A, B, L, K, con K : fCON , A, B, L, K, con K
A protocol without a TTP puts the agent which is the first to provide all information in a disadvantageous position, since the second agent can just refrain from sending the acknowledgement message. This is avoided by involving TTP: Once A made the key available to TTP, A will always be able to retrieve the remaining evidence con K. The authors use the following assumptions: – All agents are equipped with their own private signature key and the relevant public verification keys. – B cannot block the message identified by fSUB permanently, thus A will eventually be able to obtain the evidence of receipt. – The ftp communication channel is eventually available, thus also B will eventually be able to obtain K and therefore m and con K. – TTP checks that A does not send two different keys K and K with the same label L and the same agents’ names. This is necessary because L serves as identifier for con K, i.e. TTP will overwrite con K with con K which causes a problem if either A or B have not yet retrieved con K. – TTP stores message keys at least until A and B have received con K.
Security Analysis of (Un-) Fair Non-repudiation Protocols
101
Additionally, A is required to choose a new label and a new key for each protocol run, but except the above check by TTP no means are provided to guarantee this. Dispute resolution A dispute can occur if B claims to have received m from A while A denies having sent m, or if A claims having sent m to B while B denies having received m. To resolve such a dispute, the evidence of origin and receipt, respectively, has to be sent to a judge who then checks – that con K is TTP’s signature on (fCON , A, B, L, K), which means that TTP has indeed made the respective entry because of A’s message fSUB , – that EOO C is A’s signature on (fEOO , B, L, C) (that EOR is B’s signature on (fEOR , A, L, C), respectively) – that m = dK(C) The authors conclude that the above protocol provides non-repudiation of origin and receipt and fairness for both agents A and B. However, in section 5.2 we will show that the protocol is unfair for B since it allows A to retrieve evidence of receipt for a message m while B is neither able to retrieve m nor the respective evidence of origin. The scenario in which the attack can occur satisfies all assumptions stated above.
4
Automated Security Analysis Using APA and SHVT
In this section we introduce our approach for security analysis of cryptographic protocols. We model a system of protocol agents using asynchronous product automata (APA). APA are a universal and very flexible operational description concept for cooperating systems [9]. It “naturally” emerges from formal language theory [8]. APA are supported by the SH-verification tool (SHVT) that provides components for the complete cycle from formal specification to exhaustive analysis and verification [9]. An APA can be seen as a family of elementary automata. The set of all possible states of the whole APA is structured as a product set; each state is divided into state components. In the following the set of all possible states is called state set. The state sets of elementary automata consist of components of the state set of the APA. Different elementary automata are “glued” by shared components of their state sets. Elementary automata can “communicate” by changing the content of shared state components. Protocols can be regarded as cooperating systems, thus APA are an adequate means for protocol formalisation. Figure 1 shows the structure of an asynchronous product automaton modeling a system of three protocol agents A, B and TTP. The boxes are elementary automata and the circles represent their state components. Each agent P taking part in the protocol is modeled by one elementary automaton P that performs the agent’s actions, accompanied by four state components SymkeysP , AsymkeysP , StateP , and GoalsP to store the symmetric and
102
S. G¨ urgens and C. Rudolph StateA
StateB
Network
SymkeysA
A
AsymkeysA
B
TTP
GoalsA
SymkeysB AsymkeysB GoalsB
StateTTP
GoalsTTP SymkeysTTP AsymkeysTTP
Fig. 1. Structure of the APA model for agents A, B and TTP
asymmetric keys of P, P’s local state and the security goals P should reach within the protocol, respectively. The only state component shared between all agents (all elementary automata) is the component Network, which is used for communication. A message is sent by adding it to the content of Network and received by removing it from Network. The neighbourhood relation N (graphically represented by an arc) indicates which state components are included in the state of an elementary automaton and may be changed by a state transition of this automaton. For example, automaton A may change StateA and Network but cannot read or change the state of StateB . The figure shows the structure of the APA. The full specification of the APA includes the state sets (the data types), the transition relations of the elementary automata and the initial state, which we will explain in the following paragraphs. State sets, messages and cryptography In the present paper we restrict our model to very basic data types, and the model of cryptography to those algorithms needed to specify the Zhou-Gollmann protocols. For the definition of the domains of the state components as well as for the definition of the set of messages, we need the following basic sets: IN Agents N once Constants
set of natural numbers set of agents’ names set of nonces (numbers that have never been used before) set of constants to determine the agents’ states and thus to define state transition relations Keynames set of constants to name keys, Agents ⊆ Keynames Symf lags {sym, . . .} Asymf lags {pub, priv, . . .} Keys {(w, f, n) | w ∈ Keynames, f ∈ Symf lags ∪ Asymf lags, n ∈ IN} Predicates set of predicates on global states
It is helpful to include the agents’ names in the set Keynames in order to be able to formalise for example the public key of agent P by (P, pub, n) (n ∈ IN). The second component of a key indicates its type. In order to distinguish between
Security Analysis of (Un-) Fair Non-repudiation Protocols
103
different types of keys, more flags (like pubcipher, privcipher, etc.) can be added to the respective set. The third key component allows to use more than one key with the same name and type. The key K for example in the first run of the Zhou-Gollmann protocol can be formalised by (K, sym, 1), the key of the next run by (K, sym, 2), and so on. The union of the sets Agents, N once, Constants, Keys, and IN represents the set of atomic messages, based on which we define a set M of messages in the following way: 1. Every atomic message is element of M. 2. If m1 , . . . , mr ∈ M, then (m1 , . . . , mr ) ∈ M. 3. If k, m ∈ M, then encrypt(k, m) ∈ M, decrypt(k, m) ∈ M, sign(k, m) ∈ M and hash(m) ∈ M. We define the standard functions elem(k, . . .) and length on tuples (m1 , . . . , mr ) which return the kth component (or, if k ≥ r, the rth component) and the number of components of a tuple, respectively. We model properties of cryptographic algorithms by defining properties of the symbolic functions listed above and by defining predicates. For the analysis of the Zhou-Gollmann protocol we need in particular 1. decrypt(k, encrypt(k, m)) = m 2. inverse((w, sym, n)) = (w, sym, n), inverse((P, pub, n)) = (P, priv, n) 3. verif y((P, pub, n), m, sign((P, priv, n), m)) = true (where k, m ∈ M, P, w ∈ Keynames, n ∈ IN) Our general model provides additional symbolic functions for the specification of other cryptographic protocols. The above properties define for each m ∈ M a unique shortest normal form. The set Messages is the set of all these normal forms of elements m ∈ M. Now elements of Messages constitute the content of State components, while Network contains tuples of Agents × Agents × Messages, where the first component names the sender of the message and the second component the intended message recipient. In the analysis of a protocol which includes all necessary information (such as to whom to respond) in the messages itself, the first component of the tuple in Network is not evaluated. However, some protocols like the ZhouGollmann protocol assume that this information can be retrieved from some lower level transport layer, thus the information has to be provided in addition to the content of the message. An asymmetric key key is stored in AsymkeysP using a tuple (Q, f, key), where Q is the name of the agent that P’s automaton will use to find the key and f = pub or f = priv is the flag specifying the type of the key. For the formal definition of the data structure of the state components, see [4]. The symbolic functions encrypt, decrypt, sign and hash together with the above listed properties model the cryptographic algorithms used in the various versions of the Zhou-Gollmann protocol. For this paper, we assume “perfect encryption”, i.e. we assume that keys cannot be guessed, that for generating encrypt(k, m) or sign(k, m), both k and m need to be known, that
104
S. G¨ urgens and C. Rudolph
encrypt(k, m) = encrypt(k , m ) and sign(k, m) = sign(k , m ) imply k = k and m = m , and that hash(m) = hash(m ) implies m = m .
State transition relation To specify the agents’ actions we use so-called state transition patterns describing state transitions of the corresponding elementary automaton. Step 2 of the original Zhou-Gollmann protocol where B receives the message fEOO and sends fEOR can be specified as shown in Table 1.
step 2
Name of the transition pattern
(A, M, message, plain, L, C, EOO C, P KA, SKB)
Local variables used in the pattern
M ∈ Network A := elem(1, M )
Variable M is assigned a message in Network. The variable A is assigned the assumed sender of the message. B = elem(2, M ) B checks that he is the intended recipient. B can respond to A. (respond, A) ∈ StateB B owns A’s public key, the variable P KA is (A, pub, P KA) ∈ AsymkeysB assigned the respective value. B owns his own private key, the variable SKB (B, pub, SKB) ∈ AsymkeysB is assigned the respective value. message := elem(3, M ) The variable message is assigned the data part of M . B checks that the message contains the elem(1, message) = fEOO fEOO flag. elem(2, message) = B B checks that he is named in the message. L := elem(3, message) The variable L is assigned the third message element. C := elem(4, message) The variable C is assigned the fourth message element. The variable plain is assigned what B plain := (fEOO , B, L, C) assumes to be the plaintext of EOO C. The variable EOO C is assigned the EOO C := elem(5, message) signature. verif y(P KA, plain, EOO C) = true B verifies EOO C. B
→ state transition is performed by B. M ← N etwork The message tuple is removed from Network. (fEOO , L, C, expects CON, B stores all relevant data. (C, EOO C)) → StateB (B, A, (fEOR , A, L, C, sign(SKB, (fEOR , A, L, C)))) → Network B sends message 2.
Table 1. Detailed specification of step 2 of the Zhou-Gollmann non-repudiation protocol
Security Analysis of (Un-) Fair Non-repudiation Protocols
105
B
The lines above → indicate the necessary conditions for automaton B to transform a state transition, the lines behind specify the changes of the state. → and ← denote that some data is added to and removed from a state component, respectively. B does not perform any other changes within this state transition. The syntax and semantics of state transition patterns for APA as well as the formal definitions of the state sets is explained in more detail in [3]. A complete specification in the APA framework additionally contains security relevant information. Most important are the security goals the protocol shall reach. The following paragraph explains the formalisation of the security goals defined in Section 2. For more details on a complete protocol specification, we refer the reader to [4]. Security goals In our model, the state components Goals are used to specify security goals. Whenever an agent P performs a state transition after which a specific security goal shall hold from the agent’s view, a predicate representing the goal is added to the state of GoalsP . Note that the content of GoalsP has no influence on state transitions. A protocol is secure (within the scope of our model) if a predicate is true whenever it is element of a Goals component. In the Zhou-Gollmann protocol, the security goals defined in Definition 1 in Section 2 can now be concretised. As the first goal for example states that at the end of a protocol execution by A, either A owns EOR or B does not own EOO, any state in which B owns EOO must allow A to continue the protocol execution and receive EOR without the help of an untrusted agent. This gives rise to the following definitions: – For originator A the predicate NRR(B) is true if for any message m the following holds: If EOO C for m signed by A and also a matching con K are elements of StateB , then EOR C for m signed by B is element of StateA and either a matching con K is in StateA , or con K is made available by TTP and not yet retrieved by A. – For a recipient B the predicate NRO(A) is true if for any message m the following holds: If EOR C for m signed by B and a matching con K are elements of StateA , then EOO C for m signed by A is element of StateB and either a matching con K is in StateB , or con K is made available by TTP and not yet retrieved by B. The predicates NRR(B) and NRO(A) have to be satisfied in all possible states in all protocol runs. They can therefore be included in the initial state of GoalsA and GoalsB , respectively. Initial state The initial state for the Zhou-Gollmann protocol can now be specified as shown in Table 2.
106
S. G¨ urgens and C. Rudolph StateA AsymkeysA GoalsA StateB AsymkeysB
:= {(B, agent), (TTP, server), (start, B), (m1, m2, message)} := {(A, priv, (A, priv, 1)), (B, pub, (B, pub, 1)), (TTP, pub, (TTP, pub, 1))} := {NRR(B)} := {(A, agent), (TTP, server), (respond, A)} := {(B, priv, (B, priv, 1)), (A, pub, (A, pub, 1)), (TTP, pub, (TTP, pub, 1))}
:= {(A, agent), (B, agent)} StateTTP AsymkeysTTP := {(TTP, priv, (TTP, priv, 1)), (A, pub, (A, pub, 1)), (B, pub, (B, pub, 1))} := {NRO(A)} GoalsB
Table 2. The initial state
The tuple (m1, m2, message) represents a reservoir of messages that A can send, the agents’ Symkey components as well as Network are empty in the initial state.
5 5.1
Protocol Analysis APA Framework for Protocol Analysis
The APA specification of a protocol maintains an abstraction level where details like random number generation, number of runs that shall be performed, the actions of agents, etc., are specified on an abstract level. This has to be transfered to an abstraction level where the SHVT can actually perform a state search. The APA framework provides various means to specify concrete analysis scenarios. This includes to specify the number and nature of runs that shall be validated (only finitely many runs can be checked), the concrete agents taking part in these runs and their roles (we may want to analyse a scenario with Alice and Bob also acting as A and Bob acting as B), which of the agents may act dishonestly, and other details. For more details on the analysis APA, see [2]. Including dishonest behaviour In order to perform a security analysis, our model includes the explicit specification of dishonest behaviour. For each type of dishonest behaviour, the APA includes one elementary automaton with the respective state components and state transition relations for specifying the concrete actions. (For the protocols discussed in this paper, we consider only one type of dishonest behaviour, namely the actions of a dishonest agent acting in role A. However, other protocols may require to differentiate between actions of different dishonest agents.)
Security Analysis of (Un-) Fair Non-repudiation Protocols
107
The elementary automaton of a dishonest agent can remove all tuples from Network independently from being named as the intended recipient. It can extract the first and second component of the tuples and add them as possible sender and recipient of messages to be sent by itself. (Note that it can use the name of any agent it knows as the sender of its own messages.) Furthermore, it can extract new knowledge from the messages and add this knowledge to the respective state components. A dishonest agent’s knowledge can be defined recursively in the following way: 1. A dishonest agent knows all data being part of his initial state (for example the names of other protocol agents, the keys he owns, etc.) and can generate random numbers and new keys. 2. He knows all messages and parts of messages, the names of sender and the intended recipient of messages he receives or maliciously removes from Network. 3. He can generate new messages by applying concatenation or the symbolic functions encrypt, sign and hash to messages he knows. (Note that he can use any message in his knowledge base as a key.) 4. He knows all plaintexts he can generate by deciphering a ciphertext he knows, provided he knows the necessary key as well. With new messages generated according to the above rules, in every state of the system, the dishonest agent’s elementary automaton can add new tuples (sender,recipient,message) to Network, sender and recipient being one of the agents’ names it has stored in its State component. 5.2
An Attack on the Zhou-Gollmann Protocol
We now introduce a concrete scenario for the Zhou-Gollmann protocol. Using the SH verification tool we want to automatically analyse a scenario where Alice (in role A) starts a protocol run with Bob (in role B), and Server acts in role TTP. Furthermore we assume that Alice may act dishonestly. The following details of the analysis scenario are of particular interest: 1. In order to model the assumptions made by Zhou and Gollmann that B cannot block the fSUB message permanently and that A and B are eventually able to retrieve the fCON message, we restrict dishonest behaviour in that we do not allow Alice to remove messages from Network containing an intended recipient other than herself. 2. Zhou and Gollmann do not require the TTP to provide messages fCON forever (see [17]). A reasonable assumption is that these messages are available at least until A and B each have retrieved their message, and that the TTP then may delete them 1 . We simply model this by letting the server add 1
We assume that the TTP “magically” knows who has received con K. In practical implementations either the TTP needs some confirmation of receipt by A and B before deleting the stored key or a time limit may have to be set.
108
S. G¨ urgens and C. Rudolph
these messages to Network. Since messages can only be removed from Network from the agent named as the intended recipient (assumption 1), these messages stay in Network until Alice and Bob remove them (which models the server’s delete action). 3. The Server’s check that the same label L may not be used together with two different keys can easily be modeled by storing a tuple (A, B, L, K) in the Server’s State component for each protocol run. Thus messages fSUB that contain L and K already being part of one tuple can be rejected. 4. We assume that all signatures are checked by the recipients. Thus messages with incorrect signatures will be rejected. In consequence we restrict Alice’s behaviour further in allowing her only to send messages with correct signatures. Other than that, Alice may send anything she knows as L, K and eK(m). We then want to analyse whether there is a state in which the security goal NRO(A) in GoalsB is not satisfied. Starting with the initial state, the SHVT computes all reachable states until it finds a state in which Alice owns EOR C and con K for a particular message, while Bob is not able to get con K for this message. The SHVT outputs the state indicating a successful attack. Now one can let the SHVT compute a path from the initial state to this attack state, showing how the attack works. In the first step, Alice generates a new label L and key K, stores these data in her State component for later use, and starts a first protocol run with message m1. For the rest of the protocol run, Alice acts honestly and the protocol run proceeds according to the protocol description. At the end of the protocol run, Alice owns in her state component StateAlice both Bob’s EOR C and Server’s con K for m1, L and K: StateAlice = {. . . , (m1, m2, messages), (. . . , K, L, . . .), (. . . , sign((Bob, priv, 1), (fEOR , Alice, L, encrypt(K, m1))), sign((Server, priv, 1), (fCON , Alice, Bob, L, K)))} Alice can now start the next protocol run. Among the possibilities to do this is one state transition in which she chooses m2 as the next message to send, but does not generate a new label and a new key. Instead, she uses L and K she has stored to use in an attack. Thus the state component Network contains the following data: Network = {(Alice,Bob, (fEOO , Bob, L, encrypt(K, m2), sign((Alice, priv, 1), (fEOO , Bob, L, encrypt(K, m2))))} Bob may still own an EOO tuple for L, K and m1 in StateBob and may therefore be able to decrypt the ciphertext encrypt(K, m2). However, the protocol specification does not require him to check this and Bob has no reason to try any old keys on the new message. The assumption that Bob stores all proofs he ever received is quite unrealistic in any case, thus Bob may have deleted the particular key and EOO. Consequently, Bob answers with the fEOR message including EOR C for m2, which results in
Security Analysis of (Un-) Fair Non-repudiation Protocols
109
Network = {(Bob,Alice, (fEOR , Alice, L, encrypt(K, m2), sign((Bob, priv, 1), (fEOO , Alice, L, encrypt(K, m2))))} Now, since Alice still owns con K for L and K, she owns a valid proof of Bob having received message m2 that will be accepted by any judge, and stops the protocol run. Bob on the other hand does neither own the key K nor the Server’s signature (or does not know that he owns these data). Thus, security goal NRO(A) is not satisfied and Bob in fact will never be able to retrieve m2. This shows that while much care has been taken to assure fairness of the protocol for A, the protocol does not provide fairness for B. Although this protocol and in particular its fairness has been analysed with two different analysis methods [13,1], this weakness was not discovered. Bella and Paulson have used the inductive analysis approach based on the theorem prover Isabelle [10] which had been developed for the analysis of authentication and key establishment protocols. They cannot find the above attack, because in their model the only malicious action of the protocol agents consists of abandoning protocol sessions before termination. This is not a realistic attack scenario for non-repudiation protocols. Schneider’s analysis uses CSP. This approach has also been used to analyse authentication and key establishment protocols [12,11]. The scenario in which the security proofs are carried out is more realistic than the scenario of Bella and Paulson. The behaviour of the originator A and recipient B is not restricted. They can execute all possible malicious actions. Even so, the attack does not exist in the scenario because of the rather unrealistic assumption that evidence CON K and the keys remain available for download at the TTP forever. The protocol analysis performed by Zhou and Gollmann themselves [19] uses a modified version of the authentication logic SVO [14]. This logic (like all other authentication logics) cannot express the property of fairness, as stated by the authors of [19], consequently their analysis does not find the protocol weakness.
6 6.1
Protocol Variants Unique Labels
Obviously, a critical point of the protocol is how to choose the label L. In a different version of the protocol, Zhou and Gollmann suggest to use L = H(m, K) (see [17,16]). This guarantees that whenever a new message is sent, the message will be accompanied by a new label, even if the same key K is used. The actions performed by the agents are essentially the same with the exception of label generation and that the label check performed by TTP is now obsolete. In a dispute, the judge will additionally check that L = H(dK(C), K). Unfortunately the change of label generation does not avoid the unfairness of the protocol. We again model a scenario with dishonest Alice acting in role A and Bob acting in role B. Since the hash values are checked by a judge we model Alice by requiring to use the hash function for label generation, but we allow Alice to use anything she knows as parameters. Thus, Alice initiates the
110
S. G¨ urgens and C. Rudolph
protocol with a message including H(m2, K) and eK(m1) and the respective signature: Network = {(Alice,Bob, (fEOO , Bob, H(m2, K), encrypt(K, m1), sign((Alice, priv, 1), (fEOO , Bob, H(m2, K), encrypt(K, m1))))} Bob removes the message tuple from Network and proceeds according to the description in Section 3. The protocol run ends with each Alice and Bob owning sSServer (fCON , Alice, Bob, H(m2, K), K). Additionally, Alice owns sSBob (fEOR , Alice, H(m2, K1), eK(m1)), and Bob owns Alice’s respective signature. However, these do not present a valid proof for Bob that Alice has sent m1 nor a proof for Alice that Bob has received m1, as a judge would find that the label used in the signatures is not equal to H(m1, K). Alice knows this (after all, she generated the label), but Bob may not know it if he is not required to make the respective check after the last protocol step. Now Alice can start a second protocol run in which she uses the same label H(m2, K) and key K, but this time indeed sends the enciphered message m2: Network = {(Alice,Bob, (fEOO , Bob, H(m2, K), encrypt(K, m2), sign((Alice, priv, 1), (fEOO , Bob, H(m2, K), encrypt(K, m2))))} Bob answers by sending his fEOR message. Now, Alice owns a valid EOR C of Bob for m2 and, from the first protocol run, con K for H(m2, K) and K. Hence Alice owns a valid proof that Bob received message m2, while again Bob has no chance to ever retrieve m2 or the Server’s signature. Alice now stops the protocol run. 6.2
Time-Stamping Evidence
In [16] Zhou et al. propose to use time stamps generated by TTP in evidence con K to identify when the key and thus the message was made available. The protocol remains the same except for the additional time stamp Tcon in con K = sST T P (fCON , A, B, L, Tcon , K). In addition to fair non-repudiation (Definition 1) this protocol shall satisfy another security goal: The time stamp Tcon is supposed to identify when the message is made available to B. To model non-repudiation with time stamps we have introduced a discrete time model to our framework. An additional elementary automaton T ime increases a natural number in a state component T . Assumptions about the agents’ behaviour relative to time can be modelled by the behaviour of the automaton T ime. In the non-repudiation protocol example, we assume that when B expects to retrieve con K and con K is already made available by TTP then B retrieves con K within the actual time slot. In this case T is only increased by T ime after B has retrieved con K. In this scenario the security goal for B is that whenever B retrieves con K, then the time stamp signed by the TTP in con K must be the actual time contained in state component T .
Security Analysis of (Un-) Fair Non-repudiation Protocols
111
In the scenario where Bob (in role B) acts honestly as described above, Alice (in role A) can execute protocol steps 3 and 4 without previous execution of steps 1 or 2 and receive a time-stamped con K with a specific Tcon . Alice can even collect several different time stamps for the same message. Later, Alice starts the protocol as usual with step 1 at time T > Tcon . Bob responds with EOR C. As step 3 and 4 have already happened, Alice terminates the protocol run after step 2 and owns a valid evidence of receipt for time Tcon < T , although the message was not available for Bob before time T . In [15] where a more elaborated version of this protocol is introduced, Zhou has pointed out that by sending sub K before receiving EOR C, Alice enables Bob to get the message and evidence of origin without providing any evidence of receipt. However, as Bob cannot know that Alice has already submitted sub K to TTP he has no reason to retrieve it from the TTP, and if he retrieves it, he only gets evidence of origin containing an old time stamp. 6.3
Improved Fair Non-repudiation Protocols
Obviously, the problem with all three protocol variants above is that B cannot control the connection between the different parts of the evidence. We suggest to improve the protocol by letting B introduce his own label LB when receiving the first message of the protocol and including this label in all subsequent messages. Thus, the new specification of EOR C, sub K and con K are as follows: – EOR C = sSB (fEOR , A, L, LB , C) – sub K = sSA (fSUB , B, L, LB , K) – con K = sST T P (fCON , A, B, L, LB , K) All three attacks are prevented because A cannot reuse con K for a different message and A is not able to get a valid con K before step 2 was executed as A cannot guess LB .
7
Relevance of the Attacks
One can easily construct scenarios in which the attacks described in this paper disappear. However, the requirements for these scenarios are not very intuitive. The following observations motivate the choice of our analysis scenario: 1. After the protocol is finished there should be no need for the TTP to keep evidence available for retrieval by protocol agents. Fairness of future protocol runs must not rely on evidence from past protocol runs stored at the TTP, unless the protocol specification explicitly mentions respective actions. It is obvious that the TTP can not delete evidence con K before both Alice and Bob have retrieved it, as in this case the protocols cannot be fair. On the other hand, apart from dispute resolution, no further protocol steps are carried out after Alice and Bob have received their pieces of evidence. As
112
S. G¨ urgens and C. Rudolph
the TTP does not participate in the dispute resolution there is no obvious need to store any evidence after completion of the protocol. Although any real-world TTP may store a permanent log of all transactions this log cannot prevent any of the attacks. Data in the log is not available for further protocol executions, and subsequent investigation cannot detect any misbehaviour of Alice because the TTP is not involved in the second (malicious) protocol run. 2. Any agent must keep evidence as long as necessary to resolve disputes about the particular protocol run. However, at some point the agent will consider the respective transaction completed, thus from then on the evidence is no longer relevant. Thus it is not reasonable to require that evidence has to be kept forever to be used in future protocol runs. The time attack in Section 6.2 does not require any of the assumptions above. The attack may occur even if evidence is kept forever by the TTP.
8
Conclusions
In this paper we have demonstrated our method for security analysis of cryptographic protocols using three variants of a non-repudiation protocol proposed by Zhou and Gollmann, and have shown possible attacks on all three variants. The attacks illustrate the need for more detailed protocol specifications as basis for secure implementations. The security analyses were carried out using the SH-verification tool [9]. All attacks were found in less than 5 seconds on a PentiumIII 550Mhz computer. No attack was found for the improved version of the protocol proposed in Section 6.3. Here, the tool computed the complete reachable state space for a scenario with three possible protocol runs in 38 minutes. Although no attacks where found, other attacks may exist in scenarios we have not checked. Our methods do not provide proofs of security, but are similar to model checking analysis approaches where a finite state space is searched for insecure states (see, for example, [7] or [11]). Compared to other approaches, our methods are both very flexible and minimal with respect to implicit assumptions (we use “perfect encryption” and assume that no unauthorised access to agents’ memory is possible) and use more detailed protocol specifications. We expect that the attacks can also be found using other formal analysis methods if the specification of the protocol is not too abstract and if the attacks are not hidden by implicit assumptions. The examples show that although a protocol has been carefully studied and proven to be secure there may still be unknown attacks. Consequently, security proofs have to be treated with care. Such proofs could be based on improper explicit or implicit assumptions. Our current work includes security analyses of more sophisticated non-repudiation protocols with resolve and abort sub-protocols. Results will be published in a forthcoming paper.
Security Analysis of (Un-) Fair Non-repudiation Protocols
113
References 1. G. Bella and L. Paulson. Mechanical proofs about a non-repudiation protocol. In Proceedings of 14th International Conference on Theorem Proving in Higher Order Logic, Lecture Notes in Computer Science, pages 91–104. Springer Verlag, 2001. 2. S. G¨ urgens, P. Ochsenschl¨ ager, and C. Rudolph. Role based specification and security analysis of cryptographic protocols using asynchronous product automata. GMD Report 151, GMD – Forschungszentrum Informationstechnik GmbH, 2001. 3. S. G¨ urgens, P. Ochsenschl¨ ager, and C. Rudolph. Authenticity and provability, a formal framework. In Infrastructure Security Conference InfraSec 2002, volume 2437, page 227, 2002. 4. S. G¨ urgens, P. Ochsenschl¨ ager, and C. Rudolph. Role based specification and security analysis of cryptographic protocols using asynchronous product automata. In DEXA 2002 International Workshop on Trust and Privacy in Digital Business. IEEE, 2002. 5. K. Kim, S. Park, and J. Baek. Improving fairness and privacy of zhou-gollmann’s fair non-repudiation protocol. In Proceedings of 1999 ICPP Workshop on Security, pages 140–145, 1999. 6. S. Kremer and J.-F. Raskin. A game-based verification of non-repudiation and fair exchange protocols. In Proceedings of 12th International Conference on Concurrency Theory, Lecture Notes in Computer Science, pages 551–565. Springer Verlag, 2001. 7. G. Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using CSP and FDR. In Second International Workshop, TACAS ’96, volume 1055 of LNCS, pages 147–166. SV, 1996. 8. P. Ochsenschl¨ ager, J. Repp, and R. Rieke. Abstraction and composition – a verification method for co-operating systems. Journal of Experimental and Theoretical Artificial Intelligence, 12:447–459, 2000. 9. P. Ochsenschl¨ ager, J. Repp, R. Rieke, and U. Nitsche. The SH-Verification Tool – Abstraction-Based Verification of Co-operating Systems. Formal Aspects of Computing, The Int. Journal of Formal Methods, 11:1–24, 1999. 10. L. C. Paulson. Proving Properties of Security Protocols by Induction. Technical Report 409, Computer Laboratory, University of Cambridg, 1996. 11. B. Roscoe, P. Ryan, S. Schneider, M. Goldsmith, and G. Lowe. The modelling and Analysis of Security Protocols. Addison Wesley, 2000. 12. S. Schneider. Verifying authentication protocols with CSP. In IEEE Computer Security Foundations Workshop. IEEE, 1997. 13. S. Schneider. Formal Analysis of a non-repudiation Protocol. In IEEE Computer Security Foundations Workshop. IEEE, 1998. 14. P.F. Syverson and P.C. van Oorschot. On unifying some cryptographic protocol logics. In IEEE Symposium on Security and Privacy, pages 14–28, May 1994. 15. J. Zhou. Non-repudiation in Electronic Commerce. Computer Security Series. Artech House, 2001. 16. J. Zhou, R. Deng, and F. Bao. Evolution of fair non-repudiation with TTP. In Proceedings of 1999 Australasian Conference on Information Security and Privacy ACISP, Lecture Notes in Computer Science, pages 258–269. Springer Verlag, 1999. 17. J. Zhou and D. Gollmann. A fair non-repudiation protocol. In Proceedings of the 1996 IEEE Symposium on Research in Security and Privacy, pages 55–61, Oakland, CA. IEEE Computer Society Press.
114
S. G¨ urgens and C. Rudolph
18. J. Zhou and D. Gollmann. An efficient non-repudiation protocol. In Proceedings of 10th IEEE Computer Security Foundations Workshop, pages 126–132, 1997. 19. J. Zhou and D. Gollmann. Towards verification of non-repudiation protocols. In Proceedings of 1998 International Refinement Workshop and Formal Methods Pacific, pages 370–380, 1998.
Modeling Adversaries in a Logic for Security Protocol Analysis Joseph Y. Halpern and Riccardo Pucella Department of Computer Science Cornell University Ithaca, NY 14853, USA {halpern,riccardo}@cs.cornell.edu
Abstract. Logics for security protocol analysis require the formalization of an adversary model that specifies the capabilities of adversaries. A common model is the Dolev-Yao model, which considers only adversaries that can compose and replay messages, and decipher them with known keys. The Dolev-Yao model is a useful abstraction, but it suffers from some drawbacks: it cannot handle the adversary knowing protocol-specific information, and it cannot handle probabilistic notions, such as the adversary attempting to guess the keys. We show how we can analyze security protocols under different adversary models by using a logic with a notion of algorithmic knowledge. Roughly speaking, adversaries are assumed to use algorithms to compute their knowledge; adversary capabilities are captured by suitable restrictions on the algorithms used. We show how we can model the standard Dolev-Yao adversary in this setting, and how we can capture more general capabilities including protocol-specific knowledge and guesses.
1 Introduction Many formal methods for the analysis of security protocols rely on specialized logics to rigorously prove properties of the protocols they study.1 Those logics provide constructs for expressing the basic notions involved in security protocols, such as secrecy, recency, and message composition, as well as providing means (either implicitly or explicitly) for describing the evolution of the knowledge or belief of the principals as the protocol progresses. Every such logic aims at proving security in the presence of hostile adversaries. To analyze the effect of adversaries, a security logic specifies (again, either implicitly or explicitly) an adversary model, that is, a description of the capabilities of adversaries. Almost all existing logics are based on a Dolev-Yao adversary model [9]. Succinctly, a Dolev-Yao adversary can compose messages, replay them, or decipher them if she knows the right keys, but cannot otherwise “crack” encrypted messages. The Dolev-Yao adversary is a useful abstraction, in that it allows reasoning about protocols without worrying about the actual cryptosystem being used. It also has the advantage of being restricted enough that interesting theorems can be proved with 1
Here, we take a very general view of logic, to encompass formal methods where the specification language is implicit, or where the properties to be checked are fixed, such as Casper [22], Cryptyc [16], or the NRL Protocol Analyzer [25].
A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 115–132, 2003. c Springer-Verlag Berlin Heidelberg 2003
116
J.Y. Halpern and R. Pucella
respect to security. However, in many ways, the Dolev-Yao model is too restrictive. For example, it does not consider the information an adversary may infer from properties of messages and knowledge about the protocol that is being used. To give an extreme example, consider what we’ll call the Duck-Duck-Goose protocol: an agent has an n-bit key and, according to her protocol, sends the bits that make up its key one by one. Of course, after intercepting these messages, an adversary will know the key. However, there is no way for security logics based on a Dolev-Yao adversary to argue that, at this point, the adversary knows the key. Another limitation of the Dolev-Yao adversary is that it does not easily capture probabilistic arguments. After all, the adversary can always be lucky and just guess the appropriate key to use, irrespective of the strength of the cryptosystem.. There is another important problem with the Dolev-Yao adversary. Because the Dolev-Yao model uses an abstract cryptosystem, it cannot capture subtle interactions between the cryptographic protocol and the cryptosystem. It is known that various protocols that appear secure under abstract cryptography can be problematic when implemented using specific cryptosystems [29]. Two examples of this are the phenomenon of encryption cycles [2], and the use of exclusive-or in some protocol implementations [33]. A more refined logic for reasoning about security protocols will have to be able to handle adversaries more general than the Dolev-Yao adversary. The importance of being able to reason about different adversaries was made clear by Lowe [21] when he exhibited a man-in-the-middle attack in the well-known authentication protocol due to Needham and Schroeder [31]. Up until that time, the NeedhamSchroeder protocol was analyzed under the assumption that the adversary had complete control of the network, and could inject intercept and inject arbitrary messages (up to the Dolev-Yao capabilities) into the protocol runs. However, the adversary was always assumed to be an outsider, not being able to directly interact with the protocol principals as himself. Lowe showed that if the adversary is allowed to be an insider of the system, that is, appear to the other principals as a bona fide protocol participant, then the Needham-Schroeder protocol does not guarantee the authentication properties it is meant to guarantee. Because they effectively build in the adversary model, existing formal methods for analyzing protocols are not able to reason directly about the effect of running a protocol against adversaries with properties other than those built in. The problem is even worse when it is not clear exactly what assumptions are implicitly being made about the adversary. In this paper, we introduce a logic for reasoning about security protocols that allows us to model adversaries explicitly. The idea is to model the adversary in terms of what the adversary knows. This approach has some significant advantages. Logics of knowledge [14] have been shown to provide powerful methods for reasoning about trace-based executions of protocols. They can be given semantics that is tied directly to protocol execution, thus avoiding problems of having to analyze an idealized form of the protocol, as is required, for example, in BAN logic [7]. A straightforward application of logics of knowledge allows us to conclude that in the Duck-Duck-Goose protocol, the adversary knows the key. Logics of knowledge can also be extended with probabilities [13,18] so as to be able to deal with probabilistic phenomena. Unfortunately, traditional logics of knowledge suffer from a well-known problem known as the logical omni-
Modeling Adversaries in a Logic for Security Protocol Analysis
117
science problem: an agent knows all tautologies and all the logical consequences of her knowledge. The reasoning that allows an agent to infer properties of the protocol also allows an attacker to deduce properties that cannot be computed by realistic attackers in any reasonable amount of time. To avoid the logical omniscience problem, we use the notion of algorithmic knowledge [14, Chapter 10 and 11]. Roughly speaking, we assume that agents (including adversaries) have “knowledge algorithms” that they use to compute what they know. The capabilities of the adversary are captured by its algorithm. Hence, Dolev-Yao capabilities can be provided by using a knowledge algorithm that can only compose messages or attempt to decipher using known keys. By changing the algorithm, we can extend the capabilities of the adversary so that it can attempt to crack the cryptosystem by factoring (in the case of RSA), using differential cryptanalysis (in the case of DES), or just by guessing keys, along the lines of a recent model due to Lowe [23]. Moreover, our framework can also handle the case of a principal sending the bits of its key, by providing the adversary’s algorithm with a way to check whether this is indeed what is happening. By explicitly using algorithms, we can therefore analyze the effect of bounding the resources of the adversary, and thus make progress toward bridging the gap between the analysis of cryptographic protocols and more computational accounts of cryptography. (See [2] and the references therein for a discussion on work bridging this gap.) Note that we need both traditional knowledge and algorithmic knowledge in our analysis. Traditional knowledge is used to model a principal’s beliefs about what can happen in the protocol; algorithmic knowledge is used to model the adversary’s computational limitations (for example, the fact that it cannot factor). The rest of the paper is organized as follows. In Section 2, we define our model for protocol analysis and our logic for reasoning about implicit and explicit knowledge, based on the well-understood multiagent system framework. In Section 3, we show how to model different adversaries from the literature. We discuss related work in Section 4.
2 A Logic for Security Protocol Analysis In this section, we review the multi-agent system framework and the logic of algorithmic knowledge. We then show how these can be tailored to define a logic for reasoning about security protocols. Multiagent Systems The multiagent systems framework [14, Chapters 4 and 5] provides a model for knowledge that has the advantage of also providing a discipline for modeling executions of protocols. A multiagent system consists of n agents, each of which is in some local state at a given point in time. We assume that an agent’s local state encapsulates all the information to which the agent has access. In the security setting, the local state of an agent might include some initial information regarding keys, the messages she has sent and received, and perhaps the reading of a clock. In a poker game, a player’s local state might consist of the cards he currently holds, the bets made by other players, any other cards he has seen, and any information he may have about the strategies of the other players (for example, Bob may know that Alice
118
J.Y. Halpern and R. Pucella
likes to bluff, while Charlie tends to bet conservatively). The basic framework makes no assumptions about the precise nature of the local state. We can then view the whole system as being in some global state, which is a tuple consisting of each agent’ local state, together with the state of the environment, where the environment consists of everything that is relevant to the system that is not contained in the state of the agents. Thus, a global state has the form (se , s1 , . . . , sn ), where se is the state of the environment and si is agent i’s state, for i = 1, . . . , n. The actual form of the agents’ local states and the environment’s state depends on the application. A system is not a static entity. To capture its dynamic aspects, we define a run to be a function from time to global states. Intuitively, a run is a complete description of what happens over time in one possible execution of the system. A point is a pair (r, m) consisting of a run r and a time m. For simplicity, we take time to range over the natural numbers in the remainder of this discussion. At a point (r, m), the system is in some global state r(m). If r(m) = (se , s1 , . . . , sn ), then we take ri (m) to be si , agent i’s local state at the point (r, m). We formally define a system R to consist of a set of runs (or executions). It is relatively straightforward to model security protocols as systems. Note that the adversary in a security protocol can be modeled as just another agent. The adversary’s information at a point in a run can be modeled by its local state. The Basic Logic The aim is to be able to reason about properties of such systems, including properties involving the knowledge of agents in the system. To formalize this type of reasoning, we first need a language. The logic of algorithmic knowledge [14, Chapters 10 and 11] provides such a framework. It extends the classical logic of knowledge by adding algorithmic knowledge operators. The syntax of the logic LKX n for algorithmic knowledge is straightforward. Starting with a set Φ0 of primitive propositions, which we can think of as describing basic facts about the system, such as “the key is k” or “agent A sent the message m to B”, complicated formulas of LKX n (Φ0 ) are formed by closing off under negation, conjunction, and the modal operators K1 , . . ., Kn and X1 , . . . , Xn . Syntax of LKX n (Φ0 ): p, q ∈ Φ0 ϕ, ψ ∈ Φ ::= p ¬ϕ ϕ∧ψ Ki ϕ Xi ϕ
Primitive propositions Formulas Primitive proposition Negation Conjunction Implicit knowledge of ϕ (i ∈ 1..n) Explicit knowledge of ϕ (i ∈ 1..n)
The formula Ki ϕ is read as “agent i (implicitly) knows the fact ϕ”, while Xi ϕ is read as “agent i explicitly knows fact ϕ”. In fact, we will read Xi ϕ as “agent i can compute fact ϕ”. This reading will be made precise when we discuss the semantics of the logic. As usual, we take ϕ ∨ ψ to be an abbreviation for ¬(¬ϕ ∧ ¬ψ) and ϕ ⇒ ψ to be an abbreviation for ¬ϕ ∨ ψ.
Modeling Adversaries in a Logic for Security Protocol Analysis
119
The standard models for this logic are based on the idea of possible worlds and Kripke structures [19]. Formally, a Kripke structure M is a tuple (S, π, K1 , . . . , Kn ), where S is a set of states or possible worlds, π is an interpretation which associates with each state in S a truth assignment to the primitive propositions (i.e., π(s)(p) ∈ {true, false} for each state s ∈ S and each primitive proposition p), and Ki is an equivalence relation on S (recall that an equivalence relation is a binary relation which is reflexive, symmetric, and transitive). Ki is agent i’s possibility relation. Intuitively, (s, t) ∈ Ki if agent i cannot distinguish state s from state t (so that if s is the actual state of the world, agent i would consider t a possible state of the world). A system can be viewed as a Kripke structure, once we add a function π telling us how to assign truth values to the primitive propositions. An interpreted system I consists of a pair (R, π), where R is a system and π is an interpretation for the propositions in Φ that assigns truth values to the primitive propositions at the global states. Thus, for every p ∈ Φ and global state s that arises in R, we have π(s)(p) ∈ {true, false}. Of course, π also induces an interpretation over the points of R; simply take π(r, m) to be π(r(m)). We refer to the points of the system R as points of the interpreted system I. The interpreted system I = (R, π) can be made into a Kripke structure by taking the possible worlds to be the points of R, and by defining Ki so that ((r, m), (r , m )) ∈ Ki if ri (m) = ri (m ). Clearly Ki is an equivalence relation on points. Intuitively, agent i considers a point (r , m ) possible at a point (r, m) if i has the same local state at both points. Thus, the agents’ knowledge is completely determined by their local states. To account for Xi , we provide each agent with a knowledge algorithm that he uses to compute his knowledge. We will refer to Xi ϕ as algorithmic knowledge. An interpreted algorithmic system is of the form (R, π, A1 , . . . , An ), where (R, π) is an interpreted system, and Ai is the knowledge algorithm of agent i. In local state , the agent computes whether he knows ϕ by applying the knowledge algorithm A to input (ϕ, ). The output is either “Yes”, in which case the agent knows ϕ to be true, “No”, in which case the agent does not know ϕ to be true, or “?”, which intuitively says that the algorithm has insufficient resources to compute the answer. It is the last clause that allows us to deal with resource-bounded reasoners. We define what it means for a formula ϕ to be true (or satisfied) at a point (r, m) in an interpreted system I, written (I, r, m) |= ϕ, inductively as follows. Satisfiability relation: (I, r, m) |= ϕ (I, r, m) |= p (I, r, m) |= ¬ϕ (I, r, m) |= ϕ ∧ ψ (I, r, m) |= Ki ϕ (I, r, m) |= Xi ϕ
if π(r, m)(p) = true if (I, r, m) |= ϕ if (I, r, m) |= ϕ and (I, r, m) |= ψ if (I, r, m) |= ϕ for all (r , m ) such that ri (m) = ri (m ) if Ai (ϕ, ri (m)) = “Yes”
The first clause shows how we use the π to define the semantics of the primitive propositions. The next two clauses, which define the semantics of ¬ and ∧, are the standard clauses from propositional logic. The fourth clause is designed to capture the intuition that agent i knows ϕ exactly if ϕ is true in all the worlds that i thinks are possible. The last clause captures the fact that explicit knowledge is determined using the knowledge algorithm of the agent.
120
J.Y. Halpern and R. Pucella
As we pointed out, we think of Ki as representing implicit knowledge, facts that the agent implicitly knows, given its information, while Xi represents explicit knowledge, facts whose truth the agent can compute explicitly. As defined, there is no necessary connection between Xi ϕ and Ki ϕ. An algorithm could very well claim that agent i knows ϕ (i.e., output “Yes”) whenever it chooses to, including at points where Ki ϕ does not hold. Although algorithms that make mistakes are common, we are often interested in knowledge algorithms that are correct. A knowledge algorithm is sound for agent i in the system I if for all points (r, m) of I and formulas ϕ, A(ϕ, ri (m)) = “Yes” implies (I, r, m) |= Ki ϕ, and A(ϕ, ri (m)) = “No” implies (I, r, m) |= ¬Ki ϕ. Thus, a knowledge algorithm is sound if its answers are always correct. Specializing to Security The systems and logic we describe earlier in this section are fairly general. We have a particular application in mind, namely reasoning about security protocols, especially authentication protocols. We now specialize the framework above by describing the local states of the systems under considerations, as well as pinning down the primitive propositions in our logic, and the interpretation of those propositions. For the purposes of analyzing security protocols, we consider security systems to be those where the local state of a principal consists of the principal’s initial information followed by the sequence of events that the principal has been involved in. An event is either the reception recv(m) of a message m, or the sending send(i, m) of a message m to another agent i. To model the fact that adversaries can intercept all the messages exchanged by the principals, we assume that the adversary’s local state includes the set of messages exchanged between all the principals. Since typically security protocols involve agents exchanging encrypted messages, we need to discuss the cryptographic concepts we need. We assume a set P of plaintexts, as well as a set K of keys. We define a cryptosystem C over P and K to be the closure M of P and K under a concatenation operation conc : M × M → M and an encryption operation encr : M × K → M. We often write m1 · m2 for conc(m1 , m2 ) and {|m|}k for encr (m, k). There is no difficulty in adding more operations to the cryptosystems, for instance, to model hashes, signatures, or the ability to take the exclusive-or of two terms. The messages exchanged by security systems will be taken from this set M of messages. We assume that the set K of keys is closed under inverses; for a given key k ∈ K, we assume an inverse key k−1 ∈ K such that encrypting with respect to the key k−1 is equivalent to decrypting a message, that is, {|{|m|}k |}k−1 = m. If the cryptosystem uses symmetric keys, then k−1 = k; for public key cryptosystems, k and k−1 are different. Formally, we define M to be the smallest set containing both P and K, such that if m1 , m2 ∈ M and k ∈ K, then {|m|}k ∈ M and m1 · m2 ∈ M. We make no assumption in the general case as to the properties of encryption. Define on M as the smallest relation satisfying the following constraints: 1. 2. 3. 4.
m m, if m m1 , then m m1 · m2 , if m m2 , then m m1 · m2 , if m m1 , then m {|m1 |}k .
Modeling Adversaries in a Logic for Security Protocol Analysis
121
Intuitively, m1 m2 if m1 could be used in the construction of m2 . For example, if m = {|m1 |}k = {|m2 |}k , then both m1 m and m2 m. Therefore, if we want to establish that m1 m2 for a given m1 and m2 , then we have to look at all the possible ways in which m2 can be taken apart, either by concatenation or encryption, to finally decide if m1 can be derived from m2 . To reason about security protocols, we consider a specific set of primitive propositions ΦS0 . Primitive propositions for security: p, q ∈ ΦS0 ::= sendi (m) recvi (m) hasi (m)
Agent i sent message m Agent i received message m Agent i has message m
Intuitively, sendi (m) is true when agent i has sent message m at some point, and recvi (m) is true when agent i has received message m at some point. Agent i has a submessage m1 at a point (r, m), written hasi (m1 ), if there exists a message m2 ∈ M such that recv(m2 ) is in ri (m), the local state of agent i, and m1 m2 . Note that the hasi predicate is not restricted by issues of encryption. If hasi ({|m|}k ) holds, then so does hasi (m), whether or not agent i knows the key k−1 . Intuitively, the hasi predicate characterizes the messages that agent i has implicitly in his possession, given the messages that have been exchanged between the principals. S An interpreted security system is simply an interpreted system I = (R, πR ) where S R is a security system, and πR is the following canonical interpretation of the primitive propositions in R. Canonical interpretation: S πR (r, m)(sendi (m)) = true iff ∃j such that send(j, m) ∈ ri (m) S (r, m)(recv i (m)) = true iff recv(m) ∈ ri (m) πR S πR (r, m)(hasi (m)) = true iff ∃m such that m m and recv(m ) ∈ ri (m)
An interpreted algorithmic security system is similarly defined as an interpreted alS S gorithmic system (R, πR , A1 , . . . , An ), where R is a security system, and πR is the canonical interpretation in R.
3 Modeling Adversaries As we outlined in the last section, interpreted algorithmic security systems provide a foundation for representing security protocols, and support a logic for writing properties based on knowledge, both traditional and algorithmic. For the purposes of analyzing security protocols, we use traditional knowledge to model a principal’s beliefs about what can happen in the protocol, while we use algorithmic knowledge to model the adversary’s capabilities, possibly resource-bounded. To interpret algorithmic knowledge, we rely on a knowledge algorithm for each agent in the system. We haven’t said
122
J.Y. Halpern and R. Pucella
anything yet as to what kind of algorithm we might consider, short of the fact that we typically care about sound knowledge algorithms. For the purpose of security, the knowledge algorithms we give to the adversary are the most important, as they capture the facts that the adversary can compute given what he has seen. In this section, we show how we can capture different capabilities for the adversary rather naturally in this framework. We first show how to capture the standard model of adversary due to Dolev and Yao. We then show how to account adversaries in the Duck-Duck-Goose protocol, and the adversary due to Lowe that can perform self-validating guesses. The Dolev-Yao Adversary As a first example of this, consider the standard Dolev-Yao adversary [9]. This model is a combination of assumptions on the cryptosystem used and the capabilities of the adversaries. Specifically, the cryptosystem is seen as the free algebra generated by P and K over abstract operations · and {||}. Perhaps the easiest way to formalize this is to view the set M as the set of abstract expressions generated by the grammar m ::= p | k | {|m|}k | m · m (with p ∈ P and k ∈ K). We then identify elements of M under the equivalence {|{|m|}k |}k−1 = m. Notice that this model of cryptosystem implicitly assumes that there are no collisions; messages always have a unique decomposition. The only way that {|m|}k = {|m |}k is if m = m and k = k . We make the standard assumption that concatenation and encryption have enough redundancy to recognize that a term is in fact a concatenation m1 · m2 or an encryption {|m|}k . The Dolev-Yao model can be formalized by a relation H DY m between a set H of messages and a message m. (Our formalization is equivalent to many other formalizations of Dolev-Yao in the literature, and is similar in spirit to that of Paulson [32].) Intuitively, H DY m means that an adversary can “extract” message m from a set of received messages and keys H, using the allowable operations. The derivation is defined using the following inference rules: Dolev-Yao derivation: H DY m m∈H H DY m
H DY {|m|}k H DY k−1 H DY m
H DY m1 · m2 H DY m1
H DY m1 · m2 H DY m2
In our framework, to capture the capabilities of a Dolev-Yao adversary, we specify how the adversary can tell if she in fact has a message, by defining a knowledge algorithm ADY for adversary i. Recall that a knowledge algorithm for agent i takes as i input a formula and agent i’s local state (which we are assuming contains the messages received by i). The most interesting case in the definition of ADY i is when the formula is hasi (m). To compute ADY (has (m), ), the algorithm simply checks, for every message i i m received by the adversary, whether m is a submessage of m , according to the keys that are known to the adversary. We assume that the adversary’s initial state consists of the set of keys initially known by the adversary. This will typically contain, in a publickey cryptography setting, the public keys of all the agents. We use initkeys() to denote the set of initial keys known by agent i in local state . (Recall that a local state for agent i is the sequence of events pertaining to agent i, including any initial information
Modeling Adversaries in a Logic for Security Protocol Analysis
123
in the run, in this case, the keys initially known.) Checking whether m is a submessage of m is performed by a function submsg, which can take apart messages created by concatenation, or decrypt messages as long as the adversary knows the decryption key. Dolev-Yao knowledge algorithm (extract): ADY i (hasi (m), ) K = keysof () for each recv(m ) in do if submsg(m, m , K) then return “Yes” return “No” The full algorithm can be found in the appendix. According to the Dolev-Yao model, the adversary cannot explicitly compute anything interesting about what other messages agents have. Hence, for other primitives, = i, ADY returns “?”. For formulas of the form Kj ϕ and including hasj (m) for j i DY Xj ϕ, Ai also returns “?”. For Boolean combinations of formulas, ADY returns the i corresponding Boolean combination (where the negation of “?” is “?”, the conjunction of “No” and “?” is “No”, and the conjunction of “Yes” and “?” is “?”) of the answer for each hasi (m) query. The following result shows that an adversary using ADY recognizes (i.e., returns i “Yes” to) hasi (m) in state iff m exactly the messages determined to be in the set of messages that can be derived (according to DY ) from the messages received in that state together with the keys initially known, Moreover, if a hasi (m) formula is derived at the point (r, m), the hasi (m) is actually true at (r, m) (so that ADY i is sound). S Proposition 1. In the interpreted algorithmic security system I = (R, πR , A1 , . . . , An ), DY where Ai = Ai , we have that (I, r, m) |= Xi (hasi (m)) iff {m : recv(m) ∈ ri (m)} ∪ initkeys() DY m. Moreover, if (I, r, m) |= Xi (hasi (m)) then (I, r, m) |= hasi (m).
The Duck-Duck-Goose Adversary The key advantage of our framework is that we can easily change the capabilities of the adversary beyond those prescribed by the Dolev-Yao model. For example, we can capture the fact that if the adversary knows the protocol, she can derive more information than she could otherwise. For instance, in the Duck-Duck-Goose example, assume that the adversary maintains in her local state a list of all the bits received corresponding to the key of the principal. We can easily write the algorithm so that if the adversary’s local state contains all the bits of the key of the principal, then the adversary can decode messages that have been encrypted with that key. Specifically, assume that key k is being sent in the Duck-Duck-Goose example. Then for an adversary i, hasi (k) will be false until all the bits of the key have been received. This translates immediately into the following algorithm ADDG : i Duck-Duck-Goose knowledge algorithm: ADDG (hasi (k), ) if all the bits recorded in form k then i return “Yes” else return “No” ADDG handles other formulas in the same way as ADY i i .
124
J.Y. Halpern and R. Pucella
Of course, nothing keeps us from combining algorithms, so that we can imagine an adversary intercepting both messages and key bits, and using an algorithm Ai which is a combination of the Dolev-Yao algorithm and the Duck-Duck-Goose algorithm, such as: Ai (ϕ, ) = if ADY i (ϕ, ) = “Yes” then return “Yes” else return ADDG (ϕ, ) i This assumes that the adversary knows the protocol, and hence knows when the key bits are being sent. The algorithm above captures this protocol-specific knowledge. The Lowe Adversary For a more realistic example of an adversary model that goes beyond Dolev-Yao, consider the following adversary model introduced by Lowe [23] to analyze protocols subject to guessing attacks. The intuition is that some protocols provide for a way to “validate” the guesses of an adversary. For a simple example of this, here is a simple challenge-based authentication protocol: A→S:A S → A : ns A → S : {|ns |}pa Intuitively, A tells the server S that she wants to authenticate herself. S replies with a challenge ns . A sends back to S the challenge encrypted with her password pa . Presumably, S knows the password, and can verify that she gets {|ns |}pa . Unfortunately, an adversary can overhear both ns and {|ns |}pa , and can “guess” a value g for pa and verify his guess by checking if {|ns |}g = {|ns |}pa . The key feature of this kind of attack is that the guessing (and the validation) can be performed offline, based only on the intercepted messages. To account for this capability of adversaries is actually fairly complicated. We present a slight variation of Lowe’s description, mostly to make it notationally consistent with the rest of the section; we refer the reader to Lowe [23] for a discussion of the design choices. Lowe’s model relies on a basic one-step reduction function, S l m, saying that the messages in S can be used to derive the message m. This is essentially the same as DY , except that it represents a single step of derivation. Moreover, the relation is “tagged” by the kind of derivation performed (l): One-step reduction: S l m {m, k} enc {|m|}k {{|m|}k , k−1 } dec m {m1 · m2 } fst m1 {m1 · m2 } snd m2 Lowe also includes a reduction to derive m1 · m2 from m1 and m2 . We do not add this reduction to simplify the presentation. It is straightforward to extend the work in this section to account for this augmented derivation.
Modeling Adversaries in a Logic for Security Protocol Analysis
125
Given a set H of message, and a sequence t of one-step reductions, we define inductively the set [H]t of messages obtained from the one-step reductions given in t. Messages obtained from one-step reductions: [H]t [H] H [H]Sl m·t
[H ∪ {m}]t if S ⊆ H undefined otherwise
Here, denotes the empty trace, and t1 ·t2 denotes trace concatenation. A trace t is said to be monotone if, intuitively, it does not perform any one-step reduction that “undoes” a previous one-step reduction. For example, the reduction {m, k} {|m|}k undoes the reduction {{|m|}k , k−1 } m. (See Lowe [23] for more details on undoing reductions.) We say that a set H of messages validates a guess m if, intuitively, H contains enough information to verify that m is indeed a good guess. Intuitively, this happens if a value v (called a validator) can be derived from the messages in H ∪ {m} in a way that uses the guess m, and either that (a) validator v can be derived in a different way from H ∪ {m}, (b) the validator v is already in H ∪ {m}, or (c) the validator v is a key whose inverse is derivable from H ∪ {m}. For example, in the protocol exchange at the beginning of this section, the adversary sees the messages H = {ns , {|ns |}pa }, and we can check that H validates the guess m = pa : clearly, {ns , m} enc {|ns |}pa , and {|ns |}pa ∈ H ∪ {m}. In this case, the validator {|ns |}pa is already present in H ∪ {m}. For other examples of validation, we again refer to Lowe [23]. We can now define the relation H L m that says that m can be derived from H by a Lowe adversary. Intuitively, H L m if m can be derived by Dolev-Yao reductions, or m can be guessed and validated by the adversary, and hence susceptible to an attack. Lowe derivation: H L m H L m iff H DY m or there exists a monotone trace t, a set S, and a “validator” v such that: (1) (2) (3) (4)
[H ∪ {m}]t is defined, S l v is in t, there is no trace t such that S ⊆ [H]t , and either: = (S, l) with S l v in t, (a) there exists (S , l ) (b) v ∈ H ∪ {m}, or (c) v ∈ K and v −1 ∈ [H ∪ {m}]t .
One can verify that the above formalization captures the intuition about validation given earlier. Specifically, condition (1) says that the trace t is well-formed, condition (2) says that the validator v is derived from H ∪ {m}, condition (3) says that deriving the validator v depends on the guess m, and condition (4) specifies when a validator v validates a guess m, as given earlier.
126
J.Y. Halpern and R. Pucella
We would now like to define a knowledge algorithm ALi to capture the capabilities of the Lowe adversary. Again, the only case of real interest is what ALi does on input hasi (m). Lowe knowledge algorithm (extract): ALi (hasi (m), ) if ADY i (hasi (m), ) = “Yes” then return “Yes” if guess(m, ) then return “Yes” return “No” The full algorithm can be found in the appendix. (We have not concerned ourselves with matters of efficiency in the description of ALi ; again, see Lowe [23] for a discussion of implementation issues.) As before, we can check the correctness and soundness of the algorithm: S Proposition 2. In the interpreted algorithmic security system I = (R, πR , A1 , . . . , An ), L where Ai = Ai , we have that (I, r, m) |= Xi (hasi (m)) iff {m : recv(m) ∈ ri (m)} ∪ initkeys() L m. Moreover, if (I, r, m) |= Xi (hasi (m)) then (I, r, m) |= hasi (m).
4 Related Work The issues we raise in this paper are certainly not new, and have been addressed, up to a point, in the literature. In this section, we review this literature, and discuss where we stand with respect to other approaches that have attempted to tackle some of the same problems. As we mentioned in the introduction, the Dolev-Yao adversary is the most widespread adversary in the literature. Part of its attraction is its tractability, making it possible to develop formal systems to automatically check for safety with respect to such adversaries [27,28,32,22,25]. The idea of moving beyond the Dolev-Yao adversary is not new. As we pointed out in Section 3, Lowe [23] developed an adversary that can encode some amount of off-line guessing; we showed in Section 3 that we could capture such an adversary in our framework. Other approaches have the possibility of extending the adversary model. For instance, the framework of Paulson [32], Clarke, Jha and Morrero [8], and Lowe [22] describe the adversary via a set of derivation rules, which could be modified by adding new derivation rules. We could certainly capture these adversaries by appropriately modifying our ADY i knowledge algorithm. However, it does not seem that these other approaches have the flexibility of our approach in terms of capturing adversaries. Not all adversaries can be conveniently described in terms of derivation rules. There are other approaches that weaken the Dolev-Yao adversary assumptions by either taking concrete cryptosystems into account, or at least adding new algebraic identities to the algebra of messages. Bieber [6] does not assume that the cryptosystem is a free algebra, following an idea due to Merritt and Wolper [26]. Even et al. [11] analyze ping-pong protocols under RSA, taking the actual cryptosystem into account.
Modeling Adversaries in a Logic for Security Protocol Analysis
127
The applied π-calculus of Abadi and Fournet [1] permits the definition of an equational theory over the messages exchanged between processes, weakening some of the abstract cryptosystem assumptions when the applied π-calculus is used to analyze security protocols. Since the cryptosystem used in our framework is a simple parameter to the logic, there is no difficulty in modifying our logic to reason about a particular cryptosystem, and hence we can capture these approaches in our framework. But again, it is not clear the extent to which these other approaches have the same flexibility as ours. On a related note, the work of Abadi and Rogaway [2], building on previous work by Bellare and Rogaway [5], compare the results obtained by a Dolev-Yao adversary with those obtained by a more computational view of cryptography, and show that under various conditions, the former is sound with respect to the latter, that is, terms that are indistinguishable under abstract encryption remain indistinguishable under a concrete encryption scheme. It would be interesting to recast their analysis in our setting, which, as we argued, can capture both the Dolev-Yao adversary and more concrete adversaries. The use of a logic based on knowledge/belief is also not new. A number of formal logics for analysis of security protocols that involve knowledge and belief have been introduced, going back to BAN logic [7], such as [3,6,15,34,35,4]. The main problem with some of those approaches is that semantics of the logic (to the extent that one is provided) is typically not tied to protocol executions or attacks. As a result, protocols are analyzed in an idealized form, and this idealization is itself error-prone and difficult to formalize [24].2 While some of these approaches have a well-defined semantics and do not rely on idealization (e.g., [6,4]), they are still restricted to (a version of) the Dolev-Yao adversary. In contrast, our framework goes beyond Dolev-Yao, as we have seen, and our semantics is directly tied to protocol execution. The problem of logical omniscience in logics of knowledge is well known, and the literature describes numerous approaches to try to circumvent it. (See [14, Chapter 10 and 11] for an overview.) In the context of security, this takes the form of using different semantics for knowledge, either by introducing hiding operators that hide part of the local state for the purpose of indistinguishability (as done, for example, in [3]), or by using notions such as awareness [12] to capture an intruder’s inability to decrypt [4].3 Roughly speaking, the semantics for awareness can specify for every point a set of formulas of which an agent is aware. For instance, an agent may be aware of a formula without being aware of its subformulas. A general problem with awareness is determining the set of formulas of which an agent is aware at any point. One interpretation of algorithmic knowledge is that it prescribes algorithmically what formulas an agent is aware of: those for which the algorithm says “Yes”. In that sense, 2
3
While more recent logical approaches (e.g., [10]) do not suffer from an idealization phase and are more tied to protocol execution, but they also do not attempt to capture knowledge and belief in any general way. A notion of algorithmic knowledge was defined by Moses [30], and used by Halpern, Moses and Tuttle [17] to analyze zero-knowledge protocols. Although related to algorithmic knowledge defined here, Moses’ approach does not use an explicit algorithm. Rather, it checks whether these exists an algorithm of a certain class (for example, a polynomial-time algorithm) that could compute such knowledge.
128
J.Y. Halpern and R. Pucella
we subsume approaches based on awareness by providing them with an intuition. (We should note that the use of awareness by Accorsi et al. [4] is not motivated by the desire to model more general adversaries, but by the desire to restrict the number of states one needs to consider in models. Hence, the thrust of their work is different from ours.)
5 Conclusion We have presented a framework for security analysis using algorithmic knowledge. The knowledge algorithm can be tailored to account for both the capabilities of the adversary and the specifics of the protocol under consideration. Of course, it is always possible to take a security logic and extend it in an ad hoc way to reason about adversary with different capabilities. Our approach has many advantages over ad hoc approaches: it is a general framework (we simply need to change the algorithm used by the adversary to change its capabilities, or add adversaries with different capabilities), and it permits reasoning about protocol-specific issues (we can capture the cases such as the agent sending the bits of its key). Another advantage of our approach is that it naturally extends to the probabilistic setting. For instance, we can easily handle probabilistic protocols, by moving to multiagent systems with an associated probability distribution on the runs (see Halpern and Tuttle [18]). In a slightly less trivial setting, we can also deal with knowledge algorithms that are probabilistic. This leads to some difficulty, since the semantics for Xi only really make sense if the knowledge algorithm is deterministic. We have extended the theory to handle such cases, and hope to report on a study of the obtained framework in future work. This would allow us to capture probabilistic adversaries of the kind studied by Lincoln et al. [20]. The goal of this paper was to introduce a general framework for handling different adversary models in a natural way, not specifically to devise new attacks or adversary capabilities. With this framework, it should be possible to put on a formal foundation new attacks that are introduced by the community. We gave a concrete example of this with the “guess-and-confirm” attacks of Lowe [23]. We are in the process of incorporating the ideas of this paper into a logic for reasoning about security protocols. It is fair to ask at this point what we can gain by using this framework. For one thing, we believe that the ability of the framework to describe the capabilities of the adversary will make it possible to specify the properties of security protocols more precisely. Of course, it may be the case that to prove correctness of a security protocol with respect to certain types of adversaries (for example, polynomial-time bounded adversaries), we will not be able to do much within the logic—we will need to appeal to techniques developed in the cryptography community. However, we believe that it may well be possible to extend current model-checking techniques to handle more restricted adversaries (for example, Dolev-Yao extended with random guessing). This is a topic that deserves further investigation. In any case, having a logic where we can specify the abilities of adversaries is a necessary prerequisite to using model-checking techniques.
Modeling Adversaries in a Logic for Security Protocol Analysis
129
Acknowledgments This research was inspired by discussions between the first author, Pat Lincoln, and John Mitchell, on a wonderful hike in the Dolomites. We also thank Sabina Petride for useful comments. Authors supported in part by NSF under grant CTC-0203535, by ONR under grants N00014-00-1-03-41 and N00014-01-10-511, by the DoD Multidisciplinary University Research Initiative (MURI) program administered by the ONR under grant N00014-01-1-0795, and by AFOSR under grant F4962002-1-0101.
A
Algorithms
Dolev-Yao knowledge algorithm: ADY i (hasi (m), ) K = keysof () for each recv(m ) in do if submsg(m, m , K) then return “Yes” return “No” submsg(m, m , K) if m = m then return true if m is {|m1 |}k and k−1 ∈ K then return submsg(m, m1 , K) if m is m1 · m2 then return submsg(m, m1 , K) ∨ submsg(m, m2 , K ) return false getkeys(m, K) if m ∈ K then return {m} if m is {|m1 |}k and k−1 ∈ K then return getkeys(m1 , K) if m is m1 · m2 then return getkeys(m1 , K) ∪ getkeys(m2 , K) return {} keysof () K ← initkeys() loop until nochange in K getkeys(m, K) K← return K
recv(m)∈
130
J.Y. Halpern and R. Pucella
Lowe knowledge algorithm: ALi (hasi (m), ) if ADY i (hasi (m), ) = “Yes” then return “Yes” if guess(m, ) then return “Yes” return “No” guess(m, ) H ← reduce({m : recv(m) in } ∪ initkeys()) ∪ {m} reds ← {} loop until reductions(H) − reds is empty (S, l, v) ← pick an element of reductions(H) − reds = S and l = l then if ∃(S , l , v) ∈ reds s.t. S return “Yes” if v ∈ H then return “Yes” if v ∈ K and v −1 ∈ H then return “Yes” reds ← reds ∪ {(S, l, v)} H ← H ∪ {v} return “No” reduce(H) loop until no change in H r ← reductions(H) for each (S, l, v) in r H ← H ∪ {v} return H reductions(H) reds ← {} for each m1 · m2 in H reds ← {({m}, fst, m1 ), ({m}, snd, m2 )} for each m1 , m2 in H if m2 ∈ K and sub({|m1 |}m2 , H) then reds ← {({m1 , m2 }, enc, {|m1 |}m2 )} if m1 is {|m |}k and m2 is k−1 then reds ← {({m1 , m2 }, dec, m )} return reds
Modeling Adversaries in a Logic for Security Protocol Analysis
131
sub(m, H) if H = {m} then return true if H = {m1 · m2 } then return sub(m, {m1 }) ∨ sub(m, {m2 }) if H = {{|m |}k } then return sub(m, {m }) if |H| > 1 and H = {m } ∪ H then return sub(m, {m }) ∨ sub(m, H ) return false
References 1. M. Abadi and C. Fournet. Mobile values, new names, and secure communication. In Proceedings of the 28th ACM Symposium on Principles of Programming Languages (POPL’01), pages 104–115, 2001. 2. M. Abadi and P. Rogaway. Reconciling two views of cryptography (the computational soundness of formal encryption). In Proceedings of the IFIP International Conference on Theoretical Computer Science (IFIP TCS2000), volume 1872 of Lecture Notes in Computer Science, pages 3–22. Springer-Verlag, 2000. 3. M. Abadi and M. R. Tuttle. A semantics for a logic of authentication. In Proc. 10th ACM Symp. on Principles of Distributed Computing, pages 201–216, 1991. 4. R. Accorsi, D. Basin, and L. Vigan`o. Towards an awareness-based semantics for security protocol analysis. In Jean Goubault-Larrecq, editor, Electronic Notes in Theoretical Computer Science, volume 55. Elsevier Science Publishers, 2001. 5. M. Bellare and P. Rogaway. Entity authentication and key distribution. In Proceedings of the 13th Annual International Cryptology Conference (CRYPTO ’93), volume 773 of Lecture Notes in Computer Science, pages 232–249, 1993. 6. P. Bieber. A logic of communication in hostile environment. In Proceedings of the Computer Security Foundations Workshop, pages 14–22. IEEE Computer Society Press, 1990. 7. M. Burrows, M. Abadi, and R. Needham. A logic of authentication. ACM Transactions on Computer Systems, 8(1):18–36, 1990. 8. E.M. Clarke, S. Jha, and W. Marrero. Using state space exploration and a natural deduction style message derivation engine to verify security protocols. In Proceedings of the IFIP Working Conference on Programming Concepts and Methods (PROCOMET), 1998. 9. D. Dolev and A. C. Yao. On the security of public key protocols. IEEE Transactions on Information Theory, 29(2):198–208, 1983. 10. N. Durgin, J. Mitchell, and D. Pavlovic. A compositional logic for protocol correctness. In Proceedings of the Computer Security Foundations Workshop, pages 241–255. IEEE Computer Society Press, 2001. 11. S. Even, O. Goldreich, and A. Shamir. On the security of ping-pong protocols when implemented using the RSA. In Proceedings of Crypto’85, volume 218 of Lecture Notes in Computer Science, pages 58–72. Springer-Verlag, 1985. 12. R. Fagin and J. Y. Halpern. Belief, awareness, and limited reasoning. Artificial Intelligence, 34:39–76, 1988. 13. R. Fagin and J. Y. Halpern. Reasoning about knowledge and probability. Journal of the ACM, 41(2):340–367, 1994. 14. R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. The MIT Press, 1995.
132
J.Y. Halpern and R. Pucella
15. L. Gong, R. Needham, and R. Yahalom. Reasoning about belief in cryptographic protocols. In Proc. IEEE Symposium on Security and Privacy, pages 234–248, May 1990. 16. A. D. Gordon and A. Jeffrey. Authenticity by typing for security protocols. In Proceedings of the 14th IEEE Computer Security Foundations Workshop (CSFW 2001), pages 145–159. IEEE Computer Society Press, 2001. 17. J. Y. Halpern, Y. Moses, and M. R. Tuttle. A knowledge-based analysis of zero knowledge. In Proc. 20th ACM Symp. on Theory of Computing, pages 132–147, 1988. 18. J. Y. Halpern and M. R. Tuttle. Knowledge, probability, and adversaries. Journal of the ACM, 40(4):917–962, 1993. 19. S. Kripke. A semantical analysis of modal logic I: normal modal propositional calculi. Zeitschrift f¨ur Mathematische Logik und Grundlagen der Mathematik, 9:67–96, 1963. Announced in Journal of Symbolic Logic, 24, 1959, p. 323. 20. P. Lincoln, J. C. Mitchell, M. Mitchell, and A. Scedrov. A probabilistic poly-time framework for protocol analysis. In Proceedings of the ACM Conference on Computer and Communications Security, pages 112–121, 1998. 21. G. Lowe. An attack on the Needham-Schroeder public-key authentication protocol. Information Processing Letters, 56:131–133, 1995. 22. G. Lowe. Some new attacks upon security protocols. In Proc. 9th IEEE Computer Security Foundations Workshop, pages 162–169, 1996. 23. G. Lowe. Analysing protocols subject to guessing attacks. In Proceedings of the Workshop on Issues in the Theory of Security (WITS’02), 2002. 24. W. Mao. An augmentation of BAN-like logics. In Proceedings of the 8th IEEE Computer Security Foundations Workshop, pages 44–56. IEEE Computer Society Press, 1995. 25. C. Meadows. The NRL protocol analyzer: An overview. Journal of Logic Programming, 26(2):113–131, 1996. 26. M. Merritt and P. Wolper. States of knowledge in cryptographic protocols. Unpublished manuscript, 1985. 27. J. K. Millen, S. C. Clark, and S. B. Freedman. The Interrogator: Protocol security analysis. IEEE Transactions on Software Engineering, 13(2):274–288, 1987. 28. J. Mitchell, M. Mitchell, and U. Stern. Automated analysis of cryptographic protocols using murϕ. In Proceedings of the IEEE Symposium on Research in Security and Privacy, pages 141–151. IEEE Computer Society Press, 1997. 29. J. H. Moore. Protocol failures in cryptosystems. Proceedings of the IEEE, 76(5):594–602, 1988. 30. Y. Moses. Resource-bounded knowledge. In M. Y. Vardi, editor, Proc. Second Conference on Theoretical Aspects of Reasoning about Knowledge, pages 261–276. Morgan Kaufmann, San Francisco, Calif., 1988. 31. R. M. Needham and M. D. Schroeder. Using encryption for authentication in large networks of computers. Communications of the ACM, 21(12):993–999, 1978. 32. L. C. Paulson. The inductive approach to verifying cryptographic protocols. Journal of Computer Security, 6(1/2):85–128, 1998. 33. P. Y. A. Ryan and S. A. Schneider. An attack on a recursive authentication protocol: A cautionary tale. Information Processing Letters, 65(1):7–10, 1998. 34. S. Stubblebine and R. Wright. An authentication logic supporting synchronization, revocation, and recency. In Proc. Third ACM Conference on Computer and Communications Security, pages 95–105, 1996. 35. P. Syverson. A logic for the analysis of cryptographic protocols. NRL Report 9305, Naval Research Laboratory, 1990.
Secure Self-certified Code for Java M. Debbabi1,2 , J. Desharnais1 , M. Fourati1 , E. Menif1 , F. Painchaud1 , and N. Tawbi1 1 2
LSFM Research Group, Computer Science and Software Engineering Department, Laval University, Quebec, Canada Panasonic Information and Networking Technologies, Princeton, New Jersey, USA
Abstract. Java is widely used on the Internet, which makes it a target of choice for malicious attackers. This fact stimulates the research work in the field of Java program verification in order to consolidate both Java safety and security. The results achieved so far in this sector are very promising and effective. Nevertheless, the current Java security architecture still suffers from some weaknesses in terms of flexibility, efficiency and robustness. We therefore propose an architecture, named Java Certifying Compilation (JACC) system, for secure compilation and execution of Java mobile code. This architecture is based on a synergy between certifying compilation and formal automatic verification ` a la model-checking. Indeed, we have extended the certifying compilation and model-checking approaches to enforce high-level security policies in Java. In this paper, we present our work in designing and implementing the JACC system, which includes a certifying compiler, a security policy specification language and an extended bytecode verifier that integrates a model-checker. This system is flexible, efficient and robust and can be used to enforce both the safety and the security of Java programs.
1
Introduction
Information technology is becoming, more and more, a vitally important underpinning to our economy and to our society. It is embedded in our everyday applications and animates a wide class of systems that range from small to large and from simple to extremely sophisticated. Actually, information systems increasingly govern nearly every aspect of our lives. This omnipresence is increased by the dazzling expansion of Internet, World Wide Web, Java, parallel and distributed systems and mobile computation. Lately a surge of interest has been expressed in mobile code technology. The latter stands for systems in which general-purpose executable contents can run in remote locations. The idea of mobile code is not new, but it becomes an invaluable cutting edge technology in the presence of World Wide Web and Java compiled units, i.e., the so-called applets. This combination becomes a synergy that caters for an easy, natural and flexible development and distribution of intranet/internet, concurrent and distributed applications. Accordingly, plenty of systems have been advanced for creating and using mobile code. The most prominent systems are Java, JavaScript, VBScript and ActiveX. A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 133–151, 2003. c Springer-Verlag Berlin Heidelberg 2003
134
M. Debbabi et al.
The Java language [1, 2, 3] emerged as a multi-paradigmatic language that supports mobile code. It obeys to the paradigm “write once, run everywhere”. Indeed, programs in a compiled platform-independent form (class file or Java bytecode) can migrateover the network. A Java bytecode can be executed on any platform that is endowed with a Java virtual machine (JVM) that emulates a processor architecture. The most popular way to achieve code mobility in Java is by putting links in web pages to Java class files (usually referred to as applets). Mobile code in general and Java in particular poses severe, and very interesting, challenges in terms of security, reliability and performance. The security issue is of paramount importance. The host client accepting a mobile code must check whether the latter will not affect the secrecy (by leaking sensitive information), the integrity (by corrupting information), the authentication (by impersonating authorized principals), the availability (by denying service to legal users), etc. The current trend in mobile code and Java security is defensive (adding layers of firewalling, cryptographic protocols, network partitions, etc.), restrictive (sandbox models, rigid security policies, etc.) and ad hoc (dynamic checks, checksums, scanning, etc.). Consequently, there is a desiderata that consists in elaborating a security architecture that is flexible, application-dependent, efficient and practical. Moreover such an architecture must be based on robust theoretical foundations. The Java language uses multi-level mechanisms [4] in order to ensure its protection. More precisely, those mechnaisms are embedded at the language, the compiler, the bytecode verifier and the security manager levels. This architecture is one of the best to ensure safe and secure execution of Java applications. However, it lacks a few interesting properties: – Flexibility: the present security architecture of the JVM is rigid because it is application-independent. Security policies are only a series of access permissions for resource usage independently of the execution context. – Efficiency: security property verification is performed exclusively dynamically by the JVM. – Robustness: the security architecture of the JVM is vaguely and sparsely specified in the official documentation published by JavaSoft Inc. The innerworkings of the bytecode verifier are not well documented. Moreover, numerous errors have been found in the bytecode verifier which shows that its development has not been done by following robust theoretical foundations. In our current research, we are concerned with the formal static verification of expressive security properties written for Java programs in order to enhance the current Java Security Architecture. To address this problem, we have studied several approaches used to ensure safe local execution of untrusted code. Among them, certifying compilation seems to be a very promising approach, which is based on programming language theory and implementation. The main idea of certifying compilation is to generate a certified code that can be verified by the consumer using a verifier to check if the software complies to a pre-established safety policy.
Secure Self-certified Code for Java
135
It is with this strategy in mind that we have developed this research project, named Java Certifying Compilation (JACC) system. We have elaborated an architecture for secure compilation and execution of Java mobile code that is based on an extension of the certifying compilation approach. This architecture solves the three main problems previously enumerated. We have therefore designed and implemented a Java certifying compiler that inserts new type annotations into Java class files, a language for the specification of expressive security policies and a bytecode verifier that integrates a model-checker. The rest of the paper is organized as follows. We begin with a description of the related work (Section 2). We then present our approach (Section 3). In Section 4, we introduce JACC. We continue with the description of the security policy specification language (Section 5) where we focus on its syntax and semantics. Subsequently, in Section 6, we describe the JACC bytecode verifier, including the bytecode/JACC annotations correspondence verification (Section 6.1) and the JACC model-checker (Section 6.2). Section 7 proposes a case study. Finally, a few concluding remarks and a discussion of future research work are ultimately sketched as a conclusion in Section 8.
2
Related Work
To meet the growing need in Information Technology Security, several approaches were proposed to ensure secure local execution of applications. Many of these are based on dynamic and static analyses. The latter, generally using only the source code or object code of an application, prove very effective in many cases and save execution time compared to dynamic approaches. Nevertheless, most of the static analysis methods are usually difficult to develop. This makes the approaches based on certification interesting. The principle of certifying compilation consists in generating from a source code, by the means of a certifying compiler, an object code that incorporates a certificate of safety or security. This certificate makes it possible for a user of the object code to establish, by means of an object code verifier, the conformity of the object code with a security policy, implicitly or explicitly specified. Contrarily to the traditional approach, where code safety verification falls within the competence of the consumer, this task is distributed between the producer and the consumer. Self-certified code approaches can be structured in three different classes: the PCC (Proof-Carrying Code), the TAL (Typed Assembly Language) and the ECC (Efficient Code Certification) models. In the PCC model [5, 6], the consumer requires from the producer a proof of safety (certificate) that attests that the received code complies with definitive and published safety rules. Afterwards, the task of the consumer consists in validating this proof, thereby ensuring that the received code can be safely executed. The certificate in the TAL model [7, 8] constitutes a type annotation that contains a static approximation of the dynamic behavior of the program. Before any execution of the object code, a verification process is activated to check the
136
M. Debbabi et al.
conformity of the assembly code compared to the integrated type annotations. This verification is done using a verifier that tests the safety of the execution stack, the control flow and typing. Once the code has been checked, the annotations can be removed and the code can be safely executed. The ECC model [9] sacrifices expressivity and generality to the profit of performance. This is why ECC is mainly focused on fundamental safety aspects (control flow, memory accesses and execution stack management). This approach includes two principal components: a certifying compiler and a verifier. The first generates an object code and a safety certificate. This certificate corresponds to a set of structured annotations. The verifier uses these annotations to ensure the veracity of a set of conditions. In the following, we examine the state of the art in Java program verification. Klaus Havelund and his coauthors have worked on the elaboration of a framework for the verification and debugging of Java programs [10]. This framework is based on model-checking. Their system is specifically designed to verify multithreaded programs. Their work consists in the definition of Java PathFinder (JPF), a Java to PROMELA (PROcess MEta LAnguage) translator. PROMELA is the input language of the model-checker Spin. This work is interesting but has a few limitations. Indeed, it is not easily extensible to large industrial applications. Furthermore, more work should be done in order to translate all aspects of Java to PROMELA. This is a hard task, because Java and PROMELA are totally different and the latter does not necessarily support everything that the former does. Cormac Flanagan and Stephen N. Freund present a concrete example of annotations that make the detection of thread synchronization errors in Java programs possible [11]. The programmer must manually add these annotations in the Java source code. An adapted Java compiler can then verify that there are no concurrency errors (also named race conditions) during the compilation process. The authors do not consider the whole Java language. Indeed, they have isolated a subset of the Java language that is sufficient to handle concurrent Java programs. Christopher Colby, Peter Lee, George C. Necula and their coauthors explore the idea of developing a certifying compiler for Java [12]. They concentrate their efforts on type safety only, which is a safety property and not a security property. Eran Yahav presents a new verification technique based on parametric frameworks where it is possible to verify given safety properties on concurrent Java programs [13]. The operational semantics of instructions and conditional expressions of the Java language is specified using a meta-language based on first-order logic. The same meta-language is also used to express safety properties. The general idea of this technique is to generate multiple program configurations from an original configuration. These configurations are calculated by applying the semantics of the instructions and expressions of the Java language. It is finally verified that all these reachable configurations satisfy the given safety properties. The safety properties that are considered in this system are deadlocks, interference, shared abstract data types and illegal thread interactions.
Secure Self-certified Code for Java
3
137
JACC Approach
In the JACC architecture, the Java language stays totally unchanged. Therefore, the Java source code that is compiled is the exact same source code that Java developers are used to write. However, the Java compiler is modified. Our certifying compiler is called JACC. Instead of producing only the bytecode and the standard type annotations, it also produces new annotations called JACC annotations. Instead of being directly executed by the Java Virtual Machine, the augmented class files are verified by a modified Java bytecode verifier called JACC bytecode verifier. This verifier guarantees both safety properties and high-level security policies. The safety properties stay the same but the security policies are now expressed with a custom and more expressive language based on the modal µ-calculus. These new security policies are called JACC Security Policies. Once the bytecode is successfully verified, it can be executed by a traditional Java Virtual Machine. Since high-level security is already guaranteed, the Java Security Manager can be turned off and efficiency is therefore increased. Finally, secure execution of the bytecode is also achieved. JACC and its annotations, security policies and bytecode verifier are presented in the following sections.
4
JACC Certifying Compiler
JACC is a prototype of a certifying compiler for Java source code. It generates, in addition to the bytecode, new annotations (certificate) that deal with security concerns. This certificate is statically verified by the JACC bytecode verifier. The JACC certifying compiler was developed by extending an existing Java compiler called Jikes (IBM) [14]. 4.1
JACC Annotations
The annotations try to capture the behavior of a piece of software and represent it in an abstract form. Presently, five main categories of annotations are considered: file annotations describing file manipulations, system call annotations to bring out calls to the system, network access annotations (URL, socket, datagram), thread annotations describing thread manipulations and window annotations describing window manipulations. JACC generates annotations for each opcode of a method. This helps to verify that the annotations and the bytecode correspond and that none has been altered when the user receives the classes (see Section 6.1). Table 1 gives the syntax of the annotations. In this syntax, Γ corresponds to the annotations of a method, and Λ corresponds to those of a single opcode. The term α represents the memory effects, ϕ the file effects, κ the communication effects, ω the window effects, σ the system calls, λ the calls to native methods and θ the thread effects. The separator ; delimits the annotations of
138
M. Debbabi et al.
Table 1. Annotations Syntax ::= Γ ; Γ | rec(Γ, Γ ) | if (Γ, Γ ) | if (Γ, Γ , Γ ) | try (Γ, Ω) | try(Γ, Ω, finally(Γ )) | switch(Γ, Φ) | switch(Γ, Φ, default(Γ )) | #n(Λ) Ω ::= catch(Γ ) | Ω, catch (Γ ) Φ ::= case(Γ ) | Φ, case(Γ ) Λ ::= ∅ | or (Λ, Λ ) | α | β | ϕ | κ | ω | σ | λ | θ α ::= ∅ | value(x, t) | variable( Vi , t) ϕ ::= ∅ | open(x, F ) | read (x, F ) | write((x1 , k), (x2 , F )) | delete(x, F ) κ ::= ∅ | open(x, C) | read (x, C) | write((x1 , k), (x2 , C)) | delete(x, C) ω ::= ∅ | open(x, W ) | read (x, W ) | write((x1 , k), (x2 , W )) σ ::= ∅ | system call λ ::= ∅ | native method call θ ::= ∅ | create( Vi ) | current ( Vi ) | start ( Vi ) | stop( Vi ) | join( Vi ) | destroy ( Vi ) | interrupt ( Vi ) | sleep( Vi ) | suspend ( Vi ) | resume( Vi ) Γ
different opcodes. The expression rec(Γ, Γ ) represents a loop statement where Γ corresponds to the annotations of the loop condition and Γ corresponds to the annotations of the loop body. The expressions if (Γ, Γ ) and if (Γ, Γ , Γ ) are used to represent an if statement where Γ corresponds to the annotations of the condition, Γ corresponds to the annotations of the then branch and Γ corresponds to the else branch. The expressions try(Γ, Ω) and try(Γ, Ω, finally (Γ )) represent a try . . . catch . . . finally statement. The try block annotations are represented by Γ , Ω corresponds to the annotations representing the list of catch blocks and finally(Γ ) corresponds to the annotations representing the finally block. The expressions switch(Γ, Φ) and switch(Γ, Φ, default (Γ )) represent the annotations of a switch statement, with Γ being the annotations of the condition, Φ the annotations of the different cases and default (Γ ) the annotations of the default block. The annotation #n(Λ) is used to assign an opcode offset (n) to the corresponding opcode annotation (Λ). The annotation or (Λ, Λ ) represents an alternative between two annotations Λ and Λ . It occurs when the same variable has been assigned to two different values in an if statement, for example. Let us now see the atomic annotations. The first category of atomic annotations capture memory effects: value(x, t) corresponds to an opcode that declares a literal, x is the value of the literal and t is its type. The annotation variable( Vi , t) corresponds to an uninitialized field, a method parameter or a result returned by a method or a constructor that we do not consider critical. Vi is the variable name (where i is the variable number) and t is its type. The file effects capture file openings, deletions, and readings from or writings to a file. The network effects capture connection openings, sendings or receivings. The window effects are useful especially when the software tries to get confidential information by simulating interfaces generally used by the operating system. For these three categories we use the same annotations: open(x, K) for opening, read (x, K) for reading/receiving, write((x1 , k), (x2 , K)) for writing/sending and
Secure Self-certified Code for Java
139
delete(x, K) for deleting (files and connections only). In these annotations, K is F for files, C for connections and W for windows, and x, x1 and x2 represent the file name, the connection or the window number. The couple (x1 , k) in write represents the information to write. This information could be read from a file, a connection or a window, and thus (x1 , k) is identical to the parameters (x, K) of the read annotation. The annotations system call and native method call are respectively used to capture calls to the system and to native methods, which could be malicious, depending on the actions of these methods. Threads could be used to introduce attacks such as denial of service (DoS). Thread annotations capture thread manipulations to detect such attacks. In these annotations, Vi is a name assigned to a thread to distinguish between different threads. According to the effects we are trying to bring out (file manipulations, network accesses, thread manipulations, . . . ), it is important to point out that not all the opcodes are considered relevant. Therefore, these irrelevant opcodes are being attributed nil annotations (∅). 4.2
Critical Methods
For opcodes dealing with method or constructor invocations, the annotations depend on whether the method or the constructor is critical or not. A method or a constructor is considered critical if it is used to manipulate files, connections, threads, windows or system calls. Indeed, such manipulations could be malicious and affect the secrecy, integrity, etc, of the system. For instance, methods for opening, reading from or writing to files are considered critical since they could manipulate confidential files. We have collected a non-exhaustive set of critical methods from the Java API. The rational underlying that choice is that security breaches at the application level could not be achieved unless some resourse is accessed. The resource could be a file, a thread, a window, the network, the screen, etc. In the case of a non-critical constructor or method that does not return void, we generate the annotation variable( Vi , t) where Vi is a new variable name (i is a number incremented each time a new variable is created). If a non-critical method returns void, we generate a nil annotation. Any expression inside a method could use class fields or local variables. To generate the appropriate annotations for these expressions we use an environment that helps us keep track of the annotations of program variables and propagate them when needed. For example, if we have the statement f.delete(), and in the environment the annotation open(name, F ) is bound to the variable f, then we are able to generate the annotation delete(name, F ) for this statement.
5
JACC Security Policies Specification Languages
In this section, we introduce two languages that are helpful for defining security policies. The first one, the rather low-level Security Policies Specification Language, is based on the modal µ-calculus [15]. It is internally used by the certifying
140
M. Debbabi et al.
compilation system that we have developed. The second one, the high-level Security Policies Specification Language, is used by the end-user of the system to define security policies. Those security policies are properties that enhance the current Java Security Architecture [16]. Indeed, they are new kinds of properties that must be satisfied by the programs in order for the Java platform to execute them. For example, a security policy could specify that some confidential file can be read by some principal but cannot be sent over the network. Such a restriction cannot be specified using the current Java Security Architecture. It is therefore of paramount importance to tackle this problem if we want to prevent fancy attacks. The Java certifying compilation system that we are developing, and particularly the security policies specification languages, offer a greater flexibility to the end-user, who can now define more complex and complete security policies. 5.1
Low-Level Security Policies Specification Language
The low-level Security Policies Specification Language internally used by the certifying compilation system is based on the modal µ-calculus. The syntax of the low-level Security Policies Specification Language is given in Table 2. This syntax is based on the modal µ-calculus syntax [17]. The main extension we have made consists in parameterizing the actions to propagate some data information in the formula. This helped us to achieve a better precision in the verification process. For example, if a file is read and then a file is sent over the network, we can determine if it is the same file or not. This means that it is not necessary to block all the files at the network level to be sure that a given file will not be sent. Table 2. Syntax of the Low-level Security Policies Specification Language φ ::= t | f | Z | ¬φ | φ1 ∧ φ2 | φ1 ∨ φ2 | [K]φ | Kφ | µZ.φ | νZ.φ
We now explain the syntax of the low-level Security Policies Specification Language. We assume that a finite set of actions in Act is given (see Table 3). The formulas t and f respectively represent the assertions true and false. The formula Z represents a variable; there is an infinite countable set of variables. The ¬ operator gives the negation of a given formula. The operators ∧ and ∨ respectively return the conjunction (the “and”) and the disjunction (the “or”) of two formulas. The modal operators [K] and K, where K is a finite set of actions in Act, respectively express necessity and possibility. When K = {a}, for some a (that is, K contains a single action), we simply write [a] instead of [{a}] and a instead of {a}. The operators µ and ν are respectively the least and greatest fixed point operators. They are the most complex but also the most powerful operators. Among other things, they let us represent infinite properties in a finite manner.
Secure Self-certified Code for Java
141
Finally, we mention that, as for the µ-calculus, the operators of the language are not independent, since the following relations exist between them: t = νZ.Z f = µZ.Z µZ.φ = ¬νZ.¬φ[¬Z/Z] νZ.φ = ¬µZ.¬φ[¬Z/Z] φ1 ∨ φ2 = ¬(¬φ1 ∧ ¬φ2 ) φ1 ∧ φ2 = ¬(¬φ1 ∨ ¬φ2 ) Kφ = ¬[K]¬φ [K]φ = ¬K¬φ In these expressions, φ[¬Z/Z] is the formula obtained by replacing all free occurrences of Z in φ by ¬Z. Let us examine the syntax Act of actions. We are interested in actions that use the resources of the system. Each action in Act abstracts a particular behavior of a Java program, e.g. opening a file, deleting a file and sending information over the network. Act is presented in Table 3. Most actions are parameterized; this is useful to link data to actions that use them. Table 3. The Syntax of Actions (Act) Act ::= open(n, k) | read (n, k) | write((n, k), (n , k )) | delete(n, k) | native method call | system call | create(v) | current(v) | start (v) | stop(v) | join(v) | destroy (v) | resume(v) | sleep(v) | suspend (v) | interrupt (v)
The symbol n designates a file, a window, a connection or an unknown resource. It can be a constant or a variable that will be initialized during the verification process. The use of variables in actions abstracts the internal program behavior to the user. When the resource is unknown, n must be a variable. The symbol k is used to identify the resource and can take the following values: “file” for file, “connection” for connection, “window” for window and “other” for an unknown resource. The open(n, k) action, for example, represents the opening of a resource n that can be a string constant or a variable representing this resource. The latter can be a file, a window, a connection or an unknown resource. The write((n, k), (n , k )) action corresponds to writing data issuing from a file, a window, a connection or an unknown resource, abstracted by n, into a file, a window, a connection or an unknown resource, abstracted by n . The system call action is the abstraction of a system call. The native method call action is the abstraction of a native method call. The remaining actions represent thread actions. For example, the create(n) action is the creation of a new thread represented by n, which must be a variable.
142
M. Debbabi et al.
We now give an example of how to use the low-level Security Policies Specification Language presented in Table 2. Consider the following formula φ1 , that expresses the fact that after a file has been read, its content cannot, immediately or later on, be written into a window: φ1 = νZ.([read (v, file)](νZ.([write((v, file), (v , window))]f ∧ [−]Z)) ∧ [−]Z) where [−] denotes the use of the modality with arbitrary action. The formula stipulates that each time that the action read (v, file), representing the reading of any file abstracted by the variable v, is executed, the action representing the writing of a window, write((v, file), (v , window)), cannot be executed, nor immediately, neither later on after the execution of other actions. 5.2
High-Level Security Policies Specification Language
The low-level security policies specification language is very useful for specifying system properties and expressing the security policies internally used by our certifying compilation system. But, on the other hand, its complexity makes it difficult to use. That was the reason why we proposed a simple, user-friendly, easy to learn and high-level language, which hides the technical details that the end user would need to master in order to define security policies. To achieve this objective, we introduced many interesting macros that abstract the particularities, and also the difficulties, of the low-level security policies specification language. These macros express basic properties that can be used to easily define sophisticated security policies. The syntax of the high-level security policies specification language is presented in Table 4. Table 4. Syntax of the High-level Security Policies Specification Language P
::= true | false | not (P ) | and (P1 , P2 ) | or (P1 , P2 ) | always(P ) | eventually (P ) | never (P ) | loop(K) | implies(P1 , P2 ) | possible(Kh , P ) | necessarily (Kh , P )
The symbol P represents a security property and the term Kh is used to represent a finite set of actions in Acth (see Table 5). The statement not(P ) stipulates that the property P must not be satisfied. The statement and (P1 , P2 ) means that both properties P1 and P2 must be satisfied. The statement always (P ) means that the property P must be satisfied from all the program points reachable from the current one. The statement inevitable(Kh , P ) stipulates that one of the actions in Kh is immediately inevitable and any program point to which this action may lead satisfies the property P . The statement loop(Kh ) means that one of the actions in Kh can be executed in a loop from a program point reachable from the current one.
Secure Self-certified Code for Java
143
Let us examine the syntax of actions Act h used in the high-level language. These actions slightly differ from those introduced in Table 3 in Section 5.1 (Act ). To simplify writing security properties, we have modified some actions. Indeed, the sending over the network of some data issuing from a resource n is represented by the action send ((n, k), n ) instead of considering it as the writing into a file attached to a network connection, which is the case in the Java language. The syntax of actions Act h is presented in Table 5. As for Act, the symbol n designates a resource and the symbol k is used to identify the resource. Table 5. The Syntax of Actions (Act h ) Act h ::= openf (n) | readf (n) | writef ((n, k), n ) | deletef (n) | openc(n) | send ((n, k), n ) | receive(n) | native method call | system call | create (v) | current(v) | start (v) | stop(v) | join(v) | destroy (v) | interrupt (v) | sleep(v) | suspend (v) | resume(v) | openw (v) | readw (v) | writew ((n, k), v)
5.3
Semantics
The semantics is defined using a satisfiability relation. The satisfiability rules for the JACC system direct the process of determining if a given model or program satisfies a given formula or security policy. Indeed, for any formula representing a security property, we should be able to determine if a given abstraction of a method satisfies that property or not. If all the methods of a program satisfy a set of properties, or a security policy, we say that the program satisfies the policy.
6
JACC Bytecode Verifier
The JACC bytecode verifier integrates into one product a traditional Java bytecode verifier and a model-checker (refer to [18] and [19] for greater details). The Java bytecode verifier is responsible for ensuring basic low-level security properties — also called safety properties — like memory safety, type safety and control flow safety. The model-checker performs high-level security checks, verifying that a given program respects a certain security policy. When someone runs the JACC bytecode verifier, the Java bytecode verifier performs its low-level checks and then the model-checker verifies that each method of the program respects the current security policy. In fact, the class files generated by the JACC certifying compiler are parsed by a component found in the JACC bytecode verifier: the JACC Class file Parser. Using the JACC annotations, this parser is responsible for producing an abstract representation of the bytecode called a
144
M. Debbabi et al.
model. Models of Java programs consist of structured annotations that are extracted from these programs. The structure of the annotations is an abstraction of the control flow one. The model is used by the bytecode verifier in conjunction with the model-checker to verify that it satisfies the JACC security policy. This JACC security policy is also parsed by a component found in the JACC bytecode verifier: the JACC security policy parser. This parser contains a translator that is responsible for the translation of the JACC security policy, expressed in the high-level specification language, to a formula, expressed in the low-level specification language. This formula is the one used by the model-checker. Two steps have been added to the original Java verification process: bytecode/JACC annotations correspondence verification and model-checking. Both steps are rather complex but very interesting. They are explained in the following subsections. 6.1
Bytecode/JACC Annotations Correspondence Verification
The bytecode/JACC annotations correspondence verification is a very important step in the verification process of the JACC bytecode verifier because it ensures that the bytecode included in the JACC class files (augmented Java class files) exactly corresponds to what is expressed by the associated JACC annotations. This is important because the model-checking algorithms, that are executed right after, enforce the security policies by considering only the JACC annotations and not the bytecode. However, what is ultimately executed by the Java Virtual Machine is, of course, the bytecode and not the JACC annotations. Therefore, it must be guaranteed that the critical actions performed in the bytecode are entirely expressed in the associated JACC annotations and also that all critical actions expressed in the JACC annotations are really performed in the bytecode. This property ensures that the JACC annotations constitute a correct abstraction of the program. The verification step that ensures that the bytecode of a given Java program really corresponds to the associated JACC annotations, and vice versa, is divided into two steps: correspondence verification and a new dataflow analysis. The correspondence verification substep is very simple. It ensures that each opcode in the bytecode of a Java program has an associated JACC annotation and also that each JACC annotation has an associated opcode. This is performed on the basis of the opcode offsets that are included in the JACC annotations. Therefore, no semantic analysis of the opcodes or the JACC annotations is performed at this stage. This type of analysis is included or inherent to the dataflow analysis that follows this correspondence verification substep. The new dataflow analysis substep (Algorithm 1) is more complex, and also more interesting. It basically consists of a new dataflow analysis that makes sure that the semantics of the JACC annotations corresponds to the semantics of the opcodes, and vice versa. This dataflow analysis is in fact similar to the one already used by the traditional Java bytecode verifier to ensure fundamental safety properties.
Secure Self-certified Code for Java
145
Algorithm 1: Dataflow Analysis Used by the JACC Bytecode Verifier Initialization for each opcode o of the current method do set the modified flag of o to false; set the associated dataflow information of o to unknown; end set the modified flag of the first opcode of the current method to true; set the associated dataflow information of the first opcode of the current method to the information provided by the current method’s signature; Main loop while there is an opcode o with a modified flag set to true in the current method do set the modified flag of o to false; simulate the execution of o on its associated dataflow information. This consists of verifying the semantics of the JACC annotation associated to o. If this verification fails, the entire dataflow analysis fails; for each successor s of o do merge the associated dataflow information of o to the one associated to s. The resulting dataflow information is called r. The merge operator is the union on sets; if r is not equal to the associated dataflow information of s then set the associated dataflow information of s to r; set the modified flag of s to true; end end end the dataflow analysis is a success;
When the opcode o is a critical constructor or a critical method invocation opcode, the list of critical classes and methods that has been used by the JACC certifying compiler must be considered in order to determine the correct type of JACC annotation that this opcode must be associated with and also to verify the JACC annotations parameters. Therefore, the generated JACC annotation represents the appropriate action performed by the critical constructor or method invocation. When a non-critical constructor is invoked, the associated JACC annotation should be a variable, which represents the freshly created object. When a non-critical method is invoked, the associated JACC annotation should be nil if the method returns void because no value is returned and therefore, no annotation must be produced. Finally, the associated JACC annotation should be a variable if the method does not return void in order to represent the returned value.
146
6.2
M. Debbabi et al.
JACC Model-Checker
The model-checker performs high-level security checks that are carried out by a model-checking algorithm. The latter determines whether or not a transition system M = (S, C, W, R, Sub) is a model for a JACC security policy p0 . The transition system M is a quintuple, where: – S is the set of states of the model. Where a state represents a control point of the program’s abstract model; – C is the set of constants ci of the model; – W is the set of variables wi of the model; – R is a relation that associates a set of state couples (s, t) to an action a such that an edge labeled by a exists between s and t; and – Sub is a set, initially empty, tracking the substituted constants in the model. The main method (eval) algorithm (Algorithm 2) represents a recursive function which calculates the set of states that satisfy a given security policy. Since the parameters of actions can be variables or constants, the function eval computes a set for each action variant. The algorithm follows these steps: 1. Convert the propery p0 to its equivalent PNF (positive normal form) p0 . We say that a formula is in a positive normal form if negations are applied only to the atomic propositions. 2. Compute the set of states at which p0 hold (S = eval(p0 )). Based on the set of states that satisfy the formula, the security policy verifier gives binary answers (true/false) to the question of formula satisfaction. If the verifier is not able to determine whether the formula is satisfied or not because of dynamic aspects (like manually entered filenames, for example), it reacts conservatively, stating that the formula is not satisfied. In the future, we plan to add more verbose and accurate outputs.
7
Case Study
In this section, an example that exercises the main components of the JACC system is introduced and studied. Example of a Java Program import java.net.*; import java.io.*; class TestURL { public static void main(String[] args) throws Exception { URL url = new URL("http://somehost/cgi-bin/somecgi"); URLConnection uc = url.openConnection();
Secure Self-certified Code for Java
Algorithm 2: The eval Function : A transition system representing the model M = (S, C, W, R, Sub), and a security policy p0 containing the variables X1 , X2 ,. . . , Xn . output : A set containing the states that satisfy p0 . input
Function eval(M, p0 ) variable S, S ; begin switch the form of p0 do case true S = S; case false S = ∅; // Each recursion variable Yi corresponds to a set Si case Yi S = Si ; case ¬p S = S\eval(M, p); case p ∧ q S = eval(M, p) ∩ eval(M, q); case p ∨ q S = eval(M, p) ∪ eval(M, q); case delete(n, k)p or [delete(n, k)]p S = evalDelete(M, p, (n, k)); case open(n, k)p or [open(n, k)]p S = evalOpen(M, p, (n, k)); case read (n, k)p or [read (n, k)]p S = evalPossibleRead(M, p, (n, k)); case write((n, k), (n , k ))p or [write((n, k), (n , k ))]p S = evalWrite(M, p, (n, k), (n , k )); // Here A can be any thread action. case A(v)p or [A(v)]p S = evalA(M, p, v); case native method call p or [native method call]p S = evalNativeMethodCall(M, p); case system callp or [system call ]p S = evalSystemCall(M, p); case nilp or [nil ]p S = evalNil(M, p); return (S ); end
147
148
M. Debbabi et al.
uc.setDoOutput(true); FileReader f = new FileReader("confidential.txt"); PrintWriter out = new PrintWriter(uc.getOutputStream()); int c; while ((c = f.read()) != -1) { out.print(c); } out.close(); f.close(); } } This small Java program opens a network connection to a distant computer and sends to this computer the contents of a file known to be confidential. 7.1
JACC Annotations
When we compile the source code of this example, we get the annotations shown in Table 6 for the method main. Table 6. Annotations of the Method main #65537(params(param(variable( V 0, “[Ljava/lang/String; ”)))); #0(); #3(); #4(value“http://somehost/cgi-bin/somecgi”, “Ljava/lang/String; ”)); #6(open(“http://somehost/cgi-bin/somecgi”, C)); #9(); #10(open (“http://somehost/cgi-bin/somecgi”, C)); #11(open (“http://somehost/cgi-bin/somecgi”, C)); #14(); #15(open (“http://somehost/cgi-bin/somecgi”, C)); #16(value(“1”, “Z”)); #17(); #20(); #23(); #24(value(“conf idential.txt”, “Ljava/lang/String; ”)); #26(open (“conf idential.txt”, F )); #29(); #30(); #33(); #34(open (“http://somehost/cgi-bin/somecgi”, C)); #35(open (“http://somehost/cgi-bin/somecgi”, C)); #38(open (“http://somehost/cgi-bin/somecgi”, C)); #41(); #43(); rec(#53(open (“conf idential.txt”, F )); #54(read (“conf idential.txt”, F )); #57(); #58(); #60(value(“ − 1”, “I”)); #61(), #46(open (“http://somehost/cgi-bin/somecgi”, C)); #48(read (“conf idential.txt”, F )); #50(write((“conf idential.txt”, F ), (“http://somehost/cgi-bin/somecgi”, C)))); #64(open (“http://somehost/cgi-bin/somecgi”, C)); #66(); #69(open (“conf idential.txt”, F )); #70(); #73()
The annotation #65537 represents the parameters of the method. It is needed to verify the correspondence. In this case, we have only one parameter which
Secure Self-certified Code for Java
149
is args. The annotation #6 corresponds to the new URL("http://somehost/ cgi-bin/somecgi") statement. We suppose that a connection is opened when created. This annotation is bound to the url variable and propagated to the opcode 11 that represents the invocation of the openConnection() method. The annotation #26 represents the file opening which corresponds to the new FileReader("confidential.txt") statement. To write into the connection, an output stream must be created. This is done by invoking the getOutputStream() method from the URLConnection instance (uc). The annotation corresponding to this invocation is #35. Reading confidential.txt and sending it are done within a loop (while). The while statement is represented by the rec annotation in which the condition annotations range from opcode 53 to opcode 61 and the loop body annotations range from opcode 46 to opcode 50. Reading the confidential.txt file in the loop condition (f.read()) is represented by the annotation #54. Sending the character read via the connection is actually writing it to the instance PrintWriter. This is represented by the annotation #50 where we can see the source and the destination of the information. 7.2
JACC Security Policy
In this section, we explain two security policies corresponding to the given example. Security Policy 1 Consider a property corresponding to the method main of the program: never (possible(send ((“confidential.txt”, file), v), true)) This property stipulates that the file named confidential.txt can never be sent over the network. After analyzing the annotations of the method main, the JACC model-checker states that the program does not satisfy the aforementioned property. Security Policy 2 Here is another property introduced to verify the method main of the program: always (necessarily(readf (“confidential.txt”), never (possible (send ((“confidential.txt”, file), v), true)))) This property expresses the fact that once the file confidential.txt has been read, it cannot immediately or later on be sent over the network. This property is not satisfied by the program, since the file confidential.txt is read inside a while statement and then copied into the stream bound to an open connection.
150
8
M. Debbabi et al.
Conclusion
In this paper, we have presented our work on defining and implementing an expressive architecture, named Java Certifying Compilation (JACC) system, for secure compilation and execution of Java mobile code based on certifying compilation. The purpose of the JACC system is to statically detect as much potential malicious behaviors as possible in Java programs. In fact, it is a language-based approach to handle security problems. The JACC system includes a certifying compiler and a bytecode verifier. The JACC certifying compiler generates annotations in addition to the bytecode. These annotations try to capture every critical program behavior. The JACC bytecode verifier analyzes the generated annotations using a model-checker and verifies if they respect a certain security policy. The JACC system enables us to statically detect several cases of suspicious program behavior. The most useful feature of the security policy specification language is the possibility to use parameterized actions. Thanks to these actions, we are able to find the file names and URLs used by a program. In some cases, where data is not statically known, the use of variables as action parameters has shown to be very useful as an abstraction for missing data. This means that even if we cannot certify with certainty that a particular file is used, we can at least suspect it. The JACC system is more flexible (the security policy language is expressive and permits to specify security properties), more efficient (the verification process is exclusively static) and more robust (the security model is based on formal foundations) than the existing Java security architecture. A major challenge is to extend our JACC system to handle the following features: – Perform interprocedural analyses to detect attacks that include cooperation between several methods. – Add more verbose and accurate outputs to the security policy verifier. – Extend the range of possible values of the variable k, used in actions, in order to include other resources such as the screen, the keyboard, etc.
References [1] James Gosling, Bill Joy, G.S., Bracha, G.: The Java Language Specification. Second edn. The Java Series. Addison Wesley (2000) http://java.sun.com/ docs/books/jls/index.html. [2] Lindholm, T., Yellin, F.: The Java Virtual Machine Specification. Second edn. The Java Series. Addison Wesley (1999) http://java.sun.com/docs/books/vmspec/ index.html. [3] Mary Campione, K.W., Huml, A.: The Java Tutorial. Third edn. The Java Series. Addison Wesley (2001) http://java.sun.com/docs/books/tutorial/ index.html. [4] Gong, L.: Java Security Architecture (JDK 1.2) (1998) http://java.sun.com/ products/jdk/1.2/docs/guide/security/spec/security-%spec.doc.html.
Secure Self-certified Code for Java
151
[5] Necula, G.C.: Proof-Carrying Code. In: Proceedings of the Twenty-Fourth ACM Symposium on Principles of Programming Languages, Paris, France (1997) 106– 119 [6] Necula, G.C., Lee, P.: Safe, Untrusted Agents Using Proof-Carrying Code. In Vigna, G., ed.: Mobile Agents and Security. Springer-Verlag LNCS 1419 (1998) 61–91 [7] Morrisett, G., Walker, D., Crary, K., Glew, N.: From System F to Typed Assembly Language. In: Twenty-Fifth ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Diego, CA, USA (1998) 85–97 [8] Morrisett, G., Crary, K., Glew, N., Grossman, D., Samuels, R., Smith, F., Walker, D., Weirich, S., Zdancewic, S.: Talx86: A Realistic Typed Assembly Language. In: In ACM SIGPLAN Workshop on Compiler Support for System Software, Atlanta, GA, USA (1999) 25–35 [9] Kozen, D.: Efficient Code Certification. Technical Report TR98-1661, Cornell University (1998) [10] Havelund, K., Pressburger, T.: Model Checking Java Programs Using Java Pathfinder. Software Tools for Technology Transfer 2 (1999) 366–381 [11] Flanagan, C., Freund, S.N.: Type-Based Race Detection for Java. Proceedings of the ACM SIGPLAN ’00 conference on Programming language design and implementation 35 (2000) 219–232 [12] Colby, C., Lee, P., Necula, G.C., Blau, F., Plesko, M., Cline, K.: A Certifying Compiler for Java. In: Proceedings of the Conference on Programming Language Design and Implementation, Vancouver, Canada, ACM Press (2000) 95–107 http: //www.cs.berkeley.edu/~necula/pldi00b.ps.gz. [13] Yahav, E.: Verifying Safety Properties of Concurrent Java Programs Using 3Valued Logic. Proceedings of the Twenty-Eighth ACM SIGPLAN-SIGACT symposium on Principles of programming languages 28 (2001) 27–40 [14] IBM: Jikes Official Homepage. http://oss.software.ibm.com/developerworks/ opensource/jikes/ (2002) [15] Stirling, C.: Modal and Temporal Logics for Processes. Technical Report ECSLFCS-92-221, Laboratory for Foundations of Computer Science, Department of Computer Science, University of Edinburgh (1992) [16] JavaSoft: Security (1998) http://java.sun.com/products/jdk/1.2/docs/ guide/security/index.html. [17] Kozen, D.: Results on the Propositional µ-calculus. Theoretical Computer Science 27 (1983) 333–353 [18] Klaus Havelund, J.U.S.: Applying Model Checking in Software Verification. In: Proceedings of the Sixth International SPIN Workshop on Practical Aspects of Model Checking, Toulouse, France (1999) 216–231 [19] Park, D.Y.W., Stern, U., Skakkebaek, J.U., Dill, D.L.: Java Model Checking. In: Proceedings of Automated Software Engineering conference, Grenoble, France (2000) 253–256
Z Styles for Security Properties and Modern User Interfaces Anthony Hall Praxis Critical Systems, 20 Manvers Street, Bath BA1 1PX [email protected] Abstract. This paper describes two new styles for using Z. The first style, based on earlier work for the UK Government, is suitable for the specification of security properties in the shape of a formal security policy model. The second, an extension of the Established Strategy, is useful for specifying systems with modern graphical user interfaces and also for showing satisfaction of security properties. The work is based on a successful industrial project.
1 1.1
Introduction Background
One of the best-known formal notations, Z, consists of a small collection of set theoretic and logical symbols together with a structuring tool, the schema. The semantics of Z are mathematical, not computational. This makes it a powerful, flexible and extensible notation. However, since there is no built-in correspondence between the meaning of a Z specification and any computational model, it means that use of Z to specify computing systems is a matter of convention. One dominant convention has grown up - the so-called Established Strategy. However, other styles are also possible and some of these other styles are more useful for expressing particular kinds of property. In particular, a convention for using Z to define security properties was developed at UK Government’s Computer and Electronic Security Group (CESG) [4]. Barden, Stepney and Cooper have published a practical guide [5], which gives good accounts of both the Established Strategy and the CESG approach. The CESG method for specifying secure systems is as follows: 1. Write a model which expresses the security policy desired (the Formal Security Policy Model, FSPM). 2. Write a specification of the system to be built (the Formal Top Level Specification, FTLS). 3. Prove a theorem about the correspondence between the FTLS and the FSPM. In the FSPM, the system is specified as a transition relation between inputs plus starting state and final state plus output. The security properties are expressed as constraints on this system. These constraints define that subset of all systems which are secure systems. A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 152–166, 2003. c Springer-Verlag Berlin Heidelberg 2003
Z Styles for Security Properties and Modern User Interfaces
153
The FTLS, on the other hand, is first expressed in the Established Strategy as a state plus a collection of operations, each operation defined by a schema. The method then transforms this, by a relatively mechanical process, to a transition relation similar to the FSPM. To carry out step 3, the FSPM is instantiated with the types of the FTLS and an interpretation function is written which interprets the behaviour of the FTLS as a system in terms of the FSPM. The proof is constructed in two steps: first each operation is considered separately to generate Operation Lemmas and then properties of the total transition relation are proved from the operation lemmas. 1.2
An Example Project
During 1998 Praxis Critical Systems developed a Certification Authority (CA) for MULTOS. The development was carried out according to the principles of Information Technology Security Evaluation Criteria (ITSEC) level E6, although the system was not formally evaluated. ITSEC E6 requires a formal security policy and a formal design of the system. There is a description of the overall development process and its results in [6] and a more detailed account of the role of formal methods in the project in [7]. As part of this development we wrote an FSPM and an FTLS, following the basic principles of the CESG method. We did not try to prove correspondence between them. Given the size of the system (well over 100 operations) this would have been a major task. However, we did write the specifications in such a way that this proof could be attempted in future. We based our FSPM on the CESG method. However the security policy for the CA required particular operations to be supplied. It also depended on more complex properties of data than the mandatory security classifications which were the concern of [4]. We therefore adapted the CESG method in two respects: we modelled the system as a set of operations rather than a transition relation, and we had a more elaborate model of the properties of data. The CA has a modern user interface using windows, command buttons and selection from lists. The Established Strategy has a simple model of inputs and outputs, which does not represent such an interface faithfully. Many of the security properties of the CA, however, were constraints on this interface (for example that no secrets were displayed). It was therefore important, if we were to establish these properties, that they should be represented in the FTLS. We therefore extended the conventions of the Established Strategy to represent important aspects of the user interface.
2 2.1
A Formal Security Policy Model What We Had to Formalise
The user requirements included an informal security policy, which identified assets, threats and countermeasures. We formalised a subset of the whole policy.
154
A. Hall
Of the 45 items in the informal policy, 28 were technical as opposed to physical or procedural. Of these, 23 items related to the system viewed as a black box, whereas 5 were concerned with internal design and implementation details. We formalised these 23 items, turning them into 27 formal clauses which fell into three classes: – Two of the clauses constrained the overall state of the system. Each of these became a state invariant in the formal model. – Eight clauses required the CA to perform some function (for example, authentication). To formalise this we need to say that there must exist an operation with a particular property. – Seventeen clauses were constraints applicable to every operation in the system (for example, that they were only to be performed by authorised users). For these we need to say that during any operation execution where the clause is applicable, the property must hold. 2.2
Overall Approach
We followed the overall approach of the CESG method by defining: 1. A general model of a system. We model any system as a state plus a set of operations; this is slightly different from the CESG approach and the differences are discussed in more detail in section 2.3. 2. A model of a CA system This is a specialisation of the general model of a system with some mapping between system state and real-world concepts such as users, sensitive data and so on. This idea is taken from the CESG approach where the SYSTEM schema includes not just the transition relation but also applicationspecific concepts such as clearance. However, we implement the idea rather differently, using representation functions to relate system data to real-world constructs, as described in section 2.4. 3. A definition of a secure CA system. Each formalisable clause in the security policy is turned into a predicate that constrains the CA system in some way. The overall definition of a secure CA system is one where all the constraints are satisfied. Section 2.5 gives an example of how we formalised each type of property. 2.3
A General Model of a System
The approach here differs from that in [4] in that we do not define the transition function of the system explicitly. Instead, we define a system in terms of a state and a set of operations. The difference is purely formal: given a constraint over our set of operations, we could turn it into a constraint over a transition relation using the method described in [4]. We have not done this because none of the items in the security policy need constraints over sequences of operation applications. Therefore the constraints over the transition relation would reduce directly
Z Styles for Security Properties and Modern User Interfaces
155
into constraints over single operation applications - the Operation Lemmas of [4]. The system state is simply a collection of data. [DATA] State == F DATA We assume a set OPERATION which can be used to refer to individual operations. Each operation may succeed or fail: if it fails, it returns one or more errors. [OPERATION , ERROR] Each operation execution has some input. Each operation execution may produce output to be displayed and output to be transmitted outside the CA on some medium such as CDROM. These two kinds of output are distinguished. The item display refers to anything which may appear on the screen as a result of an operation. This may include the echo of input values, as well as any results or error messages from the operation. The item transmitted refers to any output which is to be transmitted as a result of an operation. OperationExecution state, state : State operation : OPERATION input : F DATA display : F DATA transmitted : F DATA errors : F ERROR We define a system as a set of possible states, a set of possible initial states, a set of possible operations and a collection of possible operation executions. System states : P State initialStates : P State operations : F OPERATION opExecutions : P OperationExecution initialStates ⊆ states ∀ oe : opExecutions • oe.state ∈ states states = {OperationExecution | θOperationExecution ∈ opExecutions • state → state }∗ (| initialStates |) {o : opExecutions • o.operation} = operations The variable operations is the identities of the operations that the system supports. Note that these operations must actually be possible in the system.
156
A. Hall
We do not allow systems to contain operations which can never be executed from any reachable state. This is of course a rather weak liveness condition, but it corresponds to the notion that certain operations must exist in the system. The possible states are just the initial states plus those reachable from the initial states by sequences of the possible operation executions. It would be a mechanical process to transform opExecutions, the set of all possible executions, into the transition relation used to characterise a system in [4]. 2.4
A Model of a CA System
This section gives a general definition of a CA system. It describes every possible CA system, both secure and insecure. It has to be rich enough for us to write predicates which distinguish secure from insecure systems. It therefore has to contain all the concepts which are used in the security policy. The CESG approach defines the system in terms of generic types: specialisation to a particular kind of system is done by instantiating these generic types. Our problem did not lend itself to this approach, since the security properties of the system depend on a number of different kinds of data representing, for example, secret information, users, audit trails and so on. Instead of having a generic type DATA therefore, we used a given set DATA and defined relevant properties of data in one of two ways: 1. Some of the security policy can be expressed in terms of very general properties of data such as whether it is secret. Such properties can be expressed simply as subsets of DATA. 2. To define other parts of the policy, we need to understand what some parts of the data in the system actually represent. For example, we need to know that some data represent role holders and their properties. In that case, what we do is construct a separate model (similar to the FTLS) of the relevant parts of the real world, and construct functions which map DATA to this real world model. General Properties of Data For the purpose of this example, we consider two aspects of data: whether it is or is not secret and whether or not it needs some kind of confidentiality, non-repudiation or integrity protection. An important property is that the set of secret data is fixed: there are no operations which can change the classification of data. secret : P DATA For the purpose of this example, we simplify the need for protection to just two cases: data are either sensitive or insensitive. We also omit details of how mechanisms are combined. sensitive, insensitive : P DATA sensitive, insensitive partition DATA
Z Styles for Security Properties and Modern User Interfaces
157
Protection consists of applying some mechanism to a piece of data. If a data item has been protected, it is no longer sensitive. Protection must of course be reversible - it must be possible to recover the original data, if the protection mechanism is known. [MECHANISM ] protection, recovery : MECHANISM → DATA → DATA ∀ m : MECHANISM ; d : DATA | d ∈ dom(protection m) • (protection m)d ∈ insensitive ∧ (recovery m)((protection m)d ) = d A data item cd is a (possibly protected) copy of a data item d if it is either d itself or the result of applying some protection to d . copyOf
: DATA ↔ DATA
∀ cd , d : DATA • cd copyOf d ⇔ (d = cd ∨ (∃ m : MECHANISM • cd = (protection m)d )) Mapping Data to the Real World Some parts of the policy can not be expressed simply as properties of data. Instead, we have to consider how the data represents security-relevant things in the real world. We do this as follows: 1. Identify the relevant real world entities and their relationships. For simplicity in translating to items of data, relationships are all represented as simple binary relations (which may of course be functions). 2. Define representation functions which map DATA into these real world entities. An item of data represents either an instance of an entity, in which case the state includes a set of such entities, or a pair of entities, in which case the state includes a set of such pairs: that is, a relation between two entity types. For example an item of roleHolderData represents a single role holder; an item of roleHolderNameData represents a (role holder, name) pair. As an example, part of the policy deals with roles and role holders. The real-world structure is described in the schema Access [ROLEHOLDER, ROLE , TEXT ] Access roleHolders : F ROLEHOLDER roles : ROLEHOLDER ↔ ROLE dom roles = roleHolders
158
A. Hall
This schema contains a well-formedness constraint that all role holders do in fact have roles. To represent this, we map role holders and roles to items of data. The variable roleHolderData represents the role holders and roleHolderRolesData represents the roles held by role holders. roleHolderData, roleHolderRolesData : P DATA disjoint roleHolderData, roleHolderRolesData rRoleHolder : roleHolderData → ROLEHOLDER rRoleHolderRoles : roleHolderRolesData → ROLEHOLDER × ROLE A state represents a value of the schema Access when the relevant parts of the state are exactly those produced by representing the values of the variables in Access. rAccess : State → Access rAccess = {s : State; a : Access | a.roleHolders = rRoleHolder (| s |) ∧ a.roles = rRoleHolderRoles(| s |) • s → a} A CA system is one in which the states do indeed correspond to the structure of Access. CASystem System ∀ s : states • s ∈ dom rAccess
2.5
The Security Policy
The security policy is a conjunction of predicates which constrain the system. A system which satisfies all the predicates in the policy is secure according to that policy. A State Invariant: One Role Only The informal policy states that it will be impossible for an individual to assume more than one role at any one time. This is formalised by saying that each role holder can only have one role. We formalise this in two stages: first we formalise the real-world constraint: OneRoleReal Access roles ∈ ROLEHOLDER → ROLE
Z Styles for Security Properties and Modern User Interfaces
159
Then we state the property that the CA System itself must represent a real world situation which respects the constraint. That is, every state in the system must represent a real world which respects the constraint. OneRoleOnly CASystem ∀ s : states • rAccess s ∈ OneRoleReal The Existence of an Operation: Backup Availability The informal policy states that it shall be possible to continue operations on another site in the event of system failure. This requires two operations: save and restore. (Other features of the CA preclude any other solution to the informal requirement.) In this (simplified) definition save, provided it is successful, generates a (possibly protected) copy of all data in the state. This copy may contain additional administrative information, but it must contain at least the state. SavePossible System ∃ save : operations • (∀ o : opExecutions | o.operation = save ∧ o.errors = ∅ • (∀ d : o.state • (∃ cd : o.transmitted • cd copyOf d ))) A Property of All Operations: Data Transmission The informal policy requires that the system will ensure that any data that are transmitted over a communications channel are afforded the security of fit-for-purpose confidentiality, integrity and non-repudiation mechanisms. The formal policy states that any transmitted data is insensitive: that is it is an original item which does not need protection or it has been protected. ProtectTransmittedData System ∀ o : opExecutions • o.transmitted ⊆ insensitive The Secure CA System The secure CA system is defined as a CA system where all the clauses of the policy are respected: SecureCASystem CASystem OneRoleOnly SavePossible ProtectTransmittedData
160
2.6
A. Hall
Relation to the FTLS
The formal top level specification must conform to the formal security policy model. To show conformance, the following steps are necessary: 1. For every data structure used in the FTLS, identify the data in the FSPM that represent it. 2. Where the data in the FSPM represent some real world concept such as ROLEHOLDER, define the representation function in terms of FTLS structures. In some cases the FTLS structure will be identical to the real world structure used to define the FSPM, but it is not necessary that the correspondence is the identity. 3. Using these correspondences, provide a mapping between possible states in the FTLS and the states in the FSPM, and show that all the states allowed by the FTLS conform with the predicates in the FSPM which constrain the system state. 4. Translate all the operation definitions in the FTLS into operations in the FSPM which represent them. An FSPM operation represents an FTLS operation if the FSPM data forming the inputs, outputs and state changes in the FSPM represent the inputs, outputs and state changes of the FTLS operation. All operations in the FTLS can be represented in the FSPM by translating their data in this way. Then show that all such FSPM operations conform to the predicates which constrain all operations. 5. Where the FSPM calls for the existence of a particular operation, demonstrate that there is a corresponding operation in the FTLS. Translate the FTLS operation into an FSPM operation which represents it, and demonstrate that the definition conforms with the definition of the required operation.
3
The FTLS
The structure of the FTLS is driven by two factors: 1. The operation definitions must be structured so that it is easy to translate them into the corresponding FSPM structures. This means that we must clearly state, for each operation, what is input, what is displayed and what is transmitted out of the CA system. The display must include any error messages; there is a requirement that all errors are reported, so we must allow for more than one error per operation. 2. The CA has a graphical user interface. This means that operations are chosen from a limited set available on the screen, and the operations available at any time depend on the state of the system. Furthermore once an operation has been selected, there is a dialogue for the user to give the inputs to the operation. Inputs may be selected from a limited set of possibilities (for example in a list box) or they may be typed in by the user.
Z Styles for Security Properties and Modern User Interfaces
161
This means that we have to represent the fact that operations may not be available, that they are long-lived and that some inputs may not be available to the user. We use three conventions to meet these requirements. 1. We have separate schemas for the inputs, displayed items and transmitted items for each operation. This allows us to match the operation specifications to the FSPM. 2. We have separate schemas to define what inputs to an operation are available, what inputs are correct and what inputs are invalid. This models the fact that some inputs are selected from a limited choice of possibilities. 3. We specify the execution of an operation in two phases. There is a generic specification StartOperation which models the selection of an operation by the user, and leaves the system in a state where that operation is active. There is then a specific definition for each operation describing its behaviour. This captures the fact that only certain operations are available at any one time, and the fact that operations are long-lived rather than atomic. It also allows tracing to rules in the FSPM which govern when certain operations may take place. As an illustration, we use the operation to add a role holder. In this simplified specification we assume that the state consists simply of two maps, from role holder ids (which are text) to their passwords and roles. The only other independent component of the state is the current operation. This may or may not be present, and to represent this we use a convention (which seems to have been invented simultaneously by several authors) of representing optional items as sets containing either zero or one member. optional X == {x : F X | #x ≤ 1} nil [X ] == ∅[X ] the[X ] == { x : X • {x } →x} CAState roleHolderPassword : TEXT → TEXT roleHolderRole : TEXT → ROLE known : F TEXT currentOperation : optional OPERATION known = dom roleHolderPassword = dom roleHolderRole There is a general schema OperationFrame which defines the state change for all operations. Since this schema represents completion of the operation, there is no operation current in the final state.
162
A. Hall
OperationFrame ∆CAState currentOperation = nil
3.1
Inputs and Outputs
We need to state exactly what the inputs for each operation are, what is displayed on the screen and what is transmitted to outside systems. Furthermore we need to be able to extract this information systematically from the specification. Therefore, for each operation xxx , we specify up to three schemas: xxxIn, xxxDisp and xxxXmit . xxxIn contains the inputs specific to the operation. These are decorated with “?”. xxxDisp contains everything significant that is displayed on the screen. The only items that are not considered “significant” in this context are prompts and similar items which are fixed and not dependent on the data in the system. This is so we can check that the system does not display anything which contradicts the security policy. This includes any listboxes which are put up for the user to choose from, and any echoes of user input. Where it contains echoes of user input, it repeats the name of the input variable. If the item displayed is not an echo of input, it is decorated with “!” in the usual way. xxxDisp always contains the declaration error ! : F ERROR, which is used to report all errors to the user. If the operation succeeds, error ! is always empty. We sometimes include part of the state in xxxDisp so as to describe constraints on the diplay. The state components are not, of course displayed: the only items that are actually displayed are those which appear in xxxDisp and are decorated with “?” or “!”. xxxXmit contains everything that is output on other media, such as floppies, DAT tapes and CD ROM. This is decorated with “!” as usual. The state change, inputs and outputs are collected together in a frame schema xxxFrame. This schema also defines parts of the state which are unchanged by the operation. For example, here is a simplified specification of the operation to add a new role holder to the system. The inputs are a role holder id which is typed in as text, a role which is selected from a given list of roles and a password which is typed in as text. RegisterRoleHolderIn roleHolderId ? : TEXT role? : ROLE password ? : TEXT The display includes an echo of the role holder id and the selected role, but not of the password. It also includes the set of roles available. In this simplified example we omit the predicate defining what roles are actually displayed.
Z Styles for Security Properties and Modern User Interfaces
163
RegisterRoleHolderDisp roleHolderId ? : TEXT role? : ROLE role! : F ROLE error ! : F ERROR In this particular case there is nothing transmitted outside the CA so we omit the schema RegisterRoleHolderXmit RegisterRoleHolderFrame OperationFrame RegisterRoleHolderIn RegisterRoleHolderDisp the currentOperation = registerRoleHolder
3.2
Availability and Validity
An operation will only have certain inputs available. For example, it might be physically impossible to provide invalid input values if input is done by selection from a menu. The conditions where the operation can be attempted are described in a schema xxxAvailable In some cases, it is possible to try the operation with particular parameters, but it won’t work. The rules for correct input are defined in a schema xxxValid . The behaviour of the operation in this case is defined in the schema xxxOK . Conversely, the schema xxxError defines the effect of invoking the operation with invalid inputs. The total operation is then xxxOK ∨ xxxError . For example, in RegisterRoleHolder the user selects a role from the list of displayed roles. RegisterRoleHolderAvailable RegisterRoleHolderFrame role? ∈ role! The user can type any role holder id and password they like. However, the id they type must not already exist. RegisterRoleHolderValid RegisterRoleHolderAvailable roleHolderId ? ∈ / known Successful operation depends on the input being valid. The RegisterRoleHolderOK schema describes the effect in that case.
164
A. Hall
RegisterRoleHolderOK RegisterRoleHolderValid roleHolderPassword = roleHolderPassword ⊕{roleHolderId? → password ?} roleHolderRole = roleHolderRole ⊕ {roleHolderId? → role?} error ! = ∅ The error schema describes what happens in case each of the validity conditions is not met. It allows more than one error to be reported, and in fact is loose in that other, implementation-defined, errors are also possible. RegisterRoleHolderError RegisterRoleHolderAvailable ΞCAState roleHolderId ? ∈ known ⇔ theRoleHolderIdHasBeenUsed ∈ error ! The total definition of RegisterRoleHolder is: RegisterRoleHolder = RegisterRoleHolderOK ∨ RegisterRoleHolderError 3.3
The Lifecycle of an Operation
The previous sections have defined how an operation behaves once it has been invoked. The action of invoking an operation is treated separately. There is an operation StartOperation. This describes how a user may attempt to start any operation which is available at the time. It succeeds provided that the rules for that operation are satisfied. The only input to StartOperation is the operation id. StartOperationIn operation? : OPERATION At certain times the user is offered a list of operations they can invoke. The user can then select an operation to be run from the listed operations. We don’t define here all the rules for what operations are displayed, but certainly none are displayed if there is already a current operation. We include the currentOperation part of the state to allow us to represent that fact, but since it is not decorated, currentOperation is not actually displayed. StartOperationDisp StartOperationIn operations! : F OPERATION currentOperation : optional OPERATION currentOperation = nil ⇒ operations! = ∅
Z Styles for Security Properties and Modern User Interfaces
165
StartOperationFrame OperationFrame StartOperationDisp An operation is only available to be started if it is one of those on the screen. StartOperationAvailable StartOperationFrame operation? ∈ operations! An operation can only be started when certain conditions are met. We omit the actual definition of StartOperationValid , but it captures rules such as only allowing certain operations if certain roles are present. StartOperationValid StartOperationAvailable If the operation is valid, it becomes the current operation. StartOperationOK StartOperationValid the currentOperation = operation?
4
Summary
This paper describes two new styles for using Z. The first, which is relevant to the specialist security community, is a variant on the approach developed for Computer and Electronic Security Group. It allows for a wider range of security properties, by supporting a richer chracterisation of data and by making operation types explicit. It should also remove the need for one proof step, although to justify that claim we would have to complete a proof using our approach. The second style is more generally applicable to the specification of modern software systems. It is an extension to the Established Strategy which provides a richer model of inputs and outputs and a more faithful representation of a typical modern user interface. These styles have been used on a real commercial project and were successful in allowing us to represent important properties of the system we were building. The security policy forced us to consider security properties in the Formal Top Level Specification, and the FTLS structure supported the specification of the user interface and the subsequent development of the system. The outcome of the development was a system which has proven robust and accurate in commercial use.
166
A. Hall
One important next step to validate this work would be to carry out proofs within this framework. We have worked out the strategy which would be needed to prove conformance of the FTLS with the Formal Security Policy Model, but not attempted any of the proofs. We did typecheck all our Z with fuzz. (This paper has also been typechecked, although we omitted some constant definitions to avoid cluttering the exposition.)
Acknowledgements We thank John Beric, Head of Security at Mondex International, for permission to publish this work.
References 1. J.M Spivey, The Z Notation: A Reference Manual, Prentice Hall, Second Edition, 1992 2. B. Potter, J. Sinclair and D. Till, An Introduction to Formal Specification and Z, Prentice Hall, 1991 3. Multos GKC System User Requirements, Issue 1-9, 4 September 1997 4. CESG computer security manual “F”: A formal development method for high assurance systems, Issue 1.1, July 1995 5. R. Barden, S. Stepney and D. Cooper, Z In Practice, Prentice Hall, 1994 6. A. Hall and R. Chapman, Correctness by Construction: Developing a Commercial Secure System, IEEE Software, Jan/Feb 2002, pp18 - 25. 7. A. Hall, Correctness by Construction: Integrating Formality into a Commercial Development Process, Proceedings of International Symposium of Formal Methods Europe, LNCS 2391, Springer, pp224 - 233.
Cryptographic Challenges: The Past and the Future B. Preneel Katholieke Univ. Leuven, Dept. Electrical Engineering-ESAT, Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium [email protected]
Abstract. In this paper we discuss our cryptanalysis of the Hagelin C38/48 (or M-209) cryptograms sent between Brussels and several Belgian officials in Congo immediately after the independence (1960–1961). This work was carried out for the Belgian Parliamentary investigation committee (1999-2001) which investigated the circumstances of the murder on Patrice Lumumba. In the second part of the article, we try to extract the implications of this research on modern cryptology. We discuss the state of the art in cryptography and the research challenges that need to be addressed in the next decade.
1
Introduction
Cryptographic algorithms play a crucial role in the information society. When we use our ATM or credit card, call someone on a mobile phone, get access to health care services, or buy something on the web, cryptographic algorithms are used to offer protection. These algorithms guarantee that nobody can steal money from our account, place a call at our expense, eavesdrop on our phone calls, or get unauthorized access to sensitive health data. It is clear that information technology will become increasingly pervasive: in the short term we expect to see more of e-government, e-voting, m-commerce, . . . ; beyond that we can expect the emergence of ubiquitous (or pervasive) computing, ambient intelligence,. . . These new environments and applications will present new security challenges, and there is no doubt that cryptographic algorithms and protocols will form part of the solution. In this article we briefly summarize our experience in cryptanalyzing C-38/48 telegrams dating back to 1960–1961. We continue by discussing the state of the art in cryptography; next we investigate why some many schemes are insecure. Finally we discuss the approaches taken to address the cryptography challenge.
2
Decrypting the Hagelin C-38/48 Telexes
In the Summer of 2001, we were contacted by two colleagues of the K.U.Leuven in their role as experts of the Belgian Parliamentary Investigation Committee that was investigating “The Circumstances of the Murder on Patrice Lumumba.” A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 167–182, 2003. c Springer-Verlag Berlin Heidelberg 2003
168
B. Preneel
2.1
Background
Congo was since 1885 the personal property of king Leopold II, and became a Belgian colony in 1907. On 30 June 1960, Congo obtained its independence and Patrice Lumumba became the first prime minister. Immediately a complex power struggle erupted in the new country. The parties involved included Belgium, the revolting provinces South Kasai and Katanga (under control of Mr. Tshomb´e), the US and the USSR. In Belgium, the interested parties comprised the king, the prime minister, the minister of African affairs and the industry. The United Nations intervened with limited success. Lumumba was fired on 5 September 1960 and arrested on 10 October; on 17 January 1961 he was transported to Katanga and the same day he was executed. In 1975, the US Congress investigated the role of the CIA in the death of Lumumba; the Church Committee asserted that despite the initiation of earlier assassination attempts, the U.S. was not involved in the death of Lumumba. Triggered by some new revelations, the Belgian Parliament decided to start up in May 2000 an investigation on the Belgian involvement; the committee would be active till October 31 2001. 2.2
The Question
In the context of this investigation, a number of encrypted telexes were found by the expert historians. Some telexes were encrypted using a simple substitution cipher – the experts had no problems deciphering these. For the others, some sources claimed that “the experts did not want to decrypt these” hereby insinuating that they did not want to uncover the truth. In August 2001, four telexes encrypted using ‘OLTP’ were provided to us. An initial cryptanalysis attempt did not give any results, but we realized quickly that ‘OLTP’ was the ‘encryption’ of ‘OTP-L’ or one-time pad modulo L (which is of course perfectly secure if the key material is properly generated). On 17 September 2001, we received 11 printex telexes sent between the Ministry of African affairs in Belgium and Elisabethville (Katanga), Brazzaville and Rusur. For five of these telexes (part of) the plaintext was known, for one a guess for the plaintext was known (which later proved to be incorrect). The question asked was to decrypt all the telexes within 3 weeks. An example of a ciphertext is given below: Brazza 28b (stamp: 15-2-1961). Jacques to TDUYB ZJQZI VVRHP ELHIL FXUKQ MNAFF ZPWSE LBZAI MXNFC ZZSHR XVTZI DZABT LPEET CNHFV HTLMO SWLOH EVJLF NOFYV ROSYC WXDTE WVEXE
Nicolas 11150 HSMEO DOXPX NFPPA RNMXS RZPUG RSNUF CJTQI HUKYM XZWBG ACKPT HSMEO 11150
Here 11 means part 1 of 1; 150 means that the plaintext contains 150 characters (or 30 five letter groups). HSMEO is the ‘false key,’ which would now be called the encrypted ‘session key.’ The corresponding plaintext is:
Cryptographic Challenges: The Past and the Future
169
CONTINUE INTRIGUES INQUIETANTES TANT LEO QU EVILLE JACQUES BISSECT VOUS PRIE VOUS INFORMER DISCR?TEMENT MISSION EXACTE CONFIEE HUBERT STOP INTERESSEYX One of the more important telexes was the following one, as it has been sent a few days before the murder on Lumumba (note the date and hour field in the last character group): Loos to Marliere OUXBD LQWDB BXFRZ XNZZH MEVGS ANLLB UEAAV RWYXH PAWSV ZVKPU ZSHKK PALWB
14 January 1961 11h00 AHQNE XVAZW IQFFR JENFV NJVYB QVGOZ KFYQV GEDBE HGMPS GAZJK RDJQC VJTEB DQCGF PWCVR UOMWW LOGSO ZWVVV LDQNI YTZAA OIJDR CHTYN HSUIY PKFPZ OSEAW SUZMY QDYEL FUVOA WLSSD SHXRR MLQOK AHQNE 11205 141100
In addition the experts provided us after a few days with some documentation from the 1950ies on the use of the printex devices. From this documentation we could learn that BISECT was used to indicate the beginning of the plaintext (this avoids known plaintext at the beginning of the telex) and that various schemes were in use to encrypt the session keys. Also, a space was replaced by ‘W’ before encrypting, but at random (in about 1 case out of 5) it was to be replaced by ‘KW,’ hence providing a randomized encryption. At the end, random padding was used to pad the plaintext to a multiple of 5 characters. 2.3
The printex Machine
Searching the cryptographic literature did not reveal any information on the printex cipher system. After talking to some cipher operators from that period, it became clear that this name was given to a class of cipher machines which would ‘print’ the ciphertext (when encrypting) or the plaintext (when decrypting) on a paper tape. The more likely candidate machines were the Hagelin C-35 (with five rotors) and the Hagelin C-38/48 (with six rotors), but from the 1950ies onwards more advanced machines were in use. The Hagelin machines C-35 and C-38 were pure mechanical (and portable) machines put on the market in 1935 and 1938 respectively by the Swiss company Crypto AG; they were designed by the Swede Boris Hagelin. The machines were aimed for both the business and military markets. He sold the machines to many countries (including Finland, France, Germany, Italy, Japan, Sweden and the US) who used it for securing tactical communications (the limited security level was well known). During World War II, a variant of the C-38, the C-48 was produced under license by Smith-Corona (the typewriter company); it was known as the M-209 (US Army) and the CSP-1500 (US Navy). After World War II, these machines were probably given away to allies. These purely mechanical machines consists of a number of rotors (with relatively prime lengths 17, 19, 21, 23, 25, 26) that move together over one position after processing a character. For each position, each rotor has a pin that is in the active or inactive state. For the C-35 this corresponds to 17+19+21+23+25=
170
B. Preneel
105 key bits, while the C-38 has 105+26=131 key bits. The period of the devices is 3 900 225 respectively 101 405 850 (the product of the individual rotor lengths). Five (or six) rotor pins form a 5-bit (or 6-bit) ‘address’, that enters a non-linear substitution table. The output of this table is an integer between 0 and 25 which forms a displacement d between the plaintext character p and the ciphertext character c or c := (25 − p + d) mod 25 (note that decryption is equal to encryption). This substitution table is implemented using a movable cage with 25, 27 or 29 bars. Each bar has 2 lugs that each correspond to one of the rotors. These lugs are fixed for the C-35 but movable for the C-38 (to 6 positions corresponding to the 6 rotors and to 2 neutral positions), hence the C-38 has a much larger key. For every character encryption, all the bars are moved past the 5 (or 6) rotor pins. A lug on a bar that meets an active rotor pin will activate that bar and increase the displacement d by one. The session keys are the initial rotor positions, indicated by 5 (or 6) characters written on the rotors. Later variants of the C-38 used an additional substitution: the letters on the plaintext dial were permuted using a simple substitution. Some information was available on these variants and on the C-41, C-52 or CD-57 (the C-52 has an an irregular movement of one of the rotors). As the cryptanalysis of these is more complex, we could do nothing but hope that they were not in use in 1960 (although one would expect that the most advanced devices available would have been used for this application). 2.4
Cryptanalysis
A simple observation is that exhaustive key search would be completely infeasible. The next step was to perform a literature study; in view of the tight timing constraints, we restricted ourselves to [3,17,21]. The session keys of 5 digits clearly pointed to the C-35 with 5 rotors; we obtained a device from the Belgian Army. This machine is relatively easy to cryptanalyze [3, p. 179–186]: one can translate the displacements d immediately into 1 or 2 values for 5 rotor pins, and reconstruct the rotor values quickly. However, our attempts failed badly. This could either imply that the known plaintexts were not correct, or that another machine was used. We moved on to the C-38/C-48, and hoped that the variant made available to us by the Belgian Army (produced by Smith-Corona) would indeed be the one used in 1960–1961. In the literature we found an attack published in 1978 by R. Morris [17]. This attack requires between 75 and 100 known plaintext characters. The ciphertext only attack by R. Rivest [21] needs several thousand ciphertext characters. However, the use of session keys precluded the concatenation of the different messages to one large ciphertext. Moreover, the longest cryptogram available to us for one session key was 370 characters; longer plaintexts were divided into blocks of 370 characters, and the session key (the rotor setting) was changed in between. The attack by R. Morris requires a combination of programming and pattern recognition and needs about one day of work. We were able to implement it
Cryptographic Challenges: The Past and the Future
171
and to recover part of the key for the message of 15 February 1961. However, in order to find the complete key (which we did on 23 September), we had to extend the attack because the lug positions were not chosen as prescribed; for a close to uniform distribution of the displacements, the number of lugs opposite the rotors should be of the form ‘1-2-4-8-10-12’ (or a random permutation thereof). It turns out however that in the key setting for February 1961, the values ‘4-8-96-9-5’ were used. Such a key setting is harder to recover using the Morris attack. The settings for the final 17 rotor pins were obtained by exhaustive search. This particular key setting is more vulnerable to a ciphertext only attack (e.g., displacement values d of 1, 2, 3 and 7 could not occur), but there were not enough ciphertext characters available to exploit this weakness. We now had the February 1961 setting for the pins and lugs (the long term key), but since the session key was encrypted, we were not able to set the rotors in the correct position to test the long term key on other cryptograms. Moreover, we were not sure at all whether the same settings for the pins and lugs were used (the manuals recommend to change the key every month). In order to save time we programmed a quick check: we verified whether or not 1, 2, 3 and 7 occur as displacement sequences in the other cryptograms (with known plaintexts). The answer was positive, so we were convinced that the long term key had been changed. The next step was to try to recover the pin and lug settings for some other known plaintext/ciphertext pairs. This work progressed only very slowly (part of the pin values could be recovered, but we failed to come up with complete solutions). After one week, we were about to give up – we were starting to believe that different devices were used on some of the other links. The reason for our failure – which was only discovered later – is that there were too many errors such as typos and missing plaintext characters (e.g., ‘KW’ versus ‘W’) in the other known plaintexts. Nevertheless, we noted on September 30 2001 that the lug settings obtained from the other cryptograms had some vague similarity to those of February 1961, so we decided that maybe the key hadn’t changed after all. This could be verified by exhaustive search for all 101 million starting positions for the rotors, or in modern language, by an exhaustive search for the session key. This requires a few minutes on a PC. This approach proved to be successful: the key (pins and lugs) had not been changed! We then wrote a program that would try all the session keys for the ciphertext for which no plaintext was known. The correct values would be among the plaintexts with more than 12% spaces (character W), which eliminated all but a few dozen to a few hundred values. A manual search for the word BISECT or BISSECT could then identify the correct plaintext. All plaintexts were decrypted on 1 October 2001 at 3:30am, so we had some time to write up a report and meet our deadline (8 October). It was now also clear why the displacements 1, 2, 3, and 7 were found in the other plaintext/ciphertext pairs: again, they were the consequence of errors in the plaintexts (mainly missing characters).
172
B. Preneel
In the following week, we discovered that the 6th rotor wheel was always initialized in the same position as the 5th, which explains why five characters are sufficient to communicate an encrypted session key indicating six rotor positions. This information reduces the time for exhaustive search of the session key from minutes to seconds. From the 11 decrypted session keys, we were able to deduce that the mechanism to encrypt these keys was the Playfair cipher, for which we were also able to recover the key (a random permutation of 25 letters). With this information, a ciphertext can be decrypted in a few microseconds. We were also able to decrypt a ciphertext that remained indecipherable in 1960 due to a transcription error in the session key (but note that the telex was of course simply repeated in the clear). 2.5
The Result
The plaintext of 14 January 1961 turned out to be very interesting: DOFGD LEXWC WMWPR WXXWS
VISWA EWSUJ INTEX ECUND
WVISW ETWAM WXXWP OWREP
JOSEP BABEL RIMOW RENDR
HWXXW GEWXX RIENW EWDUR
TERTI WJULE ENVOY GENCE
OWMIS SWXXW EWRUS WPLAN
SIONW BISEC URWWX WBRAZ
BOMBO KOWVO IRWTE TWTRE SECVX XWRWV XWPOU VEZWR EGLER ZAWWC
This can be translated as follows: Top secret. Answer to your printex message. First. Nothing sent to Rusur. Can you sort this out? Second. Resume urgently plan Brazza with respect to Joseph. Third. Mission Bomboko. See telex on this topic from the Belgian Embassy. Jules. The first part is about a money transfer. The more interesting part is the middle one: ‘Joseph’ was the code name for Patrice Lumumba. However, the expert historians have pointed out that the details of the plan Brazza are unknown (it could be a plan conceived in Brazzaville, but it could also imply that Lumumba would be transported via Brazzaville). Moreover, they claim that this telex arrived too late to influence the decision to transport Lumumba to Katanga. It also became clear that in 1961 it was not clear at all that the transportation of Lumumba would have as consequence his immediate execution. In any case, the overall communication shows that several Belgian players have never done any effort to save Lumumba, and one of the conclusions of the final report was that some parties carry a ‘moral responsibility.’ A lesson learned from the cryptograms is that the most sensitive information (which was always encrypted) was that related to financial transactions; they were considered to be more sensitive than messages on Lumumba. 2.6
Lessons Learned
The most important lesson learned is that if one deals with secrets that need to be kept for a long time, one should use a cryptosystem that offers a sufficiently
Cryptographic Challenges: The Past and the Future
173
high security margin. It should be noted that in 1960, the C-38 was clearly outdated, and it was known to offer only a limited security level. The progress in computational power and advances in cryptanalysis have further eroded its security. On the other hand, if no known plaintext would have been available, the cryptanalysis (at least within a few weeks) would have been very challenging due to the rigorous procedures (short messages, use of session keys, use of BISECT, randomization of spaces). One mistake made by the operators was resending an ‘indecipherable’ ciphertext in clear (but some paraphrasing was used and the location of the BISECT was not known, hence this would not have delivered known plaintext). A second mistake was that the chosen secret lug setting was not among the recommended ones. Due to lack of time, we were not able to investigate this further, but this clearly reduces security against ciphertext only attacks – even if it made our known plaintext attack harder.
3
Cryptography and Security
While cryptography is an essential component, the importance of cryptography should be put in the correct perspective. Indeed, failure of security systems can often be blamed on other reasons than failure of cryptography (see for example Anderson [1]): – incorrect specifications or requirements – this includes the use of solutions designed for one problem to solve another one; – implementation errors – the more popular mistakes include bad key generation (predictable inputs) and buffer overflow problems; another source of implementation weaknesses are side-channel attacks (e.g., power analysis, timing analysis); – protocol errors – incorrect assumptions on parties and subtle errors can imply that the protocols do not achieve their goals; – security management – inadequate controls and procedures, detection of intrusions or response; – social engineering – the use of social skills to convince someone on the inside to bypass the security system. Nevertheless, cryptographic algorithms are part of the foundations of the security house, and any house with weak foundations will collapse. There is thus no excuse whatsoever to employ weak cryptography; nevertheless, we encounter weak cryptography more frequently than necessary.
4
Insecure Cryptography
There are several reasons why applications use weak cryptography: – Cryptography is a fascinating discipline, which tends to attract ‘do-it-yourself’ people, who are not aware of the scientific developments of the last 25
174
B. Preneel
years; their home-made algorithms can typically be broken in a few minutes by an expert; – Use of short key lengths, in part due to export controls (mainly in the US, who dominates the software market) which limited key sizes to 40 bits (or 56 bits) for symmetric ciphers, 512 bits for factoring based systems (RSA) and discrete logarithm modulo a large prime (Diffie-Hellman). The US export restrictions have been lifted to a large extent in January and October 2000 (see Koops [16] for details). In several countries, domestic controls were imposed; the best known example is France, where the domestic controls were lifted in January 1999. Nevertheless, it can take a long time before all applications are upgraded. – Progress in cryptanalysis: open academic research has started in the mid 1970ies; cryptology is now an established academic research discipline, and the IACR (International Association for Cryptologic Research) has more than 1000 members. As a consequence of this, increasingly sophisticated techniques are developed to break cryptosystems, but fortunately also to improve their security. – Progress in computational power: Moore’s law, which was formulated in 1965, predicts that every 18 months transistor density will double. Empirical observations have proved him right (at least for data density) and experts believe that this law will be holding for at least another 15 years. The variation of Moore’s law for computational power states that the amount of computation that can be done for the same cost doubles every 18 month. This implies that a key for a symmetric algorithm will become thousand times cheaper to find after 15 years (or needs to increase in length by 10 bits to offer the same security). An even larger threat may be the emergence of new computer models: if quantum computers can be built, factoring may be very easy (a result by Shor of 1994 [22]). While early experiments are promising [24], experts are divided on the question whether sufficiently powerful quantum computers can be built in the next 15 years. For symmetric cryptography, quantum computers are less of a threat: they can reduce the time to search a 2n-bit key to the time to search an n-bit key (using Grover’s algorithm [11]). Hence doubling the key length offers an adequate protection. As a consequence of all these observations, insecure cryptographic algorithms are much more common than they should be. In order to avoid these problems, adequate control mechanisms should be established at several levels: – Substantial evaluation is necessary before an algorithm can be used; experts seem to agree that a period of 3 to 5 years is required between first publication and use of an algorithm. – Continuous monitoring is required during the use of a primitive, to verify whether they are still adequate. Especially for public key primitives, which are parameterizable, a rigorous monitoring procedure is required to establish minimal key lengths. – Adequate procedures should be foreseen to take an algorithm out of service or to upgrade an algorithm. Single DES is a typical example of an algorithm
Cryptographic Challenges: The Past and the Future
175
which has been used beyond its lifetime (for most applications, 56 bits was no longer an adequate key length in the 1990ies); another example is the GSM encryption algorithm A5/1: experts agree that it is not as secure as believed, but there is no way to upgrade it. Especially the last problem should not be underestimated: for data authentication purposes, a new security weaknesses that is discovered will typically not influence older events, and long-term security can be achieved by techniques such as re-signing. However, for confidentiality the problem is much more dramatic: one cannot prevent that an opponent has access to ciphertext, and in certain cases (e.g., medical applications) secrecy for 50–100 years is required. This means that an encryption algorithm used now will need to withstand attacks employed in 2075. It is probably easier to imagine how hard it must have been to design in 1925 an encryption system that needed to be secure for 75 years. There is no reason to believe that this problem is easier at the beginning of the 21st century.
5
Addressing the Challenge
Several elements are essential to tackle the challenge: standardization, research, and an open evaluation process. A first element is the use of open standardization mechanisms, which are based on scientific evaluation rather than on commercial pressure. It is clear that algorithms should only be included in standards if they have received sufficient scrutiny. Moreover, the standardization body should establish adequate maintenance mechanisms in order to allow for timely revocation of algorithms or upgrade of parameters. One problem is that there are many standardization bodies, each with their own approach (see Sect. 7 for an overview); algorithm revocation mechanisms are often too slow. A typical example of a successful standardization effort is the NIST selection process for the Advanced Encryption Standard; this has been a 4-year effort resulting in the publication of FIPS 197 in December 2001. During the last 25 years, cryptographic research has been making substantial progress, and it is fair to say that cryptography has been evolving from an ‘art’ to a scientific discipline. Some of the most important developments have been made under influence of theoretical computer science: rigorous security definitions have been developed (which can takes sometimes many years to crystallize), and the reductionist approach has been introduced, also known as ‘provable security.’ This implies that formal proofs are given that a weakness in a cryptographic primitive will imply that a hard problem can be solved. It should be noted however that this improves the state of the art significantly, but it does not solve the crucial question: which problems are hard? Proving that a problem is hard is notoriously hard, or to quote James L. Massey “A hard problem is a problem that nobody works on.” As a consequence, modern public-key cryptology depends on a limited set of problems believed to be hard. Most of these originate from algebraic number theory, the most popular ones are factoring the product of large primes and computing the discrete logarithm modulo a large prime. The discrete
176
B. Preneel
logarithm in other algebraic structures (defined by elliptic and hyper-elliptic curves) is also receiving attention. However, there is a clear need to perform more research on hard problems and to construct new schemes on other classes of hard problems. In the area of symmetric cryptology, a similar reductionist approach is being used. In this case however, the hard problems are typically not generic mathematical problems. The security is based on ‘ad hoc’ designs, which have been developed based on years of experience and based on evaluation against existing and newly developed generic and specific attacks. In this area, performance is often very important. There is a need for new algorithms (stream ciphers, one-way functions, block ciphers) that have undergone a substantial security evaluation and that offer a better security performance trade-off for new environments (64-bit processors, smart cards, ultra-low power applications, . . . ). In order to guarantee the success of the standardization mechanisms, an independent and open evaluation effort is required to bridge the gap between the academic research community and the requirements of the applications. There are several reasons why such an effort is required: – Academic research is more focused on providing a wide range of solutions with various properties rather than on providing a single solution. – Academic research may not always fully specify all details of the algorithm, but rather focus on generic design approaches. – Academic research often ignores certain ‘small’ problems that need to be addressed in applications and standards but that seem ‘trivial.’ However, such small details may have important security implications. A good example is the mechanism to indicate which hash function has been used together with the signature scheme, see Kaliski [15]. – Standardization bodies are not always in sink with the most recent academic developments. Successful standards require a limited number of algorithms that are fully specified; this is required for interoperability. However, an algorithm is only useful if sufficient confidence has been built up, which illustrates the need for a thorough security evaluation. This may involve checking for statistical vulnerabilities and obvious weaknesses, applying known attacks, evaluation of the security against new attacks and a careful verification of all security proofs (the need for this can be illustrated by the error found by Shoup in the 7-year old OAEP security proof [23]; the error has been corrected for RSA-OAEP by Fujisaki et al. in [10]). The selection procedure for the algorithms requires careful benchmarking of security (on which problem is the primitive based, which model is required to prove security, how tight is the reduction,. . . ), performance in various environments and other issues such as intellectual property. Typically standardization bodies do not have the resources for such a careful benchmarking, and there is a clear need for an interface between the research community and the standardization bodies. In the following section we discuss the status of previous and ongoing projects which try to act as an interface. Next we briefly discuss the different standard-
Cryptographic Challenges: The Past and the Future
177
ization bodies and their approach towards standardization of cryptographic algorithms.
6 6.1
Cryptographic Evaluation Efforts AES
The Advanced Encryption Standard (AES) effort was coordinated by NIST (US National Institute for Standards and Technologies) [19]; see also Sect. 7.6. Its goal was to select the AES, a 128-bit block cipher with a key of 128, 192 and 256 bits, that would become the successor to the DES (a 64-bit block cipher with a 56-bit key). AES is US Federal Information Processing Standard (FIPS) that is mandatory for sensitive but unclassified data. The AES process started with an open workshop to discuss the criteria and an open call launched in 1997. The AES competition has been an open competition, with 15 contenders (most of them from outside the US) and 5 finalists. Some limited evaluation (statistical evaluation) was performed by NIST, but most of the security and performance evaluation was taken care of by the designers of the candidates and by the broader research community. The NSA (National Security Agency) assisted with evaluation of hardware performance and also delivered classified inputs to NIST on the security evaluation. The main role of NIST was to listen to all the inputs and to make the final decision. In October 2000, NIST has announced that it had selected the Belgian proposal Rijndael of Joan Daemen and Vincent Rijmen as AES. The AES standard FIPS 197 was published in December 2001 [8]. A similar approach was taken for the DES (Data Encryption Standard) in the mid 1970ies. The first DES call had no acceptable result, and a 2nd call was launched to which IBM answered with the cipher that – after some modifications, in part designed by NSA – became the DES. 6.2
NESSIE
NESSIE (New European Schemes for Signature, Integrity, and Encryption) [18] is a research project within the Information Societies Technology (IST) Programme of the European Commission. The participants of the project are: Katholieke Universiteit Leuven (Belgium), coordinator; Ecole Normale Sup´erieure (France); Royal Holloway, University of London (U.K.); Siemens Aktiengesellschaft (Germany); Technion - Israel Institute of Technology (Israel); Universit´e Catholique de Louvain (Belgium); and Universitetet i Bergen (Norway). NESSIE is a 40-month project, which started in January 2000. The goal of the NESSIE project is to put forward a portfolio of strong cryptographic primitives that has been obtained after an open call and been evaluated using a transparent and open evaluation process. The project has also contributed to the final phase of the AES (Advanced Encryption Standard) process. NESSIE is also developing an evaluation methodology (both for security and performance evaluation) and
178
B. Preneel
a software toolbox to support the evaluation. The project intends to widely disseminate the project results and to build consensus based on these results by using the appropriate fora. In February 2000, the NESSIE project has launched an open call for a broad set of primitives providing confidentiality, data integrity, and authentication. These primitives include block ciphers (not restricted to 128-bit block ciphers), stream ciphers, hash functions, MAC algorithms, digital signature schemes, and public-key encryption schemes. In addition, this call has asked for evaluation methodologies. In September 2000, more than 40 primitives have been received from major players in response to the NESSIE call. Two-thirds of the submissions came from industry, and there was some industry involvement in every 5 out of 6 algorithms. During 12 months, a first security and performance evaluation phase took place, which was supported by contributions from more than 50 external researchers. In September 2001, the selection of a subset of 26 primitives for the second phase has been announced. In the second phase of the project, the remaining primitives are subjected to a thorough security evaluation, combined with a performance evaluation that will produce realistic performance estimates of optimized implementations. The project works closely together with an industry board that consists of 25 companies that include both technology companies and users of cryptography. The goal of NESSIE is to publish the recommended primitives and to work with the industry board to submit these primitives to standardization bodies. Some of the important intermediate conclusions from the NESSIE project are: – The submitted stream ciphers were designed by experienced researchers and offer a very good performance, but none of them seems to meet the very stringent security requirements; – Most asymmetric primitives needed small modifications and corrections between the 1st and 2nd phase, which shows that many subtle issues are involved in the specification of these primitives; – Intellectual property rights are an issue for about half of the primitives in the 2nd phase; in one exceptional case a significant problem has arisen. An early predecessor of NESSIE was the RACE Project RIPE (Race Integrity Primitives Evaluation, 1988-1992) [20]; confidentiality algorithms were excluded from RIPE for political reasons. 6.3
CRYPTREC
The Japanese CRYPTREC project [2] has a scope similar to but slightly broader than that of the NESSIE project. CRYPTREC also includes key establishment protocols and pseudo-random number generators. CRYPTREC intends to validate algorithms for e-government standards.1 1
Note that the results of NESSIE will not be adopted by any government or by the European commission.
Cryptographic Challenges: The Past and the Future
179
The CRYPTREC project started in 2000, and intends to have its results available by 2003. Each year a new call is launched. The evaluation is performed by members of the CRYPTREC evaluation committee; part of the evaluation effort is subcontracted to outsiders.
7
Standardization Bodies
This section briefly summarizes the approach taken and progress made by various standardization bodies in the area of cryptographic algorithms; it does not intend to be exhaustive. 7.1
EESSI
In the framework of the European Directive on Electronic Signatures [6], the European Electronic Signature Standardization Initiative (EESSI) [4] has drafted a number of technical standards; most of these documents became CEN Workshop Agreements or ETSI Technical Standards. However, at the time of writing this article (April 2002), the document which recommends digital signatures and hash functions for digital signatures [5] has not yet received an official status. It contains algorithms such as RSA (PKCS#1 and RSA-PSS), DSA, ECDSA and ECGDSA. For discrete logarithm, it includes both groups modulo a large prime and groups derived from elliptic curves. 7.2
ETSI
The Security Algorithms Group of Experts (SAGE) has a long tradition (since 1990) of strong expertise in the design of cryptographic algorithms. This group was a closed group that worked initially mainly on secret algorithms (such as A5/1 and A5/2 for GSM). To quote from the SAGE website: “The motivation for this ‘proprietary design’ has been the lack of usable publicly available algorithms and the national export controls on equipment containing cryptographic algorithms.” During recent years, the group has been opened to selected members outside the telecommunications industry. There is a trend nowadays to base the designs on publicly known algorithms and to publish the new algorithms; as an example, the UMTS algorithms KASUMI and MILENAGE were made public (for KASUMI this still created some political difficulties); MILENAGE is a design based on Rijndael/AES; A5/3 (a variant of KASUMI) has been announced in July 2002. 7.3
IEEE
The working group P1363 of IEEE [12] has published a very detailed standard in the area of public-key cryptography, which includes an extensive set of algorithms: factoring based schemes (such as RSA-OAEP and RSA with hash),
180
B. Preneel
discrete logarithm based schemes – both modulo a prime and for elliptic curve groups – (Diffie-Hellman, DSA, Nyberg-Rueppel and MQV key agreement) [13]. A strong point of this standard is the detailed specification, which facilitates interoperability; a weaker point is that the standard may offer too many choices. Currently the group is working on P1363a, which will contain additional techniques such as lattice-based cryptography and password-based authentication protocols. 7.4
IETF
The Internet Engineering Task Force (IETF) work on algorithms is very important as it decides on the algorithms that are integrated in influential standards such as TLS (www security), S/MIME (email security), and IPsec (IP level security). However, there seems to be no consistent approach towards selection of cryptographic algorithms across different working groups. Most working groups have selected one or more mandatory algorithms for interoperability, and have incorporated a negotiation mechanism for support a wide variety of others. Criteria for inclusion are often very ad hoc, and – in the IETF tradition – in part based on the availability of interworking applications. However, there is no centralized policy or management procedure for selecting and updating algorithms. 7.5
ISO
Within the International Organization for Standardization (ISO), cryptographic algorithms are standardized within Joint Technical Committee 1 between ISO and IEC (International Electrotechnical Commission). The Working Group in charge is WG2 of SC27 [14]. Since its inception in 1990, SC27 has published an impressive collection of standards on hash functions, MAC algorithms, digital signature schemes; in 2000, work has also been started on encryption algorithms. During the last year, significant progress has been made in this area (except for stream ciphers). The standards developed within this group vary in level of detail; in some cases it is probably very hard to achieve interoperability without specifying additional details. The selection of algorithms is based on consensus; this implies that most algorithms that reach a certain security threshold can gather sufficient support to be included. In some cases this has the consequence that the standard may offer too many choices. ISO/TC68 has standardized cryptographic algorithms for the banking area; its work is now closely aligned with that of JTC1/SC27. 7.6
NIST
For historic reasons, the US Federal Information Processing Standards published by NIST (National Institute for Standards and Technologies, formerly NBS or National Bureau of Standards) have been very influential. The publication of the
Cryptographic Challenges: The Past and the Future
181
block cipher DES in 1977 [7] and certainly the AES process (cf. Sect. 6.1) have given NIST a very high profile. In other areas, FIPS have also been important; however, NIST tends to follow quite a different (and less open) procedure for the selection of digital signature algorithms (DSA, ECDSA), hash functions (SHA-1, SHA-2,. . . ) [9] and MAC algorithms (HMAC and RMAC): a proposal containing a particular algorithm is drafted, published for comment and subsequently published as FIPS. 7.7
Other Standards
Other influential standards include the work by ANSI (American National Standards Institute), PKCS (Public Key Cryptography Standards, RSA Security Inc.) and SECG (Standards for Efficient Cryptography Group, Certicom Inc.).
8
Conclusions
Cryptology is a key building block of the information society, but we are often confronted with applications containing weak cryptography. The basic research questions in cryptology (which problems are hard?) are in fact a very hard problems, and researchers have only scratched the surface in this area. Moreover, solutions that are sufficiently secure right now may no longer be secure within 5 or 10 years. This probably goes again the common belief outside the cryptographic community that the basic research problems have been solved. In view of this, it should not come as a surprise that it is hard for standards and applications to keep up with a developing science in which the right answer of today may be the insecure solution tomorrow. We have tried to shed some light on the gap between research and practice and on the role of evaluation initiatives such as AES, CRYPTREC and NESSIE in bridging this gap. We hope that we have succeeded in making this fascinating world a little less cryptic to the reader.
References 1. R.J. Anderson, “Why cryptosystems fail,” Communications ACM, Vol. 37, No. 11, November 1994, pp. 32–40. 2. CRYPTREC project, http://www.ipa.gov.jp/security/enc/CRYPTREC/ index-e.html. 3. C.A. Deavours, L. Kruh, “Machine Cryptography and Modern Cryptanalysis,” Artech House, 1985. 4. EESSI, http://www.ict.etsi.org/eessi/eessi-homepage.htm 5. EESSI, “Algorithms and parameters for secure electronic signatures,” http://www. ict.etsi.org/eessi/Documents/20011019\_Algorithm\_Proposal\_V2.11.doc 6. EU, “Directive 1999/93/EC of the European Parliament and of the Council of 13 December 1999 on a Community framework for electronic signatures,” December 1999.
182
B. Preneel
7. FIPS 46, “Data Encryption Standard,” Federal Information Processing Standard, National Bureau of Standards, U.S. Department of Commerce, January 1977 (revised as FIPS 46-1:1988; FIPS 46-2:1993). 8. FIPS 197 “Advanced Encryption Standard (AES),” Federal Information Processing Standard, National Institute of Standards and Technologies, U.S. Department of Commerce, December 6, 2001. 9. FIPS 180-2, “Secure Hash Standard,” National Institute of Standards and Technologies, U.S. Department of Commerce, Draft, May 30, 2001. 10. E. Fujisaki, T. Okamoto, D. Pointcheval, J. Stern, “RSA-OAEP is secure under the RSA assumption,” Advances in Cryptology, Proceedings Crypto’01, LNCS 2139, J. Kilian, Ed., Springer-Verlag, 2001, pp. 260–274. 11. L.K. Grover, “A fast quantum mechanical algorithm for database search,” Proc. 28th Annual ACM Symposium on Theory of Computing, 1996, pp. 212–219. 12. IEEE P1363, http://grouper.ieee.org/groups/1363. 13. IEEE P1363, IEEE P1363, “Standard Specifications For Public Key Cryptography,” February 2000. 14. ISO/IEC JTC1/SC27, “Information technology – Security techniques,” http:// www.din.de/ni/sc27. 15. B. Kaliski, “On hash function firewalls in signature schemes,” Topics in Cryptology, CT-RSA 2002, LNCS 2271, B. Preneel, Ed., Springer-Verlag, 2002, pp. 1–16. 16. B.-J. Koops, “Crypto law survey,” http://rechten.kub.nl/koops/cryptolaw. 17. R. Morris, “The Hagelin cipher machine (M-209): Reconstruction of the internal settings,” Cryptologia, Vol. 2, No. 3, 1978, pp. 267–289. 18. NESSIE, http://www.cryptonessie.org. 19. NIST, AES Initiative, http://www.nist.gov/aes. 20. RIPE, “Integrity Primitives for Secure Information Systems. Final Report of RACE Integrity Primitives Evaluation (RIPE-RACE 1040),” LNCS 1007, A. Bosselaers, B. Preneel, Eds., Springer-Verlag, 1995. 21. R.L. Rivest, “Statistical analysis of the Hagelin cryptograph,” Cryptologia, Vol. 5, No. 1, 1981, pp. 27–32. 22. P.W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” Proc. 35nd Annual Symposium on Foundations of Computer Science, S. Goldwasser, Ed., IEEE Computer Society Press, 1994, pp. 124–134. 23. V. Shoup, “OAEP reconsidered,” Advances in Cryptology, Proceedings Crypto’01, LNCS 2139, J. Kilian, Ed., Springer-Verlag, 2001, pp. 239–259. 24. L.M.K. Vandersypen, M. Steffen, G. Breyta, C.S. Yannoni, M.H. Sherwood, I.L. Chuang, ‘’Experimental realization of Shor’s quantum factoring algorithm using nuclear magnetic resonance,” Nature, 414, 2001, pp. 883–887.
TAPS: The Last Few Slides Ernie Cohen Microsoft Research, Cambridge UK 7 J J Thomson Avenue Cambridge CB3 0FB, United Kingdom [email protected]
Abstract. In the last few years, a variety of methods have been used to verify cryptographic protocols in unbounded Dolev-Yao models. However, these methods typically rely on rather drastic assumptions (e.g., the injectivity of tupling and encryption), and it is unclear how to extend these methods to more realistic protocol models. We show how the firstorder verification method of [1] (implemented in the verifier TAPS) can be extended to more faithfully capture some features of real protocols, including weak secrets, bitwise concatenation and projection, reversible encryption, and bitwise exclusive-or. (These extensions are usually relegated to the last few slides in presentations of TAPS, hence the title.)
1
TAPS Overview
TAPS [1] is one of the more succesful automatic crypto protocol verifiers. It has proved safety properties of about 100 crypto protocols, including all but three protocols from the Clark and Jacob library, protocols with recursive structures (such as certificate chains and paywords), and large examples like SET (as modelled by Paulson et. al. [5]). Verification is usually fast and automatic: typical verification time for toy protocols is under a second, about 90% of verifications requiring no hints from the user, and the remaining protocols requiring only modest hints (about 40 bytes on average). In TAPS, protocols are modelled as transition system, where the state is given by the set of steps that have been executed and the set of messages that have been published (i.e., sent in the clear). Given such a protocol, TAPS tries to construct first-order invariants that captures all relevant safety properties of the the system. The most important of these is the secrecy invariant, which catalogues what can be inferred from the publication of a message. To construct this invariant, TAPS has to guess a fairly precise first-order characterization of the conditions under which each significant secret might be leaked to the adversary. For most protocols, TAPS does this fully automatically, but for more difficult protocols (particularly recursive ones), the user can provide TAPS with hints. TAPS then generates first-order proof obligations that suffice to show that the secrecy invariant is, in fact, an invariant of the protocol, and tries to discharge these obligations with a resolution theorem prover. If this succeeds, the secrecy invariant is known to hold in all protocol states, and TAPS tries to A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 183–190, 2003. c Springer-Verlag Berlin Heidelberg 2003
184
E. Cohen
prove any (user-specified) safety properties from all of the invariants by ordinary deduction (again, using the resolution prover). Details of the protocol model and how TAPS constructs its invariants can be found in [1]. 1.1
The Secrecy Invariant
Let pub(X) mean that the message X has been published. The TAPS secrecy invariant has the form pub(X) ⇒ ok(X), where ok is defined as follows: ok(X) ⇔ prime(X) ∨ fakeable(X) fakeable(X) ⇔ X = nil ∨ (∃ Y, Z : pub(Y ) ∧ pub(Z) ∧ (X = cons(Y, Z) ∨ X = enc(Y, Z))) where cons is injective pairing and enc(Y, Z) is Z encrypted with key Y . Intuitively, a fakeable message is one that an active adversary could synthesize from simpler messages that are already published. The definition of prime is protocol specific; typically, there is one prime case for each submessage of each message published by a protocol step, saying that the submessage is prime if the step has been executed and the adversary is able to strip off all of the surrounding encryption. For example, if the protocol has a step, recorded with the state predicate p0(A, Na), that generates a random nonce Na and publishes the message enc(k(A), Na), then the definition of prime will satisfy p0(A, Na) ⇒ prime(enc(k(A), Na)) p0(A, Na) ∧ dk(k(A)) ⇒ prime(Na) where dk(k(A)) means that there is some published key that decrypts messages encrypted under k(A). (For example, if k(A) is a symmetric key, dk(k(A)) is equivalent to pub(k(A)).) The key to the definition of ok is that the definition of fakeable has constructors (such as enc and cons), but no destructors (such as projection or decryption). Why is this important? Suppose we know that a message of the form enc(X, Y ) is published. By the secrecy invariant, we know that this message is ok; because we assume that the ranges of enc, cons, and nil are disjoint, we know that this message cannot be nil or a tuple. Thus, either the message is prime (in which case we know that a concrete protocol step that has been executed, which amounts to knowing something specific about the state of some principal) or it’s fakeable. And if it’s fakeable, then we know that both X and Y are published (because enc is injective). Of course X and Y might be complex terms, in which case we might have more cases to consider, but the key point is that X and Y are each syntactically simpler than enc(X, Y ), so reasoning through the secrecy invariant forces us to consider the publication of only messages that are simpler than the message we started with. This would not be the case if destructors were allowed in the definition of fakeable. For example, if the definition of fakeable included a case corresponding to adversary dectryption, like (∃ Y : pub(enc(Y, X)) ∧ dk(Y ))
TAPS: The Last Few Slides
185
consideration of how X might be published would force us to consider more and more complex messages, such as (enc(Y, X) ∧ dk(Y )), (enc(Z, enc(Y, X)) ∧ dk(Y ) ∧ dk(Z)), etc.; we would then face the same kind of infinite regressions that plagues backward symbolic search. In the description above, we have assumed that encryption and tupling are injective, with disjoint ranges, and do not yield atoms. We will revisit these assumptions in later sections.
2
Weak Secrets
Real systems implement security through a combination of strong secrets (e.g. randomly generated nonces and keys) and weak secrets (e.g. passwords and PIN numbers). Most verifiers treat these secrets the same way, but it is well-known that weak secrets require special handling to prevent offline guessing attacks. Gavin Lowe has reported on using Casper to search for such attacks[4]; TAPS has recently been extended to prove security in the presence of a Dolev-Yao attacker that can also engage in such attacks [2]. An offline guessing attack represents an additional action available to a DolevYao spy. In this attack, the adversary constructs a Boolean function that returns true for exactly one combination of inputs chosen from a set of weak secrets. The function must be constructed from published messages and operations available to the spy ((en/de)cryption and (un)tupling), except for the final step, which is either an equality test or checking membership in a sparse, recognizable class of messages (such as asymmetric key pairs or English texts). A succesful attack allows the adversary to publish the input messages for which the function yields true. To prove the absence of effective guessing attacks, we consider the set of computation steps used to compute the output true in a minimal guessing attack, and try to show that the set can’t contain a combination of steps that can produce the inputs needed to make the final step give the result true. (For example, for the case where the final step is an equality test, we have to show that the same message can’t be produced in two essentially different ways from the inputs, or produced both from the inputs and from the set of published messages.) Of course we can’t just manipulte sets of computation steps; we need a firstorder bound on this set of steps. Fortunately, guessing the set of computation steps that can occur in a minimal attack turns out to be very much like guessing the set of messages that can be published. In particular, the predicate charaterizing the computation steps can be given a structure that parallels that of the secrecy invariant (though a bit more complicated, since it deals with computation steps instead of messages), and can be constructed using similar techniques (see [2] for details).
186
3
E. Cohen
Working with Bits
Like most protocol verifiers, TAPS assumes injective tupling. However, real implementations manipulate bit strings, not tuples. The use of tuples leads to both false attacks (e.g., a value masquerading as another of different length) and false proofs of security (e.g., it can miss low-level bit-splicing attacks). In this section, we show how to modify the secrecy invariant to work at the level of bits. We begin by treating all messages as bit sequences. To do this, all terms have to be typed with their bit length. We eliminate tupling, replacing it with concatenation (represented with an infix .). We define bit projection and concatenation in the obvious way, where Xi is the i’th bit of X: len(X.Y ) = len(X) + len(Y ) X = Y ⇔ len(X) = len(Y ) ∧ (∀i : 0 ≤ i < len(X) : Xi = Yi ) (X.Y )i = Xi if i < len(X), Y(i−len(X)) otherwise We assume that in the adversary, tupling and untupling are replaced with concatenation and bit projection. Note that we have also eliminated the message nil , which is just the bit string of length 0. Since we are working at the level of bits, we consider messages to be ok on a bit-by-bit basis. So the new secrecy invariant has the form pub(X) ⇒ (∀i : 0 ≤ i < len(X) ⇒ ok(Xi )) ok(X) ⇔ prime(X) ∨ fakeable (X) fakeable (X) ⇔ (∃ U, V, j : pub(U ) ∧ pub(V ) ∧ X = enc(U, V )j ) i.e., every bit of a published message must be ok, a bit is ok iff it is prime or fakeable, and a bit is fakeable iff it is a bit of an encryption of published messages. We also have to change the generation of prime cases to generate cases for bits, so instead of a clause like p0(A, Na) ∧ dk(k(A)) ⇒ prime(Na) we use the clause p0(A, Na) ∧ dk(k(A)) ⇒ prime(Na i ) Finally, in order to support this finer-grained model, we need to strengthen the injectivity of encryption as follows: enc(U, V )i = enc(X, Y )j ⇔ U = X ∧ V = Y ∧ i = j atom(X) ⇒ Xi = enc(U, V )j atom(X) ∧ atom(Y ) ∧ Xi = Yj ⇒ X = Y ∧ i = j These axioms say that each bit value can be constructed in only one way. Since there are only two genuine bit values, how might we justify such a silly model?
TAPS: The Last Few Slides
187
One way to think about it is that we are allowing the attacker to control all scheduling and messsage delivery, and to see exactly what each principal does, but we are not allowing him to see the actual data values, and are considering only attacks that are guaranteed to succeed regardless of these values. An attractive aspect of this model is that a proof of security looks pretty much like a proof of security in the more usual model; instead of showing that a sequence of bits remains secret, we show that each bit of the sequence remains secret. Another nice feature is that we can get a more realistic, quantitative notion of security by allowing the adversary to guess up to k bits (simply add to the protocol an adversary action that publishes a fixed but inconstrained k bit constant).
4
Reversible Encryption
Symmetric-key protocols are usually built on top of reversible symmetric-key block permutations (e.g. DES). These functions satisfy dec(K, enc(K, X)) = X enc(K, dec(K, X)) = X so encryption under these functions is obviously not injective (at least not in both arguments). Suppose we choose to work in the initial crypto algebra of these equations (i.e., where all values are built up from atoms, projection, concatenation, encryption and decryption). Terms in this algebra can be reduced to a normal form by using the equations above as rewrite rules. Therefore, if enc(X, Y ) = enc(U, V ) and both sides of the equation are in normal form, we can conclude X = U ∧Y = V . However, normalization is a metalogical concept, not a logical one. Therefore, we introduce the following first-order approximation to normalization: term(X) ⇔ (∀i : (∃ U, V, W, j : (Xi = Uj ∧ (atom(U ) ∨ nenc(V, W, U ) ∨ ndec(V, W, U ))))) nenc(X, Y, Z) ⇔ ne(X, Y ) ∧ enc(X, Y ) = Z ndec(X, Y, Z) ⇔ nd(X, Y ) ∧ dec(X, Y ) = Z ne(X, Y ) ⇔ term(Y ) ∧ (∀U : ¬ndec(X, U, Y )) nd(X, Y ) ⇔ term(Y ) ∧ (∀U : ¬nenc(X, U, Y )) Intuitively, nenc(X, Y, Z) (respectively ndec(X, Y )) means that Z can be written as a normalized term of the form enc(X, Y ) (respectively dec(X, Y )), and ne(X, Y ) (respectively nd(X, Y )) means that Y is a term cannot be written as a normalized term of the form dec(X, U ) (respectively, enc(X, U )). term(X) means that every bit of X is either a bit of an atom, or a bit of a normalized encryption or decryption.
188
E. Cohen
We then replace our previous injectivity axioms with the following properties: enc(X, Y )i = enc(U, V )j ∧ ne(X, Y ) ∧ ne(U, V ) ⇒X =U ∧Y =V ∧i=j enc(X, Y ) = enc(X, V ) ⇒Y =V enc(X, Y ) = enc(Z, Y ) ⇒X=Z enc(X, Y ) = enc(U, V ) ∧ ne(X, Y ) ∧ X = U ⇒ (∃ W : ndec(U, W, V )) enc(X, Y ) = dec(U, V ) ∧ ne(X, Y ) ⇒ (∃ W : nenc(U, W, V )) = Uj nenc(X, Y, Z) ∧ atom(U ) ⇒ Zi plus corresponding equations where we reverse the roles of enc and dec, ne and de, and nenc and ndec. These axioms hold in the initial crypto algebra if we compute the functions ne,nd,necn,ndec, and term starting with atoms, then encryptions and decryptions of atoms, etc. Finally, the new secrecy invariant requires a new definition of fakeable : fakeable(X) ⇔ (∃ U, V, Y, j : X = Yj ∧ pub(U ) ∧ pub(V ) ∧(nenc(U, V, Y ) ∨ ndec(U, V, Y ))) This says that of every ok bit is either prime or fakeable, where a fakeable bit is a bit of a normal encryption or decryption of published messages.
5
Bitwise Exclusive-or
Many crypto protocols use bitwise exclusive-or (notation: ⊕), either directly as a form of encryption (Vernam encryption) or as part of the encryption process (e.g., in cipher block chaining). Although this operator plays a central role in a number of algorithms, no automatic tool handles it yet, and there is not yet a satisfactory proposal for how to do so in a Dolev-Yao model1 In this section we show how ⊕ can be added to the crypto algebra of the last section. The main issue in adding ⊕ to the crypto algebra is in how to account for the adversary’s ability to combine messages with ⊕. If we take the direct approach and simply add to the definition of fakeable a case like (∃ Y, Z : X = Y ⊕ Z ∧ pub(Y ) ∧ pub(Z)) we immediately have to consider arbitrary sums when deducing the consequences of the publication of any message (even an atom). The problem with the case above is that there is no way to make sure that Y and Z are “smaller” than X. In order to achieve this size-reducing property, we essentially replace ⊕ with disjoint union of nonempty finite sets, as follows. 1
To my knowledge, the only serious Dolev-Yao type proof of a protocol with ⊕ is a PVS proof using rank functions by Steve Schneider and Neil Evans. The particular style of rank function they employed, however, works only for protocols that can tolerate publication of the ⊕ of any even number of secrets.
TAPS: The Last Few Slides
189
Define a unit to be any 1 bit message; we say a unit is basic if it is a bit of a term as defined in the last section (i.e., a bit of an atom or of a normal encryption or decryption) In the initial model, each unit can be represented uniquely as a sum over a finite set of basic units; two units are disjoint if the basic units of their representations are disjoint (as sets). Our new secrecy invariant says that a bit is ok iff it is a finite sum of pairwise disjoint units, each of which is either prime or fakeable: ok(X) ⇔ (∃ Y : X = (⊕j : Y j ) ∧(∀j, k : j = k ⇒ Y j and Y k are disjoint) ∧(∀j : (prime(Y j ) ∨ fakeable(Y j ))) Because the elements of Y must be pairwise disjoint, this formulation avoids the infinite regression mentioned above. For example, from this invariant, we can conclude that if X is a bit of an atom, then X must be prime; if X is a bit of a normal encryption, then X must be prime or fakeable. For this invariant to be preserved when the adversary publishes the sum of two published messages, we require an additional condition: every sum of two primes can be written as a disjoint sum of primes. In general, this requires expanding the definition of primality. For example, if for formulas f and g we have both f ⇒ prime((Na ⊕ Nb)i ) and g ⇒ prime((Nb ⊕ Nc)i ), then we need to add a prime case f ∧g ⇒ prime(Nc i ). For most examples, this process terminates with a reasonable number of primes, but there are cases where we can’t achieve closure with any finite set of primes; for example, if a protocol receives a message X under encryption and publishes X ⊕ f (X), then the prime cases would have to include cases for X ⊕ f (f (X)), X ⊕ f (f (f (X))), etc., and the best we could hope for is a first-order approximation to this set. Fortunately, such protocols don’t seem to arise in practice.
6
Disclaimers
I would like to close with two important disclaimers. First, of the extensions presented here, only weak secrets have been implemented. Although the other extensions are straightforward in principle, it will require a lot of work (and a major rewrite of TAPS) to turn these ideas into an efficient mechanical implementation. So these should be considered a work in progress, presented to stimulate people with other verification approaches to think about how they might attack the same kinds of extensions. Second, for formal methods persons looking to get involved with security, I do not recommend Dolev-Yao crypto protocol verification. There are already good tools, many more tool producers than consumers, and most of the low-lying fruit has been picked. There are other, more interersting opportunities in this area, e.g. mechanizing the reduction-style proofs favored by cryptographers, or the analysis of security properties of large networks. Nevertheless, there is still a lot of interesting work to be done to bring DolevYao protocol analysis closer to protocol reality.
190
E. Cohen
References 1. E. Cohen, First-order verification of cryptographic protocols. JCS, to appear. A preliminary version appears in CSFW XIII (2000). 2. E. Cohen, Proving protocols safe from guessing attacks. IJIS, to appear. Also in FCS/VERIFY workshop, 2002. 3. J. Heather, G. Lowe, and S. Schneider, How to prevent type flaw attacks on cryptographic protocols. CSFW XIII (2000) 4. G. Lowe, Analysing protocols subject to guessing attacks. WITS 2002. 5. L. Paulson, Verifying the SET protocol: overview. In this volume.
Formal Specification for Fast Automatic IDS Training Antonio Durante, Roberto Di Pietro, and Luigi V. Mancini Dipartimento di Informatica , Universit`a di Roma “La Sapienza” Via Salaria 113, 00198 - Roma {durante,dipietro,mancini}@dsi.uniroma1.it
Abstract. This paper illustrates a methodology for the synthesis of the behavior of an application program in terms of the set of system calls invoked by the program. The methodology is completely automated, with the exception of the description of the high level specification of the application program, which is demanded to the system analyst. The technology employed (VSP/CVS) for such synthesis minimizes the efforts required to code the specification of the application. The methodology is completely independent from the intrusion detection tool adopted, and appears suitable to derive the expected behavior of a secure WEB server that can effectively support the increasing request of security that affects the e-commerce. As a case study, the methodology is applied to the Post Office Protocol, the ipop3d daemon.
1 Introduction Nowadays computer systems work in highly dynamic and distributed environments, requiring the protection mechanism tools compulsory to prevent intentional or unintentional violation of the access constrains. At the moment, the access control policies implemented in the commercial Operative Systems (OS) are not always sufficient to protect the integrity, availability and confidentiality of the system. Often the attackers are able to circumvent the access control mechanisms exploiting the applications or the OS security flaws. As an example, in many cases the attackers tend to hijack the control of privileged processes, such as the daemon processes. Moreover, the compromised system can be used as a starting point to perform other attacks over the network. A well-known family of this kind of attack is called buffer overflow attack [2]. Our proposed methodology is aimed at mapping the normal behavior of an application program onto its allowed system calls, thus enabling the detection of attacks that attempt to hijack the execution of privileged processes. It is out of the scope of this paper the detection of TCP layer attack as well as application layer attacks. The contribution of this paper is intended to help the deployment of anomaly based Intrusion Detection System (IDS). This kind of IDS allows the detection of anomaly behaviors of an application program that differ from its expected behavior. Usually, the allowed behavior is described in terms of the invoked system calls [9,8,17,21,22].
The authors were partially supported by the project Web-MiNDS and by the Italian MIUR under the FIRB program.
A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 191–204, 2003. c Springer-Verlag Berlin Heidelberg 2003
192
A. Durante, R. Di Pietro, and L.V. Mancini
The scope of this paper is to tackle the definition of the expected behavior of the application programs. In particular, we describe a methodology that starting from a high level description of the application program, such as an IETF RFC [11], derives the set of the system calls that can be invoked, and that models the normal behavior of the program. Note that the generated system calls are specific for the particular implementation of the application program to which the proposed methodology is applied. For instance, consider an FTP daemon that receives USER/PASS requests; while processing these requests, the daemon could execute different kinds of system calls depending on its implementation. To perform authentication the FTP daemon may need to read a security-sensitive file (using regular I/O system calls), or it may access the same sensitive file via memory mapping (using the mmap system call), or it may access the sensitive file via NIS (using socket connects, reads, and writes), etc. Note that the particular set of system calls to perform the FTP authentication is decided by the programmer while implementing the daemon. Hence, we do not try to synthesize the allowed system calls of all the possible implementations of a specification, but we consider a specific implementation of the daemon that will run on the specific system under consideration. In this paper, we assume that the specific implementation of the application program to which the proposed methodology is applied does not contain malicious code, though it should not be necessarily trusted. In other word, the application program could contain potential bugs in the implementation of the high level specification (e.g. bugs that could be exploited by a buffer overflow attack), but should not contain arbitrary malicious operations (e.g. a malicious programmer adds a trojan code which creates a root account in the password file even if such operation is not strictly required by the high level specification of the particular application program). We use a technology that has been successfully applied to protocol design and analysis [3,4]. Once the program is specified using the high level language VSP, it is compiled using the CVS compiler that translates the VSP in a Security Process Algebra (SPA) [5]. The methodology is completely independent from the availability of the source code, and it is completely modular with respect to any IDS. To show the effectiveness of such a methodology, we have applied it to a particular specification based IDS prototype: REMUS [1]. In [1], the system calls that are considered critical for the security of the system are intercepted by a LINUX kernel module specifically designed for this purpose. This module operates as reference monitor that denies or allows the execution of a particular system call invoked by a daemon or by a setuid software program. The decision of the reference monitor is based on a kernel data structure, called Access Control List (ACL), that maintains the set of authorized system calls and their relative parameters. The content of such a data structure can be seen as a classification of the behavior of a program. Our methodology represents a first attempt for the automatic definition of the normal profile of a privileged program in terms of the set of its invoked system calls. This issue is common to other IDSs [6,7,12,13,16,18,23]. The contributions of this paper are: (1) a methodology to speed-up the synthesis of the normal behavior of an application program. This methodology differs from that based on the source code analysis, since our approach synthesizes the program behavior starting from both a high-level specification document, such as a IETF RFC, and a
Formal Specification for Fast Automatic IDS Training
193
particular implementation code. An implicit advantage, of not relying on source code, is that our approach is applicable even if the source code is not available for analysis. Moreover, the process is automated, with the only exception of the specification phase, which is a high level human-activity; (2) an improvement of the IDS based on system call interposition, such as REMUS. In particular, we automate the generation of the ACL assuring a good and fast approximation of the legal application program behavior. Our methodology can provide the same results of a methodology based on several months of interactive learning, while not raising false positive, within a shorter time period; (3) finally, a finer classification could be introduced in the ACL, e.g. storing the logical address of the system call that are normally executed, as detailed in section 4.2. Such results, combined with the low overhead introduced by the IDS based on system call interposition, are considered the enabling factor for the effective deployment of a secure server on the WEB. The paper is organized as follows: the next section illustrates the proposed methodology. Section 3 offers a case study, applied to the ipop3d daemon, while section 4 presents a description of how the methodology has been applied to the REMUS prototype and one possible extension of the methodology. Section 5 deals with related works in the field. Finally, concluding remarks and further works are reported.
2 Methodology The methodology we propose takes in input a formal specification of an application A, and returns as output the set of the system calls that a specific program implementation of A is allowed to invoke. Throughout this paper, we apply our methodology to the RFC1939 (ipop3d) [10], as an example. Note that any other specification of an application program, with the level of detail of an RFC, could have been adopted as the starting point of our methodology. However, we have based our discussion on RFC since we intend to address the implementation of any secure Internet servers, which are mainly based on the execution of standard daemons, whose expected behavior is described through RFCs. In the following, we detail the steps of the methodology and subsequently develop a simple example. Note that we are not interested in detecting attacks at the application layer, that is brute-force username-password guessing. Indeed, these attacks do not manifest themselves as illegal pattern of invoked system calls. The first step of the methodology consists in modeling the application program behavior as a Finite State Machine (FSM) that can recognize any session of commands execution of the application program A, triggered by a client. This step requires a human intervention to express the RFC specification of the application program A with state transition semantic (an automaton). The states of the FSM are derived from the RFC, and the transitions between states are the possible commands that the application program can be requested to execute. The second step consists in formalizing the FSM using the VSP language [3]. This step too must be carried out by the system analyst. In the third step, the VSP specification is compiled using the CVS compiler [3], see section 3.4 for details. In particular, the result of the compilation produces a Security Process Algebra (SPA) that we call FSM1.
194
A. Durante, R. Di Pietro, and L.V. Mancini
The fourth step of the methodology consists in exploring the FSM1 to obtain the finite set of command sequences that may be invoked by an execution of the application program. A commands execution set accepted by FSM can be equivalently represented as a subset of the command sequences produced by FSM1. Thus, executing the set of command sequences accepted by the FSM1 will invoke the same set of system calls invoked by the application program when executing the command sequences recognized by the FSM, under the assumptions detailed at the end of this section. In the fifth step, the sequences of commands produced by CVS are translated, through a simple parsing algorithm, in the sequences of commands executable by a tool called ILSC (Invocation of Legal Sequences of Commands). The ILSC executes such command sequences on a specific implementation of the application program. During this step the module REMUS is loaded in configuration mode to intercept and log all the system calls invoked by the application program. Then, the logged system calls are used to update the ACL. Note that the first and second steps above are carried out by the system analyst, while the others are automated. In the following, we illustrate the whole sequence of steps in the case study. The assumptions underlying our methodology are: (1) the correspondence between the IETF command specification and its implementation is maintained; (2) any command implementation invokes the same set of system calls regardless of the value and size of the input parameters. Such assumptions are consistent with the best practices applied in the Software Engineering field, in which the software development process focuses on standardization [19].
3 Case Study 3.1 POP3 Commands When the ipop3d daemon service is started, it listens on TCP port 110 [10]. When a client host wishes to make use of the service, it establishes a TCP connection with the server host. When the connection is established, the ipop3d server sends a greeting. The client and the ipop3d server daemon then exchange commands and responses until the connection is closed or aborted. The commands of the post office protocol consist of a keyword followed by one or zero arguments. The response of the ipop3d daemon consists of a success indicator possibly followed by additional information. There are currently two indicators: positive (”+OK”) and negative (”-ERR”). A post office protocol session progresses through a number of states during its lifetime. Once the TCP connection has been opened and the ipop3d server has sent the greeting command, the session enters the AUTHORIZATION state. In this state, the client must identify itself to the ipop3d server. Once the client has been successfully identified, the server acquires the resources associated with the client’s maildrop, and the session enters the TRANSACTION state. In this state, the client requests actions to the ipop3d server. When the client has finished its transactions, the session enters the UPDATE state.
Formal Specification for Fast Automatic IDS Training
195
In this state, the ipop3d server releases any resource acquired during the TRANSACTION state and says goodbye. The TCP connection is then closed. For a complete description of the post office protocol see RFC1939 [10]. 3.2 The FSM (Step 1) To model the interactions between a client and the ipop3d we use a Finite State Machine, FSM. We define the FSM, where the transitions represent the commands invoked by a client and the states are those reached by the daemon as a consequence of such an interaction. Figure 1 shows the FSM derived from the RFC for the ipop3d daemon. In each state an error can occur due to a bad input command, that is BAD INP. The errors can be divided in two kinds, as reported in Table 1:
Error command error
Description the name of the command does not coincide with anyone of those specified in the RFC, given the current state of execution of the daemon. the parameter is omitted if required or it is wrong: out of range, mismatch.
parameter error
Table 1. Possible errors recognized by the ipop3d daemon Each time the ipop3d daemon receives a bad input command (BAD INP), the software send back to the client as output an error message err-“error message”. When the client sends a “well formed” command (a command and its parameters are well formed if they respect the RFC specification) the daemon returns as output an OK+ “message”. The daemon terminates its execution when it reaches one of the two possible final states: UPDATE (U) or LOGOUT (L).
L
USER name A1
BAD INP
TRANS_INP
QUIT
QUIT
A2
BAD INP
PASS ****
T
QUIT
U
BAD INP
Fig. 1. The FSM of the ipop3d daemon.
We call trace a finite sequence of commands accepted by the FSM. If we consider the set of all the traces recognized by the FSM, they correspond to the set of all possible
196
A. Durante, R. Di Pietro, and L.V. Mancini
different sequences of commands invoked by a client and executed by the ipop3d daemon. Note that there is a correspondence between each command invocation and the set of system calls executed at kernel level.
3.3 VSP Specification for ipop3d (Step 2) To obtain the FSM1, the system analyst has to specify the daemon using the VSP language. VSP is a value-passing language like CCS value passing [15] that allows protocol specification. A VSP specification is translated in a Security Process Algebra (SPA) [5] specification using the CVS compiler. The process of describing an Internet daemon through VSP is an extension of the use for which the VSP was initially intended for, that is VSP was developed to describe protocols [3,4]. In general, a protocol consists of a set of messages (that contain a set of values) exchanged by two or more entities to reach a common goal (e.g. authentication). Indeed, a daemon is specified in VSP via a set of messages exchange. A daemon can accept a command and give as output: (1) an error message if the commands is not well formed; (2) an ok message if the command is well formed. Therefore, describing a daemon through messages is a task that can be achieved if we employ messages that contain as parameters: the name of the command that the daemon has to execute and the parameters of the invoked command. Given the idea of how it is possible to employ the VSP to describe the behavior of a daemon, we detail below the four steps of the procedure that leads the system analyst to the specification in VSP of the FSM: – definition of the commands and the values of the commands parameters; in this step the system analyst has to synthesize the set of commands that a daemon can accept and the values that the command’s parameter can assume during a daemon normal session; – definition of the messages accepted by the daemon; as above expressed, the messages accepted by the daemon contains the name of the command that the daemon has to execute and the values of its parameters; – declaration and definition of the body of the daemon process; this part specifies the “body” of the daemon server. The body of a process consists of a sequence of messages. There are two kind of messages: (1) the input messages that correspond to a command invocation; (2) the output messages, which correspond to the output of the ipop3d daemon. An output message can assume two values: (a) OK; (b)err-, according to the fact that the received command is well formed or not. It is not necessary to specify the body of the client process, as usually required by the VSP specification, because the specification of the behavior of the daemon is comprehensive of all possible interactions that the daemon itself can perform with any client; – definition of a generic session; it is sufficient to consider a single instance of the daemon VSP process because the VSP coding of the daemon process generates all the sequences of the messages that could be executed during a session with a generic client.
Formal Specification for Fast Automatic IDS Training
197
3.4 Compiling VSP (Step 3) The CVS compiler takes in input the VSP specification of the daemon and generates the FSM1. The body of the VSP process is made up of a linearly ordered sequence of input and output messages. The FSM1 obtained using the CVS compiler can be modeled as a tree. The corresponding tree model for the generation of the FSM1 can be obtained according to the rules in figure 2. TreeGeneration (node *m, CartesianP *P) Begin If (m==root) { GenerateSPA(m); m=m->next; } else If (m==InputMessage) { If (checkInBound (m)) {IF (id MOD 2)=1 Then Begin GenerateSPA(m); m=m->next; TreeGeneration(m,P); } else { while (P!=Null) { P = GenerateCartesianP(m); TreeGeneration(m,P); } } else if (m==OutMessage) { if (checkInBound (m)) { GenerateSPA(m); m=m->next; TreeGeneration(m,P); } else { print(err); exit } } else if (m==Null) exit; End
Fig. 2. The SPA generation code. In the routine for the generation of the FSM1 code we call m the message that we want to translate in SPA code. P is a possible instance of the message m. Each time the routine generates a FSM1 message, the routine moves to the next message via the statement m=m → next. When a message is translated from the VSP to the FSM1, we say that a VSP message is expanded in a FSM1 message. The routine in figure 2 works as follow: 1. the root of the tree is the FSM1 name of the process; 2. if the next examined message m is an input message, this message must consist of a set of parameters, say par1,par2,..,parK usually not instanced. We check if the
198
A. Durante, R. Di Pietro, and L.V. Mancini
current message has parameters that can assume only one possible value with the function checkInBound. If this is not the case, the CVS compiler, starting from this message, generates the Cartesian product of the value of the command parameters. Each element of the Cartesian product constitutes a different son of the root if the expanded message is the first. Otherwise, the generated messages are the sons of the previous expanded message; 3. if the examined message m is an output message, then its parameters are usually instanced. If the parameters are bounded, that is they assume just a value, the compiler generates a son of the previous expanded message. Such a node is labeled with the output message and the actual values of the parameters. If the parameters are not instanced, that is they assume more than one value, an exception is raised and the routine is stopped; 4. the routine terminates when there are no more messages to expand. Note that the representation of FSM1 does not contain neither links to other nodes at the same level, nor links to the ancestor. Moreover, each node has one parent only. Finally, there cannot be isolated messages, since the compiler always links a generated node to one and exactly one of the previously generated nodes. Therefore, we are assured that the generated FSM1 graph is actually a tree. 3.5 Visiting the FSM1 (Step 4) Producing all the traces of the FSM1 is simple and consists of a depth first search in the process algebra tree produced by the CVS compiler. We use the algorithm GetTraces which takes as input: the first line of the SPA code firstline; the root node, e.g. the name of the SPA process firstnode; and an emptybuffer that will contain the execution traces, e.g. the command sequences. Note that the algorithm GetTraces follows a classical Depth First Visit, getting all the paths root-leaf of the FSM1.
4 Applying the Methodology to the REMUS Project To show the suitability of our methodology to an anomaly IDS that bases the profiling of the application program on the analysis of the system calls invoked, we have applied the methodology to the REMUS prototype [1]. The overall picture is as follow: the traces produced by the GetTraces algorithm are translated in command sequences that can be invoked by the ILSC module (step 5). The ILSC executes such command sequences on a specific implementation of the daemon (that we want to profile) when the module REMUS is loaded in the OS kernel in configuration mode with the control on the system calls activated. During the execution of the ILSC, the module REMUS stores the system calls invoked by the daemon in a file with the following format: system call name parameters - invoking program - Program Counter value. The first field consists of the name of the system call that the application program can invoke; the second field consists of the argument values of the system call; the third field is the name of the monitored application program; the last field consist in the Program Counter (PC) value of the invoked system call at runtime. The content of this file is subsequently used to
Formal Specification for Fast Automatic IDS Training System analyst
RFC
System analyst
FSM
Step 1
CVS
GetTraces
FSM1
Step3
199
VSP Spec
Step 2
Command sequences
Step 4
ILSC
Daemon session execution
REMUS
Step 5
List of system calls
Fig. 3. The manual and automated steps of the methodology.
update the ACL. A schema of the methodology applied to the REMUS IDS is showed in figure 3. The first two steps in figure 3 are carried out by the system analyst while the others are automatic. During the fifth step the module REMUS is loaded in configuration mode to intercept and log all the system calls invoked by the ILSC. Then, the logged system calls are used to update the ACL. After the fifth step completes, the system is ready to provide its intended services. During this production mode, REMUS allows a system call execution if and only if the invoking process and the value of the arguments comply with the contents of the ACL previously build in the configuration mode. Thus, common penetration techniques that allows an attacker to hijack the control of a privileged process are blocked by this IDS. 4.1 Gains in Adopting the Methodology The main advantages that we envisage in adopting this methodology can be summarized as follows: 1. the ability to avoid, by construction, false positives (granted that there are no omissions in the VSP specification). Indeed, the daemon normal behavior is completely mapped in terms of the set of system calls it can invoke during its execution. 2. the methodology is completely independent from the application whose normal behavior is modeled; 3. the automatic and fast ACL generation, as showed in the previous section; A problem with non-specification-driven learning, e.g. adaptive learning, is the possibility to train on the basis of a faked version of the application. Indeed, an application may have been corrupted at run-time, thus unexpected behavior from the point of view of the correct semantic of the application could be recorded as normal.
200
A. Durante, R. Di Pietro, and L.V. Mancini
On the other hand, an off-line preparation of the correct set of the invoked system calls preserves from this risk, once assured that the analyzed application program does not contain malicious code. Indeed, the application program could be intrinsically insecure, e.g. the password file is copied and stored maliciously, but this problem is out of the scope of this paper, and must resort to the software engineering certification branch (note that also adaptive learning cannot cope with such a problem). 4.2 Extending the Methodology In recent papers [17,21,9], it has been proposed to monitor the sequence of the system call invoked by application programs to detect unexpected behavior. However, in [21,22] a new class of attack, the so called mimicry attack has been exposed that renders the recognition of system call sequence ineffective. An attacker can succeeds in obtaining the control of the application when the IDS is on, producing the same sequence of system call expected by a normal execution of the algorithm not to raise an alarm. Note that invoking the sequence of system calls with incorrect parameters, could lead to system subversion. For an attacker to succeed, it is enough to find out a path in the sequence that include the system call he has to perform, with the proper parameters to reach his goal. We are aware that capturing only the Program Counter (PC) is not enough to assure the survivability from the mimicry attack, as exposed in [21,22]. Indeed, the PC will necessarily contain the logical address of the invoking system call, thus probably not preventing the attacker from the possibility to generate such a logical address and thus bypassing the control. However we have not been able to figure out an instance of such an attack. Moreover, note that adopting our methodology, it is also possible to record the exact sequences of invoked system call. This could be accomplished intervening on the ACL storing algorithm. However, our choice of recording only the PC of the invoking point of the system call produces a few gains: (1) requires minimal data structure in the kernel; (2) covers the majority of the effective threat; (3) does not affect significantly the overall performance of the supported IDS. As a concluding remark, we can say that it is possible to extend the methodology to record the PC and that recording only the PC with respect to the recording of the complete system call sequences, it is a good trade off between the assured level of detection, and overhead introduced, both in terms of data structure and computation time. Indeed, monitoring the PC can achieve a reduction in the possibility of false negative.
5 Related Works There are several IDS proposed in literature that we divide into two broad classes: network based and host based IDS. The former tries to detect the attempts to subvert the normal behavior of the system, analyzing the traffic of the network [20]. The latter is intended to perform as last line of defense. The host based IDS strives to detect intrusions analyzing the behavior of the system on which the IDS is run. The host based IDS can be further distinguished into three
Formal Specification for Fast Automatic IDS Training
201
categories: (1) anomaly detection, (2) misuse detection, (3) specification-based. In particular, the main characterization of the three methods can be summarized as follow: (1) the anomaly detection method is based on revealing the behavior of the system that differs from a profile that depicts the normal behavior of the system that is automatically updated; (2) the misuse detection tries to classify all the possible known attacks to the system through a sort of signature. Recognizing such a signature on the system, raises an alarm; (3) the specification based approach tries to specify the intended behavior of the monitored program. Even slight variation from this behavior, raises an alarm. The performance of the approaches is measured in terms of: (1) false positive, e.g. an alarm raised in correspondence to a regular situation; (2) false negative, e.g. the IDS did not raise an alarm while an intrusion occurred. The fundamental characteristic of the proposed approaches consist in defining the system behavior in terms of the sequences of the system calls invoked by the monitored application [9]. However, the approaches differ since the system behavior can be modeled in different way, e.g. formal specification [18], neural networks [8], sequences of pattern [13]. The strength and the weaknesses of each of the approach can be classified as follow:(1) the strength of the anomaly detection approaches is related to the capacity of the algorithm to generalize the model of the normal behavior of the monitored program. The more the capacity of generalization of that algorithm, the more the probability of individuating new typology of attack. The drawback of such an approach is that when the IDS experiences for the first time a new behavior, it raises an alarm, which may be a false positive; (2) using the misuse detection approach, it is difficult to individuate new kinds of attack since this approach detects only the old ones, so false negative can occur. However, when an alarm is raised, this is because a signature has been detected, and therefore false positive could not occur. Note that the set of signatures could include ambiguous patterns that can be generated by an attacker as well as a legitimate user; (3) the specification techniques try to cover the deficiencies of the anomaly detection and misuse detection approaches, defining the intended behavior of the controlled program. Any behavior that differs from the expected one is marked as illegal and an alarm is raised. The specificationbased technique should have the precision of the misuse detection technique and also the capability of detecting new kinds of attack as the anomaly detection technique. However, specification based techniques require a good level of technical competence: indeed, a good knowledge of the operation performed by the application program is needed because such a knowledge must be translated in a specification of the expected behavior in a format comprehensible to the IDS. On the other hand, the IDS based on anomaly and misuse detection technique are respectively self-calibrating or just calibrated. Indeed, automatic techniques that lead the learning of the IDS have been proposed [17,21,9]. A more feasible specification based approach is that proposed in [6]. Using this approach it is possible to implement several kinds of security mechanisms. Moreover, the described approach gives the possibility of combining in different ways various IDS mechanism. Among other approaches that cannot be classified in the exposed taxonomy, it is worth noting [16,23], which try to implement a Mandatory Access Control policy. If the security policy defined is too restrictive, the process has less privilege than the minimal
202
A. Durante, R. Di Pietro, and L.V. Mancini
needed to execute its functionality and then the system cannot work properly, requiring the intervention of the system administrator. However, such an approach detects all the attempts to bypass the privileges assigned. We now focus on the REMUS prototype. Its design is based on the analysis of critical system calls, which allows a goods security level intercepting only the 10% of the total number of system call performed during execution. In particular, the overhead introduced by REMUS with respect to others IDS is negligible. The system calls have been partitioned in level of threat: the system calls of level 1 are those utilized from the hacker to gain complete control of the system. REMUS checks the system call of level 1, if the invoking process is a root daemon or if it is setuid to root; indeed, only in this case the attacker can gain access to the system as a privileged user. System calls belonging to other levels of threat are discarded by the IDS since they cannot lead to a subversion of a privileged process.
6 Concluding Remarks In this paper, we have drawn a methodology to derive, starting from a high level specification of an application program, the set of the system calls the application can invoke. The possibility to incur in false positive depends on the completeness of the ACL, that is an ACL which does not contain the set of all possible system calls that the application program can invoke, will block the execution of the application program each time that a system call is not present in it. Note that this case can occur if the VSP specification is incomplete (e.g. the system analyst has incorrectly specified the VSP in the high level specification phase). Our methodology does not need to access the source code, thus, it can be adopted even in those environment in which only the executable is available. Moreover, the methodology is completely independent from the IDS tool adopted. Note that when a new release of a daemon implementation becomes available, it is necessary only to execute the ILSC to upgrade the ACL (step 5), while preserving the efforts spent in following the steps 1-4 of the methodology. Finally, except for the first two step of the methodology described in section 2, which are at a high level of design, the process is completely automatic. The adoption of a specific technology (VSP/CVS), which is internally based on automaton representation, allows us to obtain the profile of the normal behavior of the application program. The automated process based on VSP/CVS is efficient. Note that other training based IDSs, such as anomaly detection, require a training period of months. Moreover, during such a period these training techniques may be exposed to the attacks reported in [9]. Among our next objectives, we want to derive a completely secure WEB server that can effectively support the increasing request of security that affects the e-commerce. Indeed, our efforts are now concentrated on the application of the proposed methodology to the whole set of daemons that are the bulk of any Internet server (e.g. httpd, telentd, sshd, ftpd etc.). We intend to test such a suite under the DoD certified attacks [14].
Formal Specification for Fast Automatic IDS Training
203
References 1. M. Bernaschi, E. Gabrielli, and L. V. Mancini. Remus: a security-enhanced operating system. ACM Transactions on Information and System Security (TISSEC), 5(1):36–61, 2002. 2. C. Cowan, P. Wagle, C. Pu, S. Beattie, and J. Walpole. Buffer overflows: attacks and defences for the vulnerability of the decade. In Proceedings IEEE DARPA Information Survivability Conference and Expo, Jan. 2000. 3. A. Durante, R. Focardi, and R. Gorrieri. A compiler for analyzing cryptographic protocols using noninterference. ACM Transactions on Software Engineering and Methodology (TOSEM), 9(4):488–528, 2000. 4. A. Durante, R. Focardi, and R. Gorrieri. CVS at work: A report on new failures upon some cryptographic protocols. Lecture Notes in Computer Science, 2052:287–299, 2001. 5. R. Focardi and R. Gorrieri. The compositional security checker: A tool for the verification of information flow security properties. Software Engineering, 23(9):550–571, 1997. 6. T. Fraser, L. Badger, and M. Feldman. Hardening COTS software with generic software wrappers. In IEEE Symposium on Security and Privacy, pages 2–16, 1999. 7. D. P. Ghormley, D. Petrou, S. H. Rodrigues, and T. E. Anderson. SLIC: An extensibility system for commodity operating systems. In Proceedings of the USENIX 1998 Annual Technical Conference, pages 39–52, Berkeley, USA, June 15–19 1998. USENIX Association. 8. A. K. Ghosh, A. Schwartzbard, and M. Schatz. Learning program behavior profiles for intrusion detection. In Proceedings 1st USENIX Workshop on Intrusion Detection and Network Monitoring, pages 51–62, April 1999. 9. S. A. Hofmeyr, S. Forrest, and A. Somayaji. Intrusion detection using sequences of system calls. Journal of Computer Security, 6(3):151–180, 1998. 10. IETF Internet Draft, http://www.ietf.org/rfc.html. 11. http://www.faqs.org/rfcs/rfc1939.html 12. K. Ilgun, R.A. Kemmerer, and P.A. Porras. State Transition Analysis: A Rule-Based Intrusion Detection System. IEEE Transactions on Software Engineering, 21(3):181–199, March 1995. 13. S. Jajodia J. L. Lin, X. S. Wang. Abstraction-based misuse detection: High-level specifications and adaptable strategies. In PCSFW: Proceedings of The 11th Computer Security Foundations Workshop, pages 190–201. IEEE Computer Society Press, 1998. 14. R. P. Lippmann. Evaluating intrusion detection systems: The 1998 darpa off-line intrusion detection evaluation. In Proceedings DARPA Information Survivability Conference and Exposition (DISCEX). IEEE Computer Society Press, Los Alamitos, CA, 2000. 15. R. Milner. Communication and concurrency. In Prentice Hall, New York, 1989. 16. Security Enhanced Linux, http://www.nsa.gov/selinux. 17. R. Sekar, M. Bendre, D. Dhurjati, and P. Bollineni. A fast automation-based method for detecting anomalous program behavior. In IEEE Symposium on Security and Privacy, pages 144–155, Oackland CA, May 2001. 18. R. Sekar and P. Uppuluri. Synthesizing fast intrusion prevention/detection systems from high-level specifications. In Proceedings of the 8th USENIX Security Symposium, pages 63–78, Washington DC, USA, August 1999. 19. C. Szyperski, D. Gruntz, and S. Murer. Component software: Beyond object-oriented programming. In Addison-Wesley / ACM Press, 2002. 20. L. Portnoy, E. Eskin, and S. Stolfo. Intrusion Detection with Unlabeled Data Using Clustering. In Proceedings of the ACM CSS Workshop on Data Mining for Security Applications, November 8, 2001. 21. D. Wagner and D. Dean. Intrusion detection via static analysis. In IEEE Symposium on Security and Privacy, pages 156–169, Oackland CA, 2001.
204
A. Durante, R. Di Pietro, and L.V. Mancini
22. D. Wagner and P. Soto. Mimicry Attacks on Host-Based Intrusion Detection Systems. In Ninth ACM Conference on Computer and Communications Security, Washington, DC, USA, 18-22 November 2002. 23. K. M. Walker, D. F. Sterne, M. L. Badger, M. J. Petkac, D. L. Shermann, and K. Oostendorp. Confining root programs with domain and type enforcement (DTE). In Proceeding of the 6th USENIX UNIX Security Symposium, San Jose, California, USA, july 1996.
Using CSP to Detect Insertion and Evasion Possibilities within the Intrusion Detection Area Gordon Thomas Rohrmair and Gavin Lowe Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK {gtr, gavin.lowe}@comlab.ox.ac.uk
Abstract. In this paper we will demonstrate how one can model and analyse Intrusion Detection Systems (IDSs) and their environment using the process algebra Communicating Sequential Processes (CSP) [11,21] and its model checker FDR [9]. An Intrusion Detection System (IDS) is a system that detects abuses, misuses and unauthorised uses in a network. We show that this analysis can be used to discover two attack strategies that can be used to blind every Intrusion Detection System (IDS), even a hypothetical perfect one that knows all weaknesses of its protected host. We will give an exhaustive analysis of all such attack possibilities and will briefly discuss prevention techniques.
Keywords: intrusion detection, Communicating Sequential Processes, model checking, automated verification.
1 Introduction This paper introduces the use of Communicating Sequential Processes (CSP) [11,21] within the intrusion detection area. We will show how one can use CSP and its model checker FDR [9] to analyse the interactions between Intrusion Detection Systems (IDSs) and their environment. Special account is given to unexpected side effects that allow an attacker to render the IDS useless. An Intrusion Detection System is used to detect abuses, misuses and unauthorised uses in a network, caused by either insiders or outsiders. These systems identify intrusions by spotting known patterns or by revealing anomalous behaviour of protected resources (e.g., network traffic or main memory usage). With two techniques it is possible to blind Intrusion Detection Systems, even a hypothetical perfect one that knows all weaknesses of its protected host. Although they are easy to understand it took the security community a long time to reveal them [18]. We will give an exhaustive analysis about all attack possibilities that are based on this classes and will briefly discuss prevention techniques. The motivation for using formal methods is that conventional ways of specifying systems rely heavily on natural language and diagrammatic methods. Such approaches make it harder to write unambiguous specifications and therefore make it more difficult to analyse. If omissions and errors introduced during the specification phase go undetected until late in the development cycle, they become very expensive to rectify. In the A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 205–220, 2003. c Springer-Verlag Berlin Heidelberg 2003
206
G.T. Rohrmair and G. Lowe
security sector, every undetected bug has the potential to cause the loss of reputation, clients and/or information. The real power of formal methods, as mentioned in [6], becomes apparent when we try to detect emergent faults. These originate from unexpected interactions between the single processes of the overall system. Our approach involves: 1. Building two models of small networks employing an IDS. The first model uses only two fields of the Internet Protocol version 4 (IPv4) [8] within a small network that is protected by a firewall. The firewall is built upon the screened subnet architecture described in [7]. This topology consists of two routers, placed either side of the IDS. The extended model allows fragmentation and out-of-order communication between the nodes, based upon IPv4. The new fields required for this are the fragment offset and more fragments bits. The overall architecture of the network stays unchanged. 2. Developing a specification that permits one to discover every possibility of an attacker launching an attack that is not seen by the IDS. 3. Using the model checker Failures Divergences Refinement (FDR) [9] for automated verification of our models. FDR finds that the specifications are not met and provides us with counterexamples which represent blinding possibilities. The next section provides background information regarding Intrusion Detection. Section 3 shows how we modelled the Time-To-Live model described in [18,17] and discusses results that can be drawn from the FDR analysis. Section 4 deals with a more complex model that emulates the reassembly process of IPv4. Finally, in Section 5 we present some conclusions, together with a discussion of interesting questions for future work.
2 Background In this section we describe necessary background information. 2.1 Classes of Intrusion Detection Principles We can classify three different detection principles of Intrusion Detection Systems (IDSs). Misuse Detection Misuse detection based systems look for known signatures of attacks. A signature is the pattern that is used by the IDS to spot attacks [12]. It is a specific manifestation of a certain attack. Examples of such systems can be found in [3]. The signatures that are necessary for these systems are mostly developed by hand. The IDS usually obtains the required information from a network adaptor, which feeds it with raw data packets, or from the log-files of the hosting operating system. In industry the most used systems are network signature based IDSs, because of their low total cost of ownership.
Using CSP to Detect Insertion and Evasion Possibilities
207
Advantages The system knows exactly how a certain attack manifests itself. This leads to a low false-positive ratio. The detection algorithm is based on pattern matching, for which efficient solutions exist. Disadvantages Defining the manifestations of certain attacks is a time consuming and difficult task. Due to the working principle of these systems, it is nearly impossible for them to detect novel attacks. Moreover subtle variations in the attack can mislead them. Anomaly Detection Such systems distinguish between normal and anomalous behaviour of guarded resources. Examples of monitored resource characteristics include CPU utilisation, system call traces, and network links. The decision regarding which behaviour class currently relates to certain events is made by means of a set of profiles. The profiles of normal behaviour for a resource are maintained by a self-learning algorithm. Examples of such systems can be found in [3]. Advantages The cost of maintaining the system is usually low, because the profiles are updated by the self-learning algorithm. Additionally, it can detect novel attacks as well as variations of already known ones. Disadvantages The self-maintaining algorithm is usually computationally expensive. Sometimes unusual behaviour is not a precise indicator of an ongoing intrusion. The result is a high false-positive ratio. Finally, these systems can learn to classify intrusive event traces that are performed slowly as normal behaviour, which renders them useless. Specification-Based Detection [13,14] and [24] were some of the first papers that recommended this approach. They distinguished between normal and intrusive behaviour by monitoring the traces of system calls of the target processes. A specification that models the desired behaviour of a process tells the IDS whether the actual observed trace is part of an attack or not. With this approach, they attempt to combine the advantages of misuse and anomaly detection. It should reach the accuracy of a misuse detection system and have the ability to deal with future attacks of anomaly detection. Their systems managed the detection by inspecting log files. This differs from [26], where a run time engine was developed to detect violations in real time. This approach is even capable of intercepting intrusions. Advantages More or less the same as for misuse detection. However these systems manage to detect some types/classes of novel attacks. Additionally, they are more resistant against subtle changes in attacks. Disadvantages Usually for every program that is monitored, a specification has to be designed. Furthermore, the modelling process can be regarded as more difficult than the design of patterns for misuse detection systems. Additionally some classes of attacks are not detectable at all.
208
G.T. Rohrmair and G. Lowe
2.2 Classes of Intrusion Detection Data Sources Another classification style that is important is to distinguish the IDS according to their data source. Network Intrusion Detection Systems (NIDS) A NIDS gets its information from a network adapter operating in promiscuous mode. It examines the traffic for an attack symptomatic signature. Although anomaly detection has been implemented for these systems, the main detection principle remains the misuse detection. A NIDS can provide surveillance for a whole network, because it is working with the raw network packets. In our further examination we model a NIDS because it is the most used system type [25]. Advantages Due to the fact that a single NIDS can monitor a whole network its implementation and maintenance cost are low. Additionally these systems, since they work at the packet level, have all the information to sift out the difference between hostile intentions and friendly intentions. Disadvantages These systems are largely unable to read the traffic of encrypted connections. The only exception would be to include them into the security association; however this can be regarded as computationally too expensive. Nevertheless, users have increasingly encrypted their communication, rendering the system obsolete. Additionally, increasingly networks are switched rather then broadcasted. Due to the Ethernet working principle [4], the NIDS is only able to collect packets that travel through its collision domain. In a switched environment there is no real collision domain, hence the NIDS is not able to retrieve vital information [28]. Host Intrusion Detection Systems (HIDS) The HIDS runs on a specific host and watches its logging activity. Hence, these systems are operating system dependent and every protected host needs a separate IDS [27]. They can keep track of all actions that are made by the users of that host which include browsing for files with the wrong read/write permissions, the adding and deleting of accounts, and the opening and closing of specific files. This gives these systems a greater aptitude for surveillance of security policy violations. Advantages The system knows whether or not an attack is successful. It usually produces a reduced number of false-alarms caused by unsuccessful attacks; for example, a HIDS protecting a Linux host would not raise an alarm if an attacker sends a Microsoft IIS Buffer Overflow Attack against this host. HIDS have more monitoring variables than NIDS. Because HIDS reside on the target host, these systems are able to keep track of encrypted end-to-end connections. Some of them even watch the packets as they traverse up the TCP/IP stack, allowing them to cut out of the stack a packet that would lead to a security policy violation. Another advantage of this structure is that HIDS do not require additional hardware.
Using CSP to Detect Insertion and Evasion Possibilities
209
Disadvantages For every monitored node a HIDS is required. This makes them very expensive in maintenance. HIDS are operating system dependent, therefore different implementations for different operating systems are required which makes them expensive in development. They also reduce the operational capacity of a network node, because they run their analysing processes in parallel with the business applications. Additionally, once an attack succeeds the security officer has to trust information collected from a corrupted host.
3 The Time-to-Live Model In this section we present our first model. We consider whether the Internet Protocol version 4 (IPv4) [8] gives an attacker the opportunity to launch an undetected attack against the target. We first discuss how we can represent the protocol. 1. We need the data field, otherwise we could not communicate with the nodes in our simulated network. 2. Additionally, we include the Time-To-Live (TTL) field. The TTL maintains the distance a packet can travel; every router decreases its value by one; once zero is reached the packet is discarded. As shown in [18] the TTL offers an interesting evasion possibility. Modelling assumptions The model is built under certain assumptions. One general assumption is that the IDS is a NIDS based on signature detection. We believe that this is the most relevant IDS because of the large usage of these systems. The IDS itself is considered to be perfect, in a sense that it knows all vulnerabilities that could be used to cause a security breach. Additionally we consider only one-way, in-order communication. We now consider the network topology. We model a network with just one sender and one receiver node. We use a DeMilitarised Zone (DMZ) configuration, which is commonly used in industry [7]. It consists of an exterior filtering router and an internal filtering router (see Figure 1 below); the exterior one is responsible for protecting the network from most attacks; the interior one is the most restrictive one, as it only allows traffic that is permitted for the internal network. The DMZ resides between these two routers; this is the place where companies maintain their public servers, such as the web server. This is also the preferred place for the IDS; due to the limitations of a network IDS, this is the only place where the IDS receives all the traffic that comes from outside. (An alternative place would be in front of the external router; however this IDS would then detect more alerts than are actually relevant: it would include all attacks that are confounded by the exterior router.) If we find an attack under these restricted conditions, we will know that there is an attack in the real-world. 3.1 The CSP Model Each datagram consists of a TTL value and some data; hence the channels have the following structure: TTL.DATA. In order to reduce the state space of our model we ought to introduce further restrictions.
210
G.T. Rohrmair and G. Lowe
Attacker
a
Router 1
b
IDS
c
Router 2
d
Target
Fig. 1. The network topology
– The TTL value will only range from 4 to 0. We believe that the range of the TTL value is enough because the diameter of the resulting network will be smaller then 4. – The Data field will communicate the bit patterns A, B and C. A represents the bit patterns that, once received, force the target to move into a pre-crashed state, or the IDS into a pre-alerted state. The pre-crashed or pre-alerted state indicates that the system will fail or alert on receiving a B bit pattern (which stands for the attack suffix). If the system is not in a pre-crashed or pre-alerted state and receives a B it stays in its initial state. In the real world, A followed by B represent all possible real-world attacks that forces a target or IDS to fail or alert. The final class of bit patterns, C, represents all strings that are not part of class A or B, i.e. the set of innocent patterns that are not part of an attack in any way. For simulating an appropriate network we require two routers, an attacker, one target and one IDS, as shown in Figure 1; we describe each of these below. The routers The routers are used for navigating packets from source to destination. Since we have not modelled the source and destination address we can ignore the whole routing functionality; our routers, therefore, act like relay-stations. Taking this simplification into account, we achieve a very simple router that decreases the TTL field by one and checks the result. In the case where the value is zero, the packet will be dropped. Otherwise the packet will be forwarded with the new TTL value. From this we get the following CSP description: Router(in,out) = in?x?y -> if 1 Router(in,out) else Router(in,out)
x contains the data and y the TTL value. The parameters in and out are channel names that represent the input and output ports of the router. The attacker The attacker process should be able to execute the same actions as an attacker in the real-world. We model the attacker nondeterministically, so as to impose no limitation on the sequence of packets it sends. Consequently, FDR has to explore every possible input stream that the attacker process may create. The process is modelled by the following CSP description. Attacker = a?x?y -> Attacker
The target The target process receives fragments and then reassembles them. Once the packet is reassembled, if an attack signature is found, the target should fail. The following CSP description describes this process.
Using CSP to Detect Insertion and Evasion Possibilities
211
Target(sigs, vulnerability) = d?x?y -> if member(<x>,vulnerability) then fail -> STOP else let vulnerability’ = {s | <x’>ˆs <- union(sigs,vulnerability), x’==x} within Target(sigs, vulnerability’)
This process is initialised by the two variables sigs and vulnerability. sigs is a set of sequences, namely all complete attack signatures. vulnerability keeps track of the progress of security breaches and indicates what the target has to receive in order to fail. The list comprehension is used to update vulnerability. One note about the fail event: this event does not mean that the computer crashes literally; it only indicates that the security policy has been violated. This can range from stealing or compromising data to root access and even to crashing. The IDS The IDS protects the target. We assume here that the IDS is a perfect signature based IDS and therefore knows all vulnerabilities that cause the target to fail. In practice this is impossible because many vulnerabilities are not revealed yet. We have to make this assumption to generalise all current existing IDSs. The following CSP description for the IDS differs from the above target component only in the following two ways: firstly, it forwards all received packets after inspecting them; secondly, it engages in an alert event rather than in a fail event once it has received an attack pattern. IDS(sigs,alerts) = b?x?y -> if member(<x>,alerts) then alert -> c!x!y -> STOP else let alerts’ = {s|<x’>ˆs <- union(sigs,alerts), x’==x} within c!x!y -> IDS(sigs,alerts’,dist))
The complete model We use parallel composition to synchronise the different processes according to the given network structure (Figure 1). We hide all internal events, leaving just the alert and fail events visible. The Specification The specification expresses that there always has to be an alert before a fail event. In other words, the IDS should have a log-entry once a successful attack was performed. We can model this with a simple recursive CSP process. Spec = alert -> fail -> Spec We use FDR to check whether Spec T Model1 holds, that is, whether the traces of Model1 are a subset of the traces from Spec. The process Spec allows precisely the valid traces, so if the refinement holds then the traces of Model1 are just valid ones, where the IDS detects all attacks; if not, then we have discovered an attack not detected by the IDS. Results FDR reveals that the refinement check fails, and provides us with the following trace:
212
G.T. Rohrmair and G. Lowe
< a.A.4, b.A.3, c.A.3, a.C.2, b.C.1, d.A.2, c.C.1, a.B.4, b.B.3, c.B.3, d.B.2, fail > This trace is displayed in the following sequence diagram.
Attacker a.A.4 b.A.3 c.A.3 a.C.2 b.C.1 d.A.2 c.C.1 a.B.4 b.B.3 c.B.3 d.B.2
Router 1
IDS
Router 2
Target
A.4 A.3 A.3 C.2 C.1 A.2 C.1 B.4 B.3 B.3 B.2
Fig. 2. TTL Attack
This is similar to the observation of Ptacek in [18]. The attacker sends three packets with data A, C and B, respectively, where the packet with data C has a TTL value that is lower than its distance to the target. Therefore, this fragment will be discarded from the last router. The IDS, however, takes it into account, so the reassembled packet deviates from the packet that is processed by the target. Hence the target fails, but the IDS does not raise an alert Attacks like these, where the states of the IDS and target become de-synchronised, are called de-synchronisation attacks. 3.2 Discussion The attack presented above uses the fact that the IDS has not enough information about the topology of the network. We can solve this problem in two ways. – We could redesign the IDS so that it takes the different distances into account. The drawback of this solution is that the count has to be updated if changes in the network topology occur. We designed a CSP model corresponding to this proposed solution; the analysis found no attacks. – The second possibility is harder to implement; we could implement a reassembly algorithm that raises an alert if the TTL value of one fragment in the stream is different from the others, and this TTL value is lower then the diameter of the network.
Using CSP to Detect Insertion and Evasion Possibilities
213
Finally, we have to point out that these days it is very unusual for packets to get fragmented; therefore an IDS based on anomaly detection is well suited for spotting this kind of attack.
4 The Fragment-Overlapping Model In this section we examine the behaviour of a system where data packets are fragmented and reassembled based upon RFC 791 [8]. Sometimes an IP packet has to be routed through different networks. Not all networks have the same properties. Therefore, a packet might have to be split up into fragments tagged with their position in the original packet (fragment offset); this process is termed fragmentation. The target receives an increased supply of smaller fragments instead of one IP packet, and therefore has to reconstruct the initial packet; This process is called reassembly. The algorithm in [8] collects the fragments and puts them into the right place of the reserved buffer. Sometimes data is received at the same fragment offset as a previously received fragment. In such a case, a decision has to be made whether to favour old or new data. RFC 791 [8] leaves unspecified which should be preferred, but the recommendation is to prefer new data, so that if the algorithm receives data from the same position twice, the new data will overwrite the old. However, not all implementations follow this suggestion: favouring new data introduce great danger, as stated in [23], making it possible to circumvent filtering devices; for this reason some operating systems favour old data. Combining operating systems that favour new data (e.g. 4.4 BSD and Linux) with those that favour old data (e.g. Windows NT 4.0 and Solaris 2.6) introduces an evasion possibility if the IDS does not know what type of operating system the target is running; this was first discovered by Ptacek and Newsham [18]. 4.1 The CSP Model To analyse the interactions between the various types of operating systems and IDSs we have designed the following CSP model. The network topology is similar to that in Figure 1, although, For simplicity, we omit the routers. Channels The channels of this model have to be extended. We require all fields that are necessary to re-assemble the fragment stream, namely the more fragments (MF) bit, which indicates whether this is the final fragment in the packet, and the fragment offset (FO) bit, which indicates the offset of this fragment within the packet. We use the following channel description: more_fragment_bit . fragment_offset . TTL . data Therefore, event a.1.0.1.A represents a packet that travels along channel a with its more fragment bit set to one, a fragment offset of zero, a TTL value of one, and a data field containing a bit sequence A.
214
G.T. Rohrmair and G. Lowe
Attacker The attacker process remains unchanged. Target The target should satisfy the same properties as the target process of the TTL model. Additionally, it should be able to deal with fragments and out-of-order traffic. Thus, it should be able to re-assemble an out-of-order fragment stream just as it is described in RFC791. In order to consider the behaviour of the different types of operating system— favouring old or new data—we arrange for the target process to choose an operating system initially. The reassembly buffer is initialised to be empty (). Target(sigs) = os_target?os -> Target(os,,sigs,0)
The process Target(OS,buff,sigs,max) requires the following parameters: OS states whether the IDS is preferring old or new data; buff represents the allocated resources that are required for reassembling a packet; sigs carries the set of attack signatures; and max keeps track of the maximum size of the original packet. The target first receives a datagram and calculates the new buffer b1 with the function overwrite. Target(OS,buff,sigs,max) = in?mf?fragmentoffset?ttl?data -> let b1 = overwrite(buff,fragmentoffset,data) within Target(OS,buff,sigs,max,mf,fragmentoffset,data,b1)
The following process models the case were the more fragments flag is equal to zero, indicating that this will be the last fragment. First the process checks whether a fragment with this offset has already been received, and if so acts according to its update policy (favouring old or new data). It then checks whether or not the packet is complete: if the packet is not complete it stores the maximum size of the packet; otherwise it verifies whether it has received an attack or not. The function nth(a,b) returns the value stored in position b of buffer a. allFilled(a,b) checks whether all data in buffer a up to position (b) have been received. check(a,b,c) compares the buffer a with the set b to depth c to validate the existence of any attack patterns in the buffer. After the reconstruction of a packet the buffer is initialised with , which indicates a clear buffer. Target(OS,buff,sigs,max,0,fragmentoffset,data,b1) = if nth(buff,fragmentoffset) != N and OS == 0 then Target(OS,buff,sigs,max) else if allFilled(b1,fragmentoffset) then if check(b1,sigs,fragmentoffset) then fail -> STOP else Target(OS,,sigs,0) else Target(OS,b1,sigs,fragmentoffset)
The following process models the case were a fragment with more fragments bit set to one arrives, indicating that more fragments are following. The structure is nearly the same, except that we do not set any maximum.
Using CSP to Detect Insertion and Evasion Possibilities
215
Target(OS,buff,sigs,max,1,fragmentoffset,data,b1) = if nth(buff,fragmentoffset) != N then if OS == 0 then Target(OS,buff,sigs,max) else Target(OS,b1,sigs,max) else if allFilled(b1,max) and max != 0 then if check(b1,sigs,max) then fail -> STOP else Target(OS,,sigs,0) else Target(OS,b1,sigs,max)
The IDS The IDS process structure is similar to the IDS model in the improved TTL version in that it considers the distance to the target. It is also capable of re-assembling fragments arriving out-of-order, as the target process presented above. We will not give a full account here because of the similarities to the target. The IDS raises an alert instead of a fail event and indicates its operating system with os ids instead of os target. The Complete Model The complete system model is composed of an attacker, a target, and an IDS. The specification and refinement assertion remain the same as in the TTL example. 4.2 Results FDR provided us with two distinct attacks that could both elude the IDS.
Attacker
IDS
1.0.1.A
Target
os_ids.1 os_target.0 1.0.1.A
1.0.2.C 1.0.2.C 0.1.3.B 0.1.3.B
Fig. 3. Attack 1
Attack 1 5 The IDS chooses to use an operating system that favours new data (indicated by the event os ids.1), whereas the target chooses to favour old data (os target). The attacker sends two fragments with fragment offset zero, the first containing a bit sequence A (1.0.1.A), the second containing an innocent bit sequence C (1.0.2.C). The result is that the IDS receives the A fragment and then overwrites it with C. However, the target receives the A fragment and refuses to store the C fragment, because it
216
G.T. Rohrmair and G. Lowe
favours old data. Therefore we have the situation where in the reassembly buffer of the IDS a C bit sequence is stored and in the buffer of the target process an A bit sequence is stored. Finally the attacker creates the last packet (0.1.3.B), with a fragment offset of one, and the more fragment bit set to zero. Hence on receiving this fragment, both the target and the IDS re-assemble their packets. The IDS re-assembles and the target re-assembles , which causes a fail event without an alert.
Attacker
Target
IDS os_ids.0
os_target.1
0.1.0.C
0.1.0.C 0.1.2.B 0.1.2.B 1.0.3.A 1.0.3.A
Fig. 4. Attack 2 Attack 2 This attack is the reverse of the former. The IDS chooses to be based on a type zero OS (favouring old data) and the target chooses to be based on a type one OS (favouring new data). The attacker send a fragment with fragment offset one, more fragment bit set to zero, and a C bit sequence to the IDS (0.1.0.C). Afterwards he submits a fragment with the same fragment offset but with a different bit sequence (0.1.2.B). This leads to a deviation of the IDS buffer from the target buffer: the IDS has stored a C on its second place, whereas the target overwrites the C with a B. The attacker then sends the final packet (1.0.3.A). The IDS re-assembles , which is innocent, and the target re-assembles , which leads to the fail event without an alert. 4.3 Discussion There is only really one way to prevent these attacks; that is, to arrange for the IDS to take account of both possibilities for the target, i.e. favouring old or new data. However, this solution does not appear to scale well. There are many differences in the way implementations treat the TCP/IP stacks, and it would appear that the IDS needs to consider all such possibilities. Even if we only consider the drop points—operations were the packet is completely rejected—described in [18], there are a vast number of desynchronisation possibilities. The consequent state-space explosion in the IDS would appear unmanageable. Another subtle point in our reassembly model was missing, namely time: both the IDS and target will timeout if a packet is not completely received within a certain time.
Using CSP to Detect Insertion and Evasion Possibilities
217
The use of timeouts allows a different de-synchronisation attack. Consider the case were the timeout value of the IDS is smaller than the value for the target: then the attacker can send its first fragment and wait until the IDS times-out before sending the remaining fragments, causing the agents to become de-synchronised. This attack reveals an interesting point: we can never be certain whether our abstractions have removed details that allow attacks. Because of this we need a proper framework to prove formally that our abstractions have not lost too much detail.
5 Conclusion We have seen how small deviations in implementations can have a considerable impact on the security of a system. Even when the individual subcomponents of a system are secure, the overall system may be still not free from flaws. Such emergent faults can be spotted easily by testing the system as a whole against a specification. 5.1 Generalising the Retrieved Results Whenever it is possible to create a difference between the input stream of the IDS and the protected system we can successfully hide an attack. More generally, both the protected system and the IDS have state transition graphs; if we create a situation where these systems change into different states, they require different stimuli to reach the deprecated state where the target fails and the IDS raises the alert. We can distinguish between three de-synchronising possibilities: 1. De-synchronisation due to the systems behaving exactly the same, but the input streams being different; this is the method that appears in our first model. 2. De-synchronisation because the input streams are the same, but the systems behave differently under certain conditions; this is the type of flaw exploited in our second model. 3. De-synchronisation because both the input streams and the behaviour of the systems are different. 5.2 Related Work Vulnerability identification and analysis has been a topic for some time now. One of the first attempts to build an automatic vulnerability detector was COPS. In common with other approaches (SATAN, TIGER and NESSUS [15]) it looks for already known attacks. It manages this by firing known attack patterns against a particular host. However this kind of direct testing is not suitable for spotting unknown attacks. In [19], Ritchey and Ammann propose a high level approach to detect whether it is possible for an attacker to leverage the existing vulnerabilities in the network to get a higher degree of access. The configurations of the network nodes were abstracted to state machines and the attacks were represented as state transitions in these machines. In contrast to that [22] uses a low level description of a UNIX file system to spot single configuration vulnerabilities. The focus is more on finding new attacks rather
218
G.T. Rohrmair and G. Lowe
then finding a combination of attacks that enables the attacker to penetrate the system even more. The differences between ours and Sekar’s approach lies in the state space: they use an infinite model and we use approximation to restrict our model to a finite one; therefore, we can use a common model checker such as FDR. Additionally they use invariants to manifest their security policy in contrast to using a specification, as we do. The difference becomes clear during the examination of the counter examples: they have to establish an intentions model to prune away all paths that are against the defined invariant but do not violate the security policy; we have no such paths. Another approach of using model checkers for vulnerability identification is proposed in [1,2]. However they use the model checker combined with mutation testing techniques to generate and recognise test sets rather then testing direct. The advantages of our approach are: – The approach finds all possible attacks, not just known ones, due to the working principle of our model checker FDR. – Easy to understand counter examples. – Due to the modularity of our models the workload to add new processes or to change the network is low. – We use a finite model, which allows us to use common model checkers. – Our testing is specification based, which keeps the effort of encoding security policies low. 5.3 Future Work As shown so far, the CSP approach is a suitable way for identifying emergent faults. However, most of the above examples are outdated. Therefore, we will explore some new protocols and their impact on the intrusion detection area. IPv6 [16,10] is the upcoming protocol, so we will model and analyse it. On a different track, in the long term it is quite unsatisfying only to be capable of verifying IDSs that are based on signature detection. Therefore we have to solve following problems: – How can we classify and represent vulnerabilities and attacks? We have to find a better way than representing an attack as a bit pattern, because some IDSs are not signature based. – How does this technique apply to industrial-scale problems? Usually model checking can only be applied to small or medium sized problems. A multi layer abstraction framework needs to be established to address large-scale problems. One possibility could be to close the gap between the Ritchey and Ammann [19] approach and ours. Hence it would be possible to evaluate the interactions between single modules on a low level and to convert the obtained results into a high level model to analyse the relations between the spotted vulnerabilities. – How can we abstract a network or a network node without losing too many details? As mentioned before the problem with abstraction is that one may lose important detail. It would be of great use to design a technique that allows us to prove whether the lost detail was important or not.
Using CSP to Detect Insertion and Evasion Possibilities
219
– How can we prove that our observed scope is enough? Due to the working principle of FDR we were forced to restrict the state space. We achieved this by choosing small ranges for all variables. We want to prove that these restrictions do not hide other attacks. To do so we have to show that any attack which depends on the usage of larger values can be reduced to one that uses values within our scope. Dataindependence techniques [5,20] might be useful to base this on a formal foundation.
References 1. Paul E. Ammann and Paul Black. Test generation and recognition with formal methods. 2000. 2. Paul Ammann, Wei Ding, and Daling Xu. Using a model checker to test safety properties. 2000. ISE Department, MS 4A4, George Mason University, 4400 University Drive Fairfax, VA 22030 USA. 3. Stefan Axelsson. Research in intrusion-detection systems: A survey. page 98, 1999. 4. Uyless Black. TCP/IP and Related Protocols. Computer Communications. McGrawHill, 1998. 5. P.J. Broadfoot, Gavin Lowe, and A.W. Roscoe. Automating data independence. In Proceedings of ESORICS, pages 175 – 190, 2000. 6. E. M. Clarke and J. M. Wing. Formal methods: State of the art and future directions. ACM Computing Surveys, 28 (4), Dec 1996. 7. D. Brent Chapman, Elizabeth D. Zwicky, and Simon Cooper. Building Internet Firewalls. O’Reilly, Jun 2000. ISBN: 1-56592-871-7. 8. Marina del Rey. RFC 791 Internet Protocol DARPA Internet program protocol specification. Sep 1981. California 90291. 9. Paul Gardiner, Michael Goldsmith, Jason Hulance, David Jackson, A.W. Roscoe, and Bryan Scattergood. FDR2 User Manual. Formal Systems (Europe) Ltd., fifth edition edition, 2000. 10. R. Hinden and S. Deering. RFC 2460 Internet Protocol, Version 6 (IPv6) specification. Dec 1998. 11. C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985. ISBN: 0-13-153271-5. 12. Frederick Karen Kent. Network Intrusion Detection Signatures - Part 1. http:// www.securityfocus.com, Dec 19 2001. 13. Calvin Ko, George Fink, and Karl Levitt. Automated detection of vulnerabilities in privileged programs by execution monitoring. 1994. Department of Computer Science University of California, Davis, CA 95616. 14. Calvin Ko. Execution Monitoring of Security Critical Programs in a Distributed System: A Specification-Based Approach. PhD thesis, Department of Computer Science, University of California at Davis, 1996. 15. Nessus a remote security scanner. http://www.nessus.org/. 16. Commission of the European Communities. New generation Internet - priorities for action in migrating to the new Internet protocol IPv6. COM(2002) 96 final:15, Feb 2002. 17. Vern Paxton. BRO: A system for detecting network intruders in real-time. Computer Networks, 31:2435 – 2463, Dec 1999. 18. Thomas H. Ptacek and Timothy N. Newsham. Insertion, evasion, and denial of service: Eluding network intrusion detection. Secure Networks, Jan 1998.
220
G.T. Rohrmair and G. Lowe 19. R. Ritchey and P. Ammann. Using model checking to analyze network vulnerabilities. IEEE Oakland Symposium on Security and Privacy, pages 156 – 165, May 2000. 20. A.W. Roscoe and P.J. Broadfoot. Proving security protocols with model checkers by data independence techniques. Journal of Computer Security: Special Issue CSFW12, 7 (2,3):147–190, Jul 1999. 21. A. W. Roscoe. The Theory and Practice of Concurrency. Prentice Hall, 1998. ISBN: 0-13-674409-5. 22. C.R. Ramakrishan and R. Sekar. Model-based analysis of configuration vulnerabilities. Department of Computer Science State University of New York Stony Brook NY 11794. 23. D. Reed, G. Ziemba, and P. Traina. RFC 1858 Security considerations for IP fragment filtering. Oct 1995. 24. R. Sekar, T. Bowen, and M. Segal. On preventing intrusions by process behavior monitoring. USENIX Intrusion Detection Workshop, 1999. 25. Network Flight Recorder Security. http://www.nfr.com/. 26. R. Sekar and P. Uppuluri. Synthesizing fast intrusion prevention/detection systems from high-level specifications. Master’s thesis, State University of New York at Stony Brook, NY 11794. 27. Internet Security Systems. Network- vs. host-based intrusion detection. Technical report, http://www.iss.net/support/documentation/whitepapers/ index.php, Oct 02 1998. 28. Internet Security Systems. Intrusion detection systems - whitepaper. http://www. iss.net/support/documentation/whitepapers/index.php, 1999.
Revisiting Liveness Properties in the Context of Secure Systems Felix C. G¨ artner Swiss Federal Institute of Technology (EPFL), School of Computer and Communication Sciences, Distributed Programming Laboratory, CH-1015 Lausanne, Switzerland, [email protected]
Abstract. Distinguishing trace-based system properties into safety properties on the one hand and liveness properties on the other has proven very useful for specifying and validating concurrent and fault-tolerant systems. We study the adequacy of these abstractions, especially the liveness property abstraction, in the context of secure systems for two different scenarios: (1) Denial-of-service attacks and (2) brute-force attacks on secret keys. We argue that in both cases the concept of a liveness property needs to be adapted. We show how this can be done and relate the resulting concepts to related work in the areas of concurrency theory and fault-tolerance.
1
Introduction
It was observed in 1977 by Lamport [31] that system properties can informally be classified into two distinct classes: safety properties and liveness properties. Generally speaking, safety properties state that “something bad never happens”, i.e., a certain bad condition will never occur in any system configuration. Mutual exclusion and partial correctness are two prominent examples of safety properties. For the former, the bad condition is that two processes are in their critical section at the same time. For the latter, the bad condition describes a termination state where the postcondition does not hold. Safety properties are a well-established concept and a lot of theory and practice has evolved around it. In contrast to safety properties, liveness properties demand that “something good eventually happens”, i.e., a certain desired condition will eventually be true for some system configuration. The most prominent example of a liveness property is termination. Liveness properties can be regarded as a first-order approximation of real-time properties. The distinction made is merely that between “finite” and “infinite” time. While safety properties are violated in finite time, liveness properties are violated in infinite time. In a later article, Lamport elaborates on the meaningfulness of a system satisfying a liveness property as follows [32]: The question of whether a real system satisfies a liveness property is meaningless; it can be answered only by observing the system for an A.E. Abdallah, P. Ryan, and S. Schneider (Eds.): FASec 2002, LNCS 2629, pp. 221–238, 2003. c Springer-Verlag Berlin Heidelberg 2003
222
F.C. G¨ artner
infinite length of time, and real systems don’t run forever. Liveness is always an approximation to the property we really care about. We want a program to terminate within 100 years, but proving that it does would require addition of distracting timing assumptions. So, we prove the weaker condition that the program eventually terminates. This doesn’t prove that the program will terminate within our lifetimes, but it does demonstrate the absence of infinite loops. In many cases, proving the absence of infinite loops is a sufficient approximation of “fast”: While we can give no guarantees on the real-time duration within which a program will terminate, computers run so fast today that a program completes its task fast enough for many practical scenarios. However, hard realtime properties [46], like “a response occurs within 2 seconds after the request”, are usually considered to be safety properties [42,35,2] if an explicit clock is added to the system model. Validating that the behavior of the modeled clock matches the behavior of a real-time clock involves a whole new range of formal machinery (like scheduling theory) and hence can be regarded as falling into the category of “distracting timing assumptions” which are avoided using the liveness property abstraction. This is the main reason why liveness properties are a widely accepted concept when modeling and analyzing the timing behavior of algorithms. Safety and liveness have been considered adequate in the area of fault-tolerant systems too. Specifying systems using safety properties directly translates to this area since safety properties make sense without change in the presence of faults. In the context of silent crash faults it was observed that liveness properties must be restricted to those parts of the system which remain alive. For example, the standard specification of the consensus problem [7,45], the basis of distributed transactions and hence a very important problem in fault-tolerant computing, involves a safety property and a liveness property: – (Safety) If two processes choose a certain value v ∈ {commit, abort }, they choose the same value. – (Liveness) Every process eventually chooses a value. The safety property perfectly makes sense if processes can crash. But the liveness property needs to be weakened into – (Liveness) Every process that doesn’t crash eventually chooses a value. to be implementable, yielding the definition of uniform consensus. In other fault settings, the specification needs to be adapted in similar ways (see for example the area of self-stabilization [22]). In this paper we turn our attention to the adequacy of liveness properties when studying secure systems. We will not attempt to define what security means in general, but rather look at two individual examples of properties which we intuitively regard as security properties and which we would like to formalize, if possible, as liveness properties. The example properties come up in the context of two different forms of attacks:
Revisiting Liveness Properties in the Context of Secure Systems
223
– Denial-of-service attacks [16]: In a denial-of-service attack a user is prevented from using a remote resource by, for example, flooding the network with bogus messages. In this setting we ask: What are sensible forms of liveness to specify the progress properties we require of a system in the presence of such attacks? – Brute-force attacks on cryptographic (public) keys: In these types of attacks, an adversary tries to compromise the secrecy of a cryptographic key by trying and testing every possible solution from the key space. While this usually takes a prohibitively long period of time, some instances of such attacks are feasible. (For example, it was possible to break an instance of the Data Encryption Standard DES in less than a day [41].) But even when abstracting from concrete time instances (as is done in the domain of liveness), a brute force attack is guaranteed to terminate. (Note that while private (symmetric) key cryptography may in some cases be resilient to brute force attacks, public (asymmetric) key cryptography can always be attacked in this way since one key is necessarily disclosed.) In this setting we ask: What are sensible concepts in the spirit of liveness properties to model the resilience of an algorithm to brute-force attacks if an adversary can delay the progress of the algorithm for an arbitrary (but finite) amount of time? In both cases we describe concepts that can help model the properties in question. For the denial-of-service case we end up with a concept which we call self-controlled liveness properties. These properties can be seen as being the particular subset of liveness properties which the adversary cannot control. We define this concept formally and relate it to the similar concept of machineclosure [32] from concurrency theory. This work aims in the direction of better understanding system properties in the context of denial-of-service attacks, an open issue recently stated by Meadows [39]. In the brute-force case we describe and advocate concepts of other authors [29,14] which, we think, deserve more attention. The path in this case is to introduce concepts from complexity theory in a way which complements specifications based on safety and liveness properties. Briefly spoken, the additional efficiency property mandates that an adversary does not have the computational power to delay the execution “long enough” (in a complexity theoretic sense). The main motivation of this work is to develop a formal machinery to validate security using the framework of safety and liveness properties which is so wellestablished. In the first case (denial-of-service) we show that this is still possible. In the second case (brute-force attacks) we recall that neither safety nor liveness are suitable abstractions to model the desired security properties, but these properties can be formulated in a way which is compatible to the established methodology. For completeness it should be noted that in the context of secure systems there are also other system properties that can be modeled neither as safety nor as liveness properties. These properties usually run under the heading of non-interference [25] and are concerned with the absence of information flow
224
F.C. G¨ artner
in multi-level security system. For a detailed discussion of these properties and their relation to safety and liveness properties we refer to work by McLean [38]. The paper is structured as follows: We first very briefly survey the formal background of reasoning about systems in the context of fault-tolerance and security in Section 2. Sections 3 and 4 deal with the cases of denial-of-service and brute-force attacks, respectively. Section 5 concludes the paper.
2
Critical System Properties
We now briefly recall the concepts of trace-based specifications for reactive systems and the different model assumptions used in the context of fault-tolerant and secure systems.
2.1
Transition Systems and Traces
Usually, an interactive system is modeled as a state machine which moves from one state to another by means of actions. Formally this corresponds to the definition of a labeled transition system. In the black box view of systems, we wish to define the behavior of such a system in terms of the states and actions it exhibits at its visible interface. In the literature this is termed observation semantics and there are many different possibilities of defining observation semantics for concurrent systems. We will use one of the simplest semantics, called trace semantics, which amounts to defining an observation simply as a sequence of states and actions which are visible at the system interface. Formally, a trace is written a
a
s1 →1 s2 →2 s3 . . . and denotes that starting from state s1 the system reaches state s2 by executing action a1 etc. Note that trace semantics can also be used to describe the behavior of concurrent systems by defining a state as being a distributed state (i.e., a vector of local states) and viewing a trace as the interleaving of the individual local traces a a of the concurrent processes. For example, if s1 →1 s2 →2 s3 . . . is the trace of a a
a
process p and s1 →1 s2 →2 s3 . . . is the trace of process p , we can model a trace of the concurrent system where p and p take turns as: a
a
a
a
(s1 , s1 ) →1 (s2 , s1 ) →1 (s2 , s2 ) →2 (s3 , s2 ) →2 (s3 , s3 ) . . . The type of interleaving can be used to distinguish different synchrony assumptions between processes. One of the most general assumptions is that in an infinite trace both processes must take steps infinitely often. Since this rules out one process dominating the trace, this concept is often called fair interleaving.
Revisiting Liveness Properties in the Context of Secure Systems
2.2
225
Safety and Liveness
A property is defined to be a set of traces. A trace σ satisfies a property P if σ ∈ P . If σ does not satisfy P we say that σ violates P . There are two important types of properties called safety and liveness [3]. Informally spoken, a safety property demands that “something bad never happens” [31], i.e., it rules out a set of unwanted trace prefixes. Mutual exclusion and partial correctness are two prominent examples of safety properties. A liveness property on the other hand demands that “something good will eventually happen” [31] and can be used to formalize, e.g., notions of termination. Safety and liveness properties are defined as follows. A safety property is a property S such that for each trace σ that violates S, there exists a finite prefix α of σ such that for all traces β, α · β violates S (the dot “·” denotes concatenation). A liveness property is a property L such that for all finite traces α exists a trace β such that α · β ∈ L. The distinction between safety and liveness was motivated by different proof techniques used to validate them [28]. In general, safety properties can be proved by an inductive argument involving an invariant over the state of the system. Liveness properties are proved using well-foundedness arguments involving a termination function. Alpern and Schneider [3] have shown that every property (defined as a set of traces) can be written as the intersection between a safety property and a liveness property. 2.3
Asynchronous Systems
Systems with fair interleaving have a close relationship to asynchronous systems. The main advantage of asynchronous systems is that they can be characterized more by non-assumptions than by assumptions [44]: In asynchronous systems there is no assumed or existing bound on the relative processing speeds of processes. This means that while one process takes a single step, any other process can take an arbitrary (but finite) number of steps. In asynchronous systems where communication is through sending and receiving messages, usually also channels are assumed to be asynchronous [23] meaning that there is no upper bound on the time it takes for the system to deliver a sent message. Because they are so simple, the asynchronous system model has been used as the basis for many investigations in distributed algorithms. 2.4
Modeling Faults and Attacks
In the context of security we need to model faults and attacks. This is the basis for validating a certain system formally. In this paper, we take the view that attacks and faults can be modeled in the same way and hence we will use the terms “fault” and “attack” synonymously. According to Rushby [42], this can be done either in a calculational or specificational way. In the calculational approach, faults are modeled as unwanted program transitions which are explicitly incorporated into the faulty program. In effect, these approaches “calculate” the
226
F.C. G¨ artner
effects of faults and see whether the resulting traces still satisfy the specification. For example, work by Arora, Gouda and Kulkarni [5,6] falls into this category as does all the work on software-implemented fault-injection [27]. In the specificational approach, faults are modeled by “weakening” the interface specifications of subcomponents. This is commonly done in the classical literature on faulttolerant distributed algorithms [30,10]. In both fault-modeling approaches, the faults have the effect of potentially adding behavior to a system, i.e., more system executions are possible if no countermeasures are taken. Hence, a system in the presence of faults is the original system which is modified to allow faulty behavior. In this paper, a specification is a property, i.e., a set of traces. A system satisfies a specification in the presence of faults if all traces of the system in the presence of faults satisfy the specification. 2.5
Fault Classifications
Given a specification consisting of a safety and a liveness property, faults can be classified according to the type of property which they directly endanger. For example, memory perturbations that can be the effect of cosmic rays in spacecraft may lead to a direct violation of the safety property. In contrast to that, (silent) crash faults of processes do not necessarily endanger the safety property of the system but rather the liveness property (e.g., a process which is required to terminate but crashes before terminating). Fault assumptions (like memory perturbation and crash) usually come with a restriction on the number of times faults of this class can happen. For example, in the context of self-stabilization [22] memory perturbations are assumed to occur only finitely many times. Similarly, for crash faults there is usually an assumed upper bound on the number of processes which are allowed to crash. The two aspects of a fault assumption are usually called local and global. While the local fault assumption enables additional component behavior, the global fault assumption restricts component behavior again. In a sense, the Byzantine fault assumption [30] can be regarded as the strongest possible combination of safety and liveness violating faulty behavior. Byzantine behavior is arbitrary behavior of at most t components in the system. Some weaker variants of the pure Byzantine fault assumption (like noncooperative Byzantine [37]) have also been defined. Sometimes their assumptions rely on the use of cryptographic primitives, like the authenticated Byzantine model [30] which is used to increase the resilience of Byzantine agreement protocols. In this context, the arbitrary behavior of the Byzantine adversary is restricted to not being able to “guess” a cryptographic key which it has no access to. In the context of secure message-passing systems, the Byzantine fault assumption has been adapted to additionally encompass the message transport system. This has become known as the Dolev-Yao attacker assumption [21]. In this model, the corrupted components (processes) together with the message system are seen as the adversary, i.e., messages (even between two uncorrupted
Revisiting Liveness Properties in the Context of Secure Systems
227
parties) can be arbitrarily delayed or lost. However, signatures of uncorrupted processes cannot be forged. 2.6
Discussion
Note that usually it is trivial to maintain a safety specification in the presence of faults if only “liveness affecting” faults (like crashes) may occur. Hence, satisfying both safety and liveness in the presence of faults is important. In the remainder of this paper, we will assume the Dolev-Yao model and investigate the role which liveness plays in the analysis of security protocols in this model. It can be argued that statements about systems which are analyzed when treating cryptography as a black box have only a very weak assurance regarding security. For example, so-called API-level attacks [9] sometimes allow to retrieve cryptographic keys by invoking the application programmers interface (API) of an embedded device in unforeseen ways. In the area of fault-tolerance, however, abstracting away the details of cryptography can be regarded as adequate since the original motivation to introduce the Byzantine fault assumption was to model hardware components which “went haywire” and responded to inputs in a random (i.e. arbitrary) fashion [30]. The assumption that faults are random events also makes it much easier to establish measures of fault tolerance, i.e., the availability and mean time to failure [33]. In the area of security, it is much harder to calculate the coverage of an attacker assumption, i.e., the probability that it will hold in practice [40]. Therefore, while both the fault-tolerance and the security view on system and fault/attacker models are similar, security seems to mandate more pessimism than fault-tolerance.
3 3.1
Adapting Liveness in the Context of Denial of Service Motivation
“Denial of Service” (DoS) attacks are a well-known threat to the availability of systems and these types of attacks have been widely experienced on the Internet. For example, a DoS attack in early 2000 seriously disrupted the services of some prominent Internet sites such as Amazon, eBay and Yahoo [19]. According to the CERT Coordination Center [16], a DoS attack “is characterized by an explicit attempt by attackers to prevent legitimate users of a service from using that service.” This can be performed in a multitude of ways, including flooding the network, thereby preventing legitimate network traffic. In this paper, we focus on those DoS attacks which are most common, are easy to mount, and, in a sense, cannot be avoided. We now give two examples for these types of attacks. The SYN Flooding Attack. The first example is called TCP SYN flooding and IP spoofing [15] and exploits the handshake operation performed when establishing a TCP connection. In such a handshake, the client sends a TCP SYN
228
F.C. G¨ artner
request to the server. Then the server responds by sending a SYN-ACK message to the client. The handshake is completed when the client sends an ACK message to the server. In this type of attack, however, a malicious client simply does not send the final SYN-ACK, leaving the server with a “half-open connection”. Usually, the kernel data structures for half open connections at the server are limited and can be quickly exhausted by this attack. The result is that it is not possible anymore for honest clients to establish an incoming connection to the server. The attack does neither affect outgoing connections nor existing established connections. Half open connections can be easily obtained using IP spoofing meaning that an attacker sends a SYN message with a “forged” source IP address (i.e., the address of an arbitrary other machine). That machine receives the SYN-ACK but ignores it since it did not send the SYN. IP spoofing makes it almost impossible to detect SYN flooding attacks. In some cases the attacked system may exhaust memory, crash or be inoperative in other ways, resulting in denial-of-service. No general solution with current technology is known which can fully combat SYN flooding attacks. Email Bombing and Spamming. Every site which offers incoming and outgoing email service to its users may be the source or the target of email bombing and spamming attacks [17]. Email bombing means to repeatedly send email messages with extremely large sizes to users on a certain machine. This usually quickly fills up the disk space used to store email messages on that machine. Email spamming means to send email messages to hundreds or thousands of users wasting a lot of disk space and network bandwidth. Both can lead a loss of network connectivity or to system crashes due to overloaded network connections or exhausted disk space. Apart from disconnecting to the Internet, there is no short hand relief to email bombing and spamming. Sometimes the activation of user quotas (i.e., restrictions on the amount of disk space they use) can help prevent the full consequences of these types of attacks. 3.2
The Difficulty of Defining Denial of Service
The definition of CERT [16] can be regarded as seeing the “absence of liveness” as a definitory result of the type of DoS attacks described above. Apart from compromising the availability of the service, a successful DoS attack may wreak other forms of havoc, like a server operating system crash. These unwanted conditions can be incorporated into the safety specification of the system, and so a first approximation to formally defining DoS tolerance seems to be the following: Definition 1 (tentative definition of DoS tolerance). Given a system with a safety specification S and a liveness specification L in the Dolev-Yao model. The system is DoS tolerant if it satisfies S in the presence of faults. Since L is not required to be satisfied in the presence of faults, the system may lose all forms of liveness if it is attacked. This is similar to making satisfaction
Revisiting Liveness Properties in the Context of Secure Systems
229
of liveness dependent on the behavior of the adversary. For example, Cachin et al. [11] define the liveness requirement of a validated Byzantine agreement protocol as follows: If all honest servers have been activated on [a certain instance of agreement] and all associated messages have been delivered [by the adversary], then all honest servers have decided [. . . ]. Unfortunately, a definition in the spirit of Definition 1 alone is too weak to be useful since there are trivial implementations that tolerate DoS, namely systems that do nothing. However, a useful definition of DoS tolerance should at least contain Definition 1, since safety should be maintained at all times. For example, this allows to prohibit system crashes or other unpleasant consequences from excessive system load which usually are the effect of distributed DoS attacks. The weakness of Definition 1 stems from its inability to reflect the behavior of practical DoS tolerant systems. In such systems, countermeasures are taken to prevent damage caused by high system loads. For example, the system can instruct a firewall or router to dismiss certain network traffic from malicious machines. If even this is not possible, the machine can cut off all its network connections alltogether (by shutting down the network interface). As a last resort, the system may cease operation alltogether by shutting down in a safe state. From these descriptions it should be obvious that real systems that tolerate DoS do not lose all forms of liveness in the presence of faults. They are still able to make a certain (limited) amount of progress, a form of “self-controlled” liveness. 3.3
Self-controlled Liveness
For a given labelled transition system with a set of actions A, define Ap as the set of actions controlled by process p. Formally, Ap includes all actions of p which have preconditions defined only over the local state of p. This means, process p can execute the actions in Ap independently of other processes in the system. Definition 2 (self-controlled liveness). A self-controlled liveness property for process p is a property L such that for all finite traces α exists a trace β such that α · β ∈ L and β consists only of actions from Ap . Using Definition 2 we can now define DoS tolerance as follows: Definition 3 (DoS tolerance). Given a system with a safety specification S and a liveness specification L in the Dolev-Yao model and let Lp be the largest self-controlled liveness property contained in L. The system is DoS tolerant for process p if p satisfies S and Lp in the presence of faults. Using the concept of self-controlled liveness, we can cover more forms of DoS tolerance than using the tentative definition above. However, an attacker may still be able to monopolize system resources by indirectly affecting system actions. For example, if an action a of process p depends on the state of a variable v and the attacker may influence v, then in may also influence whether or not
230
F.C. G¨ artner
action a may be executed. On the one hand, if (as a result of such influence) a cannot be executed, this may result in DoS (if a delivers part of the desired service). On the other hand, if a can be executed due to the attacker’s influence, continuous execution of a may eat up much or all available system resources. For example, a common countermeasure to SYN flooding attacks is to release the kernel data structure for the requested connection after a certain period of time. Executing a timeout action depends entirely on the local state of the process, but this action, in effect, is initiated by the attacker. A definition of DoS tolerance which avoids this problem but still maintains the spirit of Definition 2 should prevent any influence of an adversary onto system execution in case of an attack (much in the sense of integrity seen as the dual of secrecy by Biba [8]). However, this means that actions of the system depend on whether or not it is under attack or not. But determining this precisely and in a timely manner is hard in practice, if not impossible. Without such a mechanism (which in analogy to fault-tolerance [18] might be called a malevolence detector ), we conjecture that Definition 3 is the closest approximation for DoS tolerance achievable. 3.4
Relation to Machine Closure
We note here that there exists an interesting relation between Definition 2 and a concept from concurrency theory. Definition 2 is very close to the notion of machine-closure [32] (sometimes also called feasibility [4] or machine-realizability [1]). A liveness property L is machine-closed for a particular system iff for any finite trace its continuation demanded by L is a trace of the system. Metaphorically, this was characterized as the inability for a system “to paint itself into a corner” [47]. Self-controlled liveness properties are a specialization of machineclosure restricted to a subset of program actions. In the context of secure systems and DoS, this concept can therefore be helpful to characterize the ability to operate under the “progress restrictions” of an adversary. The relations between the different forms of liveness properties are depicted in Figure 1.
L
LΣ
Lp
Fig. 1. Relations between all liveness properties, L, liveness properties which are machine-closed with respect to a system Σ, LΣ , and self-controlled liveness properties for a particular process p from Σ, Lp .
Revisiting Liveness Properties in the Context of Secure Systems
3.5
231
Relation to Fail-Awareness
Recently, Cristian and Fetzer [20] introduced the timed asynchronous system model. This model is at heart an asynchronous system model. It contains, however, an explicit notion of real-time through the assumption that processes have access to local hardware clocks and that these clocks run within a linear envelope of real-time. This means that it is always possible to state a real-time bound on the maximum clock difference between each pair of clocks. Note that this does not mean that bounds exist on processing speeds or message delivery delays. On the contrary, the model explicitly assumes that communication is via message passing and there is no bound on message delivery delay. Just like in the time-free model, there is no bound on the relative processing speed of processes. Through the notion of real-time provided by the hardware clocks it is possible to define real-time bounds for all services provided in a timed asynchronous system. In fact, this part of a service specification is mandatory. This means that while it is still impossible to, for example, reliably detect a process crash, it is now possible to tell whether a reply has not met its real-time deadline. This is called fail-awareness [24]. Since there are no bounds on fault occurrences, in this model it is not possible to ensure any liveness property at all. To address the question of liveness, Cristian and Fetzer made the following observation: In practice, systems alternate between long periods of stability and short periods of instability. The measurements they give in their article [20] which were made in a local area network environment show that the average distance between unstable periods is 218 seconds, while the average length of an unstable period was about 340 milliseconds (this gives a ratio of 641 : 1). This observation allows to formulate progress assumptions of the form: “There exists a constant c, such that infinitely often there will be a stable period of length at least c.” In other words, this means that infinitely often the system will be synchronous for at least c time. Using progress assumptions it is now possible to specify liveness properties in the following way: “Assuming that some stability predicate holds, then the system will eventually perform an action.” The stability predicate S can be, e.g., “infinitely often a majority of processes is synchronous for at least 2 seconds.” Liveness properties of form L are transformed into weaker variants of the form S ⇒ L called conditional timeliness properties [20]. In contrast to self-controlled liveness properties, conditional timeliness properties are restrictions on the global asynchrony, i.e., restrictions on the ability of the adversary to choose the scheduling of processes. Low-level timing assumptions can therefore be made explicit which are usually not expressible in the basic asynchronous system model. Self-controlled liveness properties restrict local “asynchrony”, i.e., the ability of the adversary to stop an alive process. In this sense, self-controlled liveness properties can be regarded as a specialization of conditional timeliness properties.
232
4 4.1
F.C. G¨ artner
Adapting Liveness in the Context of (Public-Key) Cryptography Motivation
In the Dolev-Yao attacker model [21] all corrupted parties and the complete message system are assumed to be under the control of the adversary. However, it is often assumed that channels are secure and authentic meaning that messages between uncorrupted parties which are delivered by the adversary can be verified as being authentic and their message content remains secret. As discussed in Section 3, reactive protocols which are driven by messages cannot satisfy (general) liveness properties in this context since the adversary can alter, inject and schedule messages at its own choice. Interestingly, in this model it is possible to achieve liveness while losing security, as we now explain. Systems involving cryptography are naturally prone to attacks based on cryptanalysis or on brute-force calculations to retrieve a secret piece of information. Now consider a system which satisfies a particular liveness property, say, termination. Since liveness properties do not state anything about the time it takes to achieve them, the adversary can delay the termination event as long as necessary to break the cryptographic keys involved in the protocol. This is one deficiency of the liveness property abstraction which only becomes apparent in the context of secure systems. In this section we review an interesting system model of Cachin, Kursawe, and Shoup [14,29] (with extensions [12,11]) from this perspective. The model introduces complexity theoretic means to rigorously analyze the security of a randomized Byzantine agreement protocol which is implemented using cryptography. Since in this model the cryptographic view of security prevails, the aim of this section is to explain it in a terminology related more to distributed systems and formal verification people. 4.2
Turing Machines, Security Parameters, and Negligible Functions
Instead of transition systems, the model uses probabilistic interactive Turing machines (PITMs) to model individual processes. The reason for this is that, since they can read input from a dedicated input tape, PITMs have well-defined complexity measures with respect to their input. Theoretically there is no difference between a PITM and a parametrized transition system. The input of such a PITM consists of a security parameter. This is also a concept rather unknown to regular distributed systems people. Briefly spoken, the security parameter k can be thought of as the “strength” of the underlying cryptography (e.g., the length of the secret key in bits). Attacks on the cryptosystem are assumed to have a running time which takes more than polynomial time in k. For example, a brute force attack on a cryptosystem with k key bits takes 2k time. The hope is that by increasing k it is easily possible to combat the increasing processing power of new equipment. By having the PITMs read k from the input tape, the system model is parametrized in k.
Revisiting Liveness Properties in the Context of Secure Systems
233
Even though the complexity of a brute force attack increases exponentially in k, there is still a non-zero chance of an attacker simply “guessing” the right key. In practice, this chance is assumed to be negligible for the most common cryptosystems. For example, the chances of guessing an RSA key decreases exponentially in the length of the key, since all known algorithms for factoring large numbers need exponential time. Hence, the probability of guessing the right key decreases faster than any polynomial. This is formalized using the concept of a negligible function in k. A function f (k) is called negligible if for all c > 0 there exists a k0 such that f (k) < k1c for all k > k0 . Hence, a negligible function decreases faster than any polynomial. The adversary (which includes all corrupted processes as well as the message subsystem) is also modelled as a PITM with a time complexity bounded by a polynomial in k. The honest (i.e., uncorrupted) processes are considered to be “message driven” by the adversary, i.e., they only take one initial step (and generate a finite set of messages) and then only take steps whenever a message is delivered to them by the adversary. In this case they perform a state transition and generate a (possibly empty) set of messages which are inserted into the message subsystem again. 4.3
The Liveness Property
Similar to restricting the liveness property in the presence of crash faults to all alive processes, any liveness property of a message-driven protocol in the context of the Dolev-Yao attacker model must be dependent on the extent to which the adversary delivers messages. For example, the liveness property of consensus from the introduction of this paper would be reformulated as: – (Liveness) If all messages associated to a particular instance of consensus have been delivered, then all uncorrupted processes eventually decide. Any other form of liveness would be trivially impossible to implement without dedicated resources. However, as motivated above, these properties alone do not capture the intuition of not being able to delay the protocol arbitrarily. 4.4
The Efficiency Property
Because the model is used to analyze the security of a protocol using complexity theory, runs necessarily need to be finite. The basic assumption of the model therefore is that any algorithm runs in a time which is polynomial in k. Hence, the length of any trace generated by an individual process in the system is bounded by some polynomial in k. For distributed systems people this may seem unusual at first since the notion of reactive systems was introduced explicitly to model non-terminating tasks like operating systems or schedulers. However, even “nonterminating” tasks terminate in practice, and because k and the polynomial can be freely chosen, the running time can be large enough to meet these facts. Since processes are probabilistic, successive invocations of the system will usually generate different traces. For a given trace it is possible to define some
234
F.C. G¨ artner
complexity measure. For example, the communication complexity of a trace is the total bit length of all messages generated by honest processes during the trace. The communication complexity of the protocol consequently is a random variable that depends on the adversary and on k. Given a particular protocol, it is possible to define a protocol statistic X which is a collection of random variables {XA (k)} for different adversaries A and different security parameters k. One member of this collection is obtained by measuring a particular complexity measure (like communication complexity) running the protocol with a particular adversary A and a particular security parameter k. The protocol statistic can therefore be seen as an abstraction of the behavior of the protocol for all adversaries and all security parameters (remember that only adversaries are allowed that run in polynomial time in k). The idea now is to give a definition of what it means for a protocol to be “bounded” for all allowed adversaries and security parameters. This definition can then be instantiated for different complexity measures. Intuitively, the complexity of a “bounded” protocol should always be bounded by a polynomial, no matter how the adversary operates. Formally, a protocol statistic X is uniformly bounded if there exists a fixed polynomial p(k) such that for all adversaries A there is a negligible function e(k) such that for all k > 0 holds: Pr[XA (k) > p(k)] ≤ eA (k)
(1)
This means that the probability that the complexity of the protocol lies above a certain fixed polynomial is negligible. Note that this holds for all (allowed) adversaries and security parameters. Using the notion of uniformly bounded protocol statistic it is now possible to define an additional efficiency property for a protocol. Definition 4 (efficiency property). The communication complexity of the protocol is uniformly bounded. If the protocol in question is not randomized, it is possible to simplify Formula 1 and also Definition 4 to state that the communication complexity (dependent on k) should be below a fixed polynomial p(k). To explain the intuition behind the definition, it should be instructive to look at a protocol which does not satisfy Definition 4 in the simplified (non-probabilistic) setting. Take for example a protocol with two types of messages: a-messages and b-messages. In the protocol a process, upon receiving an a-message, sends n + 1 messages in reply to other processes which need to be processed in order to terminate (n being the number of processes in the system): n a-message and a b-message. The adversary can now hold back the b-message and deliver the a-message to generate additional messages. By repeating this procedure, the adversary can generate an exponential number of “correct” protocol messages in a time linear in k. Hence, the communication complexity for this particular adversary can surmount any polynomial in k. So this protocol does not satisfy the efficiency condition.
Revisiting Liveness Properties in the Context of Secure Systems
235
Intuitively spoken, the efficiency property ensures that the protocol terminates “fast” with respect to the extent to which the adversary delivers messages. The only assumption on the adversary is that its computing power is polynomially bounded. Hence, if a protocol satisfies efficiency then such a polynomially bounded adversary cannot delay the protocol in a superpolynomial way. But a superpolynomial delay is needed for a successful brute-force attack on the given cryptosystem. We can compare this setting metaphorically to a race with two competitors (protocol and adversary) and take the running time to be the measure of complexity. Both start running and the protocol wins if itself finishes in polynomial time. The adversary wins if the protocol takes superpolynomial time. The efficiency condition therefore ensures that the protocol always wins.
5
Conclusions
In the area of secure systems we are experiencing the development of an increasingly flexible formal machinery to help designing and validating them. We have presented two concepts which we feel help to formalize and understand system properties in the presence of two different attacks: denial-of-service and bruteforce attacks on cryptographic keys. We have argued that the security properties involved can be formalized in a way which builds on the well-established concepts of safety and liveness. We feel that it has many methodological advantages to keep the framework of safety and liveness at the heart of any security investigations and find extensions in areas that fall outside of this domain. For example, distinguishing between (adversary-dependent) liveness properties and the efficiency property (as done in Section 4) allows the following: Security protocols can first be specified and analyzed in the usual context of safety and liveness. If this is done, the efficiency of the protocol can then be investigated separately. Hence, studying the resilience of protocols to brute-force attacks is compatible to the established methodology of verifying safety and liveness. This point is worth noting because other work has also developed methods to “incrementally” reason about properties that fall out of the safety/liveness domain. To the best of our knowledge, this has been done for information-flow properties in the context of non-interference [36] and real-time properties in the context of fault-tolerant algorithms [34,26,43]. The goal is to continuously extend the collection of these analysis methods to tame the complexity of system validation by compositional reasoning. Acknowledgments The author wishes to thank Klaus Kursawe for his insightful explanations on the motivations of his work and Fred Schneider for pointing out the weaknesses of Definition 2. Thanks also to Klaus Echtle, Heiko Mantel, Michael Waidner, Holger Vogt and Hagen V¨ olzer for helpful discussions. This work was supported by the Deutsche Forschungsgemeinschaft as part of the Emmy Noether programme.
236
F.C. G¨ artner
References 1. Mart´ın Abadi and Leslie Lamport. Composing specifications. ACM Transactions on Programming Languages and Systems, 15(1):73–132, January 1993. 2. Mart´ın Abadi and Leslie Lamport. An old-fashioned recipe for real time. ACM Transactions on Programming Languages and Systems, 16(5):1543–1571, September 1994. 3. Bowen Alpern and Fred B. Schneider. Defining liveness. Information Processing Letters, 21:181–185, 1985. 4. Krzysztof R. Apt, Nissim Francez, and Shmuel Katz. Appraising fairness in languages for distributed programming. Distributed Computing, 2(4):226–241, 1988. 5. Anish Arora and Mohamed Gouda. Closure and convergence: A foundation of faulttolerant computing. IEEE Transactions on Software Engineering, 19(11):1015– 1027, 1993. 6. Anish Arora and Sandeep S. Kulkarni. Component based design of multitolerant systems. IEEE Transactions on Software Engineering, 24(1):63–78, January 1998. 7. Michael Barborak, Anton Dahbura, and Miroslaw Malek. The consensus problem in fault-tolerant computing. ACM Computing Surveys, 25(2):171–220, June 1993. 8. K. J. Biba. Integrity considerations for secure computer systems. Technical Report MTR-3153 Rev. 1, The MITRE Corp., Bedford, Massachusetts, April 1977. Electronic Systems Division, U. S. Air Force Systems Command, Technical Report ESD-TR-76-372. 9. Mike Bond and Ross Anderson. API-level attacks on embedded systems. IEEE Computer, 34(10):67–75, October 2001. 10. Gabriel Bracha and Sam Toueg. Asynchronous consensus and broadcast protocols. Journal of the ACM, 32(4):824–840, October 1985. 11. Christian Cachin, Klaus Kursawe, Anna Lysyanskaya, and Reto Strobl. Asynchronous verifiable secret sharing and proactive cryptosystems. In Proceedings of the 9th ACM Conference on Computer and Communications Security (CCS-9), Washington, DC, 2002. 12. Christian Cachin, Klaus Kursawe, Frank Petzold, and Victor Shoup. Secure and efficient asynchronous broadcast protocols. In Advances in Cryptology – CRYPTO ’ 2001, Lecture Notes in Computer Science. International Association for Cryptologic Research, Springer-Verlag, 2001. See [13] for long version. 13. Christian Cachin, Klaus Kursawe, Frank Petzold, and Victor Shoup. Secure and efficient asynchronous broadcast protocols. Record 2001/006, Cryptology ePrint Archive, January 2001. An extended abstract was published as [12]. 14. Christian Cachin, Klaus Kursawe, and Victor Shoup. Random oracles in constantinople: practical asynchronous Byzantine agreement using cryptography. In Proceedings of the Symposium on Principles of Distributed Computing, pages 123– 132, Portland, Oregon, 2000. 15. CERT Coordination Center. CERT advisory CA-1996-21 TCP SYN flooding and IP spoofing attacks. Internet: http://www.cert.org/advisories/CA-1996-21. html, September 1996. last revision: November 2000. 16. CERT Coordination Center. Denial of service attacks. Internet: http://www.cert. org/tech_tips/denial_of_service.html, June 2001. 17. CERT Coordination Center. Email bombing and spamming. Internet: http:// www.cert.org/tech_tips/email_bombing_spamming.html, August 2002. 18. Tushar Deepak Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225–267, March 1996.
Revisiting Liveness Properties in the Context of Secure Systems
237
19. CNN.com. Cyber-attacks batter web heavyweights. Internet: http://www. cnn.com/2000/TECH/computing/02/09/cyber.attacks.01/index.htm%l, February 2000. 20. Flaviu Cristian and Christof Fetzer. The timed asynchronous distributed system model. IEEE Transactions on Parallel and Distributed Systems, 10(6), June 1999. 21. Danny Dolev and A. C. Yao. On the security of public key protocols. IEEE Transactions on Information Theory, 29(2):198–208, March 1983. 22. Shlomi Dolev. Self-Stabilization. MIT Press, 2000. 23. Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. Consensus in the presence of partial synchrony. Journal of the ACM, 35(2):288–323, April 1988. 24. Christof Fetzer and Flaviu Cristian. Fail-awareness: An approach to construct fail-safe applications. In Proceedings of The Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing (FTCS’97), pages 282–291. IEEE, June 1997. 25. J. A. Goguen and J. Meseguer. Security policies and security models. In Proceedings of the 1982 Symposium on Security and Privacy (SSP ’82), pages 11–20, Los Alamitos, Ca., USA, April 1982. IEEE Computer Society Press. 26. Jean-Fran¸cois Hermant and G´erard Le Lann. Fast asynchronous uniform consensus in real-time distributed systems. IEEE Transactions on Computers, 51(8):931–944, August 2002. 27. Mei-Chen Hsueh, Timothy K. Tsai, and Ravishankar K. Iyer. Fault injection techniques and tools. IEEE Computer, 30(4):75–82, April 1997. 28. Ekkart Kindler. Safety and liveness properties: A survey. EATCS-Bulletin, (53), June 1994. 29. Klaus Kursawe. Asynchronous byzantine group communication. In Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems (SRDS), Workshop on Reliable Peer-to-Peer Distributed Systems, pages 352–357, Osaka, Japan, October 2002. IEEE Computer Society Press. 30. L. Lamport, R. Shostak, and M. Pease. The Byzantine generals problem. ACM Transactions on Programming Languages and Systems, 4(3):382–401, July 1982. 31. Leslie Lamport. Proving the correctness of multiprocess programs. IEEE Transactions on Software Engineering, 3(2):125–143, March 1977. 32. Leslie Lamport. Fairness and hyperfairness. Distributed Computing, 13(4):239–245, 2000. 33. Jean-Claude Laprie, editor. Dependability: Basic concepts and Terminology, volume 5 of Dependable Computing and Fault-Tolerant Systems. Springer-Verlag, 1992. 34. Gerard Le Lann. On real-time and non real-time distributed computing. In Proceedings of the 9th International Workshop on Distributed Algorithms (WDAG95), pages 51–70, September 1995. 35. Zhiming Liu and Mathai Joseph. Specification and verification of fault-tolerance, timing and scheduling. ACM Transactions on Programming Languages and Systems, 21(1):46–89, 1999. 36. Heiko Mantel. Possibilistic definitions of security - An assembly kit. In Proceedings of the 13th IEEE Computer Security Foundations Workshop (CSFW 2000), Cambridge, England, July 2000. IEEE Computer Society Press. 37. Asif Masum. Non-cooperative Byzantine failures: A new framework for the design of efficient fault tolerance protocols. PhD thesis, Universit¨ at-Gesamthochschule Essen, Fachbereich Mathematik und Informatik, 2000. Published by Libri Books on demand, ISBN 3-8311-0815-3.
238
F.C. G¨ artner
38. John McLean. A general theory of composition for a class of “possibilistic” properties. IEEE Transactions on Software Engineering, 22(1):53–67, January 1996. Special Section—Best Papers of the IEEE Symposium on Security and Privacy 1994. 39. Catherine Meadows. Open issues in formal methods for cryptographic protocol analysis. In DISCEX 2000, pages 237–250. IEEE Computer Society Press, January 2000. 40. David Powell. Failure mode assumptions and assumption coverage. In Dhiraj K. Pradhan, editor, Proceedings of the 22nd Annual International Symposium on Fault-Tolerant Computing (FTCS ’92), pages 386–395, Boston, MA, July 1992. IEEE Computer Society Press. 41. Inc. RSA Data Security. RSA code-breaking contest again won by Distributed.net and Electronic Frontier Foundation (EFF). Internet: http://www.rsasecurity. com/company/news/releases/pr.asp?doc\_id=462, January 1999. 42. John Rushby. Critical system properties: Survey and taxonomy. Reliability Engineering and System Safety, 43(2):189–219, 1994. 43. John Rushby. Systematic formal verification for fault-tolerant time-triggered algorithms. In Mario Dal Cin, Catherine Meadows, and William H. Sanders, editors, Dependable Computing for Critical Applications—6, volume 11 of Dependable Computing and Fault Tolerant Systems, pages 203–222, Garmisch-Partenkirchen, Germany, March 1997. IEEE Computer Society. 44. Fred B. Schneider. What good are models and what models are good? In Sape Mullender, editor, Distributed Systems, chapter 2, pages 17–26. Addison-Wesley, Reading, MA, second edition, 1993. 45. John Turek and Dennis Shasha. The many faces of consensus in distributed systems. IEEE Computer, 25(6):8–17, June 1992. 46. Paulo Ver´ıssimo. Real-time communication. In Sape Mullender, editor, Distributed Systems, chapter 17, pages 447–490. Addison-Wesley, Reading, MA, second edition, 1993. 47. Hagen V¨ olzer. Fairness, Randomisierung und Konspiration in verteilten Algorithmen. PhD thesis, Humboldt Universit¨ at zu Berlin, Fakult¨ at f¨ ur Informatik, December 2000.
Author Index
Boyd, C., 49
Norman, G., 81
Cece, G., 33 Cohen, E., 183
Oehl, F., 33 Oheimb, D. von, 15
Debbabi, M., 133 Desharnais, J., 133 Durante, A., 191 Fourati, M., 133 G¨ artner, F.C., 221 Gollmann, D., 71 Gordon, A., 3 G¨ urgens, S., 97 Hall, A., 152 Halpern, J.Y., 115 Kouchnarenko, O., 33 Lowe, G., 205 Mancini, L.V., 191 Menif, E., 133
Painchaud, F., 133 Paulson, L.C., 4 Pietro, R. Di, 191 Preneel, B., 167 Pucella, R., 115 Rohrmair, G.T., 205 Rudolph, C., 97 Schneider, F.B., 1 Shmatikov, V., 81 Sinclair, D., 33 Stepney, S., 62 Tawbi, N., 133 Viswanathan, K., 49