Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen
1485
Jean-Jacques Quisquater Yves Deswarte Catherine Meadows Dieter Gollmann (Eds.)
Computer Security ESORICS 98 5th European Symposium on Research in Computer Security Louvain-la-Neuve, Belgium September 16-18, 1998 Proceedings
~ Springer
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Jean-Jacques Quisquater UCL, Microelectronic Laboratory Place du Levant 3, B-1348 Louvain-la-Neuve, Belgium E-mail: quisquater@ dice.ucl.ac.be Yves Deswarte LAAS-CNRS & INRIA 7, avenue du Colonel Roche, F-31077 Toulouse cedex 4, France E-mail: yves.deswarte @laas.fr Catherine Meadows Naval Research Laboratory 4555 Overlook Ave., S.W., Washington, DC, 20375, USA E-mail: meadows @itd.nrl.navy.mil Dieter Gollmann Microsoft Research Limited St. George House, 1 Guildhall Street, Cambridge CB2 3NH, UK E-mail:
[email protected] Cataloging-in-Publication data applied for Die Deutsche Bibliothek - C1P-Einheilsaufnahme C o m p u t e r security : proceedings / ESORICS 98, 5th European Symposium on Research in Computer Secnrily, Louvain-la-Neuve, Begiam, September 16 - 18, 199~. Jean-Jacques Ouisquater ... (ed.). Berlin ; Heidelberg ; New York ; Barcelona ; Budapest ; Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 1998 (Lecture notes in computer science ; Vol. 1485) ISBN 3-540-65004-0
CR Subject Classification (1991): D.4.6, E.3, C.2.0, H.2.0, K.6.5 ISSN 0302-9743 ISBN 3-540-65004-0 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this pubhcation or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Sprmger-Verlag. Violations are liable for prosecution under the German Copyright Law. 9 Springer-Verlag Berlin Heidelberg 1998 Printed in Germany Typesetting: Camera-ready by author SPIN 10638805 06/3142 - 5 4 3 2 1 0
Printed on acid-free paper
Preface
Since 1990, ESORICS has established its reputation as the main event in research on computer security in Europe. Every two years, ESORICS gathers researchers and practitioners of computer security and gives researchers the opportunity to present the most recent advances in security theory as well as the risks related to simplistic implementations of security mechanisms. Despite possible concurrence with other international events, ESORICS 98 received 57 submissions, coming from 19 countries and 4 continents. All these papers were reviewed by at least three program committee members or other experts at their institutions. Most of the submitted papers were considered as very good, and the program committee quickly agreed on 23 papers that could be organised into consistent sessions. Unfortunately, some high quality papers had to be rejected either because they did not correspond to ESORICS scope or because they did not fit with other papers to constitute a homogeneous session. As in previous ESORICS, some ESORICS 98 sessions are dedicated to fundamental issues such as the design and specification of security policies, access control modelling and protocol analysis. But these sessions mix both theoretical papers and very practical concerns. Since mobility is a topic of increasing importance, its two main aspects will be discussed in two sessions: one on mobile systems and anonymity, the other on Java and mobile code. A session and a panel are devoted to watermarking, an important technique for the protection of intellectual rights. Finally, two sessions are dedicated to practical issues, one on intrusion detection and prevention, the other dealing with specific threats. In this session, two papers on cryptography have been included for the first time in ESORICS. While previously, we had considered that cryptography papers should be submitted to conferences dedicated to cryptography, these two papers have been accepted because security people can learn from them the risks that can be raised by naive implementation of good cryptographic algorithms. In summary, we hope that this mix between practical and theoretical issues will satisfy the practitioner's curiosity and encourage researchers to pursue their work for the progress of a secure information society.
Yves Deswarte Programme Chair
Catherine Meadows Programme Vice-Chair
VI
Organization Conference Chairs General Chair: Program Chair: Program Vice-Chair:
Jean-Jacques Quisquater (UCL, Belgium) Yves Deswarte (LAAS-CNRS & INRIA, France) Catherine Meadows (NRL, USA)
Proceedings Editor Dieter Gollmann
Microsoft Research, UK
Program Commitee Elisa Bertino Joachim Biskup Yves Deswarte G6rard Eizenberg Simon Foley Dieter Gollmann Franz-Peter Heider Jeremy Jacob Sokratis Katsikas Helmut Kurth Peter Landrock Carl Landwehr Guy Leduc Teresa Lunt Beno~t Macq Ueli Maurer Catherine Meadows Refik Molva Emilio Montolivo Roger Needham Pierre Paradinas Jean-Jacques Quisquater Pierre Rolin Peter Ryan Pierangela Samarati Einar Snekkenes Gene Spafford Stuart Stubblebine Michael Waidner
University of Milan, Italy University of Dortmund, Germany LAAS-CNRS & INRIA, France CERT-ONERA, France) Cambridge University CCSR, UK & University College Cork, Ireland Microsoft Research, UK debis, Germany University of York, UK University of the Aegean, Greece IABG, Germany Arhus University, Denmark NRL, USA University of Liege, Belgium DARPA, USA UCL, Belgium ETH Ziirich, Switzerland NRL, USA Eurecom, France Fondazione Ugo Bordoni, Italy Microsoft Research, UK Gemplus, France UCL, Belgium France Telecom, France DERA, UK University of Milan, Italy FFI, Norway Purdue University, USA AT&T, USA IBM, Switzerland
vIf
Additional
Referees
N. Asokan Marco Bucci Jan Camenisch Cecilia Catalano Bruno Crispo Francesco Gentile Irfan Ghauri Pierre Girard Luigi Giuri Martin Hirt G/inter Karjoth Jean-Louis Lanet Sergio Loureiro Renato Menicocci Markus Michels Jon Millen Alain Pannetrat Local Organisation Benolt Macq Bart Preneel Catherine Rouyer Joos Vandewalle
IBM Zfirich Research Laboratory, Switzerland Fondazione Ugo Bordoni, Italy ETH Zurich, Switzerland Fondazione Ugo Bordoni, Italy University of Cambridge, UK Fondazione Ugo Bordoni, Italy Institut Eurecom, France Gemplus, France Fondazione Ugo Bordoni, Italy ETH Ziirich, Switzerland IBM Z/irich Research Laboratory, Switzerland Gemplus, France Institut Eurecom, France Fondazione Ugo Bordoni, Italy Ubilab, UBS, Switzerland SRI International, USA Institut Eurecom, France Committee Universit@ catholique de Louvain, Belgium Katholieke Universiteit Leuven, Belgium Universit@ catholique de Louvain, Belgium Katholieke Universiteit Leuven, Belgium
Table of Contents
Fixed vs. Variable-Length Patterns for Detecting Suspicious Process Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
H. Debar, M. Dacier, M. Nassehi, and A. Wespi A Tool for Pro-active Defense Against the Buffer Overrun Attack . . . . . . . .
17
D. Bruschi, E. Rosti, and R. Banff A Kernelized Architecture for Multilevel Secure Application Policies . . . . . .
33
S.N. Foley Dealing with Multi-policy Security in Large Open Distributed Systems . . .
51
C. Bidan and V. Issarny A Flexible Method for Information System Security Policy Specification ..
67
R. Ortalo On the Security of Some Variants of the RSA Signature Scheme . . . . . . . . .
85
M. Michels, M. Stadler, and H.-M. Sun Side Channel Cryptanalysis of Product Ciphers . . . . . . . . . . . . . . . . . . . . . . . .
97
J. Kelsey, B. Schneier, D. Wagner, and C. Hall On the Security of Digital Tachographs
..............................
111
R. Anderson An Authorization Model and Its Formal Semantics . . . . . . . . . . . . . . . . . . . . .
127
E. Be~ino, F. Buccafurri, E. Ferrari, and P. Rullo Authorization in CORBA Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
143
G. Karjoth Rules for Designing Multilevel Object-Oriented Databases . . . . . . . . . . . . . . .
159
F. Cuppens and A. Gabillon Byte Code Verification for Java Smart Cards Based on Model C h e c k i n g . . . 175
J. Posegga and H. Vogt Towards Formalizing the Java Security Architecture of JDK 1.2 . . . . . . . . . .
191
L. L. Kassab and S. J. Greenwald EUROMED-JAVA: Trusted Third Party Services for Securing Medical Java Applets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. Varvitsiotis, D. Polemi, and A. Marsh
209
MPEG PTY-Marks: Cheap Detection of Embedded Copyright Data in DVD-Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
221
J. P. M. G. Linnartz, and J. C. Talstra DHWM: A Scheme for Managing Watermarking Keys in the Aquarelle Multimedia Distributed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
241
D. Augot, J.-F. Delaigle, and C. Fontaine The "Ticket" Concept for Copy Control Based on Embedded Signalling... 257
J.P.M.G. Linnartz Panel Session: Watermarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
275
G. Eizenberg and J.-J. Quisquater Authentication and Payment in Future Mobile Systems . . . . . . . . . . . . . . . . .
277
G. Horn and B. Preneel Distributed Temporary Pseudonyms: A New Approach for Protecting Location Information in Mobile Communication Networks . . . . . . . . . . . . . .
295
D. Kesdogan, P. Reichl, and K. Junghiirtchen A Mix-Mediated Anonymity Service and Its Payment . . . . . . . . . . . . . . . . . . .
313
E. Franz and A. Jeriehow A Subjective Metric of Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
329
A. J~sang A Sound Logic for Analysing Electronic Commerce Protocols . . . . . . . . . . . .
345
V. Kessler and H. Neumann Kerberos Version IV: Inductive Analysis of the Secrecy Goals . . . . . . . . . . . .
361
G. Bella and L. C. Paulson Author
Index
.................................................
377
Fixed vs. Variable-Length Patterns for D e t e c t i n g Suspicious Process Behavior Hervd Debar, Marc Dacier, Mehdi Nassehi, and Andreas Wespi IBM Research Division, Zurich Research Laboratory, CH-8803 Riischlikon, Switzerland {deb, dac ,mmn, anw}Ozurich, ibm. com
A b s t r a c t . This paper addresses the problem of creating patterns that can be used to model the normal behavior of a given process. These models can be used for intrusion detection purposes. In a previous work, we presented a novel method to generate input data sets that enable us to observe the normal behavior of a process in a secure environment. Using this method, we propose various techniques to generate either fixed-length or variable-length patterns. We show the advantages and drawbacks of each technique, based on the results of the experiments we have run on our testbed.
1
Introduction
In [1] Forrest et al. introduced a change-detection algorithm for detecting computer viruses t h a t is based on the way t h a t natural immune systems distinguish "self" from "nonself". In [2] they reported preliminary results extending this approach in the intrusion-detection area by establishing such a definition of self for Unix processes. This technique models the way an application or service running on a machine normally behaves by registering the sequences of system calls invoked. An intrusion is assumed to exercise abnormal paths in the executable code, and is detected when new sequences are observed (see also [3], [4], [5]). Like every behavior-based technique, an intrusion-detection system needs to be trained to learn what the "normal" behavior of a process is. This is usually done by recording the activity of the process running in a real environment during a given period. This procedure has a number of drawbacks: - It leaves open the possibility of obtaining false negatives [6] if not all possible behaviors have been exercised during the recording period, because new users, new applications or configuration changes introduce new usage
patterns. - It leaves open the possibility of obtaining false positives [6] if the application has been hacked during the recording period. - The observed behavior is a function of the environment in which it is running and thus prevents the distribution of an "initialized" intrusion-detection system that could immediately be plugged in and activated.
2
Herv@ Debar et al.
Another approach consists of artificially creating input data sets that exercise all normal modes of operation of the process. For example, Forrest et al. [2] use a set of 112 messages to study the behavior of the sendmail daemon. This method eliminates the risk of obtaining a false positive but not that of obtaining a false negative. Furthermore, it is an extremely complex and time-consuming task to come up with a good input d a t a set. In [7] we have shown how to use the functionality verification test (FVT) suites provided by application developers to observe all the specified behaviors of an application. Not only are we able to eliminate the risk of obtaining a false positive with this approach, but we also dramatically reduce the risk of obtaining a false negative because, by design, our input data set exercises all the normal (in the sense of having been specified by the designer of the application) modes of operation of the application under study. In this paper, we present the results of experiments we have been running with this technique. Besides the use of the F V T suites, our work differs from that of Forrest et al. in three main ways: - The algorithm used to create fixed-length patterns is slightly different and results in shorter tables of patterns. We consider in addition the opportunity of using variable-length patterns [8] and various ways of doing this. The decision to raise an alarm is based on a new paradigm that is simpler than the Hamming distance described in [2], thus allowing possible real-time countermeasure. -
-
The structure of the paper is as follows. Section 2 presents the principles of the intrusion-detection techniques we are working with. It explains how patterns are generated and used to cover sequences. Section 3 presents the experimental results obtained when applying a fixed-length pattern-generation algorithm. Section 4 does the same for a variable-length pattern-generation algorithm. These two sections also highlight advantages and disadvantages of each technique. Section 5 concludes the paper by offering a comparison of the results as well as ideas for future work. 2
Principles
of the
Approach
In our approach, UNIX processes are described by the sequence of audit events 1 that they generate, from start (fork) to finish (exit). Their normal behavior is modeled by a table of patterns which are subsequences extracted from these sequences. The detection process relies on the assumption that when an attack exploits vulnerabilities in the code, new (i.e., not included in the model) subsequences of audit events will appear. Fig. 1 describes the complete chain used to configure the detecting tool. 1 Note that Forrest et al. [1], [2] use the sequences of system calls instead of audit events. Obtaining system calls constitutes a more intrusive technique than the one we present here.
Fixed vs. Variable-Length Patterns for Suspicious Behavior Detection
3
TRAINING SYSTEM
Translation[
[Filtering
v
PROC_cre$te FILE_close FILE_close
I Pattern
i || i
?
9 . .D A A B G E A B C D D A E F G
......
DABGEABCDAEFG...
FILE_open
OFF-LINE ON-LINE
.
.
.
.
.
.
1
Exgaction
MODEL: Table of known patterns .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
INTRUSION DETECTION SYSTEM
.
.
.
.
.
~t T
[.~ ' T;anilati'onI
Filtering
]
]
Pattern
~lAwegaa~ I
-- [ Matching
V
PROCcrea$e FILE_close
riLE ol .
9 . .AABBGGEABCDDADECFG
......
ABGEABCDAECFG...
ABGE ABCD
.
.
.
:~'.~"
-> matched -> matched
~!~.'..
FILE_open SPASSWORDREAD TCPIP_eonnect
Fig. 1. Intrusion-detection system
2.1
Off-Line Treatment
The upper part of Fig. 1 represents how the model of normal behavior is created off-line. Audit events are recorded from the ftp daemon, which has been triggered by an experiment process, and translated as letters for easier handling. Such a letter in our system actually represents the combination of the audit event and the name of the process generating it, both pieces of information being provided by the Unix C2 audit trail. This recording runs through a filtering and reduction process whose purpose is explained later. When the experiment has been completed, the entire audit information is used to generate the table of patterns that constitute our model of the normal behavior of the system. The purpose of the filtering/reduction/aggregationbox is threefold (explained in more detail in the remainder of the section):
filtering to eliminate irrelevant events (processes that are not related to the services we monitor) and to sort the remaining ones by process number;
reduction to remove duplicate sequences due to processes generating exactly the same stream of audit events from start
(fork)
to finish
(exit), 2
and
aggregation to remove consecutive occurrences of the same system calls.
2 This is not used in the on-line version, because of the memory cost and algorithmic complexity of keeping all known process images.
4
Herv@ Debar et al.
Sorting prevents the introduction of arbitrary context switches into the audit stream by the operating system. Processes are sorted so that the intrusiondetection process is applied to each process individually. The events inside the process remain in the order in which they have been written to the audit trail. The reduction keeps only unique process images for model extraction. We use test suites to record the normal behavior of the ftp daemon. These test suites carry many repetitive actions and therefore result in many instances of the same process image (see [7] for a complete explanation of this issue). We remove these duplicates as they are not used by the current algorithm extracting the patterns from the reference audit data. The aggregation comes from the observation that strings of N consecutive audit events (N > 20) are quite frequent for certain events, with N exhibiting small variations. A good example is the "ftp login" session, where the ftp daemon closes several file handles inherited from inetd, the number of these closes changing without obvious cause. Therefore, we decided to simplify the audit trail in order to obtain a model containing fewer and shorter patterns. We did not observe any difference in either the false positive or false negative rates between those experiments in which the aggregation was used and those in which it was not. There is no claim of equivalence between the reduced (simplified) audit trail and the original one; the new one possibly has less semantic content. This is an experimental choice, and we would remove the aggregation part if the true/false alarm performance of the intrusion-detection system were not satisfactory. This aggregation phase would also be skipped if the intrusion-detection technique requires redundancy, e.g. neural networks. The most obvious aggregation consists of replacing identical consecutive audit events with an extra "virtual" audit event. However, doing so enriches the vocabulary (the number of registered audit events) and possibly the number of patterns, which we want to keep small. Our solution simply aggregates these identical consecutive audit events. Therefore, any audit event represents "one or more" such events (i.e., A = A+ in regular expression formalism) at the output of the aggregation box. 2.2
R e a l - T i m e Detection
The lower part of Fig. 1 shows the real-time intrusion-detection process. Audit events are again generated by the ftp daemon, and go through the same sorting and reduction mechanism in real time. Then, we apply a pattern-matching algorithm (Section 2.3) to cover the sequences on the fly. When no known pattern appropriately matches the current stream of audit events, a decision has to be taken whether to raise an alarm (see Section 2.4).
2.3
Pattern-Matching Algorithm
The pattern-matching algorithm is quite critical for the performance of the intrusion-detection system. We wish to maximize both speed and detection capabilities. Therefore, we require that patterns match exactly, i.e., they cannot be
Fixed vs. Variable-Length Patterns for Suspicious Behavior Detection
5
matched as regular expressions using wildcards. We use a two-step algorithm: the first step looks for an exact match, and the second step looks for the best partial match possible. These two steps are illustrated with the examples presented in Figs. 2 and 3.
Table
ABCD XYZD ABC GHI XYZ HF
String:
ABCABCDXYZGHI
ABC Selectable
ABCD XYZD ABC
Reminder of string:
ABCDXYZGHI
Reminder of string:
XYZGHI
ABCD ABCD XYZD ~
Reminder of string:
Gill XYZ
GHI ABCD XYZD ~
ABC Validated GHI
A B C validated: 3 consecutive patterns yield exact match
Fig. 2. Exact pattern-matching sequence
Table
ABCD XYZD ABC GHI XYZ HF
String:
ABCABCDKMWHF AP,~CD XYZD ABC
Reminder of String:
ABCDKMWHF ABC ~ length6 ABCD ~ length 7 ~ K M W
SELECTED
= Shifted ~ Shifted ~- Shifted
Uncovered sequence K M W Fig. 3. Approximate pattern-matching sequence The first step of the algorithm is illustrated in Fig. 2. A pattern that matches the beginning of the string is selected from the p a t t e r n table. If no pattern matches the beginning of the string, then the first event is counted as uncovered and removed. The algorithm is subsequently applied to the remainder of the string. Once a selectable p a t t e r n has been found, the algorithm will recursively look for other patterns t h a t exactly cover the continuation of the string up to a given depth D or to the end of the string, whichever comes first (D = 3 in Fig. 2). This means that, for a selectable pattern to be chosen, we must find D other patterns that also completely m a t c h the events following t h a t p a t t e r n in the sequence. If such a sequence of D patterns does not exist, step 2 of the algorithm is used. The rationale behind this is t h a t we want to know whether the selected pattern has
6
Herv@ Debar et al.
a positive influence on the coverage of the audit events t h a t are in its vicinity. When such a D - p a t t e r n sequence has been found, the algorithm pops the head p a t t e r n out of the sequence and goes on with the remainder of the string. If we cannot find three other patterns t h a t m a t c h the sequence after the selected pattern, we deselect it and try the next selectable p a t t e r n in the table. The fact t h a t there were selectable patterns, but none t h a t fulfill the depth requirement, triggers the second phase of the algorithm. The second step of the algorithm deals with failure cases and is illustrated in Fig. 3. The algorithm looks for the sequence of N patterns t h a t covers the largest number of audit events. This sequence is removed from the string and the algorithm goes back to its first part, looking again for a selectable pattern, and shortening the string as long as one is not found. 2.4
Intrusion
Detection
For each string, the algorithm extracts b o t h the number of groups of audit events t h a t were not covered and the length of each of these groups. The decision whether to raise an alarm is based on the length of the uncovered sequences. It is worth noting t h a t this is different from the measures used for the same purpose by Forrest et al., namely the n u m b e r and the percentage of uncovered events in a string [2]. There are three reasons for this. Processes can generate a large number of events. An attack can be hidden in the midst of a set of normal actions. Using the percentage of uncovered characters would fail to detect the attack in this case. - Small processes could be falsely flagged as anomalous because of a few uncovered events if we use the percentage of uncovered characters. - Long processes could be falsely flagged as anomalous because of m a n y isolated uncovered events if we use the amount of uncovered characters. -
During our experiments we observed t h a t the trace of the attacks was represented by a number of consecutive uncovered characters. This is consistent with the findings of Kosoresow and Hofmeyr [5], who note t h a t mismatches due to intrusions occur in fairly distinct bursts in the sequences of system calls they monitored. This means t h a t an attack cannot be carried out in fewer t h a n T consecutive characters. Our experiments led us to choose the value 7 for T. In other words, each group of more t h a n six consecutive uncovered events is considered anomalous and flagged as an attack. This amounts to defining an intrusion as any event t h a t generates a sequence of at least T consecutive events not covered by the algorithm defined above. To validate the concepts presented here, we have developed an intrusiondetection testbed [9]. We have used it to compare various strategies to build the table of patterns used in the detector. The results of these experiments are reported in the next two sections. T h e y focus on the ]tp service, which is widely used, is known to contain m a n y vulnerabilities, and provides rich possibilities for user interaction.
Fixed vs. Variable-Length Patterns for Suspicious Behavior Detection
3 3.1
7
Generating Fixed-Length Patterns Pattern Generation
Fixed-length patterns are the most immediate approach to generate the table of patterns. We have evolved four ways of obtaining these patterns. For each of them, we use a parameter L, the length of the patterns. Technique 1 is an exhaustive generation of all possible patterns by sliding a window of size L across the sequence and recording each unique L-event sequence. This technique is easy to implement, but has obvious drawbacks. First, a large number of patterns in the reference table will not be used by the matching algorithm on the reference data set. Moreover, the covering algorithm generates boundary mismatches when the length of the process image is not a multiple of L, because it uses a juxtaposing window instead of a sliding one (which would not make sense in this context). It is worth noting that this is the technique used to run the experiments described in [2], [4], and [5]. To solve the first problem, technique 2 is introduced, which shifts the string by L events, thus creating juxtaposed patterns instead of overlapping ones. In other words, the events in positions 1 to L constitute the first pattern, events in the positions L + 1 to 2L constitute the second one, and so on ... The table we obtain is thus much shorter than the table generated by technique 1, which makes the pattern-matching algorithm faster. Technique 2 drops the remainder of the sequence of events when its length is not a multiple of L, therefore the second problem mentioned remains. To solve it, we have introduced techniques 3 and 4. They are similar to technique 2 but keep the events remaining at the end of the string as a stand-alone pattern or concatenated to the last generated pattern, respectively. Techniques 3 and 4 both result in 100% coverage of the test sets. As they are the ones that were most promising during the experiments, we focus on them in the remainder of this paper. Note that some of the patterns are smaller or larger than the specified size, and these can only match at the end of the process execution, as they necessarily contain the end-of-process event.
3.2
Experimental Results
We have used three test sets in our experiments. The first test is solely based on the FVT. The second test set includes additional information on tools that are used by ftp users, but are not in the testing of the ftp daemon p e r se. The third data set is based on the recording of several hundred real user sessions. The experimental results are presented for the second test set. The reference table generated from the first set does not contain patterns for commands like tar, which are used by normal users. Using the third set breaks the rule fixed in the introduction of the paper (no user recording for training), thus this set has only been used to evaluate the false negative rates obtained with our approach. The output of the intrusion-detection system is shown in Fig. 4. Each line reports the analysis for one process related to the ftp service.
8
Herv~ Debar et al. INumber of INumber of IPercentage INumber of IAverage INumber of ILength of laudit
luncovered
levents
levents
569 97 928 735
6 878 839
8 963 168 476 366
I o f covered Ipatterns
events 99.30 100.00 99.68 99.59 100.00 99.09 99.52 25.00 99.79 93.81 99.79 99,18
188 32 308 244 2 290 278 1 320 58 158 121
Isize of Igroups Ipattern 3.01 3.03 3.00 3.00 3.00 3.00 3.00 2.00 3.00 3.02 3.01 3.00
leach group 1111 111 111 IIiiiiii iiii 6 2 II 1 IIi
Fig. 4. Output of the intrusion-detection system
Table 1 shows the results of the experiments to monitor our ftp server under various conditions, using the intrusion-detection system loaded with several reference tables. The first column identifies the experiment. The second column gives the number of audit events in the experiment. The third column indicates the two parameters corresponding to the reference table used in the intrusiondetection system, t denotes the generation technique (3 or 4) and s the size of the patterns (2, 3 and 4). The fourth column gives the overall number of uncovered characters. The fifth column gives the number of patterns used for the coverage. The sixth column gives the average size of patterns, 3 and the seventh column indicates the number of false alarms triggered by the intrusion-detection system. Table 1 shows that our intrusion-detection system has the potential to perform well without generating false alarms. Indeed, for patterns of length 2, independent of the technique used (3 or 4), the intrusion-detection system does not generate false alarms. For patterns of length 4, independent of the technique used, the intrusion-detection system generates false alarms. For patterns of length 3, the generation of false alarms is dependent on the technique used for the generation. With technique 3, which introduces more patterns but also more freedom (some patterns can be smaller than the required length of 3), the intrusion-detection system does not generate false alarms, whereas it will generate false alarms if technique 4 is used. It is also worth mentioning that if we use the third test set, which is closely based on user-behavior observations, instead of the second one, we observe no false alarms for any of the combinations (pattern size, technique). However, using user-generated data to train the intrusion-detection system breaches the requirements expressed in the introduction.
3 This value is truncated after the second decimal.
F i x e d vs. Variable-Length P a t t e r n s for Suspicious Behavior D e t e c t i o n
9
T a b l e 1. E x p e r i m e n t a l results for normal user sessions Simulation description A n o n y m o u s user sessions
N u m b e r Technique N u m b e r of N u m b e r of A v e r a g e Number of audit and uncovered p a t t e r n s size of of false events size events required p a t t e r n s alarms 6067
t--3, s=2
t=3, s=3 t = 3, s -- 4
t=4, s=2
N o r m a l user sessions not using "site exec"
19439
Normal user sessions
20135
Sequence of all available user simulations
26094
Simulation of user activity on loaded ftp server
26148
t -- 4, s = 3 t -- 4, s -- 4 t=3, s=2 t -- 3, s = 3 t = 3, s = 4 t = 4, s = 2 t -- 4, s -- 3 t -- 4, s ----4 t--3, s--2 t = 3, s -- 3 t = 3, s = 4! t = 4, s = 21 t = 4, s = 3 t = 4, s = 41 t=3, s=2 t 3, s ----3 t ----3, s = 4 t -----4, s = 2i t ----4, s = 3 t---- 4, s = 41 t----3, s = 2 1 t = 3, s -= 3 t ----3, s = 4 t = 4, s ---- 2 t = 4, s = 3 i t = 4, s = 4
2 79 180
2 82 186 9 108 257 10 116 264 9 108 257 10 116 264 11 199 444 12 213 511 11 185 437 12 196 450
3028 1994 1477 3013 1983 1455 9705 6443 4802 9685 6416 4769 10052 6675 4978 10025 6643 4936 13029 8621 6431 12985 8586 6362 13053 8652 6440 13015 8611 6380
2.00 3.00 3.99 2.01 3.02 4.04 2.00 3.00 3.99 2.01 3.01 4.02 2.00 3.00 3.99 2.01 3.01 4.03 2.00 3.00 3.99 2.01 3.01 4.02 2.00 3.00 3.99 2.01 3.01 4.03
0 0 3 0 0 3 0 0 1 0 0 1 0 0 1 0 0 1 0 0 4 0 0 4 0 0 4 0 0 4
T o e v a l u a t e t h e risk o f n o t d e t e c t i n g a n a t t a c k , 4 T a b l e 2 s h o w s t h e n u m b e r o f v a l i d s e q u e n c e s o f six a u d i t e v e n t s t h a t c a n b e c r e a t e d o u t o f t h e t a b l e (six is t h e r e b e c a u s e o u r T = 7). T h i s is t h e m o s t o b v i o u s m e a s u r e , b u t it d o e s n o t a c c o u n t for t h e full risk, as a n a t t a c k c o u l d b e t h e c o m b i n a t i o n o f l e g i t i m a t e s e q u e n c e s w i t h s h o r t illegal ones. T h e t a b l e c l e a r l y s h o w s t h a t a s m a l l i n c r e a s e in t h e l e n g t h o f t h e p a t t e r n d r a m a t i c a l l y r e d u c e s t h e d e g r e e s o f f r e e d o m in t h e t a b l e , a n d we c o n s i d e r t h i s a v e r y i m p o r t a n t r e s u l t . Also, t h e t e c h n i q u e u s e d t o 4 B y this, we denote the possibility of r u n n i n g an attack t h a t would create a new sequence of audit events w i t h o u t being detected. This is possible if t h e new sequence consists of subsequences t h a t are valid, i.e. t h a t are present in the reference table.
10
Hervd Debar et al. T a b l e 2. Statistics on fixed-length tables Technique Size of Size of and size table patterns t--3, s=2 87 {1,2}
it.- 3, s = 3
114
t=3, s=4
152
it=4,s=2
86
t-----4, s = 3 113 t----4, s----4 147
Number oflegitimate combinations oflength < T(6) 1142100 {1,2,3} 11559 {1, 2, 3, 4} 59O {2, 3} 456456 {3, 4, 5} 9905
{4, 5, 6, 7}
143
generate the patterns has a dramatic impact on the number of combinations, as small p a t t e r n s (typically of length 1, potentially generated by the use of technique 3) significantly increase the probability of covering attacks, i.e., of not detecting them. We have also carried out a number of attacks against our ftp server to validate our approach. Table 3 shows our experimentation results for nine attacks. An X in the column indicates t h a t the attack has been detected. Table 3 reveals t h a t three attacks pose a problem to the intrusion-detection system, but for very different reasons. The "put rhosts" attack consists of putting a .rhosts file in the home directory of the ftp user. This vulnerability results from a misconfiguration of the ftp service because this directory should obviously not be world writable. Actually, the "put forward" and the "put rhosts" attacks use the same vulnerability, except t h a t one of t h e m puts a .forward file, and the other one a .rhosts. The first one is detected because our attack script activates the attack by sending a mail to the ftp user, thus triggering the execution of the .forward file by the sendmail daemon. T h e second attack is not triggered, because our attack script does not log on to the attacked machine. Any login a t t e m p t would immediately
T a b l e 3. Experimental results for attack detection Attack description [t=3, s=2[t=3, s=3[t----3, s=4]t=4, s=21t--4 , s=3[t=4, s=41 put forward X X X X X X put rhost site copy X X X X X X site exec copy ftpd X X X X X X site exec copy ls X X X X X X site exec ftpbug ftpd X X X X X X tar exec copy ffpd X tar exec copy ls X X X X tar exec ftpbug ftpd X X X X X X tar exec ftpbug ls X X X X X X
Fixed vs. Variable-Length Patterns for Suspicious Behavior Detection
11
be detected, as a shell would be spawned for the ftp user, and this behavior does not belong to the reference data set. The two "tar exec copy" attacks use the option of G N U tar to specify a compression program that will be used by tar to uncompress the data before extracting it. In our case, we have developed a Trojan horse that does no decompression, but c o p i e s / b i n / s h t o / t m p / . s h as a setuid executable. This attack exploits the fact that some ftp daemons do not release their root privileges quickly enough before forking other processes. Therefore the tar and the Trojan horse both run as root and t h e / t m p / . s h is created suid root. Our Trojan horse is compiled and saved under the name of either ls or ]tpd (hence the code name of the attacks). The attacks are only detected by the more restrictive table of patterns of length 4. We observe that the patterns modeling the normal behavior of an ls process in our reference table are more restrictive than those modeling #pd, as the masquerading of our attack program under the process name "Is" is detected by (t = 4, s = 3) but not the masquerading under the process name "ftpd" with the same pair (t = 4, s = 3). This second case also highlights one of the difficulties of this intrusiondetection technique: the reference data set must be truly restrictive and exhaustive, so that the patterns can faithfully represent the information. In our case, tar is not a command that is tested by the test suites we use to generate the reference data set, as it is not considered to be strictly related to ftpd. We have added some examples of usage of the tar command to our reference data set, but forking commands is a legitimate behavior of the tar command, and both the ftpd and ls patterns cover the execution of our Trojan horse. Lowering the threshold L would improve detection using the smaller reference tables, but it would also increase the false alarm rate.
4
4.1
Generating
Variable-Length
Patterns
Rationale for Variable-Length Patterns
The fixed-length patterns approach yields interesting results, but a careful look at the sequence of audit events shows that there are very long subsequences which repeat themselves frequently. For example, more than 50% of the process images we have obtained start with the same string. This string contains 40 audit events. Therefore, having these very long sequences would better represent our data. This is intuitively correlated by the fact that the ftp daemon answers to commands given by the user, and that each command can probably be represented by a (set of) long sequence(s) of audit events. Therefore, having the capability to extract variable-length patterns from our d a t a makes sense [8]. However, we also are looking for an automated approach. Manual pattern extraction might be feasible for small-scale experiments, but is very time consuming. Therefore, we have looked for a method that will allow us to generate variable-length patterns.
12
Herv6 Debar et al.
4.2
Description of the Pattern-Generation
Technique
A block diagram of the variable-length pattern generator is shown in Fig. 5. It consists of a suffix-tree constructor, a pattern pruner, and a pattern selector. The suffix-tree constructor processes the sample of user behavior as a sequence of symbols, i.e., a string to generate its suffix tree. The well-known suffix-tree structure is a dictionary of all substrings in the string [10]. An example of a string and its corresponding suffix tree are given in Fig. 6.
SUFFIX TREE CONSTRUCTOR
] ]
- ] PATTERN Suffix - I . , PR UNER Tree
] -I I Pattern-I Tree
PATTERN SELECTOR
I [ Reduced Tree
Fig. 5. Variable-length pattern generation
Sample of normal behavior:ACABACBDBABD$ ACBDBABD$
~ACBDBABD$
B
2
B
2
C
2
C
2
~---~BDBABD$ .~ ^ 2 f BD$ 12~CBDBABD$
~
12
12
ABDS
(a) Suffix tree
(b) Pattern tree
(c) Reduced tree
Fig. 6. Pattern extraction
The suffix tree initially consists of only a root. Then all suffixes of the sample string are added one by one until the complete suffix tree is obtained. Note that a "$", representing an end delimiter, is added to the end of the string to ensure that the suffix tree contains one leaf corresponding to each suffix. Every
Fixed vs. Variable-Length Patterns for Suspicious Behavior Detection
13
node corresponds to the substring represented by the symbols on the path from the root to that node. The number next to the node represents the number of occurrences of the substring in the sample string. The pattern pruner prunes the suffix tree in order to generate a tree, referred to as a pattern tree. The pruning is done based on the pruning factor, which is a real number between 0 and 1. This factor is multiplied by the length of the sample of user behavior to obtain a lower limit on the occurrences of patterns. Every pattern with a number of occurrences less than this minimum is clipped, i.e., all its outgoing branches are removed. Fig. 6b shows the pattern tree when the suffix tree is pruned with a pruning factor of 3/12. The number of occurrences of each pattern in the sample string is indicated next to the node corresponding to that pattern. As the branching factor is increased, the suffix tree is pruned more and, hence, the pattern tree contains fewer and shorter patterns. We can use this flexibility to try different sets of patterns and choose the one that results in the most satisfactory detector performance. From the set of pruned patterns the selector removes those patterns that are not required for covering the sample of user behavior. In our example, patterns AB, AC, BA, B D in the pattern tree in Fig. 6c are sufficient to cover the sample of user behavior; patterns C, D are not needed. The selector identifies the needed patterns by processing the sample with a pattern matcher identical to that of the detector. First, it loads the matcher with the set of the pruned-pattern tree and sets all the nodes to not marked. It then performs matching on the sample. For every matched-pattern event generated by the matcher, it marks the corresponding pattern in the pruned-pattern tree. After the entire sample has been processed, the marked patterns are selected as the set of patterns to be used by the detector. Note that currently the matcher used here is different from and much simpler than the matcher of the intrusion-detection system.
4.3
Experimental Results
Using the second test set, we have built three reference tables of patterns. The first reference table contains all the possible sequences extracted from the pattern tree, 5 denoted all patterns in Table 4. The second reference table contains only the sequences going from the root to the leaf in the pattern tree, 6 denoted leaf patterns in Table 4. The last one contains only the sequences going from the root to the leaf in the reduced tree 7 (see Fig. 6c). The statistics for the reference tables are given in Table 4, according to both the kind of reference table and the pruning factor used in the tree generation. The information still has to be compared with the complete number of combinations that we can derive from all the registered pairs (application, audit event) in our intrusion-detection system, which gives 562 9 554 combinations s.
s A, AB, AC, B, BA, BD, C~D in the example in Fig. 6. 6 AB, AC, BA, BD, C, D in the example in Fig. 6. 7 AB, AC, BA, BD in the example in Fig. 6 s In our experiments, we have observed 55 different pairs (audit event, process name); two fictitious events have been added to represent the beginning and end of a process.
14
Herr@ Debar et al. Table 4. Statistics on variable-length tables Reference table pize of table Pruning factor All patterns Leaf patterns Seiected patterns Leaf patterns Selected patterns Leaf patterns Selected patterns
87 82 51 146 74 2i5 80
0.1 0.1 0.1 0.05 0.05 0.02 0.02
Number of legitimate combinations of length < T(6) > 24 * 22 ~ > 21 * 19~ > 13 * 12~
I
> > > >
17 * 165 12 9 115 16 * 14~ 12 9 115
These reference tables actually contain patterns of length 1 to 50. This approach identifies the long sequences we found by browsing the data sets. However, they are actually less restrictive that the ones obtained with fixed-length patterns, owing to the presence of very small patterns. This unexpected result is a consequence of the presence of numerous patterns of length one, along with much longer patterns, in the table. This is the reason we are now working to improve the variable-length pattern tables by introducing a lower limit for the size of the patterns. Lowering the pruning factor results in longer patterns, but the very small ones are still required to obtain a good match, thus the number of legitimate combinations is still very large. We have experimented with these three tables on the normal user behavior, and found that they also do not generate false alarms. Concerning the attacks, we obtain the same results as in the previous section, i.e., the ".rhosts" and "tar exec copy" attacks are not detected.
5
Conclusion
and
Future
Work
In this paper, we have proposed two different families of methods, fixed- vs. variable-length pattern generation, to generate tables of patterns that model the normal behavior of a process. We have also proposed a new paradigm to detect intrusions based on the length of uncovered sequences and relying on a specific pattern-matching algorithm. Based on the results of the experiments run on our testbed, we have shown that the appeal of the new paradigm for both families of patterns (fixed length and variable length) is that it minimizes the false negative alarm rate. We have explained the limitation of the fixed-length pattern approach, namely its inability to represent long meaningful substrings. We have defined a new family of methods based on a suffix-tree representation to obtain variable-length patterns. Preliminary results exhibit their ability to detect real intrusions, but a detailed analysis indicates that they are also more prone to issue false alarms. Further work will concentrate on enhancing the algorithm for variable-length generation of patterns in order to reduce the risk of obtaining false positive
Fixed vs. Variable-Length Patterns for Suspicious Behavior Detection
15
alarms while keeping the good results regarding the true positive rate. Another proposed enhancement is to associate frequencies with the patterns, to customize the intrusion-detection patterns locally by flagging the infrequent use of registered patterns.
References 1. Forrest, S., Perelson, A.S., Allen, L., Cherukuri, R.: Self-Nonself Discrimination. In: Proceedings of the 1994 IEEE Symposium on Research in SeCurity and Privacy. IEEE Computer Society Press, Los Alamitos, CA (1994) 202-212. 2. Forrest, S., Hofmeyr, S.A., Somayaji, A., Longsta.ff, T.A.: A Sense of Self for Unix Processes. In: Proceedings of the 1996 IEEE Symposium on Research in Security and Privacy. IEEE Computer Society Press, Los Alamitos, CA (1996) 120-128. 3. D'haeseleer, P., Forrest, S., Helman, P.: An Immunological Approach to Change Detection: Algorithms, Analysis, and Implications. In: Proceedings of the 1996 IEEE Symposium on Research in Security and Privacy. IEEE Computer Society Press, Los Alamitos, CA (1996) 110-119. 4. Forrest, S., Hofmeyr, S.A., Somayaji. A.: Computer Immunology. Commun. ACM 40 (1997) 88-96. 5. Kosoresow, A.P., Hofmeyr. S.A.: Intrusion Detection via System Call Traces. IEEE Software 14(5) (1997) 35-42. 6. Esmalli, M., Safavi-Naini, R., Pieprzyk, J.: Computer Intrusion Detection: A Comparative Survey. Technical Report 95-07, Center for Computer Security Research, University of Wollongong, Wollongong, NSW 2522, Australia (May 1995). 7. Debar, H., Dacier, M., Wespi, A.: Reference Audit Information Generation for Intrusion Detection Systems. In: Posch, R., Papp, G. (eds).: Information Systems Security, Proceedings of the 14th International Information Security Conference (IFIP SEC'98), Vienna, Austria, and Budapest, Hungary, Aug. 31-Sept. 4, 1998 (in press). 8. Teng, H.S., Chen, K., Lu, S. C-Y: Adaptive Real-Time Anomaly Detection Using Inductively Generated Sequential Patterns. In: Proceedings of the IEEE Symposium on Research in Security and Privacy. IEEE Computer Society Press, Los Alamitos, CA (1990) 278-284. 9. Debar, H., Dacier, M., Wespi, A., Lampart. S.: A Workbench for Intrusion Detection Systems. IBM Research Report RZ 2998, IBM Research Division, Zurich Research Laboratory, 8803 Rfischlikon, Switzerland (1998). 10. Stephen. G.A.: String Searching Algorithms. World Scientific, Singapore (1994).
A Tool for Pro-active Defense Against the Buffer Overrun Attack D. Bruschi, E. Rosti, R. Banff Dipartimento di Scienze dell'Informazione Universit~ degli Studi di Milano Via Comelico 39/41, 20135 Milano - Italy {bruschi, rose}@dsi, unimi, it, roby~psycho, usr. dsi. unirai, it
A b s t r a c t . The problem of buffer overruns, i.e., writing past the end of
an array, in C programs has been known since the early seventies as one of the possible consequences of the C language data integrity philosophy. Since the late eighties, when computer security incidents started affecting the Internet, it has been clear that buffer overruns are a powerful threat to system security as they allow ordinary users to gain superuser privileges on Unix systems. Nowadays, buffer overruns are one of the most popular exploits in the hacker scene. In this paper we present a tool for the automatic detection of buffer overrun vulnerabilities in object code. It can be applied to operating system components as well as ordinary programs. The tool is aimed at helping system administrators eliminate vulnerable programs before they are exploited. A fully working prototype for HP-UX and Linux systems is currently available. Extensions are planned for other Unix versions.
1
Introduction
C o m p u t e r security problems either derive from security unconscious design, such as the vulnerability to eavesdropping or the lack of strong authentication in the T C P / I P communication protocols suite, or from p r o g r a m m i n g and system configuration errors, which, again, m a y depend upon poor design from a security point of view. Attack techniques exploit b o t h security flaw designs and insecure p r o g r a m m i n g practice and errors. Buffer overruns are p r o g r a m m i n g errors t h a t can be exploited in order to gain superuser privileges on a system, leading to what is usually called "root compromise" [6]. Buffer overrun is the effect of writing past the end of an array, or buffer, in C jargon. It is also known as "buffer overflow," because a buffer is stuffed with more d a t a t h a n it can contain. Another popular term used for buffer overrun based attacks is "stack smashing," since p a r t of the process stack segment is overwritten. The problem of buffer overruns in C programs has been known since the definition of the C language itself. In order to improve upon the language flexibility and efficiency, d a t a integrity checks such as bounds checks on array or
18
D. Bruschi et al.
checks on pointer arithmetic are left to the programmer. Neither static nor dynamic checks are enforced at language level thus favoring insecure programming practice. A historical case of buffer overrun is the l o g i n program, whose vulnerability has been known since the late 70's. It was, however, with the Internet Worm in November 1988 that buffer overruns started being exploited for security purposes [17]. Of the three techniques the Worm used to spread to new hosts, one was overflowing the f i n g e r command line argument with an input string of 536 characters instead of 512, which was the size of the input parameter buffer. Since 1988 buffer overruns have become more and more popular on the hacker scene. In 1997, 15 out of 28 advisories 1 of the Computer Emergency Response Team Coordination Center (CERT-CC) of Carnegie Mellon University concerned buffer overrun vulnerabilities identified in commonly used programs and facilities in Unix systems [3]. Many research efforts have been spent in recent years in order to devise solutions to the buffer overrun problem [4, 8,16, 15, 9]. In general, such solutions aim at providing tools for writing programs where buffer overruns are not possible. Yet, they do not offer a solution to the problem of vulnerable existing legacy code and commercial distributions installed world wide. In this case the only defense against possible attacks is the prompt update of the software with new releases or patches. In spite of the increasing computer security awareness, it is not common to find promptly updated systems where new releases or patches are installed as soon as they become available. Being able to automatically detect programs that are vulnerable to buffer overrun attack before they become known to the hacker world would greatly improve system security. This is the perspective from which we have addressed the buffer overrun problem in this work. Specifically targeting executable programs, we have developed a tool that automatically detects buffer overrun vulnerabilities in object code. The tool searches the file system for critical programs and tests them individually against buffer overruns. For the vulnerable ones the exploit is built and executed against the program. A notice is then sent to the system administrator, reporting the found vulnerability. Our tool can be applied to operating system components as well as system utilities and ordinary programs. A fully working prototype for HP-UX and Linux systems is currently available. Extensions are planned for other Unix versions. This paper is organized as follows. Section 1 illustrates the factors that make buffer overrun attacks possible. Section 3 sketches how to craft an attack. Section 4 presents the tool we developed. In Section 6 solutions proposed in the literature are presented and discussed. Section 7 summarizes our contribution and concludes the paper.
1 Buffer overrun related advisories in 1997 only are numbers 2, 4, 5, 6, 9, 10, 11, 13, 17, 18, 19, 21, 23, 24, and 26.
A tool for pro-active defense against the buffer overrun attack 2
General
19
Background
In this section we provide an overview of the factors that make a language issue such as buffer overruns a security issue, i.e., Unix access control scheme, C language data integrity philosophy, and function execution model. We assume a 4.4 BSD system as reference. However, we keep our description general enough to apply to other Unix versions. For further details, see [11, 1, 10].
2.1
U n i x Access Control S y s t e m
The Unix operating system implements a discretionary access control system. Users are allowed to grant and revoke access rights on the objects they own. Such objects are typically files (or directories) and processes. File protection is implemented by a three-level access control mechanism. Access rights are specified for the owner of the file, for the group the owner of the file belongs to, and everyone else, i.e., the rest of the world. For each of the three categories, read, write, and execute permissions are specified. Three additional bits, called suid (set user identifier), sgid (set group identifier), and sticky bit, apply to executable files and specify processes run time privileges. In particular, the suid and sgid bits allow a process to be granted additional privileges during execution. A process privileges are those of the user and group owning the process and are inherited from the parent process at process creation time. Both the user and the group identifiers associated with a process are actually double, as Unix distinguishes between the real and the effective uid and gid. The real uid of a process, known as ruid, is the uid of the user executing the program (i.e., the owner of the process). The effective uid, known as euid, is the uid of the owner of the program. The same applies to the gid. A confined environment is created where privileged programs have their scope limited to that of the user executing them. However, it may be necessary for a process to access objects for which it has no privileges while the program owner does. A mechanism for granting temporary privileges is implemented with the suid (sgid) bit and the s e t u i d ( ) ( s e t g i d ( ) ) system call. When a setuid program (i.e., a program with the suid bit on) executes the s e t u i d ( ) system call, the process ruid changes to euid. The program owner's file system privileges are granted to the executing process for the time the ruid remains set to that of the program owner. The setuid (setgid) mechanism is particularly dangerous when the program owner is root, as the process is granted all the superuser privileges [2].
2.2
Function Call E x e c u t i o n M o d e l
When a program is executed, its process image is allocated an address space organized in three parts, text, data, and stack segments, as illustrated in Figure 1.
20
D. Bruschi et al.
A read-only static m e m o r y area starting at low addresses 2 is allocated for the text segment, which might be shared by multiple users executing the program. Immediately above the text segment, the d a t a segment is divided into initialized data (i.e., the statically allocated data) and uninitialized d a t a (i.e., pointers to dynamically allocated data, bss in Figure 1). T h e d a t a segment can be modified by and is private to each process. D a t a allocated at run time using variables in bss occupy the space above the d a t a segment, called heap. The heap grows towards bigger addresses. Starting at high addresses and growing towards lower ones, the user stack segment occupies a contiguous set of m e m o r y locations organized in Last-In-FirstOut order. The stack organization is used as it matches the natural sequence of function call activation records. W h e n a function is called, a new stack frame process in memory high addresses per-process kernel stack red zone user
area
ps string structure signal code ~x
env strings
x
argv strings env pointers
x
shell environment variables command line arguments
argv pointers J
argo
s
user stack program in secondary storage
linke~header heap
I magic number symbol table
bss
initialized data low addresses
text
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
f
__~
Fig. 1. The user memory address space.
is pushed onto the stack and d a t a saved in fixed locations by the function prologue, an operating system component that prepares the function environment 2 The allocation depicted in the figures, although common to many modern architectures, is an implementation convention. The same is true for the specific values at which each segment starts. Therefore we do not indicate any value, as it is irrelevant to our description.
A tool for pro-active defense against the buffer overrun attack
21
for execution. At function return, the previously saved environment is restored by the function epilogue. The return address is loaded in the program counter, the saved stack pointer is loaded in the stack pointer register, and the saved frame pointer is loaded in the frame pointer register. The rest of the frame is then popped and the freed area used for future calls. 2.3
C Language
The C programming language is often considered a high level assembly language because of its flexibility and efficiency [10]. These characteristics derive from lack of type checking on data structures, no boundary checking on arrays, pointer arithmetic, automatic type conversion, and unstructured control statements such as goto or break/continue for interrupting loops. By minimizing type checking and allowing type conversion, variables can be assigned any values, as long as they fit in the allocated space, and flexibility achieved. Lack of bounds checking at run time when dealing with arrays or buffers improves upon execution time as array accesses are faster. Data integrity checks are intentionally left to the programmer. Therefore, when no program level bound checks apply, it is possible to write past the end of an array without causing errors unless private areas are accessed. Furthermore, standard C libraries implementing string manipulation functions, such as s t r c p y ( ) , and i n p u t / o u t p u t operations involving strings, such as g e t s (), do not check the array size they manipulate 3. Most of the buffer overrun attacks are based on such a well known yet overlooked flaw.
3
The Attack
The setuid mechanism adopted in Unix in order to temporarily enhance the privileges granted to an executing process together with the C programming language data integrity features become critical from a security point of view when combined with the process memory execution model. Detailed descriptions of how to write an attack have appeared to which the interested reader can refer [13, 12, 14]. The general scheme of an attack is the following. A string containing the object code of the system call execve ( "/bin/sh", NULL)4,which executes the shell command interpreter/bin/sh, is passed to the program instead of one of its input parameters. The malicious string has been crafted so as to overwrite the return address of the function that operates on it, which is saved on the 3 For some of the functions in these libraries, safe replacements exist, such as strncpy(), that control the number of bytes written into arrays. However, the good programming practice that suggests the use of such replacements cannot be enforced, thus leaving chances for continuous new exploits. 4 The object code of the execveO system call can be obtained by disassembling the code of a program executing such a system call with the proper parameters.
22
D. Bruschi et al.
stack, with the string starting address. When the function terminates, it will execute the shell code instead of returning to the calling environment. The critical aspect of this scenario is" the set of privileges the process is executing with when the buffer is overwritten. 'If'the process is setuid to root and is executing with root privileges when~the buffer overrun occurs, the shell is a root shell and the machine may suffer a~root._eompromise. In case the program is setuid to a different user than root, it is still dangerous as the spawned shell may operate on the user's files. 4
Buffer
Overrun
Detector
As shown before, the risks of buffer overruns are so serious to justify pro-active defenses. Two strategies are possible. On the one side, compiler patches that aim at producing safe code where the problem may not occur can be developed. The problem is solved by removing it "ab origine," i.e., from the source code. However, the problem of buffer overruns in programs for which no source code is available is not addressed. Several solutions have been proposed that follow such an approach. They will be discussed in Section 6. On the other side, various types of operating system or library patches that aim at making the buffer overrun harmless can be developed. Ad hoc operating system patches that prevent the exploitation of the overrun at system level are a suitable solution only if the operating system source code is available. When this is not the case, there is no other solution than to identify and fix vulnerable programs individually. In this section, we present a solution along this direction. We developed a tool, Buffer Overrun Detector, for the automatic pro-active detection of buffer overrun vulnerable programs. It is a pro-active tool as vulnerable programs are searched for before their vulnerability is exploited. The tool identifies critical programs in a system (i.e., setuid programs owned by the superuser) and systematically tests them for buffer overrun. When all the conditions are met, the tool keeps trying until the program is exploited and a root shell is spawned. A list of vulnerable programs is obtained. Appropriate actions such as local development of patches, disabling programs or stripping the suid bit, can then be taken by the system administrator. The tool consists of a set of shell scripts and C programs. It can be executed as a background process after the system has been installed. The tool operates in three phases: search for critical programs, search for segmentation violation, and construction of the exploit. Portability to different architectures is limited by the characteristics of the architecture vulnerable programs are executed on. Depending upon the stack frame layout and variable allocation convention, the tool must be adapted to take such variants into consideration. The final version of the tool is planned to incorporate system specific variants, extending the baseline prototype currently available for HP-UX and Linux to Sun Solaris and SGI Irix platforms. We describe each phase in what follows.
A tool for pro-active defense against the buffer overrun attack
4.1
23
Finding Critical Programs
For the buffer overrun attack to be really effective, it should be performed against setuid to root programs. In the first phase, the list of setuid to root programs is compiled. The file system is searched for all files owned by root and for which the permission scheme indicates that the setuid bit is on. Of all the found programs, those that accept input parameters or access environment variables in an uncontrolled mannered are then identified. Command line arguments separated by distinct options are a typical way of passing input parameters to programs. For each entry of the list of setuid programs, command line options are analyzed. Information is gathered by parsing the corresponding manual page, if there is one. When manual pages are not available, e.g., for locally developed programs, the option list must be constructed manually by direct inspection. The same applies to environment variables programs might use. When the command line argument buffers are too small to contain the shell object code, environment variables are the alternative destination to inject it. A configuration file is under construction that contains all the options and relative arguments for each setuid program found on a particular system. Introducing such a file, although some adjustments may be necessary depending upon the specific system configurations, would eliminate the overhead of repeated manual page parsing.
4.2
Searching for Segmentation Violation
The second phase identifies candidates for buffer overrun attacks among the programs selected in the first phase. The program characteristic that the attack exploits is the lack of controls on the size of the input parameters or of the environment variables. Programs that do not control the size of their input cause a segmentation fault when exceedingly large values are given to input parameters or environment variables. In order to understand how segmentation violation errors caused by writing past an array can be exploited, consider the example in Figure 2(a). It depicts the user memory map for the simple program shown in the figure, when executing on a Unix like system, before the instruction ",b = a; " is executed. The program counter contains the address of the next instruction to be executed, namely the address of "*b = a; " . The frame pointer points to the beginning of the local variable area for my_function in the user stack and the stack pointer points to the top of the stack. The frame (stack) pointer saved on the stack contains the previous frame (stack) pointer value. The return address contains the address of the instruction to be executed after returning from my. f u n c t i o n ( i , &j, s t r i n g ) , i.e., instruction j = i. Note that, for alignment reasons, the array bur 1 actually occupies eight bytes instead of six. The actual parameters have also been assigned the respective values, which are shown on the right hand side of the corresponding locations. By executing the function s t r c p y (bur 1, buf), the string "buffer overflow!!!!" will be copied into bur 1. Since the function s t r c p y ()
24
D. Bruschi et al.
keeps copying characters until a null character is found, without checking the number of characters being copied, copying "buffer overflow!!!!" into b u l l will overwrite the saved frame and stack pointers and the return address, as shown in Figure 2 (b). The stack locations bur i through the return address included define what we call the "exploitable area", i.e., the area where the exploit code must be loaded. W h e n my_funct• returns, instead of the correct instruction "j = • ;",
user process in m e m o r y
high addresses
high addresses env pointers
env pointers argv pointers argc
.r
P!\O low ! verf er o buff
4 stung
*buf a
\
argc Itl\O
}
low!
void main() { char string = "buffer overflow! ! !t"; int i, j;
,
i=5; j=8; imY-functl~ i;
" J t /
*b
i t
4i
= a; strcpy(bufl, buf);
5
f
argv pointers
void my_function(int a, int *b, char *buD{ char bufl [6];
&j' string);
i ! i i
5
j *buf
,,
stack pointer frame pointer
a
/"
,,-"
/'t
fraine pointer
[
~\,
---.~ prngr~cou.ter [
remm address
.............
~
stack pointer
5
!!!\0
stack pointer
low!
frame pointer
veff er o buff
buff
bufl
8 " ~t; " rJ
*b
5
return address
ero
buff
i
,4~}
verf
stung
]
/ heap
heap
bs$
bss
initialized data
initialized data text at function call
(a)
low addresses
low addresses
text after strcpyO, at function retrun
(b)
Fig. 2. The user process execution stack before executing the instruction ,b = a (a) and after the execution of s t r c p y ( b u f l , buf) (b) that overwrites the array bull. Variable content appears on the right hand side of the corresponding memory location.
the system will try to execute the instruction pointed to by the current return address, i.e., 0x21212100, the hexadecimal value of the null terminated string "!]!". Since such an address is out of the process instruction address space, a segmentation violation occurs and execution is aborted. However, if the value were a legal address, execution would proceed, possibly terminating with an "illegal instruction" error message if the content of that address is not a legal instruction. If the return address could be modified so as to point to an executable piece of code, that code would be executed with the process current privileges.
A tool for pro-active defense against the buffer overrun attack
25
A legitimate value for the return address is the address of the array that will be overwritten. The program will jump to the array and, if the array has been overwritten with a piece of object code, the process will execute that code instead of the original program. However, when the source code is not available, the address of the buffer and the size of the exploitable area on the stack are not known and only guesses are possible. Based on the scheme just illustrated, the tool tries to overflow the command line argument buffers. For each program in the list produced in the first phase, large input parameters are tried for each option and environment variable. When a segmentation violation message is obtained, there are good chances that the program be vulnerable. Programs that cause a segmentation fault are passed to the third phase. We currently adopt a brute force approach to test all options. This way we are guaranteed not to discard vulnerable ones. Such a brute force approach is susceptible of improvement, based on heuristics for parsing the option list and their description. Non-exploitable options could be safely trimmed without risking to overlook chances of attack. Although performance is not an issue for our tool, as programs are tested off-line, analyzing the list of setuid programs of a system can be a lengthy process. Optimizing the analysis phase while guaranteeing that no vulnerable cases are discarded is desirable.
4.3
Exploiting Segmentation Violations
A segmentation violation is a necessary but not sufficient condition for successful buffer overrun attacks. The reason is that the buffer to be overwritten must be large enough to contain the malicious executable code. Furthermore, the address of the stack location of the return address of the function writing the buffer and the new value to write in it are critical for the successful completion of the attack. Determining such values for the programs selected in the second phase and exploiting them is the objective of the third phase. Since our tool operates exclusively on object code, guessing strategies have been devised. The size of the exploitable area (the buffer to be overwritten and the contiguous locations up to the return address), the location of the return address, and the new value to load in it depend upon the program being exploited and the system on which it is executed. T h e y can be easily obtained if the source code is available. Details about the system architecture the code is executed on are also necessary, e.g., the user stack start address, the stack layout (i.e., relative positions of function actual parameters, return address, stack and frame pointers, and local variables), the maximum per process stack size (i.e., the smallest memory address feasible as a stack location), memory addressing scheme, alignment convention, and system call details. Most of the above parameters are strictly implementation dependent and hardly, if not at all, available as written documentation. Some were gathered by examining other programs, namely the stack start address, the amount of information pushed onto it when a program is loaded. A test program was used to derive the exact layout of a function
26
D. Bruschi et al.
stack frame, as it is the same for all programs on a given system. Based on such information, we derive the most critical parameters we need to exploit the code. F i n d i n g t h e stack l o c a t i o n c o n t a i n i n g t h e r e t u r n a d d r e s s For each critical option or environment variable identified in the second phase, the object program is executed under the tool control. Child processes are forked by the tool, which execute the program with increasingly bigger arguments, until a segmentation violation signal is received. By handling the child signals, the parent process knows what the object program execution status is. A default buffer size (or equivalently argument size) of 1000 bytes is initially assumed (user defined values passed as command line arguments of the tool are also possible). The first time a segmentation violation signal is received, an upper bound on the size of the buffer to be overflown is found. The range determined by such an upper bound and the largest buffer size that did not cause any signal (possibly the default buffer size) is explored. The minimum argument size that causes a segmentation fault signal, min is found in a dichotomic fashion. For the exploitable area to be complete, the location of the return address must be determined. A new sequence of executions is started with argument size equal to min, increasing it by one word at a time. The return address is eventually overwritten. When this happens, the system attempts to execute the instruction pointed to by the modified return address. An "illegal instruction" signal is generated if the value written in the return address is out of the program scope. This is easily obtained by appropriately crafting the program argument value (e.g., a string of A's would be sufficient). Thus an "illegal instruction" signal means that the return address has been found. At this point the exploit can be crafted as the size of the area to be overwritten, the exploitable area, is known. The executable code for the execution of a shell command interpreter is injected at the end of a buffer of such a size, padding the initial part with NOP instructions. F i n d i n g t h e n e w v a l u e f o r t h e r e t u r n address The last piece of information for the exploit to work properly is the value to load into the return address. The critical observation that allows us to derive the initial guess for the value to toad into the return address is that all of the programs on a given system are executed within the same memory address space, as the one depicted in Figure 1. Since in many Unix systems the highest stack address is 0 x T f f f f f f f , with a simple brute force approach all the legal addresses 5 smaller than 0 x T f f f f ~ f f become potential candidates for the value of the return address. T h e object program is executed by child processes under the tool control, trying the sequence of return address value candidates. Experimental results have shown t h a t an initial guess that optimizes the search is represented by the address of the a r g v variable of the We simplify here the discussion, assuming the entire address space starting at the highest address is accessible to the user. The fact that this is not true in reality is simply an implementation detail. Test programs can be used to determine the first, i.e., highest, user accessible address.
A tool for pro-active defense against the buffer overrun attack
27
tool itself. Such a value is obtained with the simple instruction aargv. Note that, among the command line arguments of the object program, which are placed at the top of the stack when the program is executed by a child process under the tool control, is the exploit code crafted before. Since the overall size of the tool arguments and the overall size of the object program arguments is usually different, two threads of execution are spawned that search the addresses below and above the address of argv, respectively. Either one will eventually find a correct value. Such a value is appended at the end of the exploit buffer, after the executable code, so as to overwrite the return address of the function operating on the exploited buffer.
5
A Case Study
In this section we illustrate a successful execution trace of our tool on a program available in the HP-UX operating system 6. The program was found by executing phases 1 and 2 on the set of programs comprised in the standard distribution. As mentioned in the previous section, we have to determine the location of the return address of the function operating on a buffer without controlling its size, and the value to be written in the return address. The object program is executed using the default buffer size (i.e., 1000 bytes) and specifying the option to be exploited. The following command line is executed: rose 9 The c o m m a n d
it [~/tmp] >./BOD /usr/bin/XXX/YYYY
-o -ZZZZZ
n a m e and its option are sanitized for obvious reasons. -o is the
tool option that indicates that the program/usr/bin/XXX/YYYY uses the option -ZZZZZ.
After 9 executions, trying the sequence of buffer sizes 1000, 1500, 1250, 1125, 1063, 1094, 1079, 1072, 1075, a S e g m e n t a t i o n V i o l a t i o n signal is received by the tool for the value 1075, i.e., the minimum buffer size that causes the segmentation fault. The argument size is now increased by one at a time, until an I l l e g a l I n s t r u c t i o n signal is received. This occurs after 5 steps, for buffer size equal to 1080 bytes. At this point, the buffer argument is crafted such that a sequence of NOP's is terminated by the executable code for a shell command interpreter. We use the code for the system call e x e c v e ( " / b i n / s h " , NULL). The buffer argument obtained is the following: NOP NOP . . .
NOP \ x 3 4 \ x 1 6 \ x O 5 \ x O 6 \ x 9 6 \ x d 6 \ x O 5 \ x 3 4 \ x 2 0 \ x 2 0 \ x O 8 \ x O 1
\xe4\x20\xeO\xO8\xOb\x5a\xO2\x9a\xe8\x3f \xlf \xfd\xO8\x21\xO2\x80 \x34\xO2\xO l\xO2\xO8\x41kxO4\xO2kx6Okx40\xOl \x62\xb4\x5akxO I\x54 \xObkx39\xO2\x99\xOb\x18\xO2\x98\x34\x16\xO4\xbe\x20\x20\xO8\xOl \xe4\x20\xeO\xO8\x96\xd6\xO5\x34\xde\xad\xcakxf e/bin/sh 6 W e omit here all the details relative to the O.S. release, the architecture, and the
vulnerable program, for obvious reasons.
28
D. Bruschi et al.
Now the program is executed in oder to find the value to load in the return address. The return address is added at the end of the buffer, starting from the value 0x7b03a4b0, i.e., the default argv address. After 30 steps, when the return address is 0x7b03a528, a super user shell is spawned (the # sign in the output below). During the execution, the following sequence of messages is obtained: ILLEGAL INSTRUCT. addr: 0xYb03a4b4 length: 1080 WIFSIGNALED(foo) 1 ILLEGAL INSTRUCT. addr: 0xTb03a4b8 length: 1080 WlFSIGNALED(foo) 1 ILLEGAL INSTRUCT. addr: OxTbO3a4bc length: 1080 WIFSIGNALED(foo) i ILLEGAL INSTRUCT. addr: OxTbO3a528 length: 1080 WIFSIGNALED(foo) I #
The execution time of the third phase as described is 7.2s. We then reran it by specifying the tool option for the return address as 0x7b03a528 and the value 2000 as buffer size. Both the buffer size search and the return address value are significantly accelerated. The total execution time of phase three in this case is 2s. Measurements of the third phase execution times for other programs confirm that it usually requires in the order of tens of seconds.
6
Existing Solutions
Several solutions have appeared since the buffer overrun problem has become a security issue. However, none of them has completely succeeded in removing the problem from the computer scene, as the only definitive solution is writing safe code that performs all the necessary data integrity checks on input arguments and data and uses safe library functions. Unfortunately, such a practice cannot be enforced. Alternatively, strongly typed languages could be used as they do not allow liberal variable manipulation. Adopting a type safe language represents a pro-active solution whose cost is reduced flexibility and efficiency of the generated code and applicability limited to new programs, as existing one should be rewritten. Language implementation problems may, however, affect a security consciously designed language, as is the case with Java [7, 5]. An overview of the approaches followed by the various solutions is presented in the next sections. Potential drawbacks in terms of security, performance, compatibility and portability are considered.
6.1
Compiler Patches
Compiler patches are meant to add compile time bounds checks for array operations and, where possible, run time checks on pointer manipulation to ensure that legitimate storage regions only are accessed. This approach guarantees that new and old programs, for which source code is available, will generate safe code when compiled. Two major issues relative to performance and compatibility are raised by this approach. Static checks performed at compilation time may slow down compilation time of a factor of two to three times. Performance penalties
A tool for pro-active defense against the buffer overrun attack
29
introduced by run time checks are usually much more severe as they can be of the order of few tens or more, depending on the memory access characteristics of the programs. Compatibility issues derive from possible modifications of the representation of pointers needed to perform the run time checks. A patch for the gcc compiler has been developed at Imperial College [9] that performs full static checks on arrays and run time checks on pointers. The authors report twofold to threefold increase in compilation time, depending on the type of source code, and up to a thirty times increase in execution time. Another set of compiler oriented solutions introduce integrity checks on memory accesses a n d / o r function return [8, 4]. Because of the slow down imposed on execution time, software oriented solutions can be used as debugging tools. Only hardware oriented solutions such as ElectricFence can be used in production code as they introduce negligible overhead. In [8], memory accesses are instrumented to perform integrity checks on the accessed area. Execution time slow downs of up to a factor of 5 times that of optimized code were measured. Two compiler extensions are proposed in [4]: StackGuard, for efficient protection of the return address with limited security guarantee, and MemGuard, for secure protection of all critical memory accesses at the expenses of performance. StackGuard prevents buffer overruns by introducing an extra word before the return address, whose value is checked at function return with minimum overhead. If the value is modified, a buffer overrun attempt is detected and execution stops. MemGuard implements an API for protecting memory accesses. The return address cannot be modified by overwriting a nearby array as it can only be accessed via such an API. Unfortunately, because the granularity of the protection is at memory page level, the authors report that writes to unprotected words in protected pages can cost up to 1800 times the cost of unprotected words in unprotected pages. 6.2
L i b r a r y Patches
A set of architecture specific library patches have been proposed for FreeBSD [15]. Assembly coded integrity checks on the user stack of dangerous C library functions such as s t r c p y ( ) are implemented. The performance impact is less serious than with compiler oriented solutions, as the changes only impact library functions execution. On the other hand, since user defined functions are not subject to integrity checks, this solution does not guarantee program absolute safeness. Because it is coded in assembly language, its portability is limited. 6.3
O p e r a t i n g System Patches
Based on the observation that one of the conditions for buffer overrun exploits to be effective is that the stack be executable, a different approach was taken in [16]. A patch for the Linux operating system was developed to make the stack non-executable. This way, overwriting arrays with object code to which the
30
D. Bruschi et al.
program is forced to jump becomes ineffective, as such code cannot be executed 7. The main advantage of such a solution is program protection at no cost either in execution time (no overhead is introduced) or at compilation time (no recompilation is needed for the programs to execute safely). However, the kernel has to be patched as signal handling in Linux relies on executable stack, gcc relies on executable stack for trampoline functions, and functional languages need executable stack for run-time code generation. Furthermore, the solution is not portable and can be applied only if the operating system source code is available.
7
Conclusions
In this paper, we presented a tool for the automatic detection of programs vulnerable to the buffer overrun attack. Target of our tool is all the existing object code for which sources are not available, like commercial distributions of operating systems and applications. The tool is pro-active as it assists the system administrator in verifying the system security before vulnerabilities are found and exploited by hackers. It comprises a set of shell scripts and C programs which can be executed as a background process, after completing the system installation. Portability to various systems is limited only by the specific architecture features based on which the buffer overrun is exploited. A fully operational prototype is available for HP-UX and Linux platforms. Details relative to other major Unix versions are being added.
Acknowledgments The authors are grateful to Roberto Banff, who implemented the tool.
References 1. Bach, M.J.: The design of the Unix operating system. Prentice-Hall International Series, 1986. 2. Bunch, S.: The setuid feature in Unix and security. Proceedings of the lOth National Security Conference, 1987. 3. Computer Emergency Response Team Coordination Center: ftp://ftp.cert.org/pub/advisories, Carnegie Mellon University. 4. Cowan, C., Pu, C., Maier, D., Hinton, D., Walpole, J., Bakke, P., Beattie, S., Grier, A., Wagle, P., Ziang, Q.: StackGuard: automatic adaptive detection and prevention of buffer-overflow attacks. Proc. of the 7th USENIX Security Symposium, 1998. 5. Dean, D., Felten, E.W., Wallach, D.S.: Java security: from Hot Java to Netscape and beyond, Proc. of the IEEE Symposium on Security and Privacy, Oakland, 1996. 6. Garfinkel, S., Spafford, E.: Practical UNIX and Internet Security. O'Reilly and Associates, 1996. 7 This does not protect against attacks that use heap allocated or statically allocated memory to inject the executable code.
A tool for pro-active defense against the buffer overrun attack
31
7. Gosling, J., Joy, B., Steele, J.: The Java language specification. Addison-Wesley, 1996. 8. Hastings, R., Joyce, B.: Purify: fast detection of memory leaks and access errors. Proc. of the Winter USENIX Conference, 1992. 9. Jones, R., Kelly, P.: Bounds checking for C. http://www-ala.doc.ic.ac.uk/phjk/BoundsChecking.html, July 1995. 10. Kernighan, B.W., Ritchie, D.M.: The C programming language. 2nd Ed. PrenticeHall Software Series, 1988. 11. McKusick, M.K., Bostic, K., Karles, M.J., Quarterman, J.S.: The Design and Implementation of the 4.4BSD Operating System. Addison Wesley, 1996. 12. Mudge: How to write buffer overflows. http://www.10pht.com/advisories/bufero.html 1997. 13. One, A.: Smashing the stack for fun and profit. Phrack Magazine 49, Fall 1997. 14. Smith, N.P.: Stack smashing vulnerabilities in the UNIX operating system. http://millcomm.com/nate/machines/security/stack-smashing/1997. 15. Snarskii, A.: FreeBSD Stack integrity patch. ftp: / /ftp.lucky.net /pub /unix/local /libc-letter, 1997. 16. Solar Designer: Non-Executable user stack, http://www.false.com/security/linuxstack/. 17. Spafford, E.: The Internet Worm incident. Proc. of the European Software Eng. Conference, pp 203-227, 1989.
A Kernelized A r c h i t e c t u r e for Multilevel Secure A p p l i c a t i o n Policies Simon N. Foley 1,2 1 University of Cambridge CCSR, Cambridge CB2 3DS, UK. 2 Department of Computer Science, University College, Cork, Ireland. s. f o l e y @ c s , ucc. ie
A b s t r a c t . Mandatory label-based policies may be used to support a wide-range of application security requirements. Labels encode the security state of system entities and the security policy specifies how these labels may change. Building on previous results, this paper develops a model for a kernelized framework for supporting these policies. The framework provides the basis for, what is essentially, an interpreter of multilevel programs: programs that manipulate multilevel label datastructures. This enables application functionality and security concerns to be developed separately, bringing with it the advantages of a separation of concerns paradigm.
1
Introduction
Conventional multilevel secure systems place all trust in the underlying trusted computing base (TCB), regarding most of the operating system and applications software as untrusted [20]. Clark and Wilson [3] argue that security should be defined across both the operating system and applications. Under their model, a secure system may be viewed as a certified application running on top of a trusted computing base (TCB). Certifying an application is analogous to arguing (to a degree) it's correctness according to the application's security requirements; the T C B is expected to have undergone some sort of security evaluation. A problem with this approach is that it encourages an intertwining of the functional and security-critical code that makes up an application. For example, interpreting the original Clark-Wilson model, the TCB is expected to support (enforce) static segregation of duties. However it appears that dynamic segregation of duty must be implemented/intertwined within the application itself [4, 12]. Intertwining makes the application complex and reasoning more difficult since both security and functionality concerns have to be dealt with at the same level of abstraction. We argue that when developing an application the security and functionality concerns of the application must be separated, both at the conceptual level and at the implementation level. At the conceptual level, a loose coupling between concerns facilitates the verification of security, while at the implementation level we want an architecture and programming environment that will support the concerns securely.
34
Simon N. Foley
As a software engineering paradigm, techniques that support separation of concerns for synchronisation, real-time constraints, failures and others have been studied. The reader is referred to [7] for an introduction. One could view policy neutral architectures such as [13, 17] as a step in the direction of providing support for the separation of (security) concerns paradigm. We are interested in supporting separation for multilevel security concerns. In this case, an application developer designs and implements an application in terms of separate security and functionality components. Once the security concerns meet certain verification conditions, an underlying TCB is expected to ensure multilevel security. The dynamic label-based policies proposed in [4, 5] can be used to support this separation. From an object-oriented perspective, labels are used to encode security relevant characteristics of objects and provide relabel functions that specify how the labels may change. Given some functional object, we then argue that it also has an associated security object whose state corresponds to its label and has methods that define how its state (label) may change. This security label object is like a meta-object, providing security details about the information in the associated object. These label-based policies use the Bell-LaPadula model as their underlying access control model and require a special trusted label manager which mediates and interprets requests to update labels. A variety of application security policies can be encoded in terms of label-based policies, including Chinese walls, dynamic segregation of duties and group access policies [4, 5]. It is desirable to have a framework that can support a wide range of labelbased policies. In [5] a model is described that allows any subject to relabel any object label, where the relabel functions meet certain security non-interference style requirements. These secure canonical upgrade policies (SCUP) are supported securely so long as high-level changes may not be detected by low-level subjects. A framework that supports such policies in a message-filter [8] based multilevel secure object store is described in [5]. However, the framework in [5] does not consider how the label manager might be developed in practice. This paper develops a kernelized-based model of the label manager. The paper is organised as follows. Sections 2 and 3 describe the basic model for relabel policies. They are based on [5], but with some minor modifications that make kernelization possible. Sections 4, 5 and 6 give the noninterference analysis, kernelization, and correctness, respectively, of the label manager. A number of state-invariants for the kernelized manager are explored in Section 7; these would contribute towards an efficient implementation. Section 8 discusses the results and contributions of the paper. The Z notation [16] is used to provide a consistent syntax for structuring and presenting the mathematics in this paper. In using Z, it has been possible to check the mathematics using the Z/EVES tool [15]. Appendix A gives a brief overview of the Z notation used in the paper.
A Kernelized Architecture
2
35
D y n a m i c Label Policies
An information flow policy is defined in terms of a partial ordering (_ < _) of security levels (C) which have the usual multilevel interpretation.
_ FlowPolicy[ C] _<_:C++C Vu, v,w : C 9 (u < u) ^ (u<_vAv<_u~u=v) (u<_vAv<_w~u<_w)
A
Security labels (L) are datatypes that are used to encode security relevant characteristics of objects. For example, a purchase order object could have a label indicating that it has been requested, but not yet authorised. In [5] the security label is also used to specify the desired security level of an object. An object's label may change according to relabel functions which form part of the policy. _
RelabelPolicy[C, L, F] FlowPoIicy[ C] 5ca : ( F x C x L ) - - + L
Given a set of relabel function identifiers F, then ~'n(f, s, a) -- b means that an entity at level s may use function f E F to change label a to b. E x a m p l e 1 A simple mark-for-upgrade policy uses security labels to encode details about future upgrade levels for an object. An object label has a format u : - A , where u indicates it's current level and A gives a list of future upgrade levels. For example, the label (u:-[11, t]) of an unclassified (u) object indicates that it should be upgraded to top-secret ( t :-[t]) when an upgrade is requested. A requester at level r invokes the function mark to tag the label with its level (r). Function up upgrades the object's label u : - A to the next level in A. For example, the label ( u : - [ u , t ] ) becomes ( t : - [ t ] ) , when up(u, ( u : - [ u , t ] ) ) is invoked. For the purpose of illustration, Figure 1 gives a fragment of a (Haskell) functional-style prototype of this relabel policy ( p r o j e c t and I n v i s i b l e will be described later) 1. Functions are specified in an equational-style and expression [ v l v < - t a g s , w<=v] is the list of all v that are members of list t a g s and that dominate w. A To determine whether the relabel functions mark and up are secure it is necessary to first define, as part of the policy, how users at different levels view labels. This is defined in terms of label projection, where a user at level v, inspecting label a, actually sees the label b = (a r v). A canonical policy may be 1 This policy is valid only for flow policies that are total orders. A specification of a more general policy can be found in [5].
36
Simon N. Foley ) level:-[level]
data Label level = Invisible
--datatype
for label
mark(req,(u:-tags)) I (u<req) I otherwise
--level req request marks label (u:-tags) = u:-(req:tags) --hi may mark io label = u:-tags --no change in label
up(req,(u:-tags)) # req
= (u:-tags)
i otherwise where newtags
--level req request upgrades (u:-tags) = (minimum newtags):-newtags = Iv #v<-tags,u
p r o j e c t ((u : - t a g s ) , v)
- - l a b e l ( u : - t a g s ) , as seen from level v
(u:-[wIw<-tags, w<=v])
lu<=v
=
I otherwise
= Invisible
Fig. 1. Sample Relabel Policy Prototype
thought of as a relabel policy that has a view-equivalence relation a r v = b r v (in the non-interference sense) defined over its labels. By default, there is a special label that is used to represent label projections that are invisible.
_ CanonicalPolicy[C, L, F] RelabelPolicy[ C, L, F] _t_:L• C-~ L invisible : L V u : C 9 invisible r u = invisible Vu : C; f : F 9 f74(], u, invisible) = invisible E x a m p l e 2 Continuing Example 1, Figure 1 defines the projection operator for the policy. A user may view only those tags on a label that the user's level dominates. Thus, p r o j e c t ( s , u: -[u, t]) returns label (u: -[u]). If the label is not visible to the viewer, then it is invisible. For example, p r o j e c t ( u , s : - [ s , t]) returns (by label equivalence) the label I n v • /', There are a number of conditions that a canonical policy must uphold in order to be secure. These ensure that a high-level user cannot interfere, in a visible way, with low-level labels. Sections 3 and 5 describe a subsystem for managing labels that is multilevel secure if the canonical policies that it supports uphold these conditions.
i CP_CView[C, L, F] CanonicalPolicy[ C, L, F] Vv, w : C ; a : L e w
A Kernelized Architecture
37
This condition (consistent view) specifies t h a t a user may not test for differences between two labels that are viewed as the same from the projection of the user.
I CP_NWD[C, L, F] CanonicaIPolicy[ C, L, F]
Vf :F; s~v: C; a : L . s _
v ~
a r v = .r~(f,s,a)
r v
This corresponds to the unwound non-interference requirement that a high-level user may not interfere with a low-level view of a label (no write down).
CP_NR U[ C, L, F] CanonicaIPolicy[ C , L, F] VI :F; s,v: C; a : L . 9~n(I, s, a) [ v = .Tn(], s, a [ v) This corresponds to the unwound non-interference requirement that a change in a low-level view of a label may not depend on any high-level information in the label (no read up). This requirement is slightly stronger than that originally specified in [5], but was found to be necessary for the kernelization of the label manager. We have, t h e o r e m CP_NRUprev [C, L, F] V CP_NRU[C, L, F]; f : F; s, v : C; a, b : L *
a [ v = b [ v =~.T've(],s,a) [ v =.Tn(I,s, b) r v A policy that upholds these three conditions is called a secure canonical upgrade policy (SCUP).
SCUP[C, L, F] I CP_CView[C, L, F] CP_NWD [C, L, F] CP_NRU[ C, L, F] E x a m p l e 3 The policy in Figure 1 is overly simplistic and is for illustrative purposes only. The policy described in [5] is an example of a more detailed relabel policy that was designed for the message-filter based model [8]. The scheme supports the relabelling of objects such that, when upgrades are requested, objects are migrated from one (single-level) object store to another. Object migration ensuring referential integrity may be achieved by viewing migration as a multilevel garbage collection [2] problem, or by encoding proxy information [10] in security labels. See [5] for a more detailed explanation. /k 3
Label
Manager
The label manager [5] provides a trusted interface to a SCUP policy. Its trusted operations make up the TCB extension required for standard multilevel secure
38
Simon N. Foley
systems. While it was intended specifically for message-filter based multilevel OODBMS, we believe it to be sufficiently general to be applicable to other systems. Object identifiers are used to uniquely identify objects within an objectoriented database. An object identifier is given as a tuple (u, i), where identifier i uniquely identifies an object at a security level u. Thus, given ID, the set of all identifiers, define OW[C] = = ( 0 • JD)
LabelStore defines the state of the label manager. It is accessed via the operations that make up its programming interface. Each object o : OlD has an associated security label 6(o). C, L] : OID[C] -~ L ~ 6LabelStore[ The label of an object may be changed according to the relabel functions defined in a SCUP policy. This is done by invoking the Relabel operation, where a request is made at level req? to apply the relabel function rfun? to the label of object oid?. _
Relabel[C, L, F] SCUP[C, L, F] A LabelStore[ C, L] req? : C old? : OID rfun? : F if ( o l d ? E dom 6) t h e n 6' = 6 @ {oid? ~ Jrn(rfun?, req?, 6(oid?))} else 61 = 6
The Operation ViewLabel returns, as lab!, the appropriate projection of the label of object oid? when requested at level req?. Note that if the object does not exist then the label invisible is returned; this prevents a low-level user testing the existence of high-level objects.
_ ViewLabel[C, L, F] SCUP[C, L, F] ~ LabeIStore[ C, L] req? : C oid? : Oil) lab! : L i f (oid? E dom 5 A first(oid?) < req?) t h e n lab! -~ (6 oid?) r req? else lab! = invisible
A Kernelized Architecture 4
Security
39
Analysis
Since the actions of the label manager are not mediated by the security kernel we must prove that it is multilevel secure. This is done by using an unwound version of non-interference [6,14] to prove that no series of high-level requests to the manager can interfere with what a low-level, or disjoint-level, user can view. State LabelStore looks the same as state LabeIStore ~, when viewed from security level vl, if the label projections of the objects, whose levels are dominated by vl, are equal. This is formally specified as follows.
_ VEquiv[C, L, F] LabelStore[ C, L] LabelStore'[ C , L] SCUPiC, L, f ] vl: C v o: o w [ c ] I j rst(o) < vl 9 (o e dom(5) ~=~ o E dom(6')) A (o e domSN domS' ~ (f(o) [ vl = 5'(0) [ vl) The first unwinding condition requires that each operation, when requested at a high-level, cannot interfere with a low-level view of the state (No Write Down). We have t h e o r e m Relabel_NWD [C, L, F]
V Relabel[C,L,F]; vl : C 9 -~ (req? < vl) ~ VEquiv[C, L, F] t h e o r e m ViewLabel_NWD [C, L, F] V ViewLabel[C, L, F]; vl : C 9
-1 (req? <_ vl) =~ YEquiv[C, L, F] The second unwinding condition requires that the outcome of an operation, requested at a low-level, cannot be based in any way on the high-level part of the state (No Read Up). We have t h e o r e m Relabel_NRU [C, L, F] V Relabel[C, L, F]; Relabel[C, L, F][5"/5, 5'"/5']; vl : C 9 VEquiv[C, L, F][5"/5'] ~ VEquiv[C, L, F][5"'/5] t h e o r e m ViewLabel_NRU [C, L, F] V ViewLabel[C, L, F];
Viewnabel[C, L, F][~"/5, 5'"/5', lab!"/lab!]; vl : C 9 (VEquiv[C, L, F][5"/5'] A (req? <_ vl)) labl = lab!"
40
5
Simon N. Foley
A K e r n e l i z e d Label M a n a g e r
The message-filter model [8] supports multilevel security in object-oriented database systems according to the Bell and LaPadula (BLP) model [1]. A message filter mediates all message passing between objects such that information may flow according to the information flow relation. These database objects are viewed both as objects and subjects in the Bell-LaPadula model. As objects, they have state, and as subjects, they execute actions by sending messages. Implementation of the message-filter model does not rely on the construction of a special trusted OODBMS: if the message-filter lies within the TCB of a multilevel system, then the remainder of the application can be based on existing and untrusted OODBMSs. The (multilevel) persistent object store is partitioned into a collection of single-level stores (see Figure 2). The underlying security kernel, upholding the usual BLP axioms, ensures that it is not possible for an (untrusted) OODBMS to violate the multilevel policy.
~ I Message Filter
Object Stores ~ '1tOpsecret secret
]unclassified
Fig. 2. Message Filter & Single Level Stores In [5], the label manager runs as a separate trusted-subject servicing requests from objects in the the multilevel object store. Under this approach the entire label store (5) forms part of the state of the manager, and since it is multilevel, it must be protected from direct access by any untrusted object. While its high-level specification (above) has been proven to be secure, an implementation should be constructed in terms of untrusted components (kernelized), if at all possible. A refinement of the label manager is now specified which, given a suitable message filter, can be kernelized in its entirety. The basic strategy is to replicate each label for every level. A relabel broadcasts the request to every replicated label at levels that dominate the level of the request. A request to view returns the value of the label, at the level of the requester. The scheme works so long as broadcasts are implemented in a serializable and secure manner. Section 7 will outline an optimisation to the scheme: in practice it is not actually necessary to replicate at every level. Figure 2 illustrates this replication. In the abstract state the object o : OID has label 5(0) = (u:-[u, t]). In the concrete state (implementation), the label is replicated at every level whereby the replicated value gives that level's view of the label.
A Kernelized Architecture
41
Each replicated label is regarded as a single-level object (of type object label class) contained in a single-level object store. These objects, in turn, provide relabel and view-label methods.
~
LabelStoreO[ C , L] 50 : C x OID[C] -+ L CurrOIDs : ~ OID[C]
For the purposes of modelling this implementation approach, the concrete state is specified in terms of a function 50 whereby 50(u, o) gives the level u copy of o's label. The set CurrOIDs defines the current set of objects Any state may serve as an initial state as long as replicated views are consistent. _
InitialLabelStoreO[ C, L, F] SCUP[C, L, F] LabelStoreO[ C , L] V o : CurrOIDs; u, w : C l u < w .
50(u, o) = 50(w, o) I It turns out that InitiaILabelStoreO also defines a state invariant that is maintained by the concrete relabel and view-label operations. To relabel the label of object oid? a relabel request message should be sent to the replicated copy of the label (represented as 50(req?, old?)) at the level of the requester req?. The relabel function r]un? is applied to this label and the request is also broadcast to every replicated copy of the label whose level dominates req?. This is specified by concrete operation RelabelO. If these label objects are maintained in a multilevel object store then the effect of the relabelling broadcast must be atomic across the single level stores. Thomas and Sandhu [18,19] describe a message-filter based architecture that support write-up in a secure and serializable manner. This means that the broadcast (a write-up) specified in RelabelO can be supported, in theory. The SINTRA replicated database [9] also supports both replication and write-ups and therefore should be capable of supporting the relabelling manager.
_RelabelO[ C, L, F] SCUP[C, L, F] A LabeIStoreO[ C , L] req? : C old? : OID rfun? : F if (oid? e CurrOIDs) thenS~=50@(u: C I req? ~ u
* (u, oid?) ~-+ Jrn(riun?, req?,50(u, oid?)) ) e l s e 5~ = 50
CurrOIDs' = CurrOIDs
42
Simon N. Foley
r Relabel(maxk, s, o)
I~
I topsecret
[
[secret
Relabel(up, u, o)
Fig. 3. Applying Relabel Functions E x a m p l e 4 Figure 3 illustrates the effect of a secret request to mark the label of object o in Figure 2, followed by an unclassified request to upgrade the object. In this figure the migration of o from unclassified to secret, as a result of the upgrade, is also illustrated. How this migration is achieved in practice is considered in [5]. /k At any moment, the value 5o(u, o) gives the view of label o from level u. Therefore, the concrete version of ViewLabel is specified as follows.
_ ViewLabelO[C, L, F] SCUP[C, L, f] ~ LabelStoreO[ C, L] req? : C old? : OID lab!: L if (oid? E CurrOIDs A first(oid?) ~ req?) t h e n lab! = 5o(req?, oid?) else lab! = invisible Thus, to determine the label of an object oid?, the requester simply makes the request to the replicated copy of the object's label at the level of the requester. Note that LabeIStoreO specifies that an object's label is replicated at every level and not just at those levels that dominate the object's level. This may seem surprising, but it is desirable if further restrictions on SCUP policies are to be avoided. Maintaining a low-level version of a high-level object's label means that a low-level user can view :the low-level effects that low relabel requests may have on the high-level object. If this flexibility is not required then low-level views of high-level object labels could, for example, be assumed to be invisible, that is, 5(0) [ v = invsible, where first(o) ~ u. Thus, we always have 5o(u, o) = invisible for first(o) ~ u and it becomes necessary to replicate only the label of o at levels that dominate first (o). In [5], a create label operation is specified which is used to enter a label for a new object in the label store. Its specification and refinement is straightforward;
A Kernelized Architecture
43
we do not include it here for reasons of space. However, we must address, at least in broad terms, the issue of entering new object label details into the replicated label store. Recording a new object label [5] is effectively a m a t t e r of adding tuple ((req?, newid?) ~-~ lab) to 5: a requester at level req? has created a new object with identifier newid? (in the object store at level req?) and wishes to assign it label lab. With the concrete state implementation this is achieved by adding replicated entries {u : C I tea? < u 9 (u,(req?,oid?)) ~+ lab r u } that correspond to a broadcast write-up from level req?. If lab ~ u = invisible for all other levels (req? ~ u) then we are done. However, consider the case where the relabel policy permits low-level information to be encoded in what are, ostensibly, high-level labels. The message-filter will not permit a request at level req? to create a label object at a disjoint or lower level. This problem is easily solved by adapting the RelabelO implementation such that the broadcast upwards will replicate missing labels, assigning them default values that can be specified as part of the policy. 6
Correctness
of Refinement
Since the kernelized label manager is designed to be implemented in terms of untrusted components, with multilevel security enforced by the underlying TCB, it is not strictly necessary to prove that it is secure. However, it is necessary to prove that it is correct. This corresponds to proving that the behaviour of the concrete label manager is consistent with its abstract specification, that is, it is a refinement in the sense of [16]. D a t a R e f i n e m e n t In the kernelized label manger, (f0(u, o) gives the view of label o from level u. In the abstract specification this corresponds to 8(o) r u. Therefore, given any concrete state we can retrieve its abstract equivalent. The abstraction (retrieve) relation relates the concrete and abstract states.
_ Abs[C, L, F] SCUP[C, L, F]
LabeIStore[ C, L] LabelStoreO[ C, L] dom 5 -- CurrOIDs
V o : CurrOIDs; u : C * 6(0) r = 60(u, o) Initial States Theorem. Any state of LabeIStore may serve as a suitable initial abstract state. We can retrieve from any initial concrete state a valid abstract state, that is, t h e o r e m InitialStates [C, L, F]
V InitialLabeIStoreO[C, L, F] 9 3 LabeIStore[C, L] * Abs[C, L, F]
44
Simon N. Foley
Operation Refinement All operations, both abstract and concrete are total, in the sense that they are defined for all possible input values. Thus it is safe to apply a concrete operation whenever it would be safe to apply the same request to its corresponding abstract form. If the label manager is in some concrete state, related to an abstract state by Abs, and the manager moves to a new concrete state as a result of a concrete relabel, then this new concrete state must be related (by Abs) to an abstract that can be reached by an abstract relabel from the original abstract state. This is illustrated in Figure 4, where schema AbsAfter is used to retrieve the abstract after state. Operation ViewLabel has a similar correctness requirement. LabelStore
Relabel
LabelStore ' AbsAfter
LabeIStoreO
RelabelO
~* LabelStoreO'
Fig. 4. Correctness of RelabelO.
AbsA~er[C, L, F] ~ Abs[C, L, F][(i'/(f, ~ / 5 0 , CurrOIDs'/CurrOIDs]
theorem RelabelCorrect [C, L, F] V RelabelO[C, L, F]; Abs[C, L, F] 9 S 5' : OID[C] -~ L 9 AbsAfter[C, L, F] A Relabel[C, L, F]
theorem ViewLabelCorrect [C, L, F] Y ViewLabelO[C, L, F]; Abs[C, L, F] 9 35': OID[C] -~ L 9 AbsAfter[C, L, F] A ViewLabel[C, L, F]
7
Optimisation
We outline an optimisation to the realization of the concrete label manager which, wherever possible, avoids replicating labels. First, we extend the definition of SCUP with the addition of a label join operator (_ | _), where a | b gives a label representing the join of labels a and b. For consistency we assume that joining a lower-level view to a label's higher-level view makes no difference, that is, u < v ~ a r v = (a t v) | (a r u). Assuming a generalised form of the join operator, Q) A, which joins the set of labels A, then the state invariant
V LabeIStoreO[C,L]; v: C; o: OID[C] * ~o(v, o) = Q { u: c I u <_, 9 ~o(u, o) }
A Kernelized Architecture
45
follows, t h a t is, we have a consistency between a view of a label and its lower views. Under certain circumstances it may be possible to compute the view, at level vl, of object vo's label by simply joining those views that are strictly dominated by vl. Define this view-invariant property as
_ Viewlnv[C, L, F] SCUP[C, L, F] LabelStoreO[ C, L] vl: C vo : OID[C]
If this property holds for some vl and vo, the result is that it is not necessary to store a replicated version of object vo's label at level vl, as it can be computed by joining its lower-level views. Thus, when a requester at level req? stores a new label for a new object with id (req?, id?), just one entry (req?, (req?, id?)) ~+ lab) is stored (assuming that lab = lab t req?). Relabelling maintains this view-invariant for views that cannot be altered by the request, that is,
Y RelabelO[ C, L, F]; ViewInv[ C, L, F] 9 -, (req? <_ vl) ~ ViewInv[C, L, F][5~15o] Thus, if vo's label is not replicated at level vl (req? 2~ vl) before the relabelling, it is not necessary to replicate it after the operation (this trivially follows from the definition of RelabelO). However, vo must be replicated at all levels that dominate req?: this allows the relabel to be broadcast correctly. If vo is not currently replicated at level req? then its value is computed from the joins of lower-level views (that exist) and the relabel function applied. Since write-ups are allowed by the relabel policy, a relabelling may modify higher-level views differently to lower-level views. Therefore, the label computed for vo at level req? must be broadcast (with the relabel function) to all levels that dominate
req?. E x a m p l e 5 Consider the policy from Example 1. An unclassified object o has label (u: - [u] ), which is stored in the unclassified object store (no replication). When a top-secret user requests a mark, its top-secret view is computed as ( u : - [ u ] ) , function mark is applied, giving ( u : - [ u , t ] ) , which is saved in the top-secret store. A secret mark request to object o results in the computation of a secret view of the label (u: - [u] ), which is relabelled as (u: - [u, s] ) and stored. This relabel is also broadcast to the top-secret replicated label, changing it to ( u : - [ u , s , t ] ) . ~, The drawback of this scheme is that even with a modest number of relabel requests, it is likely that an object's label will end up being fully replicated. If
46
Simon N. Foley
we limit the relabel policy such that write-ups are not permitted then it is not necessary to replicate when the broadcast is complete.
_ SCUP_NWU[C, L, F] SCUP[C, L, F] s,v:C; a:L. s # v =-~ a I v = Y ~ ( f , s , a ) ~ v
Vf:F;
In our opinion, this does not seem to be overly restrictive and we have,
V SCUP_NWU[C, L, F]; RelabelO[C, L, F]; Viewlnv[C, L, F] * ( req? ~ vl) =~ Viewlnv[ C, L, F][5~/50] In this case, if a label is not replicated at level vl (req? ~ vl) before the relabelling, it is not necessary to replicate it after the operation. A relabelling operation must, if necessary, replicate the label at level req?, and then broadcast the relabel function onto those existing replicated copies. With this scheme, if a object label is not replicated at level req? then the ViewLabelO operation must compute it by joining the object's replicated labels from lower views. If the flow policy forms a lattice then it is possible to define the policy so that the label can be computed by joining labels whose lowest upper bound equals that of the label being calculated (req?).
8
Discussion and Conclusion
The label-based framework may be used to support the separation of concerns paradigm. An application may be developed, at a conceptual level, in terms of functional and security components. The security concerns are modelled in terms of objects or abstract data types: relabel functions (methods) which define how labels (state) may change. These security and functionality concerns may also be separately implemented. In addition to ensuring multilevel security of the functionality concerns, the framework ensures multilevel security and integrity of the security concerns. This framework is based on the Message-Filter model for secure multilevel secure OODBMSs, extended to incorporate a trusted label manager. Section 5 gave a model for its kernelization, relying on a secure and consistent write-up mechanism such as those described in [9, 18]. But our results are not limited to the Message-Filter model, we believe that they could be used in any multilevel secure architecture that supports write-up. The label-manager provides the basis for a multilevel program interpreter. While the policy in Figure 1 is a very simple example of a multilevel program, the framework could be used to support more elaborate security concerns such as those in [4, 5]. An interesting area for future study is to consider how this framework could be combined with recent work on compile-time information
A Kernelized Architecture
47
flow analysis of programs [21,11] so that multilevel data-structures could be supported. These are topics for future study. In one sense, the label manager is like a 'universal' trusted subject: an interpreter of trusted programs (relabel policies). A result of Section 4 is that if the relabel policy is SCUP then multilevel security (confidentiality) is ensured. However, an implication of Section 6 is that any relabel policy can be securely supported by the kernelized manger, but to maintain integrity, it is necessary to prove that the policy is SCUP. We make two interesting observations about this relationship. Firstly, to prove integrity for this system it is necessary to perform what is effectively a non-interference analysis of the policy program. Secondly, if integrity is as critical a requirement as confidentiality then, whether we kernelize or not, we must nevertheless perform the non-interference analysis. We believe that similar observations can be made about any 'trusted' subject.
Acknowledgement This work was done while I was a member of the Centre for Communications Systems Research, on leave from University College, Cork. I'd like to express my gratitude to Stewart Lee for inviting me and thank him, and the members of the Cambridge Computer Security Group, for a most enjoyable visit. Thanks also to the anonymous reviewers for their useful comments. This work was supported, in part, by a basic research grant from Forbairt (Ireland).
References 1. D. E. Bell and L. J. La Padula. Secure computer system: unified exposition and MULTICS interpretation. Report ESD-TR-75-306, The MITRE Corporation, March 1976. 2. E. Bertino, L.V. Mancini, and S. Jajodia. Collecting garbage in multilevel secure object stores. In Proceedings of the Symposium on Security and Privacy, pages 106-120, Oakland, CA, May 1994. IEEE Computer Society Press. 3. D.D. Clark and D. R. Wilson. A comparison of commercial and military computer security models. In Proceedings Symposium on Security and Privacy, pages 184194. IEEE Computer Society Press, April 1987. 4. S.N. Foley. The specification and implementation of commercial security requirements including dynamic segregation of duties. In 4th ACM Conference on Computer and Communications Security, 1997. 5. S.N. Foley. Supporting secure cannonical upgrade policies in multilevel secure object stores. In Proceedings of the 13th. Annual Computer Security Applications Conference, pages 69-80, San Diego, CA, December 1997. 6. J. A. Goguen and J. Meseguer. Unwinding and inference control. In Proceedings 1984 IEEE Symposium on Security and Privacy, pages 75-86, 1984. 7. W.L. Hursch and C.V. Lopes. Separation of concerns. Technical Report NU-CCS95-03, College of Computer Science, Northeastern University, Boston, MA 02115, USA., 1995.
48
Simon N. Foley
8. S. Jajodia and B. Kogan. Integrating an object-oriented data model with multilevel security. In IEEE Symposium on Security and Privacy, Oakland, CA, 1990. 9. M. Kang et al. Achieving database security through data replication: The SINTRA prototype. In Proceedings of the 17th National Computer Security Conference, pages 77-87, Baltimore, MD, USA, 1994. 10. M. Makpangou and M. Shapiro. The SOS object-oriented communication service. In 9th International Conference on Computer Communications, Israel, 1988. 11. A.C. Myers and B. Liskov. A decentralized model for information flow control. In 16th Annual Symposium on Operating Systems Principles. ACM, 1997. 12. M.J. Nash and K.R. Poland. Some conundrums concerning separation of duty. In Proceedings of the Symposium on Security and Privacy, pages 201-207, Oakland, CA, May 1990. IEEE Computer Society Press. 13. D. Olawsky, T. Fine, E. Schneider, and R. Spencer. Developing and using a policy neutral access control policy. In Proceedings of the New Security Paradigms Workshop. IEEE Computer Society Press, 1996. 14. J. Rushby. Noninterference, transitivity and channel-control security policies. Technical Report SRI-CSL-92-02, SRI International, Menlo Park, CA., December 1992. 15. M. Saaltink. The Z/EVES system. In ZUM'97 (lOth International Conference of Z Users), pages 72-85. Springer Verlag LNCS 1212, 1997. 16. J. M. Spivey. The Z Notation: A Reference Manual. Series in Computer Science. Prentice Hall International, second edition, 1992. 17. D.F. Sterne, G.S. Benson, and H Tajalli. Redrawing the security perimeter of a trusted system. In Proceedings of the IEEE Computer Security Foundations Workshop, pages 162-174, Franconia, NH, 1994. 18. R.K. Thomas and R.S. Sandhu. A kernelized architecture for multilevel secure object-oriented databases supporting write-up. Journal of Computer Security, 2(3):231-275, 1993. 19. R.K. Thomas and R.S. Sandhu. Supporting object-based high assurance write-up in multilevel databases for the replicated architecture. In Proceedings of European Symposium on Research in Computer Security, pages 403-428, UK, 1994. 20. U. S. Department of Defense. Trusted computer system criteria. Technical Report CSC-STD-001-83, U. S. National Computer Security Center, August 1983. 21. D. Volpano, C. Irvine, and G. Smith. A sound type system for secure flow analysis. Journal of Computer Security, 4(2/3), 1995.
A
The Z Notation
A set m a y be defined in Z using set specification in comprehension. This is of the form { D I P " E }, where D represents declarations, P is a predicate and E an expression. T h e components of { D I P 9 E } are the values taken by expression E when the variables introduced by D take all possible values t h a t make the predicate P true. For example, the set of squares of all even natural numbers is defined as { n : N I (n rood 2) = 0 9 n 2 }. W h e n there is only one variable in the declaration and the expression consists of just t h a t variable, then the expression m a y be dropped if desired. For example, the set of all even numbers m a y be written as { n : N [ (n rood 2) = 0 }. An equivalent way to define this set is as { n : N 9 (2 9 n) } - - t h e predicate m a y be dropped if it equals true.
A Kernelized Architecture
49
In Z, relations and functions are represented as sets of pairs. A (binary) relation R, declared as having type A ~ B, is a component of ?(A x B). For a 6 A and b 6 B, then the pair (a, b) is written as a ~-~ b, and a ~-+ b E R means that a is related to b under relation R. Functions are treated as special forms of relations. The Schema notation is used to structure specifications in Z. A schema such as FlowPolicy defines a collection of variables (limited to the scope of the schema), and specifies how they are related. Schema FlowPolicy[C, L, F] is defined in terms of generic types [C, L, F], which must be instantiated when the schema is used. Schemas may be defined in terms of other schemas. For example, the inclusion of FlowPolicy within RelabelPolicy is equivalent to the syntactic inclusion of the variables and predicates of FlowPolicy within RelabelPolicy. Schema predicates are useful for writing theorems: in Section 6 the Initial States Theorem is a universal quantification over all the variables of InitialLabelStoreO such that it's predicate part implies the existence a 6 such that the predicate part of Abs holds. The decorated schema LabelStore'[C, L] is LabelStore[C, L] with all variables primed. Schema variables may be renamed in the usual way: LabelStore[C, L][6'/6] an alternative way of writing LabelStore'[C, L]. The schema ALabelStore is a syntactic sugar for LabelStore A LabelStore'. It is typically used for specifying state transitions, with undecorated variables representing 'before values' and decorated (primed) variables representing 'after values'. Schema SSCUP is the schema ASCUP, but with the constraint that variable values are unchanged.
first(a, b)
Component a of ordered pair (a, b) PA The power set of A A~ B Relations between A and B A --+ B Total functions from A to B A~ B Partial functions in A --r B dom R, ran R Domain and Range of relation R f @g The functional override of f by g Fig. 5. Some Operators from the Z Toolkit
B
Theorem Proving with Z/EVES
This paper was typeset using LaTeX with the z-eves style. Thus, the LaTeX source of the paper acts as the input specification to the Z / E V E S system [15]. In addition to using the system to syntax-, type- and domain- check the specifications in this paper, Z / E V E S was used to verify the security and correctness of the label manager. The specification source, along with the Z / E V E S proof scripts for all theorems are available from the author or under the author's W W W page at
.
Dealing with Multi-policy Security in Large Open Distributed Systems Christophe Bidan 1 and Val~rie Issarny2 1 Distributed Software Engineering Section, Department of Computing, Imperial College, London SW7 2BZ, UK, c. bidan~doc, ic. ac .uk 2 IRISA / INRIA Rennes, Campus Universitairede Beaulieu, 35042 Rennes Cedex, France, Valerie. Issarny@irisa. fr
A b s t r a c t . From the security point of view, one challenge for today's distributed architectures is to support interoperation between applications relying on different possibly inconsistent security policies. This paper proposes a practical solution for dealing with the coexistence of different security policies in distributed architectures. We introduce a model for specifying security policies in terms of security domains, access control and information flow rules. Then, we identify the set of operators for combining the specifications of sub-policies and we address the validity of the resulting policy according to the security properties of the sub-policies.
1
Introduction
Object-based distributed computing architectures 1 like CORBA (Common Object Request Broker Architecture) defined by the Object Management Group (OMG) [24], and the Telecommunication Intelligent Network Architecture proposed by the TINA consortium [26] are promising approaches to support large open distributed systems. Goals of these architectures include interoperability between software components in heterogeneous distributed environments, where components may appear and disappear dynamically, as the result of individual and autonomous actions. From the security point of view, the interoperability promoted by the above architectures stresses the complexity of security policies which have to be implemented [15,10], as well as the necessity to cope with the coexistence of multiple security policies. This coexistence results from the interoperation between systems (or software components) having different security requirements, possibly at different granularity levels. For evaluating the security of open distributed systems, security oj~icers should be able to reason about the composition of the interoperating systems' 1 From the security point of view, an object-based distributed computing architecture is viewed as multiple interconnected systems [23].
52
Christophe Bidan and Val@rie Issarny
security policies. We identify two different approaches for the composition of security policies: policy interoperation and policy combination. Using policy interoperation, the composed policy should not violate the security of the subpolicies, and should guarantee their autonomy [14]. On the other hand, with policy combination, the composed policy may be inconsistent with both subpolicies, but must be secure in the given context. Hence, security combination allows to compose conflicting policies in a secure and controlled manner [3]. In the context of large open distributed architectures, we assert that policy combination is better suited than policy interoperation. In this paper, we introduce a model that forms the basis for dealing with security in heterogeneous distributed architectures, allowing to specify and to combine a wide variety of security policies. In particular, the proposed model is useful to security officers by supplying a practical set of combination operators. This paper is structured as follows: the next section presents related work and gives our standpoint concerning policy composition. Section 3 presents our model for the specification and combination of access control policies 2. In particular, it gives the notion of complete and sound combination. Section 4 extends this model so as to deal with combination of information flow policies. Finally, we draw some conclusions in section 5 and present current status and future work. 2
Topics
for Policy
Combination
Various models have been proposed in order to reason about security policies (e.g., see an overview of security models by Landwehr [18], the Bell-LaPadula model [4], the information flow model from Bieber [8], the Clark-Wilson model for commercial security constraints [9], and the McLean model [20]). These models are introduced to check whether a policy verifies given security properties like the nondeducibility property [25] or the noninterference property [13]. However, these models do not deal with the composition of policies. Concerning the definition of a security policy resulting from the composition of policies, we identify two different approaches: - policy interoperation which infers the composed policy based on the security properties of the sub-policies, and - policy combination which specifies the composed policy based on the specifications of the sub-policies. The former approach relies on the principles of autonomy and of security of sub-policies [14]: during the interoperation of two different security policies, any access authorized within an individual policy must also be authorized within the composed policy; and dually, any access denied within an individual policy must also be denied within the composed policy. Abadi et al. [1] propose a general 2 In this paper, an access control policy is defined as a security policy that only checks the accesses between entities, without verifying the information flow resulting from access operations.
Dealing with Multi-Policy Security in Large Open Distributed Systems
53
solution to the inference of execution properties of interoperating components from the components' specification, which can be adapted to security policies. More recently, McLean [21] has presented an approach which allows to reason about security properties of composed information flow policies. Finally, Gong et al. discuss computational issues concerning the general secure interoperation problem [14]. The above proposals allow to verify that interoperation preserves the subpolicies' security properties under certain conditions. However, the undertaken approach to the definition of policy composition is restrictive; the security properties of the composed policy must be the same as those of the sub-policies (the principles of autonomy and security must be guaranteed). In consequence, the sub-policies have to be compatible. In large open distributed system, this is not always the case, since the properties of a policy resulting from the composition of two incompatible sub-policies can be an extension or a restriction of those sub-policies. We illustrate this through two examples. In both example, we consider a file system where we have two sets of users, A and C, and two sets of files, B and D. We say that an user u is authorized to access the file f if and only if u is authorized to access f ' s data. We specify only authorized accesses, and all accesses that are not specifically authorized, are denied. We define the security policy P l (resp. P2) managing A users and B files (resp. C users and D files): P l (resp. 7~2) authorizes users in A (resp. in C) to access files in B (resp. in D). First, let us combine 7~1 with P2 to build the following composed policy: the users belonging to both sets A and C (denoted as AAC) are authorized to access the files in B N D, users in A but not in C (denoted as A - C) are authorized to access the files in B - D, and finally, users in C - A are authorized to access the files in D - B. As a second example, we combine P l and P2 so as to obtain the following policy: all the users belonging to sets A or C (denoted as A U C) are authorized to access all the files in B U D. Notice that this policy may be viewed as the federation of the sub-policies. The composed policy of the first example is a restriction of the participating sub-policies, whereas the composed policy in the second example is an extension of the participating sub-policies (for instance, A users are authorized to access D files). However, these two composed policies are secure in the context in which they are specified, nevertheless they cannot be obtained by policy interoperation. The specification of a policy based on the specifications of its sub-policies allows to combine possibly inconsistent policies. In particular, a combined policy does not necessarily guarantee all the properties of the sub-policies. However, notice that the combined policy may keep the principles of autonomy and secrecy of the sub-policies. McLean [19] seems to be the first to propose a formal approach including combination operators: he introduces an algebra of security which enables to reason
54
Christophe Bidan and Valdrie Issarny
about the problem of policy conflict. However, even though this approach permits to detect conflicts between policies, it does not propose a method to resolve the conflicts and to construct a security policy from inconsistent sub-policies. Hosmer [16] has introduced the notion of metapolicies, or "policies about policies", an informal framework for combining security policies. Following Hosmer's work, Bell formalizes the combination of two sub-policies with a function (policy combiner), and introduces the notion of policy attenuation to allow the composition of conflicting security policies [3]. Our work is placed in the framework of policy combination, and it is close to Bell's approach. More specifically, our model is a practical alternative solution to Bell's proposal: we clearly identify the set of combination operators enabling security officers to combine security policies in a controlled and secure way. Briefly, our model allows to specify the general behavior of sub-policies. Based on these specifications, we introduce combination operators for combining policy specifications. Finally, we verify that the behavior of the composed policy is secure and conforms with sub-policies specifications. The following sections introduce our model to deal with access control policies and flow policies.
3
Specifying Access Control Policies
In this section, we concentrate on the following type of access control: is an entity el authorized to access an entity e2 in the context of a given policy, regardless of the information flow that is implied by this access ? Our solution to policy composition relies on the definition of a set of combination operators. A policy is specified either as an elementary policy or a combined policy built from existing sub-policies through the use of combination operators. The next sub-section addresses the specification of access control policies and is followed by the definition of the operators allowing to combine policy specifications. Finally, we address the completeness and soundness of policy specifications. 3.1
Elementary
access control policy
An access control policy defines a set of rules that specify for each pair of object and subject, whether the subject is allowed to access the object's state or not 3. Since in an open distributed architecture, an access control policy cannot be defined for the whole set of the system's entities, we take an approach which allows to specify the sensitive entities for each policy. In other words, an access control policy is defined not solely in terms of its access control rules but also in terms of its access control domain [22], that determines the system's objects and subjects to which the policy applies. The access control domain of a policy is subdivided into the set of the system's entities called object domain, that contains sensitive information for this policy, 3 We define the subjects as active entities, and the objects as passive or active entities [11].
Dealing with Multi-Policy Security in Large Open Distributed Systems
55
and the set of the system's entities called subject domain, that gives the subjects belonging to the policy domain. These two sets are defined formally in terms of security classifications 4. Given a security property P (i.e. a predicate) defined on the set $ of the system's entities, the set of entities verifying this property is termed the P classification, noted Clp:
CIp = {r e C I P(c) ~ true} = {~ e ~ [ P} Given the definition of classification, an access control domain is defined as a set of classifications: Dora = {CI,}~=I, where, for each i = 1, ..., n, Cli is the classification defined by a security property 9
n
P~. Each access control rule A of a given policy defines the set of subject and object couples that verify a given security property Px (called access control predicate), and an access control operator. The access control operator is noted ,-~> for authorization, ~> for denial, and the 4-> operator will be used to denote ,~> or ~>5. Formally, an access control rule )~ takes the form: = {(s e Cll),(o e Cl2) [ Px(s,o) ;s+>o}. In other words, we say that a pair (s, o) belongs to )~ if the access control predicate Px(s, o) holds ; then, the access control operator of )~ define whether s is authorized or not to access o. In the remainder, f o r / ) being an access control policy, its domain is noted DomT~, the notation Dome, (resp. Dome,) is used to designate the sub-domain of DomT) that defines 7)'s objects (resp. subjects). Finally, R~, is the set of access control rules of the policy 7~. E x a m p l e . We now illustrate our model for policy specification with the example of a file system expressed as a distributed architecture. Let ~'8 be the set of software components accessing files. Let further R E A D C :F$ and W R I T E C ,%'8 be respectively the set of software components allowed to read and to write files. We also consider two sets of users, A and C, and two sets of files, B and D. We introduce the predicate Owner(sc, U) which, given a software component sc, holds if the owner of sc (i.e. the user executing sc) belongs to the set U of users (i.e. A or C). In the following, we do not specify reflexive accesses. Given the above definitions, we define the/)1 and 7)2 access control policies introduced in w as follows: -
The subject domain of the policy Pl is the set of components R E A D (Dom~ = R E A D ) , and :Pl's object domain is the set of files B (Dom~ = B). The policy :Pl has a single access control rule: )h = {(s E READ), (0 E B) I Owner(s,A);s,,~>o}, that authorizes subjects of A to access objects of B.
4 We use the term classification for both object classification and subject clearance. 5 We consider that an access corresponds to the execution of an operation (e.g. read access corresponds read method). This enables us to use a single access operator.
56
Christophe Bidan and Val~rie Issarny
- The subject domain of the policy P2 is the set of components W R I T E (Dom~ = W R I T E ) , and :P2's object domain is the set of files D (Dom~ = D). The policy P2 has a single access control rule: )~2 = {(s E W R I T E ) , (o e D) I Owner(s, C); s,,~>o}, that authorizes subjects of C to access objects of D. 3.2
C o m b i n i n g access c o n t r o l policies
A combined access control policy is built by combining two sub-policies. The domain and rules of the combined policy are defined in terms of the combination of its sub-policies' domains and rules. C o m b i n i n g access c o n t r o l d o m a i n s Given two classifications, four types of combination can be implemented: the union, intersection, product of classifications, and the denial of classification combination. Let Cll = {~ E E I P1} and C12 = {E E E I P2} be two classifications. We formally define their union and intersection as: Cll U Cl2 = {~ E ~ I P1 V P2} and Cll A Cl2 = {~ E E I P1 A P2} respectively. Concerning the product of Cll and el2 (noted Cll t~ Cl2), three separate classifications are computed: the classifications defined by the access control predicates P1 A -~P2, -~P1 A P2 and P1 A P2 respectively. Finally, to forbid the combination of two classifications, we introduce the operator /2 defined by: Cll )5 Cl2 = 0. The U operator allows to extend the sub-classifications. On the other hand, the classification obtained by A operator is a restriction of the sub-classifications. Finally, the +~ operator permits to distinguish the entities belonging to a single sub-classification from the entities belonging to both sub-classifications. The formal definition for combining domains directly follows: it amounts to specifying the type of combination for each pair of classifications of the subpolicies' domains. Let Dotal = (Cl~}'~=1 and Dora2 = (Clj}~= 1 be two access control domains. Let A = {Si,j E {U, N, ~, ~ ) , i = 1, ..., n, j = 1, ..., m} be the set specifying the operator for combining the classifications Cli of Dotal and Clj of Dora2. The combined access control domain D a m resulting from the combination of Dotal with Dora2 with respect to A is given by: Dora = {V i = 1,...,n,V j = 1,...,m, Clij = Cli ~i,j Clj}. We bring to the reader's attention that a combined domain can itself be combined with another domain. C o m b i n i n g access c o n t r o l rules. Because both ,~> and ~> access operators exist, the combination of access control rules raises the problem of conflicts among rules. Thus, we distinguish between combination of non-conflicting rules and combination of conflicting rules.
Combination of non-conflicting rules. Let us first consider the combination of non-conflicting rules. Given two non-conflicting access control rules, their combination results from the combination of their classifications and of their access control predicate.
Dealing with Multi-Policy Security in Large Open Distributed Systems Let A1 and both or denim ~1 = {(S 9 A2 = {(s 9
57
)~2 be two access control rules specifying either authorization for of access for both: Cll),(O 9 Cl'l) ] Pl(s,o) ; s+>o} and C12),(o 9 Cl~2) I P2(s,o) ;s+>o}.
Let us first suppose that Cll = Cl2 and Cl'1 = Cl'2. Then, combining A1 and )~2 consists of combining the access control predicates P1 and P2. With respect
to existing work (e.g. [2]), we identify two types of combination: the logical and (A) and the logical or (V) between P1 and P2. Then, the A access control rule resulting from the combination of A1 and As takes one of the following forms: A = {(s 9 Cll),(o 9 Cl~) I (Pl(s,o) YP2(s,o)) ; s+>o}, or A = {(s 9 C l l ) , ( o 9 Cl'~)l(P~(s,o)/\P2(s,o)); s+>o}. Informally, the logical or operator allows to combine two access control rules in such a way that the resulting access control rule preserves the authorized accesses of the sub-rules (principle of autonomy). On the other hand, the access control rule resulting from the logical and operator authorizes (resp. denies) access if both sub-rules authorize (resp. deny) the access. Let us now suppose that Cll ~ Cl2 and Cl~ ~ Cl~. Then, the set of access control rules resulting from the combination of )h and )~2 depends not only on the operator for combining access control predicates, but also on the operators for combining classifications. Let 61,2 (resp. 5~,2) be the operator defined for the combination of the Cll and C12 (resp. Cl~ and Cl~) classifications. Then, the set A of access control rules takes one of the following forms, depending on the operators used for combining classifications: if 51, 2 ---~ /2 or 5~,2 = /J, then A = 0 (the empty set), i.e. the combination is forbidden. if 61,2 E {U, M} and 5~,2 E {U, M}, then A consists of a single access control rule: )~ = {(s E ell 51,2 el2), (0 E Cl[ 5'1,2 Cl~2) I (Pl(S,O) op P2(s,o)) ; s+>o}, where o~ E {V, A}. - if 51,2 = t~, we only combine the access control predicates for subjects belonging to Cll MCl2, and we keep unchanged the rules for the other entities. In other words, A consists of three separate access control rules: = {(8 E ( e l l - Cl2)),(o e (Cl i 5'1,2 Cl~))lPl(S,O ) ;8+>0}, )r = {(s E (el2 - Cll)),(o E (Cl~ 5~,2 elL) ) I P2(s,o) ; s+>o}, and )~H = {(8 9 (CllNel2)) , (0 9 (Vi i 6'1,2 Cl~)) I (P1 (8, o) op P2(s, o)) ; s+>o}, where o~ 9 {V, A}. If 5 ~ ~ = ~, the set A of access control 1,2 = t~ or 51,2 = 5 1,2 rules is equivalent to the one computed for 51,2 = ~. -
-
Coupling the operators for combining classifications and the ones for combining access control predicates, allows either to restrict or to extend the access control rules of the sub-policies. In particular, using 51,2 = 5[,2 = M and the A operator for combining access control predicates, the access control rule )~ resulting from the composition of A1 and )~2, verifies that: (s,o) E )~ ~ (s,o) E A1
58
Christophe Bidan and Val~rie Issarny
and (s, o) E As, i.e. the access is authorized (resp. denied) if and only if it is authorized (resp. denied) by both A1 and As (A is more specialized than A1 and A2 in the sense that it provides finest classifications). In this case, the composed policy keeps the principles of autonomy and secrecy of the sub-policies [14]: the combined policy is identical to the one obtained with policy interoperation.
Combination of conflicting rules. Let us now examine the combination of conflicting rules. Let A1 and A2 be two access control rules specifying respectively access authorization and access denial: )~1 ---- {(8 e Cll),(o E Cl~) [ Pl(8,0) ;8,,~>0} and As = {(s e Cl2), (0 e Cl~) [ P2(s,o) ;sr Let 51,2 (resp. 5~,2) be the operator defined for the combination of the classifications Cll and Cl2 (resp. Cl~ and Cl~). Let further C11,2 = Cll 51,2 Cl2 and Cl~, 2 = Cl[ 5'1,2 Cl~. The A+ and A - operators allow to combine two conflicting rules: A1 A+ As and A1 A- A2. T h e result of the combination of A1 and A2 consists of three separate rules A, AI and A" defined by: -- {(s e vii,s), (o e cll,s) I (fl(s,o)A-nPs(s,o)) ;s ,'-) o} = {(s e Cll,s),(o e ell,s) I (-,Pl(s,o) APs(s,o)) ;s o} )~1! ---- { ( 8 E ell,2), (o ~ Clll,2) I (Pl(s,o) A Ps(s,o)) ;sac o} with (ac = ,~>) for the operator A+, and (ac = ~>) for the operator A - . The A and A~ rules are coherent with both sub-rules. On the other hand, the A" rule is clearly inconsistent with either A1 or As. The A+ operator favors access authorization whereas the A- operator favors access denial. Notice that the A+ and A - operators are close to the notion of strong and weak authorizations introduced in database systems to support multiple access control policies [5]. As with the composition of non-conflicting rules, coupling the operators for combining classifications and the ones for combining access control rules allows either to restrict or to extend the sub-rules. Finally, the combination of two sets of access control rules R1 -- {A1}i= i n 1 and J m 1 amounts to specifying, for each pair of rules ()~1, R2 = {As}~= i A2), J the operators for combining their classifications and their access control predicate. Let us remark that the proposed combination model applies recursively: a combined policy can be combined with another policy. E x a m p l e . We now illustrate the use of combination operators with the example of the file system given in w Given the component classifications R E A D and W R I T E , the two sets of users A and C, and the file classifications B and D, we define the following access control policy P: the (A - C) users are authorized to read files in (B - D), (A Cl C) users can read and write files in (B Cl D), and (C - A) users are authorized to write files in (D - B). Using the bJ operator for combining classifications, we are able to differentiate components that only allow to read files (R_ONLY = (READ - W R I T E ) ) , components that only allow to write files ( W _ O N L Y = ( W R I T E - READ)), and components that allow to read and to write files (R_and_W = ( R E A D gl
Dealing with Multi-Policy Security in Large Open Distributed Systems
59
W R I T E ) ) . The operator t~ further allows to differentiate the following sets of files: B - D, B M D and D - B. Finally, using the operator A for combining access control rules, we are able to get the specifications of 7): = {(s E R _ O N L Y ) , (o e B - D) I Owner(s, A) ; s,~>o), ~' = ((s E W _ O N L Y ) , (o E D - B) I Owner(s, C) ; s,,~>o}, and )~" = ((s e R_and_W), (o E (BMD)) I (Owner(s, A) A O w n e r ( s , C)) ;s,~>o}. 3.3
Complete and sound access control policy
We have proposed a model for the specification of access control policies as well as a set of combination operators allowing to compose these specifications. Now, we define the notion of secure access control policy in the context of our model: the policy is secure if and only if its specification is complete and sound. Let us precisely define the completeness and soundness of policy specifications. D e f i n i t i o n 1. ( C o m p l e t e policy) An (access control) policy 7) is complete (from the standpoint of its specification) if and only if: (1) there exists an access control rule for every object and subject of 7~ 's domain, and (2) in the case that 7) is a combined (access control) policy, its sub-policies are also complete. In an open distributed architecture, an access control policy cannot be defined for the whole set of the system's entities. The above definition states that a complete access control policy ensures access control for any entity belonging to the policy's object domain, and for entity belonging to the policy's subject domain 6. D e f i n i t i o n 2. ( S o u n d p o l i c y ) An (access control) policy 7) is sound (from the standpoint of its specification) if and only if: (1) given a subject s and an object o belonging to the policy's domains, all the access control rules specifying whether s can access o or not, have the same access control operator, and (2) in the case that 7) is a combined (access control) policy, its sub-policies are sound. Given two sound sub-policies, if both specify only access authorization (or only denial) then for all combination operators, the combined policy is sound. On the opposite, when at least one of them specifies both authorized and denied accesses, the soundness of the combined policy may not be guaranteed. However, given two complete and sound access control policies, we have the following result which guarantees that there exists a secure policy resulting from the combination of these policies. 1. E x i s t e n c e o f s e c u r e c o m b i n e d a c c e s s c o n t r o l p o l i c y Let 7)1 and 7)2 be two secure access control policies. There exists a set of operators for combining classifications and for combining access control rules such that, if 7) is the policy resulting from the combination of 7)1 and 7)2 according to these sets, then 7) is secure.
Theorem
6 In general, the completeness of a given policy can be guaranteed by specifying default access control rules.
60
Christophe Bidan and Val~rie Issarny
Proof: Let Dotal = {Cli}~.=l (resp. Dom2 = {Clj}~=l) be 791's (resp. 792's) access control domains. Let A = {Ji,j = N,i = 1, ...,n,j = 1, ...,m} be the set of operator for combining classifications. Let R1 = {A~}n=l (resp. R2 = {A~}~=I) be 791's (resp. 792's) sets of access control rules. Let ~ = {ai,j E { V, A+, A- }, 1, ..., n, j = 1, ..., m} be the set of operator for combining access control predicates, with: (1) c~i,j = V if A~ and A~ are non-conflicting rules; (2) ai,j = A+ if A~ and A~ are conflicting rules and A~ is an access authorization rule; and (3) c~i,j = A- if A~ and AS are conflicting rules and A] is an access denial rule. It is trivial to show that the policy 79 resulting from the composition of 791 and 792 according to the sets A and ~ is a complete and sound policy (i.e. 79 is secure). [] E x a m p l e . Let us now address the completeness and soundness of policies specified in the file system example. The policies 791 and 792 are not complete. For instance, there are not access control rules concerning software components not belonging to subject's classifications and which require to access B or D files. If we specify default rules which deny all accesses between entities in 791 (resp. 792) security domain and the other entities, 791 (resp. 792) becomes complete. By giving priority to the access control rules over the default rules, the soundness of the policies 791 and 792 is also verified. Finally, by using the same default rules, the combined policy 79 is secure.
4
Extension for Information Flow Policies
In this section, we extend the proposed model for specifying and combining access control policies to introduce the notion of information flow. Then, we address completeness and soundness of the resulting specifications, hence defining secure information flow policy according to our model. 4.1
Information flow p o l i c y
The previous model allows to describe access control policies but does not integrate any notion of information flow. Therefore, it does not meet the specification of information flow. In order to deal with flow policies, we extend the previous model as follows: (1) the specifications of access control rules are enriched with the notion of information flow, and (2) rules that control the transitive information flow (by denying some information flows) are introduced. I n p u t a n d o u t p u t flows. For specifying the information flow that is associated to an access control rule, let us distinguish the access control rules for input flow (e.g. the read accesses) from the access control rules with output flow (e.g. the write accesses). We extend our definition of access control rules given in w by specifying the flow associated to the access operation thanks to the operators <--, and ,-~>.
Dealing with Multi-Policy Security in Large Open Distributed Systems
61
Definition 3. ( R u l e s f o r input and o u t p u t flow) An access control rule for input flow takes the form: ~ = {(s e Cl), (0 E Cl') I P(s, o) ;s<--~o). An access control rule for output flow takes the form: ~ = {(s e C1), (o e
Cl') l P(s ,o) ;s,~>o}. Given two entities el and e2, the expression el <,--e2 specifies that el is authorized to access e2, and this access produce a flow from el to e2. Dually, the expression el-,,>e2 specifies that el is authorized to access e2, and this access produce a flow from e2 to el. Note that we only specify the information flow for the authorized accesses.
Information flow r u l e s . The information flow resulting to accesses is transitive: let el, e2 and e3 be system entities, if e l ~ > e 2 and e2,-.~>e3 then el r,~>s3. In order to specify information flow policies, we specify the information flows which are denied, i.e. the information flow rules. Definition 4. ( I n f o r m a t i o n flow r u l e ) An information flow rule is defined as r : {(el e e l ) , (e2 E Cl') I P ( Q , e 2 ) ;el ~ e2}. The rule r specifies that the information flow from C! entities to Cl' entities verifying the predicate P is denied. Then, an information flow policy is defined as follows:
Definition 5. ( I n f o r m a t i o n flow p o l i c y ) An information flow policy is an
access control policy for which the access control rules are enriched according to the definition 3, and which defines a set of information flow rules. E x a m p l e . In order to illustrate our model, we describe a simple multilevel security policy [4], called ~ s - We consider a distributed military system consisting of a set of software components (i.e. programs representing users), and a set of files that the components manipulate. Suppose that we have two security levels for both files and components: Secret and Unclassified 7 such that Secret > Unclassified (i.e. the Secret information is more sensible than the Unclassified information). A software component can use a set of methods (e.g. read, write, execute, etc) to access files. In the following, we only consider the read and write methods. We define the R E A D (resp. W R I T E ) classification to be the set of software components allowed to read (resp. write) files. We introduce the predicate (cl > e2) which, given two entities el and c2, holds if and only if the security level of el dominates, i.e. is greater than or equal to the security level of e2. We define the following information flow policy ~s: - The subject domain consists of the R E A D and W R I T E classifications, and the object domain consists of files belonging to Secret or Unclassified classifications. 7 The Secret and Unclassified terms are used for denoting both the security level, the classifications and the predicate.
62
Christophe Bidan and Val~rie Issarny
- A component belonging to R E A D is authorized to access a file if and only if its security level dominates the one of the file, and the flow resulting to such access is an input flow: A1 = {(s E R E A D ) , (0 E (Secret U Unclassified)) [ s > 0 ;s<~,o}. - A component belonging to W R I T E is authorized to access an file if and only if the file security level dominates the one of the component, and the flow resulting to such access is an output flow: A2 = {(s E W R I T E ) , (0 e (Secret [J Unclassified)) [ s >_ 0 ;s,~>o}. - All the accesses not authorized previously are denied through a default rule. - Unclassified entities are not authorized to access Secret information: r = {(el E R E A D ) , (e2 E (SecretUUnclassified)) [ (~2 _> el) ;el ~/+ e2}. 4.2
Combined
information
flow policy
Let us now address the combination of information flow policies. Given two information flow policies, we distinguish three steps for the combination of these policies: (1) the combination of these policies according to the operators for combining access control policies (see w (2) the combination of the <~, and ,~> operators in order to specify the information flow of each combined access control rule, and (3) the combination of information flow rules. The first step allows to obtain the access control rules of the combined flow policy. During this step, the information flow sub-policies are viewed as access control policies, and the combination is carried out from the standpoint of combination of access control policies. The second step permits to specify the information flow within the combined access control policy obtained at the first step. Informally, the combination of two rules for input flow (resp. rules with output flow) is a rule for input flow (resp. rule with output flow). In the other case, the combined rule is a rule for both input and output flow (<~> operator). Finally, we compute the combined information flow rules by composing the information flow rules of the sub-policies, in a way similar to the combination of non-conflicting rules (see w Informally, the logical or operator allows to combine two information flow rules so as to generate a unique rule that provides the information flow control of the sub-policies. On the other hand, the information flow rule resulting from the logical and operator denies the information flow if and only if both sub-rules deny the flow. E x a m p l e . Let us combine the information flow policy 5us introduced in w with another policy having different classifications. Suppose that we have two classifications in terms of military domains for both files and software components (e.g. users): Nuclear and Conventional classifications such as Nuclear > Conventional. Similarly, we define the information flow policy 3rD, except for the set of information flow rules that we define as an empty set. By using the M operator for combining classifications, and the A operator for combining access control predicates, we obtain the policy 9r depicted in figure 1. For simplifying the figure, we do not distinguish components and files, and
Dealing with Multi-Policy Security in Large Open Distributed Systems
63
we do not show neither the denial accesses nor the implied transitive accesses. The specification of the information flow in the combined access control rules are immediate according to the sub-policies' specifications (i.e. the read access produces input flow whereas the write access generates output flow). The information flow rule of the combined policy is the one of the policy :Fs: r = {(Cl e READ), (e2 e (Secret U Unclassified)) I (e2 > el) ;cl ~-~ c2}.
t'Z, ~
Secret r Nuclear
[~
Unclas~ified n Nuclear Read access
~,r rt ]] - - -I~
Secret n Conventional Unclass~fied ~ Conventlonal Write access
Fig. 1. Combined multilevel policies
4.3
Complete and sound information flow policy
Given the specifications of information flow policies in our model, we now focus on the notion of secure information flow policy by addressing the completeness and soundness properties. In our model, we only specify the information flows which are denied. In other words, any information flows that are not denied are authorized, and the problem of conflicts among information flow rules does not exists. Based on this remark, an information flow policy /F is complete (from the standpoint of its specification) if and only if ~ is complete as an access control policy (see definition 1). In the same way, an information flow policy ~- is sound (from the standpoint of its specification) if and only if 9~ is sound as an access control policy (see definition 2). A secure information flow policy is a policy for which the specifications in our model are complete and sound. In the case of a combined policy, the secure property implies that the policy's specifications are coherent with the specifications of the sub-policies according to the combination operators that are used. Like in w given two information flow policies, there exists operators for combining classifications and rules such that the combined policy is secure. Notice that a policy being secure according to our model does not imply that the policy have no covert channels. For example, the write access is an access having output flow, but, when the file does not exist, the information flow induced by the write method is an input flow: after the operation, the user
64
Christophe Bidan and Val@rie Issarny
have information concerning the file's existence. Such covert channels do not invalidate the secure property. However, our secure property ensure that the policy's specifications are defined without ambiguity for any entity belonging to the policy's domain. E x a m p l e . Let us address the completeness and the soundness of the multilevel policies iTs and ~-D. Let us consider that the classifications in terms of security level or in terms of military domains are both complete classifications, i.e. any entity of the system has a (unique) security level and belongs to a (unique) military domains. Then the flow policies iTs and ,~FD becomes complete and sound access control policies. In the same way, the combined flow policy iT is complete and sound as an access control policy. Finally, the soundness of the policy iT is trivial, and iT is a secure flow policy.
5
Conclusion
In this paper, we have proposed a practical solution for reasoning about the coexistence of different security policies. Our approach relies on the specification of security policies and on the definition of operators for combining these specification. We have also addressed the completeness and the soundness of the (combined) security policy according to its specification. Our model can be viewed as an implementation design of the Bell's model. We have introduced a set of combination operators enabling security officers to specify and to combine security policies in a controlled and secure manner. In addition, we assert that the dissociation between the combination domains and the combination rules permits to get a simplified and growing vision of security policies in open distributed systems. Finally, the distinction between access operations and the generated information flows gives to our model a useful approach, information flow policies being considered as an extension of access control policies. We have illustrated our approach through two different examples of file systems in an distributed architecture. In particular, the multilevel security policy example enables us to demonstrate the strength for specifying security policies and the ease for combining security policies. Current and future work
In open distributed systems, the coexistence of applications which have different security constraints results from rich and complex interactions among software components [15]. Integrating our model in the software architecture paradigm provides an appealing solution to the sound development of applications with security requirements [7]. More specifically, such a solution allows to specify the security requirements, and to build systematically, from existing components, the reference monitor that meet these requirements. One application area for our results can be the World-Wide-Web (Www): we are currently examining
Dealing with Multi-Policy Security in Large Open Distributed Systems
65
the automatic selection of the components for ensuring information exchanges in a secure way. We also consider the use of this solution for reasoning about Java-applets, in order to customize the Java Virtual Machine in a controlled and secure way. From the implementation standpoint, we are currently integrating our model in the Aster configuration-based distributed programming environment [6, 17]. The resulting framework allows the easy development of applications consisting of heterogeneous software components, running on heterogeneous hardware platforms, and having multiple security requirements. From the theoretical perspective, we are interested with the extension of our model in order to specify security policies with dynamic relabeling [12].
References 1. M. Abadi and L. Lamport. Composing Specification. Technical Report 66, Digital Systems Research Center, Oct. 1990. 2. M. Abrams, L. LaPadula, K. Eggers, and I. Olson. A Generalized Framework for Access Control: an Informal Description. In Proceedings of the 13th National Computer Security Conference, pages 134-143, Oct. 1990. 3. D. E. Bell. Modeling the Multipolicy Machine. In Proceedings of the New Security Paradigm Workshop, pages 2-9, Aug. 1994. 4. D. E. Bell and L. J. LaPadula. Secure Computer Systems: Unified Exposition and Multics Interpretation. Technical Report MTR-2997 Rev. 1, MITRE Corporation, Bedford, Mass, 1976. 5. E. Bertino, S. Jajodia, and P. Samarati. Supporting Multiple Access Control Policies in Database Systems. In Proceedings of the IEEE Symposium on Security and Privacy, pages 94-107, May 1996. 6. C. Bidan and V. Issarny. A Configuration-based Environment for Dealing with Multiple Security Policies in Open Distributed Systems. In Proceedings of the 2nd European Research Seminar on Advances in Distributed Systems, Mar. 1997. URL: htZp ://www. irisa, fr/solidor/work/aster. 7. C. Bidan and V. Issarny. Security Benefits from Software Architecture. In Proceed-
8. 9. 10. 11. 12. 13.
ings 2rid International Conference on Coordination Models and Languages, pages 64-80, Sept. 1997. URL: http://w~w, i r i s a , f r / s o l i d o r / w o r k / a s t e r . P. Bieber and F. Cuppens. A logical view of secure dependencies. Journal of Computer Security, 1(1):99-129, 1992. D. Clark and D. Wilson. A Comparison of Commercial and Military Computer Security Policies. In I. C. Society, editor, Proceedings of the IEEE Symposium on Security and Privacy, 1987. R. Deng, S. Bhonsle, W. Wang, and A. Lazar. Integrating Security in CORBA Based Object Architectures. In Proceedings of the IEEE Symposium on Security and Privacy, pages 50-61, May 1995. Department of Defense Standard. Trusted Computer System Evaluation Criteria. Technical Report DoD 5200.28-STD, Dec. 1985. S. Foley, L. Gong, and X. Qian. A Security Model of Dynamic Labeling Providing a Tiered Approach to Verification. In Proceedings of the IEEE Symposium on Security and Privacy, pages 142-153, May 1996. J. Goguen and J. Meseguer. Security Policies and Security Models. In Proceedings of the IEEE Symposium on Security and Privacy, pages 11-20, May 1982.
66
Christophe Bidan and Val~rie Issarny
14. L. Gong and X. Qian. Computational issue in secure interoperation. IEEE Transactions on Software Engineering, 22(1):43-52, Jan. 1996. 15. OMG Security Working Group. White Paper on Security. TC Document 94.4.16, OMG, Apr. 1994. Available by ftp at ftp.omg.org:/pub/docs. 16. H. Hosmer. Metapolicies II. In Proceedings of the 15th National Computer Security Conference, pages 369-378, 1992. 17. V. Issarny, C. Bidan, and T. Saridakis. Achieving Midleware Customization in a Configuration-Based Development Evironment: Experience with the Aster Prototype. In Proceedings of the 4th International Conference on Configurable Distributed Systems, 1998. URL: http ://www. irisa, fr/solidor/work/aster. 18. C. E. Landwehr. Formal models for computer security. ACM Computing Surveys, 13(3):247-278, Nov. 1981. 19. J. McLean. The Algebra of Security. In Proceedings of the 1988 IEEE Computer Society Symposium on Security and Privacy, pages 2-7, Apr. 1988. 20. J. McLean. Security Models and Information Flow. In Proceedings of the IEEE Symposium on Security and Privacy, pages 180-187, May 1990. 21. J. McLean. A general theory of composition for a class of possibilistic properties. IEEE Transactions on Software Engineering, 22(1):53-67, Jan. 1996. 22. J. D. Moffett, M. D. Sloman, and K. Twidle. Specifying Discretionary Access Control Policy for Distributed Systems. Computer Communications, 13(9):571580, Nov. 1990. 23. National Computer Security Center. Trusted Network Interpretation of the TCSEC. Technical Report NCSC-TG-005, July 1987. 24. OMG. The Common Object Request Broker: Architecture and Specification Revision 2.0. Technical report, OMG Document, 1995. 25. D. Sutherland. A Model of Information. In Proceedings of the 9th National Computer Security Conference, pages 2-12, Sept. 1986. 26. TINA-C. TINA Object Definition Language (TINA-ODL) Manual - Version 1.3. Technical Report TR_NM.002_l.3_95, TINA-C Document, 1995.
A Flexible Method for Information System Security Policy Specification Rodolphe Ortalo LAAS-CNRS 7, avenue du Colonel Roche, 31077 Toulouse cedex 4, France [email protected]
Abstract. This paper presents a method for the specification of the security of information systems. The proposed approach provides a flexible and expressive specification method, corresponding to the specific needs of organizations. First, we outline the overall guidelines of the security policy definition process, and the different consistency issues associated to the description of the security requirements of an organization information system. The specification language used is based on a convenient extension of deontic logic. The formalism and its extensions are defined briefly. To illustrate the use of this formalism, the paper presents how the method applies to the description of the security requirements of a real organization : a medium-size bank agency. Keywords: security policy specification, information systems, deontic logic.
1
Introduction
This paper presents a practical framework for the specification of the security policy of an organization such as an industrial company or a public service. These organizations exhibit security needs, but they also have other requirements that induce several specificities with respect to security management. First, the information system may exist prior to the explicit definition of its security policy. Furthermore, it is important to provide an adaptable and structured methodology corresponding to the need for abstraction and progressive refinement in the security policy specification process. Finally, the information system may be imperfect, in the sense that the desired security properties may differ from the actual observed properties. From the security point of view, these contradictions appear intolerable, but they often result from a trade-off between the main requirements of the organization and the enforcement of security measures. In fact, the study of practical regulations often reveals discrepancies between the actual situation and the desired behavior, and the security policy would probably better cope with them rather than ignore them. Consequently, in the context of an organization, most of the classical security models are often unusable. The problem is that they impose constraints on the design and the behavior of the system that may not be acceptable in a situation J.-J. Quisquater et al. (Eds.): ESORICS’98, LNCS 1485, pp. 67–85, 1998. c Springer-Verlag Berlin Heidelberg 1998
68
Rodolphe Ortalo
where security is not a primary requirement. For example, the mandatory rules of a multi-level security policy [1] imply to design the information system accordingly, and assume permanent application of the rules. Moreover, in the case where the system already exists, the security policy could not be applied until it has been re-designed. To face such situations, the specification of a security policy must provide more flexibility and expressiveness. In this paper, we use an extended deontic logic language to specify security policies. Several authors already outlined the main features of deontic logic for specifying security properties [9,12,4], and this formalism seems to be more adequate for the specification of the security policy of an organization than usual security models. The specification method we propose is based on an original syntactical extension of the standard deontic logic language that allows a structured and convenient representation of the elements of the policy. Given the imperfections of real information system policies, several types of security policy inconsistencies may be encountered, corresponding to the elements of the security policy that conflict with each other. By detailing these cases, we identify some inconsistencies corresponding to a trade-off between functionality and security in the organization, that may be tolerated in the specification. Furthermore, we argue that the formalism also offers the opportunity to complement the security policy with a description of the vulnerabilities of the information system. However, it is highly desirable to completely describe the insecure states corresponding to inconsistencies in order to have a chance to trust the system. The structure of the paper is the following : Section 2 presents the various elements of an organization security policy and the potential conflicts that it may contain. Section 3 provides the definition of the specification language used in this paper. Section 4 details the application of the method to an information system. Section 5 presents a practical application example, addressing the security needs of a bank agency. Finally, we conclude in Section 6.
2 2.1
Presentation of the Approach Security Policy Components
The security policy of the system corresponds to the description of its security needs. According to [10, Section 2.9], the security policy is “the set of laws, rules and practices that regulate how sensitive information and other resources are managed, protected and distributed within a specific system”. In this paper, we consider that a security policy defines : – the security objectives, i.e. the confidentiality, integrity and availability properties expected of the system ; – and the security rules which are imposed on the mechanisms which can modify the security state of the system, in order to guarantee the security properties.
A Flexible Method for Information System Security Policy Specification
69
The components of the security policy refer to specific entities of the system (e.g. specific subjects or objects, available operations, or an organization chart). First, these description elements are added to the policy. In an organization for instance, the names of the various individuals are mentioned. Additional description rules may also be added to the specification in order to reflect the functional behavior of the system when this behavior may impose constraints on the security-related aspects of the system. In an organization, such functional rules represent the various information flows existing in the organization and the way they interact with each other. The description of these operational rules brings information that help the security administrator to adapt the security policy to the actual operation of the organization. The description of the system operation is limited to the basic elements needed to formulate the security mechanisms and objectives, and to the operational rules that describe the portions of the information system relevant to security. Therefore, this description should remain minimal, and additional elements or rules should be introduced only when needed. Second, the security policy contains the security rules to be enforced by the system. These rules reflect the way in which the security state of the system evolves. In an organization, such rules define, for example, who can assign a new position to an individual, or on which condition someone may delegate some of his privileges to another person. Finally, the security policy defines several security objectives that specify the desired security properties for this system. The set of all these properties defines the states of the system that are acceptable with respect to security. A security state of the system that satisfies all the security objectives of the system security policy is a secure state. An insecure state is a state where these properties are not satisfied. 2.2
Refinement
The first version of the security objectives introduced in the security policy will probably be rather generic. Therefore, it is very probable that these objectives will be too restrictive and easily invalidated by the operation of the organization. To resolve this problem, the introduction of new elements may be necessary to limit the scope of an objective. Similarly, it may be necessary to introduce new description rules, or accurately detail the various steps of a given task in order to focus on a desired security property. Therefore, the description of the organization will undergo several refinement steps, enabling to clarify the security objectives and to improve the whole policy. Furthermore, the verification of the policy can benefit from an incremental approach, as we will see in Sect. 4.2. 2.3
Security Policy Consistency
The security policy is consistent if, starting from a secure state one cannot reach an insecure state without violating the security rules. Given the different security policy components, we can distinguish several types of potential inconsistency :
70
Rodolphe Ortalo
– First, the description of the system may be inconsistent due to conflicting functional rules. However, this is not a security-related specification problem. – Several security objectives may be contradictive. In this case, no secure state exists and it is impossible to enforce the policy, therefore the security objectives should be updated (e.g. by reducing their scope, or adding priorities). – Some of the security rules of the specification may contradict each other. For example, an individual may be able to reach a situation in which an obligation and an interdiction hold simultaneously with respect to the same action (e.g. due to conflicting roles). The modification of the security mechanisms may prevent such conflicts. However, it is also possible that the behavior of the system eliminates the situations in which these conflicts can occur. If so, description rules could be added to the system description to describe the functioning more accurately. – The verification of each security objective, starting with the system security rules, may reveal an inconsistency. In this case, security rules fail to enforce the objectives. Such inconsistency means that either security objectives should be revised, or it is needed to introduce new security rules to enforce the expected properties. – Finally, it is possible that the functional rules conflict with the security objectives. This case identifies weaknesses in the organization operation showing that it is possible to bypass the security rules. To face this problem, two directions are possible : it is possible to modify the organization operation (and therefore decide to adapt the organization to the envisaged security mechanisms), or to consider new security mechanisms which are compatible with the actual operation (i.e. adapt the security policy to the information system). We can identify the mandatory consistency requirements of a security policy specification. First, it seems necessary that the consistency of the system description be obtained. If the description of the system functioning is incorrect, adding security requirements to the policy is a nonsense. Second, if several security objectives are contradictive, i.e. if the enforcement of one objective implies the failure of another, the information system is always in an insecure state. Therefore, consistency of the set of all security objectives is necessary : if two security objectives contradict, a choice must be made. Other inconsistency types reflect problems such as incomplete security mechanisms that fail to enforce the security objectives, over restrictive security rules that may lead to opposed permissions and interdictions in specific situations, or dangerous functioning of the system that may allow to bypass the security mechanisms. In practice, some of the security policies defined in organizations exhibit such inconsistencies [5]. Furthermore, a situation in which such problems occur may be acceptable if it corresponds to a legitimate trade-off in the organization. Exhaustive description of the insecure states is necessary to trust the system, but security issues may not need to be totally reconsidered. The purpose of the introduction of a description of the system behavior in our approach to security policy specification is precisely to allow such precise insecure states
A Flexible Method for Information System Security Policy Specification
71
identification for some classes of inconsistent policies. Of course, improvements may also be proposed, but the corresponding mechanisms may not be possible to be enforced in the organization if they perturb its normal operation, and we think that the security policy specification should take this fact into account and cope with some inconsistencies. This point of view can be extended. We can envisage to complement the system security policy by a description of the potential vulnerabilities of the system. Vulnerabilities may be due to intrinsic weaknesses of the used security mechanisms (such as cryptographic keys length), or to flaws identified after the system has started its operational life [10, Para. 3.25-28], [11, Sect. 6.C]. Such vulnerabilities compromise theoretically the security of the system, but may be very difficult or very long to exploit. The identification of the insecure states due to vulnerabilities allows to assess their impact on the security objectives. This assessment should help the information system security administrator to propose relevant changes given the known vulnerabilities of the system.
3
Specification Language
In order to handle such cases, we need a security policy specification formalism flexible enough to separate the functional description of the system and the security needs description, and which furnishes methods for exploiting inconsistent specifications due to the integration of vulnerabilities in the description. Modal logic, or more precisely deontic logic, is a formal framework that seems to be expressive and adaptable enough to specify security policies of organizations [9,12,4]. Deontic logic is a branch of modal logic to reason about normative versus non-normative behavior by means of modal operators such as ought, permitted and forbidden [15, Preface]. Furthermore, by using these special operators, deontic logic provides a mean to specify that some behaviors of the system are non-normative (illegal) but nevertheless possible. We present briefly in this section the deontic logic language that we selected. A general presentation of modal logic can be found in [3]. 3.1
Deontic Logic Language
Like any modal logic, deontic logic is an extension of usual propositional logic by one or several deontic operators. More precisely, if Φ is a set of atomic propositions a, b, c, ..., if ¬, ∨, ∧, ⇒ and ⇔ are the usual boolean operators, and if O, P and F are the three modal operators of deontic logic, the deontic language denoted LO (Φ), or simply LO when it raises no ambiguity, is the set of formulas (or sentences) built with the following rules : – if p ∈ Φ, p is a formula ; – if p and q are formulas, ¬p, p ∨ q, p ∧ q, p ⇒ q, and p ⇔ q are formulas ; – if p is a formula, Op, Pp and Fp are formulas.
72
Rodolphe Ortalo
A modal formula is a formula of LO which contains at least one operator O, P or F. A non-modal formula is a formula which contains none. In natural language, formulas Op, Pp and Fp are read, respectively, “It is obligatory that p”, “It is permitted that p”, and “It is forbidden that p”. Therefore, operFp = O¬p (1) ators O, P and F denote respectively the noPp = ¬O¬p (2) tions of obligation, permission and interdiction. Op ⇒ Pp D Conforming to the definitions of necessity (operator ✷) and possibility (operator ✸) in classi- O (p ⇒ q) ⇒ (Op ⇒ Oq) K p cal modal logic, one can state relations between RN Op these various operators. First, F can be defined from O by (1). P is the operator dual of O. Therefore, one can also consider the relation (2) between operators O and P, similar to the relation between necessity and possibility in classical modal logic. Finally, standard deontic logic [3, Chap. 5], also contains axiom D which relates obligation and permission. A modal logic system is an extension of propositional logic, therefore it contains all the axioms and inference rules of propositional logic. Thus, LO contains all tautologies and is closed under modus ponens. We consider in this work a normal modal logic system, i.e. a system that contains axiom K and the inference rule RN (“Rule of Necessitation”). In this paper, we want to use deontic logic for the practical representation of security properties. Even though it may have a direct impact on our work, studying deontic logic in itself, or as a formal language for reasoning about legal issues and normative systems (or security), is out of the scope of this presentation. Therefore, we use standard deontic logic, i.e. the KD system of classical modal logic. Especially we accept (2) as a suitable definition of permission. However, these simplifications are done in order to clarify the security policy specification method, and they do not prevent further extensions of the logical language used, such as suggested in [3, Chap. 10]. Finally, we recall the semantics associated to a normal modal logic language like LO . It is a Kripke’s semantics, or possible world semantics [13]. A Kripke model M for a modal logic system is defined by a triple W, R, V where W is a set of possible worlds w, R is a binary relation over W called the accessibility relation, and V : W × Φ → {true, false} is a function furnishing in each world w ∈ W the truth value V (w, p) of the atomic proposition p. The fact that proposition p is true in a world w of model M is noted |=M w p. This truth value is defined in the following way1 , which extends usual propositional calculus : if p ∈ Φ, |=M w p if and only if V (w, p) = true |=M ¬p if and only if we do not have |=M w w p M M |=w (p ∧ q) if and only if |=M w p or |=w q M |=M w Op if and only if ∀w ∈ M/wRw , |=w p
– – – –
The first three statements conform to the classical truth value definition of a formula in propositional logic. From the last definition, we see that in the 1
Using ¬ and ∨ as the fundamental boolean operators.
A Flexible Method for Information System Security Policy Specification
73
Kripke’s semantics the formula “It is obligatory that p” is true in a world w if and only if p is true in every world w which w is in relation with. When w is in relation with w , we say that w is directly accessible from w, or equivalently that w is a possible world for w . The definition of the truth value of Pp is easily deduced from the one of Op using (2) : M – |=M w Pp if and only if ∃w ∈ W/wRw , |=w p
3.2
Language Extensions
The language LO , or LO (Φ), is based on a set of atomic propositions Φ. In order to ease the specification task, it is desirable to introduce constructs allowing a structured and hierarchical description of the various formulas of LO . Therefore, we propose to use a syntactical extension of this language, denoted LO . Let Π be a set of sets A, B, C, . . . , ordered by the partial order relation of inclusion ⊆. The language LO (Π), or simply LO when it raises no ambiguity, is syntactically identical to the language LO defined in Sect. 3.1. This extension is purely syntactical. Its interest resides in the possibility of structuring the atomic propositions of the language through a hierarchical description of a set by its subsets. In order to ease the construction of the various sets E elements of Π, we allow the use of the operator × (set product) to define the elements of Π, with the natural extension of the relation of inclusion.2 The deontic language LO (ΦΠ ) associated to the extended language LO (Π) is built considering the set ΦΠ defined as ΦΠ = {E ∈ Π/ (∀E ∈ Π, E ⊂ E)}. Therefore, ΦΠ is the set of all the terminal sets of Π, i.e. the sets that do not contain any other sets and form the basis of the hierarchy defined by relation ⊆. A sentence f of LO (Π) is associated to a set Σf of formulas of LO (ΦΠ ) defined recursively by (3), Σf |E←E (3) Σf = where f |x←y is the formula built by substituting E ⊆E for all the occurrences of symbol x the symbol y in formula f . Therefore, a formula f of LO (Π) is a short notation for several formulas of LO (ΦΠ ). These formulas are those obtained by replacing, in f , every occurrence of a non-terminal set of Π by its content until we obtain formulas mentioning only elements of ΦΠ . Using Set Constructors. Similar notations allowing the direct use of sets within formulas have already been applied to logical languages developed for database systems [14]. These extensions aim at facilitating the specification task. Sets allow for a progressive and hierarchical description of the basic elements of the security policy, and these sets can be used directly in the formulas of the extended language. To illustrate the use of this notation, we consider now several formulas written in LO (Π) and the equivalent set of formulas in LO (ΦΠ ).3 E × E = {(e, e ) /e ∈ E, e ∈ E }, and E × E ⊆ F × F ⇔ E ⊆ F ∧ E ⊆ F . In these examples, sets A, B, ‘Administrators’, ‘Software types’ and ‘Departments’ refer to sets containing atomic propositions only. ‘Install’ is itself an atomic proposition, i.e. an empty set. 2 3
74
Rodolphe Ortalo
A simple equation of the extended language LO , e.g. P (A ∨ B), should be represented by a whole set of formulas in LO , here : {∀p ∈ A, ∀q ∈ B, P (p ∨ q)}. The set product operator × of LO (Π) allows the definition of structured sets as elements of the language. This eases the description of various items starting with some of their typical features. Hence, formula (4) states directly a set of permissions associated to software installation tasks. These tasks are described by the action symbol ‘Install’, the various people allowed to perform the installation gathered in the set of the system administrators, and the various target software. Pieces of software are themselves described through two characteristics: their type (word processor, spreadsheet, etc.), and the organization department that owns them. The description of all these elements in the original language LO would necessitate the definition of many elements, sharing common typical characteristics. Formula (5) shows such a set. As can be seen, in a conventional notation, set product corresponds to several indices. P (Install × Administrators × (Software types × Departments)) P Installa,Softwaret,d a∈Administrators,t∈Software types,d∈Departments
(4) (5)
Graphical Representation. These extensions also make the specification language suitable for a graphical presentation that can be manipulated with classical user interaction techniques (drag and drop, etc.). This graphical representation represents the sets hierarchy leading to the definition of the atoms of the language using hierarchical lists. The operators of the language are also represented using graphical artifacts called slot-boxes. The purpose of this representation is to provide a simple and convenient method to define and manipulate the security policy specification elements. Figure 1 presents an example of this representation for defining the content of the sets mentioned in (4). Departments Management Production Sales
Software types Word processors LATEX, . . . Compilers Spreadsheets
Administrators A1 A2 A3
Fig. 1. Example of sets graphical representation
This graphical approach has been implemented within a tool enabling to define and modify easily a security policy. This editor is based on the Amulet graphical library [16] which was very convenient for this realization. In the remaining of this paper, we will adopt this graphical representation for sets, but for space reasons, we will use a conventional linear notation for formulas involving operators.
4
Application to Information Systems
Security needs justifying the elaboration of a security policy appear in many organizations. For example, we can mention banks (transactions integrity),
A Flexible Method for Information System Security Policy Specification
75
telecommunication companies (system availability), or medical and public-health services (legal confidentiality obligations). However, all these organizations also have other priorities distinct from security. In this section, we present how the deontic logic language presented previously can be applied to the specification of the security policies of such organizations, and the various possible situations associated to the specification verification. 4.1
Guidelines
In the context of an organization, a security administrator would be in charge of the security policy definition. While this administrator would probably have a good knowledge of the target organization, it is probable that he would not be familiar with a modal logic language. Therefore, we detail several steps of this formalization process. Reuse of available information. Most of the description elements of the specification appear in documents describing the organization and its operation. These elements are related to physical entities, individuals, or a functional decomposition of the organization. They refer also to specific attributes (such as confidentiality levels). Many of these items appear in description documents already available in the organization, such as an organization chart, some tasks definition documentation, handbooks, norms if applicable, etc. With respect to security issues, other documents help the system administrator. For instance, organization charts also mention the individuals situation in the hierarchy and their privileges. The policy can also be based on existing regulations, documents supporting the evaluation of the organization according to a set of security criteria [10], a previous risk analysis, laws if applicable to the organization, etc. When the security policy specification target is a system to be designed (and not a system in operation), the security designer should look for equivalent information existing in the design documents. System description. In the deontic logic language presented previously, the description of the organization is performed through the introduction of atomic propositions representing the basic elements, and non-modal formulas (built with the operators of usual propositional logic) representing the aspects of the functioning that are relevant for security. For example, consider the process of internal purchase in an organization : an employee formulates a purchasing demand, which must probably be validated by a manager before going to a specialized department that performs the operation. In order to describe these different steps, first one needs to introduce several atoms such as ‘demand’, ‘manager authorization’, ‘order’ and ‘delivery’. Then, one can state the causal dependencies between these steps and the condition for an order to be actually sent by the purchase department. This is represented by logical formulas like : demand ∧ manager authorization ⇒ order and order ⇒ delivery. Security rules. The security mechanisms existing in the system are added to the description of its operation. A security mechanism describes the link between the permissions (or interdictions) and the state of the system (or, more precisely, only a subset of the whole state of the system : its security state,
76
Rodolphe Ortalo
made of security attributes). Therefore, it may be represented by a modal formula stating a permission conditioned by some non-modal formula, such as : directory writable ⇒ P (rename the file). Integration of roles. The main asset of role-based policies resides in the opportunity that they offer to specify constraints on abstract elements (the roles) instead of enumerating several analogous constraints for all the concerned subjects [19]. Formula (4) presented previously shows that the possibility to use sets directly in a formula in LO allows to define and use easily user groups. To support roles, we enumerate not only the various roles, users and permissions, but also the desired pairs (role, user), and use modified authorization rules that constrain users permissions according to user roles. For example, we may modify (4) to : P(Install × Agent × Software) ⇒ Administrator × System administrator. In this approach, roles appear as intermediate abstract entities between the permissions in the system and the actual users. This extra level improves the flexibility of the specification, as roles are defined independently of the elements they relate. This flexibility is achieved at the cost of an extra specification work to associate users to roles. As shown in [17], several variants of this method can be used to introduce roles in a deontic logic specification. Security objectives. Finally, we use the basic elements defined previously to describe the security objectives of the information system. If a regulation exists in the organization, it will be transposed in the security policy specification. The deontic logic language allows an easy definition of the security objectives, through the use of the obligation, permission and interdiction operators that allows a natural wording of the desired security properties. Therefore, a security objective is usually represented by one modal formula, such as Op or F (p ∧ ¬q). 4.2
Exploiting the Specification
An asset of a formal language resides of course in the availability of formal calculus methods, useful to reason about the security policy. Several directions may be followed to address this issue. See [8,6] for a general presentation of modal logic proof theory, and [4] for an application to the verification of security properties. Provided that this calculus exists, several verifications can be performed on the specification corresponding to the various inconsistency types identified in Sect. 2.3. With respect to validation problems, the incremental process presented in Sect. 2.2 exhibits new advantages. Non-modal formulas, such as p, p∨q, p∧q ⇒ r, corresponds to the description of the system components and functioning. Such formulas must form a consistent basis for the security policy. Therefore, validation of all the non-modal formulas should be done separately, prior to further verification of the security-related formulas. As these formulas only form a partial description of the system and are propositional logic formulas, their number is limited and we do not expect their validation to be prohibitive. When inconsistency comes from the combination of two security objectives (this can be checked directly by verifying the conjunction of the two corresponding modal formulas in the chosen modal system), a choice must be made. This
A Flexible Method for Information System Security Policy Specification
77
situation may be difficult to manage if it is attempted to verify the whole security policy directly in one step. However, if the specification process follows the incremental method suggested, the verification of the security objectives compatibility should be done separately4 . We can also check that the security policy specification exhibit some properties, such as the fact that there are no conflicting obligation and interdiction. This is represented by ¬ (Op ∧ Fp), which must be true in the specification. Inconsistency types potentially related to an organization trade-off appears mainly when considering the Kripke’s semantics of the deontic logic security policy specification. An inconsistency appearing between formulas describing security rules, such as p ∧ q ⇒ Fr or p ⇒ Pq, means that the semantic model M of the specification contains a world w (corresponding to a system state) in which M several permissions or obligations conflict (e.g. both |=M w Pp and |=w Fp hold for some p and w). Similarly, an inconsistency appearing after the addition of a security objective to the specification shows that there exists worlds in the model where the modal property corresponding to the objective is not true. In each case, a model M invalidating the specification identifies the various intermediary states leading to an insecure state as well as the possible transitions between them, as it provides the truth values V (w, p) of the atomic propositions in each world, as well as the relation R between the worlds. The verification process may exhibit a counter-model M invalidating the specification. In this case, the counter-model may be used to identify the various rules (either functional rules or security rules) involved, as well as the conditions on the state of the system that allow this failure. With this information, modifications may be proposed. If the functioning of the system never allows such situations (e.g. one person simultaneously associated to all the roles in the system), additional formulas may be incorporated to the security policy to rule out such states. If a succession of applications of security rules allows to defeat the proposed objective, either the rules may be updated or the objective revised. Finally, if the behavior of the system allows to bypass several security rules and defeat the security objective, modifications of the operation of the system may be proposed in order to improve its security. In each case, the trade off between security issues and operational issues may be difficult to manage, but the information extracted from the counter-model M helps the system administrator to determine the appropriate solution. The interest of the information that can be extracted from such a counter-model may favor the use of a semantics-based deduction method for the verification of a deontic logic security policy specification, such as those presented in [7,2]. 4
It should be noted that, with the language presented in Sect. 3, it is also possible that global inconsistencies, similar to those corresponding to conflicting objectives, result from a combination of propositional formulas (describing the functioning) and formulas describing security rules. This specific case corresponds to the fact that in a specific system state some security rules lead to a contradiction, and, simultaneously, the functioning of the system always leads to this state. This case should be treated like conflicting objectives and be eliminated.
78
Rodolphe Ortalo
We see that the specification, even when inconsistent, can be exploited usefully. In fact, we argue that the ability to analyze precisely and exhaustively the various possible evolutions of the state of the system that contradict a security objective is very important in the context of an organization information system. In this context, the verification of the whole specification is very likely to fail due to contradictory objectives, or incomplete or inadequate security rules. Furthermore, it may be acceptable to tolerate inconsistencies, if all the evolutions of the information system that may lead to an insecure state are very improbable or very difficult to exploit. This assessment can rely, for example, on an evaluation method [18]. However, such situations cannot be understood if a precise description of the security needs does not provide the information needed to qualify the insecure states. 4.3
Description of Vulnerabilities
The approach finally offers the opportunity to include in the security specification additional components potentially leading to inconsistencies, representing the dp = Pp (6) vulnerabilities of the information system. In (6), we dep ∧ dq ⇒ ✸dr (7) fine a right dp with respect to a proposition p as the p ∧ Pq ⇒ ✸Pr (8) permission of p. We propose to define a vulnerability as the fact that, given a right and a specific state of the system, it is possible to obtain a new right (possibly not legitimate). We describe this as the property that “if a non-modal (propositional) property p and a right dq are true in a world w, then there exist another world w in relation with w such that another right dr is true in w .” Therefore, we represent a vulnerability by a rule like (7). Given the definition of a right, a vulnerability is of the form of (8). This definition allows the description of general vulnerabilities and, on the contrary of [18], makes a clear distinction between the normative aspects of the specification (captured by the deontic operator P) and the opportunities offered to a potential attacker (captured by the ontic operator ✸).
5
Example
To illustrate the security policy specification method, we present an example of its application. The target organization is a bank agency, counting 35 employees, and located in a rural area. The security policy presented in this section was built on the basis of several internal documents describing the organization. The analysis of these documents has been completed by a study in the field which lasted several days. The bank help manual identifies more than a hundred different operations potentially performed in the organization, from foreign currency deposit to account freezing following a legal injunction. Among all these operations, only a limited number are integrated in the following specification. We chose to consider operations involving loan agreement, and bearer bonds. These tasks are interesting as the first relies on a delegation mechanism, and the other is considered security-critical due to anonymity.
A Flexible Method for Information System Security Policy Specification
5.1
79
Formalization of the Organization Description Documents
The initial elements of the specification are presented in Fig. 2. We note the various individuals mentioned in the policy, several actions, and also a set of roles. The set ‘Individuals’ is further split into two subsets : ‘Customers’ is left unspecified and designates any potential customer of the bank, ‘Agents’ enumerates all the employees of the organization (they are identified by numbers from 1 to 35 in this paper). The roles used in this specification are related to the functions existing in the organization. All the official functions mentioned in the organigram are gathered in the set ‘Employment’ (defined later) which is therefore a subset of all roles. However, our practical study allowed us to identify a separate unofficial role, ‘Cashier’, necessary to formulate the security rules of the organization. Astonishingly, it is not explicitly associated to a function in the organigram. Finally, the set ‘Actions’ contains the various tasks included in the specification. Only three general types of action are considered in this example : common operation performance, loan agreement, and physical access. Agents A1 A2 ··· A35
Actions Do × Operations Grant a loan × Loans Access × Physical elements Individuals Customers Agents
Miscellaneous Amount Balance Physical elements Movements Folders Movement types Account Complaint
Roles Employment Cashier
Roles mapping Agents mapping A16 × Cashier
Fig. 2. Security policy specification root
The various organization description elements are given in Fig. 3. The set ‘Employment’ describes the employee functions (corresponding to roles). It corresponds to a position description document available in the bank. This document identifies several abstract positions that gather different positions actually held by the employees (e.g. ‘Leader’ groups positions such as agency leader, office manager, etc.). The roles introduced in Fig. 2 allow to define generic description and security rules independently of the actual agents of the organization. To obtain the actual description we need to specify also a mapping between individuals and roles. The mapping of agents to roles, ‘Agents mapping’, obtained using an organigram of the organization, is defined in Fig. 3. We note in this figure that one agent may be associated to several roles, e.g. A7 is a manager as well as a service agent. A16, the agent playing the role of the cashier is identified directly in Fig. 2. The various operations performed in the organization, presented in the set ‘Operations’, are also organized in several subsets. The description is limited to the higher levels and the detailed list of all operations has not been included. In this figure, we also introduce several secondary elements. Finally, we include in these description elements a simple representation of the data associated to a bank account. In order to describe the functioning of the organization, we also need to include description rules. As an example, we show in Fig. 3 how accounts and money movements can be represented. These rules state for example that an usual operation necessitates physical access to one of the bank terminals, and
80
Rodolphe Ortalo
that such an operation results either in a debit or credit money movement on an account. Of course, this is a simplification of the real functioning. A specific description rule represents the fact that a debit operation, when performed on a negative balance account, follows an exceptional path. The proposition ‘Freezing’ is a label identifying the exception. Here, this label does not mean that the operation cannot be performed.5 Employment Agency chief executives Agency manager Agency manager assistant Leader Market leader Position leader Office manager Agency leader Position manager Service agent Private cust. service agent Farmer service agent Professional service agent Collection assistant Highly skilled clerk All market comm. assistant Clientele adviser Customer clerk Skilled clerk Secretary
Operations Usual operations Loan operations Stocks and shares Bearer bonds Non-banking operations Insurance Flows management Main cash desk ATM Foreign currencies Internal operations Forcing control Litigations Customer incidents Cash desk incidents Agency management Staff management Movement types To Credit To Debit
Position Positive balance Negative balance Accounts ⊆ Customers × Position
A1 × Agency manager A2 × Agency manager assistant A3 × Secretary A4 × Secretary A5 × Agency leader A5 × Private cust. service agent A6 × Agency leader A6 × Professional service agent A6 × Farmer service agent A7 × Position manager A7 × Professional service agent A8 × All markets comm. assistant A9 × Professional service agent A10 × Professional service agent
A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24
× × × × × × × × × × × × × ×
Folders Loans folders Private cust. loans Consumer loans Professional loans Farmer loans Agency manager delegation auth. for A5 auth. for A7 ··· Amounts < $250, 000 ··· < $20, 000 < $8, 000 Physical elements Main safe Terminal ATM
Movements ⊆ Movement types × Account × Amount Agents mapping Professional service agent Farmer service agent Farmer service agent Farmer service agent Skilled clerk Skilled clerk Skilled clerk Skilled clerk Private cust. service agent Private cust. service agent Clientele adviser Skilled clerk Clientele adviser Clientele adviser
A25 A26 A27 A28 A29 A29 A30 A31 A32 A33 A34 A35
× × × × × × × × × × × ×
Skilled clerk Private cust. service Private cust. service Private cust. service Private cust. service Position manager Private cust. service Private cust. service Private cust. service Private cust. service Private cust. service Clientele adviser
agent agent agent agent agent agent agent agent agent
Description rules Agents × Actions ⇒ Actions Agents × (Do × Operations) ⇐ Agents × (Access × Terminal) Do × Loan operations ⇒ Loan folders Do × Usual operations ⇒ To Debit × Account × Amount ∨ To Credit × Account × Amount Customers × Negative balance ⇒ Freezing × (To Debit × Account × Amount) Grant a loan × Loans ⇒ To Credit × Account × Amount
Fig. 3. Security policy description elements 5 Debit operations on a negative balance account can be eliminated by a formula such as : Customers × Negative balance ⇒ ¬ (To Debit × Account × Amount). We do not include such a rule in our specification because it does not correspond to the actual functioning of the bank. Agents treat such money movements with specific care, but they can perform them.
A Flexible Method for Information System Security Policy Specification
5.2
81
Description of the Security Needs Security rules P(Agents × (Do × Usual operations)) ⇒ (Agents × Employment) P(Agents × (Do × Loan operations)) ⇒ (Agents × Service agent) P(Agents × (Access × Main safe)) ⇒ (Agents × Cashier) ∨ (Agents × Agency manager) Agents × Actions ∧ Freezing × Actions ⇒ O(Forcing × Agents × Actions) P(Agency manager delegation) Delegation rule Delegation rule Agency manager delegation ∧ (Agents × Employment) ∧ Loan authorizations ⇒ P( P(Agents × (Grant a loan × Loan)) ) Loan authorizations Professional market Service agent × (Professional loans× < $70, 000) Leader × (Professional loans× < $90, 000) Agency chief executives × (Professional loans× < $110, 000) Farmer market Service agent × (Farmer loans× < $70, 000) Leader × (Farmer loans× < $90, 000) Agency chief executives × (Farmer loans× < $110, 000) Private cust. market Service agent × (Private cust. loans× < $40, 000) Leader × (Private cust. loans× < $90, 000) Agency chief executives × (Private cust. loans× < $140, 000) Consumer projects Service agent × (Consumer loans× < $8, 000) Leader × (Consumer loans× < $18, 000) Agency chief executives × (Consumer loans× < $28, 000) Company market Leader × (Consumer loans× < $120, 000) Agency chief executives × (Consumer loans× < $240, 000) Security objectives F(Customers × (Do × Usual operations)) F(Agent × Grant a loan × Loans ∧ Agents × Employment ∧¬Loan authorizations ∧ Loans × ¬( < $100, 000)) F(Customers × Complaint)
Fig. 4. Security requirements description
The rules describing the security needs of the organization are shown in Fig. 4. Among these rules, we focus on a specific aspect involving delegation of privileges : loan agreement delegation. Chief executives are allowed, in this organization, to delegate some of their privileges to other employees, in the limit fixed by several conditions. The rule itself is shown in the table entitled ‘Delegation rule’, and the second table ‘Loan authorizations’ enumerates the loans categories and maximal amounts that may restrict the delegation rights of the agency manager. Note that the abstract positions identified formerly in Fig. 3 (e.g. ‘Leader’) are used to define the various authorizations, illustrating the interest to describe hierarchically the set ‘Employment’. The mechanism of loan agreement delegation that exists in this bank is a typical example of a situation in which a deontic logic specification language may be most useful. The general bank rules state that an agency manager can delegate some of his privileges (with respect to loan decisions) to other employees. These rules define several limits that the agency manager must respect in his delegation decisions, but nothing more. Therefore, the manager defines the actual permissions of the agents. We represent this functioning in the specification using two nested P operators meaning that permission is granted (to the agency manager) to grant permission to one of his
82
Rodolphe Ortalo
agents to accept a loan. However, even if it is authorized, the manager may not delegate full permissions to an agent, e.g. if the latter does not need them as he works on a specific market. Finally, Fig. 4 presents three simple security objectives. The first one is related to customers, but the other two apply to the agents of the organization. The second one aims at enforcing the delegation authorizations with respect to loan agreement, and the last one at protecting customers from abuses perpetrated by employees. 5.3 Integration of Vulnerabilities As an example, we consider now two vulnerabilities of the bank agency. These vulnerabilities are presented in Fig. 5. The first one is related to trust relationships between the various agents of the organization. We also consider that a customer may trust the agent in charge of his business. Such trust may allow a malicious agent of the organization to misappropriate funds of this customer. However, in our case, we think that this vulnerability arise only for specific operations, more precisely anonymous stocks and shares operations. Vulnerabilities Agents × Trusts × Other agent ∧ P (Agents × Actions) ⇒ ✸P (Other agent × Actions) Customers × Trusts × Agents ∧ (Agents × Bearer bonds) ⇒ ✸ (¬ (Customers × Complaint))
Fig. 5. Vulnerabilities
The first vulnerability shown in Fig. 5 conflicts with the second security objective shown in Fig. 4. A study in the field allowed us to check in the real organization the various conditions under which a given vulnerability may appear (i.e. to obtain all the true propositions of the form Agents × Trusts × Other agent). Given this information, we identified all the possible scenarios allowing an agent of the organization to defeat the second or third security objective. We observed that trust relationships were numerous in this organization between the employees (174 trust relationships are very strong), and several agents indeed could defeat the loan agreement authorization limits. As it is surely impossible (and probably ill-fated) to regulate the everyday life and the trust relationships existing among the agents, these results do not allow one to influence directly the operation of the organization. However, comparison of these results with the situation of each agent (with respect to his seniority, position or skills) could reveal anomalies in privileges distribution in the organization. In our case, no abnormal tendency arose from the results analysis. In the specification considered in this example, the second vulnerability conflicts directly with the third security objective. We studied an alternate functioning of the target organization involving validation of all bearer bonds operations by one specific employee before completion. In this case, only a malicious agent who abuses one of his customer’s trust and the agent in charge of the validation is able to infringe the objective. This means that, given the revised functioning, both vulnerabilities shown in Fig. 5 may conflict with the third security objective of the organization. However, detailed analysis of the possible insecure states shows that they are much less numerous given the revised functioning than with the original one. Therefore, such functioning appears acceptable.
A Flexible Method for Information System Security Policy Specification
6
83
Conclusion
The information systems addressed in our study are commonly found in organizations such as banks, industrial companies, etc. Our approach focuses on organizations where security is not the only requirement even though security issues are important. For these organizations, the definition of a security policy necessitates a pragmatic approach, taking into account the functioning of the target information system. Most of the classical security policies define security properties expected of the system independently of a description of the system itself. But such policies also impose strong constraints on the operation of the target system or organization, which may not be acceptable in the context of a conventional information system. In our case, on the contrary, the security policy definition integrates a partial description of the information system. This allows to reuse available information, and to adapt the definition of the security objectives and rules to the target organization structure and behavior. In practice, security rules and objectives may be more detailed and fit better the actual operation of the organization. Based on a formal language, deontic logic, the method offers the opportunity to perform the verification of the policy. But we identified several types of inconsistency, and some of them correspond to admissible trade-off between security and other requirements of the organization, such as flexibility or efficiency. In these cases, the policy fails to be consistent. However, the verification of the security policy should provide a mean to identify exhaustively all the potential insecure states of the information system and could be a valuable help to the security administrator who manages such trade-off. Furthermore, a suitable description of the information system vulnerabilities can be proposed using the proposed specification language. This description fruitfully complements the security policy as the formalism offers the opportunity to identify precisely the impact of the vulnerabilities considered on the security of the organization, using verification mechanisms similar to those associated to consistency checking. This information will facilitate the task of the administrator to evaluate the gravity of the organization vulnerabilities, and to propose modifications of the functioning or of security rules in order to improve security.
Acknowledgments The author is grateful to the Cr´edit Agricole, and more especially to the CRCAM Quercy-Rouergue for their collaboration, as well as to all of the staff of the agency of Villefranche de Rouergue for their pleasant welcome. This work has been partially supported by UAP Assurances, and European ESPRIT Project 20072 “Design for Validation” (DeVa). Finally, I would like to thank Claudia Almeida, Marie Borrel, Fr´ed´eric Cuppens, Yves Deswarte and the anonymous referees for their help on this work.
84
Rodolphe Ortalo
References 1. D. E. Bell, L. J. LaPadula, Secure Computer Systems: Unified Exposition and Multics Interpretation, The MITRE Corporation, Report ESD-TR-73-306, 1975. 68 2. L. Catach, “TABLEAUX: A General Theorem Prover for Modal Logics”, Journal of Automated Reasoning, vol. 7, pp. 489-510, 1991. 77 3. B. F. Chellas, Modal Logic: An Introduction, 295 p., ISBN 0-521-29515-7, Cambridge University Press, 1980. 71, 72 4. L. Cholvy, F. Cuppens, “Analyzing Consistency of Security Policies”, in IEEE Symposium on Security and Privacy, Oakland, California, May 4-7, pp. 103-112, ISBN 0-8186-7828-3, IEEE Computer Society Press, 1997. 68, 71, 76 5. F. Cuppens, C. Saurel, “Specifying a Security Policy: A Case Study”, in 9th IEEE Computer Security Foundations Workshop, Kenmare, Ireland, June 10-12, pp. 123134, ISBN 0-8186-7522-5, IEEE Computer Society Press, 1996. 70 6. L. Fari˜ nas del Cerro, A. Herzig, “Modal Deduction with Applications in Epistemic and Temporal Logic”, in Handbook of Logic in Artificial Intelligence and Logic Programming, Epistemic and Temporal Reasoning (D. M. Gabbay, C. J. Hogger, J. A. Robinson, Eds.), vol. 4/5, pp. 499-594, ISBN 0-19-853791-3, Oxford Science Publications, 1995. 76 7. M. Fitting, “First-Order Modal Tableaux”, Journal of Automated Reasoning, vol. 4, no. 2, pp. 191-213, 1988. 77 8. M. Fitting, “Basic Modal Logic”, in Handbook of Logic in Artificial Intelligence and Logic Programming, Logical Foundations (D. M. Gabbay, C. J. Hogger, J. A. Robinson, Eds.), vol. 1/5, pp. 365-448, ISBN 0-19-853745-X, Oxford Science Publications, 1993. 76 9. J. Glasgow, G. McEwen, P. Panangaden, “A Logic for Reasoning About Security”, in Computer Security Foundations Workshop, Franconia, pp. 2-13, IEEE Computer Society Press, 1990. 68, 71 10. ITSEC, Information Technology Security Evaluation Criteria, v1.2, 163 p., ISBN 92-826-3004-8, Office for Official Publications of the European Communities, Luxembourg, 1991. 68, 71, 75 11. ITSEM, Information Technology Security Evaluation Manual , v1.0, 262 p., ISBN 92-826-7087-2, Office for Official Publications of the European Communities, Luxembourg, 1993. 71 12. A. J. I. Jones, M. Sergot, “Formal Specification of Security Requirements using the Theory of Normative Positions”, in Second European Symposium On Research In Computer Security (ESORICS 92), (Y. Deswarte, G. Eizenberg, J.-J. Quisquater, Eds.), Toulouse, France, November 23-25, LNCS, 648, pp. 103-121, ISBN 3-540-56246-X & 0-387-56246-X, Springer-Verlag, 1992. 68, 71 13. S. A. Kripke, “Semantical Considerations in Modal Logic”, Acta Philosophica Fennica, vol. 16, pp. 83-94, 1963. 72 14. G. Kuper, “Logic Programming with Sets”, in 6th ACM Conference on Principles of Database Systems (PODS), San Diego, California, USA, March 23-25, pp. 11-20, ISBN 0-89791-223-3, ACM Press, 1987. 73 15. J.-J. C. Meyer, R. J. Wieringa (Eds.), Deontic Logic in Computer Science, 317 p., ISBN 0-471-93743-6, Jon Wiley & Sons, 1993. 71 16. B. A. Myers, R. G. McDaniel, R. C. Miller, A. S. Ferrency, A. Faulring, B. D. Kyle, A. Mickish, A. Klimovitski, P. Doane, “The Amulet Environment: New Models for Effective User Interface Software Development”, IEEE Transations on Software Engineering, vol. 23, no. 6, pp. 347-365, June, 1997. 74
A Flexible Method for Information System Security Policy Specification
85
17. R. Ortalo, Using Role-Based Abstractions for Security Policy Specification with Deontic Logic, 20 p., LAAS-CNRS, Report 97216, June, 1997. 76 18. R. Ortalo, Y. Deswarte, “Quantitative Evaluation of Information System Security”, in 14th IFIP International Information Security Conference (IFIP/SEC’98), August 31-September 4, Vienna-Budapest, Austria-Hungary, Chapman & Hall, 1998. (to appear) 78 19. R. S. Sandhu, E. J. Coyne, H. L. Feinstein, C. E. Youman, “Role-Based Access Control Models”, IEEE Computer , vol. 29, no. 2, pp. 38-47, February, 1996. 76
A Flexible Method for Information System Security Policy Specification Rodolphe Ortalo ortalo@laas, fr
LAAS-CNRS 7, avenue du Colonel Roche 31077 Toulouse cedex 4 France
A b s t r a c t . This paper presents a method for the specification of the security of information systems. The proposed approach provides a flexible and expressive specification method, corresponding to the specific needs of organizations. First, we outline the overall guidelines of the security policy definition process, and the different consistency issues associated to the description of the security requirements of an organization information system. The specification language used is based on a convenient extension of deontic logic. The formalism and its extensions are defined briefly. To illustrate the use of this formalism, the paper presents how the method applies to the description of the security requirements of a real organization : a medium-size bank agency. Keywords: security policy specification, information systems, deontic logic.
1
Introduction
This paper presents a practical framework for the specification of the security policy of an organization such as an industrial company or a public service. These organizations exhibit security needs, but they also have other requirements that induce several specificities with respect to security management. First, the information system may exist prior to the explicit definition of its security policy. Furthermore, it is important to provide an adaptable and structured methodology corresponding to the need for abstraction and progressive refinement in the security policy specification process. Finally, the information system may be imperfect, in the sense that the desired security properties may differ from the actual observed properties. From the security point of view, these contradictions appear intolerable, but they often result from a trade-off between the main requirements of the organization and the enforcement of security measures. In fact, the study of practical regulations often reveals discrepancies between the actual situation and the desired behavior, and the security policy would probably better cope with them rather t h a n ignore them. Consequently, in the context of an organization, most of the classical security models are often unusable. The problem is that they impose constraints on the design and the behavior of the system that may not be acceptable in a situation where security is not a primary requirement. For example, the mandatory rules
68
Rodolphe Ortalo
of a multi-level security policy [1] imply to design the information system accordingly, and assume permanent application of the rules. Moreover, in the case where the system already exists, the security policy could not be applied until it has been re-designed. To face such situations, the specification of a security policy must provide more flexibility and expressiveness. In this paper, we use an extended deontic logic language to specify security policies. Several authors already outlined the main features of deontic logic for specifying security properties [9, 12, 4], and this formalism seems to be more adequate for the specification of the security policy of an organization than usual security models. The specification method we propose is based on an original syntactical extension of the standard deontic logic language that allows a structured and convenient representation of the elements of the policy. Given the imperfections of real information system policies, several types of security policy inconsistencies may be encountered, corresponding to the elements of the security policy that conflict with each other. By detailing these cases, we identify some inconsistencies corresponding to a trade-off between functionality and security in the organization, that may be tolerated in the specification. Furthermore, we argue that the formalism also offers the opportunity to complement the security policy with a description of the vulnerabilities of the information system. However, it is highly desirable to completely describe the insecure states corresponding to inconsistencies in order to have a chance to trust the system. The structure of the paper is the following: Section 2 presents the various elements of an organization security policy and the potential conflicts that it may contain. Section 3 provides the definition of the specification language used in this paper. Section 4 details the application of the method to an information system. Section 5 presents a practical application example, addressing the security needs of a bank agency. Finally, we conclude in Section 6. 2
Presentation
2.1
of
the
Approach
S e c u r i t y Policy C o m p o n e n t s
The security policy of the system corresponds to the description of its security needs. According to [10, Section 2.9], the s e c u r i t y policy is "the set o/laws, ~ules and practices that regulate how sensitive in/ormation and other resources are managed, protected and distributed within a specific system". In this paper, we consider that a security policy defines : -
-
the s e c u r i t y objectives, i.e. the confidentiality, integrity and availability properties expected of the system ; and the s e c u r i t y rules which are imposed on the mechanisms which can modify the security state of the system, in order to guarantee the security properties.
The components of the security policy refer to specific entities of the system (e.g. specific subjects or objects, available operations, or an organization chart).
A Flexible Method for Information System Security Policy Specification
69
First, these description elements are added to the policy. In an organization for instance, the names of the various individuals are mentioned. Additional description rules may also be added to the specification in order to reflect the functional behavior of the system when this behavior may impose constraints on the security-related aspects of the system. In an organization, such functional rules represent the various information flows existing in the organization and the way they interact with each other. The description of these operational rules brings information that help the security administrator to adapt the security policy to the actual operation of the organization. The description of the system operation is limited to the basic elements needed to formulate the security mechanisms and objectives, and to the operational rules that describe the portions of the information system relevant to security. Therefore, this description should remain minimal, and additional elements or rules should be introduced only when needed. Second, the security policy contains the security rules to be enforced by the system. These rules reflect the way in which the security state of the system evolves. In an organization, such rules define, for example, who can assign a new position to an individual, or on which condition someone may delegate some of his privileges to another person. Finally, the security policy defines several security objectives that specify the desired security properties for this system. The set of all these properties defines the states of the system that are acceptable with respect to security. A security state of the system that satisfies all the security objectives of the system security policy is a secure state. An insecure state is a state where these properties are not satisfied. 2.2
Refinement
The first version of the security objectives introduced in the security policy will probably be rather generic. Therefore, it is very probable that these objectives will be too restrictive and easily invalidated by the operation of the organization. To resolve this problem, the introduction of new elements may be necessary to limit the scope of an objective. Similarly, it may be necessary to introduce new description rules, or accurately detail the various steps of a given task in order to focus on a desired security property. Therefore, the description of the organization will undergo several refinement steps, enabling to clarify the security objectives and to improve the whole policy. Furthermore, the verification of the policy can benefit from an incremental approach, as we will see in Sect. 4.2.
2.3
Security Policy Consistency
The security policy is c o n s i s t e n t if, starting from a secure state one cannot reach an insecure state without violating the security rules. Given the different security policy components, we can distinguish several types of potential inconsistency : - First, the description of the system may be inconsistent due to conflicting functional rules. However, this is not a security-related specification problem.
70
Rodolphe Ortalo
- Several security objectives may be contradictive. In this case, no secure state exists and it is impossible to enforce the policy, therefore the security objectives should be updated (e.g. by reducing their scope, or adding priorities). - Some of the security rules of the specification may contradict each other. For example, an individual may be able to reach a situation in which an obligation and an interdiction hold simultaneously with respect to the same action (e.g. due to conflicting roles). The modification of the security mechanisms may prevent such conflicts. However, it is also possible that the behavior of the system eliminates the situations in which these conflicts can occur. If so, description rules could be added to the system description to describe the functioning more accurately. The verification of each security objective, starting with the system security rules, may reveal an inconsistency. In this case, security rules fail to enforce the objectives. Such inconsistency means that either security objectives should be revised, or it is needed to introduce new security rules to enforce the expected properties. - Finally, it is possible that the functional rules conflict with the security objectives. This case identifies weaknesses in the organization operation showing that it is possible to bypass the security rules. To face this problem, two directions are possible: it is possible to modify the organization operation (and therefore decide to adapt the organization to the envisaged security mechanisms), or to consider new security mechanisms which are compatible with the actual operation (i.e. adapt the security policy to the information system). -
We can identify the mandatory consistency requirements of a security policy specification. First, it seems necessary that the consistency of the system description be obtained. If the description of the system functioning is incorrect, adding security requirements to the policy is a nonsense. Second, if several security objectives are contradictive, i.e. if the enforcement of one objective implies the failure of another, the information system is always in an insecure state. Therefore, consistency of the set of all security objectives is necessary: if two security objectives contradict, a choice must be made. Other inconsistency types reflect problems such as incomplete security mechanisms that fail to enforce the security objectives, over restrictive security rules that may lead to opposed permissions and interdictions in specific situations, or dangerous functioning of the system that may allow to bypass the security mechanisms. In practice, some of the security policies defined in organizations exhibit such inconsistencies [5]. Furthermore, a situation in which such problems occur may be acceptable if it corresponds to a legitimate trade-off in the organization. Exhaustive description of the insecure states is necessary to trust the system, but security issues may not need to be totally reconsidered. The purpose of the introduction of a description of the system behavior in our approach to security policy specification is precisely to allow such precise insecure states identification for some classes of inconsistent policies. Of course, improvements may also be proposed, but the corresponding mechanisms may not be possible
A Flexible Method for Information System Security Policy Specification
71
to be enforced in the organization if they perturb its normal operation, and we think that the security policy specification should take this fact into account and cope with some inconsistencies. This point of view can be extended. We can envisage to complement the system security policy by a description of the potential vulnerabilities of the system. Vulnerabilities may be due to intrinsic weaknesses of the used security mechanisms (such as cryptographic keys length), or to flaws identified after the system has started its operational life [10, Para. 3.25-28], [11, Sect. 6.C]. Such vulnerabilities compromise theoretically the security of the system, but may be very difficult or very long to exploit. The identification of the insecure states due to vulnerabilities allows to assess their impact on the security objectives. This assessment should help the information system security administrator to propose relevant changes given the known vulnerabilities of the system. 3
Specification
Language
In order to handle such cases, we need a security policy specification formalism flexible enough to separate the functional description of the system and the security needs description, and which furnishes methods for exploiting inconsistent specifications due to the integration of vulnerabilities in the description. Modal logic, or more precisely deontic logic, is a formal framework that seems to be expressive and adaptable enough to specify security policies of organizations [9, 12, 4]. Deontic logic is a branch of modal logic to reason about normative versus non-normative behavior by means of modal operators such as ought, permitted and forbidden [15, Preface]. Furthermore, by using these special operators, deontic logic provides a mean to specify that some behaviors of the system are non-normative (illegal) but nevertheless possible. We present briefly in this section the deontic logic language that we selected. A general presentation of modal logic can be found in [3].
3.1
Deontic Logic Language
Like any modal logic, deontic logic is an extension of usual propositional logic by one or several deontic operators. More precisely, if (I) is a set of atomic propositions a, b, c, ..., if --, V, A, ~ and r are the usual boolean operators, and if O, P and F are the three modal operators of deontic logic, the deontic language denoted Lo ((I,), or simply Lo when it raises no ambiguity, is the set of formulas (or sentences) built with the following rules : - i f p E (I), p is a formula; - if p and q are formulas, -,p, p V q, p A q, p :=~ q, and p r q are formulas ; - if p is a formula, Op, P p and Fp are formulas. A modal formula is a formula of L o which contains at least one operator O, P or F. A non-modal formula is a formula which contains none. In natural language, formulas Op, P p and Fp are read, respectively, "It is
72
Rodolphe Ortalo
obligatory that p", "It is permitted that p", and "It is forbidden that p". Therefore, operators O, Fp = O~p (1) P and F denote respectively the notions of obliP p = -,O-~p (2) gation, permission and interdiction. Conforming Op ~ Pp D to the definitions of necessity (operator O) and possibility (operator <~) in classical modal logic, O (p =~ q) =~ COp =~ Oq) K one can state relations between these various opP-PRN Op erators. First, F can be defined from O by (1). P is the operator dual of O. Therefore, one can also consider the relation (2) between operators O and P, similar to the relation between necessity and possibility in classical modal logic. Finally, standard deontic logic [3, Chap. 5], also contains axiom D which relates obligation and permission. A modal logic system is an extension of propositional logic, therefore it contains all the axioms and inference rules of propositional logic. Thus, Lo contains all tautologies and is closed under modus ponens. We consider in this work a normal modal logic system, i.e. a system that contains axiom K and the inference rule RN ("Rule of Necessitation"). In this paper, we want to use deontic logic for the practical representation of security properties. Even though it may have a direct impact on our work, studying deontic logic in itself, or as a formal language for reasoning about legal issues and normative systems (or security), is out of the scope of this presentation. Therefore, we use standard deontic logic, i.e. the KD system of classical modal logic. Especially we accept (2) as a suitable definition of permission. However, these simplifications are done in order to clarify the security policy specification method, and they do not prevent further extensions of the logical language used, such as suggested in [3, Chap. 10]. Finally, we recall the semantics associated to a normal modal logic language like Lo. It is a Kripke's semantics, or possible world semantics [13]. A Kripke model M for a modal logic system is defined by a triple (W, R, V) where W is set of possible worlds w, R is a binary relation over W called the accessibility relation, and V : W x ~ --+ {true, false} is a function furnishing in each world w E W the truth value V (w,p) of the atomic proposition p. The fact that proposition p is true in a world w of model M is noted ~M p. This truth value is defined in the following way 1, which extends usual propositional calculus: _ _ _
if p E @, ~M p if and only if V (w, p) -- true ~M _,p if and only if we do not have ~M p ~ M ( p A q ) if and only if ~M p or ~M q ~M Op if and only if Vw' E M / w R w ' , ~wM p
The first three statements conform to the classical truth value definition of a formula in propositional logic. From the last definition, we see that in the Kripke's semantics the formula "It is obligatory that p" is true in a world w if and only if p is true in every world w ~ which w is in relation with. When w is 1Using -, and V as the fundamental boolean operators.
A Flexible Method for Information System Security Policy Specification
73
in relation with w ~, we say that w ~ is directly accessible from w, or equivalently that w ~is a possible world for w . The definition of the truth value of Pp is easily deduced from the one of Op using (2) : _ ~M p p if and only if 3w' E W / w R w ' , ~ wM p
3.2
Language Extensions
The language Lo, or Lo (~), is based on a set of atomic propositions (b. In order to ease the specification task, it is desirable to introduce constructs allowing a structured and hierarchical description of the various formulas of Lo. Therefore, we propose to use a syntactical extension of this language, denoted ~oLet 11 be a set of sets A, B, C , . . . , ordered by the partial order relation of inclusion _. The language 1:o (II), or simply s when it raises no ambiguity, is syntactically identical to the language Lo defined in Sect. 3.1. This extension is purely syntactical. Its interest resides in the possibility of structuring the atomic propositions of the language through a hierarchical description of a set by its subsets. In order to ease the construction of the various sets E elements of II, we allow the use of the operator x (set product) to define the elements of II, with the natural extension of the relation of inclusion.2 The deontic language Lo (~II) associated to the extended language ~ o (II) is built considering the set ~n defined as ~n = {E E I I / (rE ~ E H, E' g: E)}. Therefore, ~l~ is the set of all the terminal sets of H, i.e. the sets that do not contain any other sets and form the basis of the hierarchy defined by relation C_. A sentence f of ~ o (H) is associated to a set Zf of formulas of Lo (r defined recursively by (3), ~ f = U ~fl~_~, (3) where f I*~-v is the formula built by substituting E'CE for all the occurrences of symbol x the symbol y in formula.f. Therefore, a formula f of s (H) is a short notation for several formulas of Lo (~n). These formulas are those obtained by replacing, in f, every occurrence of a non-terminal set of II by its content until we obtain formulas mentioning only elements of ~n. U s i n g Set C o n s t r u c t o r s . Similar notations allowing the direct use of sets within formulas have already been applied to logical languages developed for database systems [14]. These extensions aim at facilitating the specification task. Sets allow for a progressive and hierarchical description of the basic elements of the security policy, and these sets can be used directly in the formulas of the extended language. To illustrate the use of this notation, we consider now several formulas written in s (l-I) and the equivalent set of formulas in Lo (~ii). 3 A simple equation of the extended language L:o, e.g. P (A V B), should be represented by a whole set of formulas in Lo, here: {Vp E A, Vq E B, P (p V q)}. 2E x E' = {(e,e') /e E E,e' E E'}, and E x E' C F x F' r E C_F A E' C_ F'. 3In these examples, sets A, B, 'Administrators', 'Software types' and 'Departments' refer to sets containing atomic propositions only. 'Install' is itself an atomic proposition, i.e. an empty set.
74
Rodolphe Ortalo
The set product operator • of/~o (H) allows the definition of structured sets as elements of the language. This eases the description of various items starting with some of their typical features. Hence, formula (4) states directly a set of permissions associated to software installation tasks. These tasks are described by the action symbol 'Install', the various people allowed to perform the installation gathered in the set of the system administrators, and the various target software. Pieces of software are themselves described through two characteristics: their type (word processor, spreadsheet, etc.), and the organization department that owns them. The description of all these elements in the original language Lo would necessitate the definition of many elements, sharing common typical characteristics. Formula (5) shows such a set. As can be seen, in a conventional notation, set product corresponds to several indices.
P (Install • Administrators
• (Software types • Departments))
{P (Installa,software,.d) }aEAdministrators,tESoftware types,dEDepartments
(4) (5)
G r a p h i c a l R e p r e s e n t a t i o n . These extensions also make the specification language suitable for a graphical presentation that can be manipulated with classical user interaction techniques (drag and drop, etc.). This graphical representation represents the sets hierarchy leading to the definition of the atoms of the language using hierarchical lists. The operators of the language are also represented using graphical artifacts called slot-boxes. The purpose of this representation is to provide a simple and convenient method to define and manipulate the security policy specification elements. Figure 1 presents an example of this representation for defining the content of the sets mentioned in (4). Departments[ Management[ Production Sales
Software types Word processors I$TEx, ... Compilers Spreadsheets
[Administrators [ A1 [A2 A3
Fig. 1. Example of sets graphical representation This graphical approach has been implemented within a tool enabling to define and modify easily a security policy. This editor is based on the Amulet graphical library [16] which was very convenient for this realization. In the remaining of this paper, we will adopt this graphical representation for sets, but for space reasons, we will use a conventional linear notation for formulas involving operators. 4
Application
to Information
Systems
Security needs justifying the elaboration of a security policy appear in many organizations. For example, we can mention banks (transactions integrity), telecommunication companies (system availability), or medical and public-health services (legal confidentiality obligations). However, all these organizations also have other priorities distinct from security. In this section, we present how the
A Flexible Method for Information System Security Policy Specification
75
deontic logic language presented previously can be applied to the specification of the security policies of such organizations, and the various possible situations associated to the specification verification. 4.1
Guidelines
In the context of an organization, a security administrator would be in charge of the security policy definition. While this administrator would probably have a good knowledge of the target organization, it is probable that he would not be familiar with a modal logic language. Therefore, we detail several steps of this formalization process. R e u s e of available i n f o r m a t i o n . Most of the description elements of the specification appear in documents describing the organization and its operation. These elements are related to physical entities, individuals, or a functional decomposition of the organization. They refer also to specific attributes (such as confidentiality levels). Many of these items appear in description documents already available in the organization, such as an organization chart, some tasks definition documentation, handbooks, norms if applicable, etc. With respect to security issues, other documents help the system administrator. For instance, organization charts also mention the individuals situation in the hierarchy and their privileges. The policy can also be based on existing regulations, documents supporting the evaluation of the organization according to a set of security criteria [10], a previous risk analysis, laws if applicable to the organization, etc. When the security policy specification target is a system to be designed (and not a system in operation), the security designer should look for equivalent information existing in the design documents. S y s t e m d e s c r i p t i o n . In the deontic logic language presented previously, the description of the organization is performed through the introduction of atomic propositions representing the basic elements, and non-modal formulas (built with the operators of usual propositional logic) representing the aspects of the functioning that are relevant for security. For example, consider the process of internal purchase in an organization: an employee formulates a purchasing demand, which must probably be validated by a manager before going to a specialized department that performs the operation. In order to describe these different steps, first one needs to introduce several atoms such as 'demand', 'manager authorization', 'order' and 'delivery'. Then, one can state the causal dependencies between these steps and the condition for an order to be actually sent by the purchase department. This is represented by logical formulas like: demand A manager authorization ~ order and order ~ delivery. S e c u r i t y rules. The security mechanisms existing in the system are added to the description of its operation. A security mechanism describes the link between the permissions (or interdictions) and the state of the system (or, more precisely, only a subset of the whole state of the system: its security state, made of security attributes). Therefore, it may be represented by a modal formula stating a permission conditioned by some non-modal formula, such as: directory writable =~ P (rename the file).
76
Rodolphe Ortalo
Integration of roles. The main asset of role-based policies resides in the opportunity that they offer to specify constraints on abstract elements (the roles) instead of enumerating several analogous constraints for all the concerned subjects [19]. Formula (4) presented previously shows that the possibility to use sets directly in a formula i n / : o allows to define and use easily user groups. To support roles, we enumerate not only the various roles, users and permissions, but also the desired pairs (role, user), and use modified authorization rules that constrain users permissions according to user roles. For example, we may modify (4) to: P(Install x Agent • Software) =~ Administrator x System administrator. In this approach, roles appear as intermediate abstract entities between the permissions in the system and the actual users. This extra level improves the flexibility of the specification, as roles are defined independently of the elements they relate. This flexibility is achieved at the cost of an extra specification work to associate users to roles. As shown in [17], several variants of this method can be used to introduce roles in a deontic logic specification. S e c u r i t y objectives. Finally, we use the basic elements defined previously to describe the security objectives of the information system. If a regulation exists in the organization, it will be transposed in the security policy specification. The deontic logic language allows an easy definition of the security objectives, through the use of the obligation, permission and interdiction operators that allows a natural wording of the desired security properties. Therefore, a security objective is usually represented by one modal formula, such as Op or F (p A -~q).
4.2
Exploiting the Specification
An asset of a formal language resides of course in the availability of formal calculus methods, useful to reason about the security policy. Several directions may be followed to address this issue. See [8, 6] for a general presentation of modal logic proof theory, and [4] for an application to the verification of security properties. Provided that this calculus exists, several verifications can be performed on the specification corresponding to the various inconsistency types identified in Sect. 2.3. With respect to validation problems, the incremental process presented in Sect. 2.2 exhibits new advantages. Non-modal formulas, such as p, pVq, pAq :*z r, corresponds to the description of the system components and functioning. Such formulas must form a consistent basis for the security policy. Therefore, validation of all the non-modal formulas should be done separately, prior to further verification of the security-related formulas. As these formulas only form a partial description of the system and are propositional logic formulas, their number is limited and we do not expect their validation to be prohibitive. When inconsistency comes from the combination of two security objectives (this can be checked directly by verifying the conjunction of the two corresponding modal formulas in the chosen modal system), a choice must be made. This situation may be difficult to manage if it is attempted to verify the whole security policy directly in one step. However, if the specification process follows the
A Flexible Method for Information System Security Policy Specification
77
incremental method suggested, the verification of the security objectives compatibility should be done separately 4. We can also check that the security policy specification exhibit some properties, such as the fact that there are no conflicting obligation and interdiction. This is represented by -- (Op A Fp), which must be true in the specification. Inconsistency types potentially related to an organization trade-off appears mainly when considering the Kripke's semantics of the deontic logic security policy specification. An inconsistency appearing between formulas describing security rules, such as p A q =~ F r or p =~ Pq, means that the semantic model M of the specification contains a world w (corresponding to a system state) in which several permissions or obligations conflict (e.g. both ~M p p and ~M F p hold for some p and w). Similarly, an inconsistency appearing after the addition of a security objective to the specification shows that there exists worlds in the model where the modal property corresponding to the objective is not true. In each case, a model M invalidating the specification identifies the various intermediary states leading to an insecure state as well as the possible transitions between them, as it provides the t r u t h values V (w, p) of the atomic propositions in each world, as well as the relation R between the worlds. The verification process may exhibit a counter-model M invalidating the specification. In this case, the counter-model may be used to identify the various rules (either functional rules or security rules) involved, as well as the conditions on the state of the system that allow this failure. With this information, modifications may be proposed. If the functioning of the system never allows such situations (e.g. one person simultaneously associated to all the roles in the system), additional formulas may be incorporated to the security policy to rule out such states. If a succession of applications of security rules allows to defeat the proposed objective, either the rules may be updated or the objective revised. Finally, if the behavior of the system allows to bypass several security rules and defeat the security objective, modifications of the operation of the system may be proposed in order to improve its security. In each case, the trade off between security issues and operational issues may be difficult to manage, but the information extracted from the counter-model M helps the system administrator to determine the appropriate solution. The interest of the information that can be extracted from such a counter-model may favor the use of a semantics-based deduction m e t h o d for the verification of a deontic logic security policy specification, such as those presented in [7, 2]. We see t h a t the specification, even when inconsistent, can be exploited usefully. In fact, we argue that the ability to analyze precisely and exhaustively the various possible evolutions of the state of the system that contradict a security 4It should be noted that, with the language presented in Sect. 3, it is also possible that global inconsistencies, similar to those corresponding to conflicting objectives, result from a combination of propositional formulas (describing the functioning) and formulas describing security rules. This specific case corresponds to the fact that in a specific system state some security rules lead to a contradiction, and, simultaneously, the functioning of the system always leads to this state. This case should be treated like conflicting objectives and be eliminated.
78
Rodolphe Ortalo
objective is very important in the context of an organization information system. In this context, the verification of the whole specification is very likely to fail due to contradictory objectives, or incomplete or inadequate security rules. Furthermore, it may be acceptable to tolerate inconsistencies, if all the evolutions of the information system that may lead to an insecure state are very improbable or very difficult to exploit. This assessment can rely, for example, on an evaluation method [18]. However, such situations cannot be understood if a precise description of the security needs does not provide the information needed to qualify the insecure states. 4.3
Description of Vulnerabilities
The approach finally offers the opportunity to include in the security specification additional components podp = Pp (6) tentially leading to inconsistencies, representing the vulnerabilities of the information system. In (6), we dep A dq ::~ Odr (7) fine a right dp with respect to a proposition p as the p A Pq ~ O P r (8) permission of p. We propose to define a vulnerability as the fact that, given a right and a specific state of the system, it is possible to obtain a new right (possibly not legitimate). We describe this as the property that "if a non-modal (propositional) property p and a right dq are true in a world w, then there exist another world w ~ in relation with w such that another right dr is true in wl. '' Therefore, we represent a vulnerability by a rule like (7). Given the definition of a right, a vulnerability is of the form of (8). This definition allows the description of general vulnerabilities and, on the contrary of [18], makes a clear distinction between the normative aspects of the specification (captured by the deontic operator P) and the opportunities offered to a potential attacker (captured by the ontic operator (>).
5
Example
To illustrate the security policy specification method, we present an example of its application. The target organization is a bank agency, counting 35 employees, and located in a rural area. The security policy presented in this section was built on the basis of several internal documents describing the organization. The analysis of these documents has been completed by a study in the field which lasted several days. The bank help manual identifies more than a hundred different operations potentially performed in the organization, from foreign currency deposit to account freezing following a legal injunction. Among all these operations, only a limited number are integrated in the following specification. We chose to consider operations involving loan agreement, and bearer bonds. These tasks are interesting as the first relies on a delegation mechanism, and the other is considered security-critical due to anonymity.
A Flexible Method for Information System Security Policy Specification
5.1
79
Formalization of the Organization Description Documents
The initial elements of the specification are presented in Fig. 2. We note the various individuals mentioned in the policy, several actions, and also a set of roles. The set 'Individuals' is further split into two subsets: 'Customers' is left unspecified and designates any potential customer of the bank, 'Agents' enumerates all the employees of the organization (they are identified by numbers from 1 to 35 in this paper). The roles used in this specification are related to the functions existing in the organization. All the official functions mentioned in the organigram are gathered in the set 'Employment' (defined later) which is therefore a subset of all roles. However, our practical study allowed us to identify a separate unofficial role, 'Cashier', necessary to formulate the security rules of the organization. Astonishingly, it is not explicitly associated to a function in the organigram. Finally, the set 'Actions' contains the various tasks included in the specification. Only three general types of action are considered in this example : common operation performance, loan agreement, and physical access. Actions Do • Operations Grant a loan • Loans Access x Physical elements
A m o u n t Miscellaneous I Balance I Physical elements Movements I IF~ Movement types [Account Complaint [ Roles mapping I
Individuals[
Roles
Customers Agents
Employment Cashier
Agents mapping A16 • Cashier
Fig. 2. Security policy specification root The various organization description elements are given in Fig. 3. The set 'Employment' describes the employee functions (corresponding to roles). It corresponds to a position description document available in the bank. This document identifies several abstract positions that gather different positions actually held by the employees (e.g. 'Leader' groups positions such as agency leader, office manager, etc.). The roles introduced in Fig. 2 allow to define generic description and security rules independently of the actual agents of the organization. To obtain the actual description we need to specify also a mapping between individuals and roles. The mapping of agents to roles, 'Agents mapping', obtained using an organigram of the organization, is defined in Fig. 3. We note in this figure that one agent may be associated to several roles, e.g. A7 is a manager as well as a service agent. A16, the agent playing the role of the cashier is identified directly in Fig. 2. The various operations performed in the organization, presented in the set 'Operations', are also organized in several subsets. The description is limited to the higher levels and the detailed list of all operations has not been included. In this figure, we also introduce several secondary elements. Finally, we include in these description elements a simple representation of the data associated to a bank account. In order to describe the functioning of the organization, we also need to include description rules. As an example, we show in Fig. 3 how accounts and money movements can be represented. These rules state for example that an usual operation necessitates physical access to one of the bank terminals, and
80
Rodolphe Ortalo
t h a t such an operation results either in a debit or credit m o n e y movement on an account. Of course, this is a simplification of the real functioning. A specific description rule represents the fact t h a t a debit operation, when performed on a negative balance account, follows an exceptional path. The proposition 'Freezing' is a label identifying the exception. Here, this label does not mean t h a t the operation cannot be performed. 5 Employment Agency chief executives Agency manager Agency manager assistant Leader Market leader Position leader Office m a n a g e r Agency leader Position manager Service agent P r i v a t e cust. service agent F a r m e r service a g e n t Professional service a g e n t Collection assistant H i g h l y skilled c l e r k All m a r k e t c o m m . a s s i s t a n t Clientele adviser Customer clerk Skilled c l e r k Secretary Position ] Positive balance Negative balance [Accounts C Customers x Position t
A1 x A g e n c y m a n a g e r A2 x A g e n c y m a n a g e r a s s i s t a n t A3 x S e c r e t a r y A4 x S e c r e t a r y A5 x A g e n c y l e a d e r A5 x P r i v a t e cust. service agent A6 x A g e n c y l e a d e r A6 x Professional service agent A6 x F a r m e r service a g e n t A7 x Position manager A 7 x Professional service a g e n t A 8 x All m a r k e t s comm. assistant A 9 x Professional service a g e n t A 1 0 x Professional service agent
All A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24
Operations Usual operations Loan operations Stocks and shares Bearer bonds Non-banking operations Insurance Flows management Main cash desk ATM Foreign currencies Internal operations Forcing control Litigations Customer incidents Cash desk incidents Agency management Staff management Movement types] To C r e d i t To D e b i t
Folders Loans folders P r i v a t e c u s t . loans Consumer loans Professional loans Farmer loans Agency manager delegation a u t h . for A 5 a u t h . for A 7 Amounts < $250,000 < $20,000 < $8,000 Physical elements Main safe Terminal ATM
M o v e m e n t s C_ M o v e m e n t t y p e s x A c c o u n t x A m o u n t
I • x x x x x x x x x x x x x
Agents mapping Professional service agent Farmer service agent Farmer service agent Farmer service agent Skilled c l e r k Skilled c l e r k Skilled c l e r k Skilled c l e r k P r i v a t e cust. service agent P r i v a t e cust. service agent Clientele adviser Skilled c l e r k Clientele adviser Clientele adviser Description rules
A25 A26 A27 A28 A29 A29 A30 A31 A32 A33 A34 A35
x • x x • x x x x x x x
Skilled c l e r k P r i v a t e cust. service P r i v a t e cust. service P r i v a t e cust. service P r i v a t e cust. service Position manager P r i v a t e cust. service P r i v a t e cust. service P r i v a t e cust. service P r i v a t e cust. service P r i v a t e cust. service Clientele adviser
agent agent agent agent agent agent agent agent agent
A g e n t s x A c t i o n s ::~ A c t i o n s A g e n t s x (Do x O p e r a t i o n s ) ~= A g e n t s x ( A c c e s s x T e r m i n a l ) Do • L o a n o p e r a t i o n s =~ L o a n f o l d e r s Do • U s u a l o p e r a t i o n s =~ To D e b i t x A c c o u n t x A m o u n t V To C r e d i t x A c c o u n t x A m o u n t C u s t o m e r s x N e g a t i v e b a l a n c e =~ F r e e z i n g x (To D e b i t x A c c o u n t x A m o u n t ) G r a n t a l o a n • L o a n s :=~ To C r e d i t x A c c o u n t x A m o u n t
F i g . 3. Security p o l i c y description e l e m e n t s
~Debit operations on a negative balance account can be eliminated by a formula such as: Customers x Negative balance ~ -~ (To Debit x Account x Amount). We do not include such a rule in our specification because it does not correspond to the actual functioning of the bank. Agents treat such money movements with specific care, but they can perform them.
A Flexible Method for Information System Security Policy Specification 5.2
81
D e s c r i p t i o n of the Security N e e d s Security rules [ P ( A g e n t s x (Do x Usual operations)) =~ (Agents • Employment) P ( A g e n t s • (Do • Loan operations)) ==~ (Agents • Service agent) P ( A g e n t s x (Access x Main safe)) =:~ (Agents x Cashier) V (Agents x A g e n c y manager) Agents • Actions A Freezing • Actions =*- O(Forcing x Agents x Actions) P ( A g e n c y manager delegation) Delegation rule Agency
Delegation rule I manager delegation A (Agents • E m p l o y m e n t ) A Loan authorizations I
=*- P( P(hgents x (Grant a loan x Loan)) ) Loan authorizations
[
Professional market Service agent • (Professional l o a n s • <~ $70,000) Leader x (Professional l o a n s x < $90, 000) A g e n c y chief executives x (Professional loans• <2 $110,000) Farmer market Service agent • (Farmer loans• <: $70,000) Leader x (Farmer l o a n s x < $90,000) A g e n c y chief executives • (Farmer loans• < $110,000) Private cust. market Service agent x (Private cust. l o a n s • < $40,000) Leader x (Private cust. l o a n s x < $90,000) A g e n c y chief executives • (Private cust. loans• < $140,000) Consumer projects Service agent x (Consumer l o a n s x < $8,000) Leader • (Consumer loans• < $18,000) A g e n c y chief executives x (Consumer l o a n s • <~ $28,000) Company market Leader • (Consumer loans)< < $120,000) A g e n c y chief executives x (Consumer loans• < $240,000) objectives • Employment F(Customers x (Do Security • Usual operations)) F ( A g e n t x Grant a loan x Loans A Agents "( < $100,000)) A~Loan authorizations A Loans x F(Customers x Complaint)
Fig. 4. Security requirements description The rules describing the security needs of the organization are shown in Fig. 4. Among these rules, we focus on a specific aspect involving delegation of privileges: loan agreement delegation. Chief executives are allowed, in this organization, to delegate some of their privileges to other employees, in the limit fixed by several conditions. The rule itself is shown in the table entitled 'Delegation rule', and the second table 'Loan authorizations' enumerates the loans categories and maximal amounts that may restrict the delegation rights of the agency manager. Note that the abstract positions identified formerly in Fig. 3 (e.g. 'Leader') are used to define the various authorizations, illustrating the interest to describe hierarchically the set 'Employment'. The mechanism of loan agreement delegation that exists in this bank is a typical example of a situation in which a deontic logic specification language may be most useful. The general bank rules state that an agency manager can delegate some of his privileges (with respect to loan decisions) to other employees. These rules define several limits that the agency manager must respect in his delegation decisions, but nothing more. Therefore, the manager defines the actual permissions of the agents. We represent this functioning in the specification using two nested P operators meaning that permission is granted (to the agency manager) to grant permission to one of his agents to accept a loan. However, even if it is authorized, the manager may not
82
Rodolphe Ortalo
delegate full permissions to an agent, e.g. if the latter does not need them as he works on a specific market. Finally, Fig. 4 presents three simple security objectives. The first one is related to customers, but the other two apply to the agents of the organization. The second one aims at enforcing the delegation authorizations with respect to loan agreement, and the last one at protecting customers from abuses perpetrated by employees.
5.3
Integration o f Vulnerabilities
As an example, we consider now two vulnerabilities of the bank agency. These vulnerabilities are presented in Fig. 5. The first one is related to trust relationships between the various agents of the organization. We also consider that a customer may trust the agent in charge of his business. Such trust may allow a malicious agent of the organization to misappropriate funds of this customer. However, in our case, we think that this vulnerability arise only for specific operations, more precisely anonymous stocks and shares operations. Vulnerabilities I Agents x Trusts x Other agent A P (Agents x Actions) ::~ O1~ (Other agent x Actions) Customers x T r u s t s x Agents A (Agents x Bearer bonds) ::~ O (9 (Customers x Complaint))
Fig. 5. Vulnerabilities The first vulnerability shown in Fig. 5 conflicts with the second security objective shown in Fig. 4. A study in the field allowed us to check in the real organization the various conditions under which a given vulnerability may appear (i.e. to obtain all the true propositions of the form Agents x Trusts x Other agent). Given this information, we identified all the possible scenarios allowing an agent of the organization to defeat the second or third security objective. We observed that trust relationships were numerous in this organization between the employees (174 trust relationships are very strong), and several agents indeed could defeat the loan agreement authorization limits. As it is surely impossible (and probably ill-fated) to regulate the everyday life and the trust relationships existing among the agents, these results do not allow one to influence directly the operation of the organization. However, comparison of these results with the situation of each agent (with respect to his seniority, position or skills) could reveal anomalies in privileges distribution in the organization. In our case, no abnormal tendency arose from the results analysis. In the specification considered in this example, the second vulnerability conflicts directly with the third security objective. We studied an alternate functioning of the target organization involving validation of all bearer bonds operations by one specific employee before completion. In this case, only a malicious agent who abuses one of his customer's trust and the agent in charge of the validation is able to infringe the objective. This means that, given the revised functioning, both vulnerabilities shown in Fig. 5 may conflict with the third security objective of the organization. However, detailed analysis of the possible insecure states shows that they are much less numerous given the revised functioning than with the original one. Therefore, such functioning appears acceptable.
A Flexible Method for Information System Security Policy Specification
6
83
Conclusion
The information systems addressed in our study are commonly found in organizations such as banks, industrial companies, etc. Our approach focuses on organizations where security is not the only requirement even though security issues are important. For these organizations, the definition of a security policy necessitates a pragmatic approach, taking into account the functioning of the target information system. Most of the classical security policies define security properties expected of the system independently of a description of the system itself. But such policies also impose strong constraints on the operation of the target system or organization, which may not be acceptable in the context of a conventional information system. In our case, on the contrary, the security policy definition integrates a partial description of the information system. This allows to reuse available information, and to adapt the definition of the security objectives and rules to the target organization structure and behavior. In practice, security rules and objectives may be more detailed and fit better the actual operation of the organization. Based on a formal language, deontic logic, the method offers the opportunity to perform the verification of the policy. But we identified several types of inconsistency, and some of them correspond to admissible trade-off between security and other requirements of the organization, such as flexibility or efficiency. In these cases, the policy fails to be consistent. However, the verification of the security policy should provide a mean to identify exhaustively all the potential insecure states of the information system and could be a valuable help to the security administrator who manages such trade-off. Furthermore, a suitable description of the information system vulnerabilities can be proposed using the proposed specification language. This description fruitfully complements the security policy as the formalism offers the opportunity to identify precisely the impact of the vulnerabilities considered on the security of the organization, using verification mechanisms similar to those associated to consistency checking. This information will facilitate the task of the administrator to evaluate the gravity of the organization vulnerabilities, and to propose modifications of the functioning or of security rules in order to improve security.
Acknowledgments The author is grateful to the Credit Agricole, and more especially to the CRCAM Quercy-Rouergue for their collaboration, as well as to all of the staff of the agency of Villefranche de Rouergue for their pleasant welcome. This work has been partially supported by UAP Assurances, and European ESPRIT Project 20072 "Design for Validation" (DeVa). Finally, I would like to thank Claudia Almeida, Marie Borrel, Frederic Cuppens, Yves Deswarte and the anonymous referees for their help on this work.
84
Rodolphe Ortalo
References 1. D.E. Bell, L. J. LaPadula, Secure Computer Systems: Unified Exposition and Multics Interpretation, The MITRE Corporation, Report ESD-TR-73-306, 1975. 2. L. Catach, "TABLEAUX: A General Theorem Prover for Modal Logics", Journal of Automated Reasoning, vol. 7, pp. 489-510, 1991. 3. B.F. Chellas, Modal Logic: An Introduction, 295p., ISBN 0-521-29515-7, Cambridge University Press, 1980. 4. L. Cholvy, F. Cuppens, "Analyzing Consistency of Security Policies", in IEEE Symposium on Security and Privacy, Oakland, California, May 4-7, pp. 103-112, ISBN 0-8186-7828-3, IEEE Computer Society Press, 1997. 5. F. Cuppens, C. Sanrel, "Specifying a Security Policy: A Case Study", in 9th IEEE Computer Security Foundations Workshop, Kenmare, Ireland, June 10-12, pp. 123134, ISBN 0-8186-7522-5, IEEE Computer Society Press, 1996. 6. L. Farifias del Cerro, A. Herzig, "Modal Deduction with Applications in Epistemic and Temporal Logic", in Handbook of Logic in Artificial Intelligence and Logic Programming, Epistemic and Temporal Reasoning (D. M. Gabbay, C.J. Hogger, J.A. Robinson, Eds.), vol. 4/5, pp. 499-594, ISBN 0-19-853791-3, Oxford Science Publications, 1995. 7. M. Fitting, "First-Order Modal Tableaux", Journal of Automated Reasoning, vol. 4, no. 2, pp. 191-213, 1988. 8. M. Fitting, "Basic Modal Logic", in Handbook of Logic in Artificial Intelligence and Logic Programming, Logical Foundations (D. M. Gabbay, C.J. Hogger, J.A. Robinson, Eds.), vol. 1/5, pp. 365-448, ISBN 0-19-853745-X, Oxford Science Publications, 1993. 9. J. Glasgow, G. McEwen, P. Panangaden, "A Logic for Reasoning About Security", in Computer Security Foundations Workshop, Franconia, pp. 2-13, IEEE Computer Society Press, 1990. 10. ITSEC, Information Technology Security Evaluation Criteria, vl.2, 163p., ISBN 92-826-3004-8, Office for Official Publications of the European Communities, Luxembourg, 1991. 11. ITSEM, Information Technology Security Evaluation Manual, vl.0, 262p., ISBN 92-826-7087-2, Office for Official Publications of the European Communities, Luxembourg, 1993. 12. A.J.I. Jones, M. Sergot, "Formal Specification of Security Requirements using the Theory of Normative Positions", in Second European Symposium On Research In Computer Security (ESORICS 92), (Y. Deswarte, G. Eizenberg, J.-J. Quisquater, Eds.), Toulouse, France, November 23-25, LNCS, 648, pp. 103-121, ISBN 3-540-56246-X & 0-387-56246-X, Springer-Verlag, 1992. 13. S.A. Kripke, "Semantical Considerations in Modal Logic", Acta Philosophica Fennica, vol. 16, pp. 83-94, 1963. 14. G. Kuper, "Logic Programming with Sets", in 6th ACM Conference on Principles of Database Systems (PODS), San Diego, California, USA, March 23-25, pp. 11-20, ISBN 0-89791-223-3, ACM Press, 1987. 15. J.-J. C. Meyer, R.J. Wieringa (Eds.), Deontic Logic in Computer Science, 317p., ISBN 0-471-93743-6, Jon Wiley & Sons, 1993. 16. B.A. Myers, R. G. McDaniel, R. C. Miller, A. S. Ferrency, A. Faulring, B. D. Kyle, A. Mickish, A. Klimovitski, P. Doane, "The Amulet Environment: New Models for Effective User Interface Software Development", IEEE Transations on Software Engineering, vol. 23, no. 6, pp. 347-365, June, 1997. 17. R. Ortalo, Using Role-Based Abstractions for Security Policy Specification with Deontic Logic, 20p., LAAS-CNRS, Report 97216, June, 1997. 18. R. Ortalo, Y. Deswarte, "Quantitative Evaluation of Information System Security", in 14th IFIP International Information Security Conference (IFIP/SEC'98), August 31-September 4, Vienna-Budapest, Austria-Hungary, Chapman&Hall, 1998. (to appear) 19. R.S. Sandhu, E. J. Coyne, H. L. Feinstein, C. E. Youman, "Role-Based Access Control Models", IEEE Computer, vol. 29, no. 2, pp. 38-47, February, 1996.
On the Security of Some Variants of the RSA Signature Scheme Markus Michels 1, Markus Stadler 1, and Hung-Min Sun 2 1 Ubilab, UBS Bahnhofstr. 45 8021 Zurich, Switzerland {Markus. Michels, Markus. Stadler}@ubs. c o m 2 Department of Information Management ChaoYang University of Technology Wufeng, Taichung County Taiwan 413 hmsun@dscs, csie. nctu. edu. tw
Abstract. We describe adaptive attacks on several variants of the RSA signature scheme by de Jonge and Chaum. Moreover, we show how to break Boyd's scheme with an adaptive, a directed and a known signature attack. The feasibility of the adaptive attack on Boyd's scheme is illustrated by a concrete example. Keywords: cryptanalysis, digital signature schemes, RSA variants.
1
Introduction
Since the invention of the RSA signature scheme [20] in 1978, many variants have been proposed in order to improve the security and the efficiency of the scheme. However, these attempts have not always been successful. For instance, Rabin [19] and Williams [22] suggested schemes whose security can be proved equivalent to factoring the modulus as long as the attacker is assumed to be passive, but which are vulnerable to so-called adaptive attacks (see [11] for a formal description of attacker models on signature schemes). Proposals to increase the efficiency, such as to use the same modulus for all users, have shown to be vulnerable ([8, 12] respectively) as well. Finally, it was suggested to use other group structures, such as elliptic curves or Lucas functions [14, 9, 17], but as was discovered recently, these variants do not provide better security than the basic scheme [1-3]. In 1986 de Jonge and Chaum suggested a generalization of the RSA signature scheme [13] and discussed several of its instances. In particular, they showed that many special cases are vulnerable to adaptive attacks [7] that make use of the multiplicative property of the RSA scheme. They came up with a specific variant they claimed to be resistant against such attacks. Recently, Boyd proposed another special case of this generalized scheme [4] that works in a relatively small subgroup of z~* and is therefore quite efficient.
86
Markus Michels et al.
Our Contribution: In this paper we will first point out that de Jonge-Chaum signature scheme is vulnerable to adaptive attacks as well. We then describe in detail an adaptive attack on Boyd's scheme and give an analysis of its runtime and its success probability. For a concrete realization of the scheme, with RIPEMD-160 [10] used as hash-function, we show a successful attack: if the attacker manages to obtain from a user the signatures on the following six messages (for instance as payment for an electronic service),
"Check "Check "Check "Check "Check "Check
#00000000167658: #00000000347533: #00000003719673: #00000007303360: #00000014393409: #00002271656103:
value value value value value value
$1.00" $1.00" $1.00" $1.00" $1.00" $1.00"
he can also compute a valid signature (with respect to this user's public key) on the message "Check #00002228615476: value $1'000'000.00" . This attack works because the 160 bit hash-values produced by RIPEMD-160 are quite likely to be the product of relatively short factors (in this example less than 32 bits). This attack is then extended to a directed and a knownsignature attack. Finally we outline how these ideas could be used to attack a naive implementation of the RSA blind signature scheme. Organization of the Paper: In Section 2 we review the de Jonge-Chaum scheme and describe our attack. In Section 3, we focus on Boyd's scheme, describe our cryptanalysis and the concrete attack. Section 4 concludes the paper with a discussion of the potential vulnerability of a naive implementation of RSA blind signature scheme.
2 2.1
An Adaptive
Attack
on the de Jonge-Chaum
Scheme
R e v i e w o f t h e de Jonge-Chaum Scheme
In [13], de Jonge and Chaum consider several variations of the basic RSA signature scheme with the goal to avoid adaptive attacks. These variations can be characterized by the verification equation a $1(m'n) = ]2(m,n) (mod n) , where n is the signer's public modulus, m is the message, a is the signature, and fl, f2 : {0, 1}* • ~ --~ 2~ are efficiently computable functions mapping (m, n) to an integer. For example, the choices f l ( m , n ) = e and
f 2 ( m , n ) -- m
On the Security of some Variants of the RSA Signature Scheme
87
for an integer constant e with gcd(e, ~(n)) = 1 result in the standard RSA signature scheme (without hashing). Since the signer knows the factorization of n he can compute the f l (m, n)-th root of f 2 ( m , n ) (if it exists) which is signature a on m. For several choices of f l and f2 de Jonge and C h a u m show t h a t adaptive attacks on the resulting signature scheme are possible and finally come up with a scheme they think is secure in this respect, i.e. the variant with the verification equation a 2m+1 - m (mod n) . T h e y claim explicitly that this scheme is secure for messages of arbitrary length and t h a t messages need not contain redundancy. We will show in the next subsection t h a t under these assumptions multiplicative attacks are possible.
2.2
The Adaptive Attack
An adaptive attack on the above scheme works as follows. Assume the attacker wants to construct a signature on a message m by having the signer sign two other messages m l ( 5 m) and m2 ( 5 m). He constructs m l and m2 as follows. - choose an arbitrary integer u > 0 and compute (2m + 1). (2u + 1) - 1 2 - compute integers k and j satisfying the equation ml
:~
2mml
+ 2kn § 1 = j .
(2m + 1) .
This can be done by first computing -2mml
k -
-
1
2n
ml
---
- 1
2n
(mod 2m + 1)
(if gcd(n, 2m + 1) = 1) and then computing j accordingly. - finally compute m2 :--
mml
+ kn.
The attacker then has the signer to issue signatures al and a2 on the messages m l and m2, respectively. With the values u, k, and j he is now able to construct a signature on m by computing O"
.--
~- u-+ l
"
This is a valid signature on m since o.J.(2mq-1) 2
--
O.~2u+l).(2m+l)
--
o - 2 m + l ___
--
m2
----m
0 "2mml'k2kn§
t~l--2ml+1
--
(7 2 m 2 " b 1
--
ml
(mod n) .
ml
Note t h a t this attack does not work anymore if messages must have redundancy or if the length of messages is limited.
88 2.3
Markus Michels et al. An Attack on the Generalized Scheme
We can adapt the attack to the general scheme with verification equation afl (m,n) _ f2(m, n) (mod n) if fl, f2 satisfy certain conditions, e.g. an efficient function f~ must exist such that f 2 ( S 2 ( m , n ) , n ) = m. Here the attacker constructs m l and m2 as follows. - He finds integers u, ml with u r 0 such that fl (ml, n) : = fl (m, n)u, - computes integers k and j satisfying the equation fl (S2 (f2 (m, n)f2 (ml, n) + kn, n), n) = j . fl (m, n) and - finally computes ms := ] 2 ( f 2 ( m , n ) f 2 ( m l , n ) + kn, n). As before, the attacker then has the signer to issue signatures al and a2 on the messages ml and ms, respectively. Hence a signature on m can be calculated by a := ~a~" This is a valid signature on m since
aS1
(m,n)
--
--
a~ "$'(m,n) --
a(~(m~,n)
-- f 2 ( m l , n )
_ f2(m2, n) _ f2(m, n) (mod n). - f2(ml, n) -3
Attacking
Boyd's
Scheme
In this section we will focus on a signature scheme which was recently suggested by Boyd [4].
3.1
Review of Boyd's Scheme
Boyd's scheme works a follows:
- Key generation: Each user chooses two large primes p and q such that both p - 1 and q - 1 have a large prime factor and then computes n := pq. He further chooses a prime r ~ 216~ with rl( p - 1) and a generator g of order r in 2~*. He keeps p, q,and r secret and publishes g and n. Moreover, a collision resistant hash function h is chosen. Remark: It was pointed out by Meijer [4] that n can be factored if r X ( q - 1), as g (mod q) = 1 and thus gcd(g - 1,n) = q. Moreover, it was shown by Mao [16] that n - 1 is a multiple of r if r divides both p - 1 and q - 1 and therefore computing RSA-roots in the subgroup of order r is easy. We assume in the following that r is chosen such that given the public parameters it is infeasible to compute a multiple of r. Hence r must be composite and of suitable size.
On the Security of some Variants of the RSA Signature Scheme
89
Signature generation: To sign a message m (with h ( m ) # 0 (mod r))) the signer computes d :-- h ( m ) -1 (mod r) and s := gd (mod n). Then s is a signature on the message m. - Signature verification: A signature s on rn can be verified by checking whether s h(m) = g (mod n) holds. -
It is suggested to use a 1024-bit modulus n and a hash function h t h a t outputs 160-bit values. W i t h these choices of the security the efficiency of the scheme compares favorably with the RSA signature scheme or the DSS scheme.
3.2
Basic Observations
In the above scheme, signatures are just different roots of g, where the message determines which root it is. Our adaptive attack will make use of the following simple facts: 1. If s is an (a 9 b)-th root of g, i.e. s ab - g (rood n), then s a (mod n) and s b (mod n) are b-th and a-th root of g, respectively (this was already observed in [13]). 2. If s~ and sb are a-th and b-th roots of g, respectively, and if gcd(a, b) = 1, then it is possible to find an (a. b)-th root of g by first computing integers u and v satisfying ua + vb = 1 (using extended Euclid's algorithm) and then ab ~ 8 aa b v oo abb u ~-- gbv+au ~-- g (mod n). computing 8ab : S av8 bu (mod n) as 8ab
3.3
A n A d a p t i v e Attack
The goal of the adaptive attack is to forge a signature on a message m by having the signer issue signatures on some messages m~ ~ m for i = 1 , . . . , k chosen by the attacker. T h e idea is to choose m such t h a t h ( m ) = gl 9g2 " . . . " gk
where g l , . . . , g k are relatively small integers t h a t are pair-wise coprime. Then k messages m l , . . . , m k are chosen such t h a t gi divides h ( m i ) for i --- 1 , . . . , k. Using the signatures on the messages m l , . . . , ma the attacker can derive gl-th, . . . , g k - t h roots of g (fact 1) and then compute the (gl 9g2 9 ... 9gk)-th root of g (fact 2) which is the signature on m. More precisely, the adaptive attack works as follows. - Determining m:
In order to forge a signature on a message m ~, the attacker calculates a similar message m (i.e. the semantic content of m and m r is the same) such t h a t h ( m ) = 1rlkli=l gi for pair-wise coprime numbers gi _< 2 b for 1 < i < k. - Finding r n l , . . . , m k :
Let L := {} and i = 1. Repeat the following step unless i = k + 1: The attacker picks a r a n d o m message m i and computes h ( m i ) . If g c d ( h ( m i ) , gi) 1, he computes L := L U { m i } and i := i + 1.
90 -
Maxkus Michels et al.
Obtaining signatures: Ask the signer for signatures s l , . . . , Sk to all messages in L = { m l , . . . , mk) and combine these signatures using facts 1 and 2 to obtain a signature s on
h(m). We have to estimate the probability that an arbitrary h(m) consists of pair-wise coprime ( b - 1)-bit factors only and to determine c such that the probability that each of these factors is a factor in one of the hash values h(ml), ..., h(mc) for arbitrary messages m l , . . . , me is close to 1.
Smoothness of a Hash Value We can try to find a lower bound for the probability that an arbitrary output of the hash function h can be fully factored into pair-wise coprime factors smaller than 25 (provided the hash function h is a truly random function). Let Ih(x)[ = Ib for a certain l > 2 and arbitrary messages x. Some "good cases" can be described as follows: I--1
1. h(m) = z YIi=l pi, where 0 < z < 2b-t, the pi's are prime and pairwise distinct, 2b-t < Pi < 25 for 1 < i < l - 1 and 0 < t < b. 2. h(m) = 1-It,=l pi, where the p{'s are prime and pairwise distinct, 2 b-t < Pi < 2b for l < i < l. 3. For 2 < j < l 1: h(m) = l--[i=1 t pi, where the pi's are prime and pairwise distinct, 0 < Pi < 2 b-t for 1 < i < j and 2 b-t < Pi < 25 for j + 1 < i < I. -
2b
2b
(b-t)2b-b2
b-t
With ~r(2b) ~ ~ ~ ~-~ and 5 = 0.7b(b-t) number of the above listed good cases r is
2b-tsl-1 r
I--1
5t
(l-1)----'---~ + ~
+
an approximation for the
2b--t )
~
E
J
t-j ~
-~:~
j=2
For concrete values b, l and t = 3 (to get an optimal result) the following lower bounds for the probability P r ( = r bt) that an arbitrary h(m) has only pair-wise coprime factors smaller than 25 can be derived.
b
1
20 23 27 32 40
8 7 6 5 4
160 161 162 160 160
42.4 36.8 31.1 25.3 19.6
30
8
6
]i 240
46.5
40
1240
33.9
320
49.4
40
Ih(m)l -log2(Pr)
i 1:k 010
On the Security of some Variants of the RSA Signature Scheme
91
Finding Signatures with Suitable Factors We would like to find a value c t h a t the probability t h a t each element of L = {gl,..., gk} is a factor of one element in M = { h ( m l ) , . . . , h(mc)} is close to 1, i.e. find c and p := Pr(3(jl,...,jh): Vi: gdh(mj,)l[1 < i < k, 1 <_ji <_c) such t h a t p ~ 1. We define p~ := Pr(3(jl,... ,Jk) : Vi: g~lh(rnj,)ll 1 < i < k , 1 ~_ji ~_C,ja ~ j v , 1 <_a pt. For a given gj E L
Pr(Si: gj[h(mi)[[1 < i < c) > 1 - (1 - 2-b) c holds. Let us focus on the following experiment: -
MI=M.
- For i = 2 , . . . , k: Mi = Mi-1 - {d}, where d E Mi-1 and gi-1 [d ifgi-1 divides at least one element in Mi-1. Otherwise d is uniformly chosen in Mi-1. For i - 1 , . . . , k: The event Xi is t h a t gi divides at least one element in Mi. -
The probability t h a t all X1,..., Xk hold is a lower bound for p~. Let us show that Pr(X1,...,Xk) > Pr(X1)...Pr(Xk) holds. We try to find a lower bound for the probability Pr(X2[X1). Assume t h a t gl does not divide any element in M1. Hence the probability that g2 divides at least one of the remaining c - 1 elements is higher than if these elements were chosen at random. Assume t h a t gl divides at least one element in M1. T h e probability that gl divides at least one element in 11/12is slightly smaller t h a n if these elements were chosen at random. Hence the probability that g2 divides at least one element in 11//2 is slightly higher t h a n if these elements were chosen at random. Hence Pr(X2[X1) > Pr(X2) and in a similar way Pr(Xj[X1,..., Xj-1) > Pr(Xj) for j -- 3 , . . . , k can be shown. As a result we have k pl > H 1 -
k (1 - 2 - b ) c + l - i > 1 - Z ( 1 - 2-b)C+l-i=
i----1
i----1
k ( 2b ~z 1-(1-2-5)
c+1~\25_1]
i=1 1 - ( 1 - 2-b) c+l f 2b(k+l)
(22--~bl)kq-1 _ (22-~bl) =1-(1-2-5)
c+1
2b
~
- 1
2 b) = 1 - 2b(1 - 2-b) c+l ((1 - 2-b) -k
1)
We get the following values for c such t h a t p > p~ > 0.99 using the above derived lower bound for p'. Note t h a t ! < k < bl.
92
Markus Michels et al.
b 20 ! 20 27 27 32 32 40 40
k
log 2 c
8 16 6 12 5 I0 4 8
23 23 30 30 35 35 43 43
As a result we can conclude that the choice of I h ( m ) l = 160 cannot be considered as secure. Practical Results
As was already mentioned in the introduction, we have found a concrete adaptive attack for a scheme using the RIPEMD-160 hash-function [10]. RIPEMD outputs 160 bit strings which were directly interpreted as integers (where the leftmost bit is interpreted as the most significant bit). The attack was implemented in C using the LIP long integer package [15] and executed on a SUN Enterprise Server. First, we tried to find a "target" message m of the form "Check # x x x x x x x x x x x x x x : value $1'000'000.00" where "xxxxxxxxxxxxxx" is a 14-digit decimal integer that served as a counter. Since factoring the resulting hash-value is quite time consuming, only those hash-values were factored whose most significant 20 bits were all zero and which therefore are more likely to contain only short factors. Factoring was done in two steps, first trial division with all prime factors smaller than 22~ and then Pollard's rho method [18] with 215 steps in the main loop (using LIP's z p o l l a r d r h o function [15]) to find the remaining (small) prime factors. On a SUN Enterprise Server this program did run for approximately one day to find the following result. RIPEMD160("Check #00002228615476:value $1'000'000.00") = = 591958810961311141109102519582266871126124 =
230589.581003.5743124.30989939.44307239.560313293.
Note that the six factors are pair-wise coprime but not necessarily prime. In the next step we were searching six messages of the form "Check # x x x x x x x x x x x x x x : value $1.00"
On the Security of some Variants of the RSA Signature Scheme
93
whose RIPEMD-160 hash value is divisible by the six factors of the target message. Using simple trial division, this took us less than a day on the SUN Enterprise Server. The six messages are: RIPEMD16o ("Check #00000000167658: value $1.00") = = 230589. 3142167595787040010194856861101477257344816 RIPEMD16o ("Check #00000000347533: value $1.0ff') = = 581003-492810602885113654853688380970299229084257 RIPEMD16o ("Check #00000003719673: value $1.00") = = 5743124. 93987921877576599808444485864739254202422 RIPEMDz6o("Check #00000007303360: value $1.00") = = 44307239. 28587659353264827273984278108092714061121 RIPEMD16o ("Check #00000014393409: value $1.0ff') = = 30989939. 21093763551230032693705837318819942077090 RIPEMD16o("Check #00002271656103: value $1.00") = = 560313293. 2314618093965960393631745332253120816552
3.4
A Directed Attack
We can slightly modify the adaptive attack in order to get an attack for the weaker model of a directed attack. Here the attacker is just allowed to get signatures on all messages in a set A chosen by himself once. The aim of the attack is to find a signature to messages which are not in A. - The attacker chooses a basis B that contains the maximal prime powers that are smaller than a certain number, say 2 5. We denote B -- {gz,'" ", gk}, whereg~ =pe~, ei E ~>o,piisprimeandp e" _< 2 b _ o , 0 < fi <_ ei and wi 6 {0, 1} for all i = 1 , . . . , k.
94
Markus Michels et al.
- Then using the known signatures to all elements in B and by applying the facts 1 and 2 mentioned in subsection 3.2, the attacker can compute s such that 8 h(m) : g (mod n).
Analysis We can apply the analysis of the adaptive attack. The probability that an arbitrary hash value h(m) has only pair-wise coprime prime power factors smaller than 2b with given b is the same as calculated above. We can also modify the analysis of finding signatures with suitable factors. As k ~ ~2 b ~ -1 -. 4V2 . 2- -b now, we get the following values for the number of hash values c such that p > p~ > 0.99 using the lower bound for pt.
b
log 2 c
20 27 32 40
24 32 37 45
Hence we can conclude again that the choice of lh(m)l = 160 is to small to make this attack infeasible. It should be noted that the number of signatures needed for this attack is at most the size of the basis k ~ 1 . 4 25. 2 b 3.5
Known-Signature Attack
In an even weaker attack model, the known-signature attack, the attacker may only collect signature-message pairs from its victim. The attack described above can be modified to work in this model as well with overwhelming probability, if the attacker collects about c ( ~ 25+5 according to the table above) signaturemessage pairs provided they are uniformly chosen. However, the computational effort compared to the directed attack increases.
4
Remarks Signature
on a Naive Scheme
Implementation
of the
RSA
Blind
Let us finally by point out, how the attack described in the last section can be adapted to a naive implementation of the RSA blind signature scheme [6]. We assume that this attack is generally known, but to the best of the authors knowledge there is no description of it in the literature.
On the Security of some Variants of the RSA Signature Scheme
95
Let the verification equation of the RSA implementation be given by a e - h(m)
(rood n),
where n is the modulus, e is the public exponent, a is the signature, h is a hash function t h a t outputs at most b-bit values and b is relatively small, say b -- 160. Then a similar attack as described in the section 3.4 works as follows: The attacker chooses a basis B that contains the all primes t h a t are smaller than a certain number, say 2 b. We denote B = { P l , " " ,Pk}. Then he asks the signer to sign all p l , . . . , Pk using the blind signature scheme (note t h a t the signer cannot learn the unblinded messages and hence cannot check t h a t these messages are not hash values) and gets the signatures a l , . . . , ak. Now the attacker can forge signatures on messages h ( m ) that can be totally factored over B, i.e. h ( m ) = yi ~=1 k Piw, for integers w, > 0, 1 < i < k. He can easily compute the signature a k a,w, (mod n). on m by a := 1-Ii=l The attack can be easily prevented by using an increased value b or by suitable padding, e.g. m ' := al]]...]tat with al := m and a, := h ( a l l l . . . ] ] a i _ l ) for 1 < i _< t and ]m'] = In] instead of h ( m ) . This method of padding is used in the RSA based ecash T M system [21], possibly to countermeasure the above mentioned attack as well.
5
Acknowledgments
The first two authors were supported by the Swiss National Foundation ( S N F / S P P project no. 5003-045293).
References 1. D. Bleichenbacher, "On the Security of the KMOV Public Key cryptosystem", LNCS 1294, Proc. Crypto'97, Springer-Verlag, (1997), pp. 235-248. 2. D. Bleichenbacher, W. Bosma, A. Lenstra, "Some remarks on Lucas-based cryptosystems', LNCS 963, Proc. Crypto'95, Springer-Verlag, (1997), pp. 386-396. 3. D. Bleichenbacher, M. Joye, J.-J. Quisquater, "A new and optimal chosen-message attack on RSA-type cryptosystems', LNCS 1334, Proc. Information and Communications Security- ICICS'97, Springer-Verlag, (1997), pp. 302-313. 4. C. Boyd, "Digital Signature and Public Key Cryptosystem in a Prime Order Subgroup of ~ " , LNCS 1334, Proc. ICICS'97, Springer-Verlag, (1997). 5. Communicated by Colin Boyd. 6. D. Chaum, "Security without identification: transaction systems to make big brother obsolete", Communications of the ACM, 28, 10, October, (1985), pp. 1030-1044. 7. G.I. Davida, "Chosen signature cryptoanalysis on the RSA (MIT) public key cryptosystem", Tech. Rep. TR-CS-82-2, Department of Electrical Engineering and Computer Science, University of Wisconsin, Milwaukee, (1982). 8. J.M. DeLaurentis, "A further weakness in the common modulus protocol for the RSA cryptoalgorithm', Cryptologia, No.8, Vol. 3, (1984), pp. 253-259. 9. N. Demytko, "A new elliptic curve based analogue of RSA", LNCS 768, Proc. Eurocrypt'93, Springer Verlag, (1994), pp. 40-49.
96
Markus Michels et al.
10. H. Dobbertin, A. Bosselaers, B.Preneel, "RIPEMD-160, a strengthened version of RIPEMD", LNCS 1039, Proc. Fast Software Encryption, Springer-Verlag, (1996), pp. 71-82. 11. S. Goldwasser, S. Micali, R. Rivest, "A digital signature scheme secure against adaptive chosen message attacks", SIAM Journal on Computing, Vol. 17, No. 2, (1988), pp. 281 - 308. 12. J. Hastad, "On using RSA with low exponent in a public key network", LNCS 218, Proc. Crypto '85, Springer Verlag, (1986), pp. 404-408. 13. W. de Jonge, D. Chaum, "Some Variations on RSA Signatures & their Security", LNCS 263, Proc. Crypto '86, Springer Verlag, (1987), pp. 49-59. 14. K. Kurusawa, U. Maurer, T.Okamoto, S.Vanstone, "New public key schemes based on the ring ~ n " , LNCS 576, Proc. Crypto'91, Springer Verlag, (1992). 15. A. Lenstra, P. Leyland: LIP - Long Integer Package, 1995. 16. W.Mao, "Cryptanalysis in Prime Order Subgroups in ~ " , manuscript, (1998), available at http://grouper.ieee.org/groups/1363/contrib.html. 17. W.B. Miiller, W. NSbauer, "Some remarks on public key cryptosystems", Studia Sci. Math. Hung., Vol. 16, (1981), pp. 71-76. 18. J. Pollard, "A Monte Carlo Method for Factorization", BIT, Vol. 15, (1975), pp. 331-334. 19. M.O. Rabin, "Digitalized signatures and public-key functions as intractable as factorization", MIT/LCS/TR-212, MIT Lab. for Computer Science, Cambridge, Mass., (1979). 20. R.L. Rivest, A. Shamir, L. Adleman, "A method for obtaining digital signatures and public-key cryptosystems', Communications of the ACM, Vol.21, (1978), pp. 120-126. 21. B. Schoenmakers, "Basic Security of the ecash T M Payment System", LNCS, Proc. Computer Security and Industrial Cryptography: State of the Art and Evolution, Springer-Verlag, (1997). 22. H.C. Williams, "A modification of the RSA Public-Key Cryptosystem", IEEE Trans. in Inform. Theory, IT-26, No. 6, (1980), pp. 726-729.
Side Channel Cryptanalysis of Product Ciphers John Kelsey 1, Bruce Schneier 1, David Wagner 2, and Chris Hall 1 1 Counterpane Systems {hall ,kelsey, schneier}Ocounterpane, c o m 101 E. Minnehaha Pkwy Minneapolis, MN 55419 (612) 823-1098 2 U.C. at Berkeley daw0cs, berkeley, edu Soda Hall
Berkeley, CA 94720-1776
Abstract. Building on the work of Kocher [Koc96], we introduce the notion of side-channel cryptanalysis: cryptanalysis using implementation data. We discuss the notion of side-channel attacks and the vulnerabilities they introduce, demonstrate side-channel attacks against three product ciphers--timing attack against IDEA, processor-flag attack against RC5, and Hamming weight attack against DES--and then generalize our research to other cryptosystems.
Keywords: side channels, cryptanalysis, timing attacks, product ciphers.
1
Introduction
Any cryptographic primitive, such as a block cipher or a digital signature algorithm, can be thought of in two very different ways. It can be viewed as a mathematical object; typically, a function taking an input between 0 and 2 g - 1, and providing an o u t p u t between 0 and 2 M - 1. Alternatively, it can be viewed as a concrete implementation of that mathematical object. Traditionally, cryptanalysis has been directed solely against the mathematical object, and the resultant attacks necessarily apply to any concrete implementation. The statistical attacks against block ciphers--differential [BS91,BS93] and linear [Mat93] cryptanalysis--are example of this; these attacks will work against DES regardless of which implementation of DES is being attacked. Many of these attacks are more theoretical than operational, and are sometimes called "certificational weaknesses" to indicate that they may not work practical implementations. A known-plaintext attack that uses 240 plaintexts, for example, would require an operational cryptanalyst to obtain just under nine terrabytes of data (assuming a 64-bit block cipher) encrypted in a single
98
John Kelsey et al.
key. While no cryptographer would seriously recommend such an algorithm for operational deployment, is wouldn't be too hard to build a system that denied an attacker access to this magnitude of plaintext. In the last few years, new kinds of cryptanalytic attack have begun to appear in the literature: attacks that target specific implementation details. Both timing attacks [Koc96] and differential fault analysis [BDL97,BS97] make assumptions about the implementation, and use additional information garnered from attacking certain implementations. Failure analysis [HGS97,Be196] assumes a one-bit feedback from the implementation--was the message successfully decrypted--in order to break the underlying cryptographic primitive. Related-key cryptanalysis [Bih94,KSW96,KSW97] also makes assumptions about the implementation, in this case about related keys used to encrypt different texts. These attacks don't necessarily generalize--a fault-analysis attack just isn't possible against an implementation that doesn't permit an attacker to create and exploit the required faults--but can be much more powerful. For example, differential fault analysis of DES requires between 50 and 200 ciphertext blocks (no plaintext) to recover a key [BS97]. In this paper, we consider the general class of side-channel attacks against product ciphers. A side-channel attack occurs when an attacker is able to use some additional information leaked from the implementation of a cryptographic function to cryptanalyze the function. Clearly, given enough side-channel information, it is trivial to break a cipher. An attacker who can, for example, learn every input into every S-box in every one of DES's rounds can trivially calculate the key. Our research attempts to show how little side channel information is required to break product ciphers. In real-world systems, attackers cheat; prudent engineers of secure systems anticipate this and adapt to it. Exploiting weaknesses in implementations-either by monitoring some "side-channel" of information out of the mechanism implementing the cryptographic primitive (such as timing or power consumption), or by altering some internal data inside that mechanism--may feel like cheating, but that just makes their effects more devistating. 1.1
Side Channels and P r o d u c t Ciphers
Symmetric encryption is most often done with product block ciphers, such as DES [DES81], IDEA [LMM91], and Blowfish [Sch94]. To understand why side channels so often provide devastating attacks against product ciphers, it is necessary to digress a bit into the theory of block cipher design. A block cipher is a cipher that encrypts whole blocks of plaintext at a time; all the abovementioned block ciphers operate on blocks of 64 bits at a time. A product cipher (sometimes called an iterative block cipher) is a block cipher made by iterating a fairly simple round function many times, each time with its own key. Thus, a four-round product cipher would look like:
EK(X) = RK~ (RK2(ngl (nKo (X)))),
Side Channel Cryptanalysis of Product Ciphers
99
where/(0..3 are functions of K. Each round function, R K 0 , is actually a weak block cipher, in the sense that if an attacker knows much about the plaintext (such as simple plaintext statistics), he can quickly break the round function and recover its round key. However, if the same attacker is faced with a sequence of random numbers, each encrypted with the round function, he has no way to mount such an attack. Nearly all attacks on product ciphers work by learning some way to distinguish the output of all but the last rounds from a random permutation. In a linear attack [Mat93], a subset of bits in the input to the cipher and to the last round don't quite have a balanced parity; in a differential attack [BS91,BS93], the relationship between a pair of inputs to the last round isn't quite random. Partitioning attacks, higher-order differential attacks, differential-linear attacks, and related-key attacks all fit into this pattern. A strong product cipher will have as the input to its last round a random permutation of the input to the cipher: something an attacker can't distinguish from a random number. Side channel information, even a tiny fraction of a bit of it per ciphertext output, can provide another way of distinguishing the input of the last round from a random number. If the attacker can learn the parity of the input to the last round, or its Hamming weight, or the carry-flag resulting from the last round's addition operation, then he can usually break the cipher fairly quickly. 2
Timing
Attacks
Against
IDEA
IDEA [LMM91] is a product block cipher designed by Lai, Massey, and Murphy, to resist all forms of cryptanalysis publicly known at that time. Except for some special "weak-key" conditions, the full 8-round IDEA is apparently very strong against the standard differential, linear, and related- and chosen-key attacks. However, IDEA can be cryptanalyzed with a piece of side-channel information: whether one of the inputs into one of the multiplications is zero. This side channel can be obtained in several ways. Since the multiplication is done modulo 216, a zero operand is treated as a special case. Some implementations bypass the multiplication completely and simply patch in the correct value. Timing 1, power consumption, radiation, etc., will all be different if one input is a zero. For the rest of this discussion, we assume that this information is received through timing measurements. However, any side channel that can yield the same information is just as effective. 2.1
The Basic Attack
The basic mechanism used in all of these attacks is to search for one or two 16bit values visible to the cryptanalyst which indicate a zero input into a multiply somewhere in the cipher. 1 This observation was borne out experimentally using the PGP 2.3 implementation of IDEA, on a 486SX-33.
100
John Kelsey et al.
The time taken to encrypt one block with IDEA may be broken into three parts: 1. Operations which take approximately constant time. These may be assumed to include all operations except the multiplications modulo 216 + 1. 2. All the multiplications except for the one being targeted. This is probably approximately normally distributed. 3. The multiplication being targeted. If this has a zero input, it will invariably take less time than if it has a nonzero input. The time taken by the whole encryption process is the sum of these three times. If the multiplication being targeted has a zero input, then the mean of this time will be slightly lower than if the multiplication being targeted does not have a zero input. If we can reliably detect this difference, then we can detect whether or not the multiplication operation being targeted is getting zero inputs. Timings can be acquired in two simple ways: Either the cryptanalyst makes extremely precise timings of each encryption, or he measures total time to encrypt many similar plaintext blocks at a time. In either case, however, he must then statistically test the hypothesis that one of these sets of times is significantly lower than the others.
A Ciphertext-Only Attack Requiring Precise Block Encryption Timings With precise timing of each encryption, we can apparently recover the key using ciphertext only. Our attack works as follows: 1. Record precise timings for n encryptions. Also store the resulting plaintexts. Let To..n-1 be the timings, and Co..n-1 be the ciphertext blocks. 2. Group the ciphertexts and timings into 216 subsets, based on the low-order 16 bits of the output. 3. Test the average times of each group against the average times of all the groups statistically, to find whether one of the sets has (with some acceptably high probability) a lower average than the other sets. If so, then the inputs to the last multiply of the output transformation must have been 0 for all inputs in that set. Solve for the last multiplicative subkey. If there is no difference, then either we've chosen some parameters (i.e., n) wrong, or the subkey is a 0. 4. Repeat steps 2-3, above, for the high-order 16 bits and the first multiplicative subkey of the output transformation. We now have 32 bits of expanded key. 5. We now attack the second additive subkey in the output transformation. For each possible value of this subkey, we look at which ciphertexts lead us to a zero value going into the first multiplication of the last round's MA box. For one of these subkey guesses, the average timing should be less than for all the other subkey guesses. This reveals the right subkey. If there is no difference, then either we've chosen some parameters wrong, or the first subkey in the MA-box is zero. We have now recovered 48 bits of expanded key.
Side Channel Cryptanalysis of Product Ciphers
101
6. We now attack the first additive subkey in the output transformation, and the first subkey in the MA-box. We do this as follows: (a) Break the ciphertexts and timings up into 216 subsets based on the value of the leftmost (first) input to the MA-box. (b) For each possible subkey value for the first additive subkey of the output transformation, break each subset up into 216 sub-subsets, based on what the value of the second MA-box input would be if this were the right subkey. (c) For the right subkey, each subset will have one sub-subset which has a smaller timing value than all the other sub-subsets in that subset. We have now found 64 bits of subkey. (d) We now choose any three of these sub-subsets, and use them to solve for the first multiplicative subkey of the MA-box. We have now found 80 bits of subkey, and can brute-force the remaining 48. (There are also ways to continue this attack.) This kind of attack might be practical for recovering the key from a tamperresistant box which always encrypts under the same IDEA key. The cryptanalyst does not need to know anything about the plaintext for this attack, but must always know precisely when the encryption started and when it ended. Chaining modes have no real impact on this attack.
2.2
A n A d a p t i v e C h o s e n P l a i n t e x t T i m i n g Attack
An attack very similar to the one described above is possible, even if the cryptanalyst doesn't have the ability to measure each encryption time precisely. In this case, we require the ability to choose plaintext batches to send through the encryption algorithm, along with the ability to time these batches. The attack works as follows: 1. Choose 216 batches of plaintexts large enough that changing the input of an average of one multiply instruction per encryption to zero will be detectable in the timing, with high probability. (We need to specify numbers for the timing statistics here before we can know how many plaintexts must be in each batch.) Every plaintext has the same X0. Each batch of plaintexts has the same X2. 2. Determine the time required for each batch to be encrypted. 3. The batch that took the least time to encrypt indicates one equation involving some key material."
(x0 e z0) = (x2 [] &) . Generate another set of 216 plaintexts to be encrypted, this time using a different value for X0, X~. Follow the steps above, to get a second equation:
( x ; e z0) = (x2 [] &)
102
John Kelsey et al.
In most cases, this should allow the cryptanalyst to recover values Z0 and Z2. (If Z4 = 0, then this attack will simply not won't be a significant difference in the times.) 5. Generate another set of 216 batches of plaintexts, this time constant by use of fixed values for Xo and X2, and also holding For each different batch, X1 must take on a different value. 2.3
the two key work--there holding (A) X3 constant.
Generalizations
We detailed this attack assuming that the cryptanalyst was able to determine when a multiply by zero occurred by watching the relative encryption time. This is not the only side channel that can yield this information; radiation and power consumption can also leak this multiply-by-zero condition. Moreover, radiation and power consumption can generally determine the exact round where the multiplication by zero occurred, greatly simplifying the attack.
3
A Processor-Flag
Attack
Against
RC5
There may be cases when we can learn the internal flags states of processors. For example, we may be able to determine the state of the carry flag after each half-round of RC5. Recall that an RC5 round looks like
Ri ~- (Rz-1 9 Lz-1) ~L'-I + S2z-1 Li = (L~-I | Ri) <
S2r-{-1
before we obtain the resulting ciphertext (Lr+l, Rr+l) (this last transformation makes the encryption and decryption processes identical). RC5 is a variable-width cipher usually the width is set so that Ri (resp. Li) fits into one register. Therefore the addition performed may cause the carry flag to be set and we denote C2i-1 (resp. C2i) the value of the carry flags after the addition computing R~ (resp. L/). We will show that with high probability, an attacker can reconstruct $2~+1 given several ciphertext pairs (L~+Ij, R~+I,j) and the corresponding carry flags C 2 r - l j . Once an attacker determines $2~+1, they can strip off the last half round and reapply our attack (reversing left and right). As we will see in the next sections, each of the two possible values of msb(SK~r_1)l lead to exactly one value of SK2~+1. Hence an attacker could simply guess the most significant bit of SK2~-1 for all i < r, solve for each of the subkeys, and
Side Channel Cryptanalysis of Product Ciphers
103
perform a few trial decryptions. If r is small (say 32), this is certainly feasible. However, in the event that r is too big, we must attempt to find m s b ( S K ~ r - 1 ) l by other means. The next section gives an algorithm for doing exactly that. 3.1
Determining
msb(SK2r-1)l
Determining the most significant bit of S K 2 r - 1 is rather simple. Consider three ciphertext pairs (Lr+l,j, Rr+l,j), j = 1, 2, 3 and their corresponding carry flags C2~-1,j. Then we simply take the most significant bit of SK2r-1 to be the majority value of the carry flags C2r-l,j. The analysis which shows that three pairs suffices follows. First we must consider a simplified experiment. Let K be a randomly chosen n-bit value and consider the experiment in which we repeatedly chose a random n-bit value A and determine the value C of the carry flag for A + K . Then P [ C = I l K = X] = P [ A + X > 2 n] = X / 2 n.
Furthermore, we can generalize this where we only consider the l most significant bits of K . Then 2 ~ - I --1
P[C=
llmsb(K)l = X]=
E
P [ C = IIK = X . 2 n-t + i]. P[lsb(K),~_l = i]
i=0 2 ~ - l --1
=E
X
92 n-l + i 22n-I
z=0
= X2 -I + 2 -/-1
_ 2-n-1.
Therefore, in order to estimate msb(K)l we simply need to perform sufficiently many trials so the X such that #' _ t ( X 2 -t + 2 -I-1 _ 2 - n - 1 )
is minimal satisfies msb(K)l = X with high probability, where #' is the observed number of carries and t is the number of trials. Let Pl (X) denote the probability of a carry when msb(K)l = X. Then p l ( X ) = X 2 -I + 2 - I - 1 - 2 - n - 1
To approximate the number of trials, we assume that the distribution of #~ is normal. After t trials with #~ observed carries, we have P [tp,(X) - 3 x / t p t ( X ) ( 1 - p t ( X ) ) < it' < tpt(X) + 3 x / t p , ( X ) ( 1 - pt(X))] ~ 0.997.
Hence we want to choose t such that p l ( X ) - p l ( X - 1) > 3 / 2 V / P l ( X ) ( 1 - p t ( X ) ) / t
104
John Kelsey et al.
for all X > 1. This gives
p,(X) - p , ( X - 1) > 3/2V/P,(X)(1 - p , ( X ) ) / t r 2-' > 3/2X/p,(X)(1 - p z ( X ) ) / t 2 -2' > (9/4)p,(X)(1 - p,(X))/t ~=> t > 22t-29p,(X)(1 - p , ( X ) ) which will be roughly maximal when X = 21-1 and hence
t > 22t-29p/(2/-1)(1 - pt(2t-1)) = 22t-29(2-2 _ 2-2/-2 + 2 - ' - n - 1 _ 2-2n-2) = 9(22l-4 _ 2-4 + 2 z - n - a _ 22t-2n-4). We are interested in the case where 1 = 1 and hence need t > 9(2 -2
-
2 - 4 -+- 2 - n - 2
-
2 -2n-2)
pairs. For n = 32 this gives t ~ 1.69. Therefore we can take three ciphertext pairs and let m s b ( S K 2 ~ _ l ) l be the majority value of the carry flags C2~-1,a.
3.2
Determining
lsb(SK2~+l)l
Once we have determined the most significant bit of SK2~-1, we can use it to determine the least significant bit of SK2~+1. The algorithm is rather simple: we choose plaintext/ciphertext pairs (L~+I,j, Rr+l,j) such t h a t L~+I,j = 1 (mod 32) and C2r-1 ~ m s b ( S K 2 ~ - l ) l . The former condition ensures t h a t if we know the most significant bit of R~,a , then we will know the least significant bit of Rr+l,j [] $2r+1. The latter condition ensures that we know the most significant bit of Rr,a. After all, if m s b ( S K 2 ~ - l ) l = 1 and C2~-1 = 0 we know t h a t msb(R~,a)l = 1. Similarly, if msb(SK2r-1)l = 0 and C2r-1 = 1 we know t h a t msb(Rr,j)l = 0. Hence m s b ( R r , , ) l = msb(SK2~_~)l.
(1)
From the above and the formula for computing R r + l o it is easy to see t h a t lsb(Rr+l,j • $2r+1)1 = m s b ( R r , j ) l |
msb(Lr,j)l
or
lsb(S2r+l)l = ( m s b ( S K 2 r - 1 ) l |
m s b ( L r + l , j ) l ) [ ] lsb(Rr+l,j)l.
Note t h a t we only need one ciphertext pair to determine lsb(SK2r+l)l.
Side Channel Cryptanalysis of Product Ciphers
3.3
105
Determining Remaining Bits of SK2,.+1
Determining the remaining bits of SK2r+I comes from generalizing the attack of the previous section. Here we assume that we know the j least significant bits of SK2r+I and wish to determine the j + 1st bit. Again we choose a ciphertext pair (Lr+l,Rr+l) such that C2r-1 ~ m s b ( S K 2 ~ - l ) l . However, this time we choose the pair so that Lr+l -= j + 1 (mod 32). The former condition ensures that we know the most significant bit of R~ by the same argument as in the previous section. The latter condition ensures that we know the j + 1st bit of R,.+I [] $2~+1. Since we know lsb(S2r+l)3, we can compute lsb(R~+l [] S2r+l)j. However, we also know the j + 1st bit of the difference as it is a function of R~+I [] lsb(S2~+l)j, msb(R~)l and msb(Lr)l. Hence lsb((R~ @ Lr)<
msb(Lr)l))
Of course Lr : Lr+l so this formula allows us to determine the j + 1st bit of $2~+1 directly. Repeatedly applying this algorithm for j = 2 , . . . , 32 we recover the remaining bits of $2r+1.
3.4
Adaptive Chosen-Plaintext Attack
We can recover $1 much more directly using a chosen-plaintext attack which requires n chosen texts. Once we determine $1 we can peel off the first halfround and repeat the attack for the remaining subkeys. The attack is a very simple binary search on $1. The crucial observation is that C1 = 1 iff (R0 | L0) < 2n. Therefore we perform a simple binary search on 2 n [] SI: 61 ---- 1 iffR0>2 n[]S1. 3.5
Generalizations
This is not the only side channel that can be used to cryptanalyze RC5; other side-channel information can also be used to break the cipher. For example, the timing may be different depending on the number of bits rotated. This information can be used to recover the RC5 key [Koc98]. Or the power consumed by the rotate mechanism might be different depending on the Hamming weight of the bits rotated. 4
A Hamming-Weight
Attack Against DES
Next we introduce Hamming-weight cryptanalysis. This assumes we have a side channel that gives information on the Hamming weight of intermediate encryption values. We claim that this model is not implausible: for instance, in some
106
John Kelsey et al.
hardware implementations total power consumption is correlated to the total Hamming weight of all intermediate values, and in software timing data may leak information about the Hamming weight of internal variables when certain operations (e.g. integer multiplication) are used. We show here that Hamming weight side channels can enable powerful ciphertextonly attacks in many cases. We will concentrate throughout on cryptanalysis of DES for concreteness, but it should become clear t h a t these attacks will apply directly to any product cipher. Let's start first with a very simple scenario. Assume that we are provided with a side channel that discloses the Hamming weight of the block after 15 rounds of encryption (i.e. the input to the 16th and final round). Then attacks to recover the last-round subkey abound. For instance, we could simply guess the 48-bit last-round subkey, and verify correct guesses on a few known ciphertexts. This attack will require about 4 8 / l o g 2 ~ ~ 15 ciphertexts and offiine work equivalent to 244 trial encryptions; the computational complexity could be reduced to about 222 with meet-in-the-middle techniques. For reasons that will become clear later, we wish to describe a different statistical attack which uses the Hamming weight of the input to the last round. Let (R, L @ F ( L ) ) be the last-round input corresponding to the ith ciphertext (L, R) where F ( L ) is the output of the Feistel F function under the last-round subkey. We write W = wt F ( L ) for the Hamming weight of the output of the last-round Feistel function; W is easily determined from the side channel. Also, we write S for the Hamming weight of the 4-bit output of the first S-box in the computation of F ( L ) , and N for the total Hamming weight of the other S-box outputs, so that W = S + N. Then guessing the 6 key bits entering the first S-box (in the 16th round) gives us S if the guess was correct, or (say) T otherwise (where T is a random variable with the same distribution as S, but T is independent of S). The core idea of the statistical attack is that S is strongly correlated with the known values W = S + N, but T is not, so we can detect correct guesses at those 6 key bits by the statistically-significant correlation that results. The statistical attack thus proceeds as follows. Suppose we have n ciphertexts where the side channel discloses W [ 1 ] , . . . , W [ n ] . Guess the 6 key bits k entering the first S-box and compute the Hamming weight of that S-box's output, calling it (say) Uk[1],...,Uk[n]. Calculate a measure of correlation 2 n Ck = ~ i = 1 Uk[i]W[i]/n, which is an estimate at the expected value E U k W of Uk 9 W . When k is correct, we get Uk[i] = Sill, and thus ck is an estimate of E S W = E S ( S + N ) = E S 2 + E S E N ; when k is false, Ck is an estimate of E T W = (ms) 2 + E S E N . Therefore, the counter Ck is expected to be noticeably larger when k is correct. If n is sufficiently large, we should be able to pick out the correct value of k from the largest of the 64 counters.
2 It is in fact equivalent in power to calculating the observed correlation coefficient p, or the covariance for that matter.
Side Channel Cryptanalysis of Product Ciphers
107
Of course, once we have determined the correct values of the 6 key bits entering the first S-box, we can repeat the attack for each of the other S-boxes in turn, until we have recovered the entire last-round subkey. In practice one could make do with significantly fewer ciphertexts by treating several S-boxes at once; here we focus on giving the essence of the central ideas of the attack, with an eye towards applying t h e m to a wider class of Hamming-weight-based side channels. We now analyze the complexity of this algorithm. The r a n d o m variables S, T are independent symmetric binomial random variables with m e a n 2; also N is a symmetric binomial with m e a n 14. We see t h a t E ( S W - T W ) = E S 2 (ES) 2 = 1. Also, a bit of c o m p u t a t i o n shows t h a t Vat ( S W - T W ) = 232. Letting k be a (generic) incorrect guess and K be the correct guess, we see that CK -- Ck will be approximately Gaussian with mean 1 and standard deviation 2 3 / v ~ when n is not too small. We expect success when cK is larger than all the incorrect ck counter values, and we can roughly estimate the probability of this as Prob(cK > Ok) 63 : ~ ( v ~ / 2 3 ) 63. Therefore, with n = (3 9 23) 2 ~ 4800 ciphertexts, we expect to succeed with probability about (1 - .0013) 63 ~ .92. To recap, we have expressed the known H a m m i n g weight W as a sum W = S + N of "signal" S and "noise" N. Each guess at the 6 key bits entering the first S-box gives us a list of candidate observations for S. This reduces the problem of recognizing correct guesses to the problem of filtering out the noise from a noisy signal; this, in turn, is m a d e possible because S is strongly correlated with S+N.
This technique can be generalized to scenarios with just a b o u t any Hammingweight-based side channel imaginable. For instance, imagine a side channel which gives us W, the total weight of all of the DES F function outputs s u m m e d over all 16 rounds. This is a good deal more plausible than our previous model, where we obtained the H a m m i n g weight of only the last-round F function output; however, it m a y not be immediately obvious how to use such a side channel to break DES. We note t h a t the statistical techniques developed above will apply immediately. As before, we let S stand for the total H a m m i n g weight of the output of the first S-box in the last round. The total weight W can be expressed as W = S + N , where N is a sum of 127 other S-box outputs. Thus N can be treated as a symmetric binomial r a n d o m variable of mean 254 which is independent of S. The attack proceeds as before, though we will of course need significantly more known ciphertexts. We estimate t h a t n = 220 known ciphertexts and offiine work comparable to 219 trial encryptions suffices to recover the entire DES key with very high probability. This figure could be reduced to about 2 is known ciphertexts and 237 offiine work by simultaneously guessing 24 key bits entering 4 S-boxes. In some environments, we m a y not be able to obtain the total H a m m i n g weight of all the F function outputs, but we may be able to obtain a "noisy" version of it. (As an example, imagine a hardware implementation where the power consumption reveals the total H a m m i n g weight of all intermediate values
108
John Kelsey et al.
computed during a single DES encryption, or even something correlated to this weight. This would yield a side channel which is correlated to the total Hamming weight of all F function outputs.) In these situations, it is clear that we can filter out the noise as before: the same statistical technique still works, though more ciphertexts will be needed to compensate for the stronger "noise" contribution to the side channel value. So far we have concentrated on ciphertext-only attacks. In fact, by symmetry, one can just as easily apply the attacks to scenarios where only the plaintext is known, and not the ciphertext! We are not aware of any other cryptanalytic technique where this is possible. In summary, we see that even a side channel which is only mildly correlated to values of interest (such as the input to the last round) can be used to mount a powerful statistical attack against just about any product cipher. These attacks require only known-ciphertexts, and so should be quite practical to mount in practice. This shows that cryptosystem implementors must take great care to avoid even the slightest correlation between observable outputs and internal values. 5
Conclusions
and Further
Directions
The purpose of this paper was to demonstrate the power of side-channel cryptanalysis against product ciphers. Our attacks are by no means exhaustive; the algorithms discussed have other possible side channels and other attacks are possible given other side channel information. And other product ciphers are vulnerable to similar attacks. We believe attacks based on cache hit ratio in large S-box ciphers like Blowfish [Sch94], CAST [Ada97], and Khufu [Mer91] are possible. Stream ciphers with irregular clocking features can be especially vulnerable to side-channel attacks. A5 [XHW94], PIKE [And95], Gollman cascades [CG88], shrinking and self-shrinking generators [CKM94,MS94], and any of the family of alternating stop-and-go generators [Sch96] are easily breakable if an attacker knows how many LFSRs clocked for each output. Implementations of RC4 [Sch96] that have different side-channel characteristics when i = j are also vulnerable. Public-key algorithms are vulnerable. Kocher showed how to mount at timing attack against several public-key algorithms [Koc96]. These results can be generalized to any primitive that uses multiplication modulo large numbers for security, e.g. stream ciphers like Blum-Blum-Shub [BBS86] and hash functions like IBC Hash. The open questions lead in two different directions: how to obtain more detailed side-channel information, and how little side-channel information is required to break a cryptographic primitive. With regards to the former, we leave that to the electronic engineers. We have collected side-channel information using both timing and power channels, and have often been surprised by how easy it
Side Channel Cryptanalysis of Product Ciphers
109
is. We can only speculate how much more information a well-equipped hardware laboratory can collect. As to the second question, the surprising result from this research is that the amount of side-channel information necessary to break a product cipher is very small. Ciphers have not been designed with side channels in mind; hence, they are often very vulnerable to analysis using them. It is our belief that most operational cryptanalysis makes use of side-channel information. Sound as a side-channel--listening to the rotation of electromechanical rotor machines--was alluded to in Kahn [Kah67]. Van Eck radiation-another side channel--has been demonstrated as a way to get plaintext [vEc85]. And Peter Wright discussed data leaking onto a transmission line as a side channel used to break a French cryptographic device [Wri87]. Using side channels to break cryptographic primitives is such a powerful notion that it is reasonable to expect intelligence organizations to have built on the successes alluded to in the previous paragraph. By continuing to research both the collection of side-channel data and the vulnerabilities to specific primitives to side-channel data, it is our hope that we can begin to build mathematical algorithms that are more resistant to side-channel cryptanalysis as well as implementations that leak less side-channel data.
References [Ada97]
land95] [Bel96] [DES81] [SS91] [BS93] [Sih94] [BS97]
[BBS86] [BDL97]
C. Adams, "Constructing Symmetric Ciphers Using the CAST Design Procedure", Designs, Codes and Cryptography, v.12, n.3, Nov 1997, pp. 71-104. R. Anderson, "On Fibonacci Keystream Generators," Fast Software Encryption, 2nd International Workshop Proceedings, Springer-Verlag, 1995, pp. 346-352. S. Bellovin, "Problem Areas for the IP Security Protocols," Proceedings of the Sixth Usenix Unix Security Symposium, Jul 1996, pp. 1-16. ANSI X3.92, "American National Standard for Data Encryption Algorithm (DEA)," American National Standards Institute, 1981. E. Biham and A. Shamir, "Differential Cryptanalysis of DES-like Cryptosystems," Journal of Cryptology, Vol. 4, No. 1, 1991, pp. 3-72. E. Biham and A. Shamir, Differential Cryptanalys~s of the Data Eneryption Standard, Springer-Verlag, 1993. E. Biham, "New Types of Cryptanalytic Attacks Using Related Keys," Journal of Cryptology, v. 7, n. 4, 1994, pp. 229-246. E. Biham and A. Shamir, "Differential Fault Analysis of Secret Key Cryptosystems," Advances in Cryptology--CRYPTO '97 Proceedings, Springer-Verlag, 1997, pp. 513-525. L. Blum, M. Blum, and M. Shub, "A Simple Unpredictable PseudoRandom Number Generator," SIAM Journal of Computing, v. 15, n. 2, 1986, pp. 364-383. D. Boneh, R.A. Demillo, R.J. Lipton, "On the Importance of Checking Cryptographic Protocols for Faults," Advances in Cryptology-EUROCRYPT '97 Proceedings, Springer-Verlag, 1997, pp. 37-51.
110
John Kelsey et al.
[CG88]
[CKM94]
[HGS97] [Kah67] [KSW96]
[KSW97]
[Koc96]
[Koc98] [LM90]
[LMM91]
[Mat93] [MS94]
[Mer91] [Sch94]
[Sch96]
[vEc85] [Wri87] [XHW94]
W.G. Chambers and D. Gollmann, "Generatirs for Sequences with NearMaximal Linear Equivalence," IEE Proceedings, v. 135, pt. E, n. 1, Jan 1988, pp. 331-343. D. Coppersmith, H. Krawczyk, and Y. Mansour, "The Shinking Generator," Advances in Cryptology--CRYPTO '93 Proceedings, SpringerVerlag, 1994, pp. 22-39. C. Hall, I. Goldberg, B. Schneier, "Reaction Attacks Against Several Public-Key Cryptosystems," 1998, in preparation. D. Kahn, The Codebreakers, The MacMillan Company, 1967. J. Kelsey, B. SchneieL and D. Wagner, "Key-Schedule Cryptanalysis of IDEA, G-DES, GOST, SAFER, and Triple-DES," Advances in Cryptology - - CRYPTO '96 Proceedings, Springer-Verlag, 1996, pp. 237-251. J. Kelsey, B. Schneier, and D. Wagner, "Related-Key Cryptanalysis of 3-WAY, Biham-DES, CAST, DES-X, NewDES, RC2, and TEA," Information and Communications Secumty, First International Conference Proceedings, Springer-Verlag, 1997, pp. 203-207. P. Kocher, "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems," Advances in Cryptology--CRYPTO '96 Proceedings, Springer-Verlag, 1996, pp. 104-113. P. Kocher, personal communication, 1998. X. Lai, J.L. Massey, "A Proposal for a New Block Encryption Standard," Advances in Cryptology--EUROCRYPT '90 Proceedings, Springer-Verlag, pp. 389-404. X. Lai, J.L. Massey, and S. Murphy, "Markov Ciphers and Differential Cryptanalysis," Advances in Cryptology--EUROCRYPT '91 Proceedings, Springer-Verlag, pp. 17-38. M. Matsui, "Linear Cryptanalysis Method for DES Cipher," Advances in Cryptology--EUROCRYPT '93 Proceedings, Springer-Verlag, 1994, pp. 386-397. W. Meier and O. Steffelbach, "The Self-Shrinking Generator," Communications and Cryptography: Two Sides of One Tapestry, R.E. Blahut et al, eds., Kluwer Academic Publishers, 1994, pp. 287-295. R. Merkle, "A Fast Software Encryption Function," Advances m Cryptology--CRYPTO '90 Proceedzngs, Springer-Verlag, 1991, pp. 476501. B. Schneier, "Description of a New Variable-Length Key, 64-Bit Block Cipher (Blowfish)," Fast Software Encryption, Cambridge Security Workshop Proceedings, Springer-Verlag, 1994, pp. 191-204. B. Schneier, Applied Cryptography, 2nd Edition, John Wiley ~z Sons, 1996. W. van Eck, "Electromagnetic Radiation from Video Display Units: An Eavesdropping Risk," Computers 8_4Security, v. 4, 1985, pp. 269-286. P. Wright, Spycatcher, Viking Penguin Inc., 1987. S.B. Xu, D.K. He, and X.M. Wang, "An Impelementation of the GSM General Data Encryption Algorithm A5," CHIANCRYPT '94, 11-15 Nov 1994, pp. 287-291.
On the Security of Digital Tachographs * Ross Anderson Cambridge University Computer Laboratory Pembroke Street, Cambridge CB2 3QG r j al4@cl, cam. ac. u k
A b s t r a c t . Tachographs are used in most heavy vehicles in Europe to control drivers' hours, and for secondary purposes ranging from accident investigation and environmental enforcement to the prevention of fraud. Their effectiveness is under threat from increasing levels of sophisticated fraud and manipulation. We examine this in the context of recent EU proposals to move to smartcard-based tachograph systems, which are aimed at cutting fraud and improving the level of enforcement generally. We conclude that the proposed new regime will be extremely vulnerable to the wholesale forgery of smartcards and to system-level manipulation, which in turn could lead to a large-scale breakdown in control. We present this as an interesting and important problem to the security community.
K e y w o r d s : smartcard, tampering, fraud, tachographs
1
Introduction
Vehicle accidents where the most likely cause was the driver falling asleep at the wheel account for a b o u t 16% of the total on all UK roads, and about 23~0 on motorways [11]. Death or serious injury is significantly more likely than in other types of accident, p r o b a b l y because of the greater speed on impact [26]. Consistent figures have been reported from the USA, Germany, Israel and Sweden [11]. By comparison, vehicle accidents where alcohol is a significant contributory factor range from 3.1% in the UK through 4.9% in Sweden and 9.5% in G e r m a n y to 14.1% in Finland [28]. Heavy commercial vehicles are a particular problem as they can do much more damage in a crash. Although they are involved in only 6% of serious accidents, these include 16% of fatal accidents. In Europe, the principal control on the hours worked by heavy vehicle drivers is the tachograph - - a device fitted behind the speedometer which logs the vehicle's speed, distance and mode of work on a waxed p a p e r chart, in the centre of which the driver must write his name, starting location, vehicle number, * This paper reports research funded by the Department of the Environment, Transport and the Regions during 1997-98.
112
Ross Anderson
date and odometer reading before commencing his journey. Tachograph use is mandatory for most heavy trucks in the European Union, and about half the bus and coach fleet. Drivers must carry with them their charts for the current week and the last driving day of the previous week, after which they must be stored at the employer's premises for one year. Although the system was introduced to control drivers' working hours, it has since acquired a number of other uses. The police find tachograph charts helpful in investigating accidents and other offences, such as unlicensed toxic waste dumping, while many operators use them to prevent theft of fuel by drivers (which can amount to 5,000 ECU per vehicle per year otherwise) [24]. Almost a third of tachograph charts are already scanned by fleet operators or bureaux and fed into fleet management systems for this purpose; these operators would welcome digital tachographs. However, safety remains the main purpose of the system. About 200 fatalities a year result from sleep-related accidents involving heavy vehicles in Britain; perhaps half of these might be prevented by draconian enforcement of driving hours regulations, while the total might be doubled by a breakdown in control. In countries with lax enforcement, the prospective gains should be higher and the potential losses lower. In total, a uniformly high level of control might save the EU a thousand lives a year and a sum in the low billions of ECUs, while a breakdown might have additional human and economic costs on the same scale.
2
Tachograph
Tampering
and
Fraud
There is therefore great concern at a growing wave of tachograph fraud and tampering, by which both drivers' hours and speed regulations are flouted on a large scale. Most of this fraud is motivated by economic pressures on vehicle operators, and might be reduced by better tachograph systems. Good security engineering requires a detailed understanding of threats. We therefore list here the main techniques used at present for tachograph fraud and tampering. Our figures for their relative prevalence come from a recent survey by the UK Vehicle Inspectorate of convictions for tachograph offences [9]: the sample size was 1060, made up of 854 convictions of drivers and 206 convictions of operators. 1. Most frauds do not involve tampering but exploit procedural weaknesses. These accounted for 68% of driver and 71% of operator convictions. - A very common fraud is 'ghosting' - - manipulating tachograph charts so that there appear to be more drivers than there actually are. For example, a company with premises in Dundee and Southampton should have four drivers in order to operate one vehicle per day in each direction; the distance is about 800km and the journey takes about 10 hours which is illegal for a single driver to do without an overnight rest.
On the Security of Digital Tachographs
-
-
113
The standard fiddle is to have two drivers, who meet en route at Penrith, change trucks, and insert new paper charts into the tachographs. The driver from Southampton now returns there with the vehicle from Dundee. When stopped and asked for his charts, he would show the current chart from Penrith to Southampton, the previous day's for the leg from Southampton to Penrith, the day before's for Penrith to Southampton, and so on. In this way he would give the false impression that he spent every other night in Penrith and was thus legal. This practice, of swapping vehicles halfway through the working day, is widespread [21]. Much casual deceit involves very simple manipulation, such as altering the clock to simulate a rest period, inserting a fresh tachograph chart and hiding the old one, forging a chart by hand (perhaps with the help of compasses or bottle tops) or even driving with no chart at all and hoping to produce an old one by sleight-of-hand if stopped [29]. These tricks are likely to be detected if the vehicle is stopped by an alert officer; they are often used when a normally honest driver is delayed and is under pressure to deliver a load rather than take an unscheduled break. More sophisticated procedural frauds include 'forgetting' to write the date on the chart centre field, representing a hitch-hiker as a co-driver, using a chart for a 140 k m / h tachograph in a 125 k m / h device, filling completely fictitious centre field details, and representing the start point of the journey as an obscure village [21]. They often involve collusion with the operator, as with ghosting: often, when the operator is ordered to produce charts and supporting documents such as pay records, ferry tickets etc., his office will conveniently burn down.
2. The next largest category of fraud involves tampering with the supply to the tachograph instrument, including interference with the power and impulse supply, cables and seals. Offences involving electronic tachographs amounted for 23% of driver convictions and 18% of operator convictions; offences involving the older mechanical instruments added another 2% and 3% respectively. -
-
Such frauds often involve collusion with fitters. Electronic tachographs get their input from a sensor in the gearbox, which sends electrical impulses as the prop shaft rotates, and a common attack is to unscrew the sensor about 2mm. This causes the impulses to cease, as if the vehicle were stationary. To prevent this, gearbox sensors are fixed in place with a wire and lead seal. Fitters are bribed to wrap the wire anticlockwise rather than clockwise, which causes it to loosen rather than break when the sensor is unscrewed. The fact that seals are kept by workshops rather than by individual fitters complicates prosecution. Some determined offenders fit a switch into the cable so that the input can be drawn either from the real gearbox sensor or from an additional sensor that is mounted under the driver's seat rather than being rotated by the gearbox [21]. At least one operator has had all its vehicles wired to interrupt impulses on demand.
114
Ross Anderson
- However the two most common techniques are very simple: to insert an earthing wire into the cable, thus shorting out the impulses, and to replace the tachograph fuse with a blown one. The incidence of the former attack has been reduced recently by a switch to armoured cables 1 but blown fuses continue to provide a plausible excuse [21]. 3. The third category of fraud is tampering with the tachograph head itself (the unit mounted behind the speedometer). 4% of driver offences, and 5% of operator offences, are ascribed to this kind of abuse of the older mechanical tachographs, with a further 1% and 2% respectively booked for attacks on the newer electronic devices. Most tachograph head tampering involves miscalibration, which is often performed with insider assistance. Tachograph heads contain potentiometers or switches with which the radius of the road wheels can be set, thus translating impulses into distance and impulse rates into speed. Corrupt fitters often set these to indicate some 90% of the actual road speed. Drivers may also break seals and change the calibration directly [22], and steam cleaning often destroys seals anyway thus providing a good excuse. In some cases, drivers have broken seals and repaired them invisibly [10, 21]. The current seals are easy for skilled persons to defeat, and their knowledge is spreading rapidly [14]. Work needs to be done urgently in this area. The remaining attacks tend to exploit vulnerabilities discovered by chance, knowledge of which spreads more rapidly among drivers than among policemen. They include: - in many devices, one can bend the styli that write on the chart and thus falsify the recorded speed [22]; - one device can be caused to register zero speed by inserting a piece of wire into the tachograph head. This shorts the circuitry to earth without breaking the seal. In another device, an earth wire can similarly be inserted into a sealed cable joint. These wires can be pulled out in a second if the vehicle is stopped by police [22]; - the earlier electronic tachographs had the power for the impulse generators and head motors wired through the ignition circuit to minimise battery drain, while the clock and lighting circuits come directly from the battery. However diesel engines once started can run with the ignition off, in which case the device is frozen. The speed stylus remains roughly where it was, while other traces will indicate a vehicle at rest; - in the following generation of electronic tachographs, the clock and the motors operated at different voltages, and the device could be frozen by reducing the supply voltage. In one model, the overspeed warning light helpfully came on at just the right voltage [21]; with some models, one could wire up a flasher unit to interrupt the supply voltage and thus reduce the rate at which the chart table turned; when features to detect supply voltage interruption were introduced 2, not all of them were well engineered. In one device, an alarm causes the -
-
1 Commission Regulation 2470/95 2 as a result of Commission Regulation 3314/90
On the Security of Digital Tachographs
115
speed stylus to strike downwards, which has no effect if the vehicle is stationary when power is resumed. In another device, the alarm can be suppressed by connecting two generally unused terminals together [21]; - with one model, the driver can press down on the centre of the speedometer plastic and prevent the needle from moving when he accelerates from (say) 30 to 60 km/h. As the assembly is driven by a stepper motor, both the speedometer and the underlying tachograph will register 30km/h less than the true speed until this speed next drops below 30km/h.
Fig. 1. A tachograph with an interruptor controlled using a radio key fob.
4. The state of the tampering art is a radio controlled interruptor device depicted in figure 1. This is a red plastic cylinder bearing the inscription 'Voltage Regulator - - Made in Japan' but which is actually inserted into the tachograph cable and controlled by the driver from the cab using the remote control key fob. A first press causes the indicated speed to drop by 10%, a second press cuts it by a further 10%, a third press causes it to fall to zero, and a fourth causes the device to become inactive so that the tachograph and speed limiter return at once to proper operation. Such devices are extremely hard to find as they can be hidden at many different places in the truck's cable harness (miniature versions are even found inside the tachograph head itself). Police officers who stop a speeding truck equipped with such a device, and cannot find it, have difficulty getting a conviction
116
Ross Anderson as the sealed and apparently correctly calibrated tachograph contradicts the evidence from their radar or camera. Only six convictions - - five driver and one operator - - fall into the category of 'Radio controlled devices etc' but intelligence suggests that their use is becoming widespread; the lack of convictions is due to the difficulty of detection.
As with many of the so-called 'victimless' crimes that are usually detected only by enforcement action, hard figures on the extent of tachograph manipulation are not readily available. In any case the extent and methods of fraud vary by country. In countries with poor enforcement, drivers may simply not use charts at all (industry sources named two member states with negligible chart sales). However, the consensus of informed people in the UK is t h a t maybe 5-10% of drivers are persistent offenders and about half offend occasionally. The persistent offenders tend to be associated with certain operators, some 20-40% of whose vehicles turn out to have signs of past tampering when examined closely at inspection stations. These signs include seal defects, calibration faults and suspiciously loose wiring [30]. The significance of this is that well funded attempts to defeat future t a m p e r proofing systems can be anticipated; at least one of the suspect operators has over 500 trucks. Tachograph tampering brings secondary safety hazards, in that most heavy vehicles must now be fitted with speed limiters, which are usually driven from the tachograph head. So drivers who wish to exceed the speed limit may tamper with the tachograph, and drivers who tamper are tempted to speed as well. Speed limiter tampering is even more widespread than tachograph tampering (as drivers can also attack the limiter, and the cable between it and the tacho head, in various ways). Unfortunately, the introduction of limiters has caused some truck makers to de-rate tyres and other systems, making speeding significantly more hazardous. One might think that as vehicle systems become more integrated, using standards such as CANBUS, tampering would decrease because unauthorised modifications to systems can have a side-effect on safety. Experience shows that this hope is vain. One truck ran its antilock braking system from the same circuit as the tachograph, so that a driver who replaced the fuse with a blown one would disable his ABS. Many drivers still replaced the fuse. It might also be thought that the situation has been improving over time, with more recent tachographs being more resistant to tampering. This is not the case; the move from mechanical to electronic equipment led to a tenfold rise in convictions for tampering with the supply. Older devices used rotating wire cables and an a t t e m p t to physically jam the odometer caused gear teeth to strip or the cable to shear; the move to electrical impulses was a large setback for enforcers as it enabled odometers to be jammed (which in turn made other frauds easier), while cable earthing and then interruptors opened up a whole new
On the Security of Digital Tachographs
117
set of attacks. Attempted improvements just made the frauds more complex and difficult to detect [21]. It is now generally agreed that the cable from the sensor to the tacho head should be protected by cryptography. But there is still no agreement about how, and it is unclear how unique crypto keys can be loaded safely into the sensor and the head when the fitters are in the pay of the attacker. (In fact, one supplier has proposed loading the same crypto key into every sensor unit.) A further set of problems comes from the EU member states' different operational models. For example, in the UK much of the enforcement is done by roadside checks, while in the Netherlands it is mostly at operator premises. In addition, a variety of computer systems have been developed to support enforcement, and there are also a number of fleet management systems which are used, as noted above, to prevent theft of fuel by drivers. The actual tachograph is thus a component of many different enforcement and management systems. 3
The
Tachosmart
Projects
This is the background to the emerging consensus in Europe for action at the Community level to improve the dependability of drivers' hours recording [7, 29]. The EU tachograph market is dominated by two companies, VDO Kiezle of Germany and TVI of Britain, with about 75% and 25% respectively of an equipment market worth about 100 MECU per annum. Sales made by analysis bureaux, fitters, calibrators, suppliers of consumables and spare cables, etc., make up the total tachograph market value to some 300-400 MECU per annum - - about an order of magnitude less then the annual economic cost of sleep related accidents. So a change to the system can be motivated by enforcement considerations, and the EU has funded a 'Tachosmart' project to develop a more tamper resistant electronic replacement for the current chart-based systems. There have been three phases so far, and Tachosmart 3 resulted in prototype devices built by manufacturers in five different member states [27]. The prototypes use a smartcard rather than a chart to personalise the equipment. This introduces a significant memory constraint. The cards used in the prototype have 8K E E P R O M of which a little over 5K is available for driver record keeping; production cards are planned to have twice as much memory. However, paper charts keep speed history with a resolution, under microscopic examination, of 1 second - - equivalent to a capacity of several tens of kilobytes per chart. Recording 28 days' (or even 14 days') driving activity on a commodity smartcard is thus out of the question. Thus a smartcard based tachograph must either have auxiliary memory, store a much compressed record, or both.
118
3.1
Ross Anderson
T h e p r o p o s e d solution
The proposed solution is that a memory in the vehicle unit will retain 365 days' drivers' hours data, plus a speed history for the last 24 hours. The driver card will have memory for 28 days' driving hours, but no speed history. There will also be cards for the vehicle operator, vehicle inspectors, calibration stations and fitters which will give differing kinds of access to the data in the vehicle unit. The vehicle operator, for example, will be able to download complete data for integration into a fleet management system. Some of the interfaces, including those between the various types of card and the vehicle units, will be standardised across the EU. In the prototype systems, the impulses are generated as before in a gearbox sensor, and passed to the vehicle unit (the means of protecting the impulses from tampering are left to the manufacturer). T h e y are converted in the vehicle unit into a speed history which is retained in memory for 24 hours and may be downloaded using an operator or inspector card. The vehicle unit can also store more alarms than the card; typically 10-25 of each type. It will normally have one or two cards in it, and in normal operation these will be drivers' cards. It will then associate driver's and if appropriate co-driver's hours to them in its internal log. A much compressed history of drivers and hours will be kept for 365 days. Cards (of whatever type) are authenticated to the tachograph by a bidirectional challenge-response using keys that are common throughout the system; data exchanges between tachographs and the cards carried by drivers and inspectors are claimed by the security specification to employ digital signatures [8]. However, the specification of the card contents does not contain enough room for signature keys [15, 16]. We conclude that, as often in the smartcard industry, 'digital signature' or 'data signature' actually means a message authentication code computed using a common secret. This leaves it unclear how signatures are to be verified in the software of external systems, and how the system will interact with German digital signature law. Key material is loaded into drivers' cards by a national card issuing authority. The vehicle units are manufactured with embedded secret keys that are common throughout the system, and every time they are calibrated, new signature keys and certificates are loaded from a calibration card using the common secret for authentication. The protocols used are not specified in sufficient detail for close analysis. In theory, the driver's card is retained in the instrument during driving (though ensuring this is harder than it looks), and at the end of the trip it is updated with a signed record of working hours plus the last three alarms (overspeeds, tampering events, etc) if any. The card can retain up to 28 days' records, depending on driving conditions. At any time, an inspector can request a print-out from the vehicle unit in one of two ways. If he has a control card, he can obtain a signed copy of the contents
On the Security of Digital Tachographs
119
of the tachograph memory transmitted to a lap-top over a serial link [17]; if not, the driver can print out appropriate data from the vehicle unit's memory and hand it to the inspector.
3.2
Anticipated problems
The simultaneous operation of vehicle fleets with paper charts and driver cards is expected to lead to serious enforcement problems. In cases involving ghosting, it is predicted that the operator will have one new vehicle that uses a card and an older one that uses charts. (This is expected to raise the price of older vehicles and depress new vehicle sales - - an effect already seen with previous changes in regulations, not just for tachographs but also for speed limiters and emissions.) Conscientious operators will buy systems that will enable downloaded digital tachograph data to be merged with data from scanned analogue charts, fuel purchases, drivers' overtime etc. But, as noted above, there is no EU requirement for such systems. Some countries are considering making them mandatory at the national level, with controls mirroring those proposed for cryptographic systems in general [4] in that large companies would be trusted to run their own systems while the smaller operators would have to use an approved bureau service. This would extend the current system whereby the workshops of large, trusted operators may be approved as tachograph fitters while small operators must use third parties. In countries where downloading is not mandated, the theory is that drivers would take paper printouts from any digital tachographs they used and keep them along with the paper charts from any analogue devices. This would make enforcement harder; the printouts from the Tachosmart prototype systems are much easier to forge than charts are. It is also unclear how operators could discharge their legal duty to maintain records of drivers' hours [7], as the vehicle units containing this information could be distributed all over Europe and beyond. There will be serious consequences for competition and the internal market. From July 1998, the EU will have unrestricted cabotage - - drivers from one member state will be able to carry goods in another state. The current wide variation in tachograph enforcement does not pose a competitive issue so long as the primary control is carried in the vehicle. However, if Tachosmart moves the primary control to the computer system of the vehicle operator or his bureau, and this control is only effective in some countries, then drivers from other countries will have a competitive advantage which will exert downward pressure on the quality of control everywhere. This will be further complicated by the fact that there are vehicle operators with depots in (say) Belgium and Holland, and who operate a given vehicle out of both of them depending on the day of the week; there are also operators who share vehicles. Even more serious problems may come from non-EU vehicles. The proposals are silent on how the control factors, from card issue onwards, will be managed in that case.
120
Ross Anderson
The second set of anticipated problems comes from the loss of detailed redundant data. At present, fraud and accident investigation depends on comparing the speed and distance traces against the claimed journey end points and against other documents such as delivery notes and ferry tickets. In the digital system, speed history will be retained in the vehicle unit for only 24 hours, and if not downloaded will be lost; while journey end points will only be logged at the granularity of a region, as a keyboard to enter the names of towns would be expensive. It is not clear how the inspector is to do his job in a digital environment. The detailed, redundant data on the chart often provides inspectors with grounds for suspicion to justify a detailed examination. However digital systems either indicate a violation or do not. Thus while the older analogue systems degrade gracefully under attack, digital systems fail abruptly. This means that a higher level of tamper resistance is needed in the digital environment, to which we will return below. As for the use of additional inputs, the draft regulations permit countries to require inputs from GPS for domestic vehicle, if they wish ([7] article 15 clause 5a). However the Tachosmart standards do not support this - - there appears to be no provision for a GPS interface, despite the regulations permitting individual member states to require the use of GPS. If GPS units are not connected to the tachograph but carried as separate equipment, then many of their potential benefits will be lost. The third problem set concerns reliability. Smartcards started out as bank cards, designed to be used several times a week in cash machines; when they are used heavily such as in building access control and transport ticketing - - and especially in grimy environments - - common models experience failure rates of up to 7 ~ per annum. This is not compatible with the expected 5 year card lifetime and 10,000 hour mean-time-to-failure [15], and would be onerous on companies and workshops given that cards are to be issued centrally with replacement taking as much as two weeks [18]. This will facilitate simple frauds by drivers and operators. By destroying a card (e.g. by applying mains electricity) a driver can eliminate a problematic record, and under the regulations he will be allowed to drive for 15 days without a card. Such card-destruction attacks have been perpetrated on bank smartcard systems in the UK, France and elsewhere; by forcing the system back on less secure stand-in procedures many avenues of abuse are opened up. One UK bank has had to open a specialist laboratory to examine failed smartcards presented by customers. Other reliability issues impinging on security include the failure rate of printers in a commercial vehicle environment. Many of these objections can be overcome by changes to the details of the specification. However there is a much more fundamental problem with the Tachosmart concept and that is the belief that smartcards are tamper-proof.
On the Security of Digital Tachographs 4
Smartcard
Security
121
Issues
The draft regulations state that 'the total system, including the connections to the speed and distance sensor, must be tamperproof' ([7] p 27). This cannot be achieved given the card technology currently available in Europe. For years, smartcard vendors claimed that their products were tamperproof, or as nearly so as made no difference. In the last few years, this claim has been demolished by a large number of attacks on pay-TV and on other systems. The state of the tampering art is constantly evolving. A number of historical attacks are described in [2], together with the techniques used by pirates in 199496. The state of the art in early 1998 is somewhat more advanced; professional pirates now use microscopes fitted with lasers and microprobes to extract card data quite rapidly. One technique is to fit a probe to the line that controls the instruction latch, and use this to prevent new instructions being loaded from the bus. Now when the card is clocked there will be no jump instructions, and all the words in memory will appear on the bus in sequential order. A second microprobe is then used to recover the memory contents from one bus line at a time. The laser is used to remove the passivation layer from the card surface over the feature to be probed; this avoids removing the whole layer, which may set off an alarm in some card designs. It also creates a depression in which the microprobe will lie stably. The cost of the laboratory equipment needed to perform this attack is about 150,000 ECU, though second-hand equipment is much cheaper. The equipment is also available at many university laboratories, and at least one EU university teaches chipcard breaking techniques to undergraduates as part of their course work [3]. The effect of attacks, carried out both by students and by professional pirates, has been to force pay-TV operators such as BSkyB to change their entire card base about once a year, moving each time to a new technology of card with ever more expensive tamper resistance mechanisms. To date, the pirates have always managed to keep up; an example of the Sky series 9 card, opened with hand tools and yet still functional, can be seen in [2]. The arms race between attackers and defenders is expensive; for example, the technology supplier to BSkyB has seen its research department grow from 60 to 200 people over the past four years, while at a recent trial of a card forger, BSkyB claimed that forgery cost them s [13]. Yet there is still no real breakthrough in tamper prevention, and senior scientists at some large semiconductor makers expect none, for reasons discussed in [2]. Attacks have mostly taken place on systems where universal secrets are stored in cards, and where the compromise of these secrets can enable cards to be forged and sold for mass use (such as pay-TV). T h e y have not been widespread on cards that store only individual secrets and where forgery can bring only limited benefits (such as the SIM cards used in mobile phones).
122
Ross Anderson
The system prototyped in the Tachosmart 3 project falls squarely into the former category. The workshop, operator and inspector cards contain only common secrets and so we expect that they will be rapidly duplicated and sold through the channels currently used to market interruptors (such as ferry terminals and transport cafes). The workshop card will enable alterations to be made to tachograph parameters. The calibration cards, if required after such changes (the specification is vague), contain a list of keys plus a common secret for authentication. These cards will also be duplicated as they will allow speed limits to be changed. The workshop card can clear the tachograph memory and the calibration card can then be used to re-load keys. If a forged calibration card were used for initialisation of the device in the first place, then the original keys could be re-loaded thus causing an embarrassing record to vanish without trace. If, as appears in [15, 16], the 'data signatures' do not really use asymmetric cryptography but message authentication codes (MACs) computed with shared secrets, then once these common secrets become known, the system will be completely broken. Pirates will be able to manufacture any type of card and forge any type of record. We will now look in more detail at the likely effects of attacks under various assumptions.
5
What
Will Go Wrong
If we now work down the list of existing frauds, we can get a rough idea of the effect of introducing the proposed technology in its present form. 1. The initial attraction of the smartcard-based digital tachograph was that it might stop ghosting which together with other chart manipulations amounts for most UK convictions. However, this would only be the case if (a) digital devices were tamperproof (b) cards could not be forged or duplicated and (c) the entire EU vehicle park were retrofitted with digital devices. None of these is likely to be the case. Retrofitting a vehicle with a tachograph other than of the original type is expensive, as it involves custom engineering; one vendor reported 19 replacement sales last year against 20,000 units repaired. Yet in the absence of retrofit, ghosting will be easy even without system penetration; the operator will just use one digital and one analogue truck. Attacks involving simple manipulation will become much easier as drivers will be able to cause tachograph malfunctions in many ways (mains electricity in the card, electrical contact failure, sand in the printer, ...). The digital tachograph is much like a burglar alarm in that the attacker only has to destroy confidence in it, by making it appear to be unreliable, in order to defeat it. The lessons from attacks on digital burglar alarms [23] should be studied carefully.
On the Security of Digital Tachographs
123
Frauds in which the operator colludes with drivers and covers up for them by destroying records in the event of an investigation will become more common if record destruction involves only a 'disk crash' on a PC rather than burning down part of the premises as at present. In addition, we expect that the larger rogue operators will invest in manipulating cards. If operators are allowed to keep records on paper, the printouts from vehicle units will be easy to forge, and this may be the method of choice for one-man operators where downloading is not compulsory. 2. Attacks on the sensor seal are not tackled by the proposals, and although new systems are supposed to protect the signalling, this is left to individual vendors with no standards being set. The one unit we examined sent conventional unprotected pulses plus a second channel of encrypted pulses. The latter is used by the tachograph, but only the former is available to the speed limiter. So an interruptor could still be fitted between the tacho head and the limiter. The simultaneous entry into the market of several new vendors with no experience of the industry will be an aggravating factor. Reliability generally will fall, as will inspector's level of experience with the equipment in use. So old tricks (such as replacing the tachograph fuse with a blown one) will gain a new lease of life. 3. As noted above it seems that tachograph calibration may be tampered with given workshop cards, which contain only shared secrets and so should be widely forged. At present, intentional miscalibration can be proved if the setting within the instrument differs from that recorded on the plate, and the seal is intact; it is unclear what evidence could be extracted from the new digital system. More generally, attacks on computer security systems involve the opportunistic exploitation of implementation defects, and these are more prevalent in new systems [1]. The introduction of digital tachographs will be no different: vehicle inspectors report increased tampering attempts whenever new technologies are introduced. 4. Finally, there will be a number of new avenues of attack which will appeal to the underground workshops that currently make their money from interruptors. One obvious target is the vehicle unit itself. The proposed regulations are silent on the anti-tampering mechanisms and standard of testing required here; but the move from an analogue tachograph to a digital one raises the possibility that tachographs' program code could be modified, or circuit boards substituted, or totally bogus devices manufactured which would conform with the regulations in all externally visible ways but contain extra features for the driver or operator. Criminals have already used altered or completely bogus cash machines and point-of-sale terminals in bank card fraud; bogus digital tachographs are to be expected. The hope has been expressed that testing under the ITSEC programme will ensure that the equipment is fit for purpose. This may be the case eventually, but at present the institutions participating in ITSEC are oriented to evaluating
124
Ross Anderson
military computer systems for NATO use. An early attempt to produce ITSEC testing criteria for smartcard-based systems yielded a document which emphasises the secrecy of the chip design [25] - - a relatively pointless goal given that attackers can buy microscopes. Much more convincing test criteria have been developed by VISA [31].
6
Conclusions
It is often thought that introducing advanced technology such as smartcards and cryptography will make systems more secure. The Tachosmart proposals provide an interesting counterexample; there is consensus among experienced enforcement officers that they will make the problem worse (as was the case with the earlier move from mechanical to analogue electronic instruments). The practical issues raised are important and urgent. Even with a good design, the costs of a changeover will be significant; the recent introduction of armoured tachograph cable cost many millions of ECU. Yet the design currently proposed does not address the most common abuses, and those that it does try to tackle are not dealt with effectively. Its introduction would ensure that much fraud will become almost impossible to detect, and that the regulations will fall into disrepute. The cost of such a failure could amount to thousands of additional traffic fatalities, and billions of ECU, before control could be restored. We therefore present the problem of how to design a serviceable digital tachograph as a worthy problem to the research community.
Acknowledgements: A number of people provided valuable background information including Peter Dean, Department of Transport; Jean-Philippe Leli~vre, ERTICO; Gregory Clough, UCL; John Martin, Hampshire Constabulary; Gary Geldart, Vehicle Inspectorate; Dick Edmonds, Phil Needham and Nick Rendle of Lucas Kienzle; and Frank Clish of TVI. Roger Needham of Microsoft Research and Markus Kuhn of Cambridge University provided a sounding board as this analysis was developed. References 1. RJ Anderson, "Why Cryptosystems Fail", in Proceedings of the 1st ACM Conference on Computer and Communications Security (November 1993) pp 215-227 2. RJ Anderson, MG Kuhn, "Tamper Resistance - - a Cautionary Note", in The Second USENIX Workshop on Electronic Commerce Proceedings (Nov 1996) pp 1-11 3. E Bovenlander, invited talk on smartcard security, Eurocrypt 97 4. 'Licensing of Trusted Third Parties for the Provision of Encryption Services', Department of Trade and Industry, March 1997 5. EU Regulation 3820/85 (drivers' rules)
On the Security of Digital Tachographs
125
6. 'Council Regulatwn (EEC) no 3821/85 of 20 December 1985 on recording equzpment in road transport' No L 370/8, 31/12/85 7. 'Draft Council Regulation Amending Councd Regulation (EEC) no 3821/85 and Couned Dzreetive 88/599/EEC on Recording Equzpment in Road Transport', EU, The Council, Interinstitutional file no 94/0187 (SYN), 2 July 1997 8. Euraxiat - - Security Analysis, March 1997 9. "False Records: Analogue Tachographs", G Geldart, Vehicle Inspectorate, 22nd January 1998 10. Hampshire police training video 11. "Sleep related vehicle accidents", JA Home, LA Reyner, in British Medical Journal v 310 (4/3/95) pp 565-567 12. "Falling Asleep at the Wheel", J Horne, L Reyner, preprint, 29/2/96; distributed at Falling Asleep at the Wheel, Loughborough University 18-19/11/96 13. '~Ex-radio chief 'masterminded' TV cards scam"Irish Independent 17/2/98 14. "Vulnerability Assessment of Security Seals", RG Johnson, ARE Garcia, in Journal of Security Administration v 20 no 1 (June 97) pp 15-27 15. 'Driver Card 16kByte version 00.01.01', 24/11/97 16. 'Workshop, Control and Company Card version 00.01.01', 24/11/97 17. 'External Interface version 00.01.01', 25/11/97 18. 'Card Issuing version 00.02.00', 25/11/97 19. O Kocar, "Hardwaresicherheit von Mikrochips in Chipkarten", in Datenschutz und Datensicherheit v 20 no 7 (July 96) pp 421-424 20. "Falling Asleep at the Wheel", JS Martin, report on conference of the same name, Loughborough University 18-19/11/96 21. "Principal frauds used on the current tachographs", J Martin, UK response to ERTICO consultation document on tachograph falsification 22. 'Princ~pales fraudes partant atteinte d l'mtdgrit4 du chronotachygraphe', Minist~re de l'Equipement, des Transports et du Tourisme, Direction des Transports Terrestres, Sous-direction des transports routiers, bureau R3 (France) 23. "Denial of Service: An Example", RM Needham, in Communications of the ACM v 37 no 11 (Nov 94) pp 42-46 24. "Spying tags could stop fuel fraud", N Nuttall, in The Times (Interface Section) 28/1/98 p 7 25. 'Common Criteria for I T Security Evaluation -- Smartcard Integrated Circuit Protection Profile ', registered at the French Certification Body under the number PP/9704; Motorola, Philips, Siemens, SGS-Thomson, Texas Instruments, October 1997. 26. "Driver sleepiness as a causal factor in accidents on the M180/A180 in South Humberside: A Preliminary Analysis", L Reyner, at Falling Asleep at the Wheel, Loughborough University 18-19/11/96 27. 'Tachosmart 3 Digital Tachograph -- Final report', Thomson CSF, 1996 28. "Ministers unveil plans to toughen drink-drive laws", in The Times 2/2/98 p 2 29. "Commission wants black box, smart cards to enforce road safety", A Tortes, Reuters news item 0804 (2/9/94) 30. 'A Report on the Fztment of Tachograph Interruptors', 10/8/95, Vehicle Inspectorate, Wighill Lane, Walton, Wetherby, West Yorkshire 31. 'Integrated Circuit Chip Card -- Security Guidelines Summary for IC Chip Design, Operating System and Application Deszgn, Implementation Verification' v
2.1, 4/11/97
A n A u t h o r i z a t i o n M o d e l and Its Formal S e m a n t i c s E l i s a B e r t i n o , 1 F r a n c e s c o Buccafurri, 2
E l e n a F e r r a r i , 1 P a s q u a l e Rullo 3
1 Dipartimento di Scienze dell'Informazione Universitk di Milano - Milano, Italy {bertino, f errarie}9 unimi, it 2 Dipartimento D I M E T Universitk di Reggio Calabria, Reggio Calabria, Italy bucca9 ing. unirc, it 3 Dipartimento di Matematica Universitk della Calabria, Arcavacata di Rende, Rende, Italy rullo9 ing. unicr, it
A b s t r a c t . This paper describes an advanced authorization mechanism based on a logic formalism. The model supports both positive and negative authorizations. It also supports derivation rules by which an authorization can be granted on the basis of the presence or absence of other authorizations. Subjects, objects and authorization types are organized into hierarchies, supporting a more adequate representation of their semantics. From the authorizations explicitly specified, additional authorizations are automatically derived by the system based on those hierarchies. The combination of all the above features results in a powerful yet flexible access control mechanism. The specification language of the system is an extension of Ordered Logic with ordered domains. This is an elegant yet powerful formalism whereby the basic concepts of the authorization model can be naturally formalized. Its semantics is based on the notion of stable model and assigns, to a given set of authorization rules, a multiplicity of (stable) models, each representing a possible way of assigning access authorizations. This form of non-determinism entails an innovative approach to enforce access control: when an access request is issued, the appropriate model is chosen on the basis of the accesses currently under execution in the system.
1
Introduction
T h e i n t r o d u c t i o n of an access c o n t r o l s y s t e m w i t h i n a n y o r g a n i z a t i o n entails two m a i n tasks. T h e first is t h e i d e n t i f i c a t i o n a n d specification of s u i t a b l e access c o n t r o l policies. A n access c o n t r o l p o l i c y e s t a b l i s h e s for each s u b j e c t t h e a c t i o n s h e / s h e c a n p e r f o r m on which o b j e c t w i t h i n t h e s y s t e m u n d e r which c i r c u m s t a n c e s . T h e second t a s k is t h e d e v e l o p m e n t of a s u i t a b l e access control m e c h a n i s m i m p l e m e n t i n g t h e s t a t e d policies. M a n y a d v a n c e d a p p l i c a t i o n s ,
128
Elisa Bertino et al.
such as workfiow systems, and computer-supported cooperative work, have articulated and rich access control requirements. These requirements cannot be adequately supported by current access control mechanisms which are tailored to few, specific policies. In most cases, either the organization is forced to adopt the specific policy built-in into the access control mechanism at hand, or access control policies must be implemented as application programs. Both situations are clearly unacceptable. A possible solution to this problem is the development of advanced access control mechanisms, which extend the expressive power of access control mechanisms currently available in commercial DBMSs and research prototypes. Extending the expressive power of existing authorization models has, as its counterpart, an increase of the complexity of the model. A method to manage such increased complexity is to provide a formal semantic foundation for the model. In this paper we present a step in this direction by proposing an authorization mechanism based on a logic formalism. The model is based on the closed policy with negative authorizations and the enforcement of the "strongest authorization takes precedence" principle. Our model also provides the possibility of specifying derivation rules by which new authorizations can be derived on the basis of the presence or absence of other authorizations. Authorizations can be granted to a single user as well as to a group or to a role. Subjects, objects, and authorization types are organized into hierarchies, supporting a more adequate representation of their semantics. Authorizations automatically propagate along these hierarchies according to a set of propagation rules. The combination of all the above features results in a powerful yet flexible authorization model.
The authorization specification language of the system is an extension of Ordered Logic (OL) [5, 6, 13] with ordered domains. This is an elegant yet powerful formalism whereby the basic concepts of the authorization model can be directly represented. Its semantics is an extension of the stable model semantics of logic programs [10, 20] to deal with hierarchies, ordered domains, and true negation. According to this semantics, more stable models can be assigned to a given set of authorization rules, each representing a consistent set of assignments of access authorizations. This form of non-determinism entails an innovative approach to enforce access control: when an access request is issued, the set of authorization rules against which the access request must be verified is chosen on the basis of the accesses currently under execution in the system. The development of flexible authorization models has been addressed in other papers [2-4, 9, 11, 19]. Most of the previous proposals, however, have one or more of the following shortcomings: (i) only group [4, 9, 11] or role hierarchies [19] are supported; (ii) only propagation rules [9,19] or derivation rules [2, 3, 11] are supported; (iii) very limited form of derivation rules are supported [2, 3, 9] (iv) very restrictive conflict resolution policies with no exceptions are adopted [2-4]; (vi) object hierarchies are not supported [2-4, 11]; (vii) access mode hierarchies are not supported [2-4, 9, 11]. By contrast, our model integrates most of the features of the abovementioned models into a common framework and provides a semantic foundation for them. Additionally, our model has an articulated conflict
An Authorization Model and Its Formal Semantics
129
resolution policy, to deal with authorization conflicts and exceptions, and an innovative approach to enforce access control. From the side of logic formalisms for security specifications, Jajodia et al. [12] propose a logic language for expressing authorization rules and show how this language can express most of the access control policies proposed so far. Programs that can be written in this language are a subset of stratified Datalog programs and therefore they are able to express only a limited set of authorization specifications. By contrast, in this paper we propose a very general language to express authorization specifications without syntactic restrictions (like stratification). Hence, we do not restrict ourselves to the consideration of programs having a unique model rather we allow a multiplicity of models to be associated with a given program. To deal with this multiplicity, we have developed a strategy that, each time an access request is submitted to the system, allow us to select the appropriate model against which the access request must be checked. A logic language, based on modal logic, has been proposed by Abadi et al. in [1]. However, their logic is mainly devoted to express concepts such as roles, delegation of authorizations, or the operation of certain protocols. A general logic language for expressing authorization rules has also been proposed by Woo and Lam in [22]. Although their language is very expressive, it suffers from several drawbacks. The most important is that it is not always possible to decide whether an access request can be authorized or not, because of conflicting authorizations, and no mechanisms are provided to deal with such inconsistencies. The remainder of this paper is organized as follows. Section 2 presents the basic elements of our authorization model. Section 3 provides the logic framework to express the various components of our model. Section 4 deals with access control. Finally, Section 5 concludes the paper and outlines future work.
2
O v e r v i e w of t h e A u t h o r i z a t i o n M o d e l
In this section we illustrate our authorization model. 2.1
Subjects, Objects and Privileges
Our model relies on three basic components. The first component is a set of subjects S to which authorizations are granted. Subjects can be single users (i.e., elements of a set U), groups (i.e., elements of a set G) or roles (i.e., elements of a set R). Roles are named collection of privileges and represent organizational agents intended to perform certain job functions within an organization. Users in turn are assigned appropriate roles based on their qualification. To enable users to successfully execute their tasks, each role has some authorizations associated with it. A user can be authorized to play several roles. When a user takes on a role, he/she is allowed to exercise all the privileges associated with the role. Moreover, a role may be played by several users. 1 Usually, roles within an organization are hierarchically organized. For 1 We assume that there is some mechanism in place that associates users with roles.
130
Elisa Bertino et al.
Manag~
./--,,. Top
Administrative
Teclmlcal
Employee
Fig. 1. An example of role hierarchy
instance the role Manager is considered at higher level than the role Accountant. To reflect this situation we assume that roles are organized according to a partial order, denoted as -~R. Therefore, the set R together with the -~R relation forms a hierarchy, 2 referred to as role hierarchy. Such hierarchy reflects the organizational position of roles within a given organization. Let ri, rj E R be roles. We say that ri dominates rj in the hierarchy (rk -~R rj), if r~ precedes rj in the ordering. An example of role hierarchy is illustrated in Figure 1. Groups are sets of users a n d / o r roles. A user or role may belong to several groups. A partial order is defined on the set G of groups, denoted as -~c, which represents the group-subgroup relationship. Given two groups gi and gj, gi -~G gj if and only if g3 is a subgroup of gk. The set G and the -4c relation form a hierarchy, referred to as group hierarchy. The second component of our authorization model is a set of objects O, denoting the resources to be protected. On the set of objects O a partial order is defined, reflecting the way objects are organized in terms of other objects. Given two objects ok and oj, we say that oj is a component of object ok, denoted ok -~o 03, if oi precedes oj in the ordering. The set O and the -~o relation form a hierarchy, referred to as object hierarchy. An example of object hierarchy is illustrated in Figure 2.
mot
l~mployee_Info
Public_hffo
Empl ~oE y m~yee~loy._ EvaluaIon Salary_InfoPersonal_Data
Program S e P i ~ C_programs programs Cobol Assembler_ I~ograms
Fig. 2. An example of object hierarchy 2 By hierarchy we mean a poset (S, ~), where S is a set and -~ is a partial order over S.
An Authorization Model and Its Formal Semantics
131
The last component of our model is a set of privileges P, denoting the access modes subjects can exercise on the objects in the system. In real situations, interactions exist among privileges. For instance, it is reasonable to assume t h a t the w r i t e privilege is stronger than the r e a d privilege. For this reason, also the set of privileges P is organized into a hierarchy, called privilege hierarchy, by means of a partial order -~p. We say t h a t a privilege pj is a sub-privilege of privilege Pi if and only if the relation p~ -~p pj holds. This means t h a t privilege pj is subsumed by privilege pi.
2.2
Positive and N e g a t i v e Authorizations
Authorizations can be either positive or negative. A positive authorization states t h a t a subject is authorized for a given privilege on a given object, whereas a negative authorization states t h a t a subject must be denied access to a given object. In the following we use the notation (s, o,p, g) to denote positive authorizations, and the notation -~(s, o,p, g) to represent negative authorizations, where s C S, o 6 O, p 6 P and g C U U R (a description of the formal model is reported in Section 3). Tuple (s, o,p,g) states t h a t subject g authorizes subject s to exercise privilege p on object o, whereas -~(s,o,p,g) states t h a t g prevents s from exercising privilege p on o. Note t h a t roles can appear as grantor in an authorization. The reason is that, in our model, a user can connect to the system either as a role or as a single user (we refer to the role with which the user connects as the user's active role).3 W h e n a user logs into the system as a role, all the actions he/she performs (such as granting or revoking authorizations) are considered as if they are performed directly by the user's active role rather t h a n by the user itself.
Example 1. Authorization (Employee, r e a d , P u b l i c _ I n f o, Bob), granted by Bob, authorizes the role Employee to read the object P u b l i c _ I n f o , whereas authorization -~(Employee, read,Employee_Evaluation, Manager), granted by the role Manager, prevents the role Employee from reading Employee_Evaluation. 2.3
Derivation Rules
Beside explicit authorizations, the model supports the specification of derivation rules stating the permission or denial for subjects to exercise privileges on objects on the basis of the presence or absence of other permissions or denials. Derivation rules can be used to derive b o t h positive and negative authorizations, and represent a compact way to specify a set of authorizations.
Example 2. The following are examples of derivation rules t h a t can be expressed in our model: 1. User Ann is denied to exercise the e x e c u t e privilege on C_Programs if the role Programmer has an authorization to w r i t e it. 3 For simplicity, we make the assumption that a user is allowed to take on only one role in a session
132
Elisa Bertino et al.
2. Users Ann, Bob, and John must have exactly the same authorizations on all the objects. 3. The roles A d m i n i s t r a t i v e Manager and T e c h n i c a l Manager must have mutually exclusive write authorizations on object Employee_Evaluation. The language to express derivation rules will be illustrated in Section 3. As it will be clear in that section, the presence of negation in derivation rules (whereby new authorizations are derived based on the absence of others) may cause the existence of several sets of access authorizations. Consider, for instance, the third rule in the above example, where the authorization to write object Employee_Evaluation can be derived for the role Administrative Manager provided that the role Technical Manager is not authorized to write the same object, and vice versa. Clearly, two sets of assignments can be derived in this case, one in which only Administrative Manager has the permission to write Employee_Evaluation and the other in which only Technical Manager has this permission. 2.4
Derivation of Authorizations a l o n g R o l e , G r o u p , O b j e c t , and Privilege Hierarchies
Our model supports two different types of implicit authorizations: the first, discussed in the previous section, consists of authorizations derived from the derivation rules specified by the user. The second derives from the role, group, object and privilege hierarchies. Authorizations automatically propagate along these hierarchies according to a set of propagation rules. These propagation rules apply to both explicit authorizations and authorizations implicitly given through derivation rules. As far as the role hierarchy is concerned, we consider the following propagation rules: 1) a positive authorization given to a role r propagates to all roles which precede r in the role hierarchy (that is, to all roles r' such that r' -~R r); 2) a negative authorization given to a role r propagates to all the roles following r in the role hierarchy (that is, to all roles r' such that r -
Example 3. Consider the role hierarchy in Figure 1. From the explicit authorizations: (Administrative Manager, write, Employee_Personal_Data, Bob), (Administrat • Manager, execute, Program_Repository, John) the following authorizations are derived: (Top Manager,write,Employee_Personal_Data,Bob), -I(Secretary, execute, Program_Repos it ory, John), -~(Accountant, execute, Program_Repos it ory, John), -~(Employee, execute, Program_Repository, John).
An Authorization Model and Its Formal Semantics
133
Moreover, we assume that an authorization, either positive or negative, given to a group propagates to all the members of the group. Thus, if a user logs into the system as a role he/she has all the authorizations explicitly given to the role (or derived through propagation a n d / o r derivation rules) and all the authorizations of the groups to which the role belongs to; if he/she connects as a single user, then he/she has all his/her personal authorizations and all the authorizations of the groups to which he/she belongs to. Similarly to the role hierarchy, authorizations propagate along the privilege hierarchy according to the following rules: 1) a positive authorization for privilege p on object o implies a positive authorization on o for all the privileges following p in the privilege hierarchy; 2) a negative authorization for p on o implies a negative authorization on o for all the privileges preceding p in the privilege hierarchy.
Example ~,. Consider the authorizations of Example 3 and suppose that w r i t e -~p read. The following additional authorizations are derived:
(Administrative Manager, read, Employee_Personal_Data, Bob), (Top Manager, read, Employee_Personal_Data, Bob). Finally, authorizationsgranted on a given object propagates to all the objects which are direct or indirect components of it.
Example 5. Consider the object hierarchy illustrated in Figure 2. From the authorizations of Examples 3 and 4 we derive a negative authorization for Administrative Manager, Secretary, Accountant, and Employee to execute C_Programs, Cobol_Programs, and Assembler_Programs. 2.5
Conflict Resolution Policy
In our model conflicts may arise due to the simultaneous presence of a positive and a negative authorization with the same subject, object and privilege. We do not consider the simultaneous presence of conflicting authorizations as an inconsistency, rather we define a conflict resolution policy which is based on the notion of strongest authorization. The conflict resolution policy enforced by our model keeps into account:
- the grantors of the conflicting authorizations, that is, their relative positions in the role hierarchy; the authorization granted by the higher level grantor is considered as prevailing. Clearly this conflict resolution mechanism applies when both the conflicting authorizations are granted by a role; - the object hierarchy, in that when conflicts are not solved by the role hierarchy, the authorization specified at a lower level in the object hierarchy is considered as dominating; the sign of the authorizations, since when conflicts cannot be solved by considering the role and/or object hierarchy, we consider as prevailing negative authorizations. -
Example 6. Consider the following authorizations:
134
Elisa Bertino et al.
(Employee,write,Employee_Info,Top Manager), ~(Employee,write,Employee_Info,Administrative Manager), ~(Consultant,execute,Program_Repository,Technieal Manager), (Consultant,execute,Program_Repository,Tom). -
-
2.6
(Employee,write,Employee_Info,Top Manager) prevails over (Employee,write,Employee_Info,Administrative Manager) since Top Manager -
The authorization model we have defined so far is characterized by a high degree of flexibility. However, this flexibility implies some loss of control from the owner of the object. As an example, let us suppose that a user, say Bob, when playing the role Employee, writes some information into one of his objects, say ol, and that he does not want to disclose this information to Alice. He therefore grants a negative authorization for the r e a d privilege to A l i c e on object ol. However, the negative authorization issued by Bob does not always ensure that A l i c e is forbidden to r e a d object ol. For instance, if a user playing the role Manager grants A l i c e a positive authorization for the r e a d privilege on object ol, this latter authorization overrides the authorization granted by Bob. To overcome this drawback we introduce the concept of strong authorizations, that is, authorizations not admitting exceptions with respect to the role hierarchy. Strong authorizations can be specified either explicitly or through derivation rules. The basic idea is that: i) strong authorizations cannot be overridden by weak authorizations; ii) conflicts among strong authorizations are solved in favor of the authorization specified at the lower level in the object hierarchy. If the objects on which the authorizations are specified are not related by the -
An Authorization Model and Its Formal Semantics 3
Formal
135
Model
In this section we present the language whereby authorizations are specified in our system and give its formal semantics. The language is based on an extension of Ordered Logic (OL) [5, 6, 13] by ordered domains.
3.1
The Authorization Specification Language
An authorization program encodes the rules whereby authorizations are granted within a given system. The components of such a system, namely, subjects, objects and privileges, along with their relationships, are modeled by the notion of program domain, that we next formMly define.
Definition 1. ( P r o g r a m D o m a i n ) A (program) domain consists of the following components: 1. A countable set S of labels, called subject identifiers. This set is partitioned into three subsets, namely, G (groups), R (roles) and U (users). On G the partial order relation ~G is defined. The set R contains a built-in role denoted -[-R. The set R U U is partially ordered by _R in such a way that 1-R --~R a, for all a E R U U and, further, a --R b, with a r -I-R, implies a, b E R (the element -[-R is used to model strong authorizations). We denote by -~R the reflexive reduction of ___R.4 Yhrther, the following functions are given: User_Groups : G --* 2 U that, given a group g, returns the users members of g; and Role_Groups : G --* 2 R that, given a group g, returns the roles members of g. 2. A countable set O of labels, called object identifiers. 0 is partially ordered by ~-o and the poset (O, -~o) has a top element denoted T o , that is, an element such that T o ~ o o, for all o E O. This element is used to define properties that must hold on all the elements of O. The partial order ~-o models a part-of relation among objects. We denote by -~o the reflexive reduction of ~o. 3. A countable set P of labels, called privilege identifiers. P is partially ordered by _ p . We denote by -~p the reflexive reduction of _ p . From now on, we assume that a domain D has been fixed. Further, we assume t h a t the following sets are given: 1) A s e t / - / o f predicate symbols of two types: built-in and user-defined. There is a unique built-in predicate symbol, namely, auth, of arity 3, which has type S • P • (RU U). User-defined predicate symbols are untyped and have a fixed arity; 2) a set A of variable symbols. We next define our language based on the fixed domain D (from which constants are taken) and the above defined s e t s / I and A (thus, henceforth, every notion is implicitly defined on D, H and A). A term is either a constant (of D) or a variable (in A). An atom is of the form: p(tl, ..., tn), where p is a predicate symbol (in/-/), n is the arity of p and t l , . . . , t n are terms. I f p is built-in (i.e., p =- auth) and its type is T1 x T2 • ~-3, 4 Given a partial order R, the reflexive reduction of R is ((a, b) e Rla ~ b).
136
Elisa Bertino et al.
then, for each t~ which is a constant, ti E Ti holds. A simple literal is either a positive literal Q or a negative literal -~Q, where Q is an atom and -~ is the true negation symbol. A referential literal is of the form o.L, where o E O and L is a simple literal. A literal is either a simple or a referential literal. Two simple (resp. referential) literals are complementary if they are of the form Q and -~Q (resp. o.Q and o.-~Q). Given a literal L, we denote its complementary literal by -~.L. Two literals are conflicting if they are of the form auth(s,p,g) and -~auth(s',p',g') (resp. o.auth(s,p,g) and o.-~auth(s',p',g')), where s -- s' and p -- pP. A clause r is an expression of the form:
H ~ A1, ...,AN,not B1, ...,not Bm
n >_0, m > 0
where H , A1, ..., A~ and B1, ..., Bm are literals and not is the negation by failure symbol [15]. H is the head of r, whereas A1, ..., AN, not B1, ...,not Bm, n >_ 0, m > 0, is the body of r. Note that the head H may be a negative literal. We often denote the head literal of r by Head(r) and the body of r by Body(r). D e f i n i t i o n 2. ( P r o g r a m R u l e ) A (program) rule is a pair (o, r) where o E O and r is a clause such that Head(r) is a simple literal. A program rule (o, r) whose head predicate symbol is auth is called authorization rule. If the body of r is the empty conjunction, then (o, r) is an explicit authorization; otherwise, it is a derivation rule (whereby implicit authorizations are derived). An authorization rule (o, r), where Head(r) is of the form, say, auth(s, p, g) (resp. -~auth(s, p, g)), is used to express a positive (resp. negative) authorization for privilege p granted by g to subject s on object o.
Example 7. The rule: ( P u b l i c _ I n f o , -~auth(Employee, read, Bob) ~-) encodes an explicit negative authorization to r e a d P u b l i c _ I n f o granted by Bob to the role Employee. With the rules: rl: (C_Programs, auth(hmy, read, Tom) ~- not others_subj_has_write_auth) r2 : (others_subj_has_write_auth ~- C_Programs.auth(X, w r i t e , Y), X ~ Amy) Tom authorizes Amy to r e a d object C_Programs provided that nobody else has an authorization to w r i t e it. D e f i n i t i o n 3. ( A u t h o r i z a t i o n P r o g r a m ) An (authorization) program P is a finite set of program rules. We call each maximal subset of rules in 1) having the same object identifier a component of 7). For simplicity, and without loss of generality, we assume that user-defined predicates are local to components.
Example 8. Consider the program 7~ consisting of the following rules: r1: ( Employee_Evaluation, auth(Administrative Manager,write,Top Manager) *- not auth (Technical Manager,write,Top Manager)) r2: ( Employee_Evaluation, auth(Technical Manager,write,Top Manager) +- not auth (Administrative Manager,write,Top Manager) >
An Authorization Model and Its Formal Semantics
137
r3: (Program_Repository, a u t h ( B o b , r e a d , A n n ) +-- ) r4: ( Employee_Salary_Info, a u t h ( A c c o u n t a n t , w r i t e , T o p Manager) ~- not E m p l o y e e _ E v a l u a t i o n . a u t h ( A d m i n i s t r a t i v e M a n a g e r , w r i t e , X ) ) Rules rl and r 2 express that the role Top Manager prevents roles A d m i n i s t r a t i v e Manager and T e c h n i c a l Manager to simultaneously have the w r i t e privilege on Employee_Evaluation. 3.2
Stable Model Semantics
Throughout this section, we assume that an authorization program 7) is given. The Base Bp of 7) is the set of all ground (both base and referential) literals constructible from the predicate symbols of the language and the constants occurring in the program. An interpretation (for 7)) is any subset I of B~. An interpretation is consistent if no conflicting literals occur in it. Given an interpretation I, a ground (either simple or referential) literal L is true (resp. false) wrt I if L E I (resp. -~.L E I). Let r be a ground rule and I an interpretation; the body of r is true wrt I if every A~, 1 < i < n, is true wrt I and every Bj, 1 < j < m, is not true w . r . t . I . The rule r is true wrt I if either the head of r is true wrt I or its body is not true wrt I. We denote by 7)g the set of all ground rules obtained from each rule (o, r} C 7) by replacing each variable appearing in r by each constant appearing in the program. Further, given an object o E O and a rule r, let o.r denote the rule obtained from r by replacing each simple literal L by o.L. Next we define the notion of ground version of 7) where authorization propagation along hierarchies is made explicit. D e f i n i t i o n 4. ( G r o u n d V e r s i o n o f a n A u t h o r i z a t i o n P r o g r a m ) Let 7)~ = UoeO{(5, o.r) I 5 ~ o o and (5, r) 6 7)g}(the closure of 7)g w.r.t, the object hierarchy). The ground version of 7), denoted G(7)), is inductively defined as follows: Base: if (5, o.r) E 7)g then (5, o.r) 6 G(7)); - Induction: if (5, o.r} C G(7)), where r is an authorization rule, then: -
1. if Head(r) is of the form auth(s,p, g), then G(7)) = G(7)) U Z1 U r 2 U Za, where: - Z l = {(5, o.r')lr' = a u t h ( s l , p , g ) ~- a u t h ( s , p , g ) , s e G and (sl 6 User_Groups(s) U Role_Groups(s) or s ~_c sl)}, that is, a positive authorization granted to a group propagates to all subgroups, users and roles members of that group. - ~2 = {(5, o.r')lr' = a u t h ( s l , p , g ) +-- a u t h ( s , p , g ) , s 6 R and sl -
138
Elisa Bertino et al. = {(6, o . r ' ) [ r ! : ~auth(si,p,g) +-- ~auth(s,p,g), s E G and (si E User_Groups(s) URole_Groups(s) or s -
--81
authorization granted to a group propagates to all subgroups, users and roles members of that group.
- 82 = {(5, o.r')[r'= -~auth(sl,p,g) +-- -~auth(s,p,g),s 9 R and s -
9 G(7~) be a referential rule. We say that ri is defeated in I if there exists r2 = (o2, o'.r') 9 G(P) such that ri and r2 are conflicting rules, the body of r2 is true in I and (at least) one of the following conditions holds (next g and g' are the grantors appearing in Head(r) and Head(r'), respectively):
- g' -
An Authorization Model and Its Formal Semantics
139
P r o p o s i t i o n 31 Let M be a model ofT). Then, M is a consistent interpretation. To define stable models, we need the following preliminary definition. D e f i n i t i o n 8. ( R e d u c t i o n o f a n A u t h o r i z a t i o n P r o g r a m ) Given an interpretation I for 7), the reduction of 7) w.r.t. I, denoted Gx(7)), is the following set of ground rules: GI(7)) -- {o.rl3(6, o.rl 9 G(7)) not defeated in I and Body(r) is
true w.r.t. I}. Note that G I ( P ) can be regarded as a Datalog program (with negation by failure) [14, 21] simply by considering each referential literal of the form, say, o.p(tl,..., tn) (resp. o.-~p(tl, ..., t~)), as a simple literal with predicate symbol o.p (resp. o.-~p). Thus, given a set X of rules, the immediate consequence operator Tx is defined in the usual way:
Tx : 2s~--'2 s~ I ~-~{L 9 Bp] 3r 9 X s.t. Head(r) = L, and Body(r) is true wrt I}. D e f i n i t i o n 9. ( S t a b l e M o d e l ) Let M be a model of 7). M is a stable model of P if M = T~M(p))(~ ).
Example 9. Consider the program 7) of Example 8 and let: M 1 -- {Employee_Evaluat ion.auth (Administrative Manager, write, Top Manager), Program_Reposit ory.auth (Bob, read, Ann) } be an interpretation for 7). GMI (7)) = {rl,r3}. Clearly, M I = TI~MI (9))(@) and, thus, M I is a stable model of 7). It can be easily shown that:
M2 = {Employee_Evaluation.auth(Technical Manager,write,Top Manager), Employee_Salary_Inf e.auth(Accountant, write, Top Manager), Pr ogram_Reposit ory.auth (Bob, read, Ann) } is also a stable model of P. Thus, in general, a program is assigned with a number (possibly zero) of stable models. Each stable model represents a consistent way of assigning access authorizations to users.
Example 10. Consider Example 9 above. According to the two stable models associated with 7), it is possible to simultaneously authorize A d m i n i s t r a t i v e Manager to write Employee_Evaluation and Bob to read Program_ Repository and the same happens if we consider Technical Manager instead of Admnistrative Manager. By contrast, it is not possible to simultaneously authorize Administrative Manager to write Employee_Evaluation and Accountant to write Employee_Salary_Info.
140
4
Elisa Bertino et al.
Access Control
An access request is represented as a triple (s,p, o), with s E U U R, p E P, and o E O. Tuple (s, p, o) states that subject s requires to exercise privilege p on object o. L e t / ' p be the set of stable models of 7). Each model belonging t o / ' p represents a consistent way of assigning access authorizations. The problem then arises of what model to select, among those i n / ' p , in order to verify whether an access request can be authorized. In our access control policy, the choice of the model against which access requests must be verified is driven by the privileges the subjects are exercising on the objects in the system when the access request is issued. Given an access request a, the access control mechanism selects from Fp a stable model which: i) satisfies a and ii) is compatible with all the current accesses. If such a model exists, the access is authorized; otherwise, it is denied. Clearly, enforcing this policy requires the system to maintain information on the set of accesses that are currently in execution. An algorithm enforcing access control is illustrated in Figure 3. The algorithm makes use of two sets, namely Current_Access and Current_Model. Current_Access is the set of current accesses: (s, p, o) E Current_Access if and only if subject s is currently exercising privilege p on object o. Current_Model is the model of 7) currently chosen as valid. Moreover we make use of function Hg that, given a model M, returns the set of access requests that can be authorized according to that model. Formally, fig(M) = {(s,p, o) l o.auth(s,p,g ) E M}.
Algorithm 41 Access Control Algorithm INPUT: OUTPUT:
1) An access request (s,p, o), 2) An a u t h o r i z a t i o n p r o g r a m 1) A U T H O R I Z E if the access request can b e authorized, 2) R E J E C T , otherwise.
METHOD: 1. Let C u r r e n t _ A c c e s s b e the set of c u r r e n t accesses 2. if (s, p, o) E YIg (Current_Model) t h e n C u r r e n t _ A c c e s s : = C u r r e n t _ A c c e s s U {(s, p, o)} return AUTHORIZE endif 3. i f 3 M ~ F p such t h a t ( ( ( s , p , o ) } U C u r r e n t _ A c c e s s ) C H g ( M ) t h e n
C u r r e n t _ M o d e l :~- M C u r r e n t _ A c c e s s :-~ C u r r e n t _ A c c e s s return AUTHORIZE
U
{ (S, p, o)}
endif 4. r e t u r n R E J E C T
Pig. 3. Access Control Algorithm
Example 11. Consider the program P of Example 9. Assume that Current_Model = M1 and Current_Access = { ( B o b , r e a d , P r o g r a m _ R e p o s i t o r y ) } . Let
An Authorization Model and Its Formal Semantics
141
(Technical Manager,write,Employee_Evaluation) be a new access request. (Technical Manager,write,Employee_Evaluation) (L 1-1g(M1). However, (Technical Manager, writ e, Employee_Evaluat ion) E Hg(M2) and Current_Access C Bg(M2), then the request is authorized and inserted into Current_Access. Current_Model is set equal to M2. Suppose now that the access request (Accountant , w r i t e ,Employee_Salary_Info) is submitted to the system. This request is denied since there does not exist a stable model M of 7~ such that ( A c c o n n t a n t , w r i t e , E m p l o y e e _ S a l a r y _ I n f o ) E FIg(M) and Current_Access C_ FIg(M). A few concluding remarks on the complexity of the described algorithm. It is well known in logic programming that computing stable model semantics is an exponential task (unless P = N P ) [16]. In particular, deciding whether a program admits a stable model is NP-complete. In principle, this is not a drawback, as it means that it is possible to express some NP-complete problems. Without any form of syntactic restrictions (no stratified negation [21], no limitation on the structure of both hierarchies and conflicts, etc.), the language offers a general tool whereby the user is free to model very complex application domains. On the other hand, a number of efficient techniques have recently been proposed to drastically improve the computation of stable models [20, 7, 17, 8]. As a result, the approach becomes feasible for many problems of practical interest. We argue that (real) security applications fall into this category of problems, as it is reasonable to assume that the sources of complexity (i.e., unsolvable conflicts, unstratified negation by failure [21]) are, in general, limited.
5
Conclusions
In this paper we have presented an advanced authorization mechanism based o n a logic formalism. The model provides several notable features such as positive/negative and weak/strong authorizations, derivation and propagation rules, and the support for both roles and groups. Moreover, we have proposed an innovative approach for both access control and authorization conflicts resolution. We plan to extend this work along several directions. First we are developing more articulated resolution policies, taking into account other aspects, such as the group hierarchy. We also plan to extend the current model to incorporate temporal features [2, 3] and to allow a user to take on multiple roles in a section. A third direction concerns implementation issues.
References 1. Abadi, M., Burrows, M., Lampson, B.W., Plotkin, G. A Calculus for Access Control in Distributed Systems. ACM Trans. on Programming Languages and Systems, 15(4):706-734, 1993. 2. Bertino, E., Bettini, C., Ferrari, E., Samarati, P. A Temporal Access Control Mechanism for Database Systems. IEEE TKDE, 8(1):67-80, 1996.
142
Elisa Bertino et al.
3. Bertino, E., Bettini, C., Ferrari, E., Samarati, P. An Access Control Mechanism Supporting Periodicity Constraints and Temporal Reasoning. ACM TODS, to appear. 4. Bertino, E., Jajodia, S., Samarati, P. Supporting Multiple Access Control Policies in Database Systems. Proc. of the IEEE Symposium on Research in Security and Privacy, Oakland (CA), 1996. 5. Buccafurri, F., Leone, N., Rullo, P. Stable Models and their Computation for Logic Programming with Inheritance and True Negation. Journal of Logic Programming, 27(1):5-43, 1996. 6. Buccafurri, F., Leone, N., Scarcello, F. On the Expressive Power of Ordered Logic. A I Communications, 9:14-13, 1996. 7. W. Chen, D.S. Warren. Computing of Stable Models and its Integration with Logical Query Processing. IEEE TKDE, 17:279-300, 1995. 8. Eiter, T., Leone, N., Mateis, C., Pfeifer, C., Scarcello, F., A Deductive System for Nonmonotonic Reasoning, Proc. of the ~th Int. Conf. on Logic Programming and Nonmonotonic Reasoning (LPNMR '97), LNAI 1265, Berlin, 1997. 9. E. Fernandez, E.B. Gudes and H. Song. A Model for Evaluation and Administration of Security in Object-Oriented Databases. IEEE TKDE, 6:275-292, 1994. 10. Celfond, M., Lifschitz, V. The Stable Model Semantics for Logic Programming. Proc. 5th Int. Conf. on Logic Programming, pp. 1070-1080, 1988. 11. Jajodia, S., Samarati, P., Subrahmanian, V.S., Bertino, E. A Unified Framework for Enforcing Multiple Access Control Policies. Proc. of ACM-SIGMOD, 1997. 12. Jajodia, S., Samarati, P. Subrahmanian, V.S. A Logical Language for Expressing Authorizations. Proc. IEEE Symposium on Research in Security and Privacy, Oakland (CA), pp. 31-42, 1997. 13. Laenens, E., Saccs D., Vermeir, D. Extending Logic Programming. In Proc. of ACM-SIGMOD, 1990. 14. Lifschitz, V. On the Declarative Semantics of Logic Programs with Negation. Foundation of Deductive Database and Logic Programming, pp. 89-148, 1997. 15. Lloyd, J. W. Foundations of Logic Programming, Springer-Verlag, 1987. 16. Marek, W., Truszczyfiski, M., Computing Intersection of Autoepistemic expansions, Proc. of the 1st Int. Workshop on Logic Programming and Non Monotonic Reasoning, pp. 37-50, 191. 17. Niemel~, I., Simons, P., Efficient Implementation of the Well-founded and Stable Model Semantics. Proc. of the 1996 Joint Int. Conf. and Symposium on Logic Programming, pp. 289-303, Bonn, Germany, 1996. 18. Oracle Corporation. Oracle8 Server Concepts, 1997. 19. Rabitti, F., Bertino, E., Kim, E., Woelk, D. A Model of Authorization for NextGeneration Database Systems. ACM TODS, 16(1):88-131, 1991. 20. Subrahmanian, V.S., Nau, D. and Vago, C. WFS § Branch and Bound = Stable Models. IEEE TKDE, 7(3):362-377, 1995. 21. Ullman, J.D. Principles of Database and Knowledge-Base Systems, Vol. 1 and 2, Computer Science Press, 1989. 22. Woo, T.Y.C, Lain, S.S. Authorizations in Distributed Systems: A New Approach. Journal of Computer Security, 2(2 ~ 3):107-136, 1993.
Authorization in C O R B A Security GfinterKarjoth IBM ResearchDivision, Zurich Research Laboratory
gkaOzurich.ibm.com
A b s t r a c t . Integration of security and object-oriented techniques is critical for the successful deployment of distributed object systems. In December of 1995, the Object Management Group published a security service specification called CORBA Security to handle security in object systems that conform to the Object Management Architecture. This paper provides a rigorous definition of the authorization part of CORBA Security. Its semantics is given in terms of an access control matrix. The dependencies among the authorization elements are analyzed and possible interpretations for access control decision functions are given. The expressivity of the authorization model to define a wide range of policies, in particular mandatory access control, is discussed. Keywords: authorization, object access control, CORBA, distributed object systems
1
Introduction
The Object Management Group (OMG) is a consortium of more than 700 software vendors, software developers, and end users. It has developed an object model and a reference architecture for the construction of applications that interwork across diverse architectures and infrastructures using object-oriented techniques. The Object Management Architecture [10] defines the necessary facilities, including at its core the Object Request Broker (ORB), a mechanism that provides transparency of object location, activation, and communication. The Common Object Request Broker Architecture (CORBA) specification describes the interfaces and services that must be provided by compliant ORBs. Object Services to be shared by applications across all domains provide the basic functions for realizing distributed object solutions. Naming, events, persistence, transaction, and concurrency control are examples of object services. Security is regarded as a service as well. In their Object Services R F P 3 [9], which includes the OMG White Paper on Security [8] as an annex, OMG defines the requirements for security in object systems that conform to the Object Management Architecture. In December of 1995, OMG issued the CORBA Security specification [12], which defines the core security facilities and interfaces required to ensure a reasonable level of security of a CORBA-compliant system as a whole. The CORBA
144
Gfinter Karjoth
Security specification includes a security model and architecture, the security facilities and interfaces available to application developers, administrators, and object system implementors, and a description of how security services affect the CORBA2 interoperability protocol [11]. The reference model describes how and where a secure system enforces security policies. The security architecture implements the security model. As security services have implications on the ORB as well as on most Object Services, Common Facilities, and object implementations, the CORBA Security specification also includes rules to express conformance to CORBA Security. CORBA Security provides services for authentication, delegation, authorization, quality of protection including privacy, non-repudiation, and audit. This visible security functionality uses other security functionality such as cryptography, which supports many of the above functions but is not visible outside the security services. Thus, to achieve independence of security technology, CORBA Security does not define cryptographic interfaces, nor does it propose any direct use of cryptography by application objects. Typically, distributed object systems support many more objects, generally several orders of magnitude more, than do language-based object systems [5]. For scalability, CORBA Security authorization provides the notions of privilege attributes (user groups, roles, etc.) and object groups to reduce the cardinality of the sets of subjects and objects and thus the size of the access matrix. Operations of an object's interface can also be grouped by assigning appropriate rights required for their invocation. This assignment partitions an object interface with different users/groups being authorized to different partitions. CORBA Security treats authorization as an independent semantic concept separate from its implementation in system-specific mechanisms. It is therefore necessary that the language representing the authorization have a formal semantics so that the meaning of an authorization statement can be precisely determined. This way, the security administrator as well as the CORBA Security implementor have the same understanding. This is particularly important as vendors of CORBA products begin to include implementations of the security specification. For example, the security service integrated with DSOM (IBM's Distributed System Object Model) was developed in accordance with OMG requirements and guidelines [2]. This paper provides a formal definition of authorization in CORBA Security as defined in document [12]. The semantics is given in terms of a protection state, the set of rights held by each individual subject, and an authorization scheme; i.e. a set of rules that lets the protection state evolve by the autonomous activities of the subjects. The dependencies among the authorization elements are analyzed, and possible interpretations for access control decision functions are given. The remainder of this paper is organized as follows. Section 2 provides a brief description of CORBA Security authorization. Section 3 introduces the formal model on which CORBA Security authorization is defined. The specification of CORBA Security authorization is developed in Section 4. It describes
Authorization in CORBA Security
145
the elements of CORBA Security authorization, defines possible access decision functions, and elaborates on the semantics of operations in the CORBA Security administration interface. Section 5 presents a mandatory access control policy expressed within the CORBA Security model. Finally, Section 6 contains a discussion of open issues and presents conclusions. 2
Access Control
in CORBA
Security
CORBA Security manages security policies based on the security attributes of resources and principals. It provides abstractions to reduce the size of access control information but also allows fine-grained access to individual operations rather than to the object as a whole. CORBA Security interfaces are objectoriented and shall allow polymorphic implementations of their constituent objects based on different underlying mechanisms. Reference [12] defines a particular access policy type based on privilege and control attributes and an associated management interface. All access decisions on object invocation are made via a standard access decision interface, no matter which objects define the policy and implement the access decision. Principals are users or processes accountable for the actions associated with some user. Depending on a given security policy, a principal may have certain privilege attributes which can be used for access control such as access identities, roles, groups, security clearance, or capabilities. At any time, the principal may choose to use only a subset of the privilege attributes it is permitted to use in order to establish its rights to access objects. On the other hand, control attributes are associated with the target. Examples are ACLs and information used in label-based schemes. The access control model, depicted in Fig. 1, allows access control policies to be defined at two levels: at object invocation and within the application. The access decision function bases its decision on the current privilege attributes of the principal, the operation to be performed, and the control attributes of the target object. Although applications can enforce their own access policy, no standard application access policy is defined. Therefore, we deal only with object invocation access policies and focus on access policy controlled at the target side, the most interesting case. A Security Policy Domain is the set of objects to which common security policies apply. The domain itself is not an object. Security policy domains permit application of security policy information to security-unaware objects without requiring changes to their interfaces (by associating the security policy management interfaces with the domain rather than with the objects to which the policy is applied). Security policy domains provide leverage for dealing with the problem of scale in security policy management (domain granularity rather than object instance granularity). The CORBA Security reference model allows objects (and domains) to be members of multiple domains. The policies that apply to an object are those of all its enclosing domains. However, document [12] does not specify policy com-
146
Giinter Karjoth a.,,c.t,oo
ccess decisions
]
~
~ / request
~Target application-]
j~ ~accessdecisions J request
I
\\" \
/ / /
/
/
)
ORB \
] /
i
,/
I /
[/
IClient-sideinvocation access decisions "~
!Target-sider--invocation accessde:i :---ii~ ....................................................................................
Fig. 1. The Access Control Model
position rules, nor are problems of consistency, complexity, and administration addressed. A Domain Access Policy grants a set of principals a specified set of rights to perform operations on all objects in the domain. A representation of a particular access policy is given in Table 1. This table resembles the traditional access control list and defines the rights of subjects on objects in a given domain. Table 1. Domain Access Policy (Granted Rights) Privilege Attribute Delegation State Granted Rights corba: gsac(:ess..id:a 1 initiator ac(:ess_id:al delegate group:gl
initiator
group:g2
initiator
other: -u-m-s corba: g-other: corba: g-other: -u---s corba: gsother:
Compared with the structure of a usual access control list, we notice three differences. First, to simplify administration, a Domain Access Policy aggregates principals for access control by using their privilege attributes ~ subject entries. User identities are considered to be a special case of privilege attributes. Second, a Domain Access Policy allows different rights to be granted depending
Authorization in CORBA Security
147
on whether a privilege attribute is used by a delegate or by an initiator. Thus, the "subject" in the traditional sense of access control is the combination of a set of privilege attributes and a delegation state. Third, to cope with different needs to express authorization, rights are classified into sets of "access control types", named rights .families. However, CORBA Security defines only one RightsFamily 1, the "corba" family, containing the three rights g (get), s (set), and m (manage). Rights are always classified by the name of the RightsFamily to which they belong. CORBA Security implementations are free to define and provide additional rights families, although doing so is discouraged in order to keep the complexity of the administrator's model low. To give an example of another rights family, we let RightsFamily other = {g, u, o,m, t , s} contain the rights g (guest), u (user), o (operator), m (administrator), t (auditor), and s (super). For example, Table 1 grants user al the rights g and s of rights family c o r b a and the rights u, m, and s of rights family o t h e r when in delegation state "initiator". However, when the access request of user al is performed in delegation state "delegate", it only possesses the right c o r b a : g . An AccessPolicy object grants rights to privilege attributes. In contrast, a RequiredRights object defines that for the invocation of each operation in the interface of a secure object, some set of rights is required. An example of a RequiredRights object is shown in Table 2 in the form of a table. This table defines the rights required to gain access to each specific operation of an object. There is also a mechanism to specify whether a user needs all the rights in an operation's required rights entry to execute that operation (AND semantics) or whether it is sufficient to match any right within the entry (OR semantics). Table 2. Required Rights Rights Required Rights Combinatol Operation Interface corba: - s ~1 /at el corba: gsany m2 corba: gsall ml c2 other: ---m-s all m2 other: g-om-s any m3 C3
Required rights are assigned on an "interface" basis and not on an "instance" basis. Hence, no change to existing object interfaces is necessary to assign required rights. But also all instances of an interface, therefore, will always have the same required rights. The AccessDecision object determines the validity of invocation requests that rely on the privilege and control attributes provided by AccessPolicy and RequiredRights objects. There are no explicit rules on how to calculate the ac1 Objects/interfaces occurring in [12] are denoted by capitalized names.
148
GfinterKarjoth
cess decision. Derived from the examples given in document [12], the following (partial) rule can be defined: A requestor will be granted authority to execute operation m of object o if the credential associated with the request contains a privilege attribute t h a t grants a right into a domain containing object o and if the same right is also included in the RequiredRights table of operation m. In other words, operation m will be executed if the requestors's rights match at least one of the rights specified in the operation's entry of the RequiredRights table. For example, a user who is granted any right in an operation's RequiredRights table entry is authorized to invoke the operation. Table 3. Groupings User Privilege A t t r i b u t e Ul ~2
u3
lalTgl a2, a3
g2 (a) User Privileges
Object Domain ol, o2, 05, os, o12 dl
O1~O8
08, 09
02,05
C2
O12
C3
d2
(b) Object Domains
O b j e c t s Interface iCl
(c) Object Interfaces
Let Table 1 define the access policy of domain dl and let Table 2 prescribe the required rights for interfaces cl, c2, and c3. In the context of the grouping provided in Table 3, C O R B A Security authorization gives access granted by the AccessPolicy of domain dl in at least the following cases: - ul can execute operations m l and m2 of objects ol,os as an initiator but may execute only operation m2 as a delegate. - ul can execute operations m l and m2 of objects 02,05 as an initiator but may execute no operations of 02 and 05 as a delegate. - members of group g2 can execute operation m2 of objects 02, 05 as an initiator but m a y execute no operations of 02 and 05 as a delegate. In C O R B A Security [12, p. 15-130], it is left open how an AccessPolicy object combines rights granted by different Privilege Attribute entries, in the case t h a t a subject has more t h a n one privilege attribute to which the AccessPolicy grants rights. As mentioned above, there are no composition rules for domain hierarchies either. This undefinedness leads to different interpretations and, therefore, to different access control decision functions (see Section 4.2). With respect to the above scenario, for example, it is not clear whether Ul as a delegate m a y execute operation m 3 of object o12 if the request carries privilege attributes al and gl. As privilege attribute al does not grant access, a POSIXbased access decision function would deny access. Thus, to conform to POSIX, the access decision function has to implement an evaluation order on privilege attribute types or on overlapping domains. This would also enable one to realize some kind of exception handling policy, e.g. "all m e m b e r s of group X except user u". C O R B A Security authorization enables the formulation of access control policies where a role is combined with a group membership, e.g. "Only somebody
Authorization in CORBA Security
149
who is a manager and a member of project-X is allowed to change project X's time schedule".
3
A F o r m a l M o d e l of A u t h o r i z a t i o n
We use Lampson's access matrix model as defined by Harrison, Ruzzo, and Ullman [4] but in a less restricted form. The model makes use of two concepts, the protection system and its state, the latter represented by an access matrix, in which a set of subjects (e.g. users) is granted rights to perform operations on a set of objects. A protection system consists of a finite collection of commands to change the protection state. The execution of a command is controlled by a boolean condition on the current contents of the access matrix. For each command, the authority to execute as well as its effect on the protection state have to be specified. The scheme is defined by the security administrator when the system is first set up and thereafter remains fixed. D e f i n i t i o n 1. A protection system consists of a finite set T~ of generic rights
and a finite set of commands of the form: c o m m a n d c~(X1, Z 2 , . . . , Zk) if rl e [X81, Xol] A r2 e [Zs2, Xo2] A ... t h e n opl, op2, ... end Here, a is a name and X 1 , . . . , Xk are formal parameters, each Xi being either a subject or an object. Subscripts s, si, o, and oi are integers in the range 1 . . . k indicating whether a parameter is a subject or an object. Each ri is a right and each opi is one of the primitive operations e n t e r r i n t o [Xs, Xo] d e l e t e r f r o m [Xs, Xo] c r e a t e s u b j e c t Xs c r e a t e o b j e c t Xo d e s t r o y s u b j e c t Xs d e s t r o y o b j e c t Xo
whereby each s and o is an integer in the range 1 . . . k, k being the number of parameters in the command, Xs denoting a subject, and Xo denoting an object. D e f i n i t i o n 2. A state of a protection system is a triple (S,O, M I where S is the set of subjects, 0 is the set of objects and M is an access matrix with a row for every subject in S and a column for every object in O. M[s, of is a subset of rights to object o possessed by subject s. The set of commands lets the state evolve by the autonomous activity of the subjects. Each command may specify a test for the presence of certain rights in certain positions of the matrix verifying that the action to be performed by the command is authorized. Each operation defines some modification to be made to the access matrix.
150 4
Giinter Karjoth The
Authorization Specification
In this section, we develop a specification that defines the semantics of authorization in the CORBA Security model as described in Section 2.
4.1
Structure
Let 7~ be the set of rights, the elements of which are denoted by r. RightsFamilies subdivide ~ into sets of rights types. Not all subsets of T~ are legal rights types. We require that ~ contain at least the rights type corba = {g, s,m}. The distinguished symbol ~/shall be a member of 7~. Let (9 denote the set of objects and let 7) denote the set of domains. We assume a relation domain: (9 • 7) that assigns objects to domains. We denote the set of object interfaces by C and require that an object has at least one interface. The relation interface: (9 x g gives the interfaces to which an object belongs. To reduce the cardinality of the set of operation names, we number the operations of each interface: m l , m 2 , . . . . Let A/[ denote this set, whose cardinality is determined by the interface with the largest number of operations. If Af is the set of all operations, function name : X • g --~ A/[ returns the index of a operation with respect to a given interface. By means of (m, c), where m is the index of the operation and c is the name of the interface to which the operation belongs, we uniquely identify an operation. Let P denote the set of privilege attributes, possibly subdivided, for example, into the set of access identifiers, user groups, and roles. A privilege attribute type such as group or role may also possess an internal structure, e.g. a hierarchical order that could be employed in the access decision function. Let b / d e n o t e the set of principals or users, the active entities in the system. The relationship between users and privilege attribute types is defined outside of the CORBA Security specification. For example, if there is a unique access identifier for each user, the set of users and the set of access identifiers may coincide. To give an example of the use of privilege attributes, we consider two kinds of privilege attribute types, the set of access identifiers ,4 and the set of user groups G. We require that relation accId: .4 x/4 uniquely identifies the user associated with each access identifier, i.e. there is no access identifier shared by two users: v,:, e ..4,
e/4:
e ace!dA
e aecId)
=
(1)
As the CORBA Security specification imposes no structure on a usrGrp: ~ • /4 relation, e.g. groups may be members of other groups, we model privilege attribute types as fiat structures. To distinguish between privileges used in different delegation states, we assume a set 75 of privileges used in state d e l e g a t e . Correspondingly, we call p the privilege p used in state d e l e g a t e . Finally, the authorization units of CORBA Security is the set $ = 7) U 75. With the above description of the elements of the CORBA Security model we now can give a formal definition of its structure.
Authorization in CORBA Security
151
Definition 3. A C O R B A Security state parameterized by a set T~ of rights is a triple <S, O, M> where S = S U (9 U .~4 is the set o/subjects, 0 = b[ tJ 7) U C is the set of objects and M is an access matrix. The access matrix combines all authorization information and thus provides a uniform framework. Subjects and objects are (implicitly) typed and divide the matrix into partitions, some of which are empty. Consistency checks become easier. It is also interesting to note that in the CORBA Security access matrix target objects occur as subject entries and users occur as object entries. This shows clearly that authorization in CORBA Security refers to privilege attributes (security subjects) and domains (security objects). Table 4. The access matrix of CORBA Security Authorization Objects Subjects ~ ai
I gs . . . .
~/
I
a-7
V
g2
r
O1 02 05
7
08
----V---
09
7
O12 ml
iiiiiiiiiiii~iiiiiiiiiiiiiiiiiiiiiii~. iiii
~i~iiii~i~ ~iiii~i~iiiiiiiiiiiii~ii~i~iiiiil
m2 /rt3
N
la:-s-
I:gs-
~:gs&:---m-s l:g-om-s
Table 4 illustrates the structure of a CORBA Security state. In the matrix, partition Domains = M[(9,7)] defines relation domain, partition Jnter/aces = M[(9, C/defines relation inter/ace, partition Access Identifiers = M[.4, L/] defines relation accId, and partition User Groups = M[0,L/] defines relation usrGrp. Thus, we can also give a concrete definition of the four relations as follows: accld(u) = {a I ~ / e [a, u]} usrVrp(u) = {gl ~ / e [g,u]} domain(o) = {a I ~/E [o, d]} inter/ace(o) = {c I ~ / 6 [o, c]}
152
Giinter Karjoth
Finally, partition R R = M [ H , g] defines the rights required to invoke an operation, and partition G R = M[8, :D] defines the set of rights granted to privilege attributes. Table 4 combines the information expressed in Tables 1, 2, and 3. In addition, it defines an access policy for domain d2. For space limitations, symbol '&' shall denote the 'All' rights combinator and symbol '1' shall denote the 'Any' rights combinator. Moreover, rights are not prefixed by the name of their RightsFamily. To ensure that an access decision function (see below) always returns a valid result, a CORBA Security state has to satisfy certain constraints. D e f i n i t i o n 4. A well-formed C O R B A Security state is a state (S, O, M ) in which the access matrix satisfies the following condition:
VoeO3dEl)
: dEdomain(o)
The above condition expresses that each access-controlled object in a CORBA system must be a member of at least one domain [12, p. 15-130]. 4.2
Access C o n t r o l D e c i s i o n
In this subsection, we define how access decisions are taken. In Fig. 2, links relate the elements of the access control policy as expressed in Tables 1 and 2 (CORBA Security tables) and in Table 4 (access matrix). In Fig. 2, the structure of privilege attributes and domains is flattened, e.g. a domain refers to all objects that transitively belong to it.
User
Privilege Attribute
Domain
Object
Interface
Method
01
............. .......................................o2 ...........
~ , ~
mz
~
ml
~2
~3
,,, 09 .....' o12 ~ ' ~ ' ~
~: ~ m 2 c3 ~ S m 3
Fig. 2. Associations between the CORBA Security authorization elements
If we assume that an object invocation request carries all privilege attributes of the initiating user, the task of an Access Decision Function could then be
Authorization in CORBA Security
153
described as follows: check whether the requesting user is connected with the demanded operation via the object on which the operation should be executed. In the case of an OR semantics, the existence of one path labeled twice with the same right (neglecting v0 would be sufficient to grant the request. However, as CORBA Security distinguishes whether a principal acts as the initiator of or as a delegate within a chain of calls, different rights are granted to the principal associated with the object invocation request. Thus, the above description of the CORBA access discipline is not complete. Dependent on the delegation options in use, an object invocation request may carry the privileges of several principals. In the following, we restrict our formalization to the case where only the privileges of the initiator of the call chain are carried with the request. Let function PA(u) denote the set of privilege attributes assigned to user u, implicitly determining whether the request was issued in i n i t i a t o r or d e l e g a t e state.
Disjunctive Semantics ( A n y R i g h t ) CORBA Security grants access to user u for an object o by operation m if there exists a right r which occurs in a cell belonging to a privilege attribute p and a domain d and the same right occurs in a cell belonging to operation m and the interface of object o. The privilege attribute p must be included in the user's request and can be an access identifier or a group membership. Object o must be a member of domain d. D e f i n i t i o n 5. A n OR-based access decision function permits a request of user u to execute operation m on object o i] it holds that: 3p, c , d , r : p E S , d E I),c E C,r E T~ : p E PA(u) A d E domain(o) A c E interface(o) A r E [p, d] A r E [m, c]
(2)
Note that the above formula requires the existence of at least one right in both the GrantedRights table entry and the Required Rights table entry in order to allow access, i.e. if the RequiredRights table entry is empty (] : - - - ) , access is denied to all users. As all sets in Formula 2 are finite, we can completely express the decision function in propositional logic: V ~ / E ~ p , u ] A ( V ~/E [ o , d ] A ( V x / E [ o , c ] A ( V r pES
dET)
cEC
E ~,a~Ar E [m,e]))) (3)
rET~
In Formula 3, disjunction is over privilege attributes, interfaces, domains, and rights, making the underlying "or" semantics explicit. A user receives the union of the rights of the privilege attributes she owns, an operation receives the union of the rights of the domains to which it belongs, and a RequiredRights table entry ORs all its rights. C o n j u n c t i v e S e m a n t i c s ( A l l R i g h t s ) If the access permission depends on all rights set in the RequiredRights table entry, the access decision function must
154
Gfinter Karjoth
compute the AND semantics as follows:
V v pES
(V
[o,d]^(V
dET)
[o,c]^(h r
cEC
r
[,,,d])))
rET:~
(4) Note that the above formula permits any user to execute an operation if the operation's RequiredRights table entry is empty ( ~ : - - - ) . However, it requires further that there be one domain d that grants all the necessary rights to requester u. For example, if the RequiredRights table entry for a certain operation requires the possession of rights g and s, the request of a principal being granted right g for domain dl and right s for domain d2 would be rejected even if the interface is contained in both domains. To allow for the most liberal case where (granted) rights can be cumulated through different privilege attributes (access identifier/group) and through different domains, the access control function has to be adapted as follows:
V , / e [o, c] ^ A , e ira, c] cEC
rET~
V x/E [p,u]A ( V x / e [o,d]Ar e [p,d]) pES
(5)
dED
Although it might be debatable whether it is of practical use to combine access rights gained through different means, e.g. user group membership and role association, we employ this particular access policy in order to be able to implement the access policy expressed at the end of Section 2. D e f i n i t i o n 6. A n A N D - b a s e d access decision function permits a request of user u to execute operation m on object o if it holds that:
3c:cEC:
VrE~:
cEinterface(o) A r E [m,c]
3p, d : p E $, d E 7:) : p E Ph(u) ^ d e domain(o) A r E [p, d] Compared with OR semantics, computation complexity is greater as one has to check for each right in the RequiredRights table whether the same right occurred in the GrantedRights table. However, if CORBA Security authorization is to be implemented using DCE, an AND semantics fits well. Otherwise, the CORBA Security authorization oracle would have to call the DCE access control function, e.g. dce_acl_is_client_authorized, for each right in the GrantedRights table entry until a match is found. 4.3
Administration Interface
For security administration, CORBA Security defines a basic interface for the maintenance of Domain Access Policies and Required Rights. However, there are no interfaces to manage domain membership and the structure of domains. The management of privilege attributes is left open, too. To complete the definition of CORBA Security authorization, we briefly discuss how administration operations could be given a semantics as well.
Authorization in CORBA Security
155
The execution of any CORBA Security administration operation modifies the state of the access matrix. Thus, they have to be modelled as commands of the protection system (see Def. 1). On the other hand, these operations are subject to access control as well and should occur as operations listed in the subject dimension of the access matrix. However, in their most abstract form, not prescribing any particular implementation, the administration operations would merely occur as commands. In CORBA Security authorization, ownership is separated from administration, i.e. authorization data itself is not protected by special rights auch as the control right in DCE. To show how administration operations can be managed in this paradigm, we assume that there are special protection objects, one for each cell of the access matrix. However, we leave it open how these protection objects are grouped into domains, one for each set of 'cell objects' controlled by the same access policy. For example, the domain PGR could define the access policy for the administration of partition GR if the power of granting rights is controlled by only one access policy. As an example of operations that change the contents of the matrix, the command below grants right g (to perform get operations on objects that are members of domain d) to the privilege attribute named p: c o m m a n d grantRightg(sA,p,d) if mE[sA,PGR] t h e n e n t e r g into [p, d] end
In the above command, the security administrator 8A is authorized to perform this command if she, for example, possesses right mon domain PGR. However, for this to work, we have to make the implicit assumption that the first parameter of a command always names the requestor of the operation. 5
Mandatory
Access Control
Policies in CORBA
Security
In this section we implement a label-based access control policy to demonstrate the expressivity of CORBA Security authorization. First, we introduce another rights type called labelRights. Assuming three security levels, this rights type consists of the individual rights rl, r2, r3, Wl, w2, and w3. For example, right r3 expresses a higher security level than r2 does. Security labels are attached to subjects and objects. A label on subjects is called a security clearance. A label on objects is called a security classification. We model security clearances by creating a domain called LABEL and grant each subject label rights according to her clearance. For example, if subject s has clearance 2 she gets rights rl, r2, w2, and w3, denoting that she can read all objects of classification 2 or lower, and that she can write on all objects with the same classification or higher. As access control granularity in CORBA Security is on the operation level, we have to determine the classification level of each operation. For all MAC schemes
156
Giinter Karjoth
it is assumed that each operation has either read or write characteristics. These levels are written into the RequiredRights table. All objects are members of domain L A B E L . Let function R R ( m ) give the set of required rights to invoke operation m = (n, c). Assume the above subject s wants to invoke operation m l , where R R ( m l ) = {rl}. Access is allowed as the right rl is granted to subject s as well as being included in the RequiredRights table of operation ml. Assume further that we have R R ( m 2 ) = {wl) and R R ( m 3 ) = {w3}. Then the same subject s is allowed to invoke operation m3 but is not allowed to invoke operation m2 (no write-down, *-property). Note that this scheme works with both OR and AND semantics. Table 5 gives an example of a CORBA Security matrix expressing a mandatory access control policy. Note that there is only one domain named L A B E L . For example, operation ml of interface c2 can be executed by subject a2 as this subject has clearance level 2 and therefore possesses label right r2. Subject al with clearance level 1 is not allowed to execute this operation but may execute operation m2 of interface c4 (write-up).
Table 5. Mandatory Access Control in CORBA Security Objects Subjects u l u2 u3 al
~/ x/
a3
os
LABEL
cl c~ c3 c4
rl,wl,w2,w3 x/ r l , r 2 , r 3 , w 3
:~
~:
~/
,/
ml m2
Administration can be simplified if there were rights groups as follows: rights group R2 would include rights rl and r2; rights group R3 would include rights rl, r2, and r3. Similarly, rights group W l would include rights wl, w2, and w3 and rights group W2 rights w2 and w3. This means that an administrator would give a subject with clearance level 2 the rights groups R2 and W2 instead of the rights rl, r2, w2, and w3. A DAC policy augmented with m a n d a t o r y access control to enforce information flow policies could be implemented as well. There is a (simple) solution for an AND-based access decision function and a more complex solution for an OR-based access decision function using the technique illustrated in [14].
Authorization in CORBA Security 6
157
Conclusion
We presented a formalization of the authorization scheme in CORBA Security. By providing an access matrix, we defined the structure and the consistency requirements of an CORBA Security authorization state. The definition of associated access decision functions showed clearly the underlying union semantics of CORBA Security authorization. A set of commands defined the semantics of the CORBA Security interface. CORBA Security access policy enforcement is independent of a particular policy. We showed that multilevel operations can be implemented as requested in [3]. The implementation of compartmented-mode operations, however, would require extensions. The current definition of CORBA Security authorization does not address the specification of default access rights for newly created objects. Although CORBA Security authorization has been specified in an object-oriented paradigm, the current document [12] gives no advice on how to administer object-oriented data systems beneficially [15]. Some application domains need a finer authorization granularity, for example different access policies may apply to different objects of the same type. A work-around would be to introduce either different interfaces for the same class or have different policy settings for the same interface in different domains. However, access policies based on object method parameters values (input and/or output) can only be implemented in security-aware applications and therefore would suffer from interoperability problems. Compared with other authorization models [1, 6, 7], in particular those designed for object-oriented database systems, CORBA Security authorization lacks the notions of composite and compound subjects, hierarchies (of groups, roles, domains, rights), and negative rights. These models allow one to derive (implicit) authorizations from the relationships among subjects, objects (and in particular along the interface inheritance hierarchy), and rights, mostly by providing explicit derivation rules. A rich structure is therefore beneficial. However, a tradeoff is always necessary between the complexity of the derivation system and the understandability for a human being. To foster analysis of secure CORBA systems, it would be interesting to see how much of our access matrix model could be mapped on Sandhu's Typed Access Matrix model [13]. Accompanied by a logical language, consistency requirements could be axiomatized. In Definition 4, we formalized the simple consistency property of CORBA Security that each access-controlled object in a CORBA system must be a member of at least one domain. Thus, to check whether a given CORBA Security state is well-formed, it is sufficient to show that there is no empty row in partition Domains. More complex consistency properties could be defined, for example that a user request in delegate mode should have fewer privileges than in initiator mode. The axiomatization would also be the basis for identifying conflicts in access control policies when they are composed and then share common elements.
158
Giinter Karjoth
Acknowledgments The author thanks Messaoud Benantar, Bob Blakley, and Tony Nadalin for their patience in answering numerous questions concerning details of the CORBA Security specification.
References [1] E. Bertino, S. Jajodia, and P. Samarati. Access controls in object-oriented database systems: Some approaches and issues. In N. Adam and B. Bhargava, editors, Advanced Database Concepts and Research Issues, Lecture Notes in Computer Science #759, Chapter 2, Springer Verlag, 1993. [2] M. Benantar, G. Blakley, and A. Nadalin. Approach to object security in Distributed SOM. IBM Systems Journal, 35(2):192-203, 1996. [3] S.L. Chapin, W.R. Herndon, L. Notargiacomo, M.L. Katz, and T.J. Mowbray. Security for the common object request broker architecture (CORBA). In 10th Annual Computer Security Applications Conference, pages 21-30, Computer Society Press, 1994. [4] M. Harrison, W. Ruzzo, and J. Ullman. Protection in operating systems. Communications of the ACM, 19(8):461-471, 1976. [5] B. Hartman, D. Nessett, and N. Yialelis. Scalability of Security in Distributed Object Systems: Panel Session. In The Internet Society 1996 Symposium On Network And Distributed System Security (NDSS '96), pages 40-41, Computer Society Press, 1996. [6] S. Jajodia, P. Samarati, V.S. Subrahmanian, and E. Bertino. A Unified Framework for Enforcing Multiple Access Control. In A C M SIGMOD Int'l. Conf. on Management of Data, SIGMOD Record (ACM Special Interest Group on Management of Data), 26(2), pages 474-485, 1997. [7] D. Jonscher and K.R. Dittrich. Argos - A Configurable Access Control System for Interoperable Environments. In D.L. Spooner, S. Demurjian, and J. Dobson, editors, Database Security, IX: Status and Prospects, Chapman ~z Hall, 1996. [8] Object Management Group. OMG White Paper on Security. Document 94.4.16, OMG Security Working Group, April 1994. [9] Object Management Group. Object Services RFP 3. Document 94-7-1, 1994. [10] Object Management Group. Object Management Architecture Guide. Revision 3.0, June 1995. John Wiley & Sons. [11] Object Management Group. The Common Object Request Broker: Architecture and Specification. Revision 2.0, July 1995. [12] Object Management Group. Security Service Specification. In CORBAservices: Common Ob3ect Services Specification, Chapter 15, November 1997. [13] R.S. Sandhu. The typed access matrix model. In IEEE Symposium on Research in Security and Pmvacy, pages 122-136, Computer Society Press, 1992. [14] R.S. Sandhu and S. Ganta. On the minimality of testing for rights in transformation models. In IEEE Symposium on Research in Security and Privacy, pages 230-241, Computer Society Press, 1994. [15] D.L. Spooner. The impact of inheritance on security in object-oriented database systems. In C.E. Landwehr, editor, DATABASE S E C U R I T Y II: Status and Prospects, pages 141-150. Elsevier Science Publishers B.V. (North-Holland), 1989.
Rules for Designing Multilevel Object-Oriented Databases Frederic Cuppens 1 and Alban Gabillon 2 10NERA/CERT 2 Avenue E. Belin 31055, Toulouse Cedex, France Fax: 05 62 25 25 93 cuppens@cert, f r 2 Eastern Mediterranean University Famagousta Mersin 10, Turkey Fax: (00) 90 392 366 12 17 alban@compenet, compe, emu. edu. tr
A b s t r a c t . When implementing a multilevel security policy for ObjectOriented Databases (OODBs), several aspects have to be investigated. One of these aspect is the design of multilevel OODBs. In an OODB, data are organized in a complex structure built using different constructs (classes, objects, attributes, links ... ). Therefore, a first problem is to determine what constructs of the object-oriented model should be associated with a security level. A second problem is then to define semantics for each assignment of a security level to an object-oriented construct. While assigning the security levels, we have also to be careful with the inference problems which may occur due to the integrity constraints inherent in the object-oriented paradigm. Therefore, a last purpose of this paper is to define a set of general rules to cope with this problem.
1
Introduction
In a multilevel security policy, every subject is associated with a clearance level and every piece of information is associated with a classification level. A subject is a user or a process. Every user is assigned with his own clearance level whereas every active process running on behalf of a user is associated with a security level dominated by (i.e. is lower than or equal to) the clearance level of this user. Classifications and clearances are both taken out a set of security levels. For instance, the sensitivity levels Secret (S), Confidential (C) and Unclassified (U) may be used as security levels. Most of the existing multilevel security policies are based on the two following requirements of the Bell & LaPadula model [BL75]: N o R e a d U p : Subjects are only permitted to read data whose classification is dominated by their clearance.
160
Frederic Cuppens and Alban Gabillon N o W r i t e D o w n : Subjects are only permitted to write data whose classification dominates (i.e. is greater than or equal to) their clearance.
The "No Write Down" restriction is necessary to prevent a malicious agent (generally called a Trojan Horse) introduced in a high level process, from writing high classified data into a low classified file accessible by a subject cleared at a low level. When implementing a multilevel security policy for OODBs, several aspects have to be investigated. One of these aspect is the design of multilevel OODBs. Designing multilevel OODBs involves the three following issues: 9 Defining the information granularity. In the context of OODBs, defining the information granularity means determining what constructs of the object paradigm have to be associated with security levels. Should the objects be protected? What about the classes? the attributes? the inheritance hierarchy ?
9 Providing semantics for each association between a security level and a construct. As an example, what is the meaning of assigning a security level to an object? Does this association protects the object content (its properties) or the object itself (its existence) ? 9 Analyzing the effects of the classification of a particular construct on the classifications of other constructs. The knowledge of a low classified data may disclose the knowledge of another higher classified data. Thus, in order to control these unauthorized inferences, assignment of security levels must be carefully performed according to some inference control mechanisms. For example, the knowledge that a is an attribute of class c discloses the knowledge that c is a class. Therefore, the security level assigned to the fact "c is a class" must be dominated by the security level assigned the fact "a is an attribute of class c". The objective of this paper is to define a set of general principles addressing these three problems. These general principles are stated as rules that must be applied when designing multilevel OODBs. These rules are stated using an easyto-read presentation, i.e. without using any formal language. However, we argue that these rules are sound and complete and the interested reader is referred to [GAB95] for a proof of this result. Due to space limitation, this proof cannot be included in this paper. Note that [OS94] is another attempt to derive such a set of rules. However, our methodology leads to a much more complete set. Note that the objective of this paper is not to show how a conceptual multilevel OODB may be implemented. For this issue see the MultiView model [BCGY94][CG96]. The remainder of this paper is organized as follows. In section 2, we define the different kind of data that deserve to be protected in an OODB schema. In section 3 we do the same investigations but for objects. In section 4, we consider the particular case of methods. Finally, section 5 concludes this paper.
Rules for Designing Multilevel Object-Oriented Databases 2
Schema
161
Protection
T h e purpose of this section is to determine the semantics of every association between each of the following schema construct and a security level: 9 Class 9 Class attribute 9 Inheritance link We also state the inference control rules t h a t must be enforced when assigning security levels to these constructs. Note t h a t methods are also elements of the D a t a b a s e schema but since methods represent the dynamic aspect of the object-oriented model, we prefer to consider t h e m separately in section 4. 2.1
Classes
A class is an element of the D a t a b a s e schema. A class groups methods and attributes. Assigning a security level 1 to a class c could have two meanings: 9 The security level 1 protects the content of class c. In this case an insufficiently cleared user (i.e. a user cleared at level l ~ with 1~ < l is not permitted to know the list of properties included in class c. 9 The security level l protects the existence of the class. In this case an insufficiently cleared user is not authorized to know t h a t class c belongs to the Database. We consider t h a t an object represents a complex real world entity containing different pieces of information, each of t h e m having its own sensitivity. Since a class groups objects sharing the same properties, we argue t h a t each class property (attribute or method) m a y be separately classified. Thus, it would be restrictive to assign a security level to a class in order to globally classify its content because it would prevent us from separately classifying each class property. On the other hand, in some applications, assigning a security level to a class in order to protect its existence represents a very important functionality. So, we state the following rule:
Rule 1 Each class is associated with a security level. Assigning a security level to class c protects the existence of class c, i.e. the fact t h a t c is a class of the Database. In Figure 1, two classes are represented: the classes A I R C R A F T and H Y P E R S O N I C _ A I R C R A F T . We assume we have three security levels only: the Secret security level denoted by the letter S, the Confidential security level denoted by the letter C and the Unclassified security level denoted by the letter U. Secret
162
Pr@d@ricCuppens and Alban Gabillon
information is in b o l d , confidential information is in italic and unclassified information is in regular font. The existence of class A I R C R A F T is unclassified. Every user has the right to know t h a t this class belongs to the Database. By contrast, the class H Y P E R S O N I C _ A I R C R A F T is a confidential class. Only secret and confidential users can see it.
2.2
Class Attributes
Assigning a security level I to an attribute a of type t in a class c could have two meanings 1: 9 The security level l protects the fact that a t t r i b u t e a of class c is of type t. 9 The security level l protects the existence of attribute a, i.e. the fact that attribute a belongs to class c. In our opinion, having the possibility to protect the fact that a given attribute belongs to a given class is a very important functionality. In Figure 1, the fact that the Weapons attribute belongs to class A I R C R A F T is not protected but the fact that the Nuclear_Bomb attribute belongs to class A I R C R A F T is protected at the secret level. Protecting the fact that a given class attribute is of a given type is perfectly acceptable although we think this does not represent a crucial functionality. For instance, in Figure 1, we could associate the fact <> with a confidential (or secret) security level. This would allow us to hide the fact t h a t an aircraft may have several weapons. However adopting this functionality would lead us to manage two distinct security levels for each class attribute: one security level to protect the existence of the class attribute and one security level to protect its type. For the sake of simplicity, we prefer to associate each class attribute with a single security level protecting the existence of the attribute and, in the same time, implicitly protecting its type. So, we state the following rule: Rule 2 Each class attribute is associated with a security level. Assigning a security level to an attribute a of a class c protects its existence, i.e. the fact t h a t a is an attribute of c.
The knowledge << a is an attribute of class c >> discloses the fact t h a t c is a class. In Figure 1, if the fact << Speed is an attribute of class H Y P E R S O N I C _ A I R C R A F T >> were unclassified it would disclose to unclassified users the confidential fact << H Y P E R S O N I C _ A I R C R A F T is a class of the D a t a b a s e >>. In order to 1 One might consider a third meaning: The security level I is viewed as a constraint limiting the security level that may be assigned to the value of the attribute a in every instance of the class c. This meaning is used by [KTT89] in the SODA model but with the following generalization: Each class attribute is actually associated with an interval of security levels. The security level assigned to the value of an attribute of a given instance is constrained to be chosen from the interval associated with the corresponding class attribute.
Rules for Designing Multilevel Object-Oriented Databases
163
avoid such a disclosure of confidential information every attribute of class HYPERSONIC_AIRCRAFT must be at least confidential. Generalization of this principle leads to the following inference control rule:
Rule 3 The security level assigned to an attribute a of a class c must dominate the security level assigned to c.
f
AIRCRAFT (U) ~'~ f Attributes Name (U): string isa (c) Speed (U): integer Range (U): integer Weapons (U): set of strings N u c l e a r _ B o m b (S): b o o l e a n Methods Automatic_Pilot (U) , D r o p _ N u c l e a r _ B o m b (S)
HYPERSONIC_AIRCRAFT (C)~ Attributes
Name (C): string Speed (C): integer Range (C): integer Weapons (C): set o] stmngs N u c l e a r _ B o m b (S): boolean
Boosters (C): integer Methods
J Instance (U)
nstance (u)
/ Instance (S) !
o t (u)
o2 (u)
Name (U): JAGUAR (U) Speed (U): Mach 1.5 (U) Range (U): 1000 km (U) Weapons (V): {{Gun, Bomb} (V) {Gun, B o m b , Rocket} (S)} Nuclear_Bomb (S): F A L S E (S)
Name (U): MIRAGE 4000 (U) Speed (V): {Mach 2.5 (V), Mach 6 (S)} Range (U): 1000 km (C) Weapons (U): {Gun, Rocket,Bomb} (U) N u c l e a r _ B o m b (S): T R U E (S) B o o s t e r s (S): 2 (S)
Fig. 1. Example of a multilevel object-oriented Database
2.3
Inheritance Links
Classes inherits properties through inheritance links. In Figure 1, all properties of class AIRCRAFT are inherited by class HYPERSONIC_AIRCRAFT. Assigning a security level to an inheritance link between a class c ~ and a class c protects the existence of this link, i.e. it protects the fact that c t inherits c. So we state the following rule:
Rule
Each inheritance link is associated with a security level. Assigning a security level to an inheritance link between a class c and a class d protects the fact that c ~ inherits from c.
164
Fr@d@ricCuppens and Alban Gabillon
The knowledge << class c ~ inherits from class c ~ discloses the fact that c and c t are classes. In Figure 1, if the link between A I R C R A F T and H Y P E R S O N I C _ A I R C R A F T were unclassified, it would disclose to unclassified users the fact that H Y P E R S O N I C _ A I R C R A F T is a class of the Database. In our example, the purpose of the confidential security level assigned to the inheritance link between class H Y P E R S O N I C _ A I R C R A F T and class A I R C R A F T is to avoid the disclosure of the existence of confidential class H Y P E R S O N I C _ A I R C R A F T . Generalization of this principle leads to the following inference control rule: Rule 5 The security level assigned to an inheritance link between two classes c and c / must dominate the least upper bound of the security level assigned to c and the security level assigned to cq The knowledge << a is an attribute of class c and class c t inherits from c >> discloses the fact <~ a is an attribute of class c I >> (since a is inherited by c~). In Figure 1, the existence of class attribute Nuclear_Bomb in class H Y P E R S O N I C _ A I R C R A F T is secret whereas the existence of the inheritance link between A I R C R A F T and H Y P E R S O N I C _ A I R C R A F T is confidential. If attribute Nuclear_Bomb in class A I R C R A F T were confidential, it would disclose to confidential users t h a t Nuclear_Bomb is an attribute of H Y P E R S O N I C _ A I R C R A F T . So in our example, either the security level assigned to attribute Nuclear_Bomb in class A I R C R A F T must be at least secret or the security level assigned to the inheritance link between A I R C R A F T and H Y P E R S O N I C _ A I R C R A F T must be at least secret. Generalization of this principle leads to the following inference control rule: Rule 6 The least upper bound of the security level assigned to an attribute a of a class c and the security level assigned to an inheritance link between a class c I and c must dominate the security level assigned to a in class c I. 3
Instances
Protection
Defining the information granularity for objects is subject to a large debate. Basically, there are two possible approaches: 1. The Single Level Object approach [ML92][JKg0]. In this approach, every object is assigned with a security level. This security level protects all the information encapsulated in this object. Security models based on this approach are easy to implement. Indeed, the logical multilevel OODB can be defined as a set of physical single level Databases. Each single level Database of level l contains all the objects of level 1. M a n d a t o r y access controls based on the Bell & L a P a d u l a model can be easily interpreted. However, in this approach, the expressive power of the multilevel policy is poor. 2. The Multilevel Object approach [KTT89][VB91]. In this approach, every attribute of every object is assigned with a security level. This security level
Rules for Designing Multilevel Object-Oriented Databases
165
protects the value of this attribute. Security models based on this approach provide the multilevel policy with higher expressive power. In our opinion, the single level object approach is too restrictive. An object may represent a complex real world entity and may contain data of different sensitivities. So, in this article, we adopt the Multilevel Object approach. Note that directly implementing multilevel objects is impractical since it would lead to manage one physical multilevel Database. Such an implementation is not realistic since it is very difficult to define secure operations on a single multilevel file. This explains why some models using multilevel objects propose a technique to decompose a multilevel object into several single level objects [BJ93][JK90][CG96]. However this implementation issue is not the purpose of this paper. The objective of this section is to determine the semantics of every association between each of the following constructs and a security level: 9 9 9 9
Object Inheritance link Object attribute Attribute value
We also state the inference control rules that must be enforced when assigning security levels to these constructs. 3.1
Objects
We consider that every property of an object must be separately classified. Thus, assigning a security level to an object in order to protect its content is useless. However as for classes, it is useful to assign a security level to each object in order to protect its existence in the Database. So, we state the following Rule: Rule 7 Each object is assigned with a security level. Assigning a security level to an object o protects the fact that o is an object of the Database. In Figure 1, there are two objects O1 and 0 2 whose existences are unclassified. 3.2
I n s t a n c e Links
Objects inherit attributes through instance links. Assigning a security level to an instance link between an object o and a class c protects the existence of this link, i.e. the fact that object o is an instance of class c. So we state the following rule:
166
Fr@d@ricCuppens and Alban Gabillon
Rule 8 Each instance link is assigned with a security level. Assigning a security level to an instance link between an object o and a class c protects the fact that o is an instance of c. In Figure 1, the fact that O1 is an aircraft is unclassified. On the other hand, the fact that 0 2 is an aircraft is unclassified whereas the fact that it is an hypersonic aircraft is secret. This means that unclassified and confidential users should believe that 0 2 is a classic aircraft and should ignore that it is actually an hypersonic aircraft. The knowledge << object o is an instance of class c >> discloses the fact that o is an object and c is a class. In Figure 1, if the link between 0 2 and H Y P E R S O N I C _ A I R C R A F T were unclassified, it would disclose to unclassified users the fact << H Y P E R S O N I C _ A I R C R A F T is a class of the Database >>. In order to avoid such a disclosure, the security level assigned to the instance link between 0 2 and H Y P E R S O N I C _ A I R C R A F T must be at least confidential. Generalization of this principle leads us to the following inference control rule:
Rule 9 The security level assigned to the instance link between an object o and a class c must dominate the least upper bound of the security level assigned to o and the security level assigned to c. One of the integrity constraints inherent in the object paradigm is the following: an object must be instance of at least one class. Therefore, the knowledge << o is an object >> discloses the fact << there exists at least one class c such that o is instance of c >>. In Figure 1, if the instance link between 0 2 and A I R C R A F T were confidential, unclassified users would not be able to see any parent class for object 02. Hence, these unclassified users would be able to determine the existence of a protected instance link involving 02. In order to avoid such a disclosure, there must be at least one unclassified instance link involving object 02. Generalization of this principle leads to the following inference control rule2:
Rule 10 If an object o is associated with a security level l, then o must be instance of at least one class c such that the instance link between o and c is classified at level l. Many authors [KTT89][Lun90][ML92] add the following more restrictive rule: If an object o is an instance of a class c, then the security level of o must dominate the security level of c. 2 From this reasoning, we can derive that "If the object o is associated with the security level l, then the object o must be instance of at least one class c such that the instance link between the object o and the class c is classified at a level dominated b y / ' . However, due to Rule 8, the security level assigned to the instance link between the object o and the class c must dominate the level l. Combination of these results leads to Rule 10.
Rules for Designing Multilevel Object-Oriented Databases
167
This rule is unnecessarily restrictive if one considers that (a) an instance link m a y be explicitly classified and (b) an object may be an instance of several different classes. In particular, this rule would not enable us to consider that 0 2 is an instance of class A I R C R A F T (at the unclassified level) but also an instance of class H Y P E R S O N I C _ A I R C R A F T (at the secret level).
3.3
O b j e c t Attributes
Assigning a security level l to an a t t r i b u t e a of an object o could have two meanings: 9 The security level l protects the value of attribute a in object o. 9 The security level 1 protects the existence of attribute a in object o. Having the possibility to protect the value of an attribute is obviously an imp o r t a n t requirement. Therefore, protection of attribute values will be separately considered in section 3.4. However, having the possibility to protect the fact that a given attribute belongs to a given object is also an i m p o r t a n t functionality. For example, in Figure 1, the existence of Nuclear_Bomb attribute in object O1 is secret. We state the following Rule:
Rule 11 Each object attribute is associated with a security level. Assigning a security level to an a t t r i b u t e a of an object o protects the fact t h a t a is an attribute of o. T h e knowledge << a is an attribute of object o >> discloses the fact t h a t o is an object. So we state the following inference control rule:
Rule 12 The security level assigned to an attribute a of an object o must dominate the security level assigned to o. T h e knowledge << a is an a t t r i b u t e of class c and o is an instance of class c >> discloses the fact <~ a is an attribute of object o >> (since a is inherited by o). In Figure 1, the existence of attribute Boosters in class H Y P E R S O N I C _ A I R C R A F T is confidential whereas the existence of this attribute in object 0 2 is secret. If the instance link between 0 2 and H Y P E R S O N I C _ A I R C R A F T were confidential, it would disclose to confidential users t h a t Boosters is an attribute of 02. So in our example, either the security level assigned to attribute Boosters in class HYP E R S O N I C _ A I R C R A F T must be at least secret or the security level assigned to the instance link between 0 2 and H Y P E R S O N I C _ A I R C R A F T must be at least secret. Generalization of this principle leads to the following inference control rule:
168
Fr@d~ricCuppens and Alban Gabillon
Rule 13 T h e least u p p e r b o u n d of the security level assigned to an a t t r i b u t e a in a class c and the security level assigned to an instance link between an object o and c must d o m i n a t e the security level assigned to a in object o. Objects inherit attributes from classes. Thus, the knowledge << a is an a t t r i b u t e of object o >> discloses the fact t h a t there is at least one class c such t h a t o is an instance of c and a is an a t t r i b u t e of c. In Figure 1, if a t t r i b u t e Nuclear_Bomb of O1 were confidential, it would disclose to confidential users t h a t O1 is instance of at least one class having a secret Nuclear_Bomb attribute. In order to avoid such a disclosure, Nuclear_Bomb a t t r i b u t e of O1 m u s t be secret. Generalization of this principle leads to the following inference control rule:
Rule 14 If an a t t r i b u t e a of an object o is classified at level l, then o m u s t be instance of at least one class c such t h a t a is an attribute of c a n d 1 is equal 3 to the least upper b o u n d of the security level assigned t o a in class c and the security level assigned to the instance link between o and c.
3.4
Attributes Values
Assigning a security level to an a t t r i b u t e value v of an attribute a of an object o protects the fact t h a t the value of the a t t r i b u t e a of object o is equal to v. For instance, in Figure 1, the fact t h a t the n a m e of O1 is J A G U A R is an unclassified information. T h e fact t h a t the weapons of 0 2 are {Gun, Bomb, Rocket} is also an unclassified information. T h e fact t h a t the range of object 0 2 is 3000 km is a confidential information and so on . .. So we state the following Rule.
Rule 15 E a c h a t t r i b u t e value is associated with a security level. Assigning a security level to a value v of an a t t r i b u t e a of an object o protects the fact t h a t the value of a in object o is equal to v. T h e knowledge << v is the value of a t t r i b u t e a of object o >> discloses the fact ~< a is an a t t r i b u t e of object o >>. So we state the following inference control rule:
Rule 16 T h e security level assigned to a value v of an attribute a of an object o must d o m i n a t e the security level assigned to a in object o. One of the integrity constraints of the object p a r a d i g m is the following: an object a t t r i b u t e must have a value 4. Therefore, the knowledge << a is an a t t r i b u t e of object o >> discloses the fact << there exists a value v for attribute a of object o >>. In Figure 1, the R a n g e a t t r i b u t e of 0 2 is unclassified whereas the value 3000 3 Due to Rule 13, it cannot be greater. a This value may be a NULL value.
Rules for Designing Multilevel Object-Oriented Databases
169
of this attribute is confidential. Unclassified users can see the Range attribute of 02, but they cannot see any value for this attribute. Therefore, these users can guess the existence of a high classified value for the Range attribute of 02. Basically, there are two solutions to avoid these kind of disclosure: 9 Stating a rule enforcing t h a t << the security level associated to the value v of an attribute a of an object o must be equal to the security level associated to a in object o >>. Clearly, this Rule is too restrictive. It would considerably downgrade the expressive power of the security policy since it leads to assign a single security level protecting b o t h the existence of the object attribute and its value. We reject it without any hesitation. 9 Allowing object attributes to be polyinstantiated [DLSSH88]. Polyinstantiation means that an object attribute m a y be associated with different values. These values are distinguished by their classification levels. This functionality is a means to implement cover stories. A cover story is a low classified d a t a deliberately introduced to hide the existence of another higher classified data. In Figure 1, the value Mach 2.5 of the attribute Speed of 0 2 is typically a cover story. This value is a lie provided to any unclassified user who wants to know the speed of the Mirage 4000. Since unclassified users are provided with this lie, they cannot guess the existence of a secret speed value for the Mirage 4000. Cover stories are also used to partially hide the truth. The value {Gun, Bomb} of the Weapons attribute of O1 is not really a lie. It rather represents an incomplete information since the J a g u a r is also equipped with rockets, as indicated in the secret value {Gun, Bomb, Rockets} 5. We state the following rule: Rule 17 An object a t t r i b u t e of level l m a y be associated with different values distinctly classified with a security level dominating the level l. There is no particular rule to decide whether an object attribute must be polyinstantiated or not. The decision to disclose or not to low cleared users the existence of a high classified d a t a is up to the D a t a b a s e Designer. In the particular case of the Speed attribute of object 0 2 , the decision is not to disclose the existence of the secret value Mach 6, leading to the insertion of the cover story Mach 2.5. On the other hand, in the case of the Range attribute of object 02, the D a t a b a s e Designer decided t h a t it was not i m p o r t a n t to hide to unclassified users the fact that the range value of the M I R A G E 4000 was a protected information. Thus, the Database Designer did not insert a cover story for this attribute at the unclassified level.
5 A similar comment may be done for the object 02 which is an instance of AIRCRAFT at the unclassified level. This represents an incomplete information since 02 is an instance of HYPERSONIC.AIRCRAFT but this is secret information.
170
Fr@d@ricCuppens and Alban Gabillon
The value of an object attribute may be another object. In this case, such a value represents an association between two objects. Thus, the knowledge ~< the value of attribute a of object o is object o~>> discloses the fact << o~ is an object >>. So we state the following inference control rule:
Rule 18 If a value v of an attribute a of an object o is equal to an object o~ then the security level assigned to v must dominate the security level assigned to oq In Figure 1, for the sake of simplicity, we did not represent such an association between two objects. So, in order to illustrate Rule 18, let us assume that we modify figure 1 by creating a third class MISSION and let us consider that 0 3 is a secret instance of this class. Let us now add an attribute Used_in to the class A I R C R A F T to represent the fact that a given aircraft is used in a given mission and, let us assume that aircraft O1 is used in mission 03. This means that the value of attribute Used_in of object O1 is equal to 03. In this case Rule 18 states that the security level assigned to this value 0 3 must be at least secret since we assumed the existence of mission 0 3 to be secret.
4
Methods
A method is a program implementing an operation. A method is attached to a class. It is associated with a signature and a code. Assigning a security level l to a method m of a class c with signature s and code p could have four meanings: * The security level l protects the existence of method m , i.e. the fact that method m belongs to class c. . The security level 1 protects the signature of the method, i.e. the fact that the signature of method m of class c is s. . The security level l protects the code of the method, i.e. the fact that the code of method m of class c is p. . The security level l protects the execution of method m of class c. Let us examine these four possibilities. A method is a class property. Having the possibility to protect the fact that a given method belongs to a given class is an important functionality. In Figure 1, we consider that the security levels assigned to the methods correspond to this first meaning. Therefore, the fact that the Automatic_Pilot method belongs to class A I R C R A F T is not protected but the fact that the Drop_Nuclear_Bomb method belongs to class A I R C R A F T is protected at the secret level. So we state the following rule:
Rule 19 Each method is associated with a security level. Assigning a security level to a method m of a class c protects its existence, i.e. the fact that m is a method of c,
Rules for Designing Multilevel Object-Oriented Databases
171
The knowledge << m is a method of class c >> discloses the fact that c is a class. In Figure 1, if the fact << Automatic_Pilot is a method of class H Y P E R S O N I C _ A I R C R A F T >> were unclassified, it would disclose to unclassified users the confidential fact <~ H Y P E R S O N I C _ A I R C R A F T is a class of the Database >>. In order to avoid such a disclosure of confidential information every method of the class H Y P E R S O N I C _ A I R C R A F T must be at least confidential. Generalization of this principle leads to the following inference control rule:
Rule 20 The security level assigned to a method m of a class c must dominate the security level assigned to c.
Rule 21 The least upper bound of the security level assigned to a method m in a class c and the security level assigned to an inheritance link between a class c~ and e must dominate the security level assigned to m in class C t .6
Let us now turn to the second possibility. Protecting the code of a method may be an important requirement. Since a method is attached to a class, the code of a method discloses much information about the behavior of the instances of the class. Thus, the code of a method may be associated with its own security level. Notice however that classifying the code of methods is not illustrated in Figure 1 (the code of methods is not presented in Figure 1, anyway). So we state the following rule:
Rule 22 The code of a method is associated with a security level. Assigning a security level to the code p of a method m of a class c protects the fact that the code of m in class c is p. The knowledge << the code of method m of class c is p >> discloses the fact << m is a method of class c >>. So we state the following inference control rule:
Rule 23 The security level assigned to the code p of a method m of a class c must dominate the security level assigned to m in class c. Observing the code of a method may disclose the existence of some objectoriented constructs referenced in the statements of this code. So we state the following inference control rule:
Rule 24 The security level assigned to the code p of a method m of a class c must dominate the least upper bound of security levels protecting the existences of object-oriented constructs referenced in p. For instance, it is likely that method Drop_Nuclear_Bomb references attribute Nuclear_Bomb. Therefore, Rule 24 requires that the security level assigned to the Note that attributes can be seen as a special case of methods. Under this view, rules 2,3 and 6 are subsumed by the rules 19,20 and 21.
172
Frederic Cuppens and Alban Gabillon
code of method Drop_Nuclear_Bomb must dominate the security level assigned to the existence of attribute Nuclear_Bomb in class AIRCRAFT. The third possibility is to protect the signature of a method. This is perfectly acceptable but, as for attributes type, we think that this does not represent a crucial requirement. So we do not consider this possibility. Finally, the last possibility is to assign a security level to a method in order to protect its execution. This would mean that an insufficiently cleared user would not be able to execute the method. We claim that this kind of protection is not relevant to a confidentiality problem. More precisely, from the confidentiality point of view, the only restriction upon the execution of a method is stated by the following rule: Rule 25 A method rn of a class c may be executed by any subject who can see this method, i.e. any subject whose clearance dominates the security level assigned to this method.
Of course, additional security requirements may be specified so that some subjects are authorized to execute a given method and other subjects are not authorized to do so. For instance, depending on its role in the Database system, a given subject may be authorized to execute the method Automatic_Pilot whereas another subject may be not authorized to do so. Our claim is that it is not the purpose of a multilevel security policy to define these security requirements but instead it is the purpose of a Discretionary Access Control policy (DAC, see for instance [PHD88]) or a Role Based Access Control policy (RBAC, see for instance [DHDT95]). Thus we shall not consider that a security level may be used to protect the execution of a method. After starting execution, a method becomes a subject associated with a clearance level I inherited from the user on behalf of whom it is running. If this executing method tries to access data associated with a security level strictly dominating level l, then this method will normally fail due to insufficient clearance.
5
Conclusion
Our objective in this paper was to define some general principles for designing multilevel OODBs. We first determine the constructs of the object paradigm that should be associated with security levels. Then, we define semantics for each association between a security level and a construct. Finally, we state some inference control rules that must be enforced when assigning security levels. In [CG96], we provide a formal method allowing us to derive from the integrity constraints of the object paradigm, all of the inference control rules presented in this paper. In [GAB95], we show that the set of inference control rules derived by
Rules for Designing Multilevel Object-Oriented Databases
173
this formal method is sound and complete according to the integrity constraints of the object paradigm. These rules do not provide us with a complete methodology for designing secure Database applications. In M O M T (Multilevel Object Modelling Technique) [MST96], designing such an application involves the three following steps: 1. Analyzing the requirements to detect potential security vulnerabilities. 2. Designing the multilevel Database. 3. Designing the modules of the automated system. Clearly, Rules 1 to 25 stated in this paper should be used in second step of this methodology. To our knowledge, we are not aware of any proposal considering a complete list of rules such as the one we define in this paper. For example, only a small subset of rules 1 to 25 are effectively taken into consideration in MOMT. Moreover, other proposals do not always provide clear semantics for the association between an object-oriented construct and a security level. An issue not addressed in this paper is implementation of a conceptual multilevel Database. As an example, implementing as a single multilevel file the Database represented in Figure 1 is not realistic. For security reasons, classified data of a n-level Database should be distributed over a set of n single level Databases. See MultiView model [BCGY94][CG96] for a discussion and for a presentation of the implementation principles.
References [BCGY94] N. Boulahia-Cuppens, F. Cuppens, A. Gabillon and K. Yazdanian. Decomposition of Multilevel Objects in an Object-Oriented Database. In European Symposium on Research in Computer Security. UK 1994. Springer Verlag. [BJ93] E. Bertino, S. Jajodia. Modelling Multilevel Entities Using Single Level Objects. Proc. of the 3th international conference on Deductive and Object-Oriented Databases (DOOD93). [BL75] D. Bell and L. LaPadula. Secure Computer Systems: Unified Exposition and Multics Interpretation. Technical Report ESD-TR-75-306, MTR 2997, MITRE, Bedford, Mass. 1975. [cG96] F. Cuppens and A. Gabillon. A Logical Approach to Model a Multilevel Object-Oriented Database. Proc. of the 10th Annual IFIP WG 11.3 Working Conference on Database Security. Como Italy. 1996. [DHDT95] S. Demurjian, M. Hu, T. Dagget and T. Ting. User-Role Based Security Enforcement Mechanisms for Object-Oriented Systems and Applications. Proc. IFIP WG 11.3 Workshop on Database Security. Rensselaer USA. 1995. [DLSSH88] D. Denning, T. Lunt, R. Schell, W. Shockley and M. Heckman. The Sea View Security Model. Proc. of the 1988 IEEE Symposium on Research in Security and Privacy. Oakland. 1988.
174
Frederic Cuppens and Alban Gabillon
[GAB95] [JK90]
[KTT89] [LUN90]
[ML921
[MST96]
[os94] [PHD88]
[VB91]
A. Gabillon. S~curite multi-niveaux dans les bases de donn~es tt objets. Ph.D. dissertation. ENSAE 1995. S. Jajodia and B. Kogan. Integrating an Object-Oriented Data Model with Multilevel Security. Proc. of the 1990 IEEE Symposium on Security and Privacy. Oakland. 1990. T. Keefe, W. Tsai and B. Thuraisingham. SODA: A Secure ObjectOriented Database System. Computer and Security, 8(6), 1989. T.F. Lunt. Multilevel Security for Object-Oriented Database Systems. In D.L. Spooner and C. Landwehr editors. Database Security III: Status and Prospects. North-Holland 1990. Result of the IFIP ~vVG11.3 Workshop on Database Security. J.K. Millen and T.F. Lunt. Security for Object-Oriented Database Systems. Proc. of the 1992 IEEE Symposium on Research in Security and Privacy. Oakland. 1992. D. Marks, P. Sell and B. Thuraisingham. MOMT: A Multilevel Object Modelling Technique for designing secure database applications. Journal of Object-Oriented Programming. July-August 1996. M. Olivier and S. Von Solms. A Taxonomy for Secure Object-Oriented Databases. ACM Transactions on Database Systems. Vol 19 (1). March 1994. H. Pfefferle, M. Hartig, K. Dittrich. Discretionary Access Control in Structurally Object-Oriented Database Systems. Proc. IFIP WG 11.3 Workshop on Database Security. 1988. V. Varadharajan and S. Black. Multilevel Security in a Distributed ObjectOriented System. Computer and Security, Vol 10. 1991.
B y t e C o d e Verification for Java Smart Cards B a s e d on M o d e l Checking* J o a c h i m P o s e g g a a n d Harald Vogt Deutsche Telekom AG Technologiezentrum IT Security D-64307 Darmstadt Tel. +49 6151 83-7881, Fax-4090
(poseggalvogth}@tzd. telekom, de
A b s t r a c t . The paper presents a novel approach to Java byte code verification: The verification process is performed "offline" on a network server, instead of incorporating it in the client. Furthermore, the most critical part of the verification process is based upon a formal model and uses a model checker for checking the verification conditions. The result of the verification process can be securely communicated to the runtime platform with cryptographic means. The major advantages of our approach are twofold: on the one hand, it offers a higher degree of security, since the verification process is based on a formal framework. Secondly, it saves resources on the client's side, since the process of byte code verification can be replaced by a simple check of a digital signature. This paper concentrates on Java smart cards, where resource limitations inhibit fully-fiedged byte code verification within the client, but the demand for security is very high. However, our approach can also be applied to other variants of Java.
1
Introduction
J a v a [GS96] is an instance of down-loadable code t h a t m a n y people see as a p a r a d i g m shift in c o m p u t e r science. J a v a is a network-oriented a p p r o a c h with two m a j o r advantages: Firstly, J a v a is s u p p o s e d t o be plat]orm-independent. This is achieved by compiling J a v a applets or J a v a applications into Java b y t e code [LY96], a stackoriented machine l a n g u a g e t h a t is interpreted on the specific platform where J a v a p r o g r a m s are t o be executed (cf Figure 1). T h e interpreter for this byte code is called the Java Virtual Machine. Secondly, J a v a applets offer the principle of downloading executable code over a network on demand; this is a significant gain in flexibility, since it allows one * The opinions expressed in this paper are solely those of the authors and do not necessarily reflect the views of Deutsche Telekom AG.
176
Joachim Posegga and Harald Vogt
to configure the function of clients in a network as needed; the expensive process of installing software on clients becomes superfluous. The technical basis for both these specifics of Java have been known in computer science since decades. However, the overall design of the Java scenario is very promising, and the technology became widely accepted and available on various platforms. This makes Java interesting. A potential hindrance for the wide-spread use of Java are security concerns. The problem is, essentially, that down-loading executable code from an open network can be dangerous, since it is hard to ensure that such an executable actually does not do any harm to the local system. Java's answer to this problem is, in essence, a type-safe language, a byte code verifier that checks certain safety properties of the transferred executable code, and a sandbox model that restricts the runtime environment of the down-loaded code. The Java byte code verifier Fig. 1. Java's Architecture plays a crucial role in this architecture (cf Figure 1): implemented within the user's Java platform, it ensures that the byte code to be executed meets certain properties like type-safety. The sandbox that runs Java applets takes these properties for granted, mainly for reasons of efficiency. Without this verification step, malicious byte code (e.g. code that has been hand-coded, or been manipulated during the transfer) can crash the sandbox and easily take over complete control of the underlying machine. In this paper we describe a novel security architecture for Java, where the process of byte code verification is carried out using a model checker; this process is assumed to be implemented "offiine", i.e. on a server in a network and not within the client. The result of such a verification step can be communicated to a client, e.g. by applying digital signatures to Java byte code programs (applets). Our approach has two major advantages: firstly, it offers a strong formal basis for an operation that is crucial to security. This helps avoiding bugs in the byte code verifier that can cause security holes. Secondly, the principle of carrying out the byte code verification offiine offers much more flexibility than the current apFig. 2. Byte Code Ver- proach, since it does not suffer from resource restricification tions usually found within the clients, both for size and speed of the Java runtime platform. The major motivation for our research were the specifics of Java smart cards [Sun97a,Sun97b,KP98], where byte code verification is particularly problematic
Byte Code Verification for Java Smart Cards
177
because of the extreme space and runtime limitations of smart cards processors. The concept of offiine verification fits perfectly into this scenario: it saves resources on the card, and formal approaches like model checking can increase security, which is highly desirable for smart card applications. Whilst the advantages of our approach are most obvious in the case of smart cards, the approach itself can also be carried forward to the general framework of Java. The paper is organised as follows. In the next Section (2) we discuss the role of byte code verification within Java's security architecture. Section 3 outlines the idea behind performing byte code verification offiine and compares it with the current architecture. In Section 4 we introduce the use of model checking for the semantic checks of Java's byte code verification, Section 5 gives a detailed example. Finally, we discuss related research in Section 6 and draw conclusions from our work in 7. We assume the reader to be familiar with the basics of model checking [EC82], some familiarity with Java's security architecture [Ye195,LY96] and Java byte code [LY96] is also helpful for the understanding of this paper. 2
Java
Byte
Code
Verification
The Java runtime system loads classes from class files, which are often fetched over a network on demand. For each newly loaded class, its class file is checked by the class file verifier, part of which is the byte code verifier. The checks performed can be roughly divided into mainly syntactic and mainly semantic checks. Examples of syntactic checks are verifying the correct syntactic format of the class file, the length of attributes, the correct declaration of the class, its fields and methods, etc. Since these syntactic checks are comparably simple and not of particular interest for the sequel of this paper, we will not elaborate on them. We will concentrate on the semantic checks of the verifier, in particular, we will show in Section 4 how these can be implemented using a model checker.
2.1
Semantic Checks o f the Java B y t e C o d e Verifier
The byte code verifier ensures that the byte code itself, which implements Java methods, is "legal", i.e. it follows certain rules. A detailed, though informal description of what this is supposed to mean can be found in [Ye195,LY96]. 1 Essentially, the idea behind byte code verification is to ensure that the byte code has been generated by a conformant compiler, thus observing several implicit rules. The Java interpreter within the client takes conformance of the byte code 1 From a rigorous point of view one can argue that the byte code verifier does in fact not perform a verification step: verification as an isolated notion does not make sense, instead one can only verify something against a specification. If such a specification is not explicitly given, the notion of "verification" is at least from a formal point of view misleading.
178
Joachim Posegga and Harald Vogt
to these rules for granted, mainly for reasons of efficiency. Therefore, byte code verification is essential for Java security since it is not obvious that the byte code always conforms to these rules: it could have been manipulated or an attacker could even have written malicious byte code by hand for crashing the Java interpreter. The semantic checks in the byte code verification process form the most complex part of the class file verification. In particular, a data flow analysis of each method is carried out, which is is closely related to the Java type system: Starting with the first instruction of a method, the effect of that instruction to the operand stack and to local variables is computed. This computation does not consider particular values of fields or variables but their type information, only. The state reached thereby is taken as a precondition for the following instructions. Subsequently, such a simulation is applied to all instructions of a method. For secure execution, several conditions must hold for each instruction, depending on the type of instruction: - For each instruction, an appropriate number of parameters must be on the operand stack, local variables must have appropriate types, and there must be enough room on the stack for storing the results of the instruction. If an instruction can be reached by different execution paths, the values of the operand stack and the local variables must have compatible types in all these paths. - Each method of an object that is called must have been initialised by certain initialisation methods, and the type returned by a method must match its declaration. Jumps to invalid instructions are forbidden, and execution must not go beyond the last byte of the code. Additionally, there are certain restrictions on how the instructions for subroutines may be used. Subroutines are (among other things) used for implementing t r y / f i n a l l y constructs in the Java language [LY96]. In summary, a byte code program that meets these criteria is supposed to be safe and not capable of "breaking" the virtual machine. A program which fails to meet one of these criteria is refused since its safe execution can not be guaranteed, and it might lead to gaining unauthorised access to resources on the client by crashing the Java interpreter. 2.2
Problems of
the
Current
Approach
We see several disadvantages of the byte code verifier and the current security architecture of Java: Lack o f f o r m a l i t y . Current implementations of byte code verification lack formality, in the sense that there is no formal model of what the byte code verifier exactly performs, and what exact properties a successfully verified applet actually has.
Byte Code Verification for Java Smart Cards
179
This lack of formality makes it difficult to avoid bugs in a verifier's implementation, as the history of Java bugs clearly demonstrates [DFW96,Pos98]. W a s t e o f s p a c e o n t h e c l i e n t ' s side. The byte code verifier is a program of considerable size, which needs to be installed on the client. This might be acceptable on a PC, but other Java-enhanced devices like PDAs, Java-based phones or Java smart cards have much tighter space constraints. W a s t e o f clienCs C P U t i m e . The process of byte code verification consumes a significant amount of CPU time, which is again a problem in PDAs, phones, or smart cards that use comparatively slow processors. Furthermore, the verifier is called each time an applet is loaded, thus resulting in multiple checks of the same byte code. Altogether, we found that the current security architecture of the Java platform has a number of deficiencies, in particular for application areas like Java smart cards. On such a card, only a few KBytes are available for the complete Java platform including applications, and it will take at least some years until progress in chip manufacturing will allow for implementing an explicit byte code verification on the chip itself. We therefore considered an alternative architecture, where the byte code verification proceeds offiine on a server in a network. This also allows to apply a more formal framework for byte code verification which can significantly strengthen Java security.
3
Oftiine
Byte
Code
Verification
Offiine byte code verification is carried out on a server in a network, rather than on the client itself. Successfully verified byte code programs are signed with the verifier's digital signature and passed to the client for execution. Figure 3 depicts this principle. A client loading byte code of an applet just has to check the digital signature of the code to make sure that the byte code has been checked by a verifier it trusts. Note that the verification server in the network does not need to check an applet more than once: it can behave like a proxy server and store results for a while or even check applets in advance. We will first discuss the general advantages Fig. 3. Offiine Byte Code Verification of offiine byte code verification, and turn to our particular approach using model checking in Section 4. Offiine byte code verification has a number of significant advantages:
180
Joachim Posegga and Harald Vogt
- It allows the use of (nearly) unlimited resources in the process of byte code verification, because the resource limitations on a client platform (both in time and space) are circumvented. Offiine verification is only limited by the amount of time a user is willing to wait when loading applets. In m a n y application areas it is even acceptable that applets are made known to the verifier before being used, so offiine byte code verification may indeed spend several minutes or even hours or days for analysing the byte code of an applet and invest as m a n y resources as are available on a (powerful) server. In particular, this allows one to use a formal framework for byte code verification, like the one we propose in Section 4. - Offfine byte code verification can be customised to special needs without having access to the runtime platform. So far the byte code verification in the standard J a v a environments has come as a fixed "bundle" of tests applied to the byte code. These tests might or might not suit the needs of users. For instance, one could think to include approaches like proof-carrying code [NL96] or even apply semi-automatic approaches, like a functional verification of the byte code. It saves resources at the client's platform since a client that loads an applet just has to check the digital signature of the incoming code. - It offers potentially more security t h a n the current architecture: The current architecture depends on verification engines t h a t are distributed and individually used. Several bugs in commercial products have shown t h a t it is occasionally necessary to replace these components, which is expensive in time and money for both the manufacturer and the customer. Further, it is unlikely t h a t all customers are aware of security issues and thus use the latest versions or bugfixes. In contrast to this, a central component can easily be replaced as errors show up. Overall security in the network is no longer defined by the weakest client, but depends only on one single well-maintained, well-implemented, well-protected component. -
We see also some disadvantages of oifline byte code verification; in particular: It requires a trusted service in the network since otherwise the result of the verification process would be arbitrary. - There might be problems with scaleability if such a verification service is implemented naively: If plenty of different applets and clients are involved, a verification server might run into performance problems and form a bottleneck for the clients. However, for application areas where only a small number of different applets are used, this drawback is negligible, and it vanishes completely if these applets can be made known to the server in advance.
-
Overall, we consider the advantages of offiine byte code verification to clearly outweigh its disadvantages for most application areas. In particular, such an approach allows the integration of formal methods into the process of byte code verification, as described in the sequel.
Byte Code Verification for Java Smart Cards 4
Model
Checking
of Java
Byte
181
Code
Given a description of a state transition system and a temporal logic formula, a model checker [EC82] decides whether the formula holds in the system. This works by completely examining the system's state space, which therefore needs to be finite. A byte code program can be viewed as a description of a state transition system performing transitions on states of the JVM. The state space of the JVM is of course potentially infinite, therefore model checking is not directly applicable to such a system, and we need to derive a finite abstraction, first (cf. [CGL94]). On the abstract model, it is possible to check properties which are formulated in a temporal logic like CTL. In such a logic, it is possible to express system properties involving information on states (e.g. typing of variables) and the system's dynamic behaviour. The properties which are to be checked by the process of byte code verification as defined in [LY96] can be expressed in CTL. Thus, by feeding a model checker with the model description and the respective CTL formulas, the model checker acts as a byte code verifier. 4.1
Building an Abstract Model
Byte code verification is applied to each method separately, therefore only the information relevant to a single method has to be included in the abstract model. Thus, a state of the abstract model essentially consists of a program counter, an operand stack, and the local and instance variables. Additional state variables are added for the bookkeeping of subroutine information and exceptions. References to the constant pool, which contains data used class-wide, can be pre-evaluated as these data items do not change. Certain instructions can throw exceptions. This means that an exception object is placed on the operand stack and execution continues at a specific point which is the entry of an exception handler. If no handler is defined explicitly, a default handler in the JVM is invoked: it takes the exception object from the stack, terminates the current method, returns to the calling method and rethrows the exception there. This recursive procedure ends finally with terminating the program. In contrast, if there exists an explicitly defined exception handler, execution simply continues with its code. This behaviour is coded into the transition relation, where the default handler is modelled by a transition to the final state with the exception flag set to true. Since the properties of the byte code related to its security deal mainly with type safety, the abstract model can be essentially restricted to type information. This allows the abstract model to be finite because - a method compiled from a Java program uses only an operand stack of restricted size, - the number of local variables is finite, - there are only finitely many instructions in a method,
182
Joachim Posegga and Harald Vogt
- the number of types used in a method is finite. As we want the properties checked on the abstracted system to hold also on the original JVM, a homomorphic mapping between the JVM and the abstract model is established. A formal treatment requires, of course, a formal model of the original JVM (a concrete model); there are several approaches to achieve this, e.g. [Coh96,Qia97]. The mapping ensures the following property: if ~ holds in the abstract model, then it holds also in the concrete one
It can be shown that this holds for CTL formulas ~ quantified over all execution paths (see e.g. [CGL94]), and we will see that the relevant byte code properties can be expressed as formulas of that class. To make the description of the transition system complete, the set of initial states is defined by setting the state variables to their initial values: the operand stack is empty and the program counter points to the first instruction; the values of the local variables are determined by the method's signature, which specifies the types of the parameters, the initialization of the remaining state variables is straighforward. This describes the idea behind constructing a finite state transition system from a byte code program. Since our research is motivated by Java smart cards, we will not consider certain features of the Java Virtual Machine (JVM) which are not supported by Java for smart cards (see [Sun97b] for details). This includes multi-dimensional arrays, multi-threading, and dynamic class loading.
4.2
Formal Security Properties
The security properties described informally in Section 2.1 have to be formulated in an appropriate logic (such as CTL) if a model checker is supposed to reason about them. These formulae can then be easily validated against the informal description as they are high-level and understandable by human readers. Another way of obtaining the formulae is to extract them from a formal description of a defensive JVM. A defensive JVM performs all checks at runtime needed for the secure execution of a program. Such a model of the JVM is partially described in [Coh96]. The conditions to be checked by the model checker could be formally derived from such a description. Most properties are local to a specific program point. T h a t means: Before an instruction can be safely executed, the state of the JVM has to meet a certain condition. This is expressed by the CTL formula AG pc = n - + r
where pc denotes the program counter, n a program point (the address of an instruction) and r is a propositional condition on the JVM state. In plain words: Whenever program point n is reached during execution, the condition r must hold.
Byte Code Verification for Java Smart Cards
183
To make sure that a local variable i contains a value of a specific type T, r is of the form l o c i = T ( l o c is the array of local variables). The formulm for checking the size and typing of the operand stack are similar. Some conditions can be checked for all program states in general, like that there are not stack overflows. This is achieved by specifying the formula
AG
size(stack) ~_ max_stack
which imposes on all states the condition that the stack does not exceed its maximum size.
public class Purse { int cash; / / money stored // in the purse int ta; // transaction ... // counter void notifyBank() throws Exception { ... } boolean debit(int amount) { try { if (this.cash < amount) { notifyBank(); return false; } else { this.cash -= amount; return true; } } finally { this.ta++; } } } 0 I 4 5 8 9
aload_O getfield #7 iload_l if_icmpge 19 aload_O invokevirtual #8 <notifyBankO>
12 iconst_O 13 istore_2 14 j s r 42 17 18 19 20 21 24 25 26 29 30 31 34 35 36 37 40 41 42 44 45 46 49 50 51
iload_2 ireturn aload_O dup getfield iload_l isub putfield iconst_l istore_2 jsr 42 iload_2 ireturn astore_3 jsr 42 aload_3 athrow astore 4 aload_O dup getfield iconst_l iadd putfield
#7
#7
#9
#9
54 r e t 4 Exception table : from to t a r g e t type 0 36 36 any
Fig. 4. Implementation of a Simple Electronic Purse: Source Code and Byte Code.
184 5
Joachim Posegga and Haxald Vogt Example:
A Simple
Electronic
Purse
As an example for performing byte code verification through a model checker, we consider the method d e b i t of class P u r s e (see figure 4), implementing a simple electronic purse running on a Java smart card; this example was implemented for the Cyberflex Java Smart Card [Sch97]. The m e t h o d d e b i t is invoked on a P u r s e object whenever money is withdrawn, d e b i t takes the amount of money as an argument and returns true if the transaction was successful and false in case the value stored in the purse would become negative. In the latter case, a method for notifying the bank is invoked. In any case, the transaction counter is incremented. The respective code is enclosed in a f i n a l l y clause, so it is performed even if an exception is raised from calling notifyBank. We first take a look at the state transition system built from the compiled method d e b i t and how it reflects the semantics of the byte code instructions. Then we consider the formulse describing the properties established during byte code verification. 5.1
A Transition
System
f o r debit
Figure 4 shows the compiled method d e b i t (as created by javac from Javasoft). From this piece of byte code, a state transition system is built and given to a model checker, together with conditions to check. Figure 5 shows the corresponding input for the model checker SMV. This description was created manually, but it can be easily computed automatically with an appropriate tool. A state of the transition system consists of the following state variables: - a program counter pc ranging over the instruction addresses and an additional final state; - an operand stack, implemented by a stack pointer sp and stack locations s t 0 through s t 2 (the maximum height of the stack is 3 as recorded in the class file); - local variables l o c 0 through loc4; - a stack for active subroutines r s t a c k ; - an e x c e p t i o n flag which indicates whether the method ends because of an exception; - object variables field7 and field9. Other state types are possible, and it is desirable to restrict the state space as much as possible. A candidate for a more efficient representation is the operand stack. Here, the currently unused stack locations unnecessarily expand the state space. Some simplifications were made for building the model: The only primitive data type is integer (booleans are treated as integers, and no other primitive types are used in the program). Furthermore, the access conditions for object variables (fields) and methods which could be violated by the g e t f i e l d , p u t f i e l d and i n v o k e v i r t u a l instructions are not handled.
Byte Code Verification for Java Smart Cards
185
The types for stack locations and local variables are similar: They range over a set of reference values (null, this, exception, and arbitrary reference), a primitive type (integer) and return addresses (used by the r e t instruction). The additional value UNDEF for local variables indicates that a variable has not yet been initialised and therefore cannot be used. The object variables are of type integer. They are strictly typed and may not hold arbitrary values like local variables do. Note that the stack pointer ranges over values from 0 to 4, where 4 stands for "invalid". The formube describing the transitions are partial in the sense that for invalid values, the value of sp in the next state is not determined. This leads to the existence of execution paths, where sp has a value of 4. However, by the specification SPEC AG !sp=4 we can assure that these paths are not reachable and therefore the stack never exceeds its maximum (or minimum) size.
Initial state In the initial state, the stack is empty and the program counter points to the first instruction of the program. The local variable 0 contains the reference to the object for which the method was called, l o c l holds the method parameter which is an integer value. The other local variables are marked UNDEF and are therefore unusable for any reading instruction. There are no active subroutines, no exception has been raised, and the object variables are assumed to contain valid values (see the INIT lines in Figure 5). State transitions The transitions are described by formulm using the variable values of the current and the following states. The description reflects the structure of the byte code program. Generally, the actual transition formula is guarded by a formula pc=Lnn, indicating that this transition should be made when program point Lnn is reached (see the TRANS statements in Figure 5). A transition manipulates the stack, changes some variables or raises an exception. This is expressed by describing which values the state variables will have in the next state, depending on their values in the current state. It must be stated explicitly if state variables do not change their values. Otherwise, the model checker chooses an arbitrary value (as for unused stack locations). Exception handling The Java method considered here performs no explicit exception handling, i.e.: all exceptions raised while executing the method are handed over to the caller of the method. However, the f i n a l l y clause must be executed even if an exception occurs. Therefore the compiler added an exception handler (ranging from address 36 to 41 in Figure 4) which calls the f i n a l l y clause and then re-throws the exception, i.e. hands it upwards. The instructions in our example which could possibly raise exceptions are invokevirtual, getfield and putfield. There are several getfield instructions in the method which raise exceptions if their respective object reference is null. As an example, look at the instruction at address I which is covered by the exception handler, so in case of a null reference a jump to the code of the exception handler is performed. Thus, for address I the model description includes a transition to address 36 guarded by the appropriate condition.
186
Joachim Posegga and Harald Vogt
In contrast, if the instruction at address 46 raises an exception, the m e t h o d is immediately interrupted and control is returned to the caller of the method. This is reflected in the abstracted model description by a transition to the final state with the exception flag set. We do not model the transfer of control here, as we consider only one method at a time.
Subroutine calls There is one subroutine in the method, implementing the code enclosed in the f • clause. This code can be called from 3 different locations (the j s r instructions at addresses 14, 31, 37). Although not obvious, the variable r s t a c k holds a stack where labels for all active subroutines are pushed on. As we have only one here, there are two possible values for r s t a c k : EMPTY if the subroutine is not active and JSR42 in the other case. (The label JSR42 indicates t h a t the subroutine starts at address 42.) Whenever a j s r instruction is executed, the subroutine must not already be active, as the J a v a byte code specification forbids recursive calls to subroutines. Therefore, the condition rstack=EMPTY has to be true for each program point with a j s r instruction.
5.2
B y t e Code Properties
The properties sketched below describe the type safety of the program. T h e y ensure t h a t all operations have operands of suitable types, and that there are no stack underflows or overflows. T h e y also ensure t h a t subroutines are not recursively called. SPEC AG (pc=LO->(sp<3 ~(IocO=NULL I locO=THlS J locO=EXC I locO=hEF))) SPEC AG (pc=Ll ->(sp>O & (((sp=l -> NULL=stO) ~ (sp=2-> NULL=stl) ~ (sp=3-> NULL=st2)) [ ((sp=l -> THIS=stO) ~ (sp=2-> THIS=stl) ~ (sp=3-> THIS=st2)))))
[omitted] SPEC AG (pc=L54-> (Ioc4=L17 [ ioc4=L34 1 ioc4=L40)) SPEC AG (!sp=4)
5.3
Results
The model description shown can be fed into the model checker SMV [CGL95]. It reports a total state space of 8.5 9 10 l~ states, of which 8531 are reachable. All properties are checked and reported to be valid in a total amount of time of 0.86 seconds (on a Sun SparcStation 20). Several points that could be verified by the model checker were already preevaluated while building the model. For example, the model checker could verify t h a t j u m p s go only to valid instructions. The model shown here incorporates this condition by having only valid instruction addresses in the range of the p r o g r a m counter.
Byte Code Verification for Java Smart Cards 6
Related
187
Work
The principle of offiine byte code verification has, to our knowlege, first been described within the context of the Kimera project [SGB98]; the separation was proposed mainly for reasons of efficiency. The project also addresses the security of Java runtime environments, by providing a more carefully tested and "cleaner" implementation of the byte code verifier. Kimera does not attempt to put the verification step on rigorous approaches like formal methods. Closely related work towards a formal treatment of Java is currently being undertaken at several places: Attempts to define a formal semantics for subsets of the Java language are pursued by [NvO98,DE97,Sym97]; this work chiefly aims at proving that the Java type system is sound and uses type-theoretic approaches. These approaches do not consider Java byte code and are therefore not directly applicable to our approach. An approach to formally define the semantics of the complete Java language using Abstract State Machines is described in [BS98]. Their formulation defines the mapping from the Java language to Java byte code in several layers of abstraction. This work is related to our approach since ASM can be seen as state transition systems. Although [BS98] do not consider proving properties of concrete byte code programs, the results could eventually contribute to mapping byte code programs into the finite abstractions used in our approach. A definition of the semantics of the Java Virtual Machine is described in [Qia97]. The approach considers a subset of the JVM instructions and aims at proving the run-time type correctness of byte code programs from their static typing. It yields an operational semantics of Java byte code and a static type inference system. A similar approach is taken in [SA98]; the paper concentrates on the t r y / f i n a l l y construct in the Java language and the Java byte code instructions relevant to this. The difference to our approach is that [SA98] uses a language more expressive than the finite state systems we consider; therefore, a more detailed description of the semantics of byte code verification is possible, at the price of a more complex modelling. The approach goes into a very similar direction to our work, but it concentrates on describing the byte code verification, whilst we focus on carrying it out with declarative means. Our approach differs -on the meta level- from proof carrying [NL96,FL97] code in that we replace the process of checking the proof by verifying a digital signature. Rather than requiring a proof checker on the client side, we assume safe distribution of keys, trust in a network node and signature verification on the client side. However, it is foreseeable that a public key infrastructure will be widely available in the future, so very little additional overhead is brought into the client by our procedure. Technically, we use a less expressive (because decidable) language for expressing properties than proof carrying code. This allows us to perform proofs fully automatically, while with proof carrying code, the question of where the proofs actually come from, still needs to be sorted out in detail.
188
7
Joachim Posegga and Harald Vogt
Conclusion
We described an approach to Java byte code verification, where the verifier is not integrated into the Java run time environment, but proceeds offiine. The process is implemented by using a model checker on an explicit, finite model of a byte code program. The checker verifies that the required security conditions, which are also given explicitly, hold for that model and are therefore fulfilled by the original byte code program. The proposed method of formal byte code verification helps achieving the highest level of security in execution of Java applets. This is done by replacing a crucial part of the Java security architecture, the byte code verifier, by a tool based on formal methods that avoids the pitfalls of conventional implementations. To summarise, the advantages of our approach are: M o r e f l e x i b i l i t y a n d e x t e n s i b i l i t y . In the sketched framework, we deal with high-level descriptions of system models and safety properties. The described abstract model of the JVM is not the only one possible, and the model and the properties to be checked can be adapted to individual needs. A c h i e v i n g f o r m a l c o r r e c t n e s s . The highest achievable level of correctness is a formal proof. The correctness of an implementation of the byte code verifier is almost impossible to prove, and the benefit is questionable, as a little change invalidates the whole proof. In our approach, we shift the proof condition from the implementation level to the level of validation, of both the model description and the formulm. But as these descriptions are high-level, the task becomes relatively easy. Furthermore, as errors show up, they can be corrected quickly and safely. Of course, there is still no guarantee that an actual implementation of a JVM will execute a verified byte code program correctly. To achieve this, we need a formally verified implementation of the JVM and the whole environment. For smart cards, even this task seems tractable. We believe that through model checking, a formal treatment of byte code can be achieved in a simple and effective way. Understanding the formalisms needed for this kind of byte code verification is not more complicated than implementing a byte code verifier, and the gain is a formal, highly trustable verifier.
References E. BSrger and W. Schulte. Programmer Friendly Modular Definition of the Semantics of Java. In J. Alves-Foss, editor, Formal Syntax and Semantics of Java, LNCS. Springer, 1998. [CGL94] E. Clarke, D. Grumberg, and D. Long. Model Checking and Abstraction. ACM Trans. on Prog. Languages and Systems, 16(5):1512-1542, 1994. [CGL95] E. Clarke, O. Grumberg, and D. Long. Verification tools for finite-state concurrent systems. In A Decade of Concurrency - Reflections and Perspectives, volume 803 of Lecture Notes in Computer Science. Springer Verlag, 1995. [BS98]
Byte Code Verification for Java Smart Cards
189
[Coh96] Richard M. Cohen. The Defensive Java Virtual Machine Specification Version, Alpha 1 Release. Technical report, Computational Logic, Inc;
http: / /www.cli.com /software / djvm /html-O.5 / djvm-report.html,
1996.
[DE97]
Sophia Drossopoulou and Susan Eisenbach. Proving the Soundness of the Java Type System. Working Paper, Imperial College, Dept. of Computing, London, UK, Feb. 1997. [DFW96] Drew Dean, Edward W. Felten, and Dan S. Wallach. Java security: From Hot Java to Netscape. In IEEE Symposium on Security and Privacy, Oakland, CA, May 1996. IEEE. http://www.cs.princeton.edu/sip/pub/secure96.html. [EC82] E. Allen Emerson and Edmund M. Clarke. Using Branching Time Temporal Logic to Synthesize Synchronization Skeletons. Science of Computer Programming, 2(3):241-266, December 1982. [FL97] J. Feigenbaum and P. Lee. Trust Management and Proof-Carrying Code for Mobile Code Security. In DARPA Workshop on Foundations of Mobile Code Security, Monterey, CA, 26-28 March 1997. [GS96] J. Gosling and G. Steele. The Java Language Specification. Addison-Wesley, 1996. [KP98] Matthias Kaiserswerth and Joachim Posegga. Java Chipkarten. InformatikSpektrum, 21(1):27-28, 1998. [LY96] T. Lindholm and F. Yellin. The Java Virtual Machine Specification. AddisonWesley, 1996. [NL96] G. Necula and P. Lee. Proof-Carrying Code. Technical Report CMU-CS-96165, Carnegie Mellon University, School of Computer Science, Pittsburg, PA, September 1996. [NvO98] Tobias Nipkow and David von Oheimb. Javaeigh~ is Type-Safe - - Definitely. In Proc. 25th A CM Syrup. Principles of Programming Languages. ACM Press, 1998. [Pos98] Joachim Posegga. Die Sicherheitsaspekte von Java. Informatik-Spektrum, 21(1):16-22, 1998. [Qia97] Z. Qian. A Formal Specification of Java Virtual Machine Instructions. (unpublished manuscript), 1997. [SA98] Raymie Stata and Martin Abadi. A Type System for Java Bytecode Subroutines. In Proc. 25th ACM Syrup. Principles of Programming Languages. ACM Press, 1998. [Sch97] Schlumberger, Inc. Cyberflex 2.0 Multi 8K Programmer's Guide.
http://www.cyberflex.austin.et.slb.com/cyberflex/docs/docs-page3.htm,
1997.
[SGB98] Emin Grin Sirer, Arthur J. Gregory, and Brian N. Bershad. Kimera: A Java System Architecture. http://kimera.cs.washington.edu/, 1998. [Sun97a] Sun Microsystems, Inc. Java Card 2.0 Application Programming Interfaces, Revision 1.0 Final. http://java.sun.com:80/products/javacard/, October 13 1997. [Sun97b] Sun Microsystems, Inc. Java Card 2.0 Language Subset and Virtual Machine Specification, Revision 1.0 Final. http://java.sun.com:80/products/javacard/, October 13 1997. [Sym97] Don Syme. Proving Java Type Soundness. Technical report, University of Cambridge Computer Laboratory, 1997. [Yel95] Frank Yellin. Low Level Security in Java. In WWW$, 1995.
http :/ / www.w3.org/ pub/ Conferences/WWW 4/ Papers/19 7/ 40.html.
190
Joachim Posegga and Harald Vogt
MODULE main V A R pc : {LO,LI,L4 . . . . . L51,L54,FINI}; sp : { 0 , 1 , 2 , 3 , 4 } ; stO,stl,st2 : {NULL,THIS,EXC,REF,INT,LI7,L34,L40};
Ioc0..4 : {UNDEF,NULL,THIS,EXC,REF,INT,L17,L34,L40}; rstack : {EMPTY, JSR42}; exception : {0,i}; field7 : {UNDEF,INT}; field9 : {~DEF,I~Tr}; INIT pc=LO ~ sp=O ~ locO=THIS & locl=INT ~ loc2=UNDEF ~ loc3=UNDEF INIT Ioc4=UNDEF ~ rstack=EMPTu & excep~ion=O ~ fieldT=INT ~ fieldg=INT -- aload_O TRANS pc=LO ->((sp=O ->(next(stO)=locO & next(sp)=l)) (sp=l ->(next(stO)=stO ~ next(stl)=locO ~ next(sp)=2)) & (sp=2 ->(next(stO)=stO ~ next(stl)=stl $ next(st2)=locO ~ next(sp)=3))) & (next(locO)=locO & next(locl)=locl & next(loc2)=loc2 & next(loc3)=loc3 next(loc4)=loc4 & next(rstack)=rstack ~ next(fieldT)=field7 next(field9)=field9 & next(exception)=exception) ~ next(pc)=L1 -- getfield #7 TRANS (pc=L1 ~ I((sp=l ->NULL=stO) & (sp=2 ->NULL=st1) & (sp=3 ->NULL=st2))) ->((sp=l ->nex~(sp)=l & next(stO)=fieldT) (sp=2 ->next(sp)=2 & next(stO)=stO ~ next(stl)=field7) & (sp=3 ->next(sp)=3 & next(stO)=stO ~ next(stl)=stl & next(st2)=fieldT)) (next(locO)=locO ~ nex~(locl)=1ocl & next(loc2)=loc2 & next(loc3)=loc3 next(loc4)=loc4 & next(rstack)=rstack & next(fieldT)=field7 next(fieldg)=field9 R next(exception)=exception) & next(pc)=L4 TRANS (pc=L1 & ((sp=l ->NULL=s~O) & (sp=2 ->NULL=stl) ~ (sp=3 ->NULL=st2))) ->next(pc)=L36 (next(locO)=locO ~ next(locl)=1ocl & next(loc2)=loc2 ~ next(loc3)=loc3 & next(loc4)=ioc4 & next(rstack)=rs~ack & next(fieldT)=field7 & next(fieldg)=field9 & next(exception)=exception) ~ next(sp)=l ~ next(stO)=EXC
[omitted] -- ret 4 .-TRANS pc=L54 ->((sp=O ->(next(sp)=sp)) R (sp=l ->(next(sp)=sp ~ next(stO)=stO)) (sp=2 ->(next(sp)=sp ~ next(stO)=stO ~ next(stl)=stl)) & (sp=3 ->(next(sp)=sp & next(stO)=stO & nexr ~ next(st2)=st2))) (next(locO)=lor R next(1ocl)=locl & next(1oc2)=loc2 & next(loc3)=1oc3 & next(loc4)=U~DEF & next(re~ack)=EMPTY ~ next(field7)=field7 & next(fieldg)=field9 & next(exception)=exception) & next(pc)=loc4 -- final state TITANS pc=FINI ->((sp=O ->(next(sp)=sp)) ~ (sp=l ->(nex~(sp)=sp ~ nex~(stO)=stO)) (sp=2 ->(nex~(sp)=sp ~ nex~(stO)=s~O & next(stl)=stl)) & (sp=3 ->(next(sp)=sp ~ next(stO)=stO ~ nex~(stl)=stl ~ next(st2)=st2))) & (next(1ocO)=locO = next(locl)=locl & next(loc2)=loc2 ~ next(loc3)=loc3 & nex~(Ioc4)=lor ~ next(rs~ack)=rstack & next(fieldT)=field7 next(field9)=field9 ~ next(exception)=exception) & nex~(pc)=FINI
F i g . 5. Sketch of the d e b i t Method as a State Transition System.
Towards Formalizing the Java Security Architecture of J D K 1.2 Lora L. Kassab 1 and Steven J. Greenwald 2 1 Naval Research Laboratory Center for High Assurance Computer Systems Washington, D.C. 20375, USA kassab~itd.nrl, navy.mil 2 Independent Consultant 2521 NE 135th Street North Miami, Florida 33181, USA sjg6@gate, net
A b s t r a c t . The Java security architecture in the Java Development Kit 1.2 expands the current Java sandbox model, allowing finer-grained, configurable access control for Java code. This new security architecture permits more precise, yet flexible, protection for both remote code (loaded across a network connection) and local code (residing on the same machine running the Java Virtual Machine) developed using the Java programming language. Our formal model and analysis is intended to: (1) allow designers and implementors to understand and correctly use the protection provided by these security controls, and (2) provide guidance to a JVM implementor wishing to support these security controls. Access control decisions in Java are made based on the current execution context using stack introspection. To model this, we employ a state-based model that uses multiple access control matrices to model the security controls in JDK 1.2. We also present a safety analysis and discuss the effects of static and dynamic security policies for a given Java Virtual Machine.
1
Introduction
T h e rise of J a v a as a p r o g r a m m i n g language has important implications if employed for high assurance systems. J a v a can be used to implement systems t h a t have high assurance requirements and must perform critical functions correctly. T h e security controls supported by the J a v a p r o g r a m m i n g language and the J a v a Virtual Machine (JVM) can be a significant benefit in developing high assurance systems in Java, but only if: (1) designers and implementors understand the protection those controls can provide and develop a system t h a t uses the controls appropriately to meet system requirements, and (2) the J V M on which the system runs implements the controls correctly, without providing loopholes or bypasses. This paper addresses the first of these two concerns by providing a model for the security controls implemented in the Java Developer's Kit (JDK) 1.2.
192
Lora L. Kassab and Steven J. Greenwald
The purpose of this model is to provide: (1) designers and implementors with a precise understanding of how these controls work so that they can design systems that use the controls correctly, and (2) guidance to a Java virtual machine implementor wishing to support these security controls. The need for a precise understanding of the security controls is highlighted by the fact that the new controls require complex changes to the functionality of the virtual machine. As noted before in [18], changes to the JVM can destabilize the entire system. The issue of whether a given implementation of the JVM on a given platform implements the controls correctly is important but beyond the scope of this paper. One interesting opportunity created from having many JVMs on different platforms is that the same application could be executed on a variety of underlying hardware and software platforms and their results compared as a way of reducing vulnerability to hardware flaws, software flaws, and perhaps even malicious attacks or misuse [14]. Even with such security concerns, Java's still-growing popularity has several roots. Language features such as type-safety, automatic memory management and range checking on strings and arrays are examples of how Java reduces the chance of some common kinds of programming errors, e.g., buffer overflow. (This is very important for safety-critical software as well.) Further, the availability of JVMs for a wide variety of platforms brings the possibility of "write once run anywhere" to application developers. The inclusion of applet tags in HTML together with the incorporation of the JVM into web browsers permits web servers to provide downloadable executable content (Java applets) transparently, moving appropriate computing tasks from the server to the client domain. Java's developers recognized the concerns that users might have about permitting arbitrary applets to execute on their machines and initially provided a simple "security model" for applet execution [20]. In this model, downloaded applets would be confined by the JVM to executing within a "sandbox" that would prevent them from, for example, altering the client's file system, or communicating with any network sites other than the site from which they were downloaded. Java applications (local programs not downloaded via a web browser) would execute as "trusted", without sandbox constraints. Predictably, some initial JVM implementations didn't implement the "sandbox" correctly [4, 6], and when they did, developers sometimes found the sandbox model too restrictive. Further, although the notion of a sandbox is simple, its detailed implications for security enforcement are not. Java applications, on the other hand, were (like typical programs run on clients) subject only to the user's constraints, and the idea that some additional constraints (perhaps less stringent than those imposed on applets) might be imposed on applications has appeal to users and developers alike. Consequently, Javasoft has developed more flexible security controls in JDK 1.2 [8]. With this new architecture, the earlier constraints on applets can be relaxed (depending on the source of the applet and any digital signatures that have been used to sign it) and local code can now be subjected to the same security controls as applets. Users can configure policies that their JVM must
Towards Formalizing the Java Security Architecture of JDK 1.2
193
enforce on applications as well as applets [9]. Thus, the security controls now allow finer-grained access controls for applets as well as applications. This results in a security architecture that allows multiple sandboxes with varying access permissions to co-exist. But with flexibility comes complexity and thus error-proneness. The reason for developing this formal security model is to answer precise, detailed questions about how the security controls are intended to behave and the implications of how each JVM implements the security controls. These questions often reveal subtle, undocumented, and sometimes unexpected behaviors that surface in an implementation of the JVM. As we describe the formal model, we will identify such aspects of JDK 1.2's current security controls and we will discuss the effects of static and dynamic security policies for a given JVM.
2
Overview
of JDK
1.2 S e c u r i t y
Controls
The security architecture in JDK 1.2 is based on protection domains that represent units of protection within the Java runtime environment. Protection domains are defined according to: (1) the location of where the Java code originated (the URL codebase), and (2) a set of cryptographic keys corresponding to the private keys that signed the code. Every .class file1 belongs to a single protection domain and is granted permissions according to its domain. Thus, a protection domain is scoped by the set of classes and objects (in the object-oriented sense) that are currently directly accessible by a principal, where a principal is an entity in the computer system to which permissions are granted [8]. Security policy files are used to specify the overall system security policy. These files contain a sequence of permission entries specifying which protection domains should be created and what permissions to grant to each protection domain. For each user running a JVM, there is a system security policy file and optionally a user security policy j~le that can be added to it. Since there are no negative permissions, the composition of the system and user policy file is simply the union of the two files to specify the policy "in-effect" for a user's JVM. 2 If neither policy file is present, then the default security policy is the original sandbox policy. An example of a permission entry in a security policy file is: grant codeBase "http://www.itd.nrl.navy.mil" SignedBy "abcdefg" { permission java.io.FilePermission "/home/foo/bar", "read", "write";
}; The above example indicates that code originating from the URL http://www.itd.nrl.navy.mil/signed with the key "abcdefg" has the permission to read and write to the system resource (in this case a file) "/home/foo/bar."
1 A .class file is the bytecode corresponding to the source code for a Java objectoriented class. To date, a JVM is implicitly owned by one user only.
194
Lora L. Kassab and Steven J. Greenwald
Access control decisions are not based solely on the contents of the policy file(s). Thus, an entry in the security policy file does not necessarily warrant access to a system resource, because access control decisions are also made depending on the execution context. The execution context includes checking all protection domains and their associated permissions "involved" in the request before granting any permissions. This design prevents more restricted protection domains from acquiring permissions not indicated in the security policy files. This mechanism is known as extended stack introspection and has been implemented by JavaSoft, Netscape, Microsoft and other vendors. Each implementation varies [18, 16, 15, 8], but the core design is similar and is the basis of our model. Before describing the components of our model, figure 1 illustrates how the JDK 1.2 security architecture maps into our model (defined in Sections 3 and 4). On the left-hand side of the figure, we show four protection domains that may co-exist, which have varying permission boundaries. We represent the security policy for this JVM with a VM policy matrix as indicated on the right-hand side of this figure. This VM policy matrix specifies the security policy, which is constructed from the system policy file (and possibly a user policy file based on the user starting the JVM) for the four protection domains currently executing on this JVM. Using this VM Policy Matrix, we define a domain matrix for each thread of execution on a JVM. These domain matrices correspond to the multiple sandboxes with varying permissions that may co-exist in JDK 1.2. 3
Threads
(1)
Local code
Sandbox
(2) VM Policy Matrix
Applet
Applet
[
JVM
(n-l) (n)
Resources
Fig. 1. JDK 1.2 Security Architecture Mapping
In the following sections, we define our Java Access Model (JAM), a statebased model that uses access control matrices to model protection states. There are a few noteworthy distinctions between JAM and other access matrix models. First, even though JAM relies heavily on the use of matrices, it does not 3 The permissions will vary based on the protection domains (and their associated permissions) that are present in a domain matrix.
Towards Formalizing the Java Security Architecture of JDK 1.2
195
have any individual matrix cell operations per se, which eliminates the ability to add, remove or transfer access rights between principles. Second, access is not accomplished by a lookup of a single matrix cell, as in many of the traditional techniques [13]. Instead, a special columnar operator is used to determine access. Finally, more than one matrix (VM Policy Matrix and domain matrices) is necessary to model the security controls for a JVM. These differences rule out many of the characteristics of other access control matrix models such as BellLaPadula (BLP) [2], Harrison, Ruzzo, and Ullman (HRU) [11], and Sandhu's TAM model [17]. The next section defines our formal representation of the JDK 1.2 security architecture.
3
T h e Virtual M a c h i n e Policy M a t r i x
The virtual machine policy matrix is a (source by target) matrix of allowable actions. We also show how a given virtual machine policy matrix defines any possible domain matrix that may exist for a JVM. Every component in the model is finite. The definitions of the virtual machine policy matrix follow. D e f i n i t i o n 1. R is a set of resources. Elements of R model anything available through the operating system, e.g., files, devices or network connections. D e f i n i t i o n 2. A is a set of actions. A includes the permissions provided by the underlying operating system (e.g., read, write, execute). D e f i n i t i o n 3. S is a set of sources. A source, is a pair (n, k), where n is a URL that names the location of a collection of classes and k is a set of public keys that are associated with the signers of the classes in n. If n is the wildcard ~, this represents any URL. Likewise, if k is the string "*", then this means the corresponding collection of classes are not signed. Similar to subjects in traditional access control matrix models, sources denote the rows in an access control matrix. The elements of S are not necessarily disjoint. That is, there may be sources in S, where the collection of classes and the corresponding keys (if any) overlap. When this happens We say, one source is a prefix of another. Source sl = (nl, kl) is a prefix of source s2 = (n2, k2) /~ nl -~ n2 and kl = k2, where ~ denotes the URL prefix operator. For example, the source (http://www.itd.nrl.navy.mil/, "abcdefg") is a prefix of source, (http://www.itd.nrl.navy.mil/people/, "abcdefg"), because the keys match and the URL that names a collection of classes in the first source is a prefix of the URL http://www.itd.nrl.navy.mil/people/. D e f i n i t i o n 4. T is a set of targets. Targets are strings that name sets of resources. Similar to targets in traditional access control matrix models, targets denote the columns in an access control matrix. However, in a V M policy matrix, it is not always the case that every source is also a target as in other access control matrix models. In a V M policy matrix, it is possible that more than one target refers to the same set of resources. For example, the use of symbolic links in UNIX causes more that one path name to refer to the same physical file.
196
Lora L. Kassab and Steven J. Greenwald
D e f i n i t i o n 5. The virtual machine policy matrix P M is an ISI • IT] matrix,
where PM[s, t] C_ A gives the allowable actions for target t, using the collection of classes loaded from source s. The policy matrix P M is a static representation of the system security policy file (with the possible addition of a user security policy file), where each row represents a source and every column is a target. The virtual machine policy matrix is defined at Java runtime start-up, thus defining the security policy for a given J V M until the J V M terminates. Figure 2 is a simple example of a policy matrix. Recall that sources are composed of a URL (or * for any URL) and the set of associated public keys (or the wildcard string, "*", if not signed). An empty cell in the matrix implies that the source has no permissions for the corresponding target. In figure 2, the last entry has a wildcard, *, to specify code loaded by the J V M from any URL receives the following permissions. Thus, all code may read files in the directory "/tmp', so this expands the boundary of the original sandbox.
/
SOURCES
TARGETS
\
URL
KEY
"*.com:80 .
.t 1 . "/trap" .
*.mil"
accept,
http://www.ivy.com/
connect
http://www.ivy, comJenglish/
IB*ll
http://www.topiaries.ivy/
"abcdef12345"
http://www.rhubrum.lilie s/
"*"
http://www.hibiscus.flowers/
"qwertyasdf456"
*
,,,,
execute connect execute
execute
write
accept
read
accept
write
read
(where target t 1 is java.lang.RuntimePermission.createClassLoader)
Fig. 2. Virtual Machine Policy Matrix
D e f i n i t i o n 6. D M is a set of domain matrices. Each element dm E D M is
an ]Sdm[ • [TI matrix corresponding to a thread of execution on a JVM, where Sdm C S and T is the set of targets. Each row in the domain matrix is similar to a capabilities list, which for Java is called a "protection domain". The actions in dm[s, t] are the union of all actions in virtual machine matrix element PM[s', t] for every source s' E S that is a prefix of s. Thus, for every source s' E S that is a prefix of s E Sdm, the permissions granted to s' for each target t E T are granted to s for t.
Towards Formalizing the Java Security Architecture of JDK 1.2
197
Figure 3 is an example of a domain matrix (with a single row) which is based on the policy matrix in Figure 2. To illustrate how the domain matrix was created, suppose the class http://www.ivy.com/english/green//oo.class were loaded by a J V M with the policy depicted in Figure 2 in effect. Based on the protection domain of this class, its set of permissions would be the union o] the two sources in the policy matrix that are prefixes of this class, namely (http://www.ivy.com/, "*"), (http://www.ivy.com/english/, "~"), and the last entry in the matrix which applies to any URL. Note that the code loaded from this source was not required to be signed (as indicated by "*").
"*.corn:80" http ://www.ivy.condenglish/
t1
ca~:Petct] execute
"/trap. . . . *.mil" read, write
accept
Fig. 3. Domain Matrix
A domain matrix (with multiple rows) can be viewed as stack (LIFO order), where the last row added is at the bottom of the matrix. We decided, however, to use a matrix representation for consistency with the P M representation. With the matrix form we do not lose any generality as we can manipulate it in a stacklike way when necessary. Also, the matrix allows us to determine permissions (defined later) with greater ease than a stack.
4
Java A c c e s s M o d e l
T h e virtual machine policy matrix defined above applies to a single JVM. In this section, we define the J a v a Access Model (JAM), which is the state-based model for a single virtual machine policy matrix. 4 Our J a v a Access Model is defined as J A M = (PS, C, f, U), where P S is the set of protection states, C is the set of commands (inputs) t h a t cause J A M to transition from one protection state to another, f : P S x C --+ P S is the transition function, and U is a set of principals. We use the sequence of input commands c E C* to distinguish the various entities t h a t are found in a protection state. Each entity is given the subscript of the c o m m a n d t h a t created it. For example, if c o m m a n d i in the input sequence creates a domain matrix, then the domain matrix is labeled dmi. T h e following four definitions define the components of JAM. 4 JAM can be modified to allow more than one JVM to concurrently execute on a single computer. However, the focus of this paper is on modeling the security controls of a single JVM.
198
Lora L. Kassab and Steven J. Greenwald
D e f i n i t i o n 7. U is a set of principals. It models the entities to which authoriza-
tions are granted (and as a result, accountability). Today, a J V M is implicitly owned by a single user causing U to be a singleton set. 5 D e f i n i t i o n 8. A protection state ps is a sextet (V, II, S, R, T, A), where
9 V is a triple u, P M , D M , where u is a principal, P M is a virtual machine policy matrix, and D M is a set of domain matrices. 9 I I : u -+ P M is the policy function that represents the effect of Java policy files; it maps a principal u to a virtual machine access matrix P M . Thus, this function returns the matrix based on the system policy file. (If a user policy file exists, the union of system and user policy file is used. This is easily accomplished since there are no negative permission entries in either
file.) 9 S, R, T, A are the sets that are used to define the policy and domain matrices in V . D e f i n i t i o n 9. The protection state ps E P S of J A M changes by one of the ]oUowing commands in C.
The command init(u) creates a new triple, V = (u, P M , n M i ) , where u is a principal in U, P M =- II(u) and D M i is an empty set of domain matrices (no threads of execution). This command models a principal starting a JVM. d e s t r o y ( P M ) : Remove the triple V = (u, P M , D M i ) from the current protection state ps. This command models the termination of a JVM. s t a r t ( d i n , s): Add a new domain matrix drn to the set of domain matrices D M j for the triple (u, P M , D M j ) . This command models the creation of a new thread executing the class(es) from source s, where the domain matrix drn consists of all targets in T and the set of sources Sdm for this domain matrix is the singleton set s. If start(din, s) is the i th command in the input sequence c, then the new domain matrix is drni, where i > j . The actions in din[s, t] for each t E T are obtained by the union of P U [ s ' , t] for every s I E S that is a prefix of s. s t o p ( d i n ) : Remove the domain matrix dm from the set of domain matrices D M . This command is less interesting as it simply removes permissions for a thread. This command, however, must also remove any privileges that may have been associated with dm (the "privileged" mechanism is discussed in Section 4.1). e n t e r ( s , dm): Add the source s E S to the set Sdm for domain m a t r i x d m E D M . This command adds a new row to the domain matrix drn. This models a thread of execution entering the protection domain s. As in the command start, the actions in drn[s, t] for each t E T are obtained by the union of P M [ s ~, t] for every s' E S that is a prefix of s.
9 init(u):
9 9
9
9
This set would not be a singleton if multiple JVMs are modeled or if protection domains are extended to include the notion of "running-on-behalf" of a principal as presented in [1, 8]. Thus, we maintain this set for clarity and extensibility.
Towards Formalizing the Java Security Architecture of JDK 1.2
199
9 e x i t ( s , d m ) : Remove source s from the set Sgm of domain matrix din. This command removes a row from the domain matrix din. This command models a thread of execution exiting the protection domain s. This command is most interesting as the removal of a row in a domain matrix has the potential to increase the permissions granted to a thread, which is atypical of commands in traditional access control models. That is, the removal of a row will never decrease the permissions granted to a thread of execution on a JVM. D e f i n i t i o n 10. P E R M I T : din, t -4 a is the permit function which returns the allowable set of actions a E A that a thread may perform on a target t. For every s E Sdm, the permit function takes the intersection of din[s, t] for target t as follows. P E R M I T ( d i n , t) = N dm[s,t] sESa,n
The resulting set is the ej~ective set of actions that a thread may perform. It is possible that a given s does not have any actions for a target ti, which is denoted by an empty cell in the matrix. In this situation, the intersection would return the empty set, since there exists at least one source s E Sdm where dm[s,t~] is an empty cell.
"*.corn:80" http://www.topiaries.ivy/
"abcdef12345"
http://www.rhubrum.lilies/
"*"
http://www.ivy.com/english/
,,.,,
t1
connect i execute
"/tmp . . . . *.mir' read
~execute
read, write
accept, connect execute
read, write
accept
accept
Fig. 4. Domain Matrix
Given the domain matrix illustrated in Figure 4 and a target t~, the P E R M I T function would take the intersection of the permissions for that target for every source in the domain matrix. In Figure 5, the intersection of the permissions for all targets has been calculated. For example, using the dm in Figure 4 and the target "/tmp", the P E R M I T function would return the permission read. 4.1
The "Privileged"
As stated previously, Java access control decisions are made based on the current execution context via stack introspection. This is not always a "complete" introspection, as blocks of Java code may declare that they are "privileged" by using the beginPrivileged 0 and endPrivileged 0 methods of the JDK class java.security.AccessController. The term privileged is not equated with the usual security notion of trusted code. Rather, "privileged" in this context means that
200
N
Lora L. Kassab and Steven J. Greenwald ( "*.corn:80"
connect )
( t 1 execute )
( 'Ttmp"
read )
( "*.rail"
accept )
( "*.corn:S0"
)
( t 1 execute )
( "/trap"
read, write )
( "*.rail"
)
( "*.corn:80"
accept, connect ) ( t 1 execute )
( "/trap"
read, write )
( "*.rail"
accept )
( "/trap"
read )
( "*.mir'
)
( "*.corn:80"
)
( t 1 execute )
Fig. 5.
PERMIT
function example
the block of code is responsible for requesting access to its resources, but the block of code can never gain more permissions t h a n indicated in the system policy file (and user policy file if one exists). For example, a block of code may use this privileged construct when accessing system files, such as fonts or libraries, when other less permissive are likely to be on the execution stack. We model the privileged construct through the commands enter(s, dm) and exit(s, dm), where the trace of these commands imposes an execution ordering of the protection domains traversed by a thread of execution. T h a t is, the enter c o m m a n d appends rows to a domain matrix and exit removes rows in LIFO order. 6 Therefore the top-most source in the domain matrix is older than the sources below it in the matrix, which is reflected in the sequence of commands on the domain matrix. The use of "privileged" circumvents the need to check all sources in the domain matrix by only checking the sources that were entered after the "privileged" source (the entries below the privileged entry in the domain matrix).
"*.corn:80" http'J/www.topiaries.ivy/
"abcdef12345"
http://www.rhubrum.lilies/
"*"
http://www.ivy.com/englisld
"*"
tl
connect execute
"/trap . . . . *.mir' read
accept
--~
connect:
PRIVILEGED
~te
Fig. 6. Domain matrix with a privileged protection domain
In Figure 6, code obtained from the source (http://www.rhubrum.lilies/, "*") executed the method beginPrivilegedO. Now, only the more recent protection domains need to be checked as indicated by the shaded portion of the figure before granting permission to any resources. We define the P E R M I T function under "privileged" as the intersection of dm[s, t] over every source s added by a c o m m a n d input that is greater than or equal to the c o m m a n d input that added 6 Recall that a domain matrix can be viewed as stack, where the most recently added row is at the bottom of the matrix (as indicated by the arrow in Figure 6).
Towards Formalizing the Java Security Architecture of JDK 1.2
201
the privileged source as follows:
PERMIT(din, t) =
~
dm[s, t]
c~ ( s ) > cj ( p r i v i l e g e d )
Since the code is executing as privileged, the results of P E R M I T may not be the same. As indicated in Figure 7, the permissions allowed are calculated given the context of a thread of execution. Now, if the P E R M I T function is applied to the target "/tmp" and the domain matrix in Figure 3, the permissions granted are both read and write (as opposed to just read as in Figure 5).
Cl
( "*.corn:80"
)
( "*.corn:80"
( "*.corn:80"
( t I execute )
( "/trap" read, write )
( "*.mir'
)
accept, connect ) ( t 1 execute )
( "/trap"
read, write )
( "*.mir'
accept )
)
( "/trap" read, write )
( "*.mir'
)
( t I execute )
Fig. 7. Permissions granted with a privileged protection domain
It is important to note that the privileged mechanism is scoped within a single thread. Thus, privileges cannot leak to "untrusted" code. Although this approach allows straightforward auditing of system code to verify that privileges are enabled only for a limited time, it is still possible for a developer to "forget" to call endPrivileged. The potential danger in this scenario depends on whether the implementation automatically removes the privilege once the code that called beginPrivilegedexits. Although we do not provide this comparison in this paper, we will mention that Netscape, Microsoft, and JavaSoft vary in their implementation of the privileged mechanism [18, 9]. 5
Safety Analysis
of JAM
We previously stated that users are not limited in what is entered into policy files. Consequently, policy matrices (PM's) are populated based on the "arbitrary" construction of policy files, which precludes our defining a secure initial state. These policy matrices are then used to form domain matrices (dm's), which are used to grant or deny permissions (via the P E R M I T function). Therefore our safety analysis focuses on the problem of whether a particular dm can be generated given a particular PM. We term this safety problem the Arbitrary Policy Matrix problem (APM) and formulate it as a decision problem to facilitate our analysis. The two theorems that follow show that APM is in NP and that APM is actually an NP-complete problem. This fact is significant for the following reasons. It demonstrates how a very interesting result can be obtained from a simple manipulation of the model, thus demonstrating the expressiveness of JAM. Also,
202
Lora L. Kassab and Steven J. Greenwald
it should be of great importance to developers Working with J D K 1.2, since the class of NP-complete problems are generally considered to be "intractable."
5.1
The Arbitrary Policy Matrix P r o b l e m
A P M can be formulated quite simply: given an initial protection state pso and a final protection state psf, is it possible for a particular domain matrix dm to appear in any future state ps~ such t h a t 1 < i < f ? We would like to be able to answer this question because there might be dm's t h a t are undesirable and it would be beneficial to be able to determine if such dm's would be possible for an arbitrary policy matrix P M .
T h e o r e m 1. (1.) A P M E NP. Proof. For each row in the dm, the worst case requires checking all possible combinations of the unions of all the rows in P M . There is an exponential number of possible rows, bounded by 2 S - r where S is a set of sources as defined in Section 3 and 4. Therefore, the problem can be posed as a decision problem, where an instance of a dm is guessed by a nondeterministic algorithm, and then compared to the dm in question. Note t h a t the actual comparison is done in polynomial time mn where m is the number of rows and n is the number of columns, being a simple comparison of two matrices. 7 In order to solve this problem, a nondeterministic algorithm need only guess a particular dm from the exponential number of possible dm's that m a y be created (from any particular P M ) in the states of the machine. We then need to compare the guessed dm with the particular dm of interest. As mentioned above, the comparison of the two matrices takes place in polynomial time, and the number of states is linear (and therefore also polynomial). Therefore, the problem is in NP. []
T h e o r e m 2. (2.) A P M is NP-complete. Proof. We polynomially reduce the well-known NP-complete Satisfiability (SAT) problem [3] used in Cook's theorem to APM. We briefly review SAT. We have a set B of Boolean variables, and a collection C of clauses over B. The decision problem being: is there a satisfying t r u t h assignment for C? The following example is p a r a p h r a s e d from Garey and Johnson [7] (we have changed some of the set names to conform to our notation). Let B be a set of Boolean variables. A truth assignment for B is a function t : B --+ {T, F}. If b is a variable in B, then b and b are literals over B. A clause is a set of literals over B representing the disjunction of those literals and is satisfied by a t r u t h assignment iff at least one 7 If multiple source prefixes exist in the rows of the policy matrix, then we can collapse the policy matrix by replacing the prefix sources with a single row that is the union of the prefixes. This will yield the most permissive set of possible dm's without loss of generality.
Towards Formalizing the Java Security Architecture of JDK 1.2
203
of its members is true under that assignment. A collection C of clauses over B is satisfiable iff there exists some t r u t h assignment for B that simultaneously satisfies all the clauses in C. For example, B -- {51, b2} and C = {{bl, b2}, (bl, b2}} provide an instance of SAT for which the answer is "yes." A satisfying truth assignment is given by t(bl) = t(b2) = T. Alternatively, replacing C by C' = {{bl, b2}, {bl, b2}, (bl}} yields an instance for which the answer is "no." C ~ is not satisfiable. We need to reduce, in polynomial time, an arbitrary instance of a SAT problem to APM. We restrict ourselves to two actions in JAM, such that A = {T, F} corresponding to Boolean T R U E and FALSE respectively. Let m = ICI and n = IBI. We then construct a P M that has m ~ = 2 '~ rows, and n columns such that each column corresponds to a unique variable in B. This gives us a P M of size mJn. The ceils of the P M are constructed in such a way t h a t we can generate d m ' s with m rows and n columns where each d m cell contains only one action, T or F (this means that the P M will contain every possible combination of unique rows with cells composed of T's and F's). We are now ready to begin the reduction of the problem instance. The particular d m that we want to compare against is termed d m t. d m ~ has m rows and n columns. Each row in d m ~ corresponds to a particular clause in our instance of SAT, and each column corresponds to a particular variable in B. We use the function g : C -+ d m ~ to fill in some of the cells of drn ~. Function g works the following way. If a particular literal in a clause in C is not negated (i.e., there is no bar over the literal), then we change the corresponding cell in d m ~ to T. If a particular literal in a clause in C is negated (i.e., there is a bar over the literal), then we change the corresponding cell in d m ~ to F (note that it is likely that some of the cells of d m I will be empty). It takes polynomial time m n to generate d m ~. At this point the reduction of the problem instance is finished. We now use a nondeterministic algorithm to generate possible solutions to APM by generating one d m out of a possible 2 mn unique d m ' s that are the same size as d m ~, where each cell contains either the action T or the action F (henceforth a d m guessed by our nondeterministic algorithm will be referred to as just "din"). At this point, each row in the constructed d m is analogous to applying the SAT t r u t h function t to the Boolean variables in B, and placing the results in each cell, where T is equivalent to a Boolean true and F is equivalent to a Boolean false. For example if B = {bl, b2, b3, b4, bs, be} then we would have a d m that might contain T, T, F, F, T, F respectively in each of the 6 cells in one of its rows (this being a particular guess we are checking). It takes time m n to "guess" a solution d m giving a total time of 2 m n at this point. We now do the actual comparison of d m against d m ~ to check for a satisfying truth assignment. We decompose the guessed d m into m one-row matrices dm~, 1 < i < m , taking time m n for a total time of 3ran at this point. We then construct a t e m p o r a r y matrix, d m h that is the same size as d m ~ by applying a new function h : dm'[s, t], dm~[t] ~-+ dmh[s, t] to each cell in dmh. Function h
204
Lora L. Kassab and Steven J. Greenwald
works as follows. If cell dm'[s, t] is empty, then h inserts F into dmh[s, t]. If the cell contains T, h copies the action from dm~[t] to dmh[s, t]. If the cell contains F , then h places the compliment of dm~ [t] into dmh [s, t]. This takes time mn giving a total time of 4ran at this point. To solve, we do a Boolean OR operation on each of the m rows in dmh (taking time mn for a total time of 5mn), followed by a Boolean AND operation of the m results taking time m for a total time of 5m2n. To finish the comparison we do the above for each dm~ (since there are a total of m one-row matrices, this takes time m for a total time of 5m3n), and do a Boolean AND operation on the m results s taking time m. Therefore, the total comparison time of dm against din' is the polynomial 5m4n. To solve SAT, if our final result is true then dm matches dm', and we have an instance of SAT that is a satisfying t r u t h assignment. If all of the exponential number of dm's do not satisfy, then our instance of SAT does not have a satisfying truth assignment. Any instance of SAT which satisfies, will also satisfy our reduction. Any instance of SAT that does not satisfy, will not satisfy our reduction. Therefore, APM is NP-complete. [] 6
Implications
of Dynamically
Changing
Security
Policies
Currently, a JVM enforces a single static security policy based on the system security policy file(s) read during initialization. Our safety analysis in section 5 proved that the problem of whether a particular dm can appear given a particular P M is an NP-complete problem. The fact that APM is NP-complete shows that the consequences of a given policy are quite difficult for users to evaluate. Even so, JavaSoft is presently working on a mechanism that allows changes to a JVM security policy after initialization (i.e., dynamically changing P M ' s ) via a secure mechanism [8]. The improved flexibility of dynamically changing security policies is not without penalty due to the increase in complexity of JVM implementations. Flawed implementations, regardless of how they became erroneous, are far more detrimental to security than the benefits provided by this flexibility. Using our model, we can explore the implications for currently executing Java programs with dynamically changing security policies. One possible way to dynamically change the policy may require certain preconditions to be satisfied before the policy change may occur. 9 The simplest requirement would be that all executing threads are executing within the confines of the original sandbox before the new security policy becomes effective. If domain matrices are utilizing resources "outside" of the original sandbox and the change in the security policy must be effective immediately, then there are a few alternatives. The easiest and most secure implementation would require that all s Technically, the final Boolean AND could be a Boolean OR and still find satisfying truth assignments. However, the AND is necessary in order to retain the structure of the matrix comparison between dm and drn'. 9 This is a common object-oriented technique known as design by contract.
Towards Formalizing the Java Security Architecture of JDK 1.2
205
of these threads be terminated (via the stop(dm) command in our model) before the update occurs. Although this may seem rigid, co-existing security policies (multiple PMs) for the same principal will complicate the implementation of security controls in a JVM and is subject to luring attacks by allowing more permissive threads to linger through changes to the security policy for a JVM. A less rigid implementation for dynamically changing the security policy for a JVM may depend on the specific changes to the policy files. If a modification to the policy files makes any protection domain(s) more permissive (only adds actions for one or more targets), then each dm that traversed any of these protection domain(s) can simply be granted these new permissions. If implemented correctly, this does not pose a problem. If a modification to the policy files makes any protection domain(s) less permissive (removes actions for any target), then the update is more complicated. For each thread of execution (dm) that traverses an affected protection domain, either the update does not occur until the protection domain is no longer accessible by the thread, or the thread of execution is simply terminated. A naive implementation that allows a dm to be updated with fewer (or less permissive) actions may result in the executing applet/application not completing its computation. For instance, suppose a currently executing thread has received permission to write to a given file and the new policy disallows this permission (while the thread is writing to the file). Then, subsequent resource access by that thread will either throw a java.lang.SecurityException or incorrectly allow the thread to continue writing to the file. The design for supporting dynamically changing policies has not been defined yet, but this mechanism is intended to be included in future implementations. Using JAM, we can more easily analyze how this mechanism is best implemented and how threads that persist over changing policies may be dangerous if not handled properly. Clearly, the implementation of this capability to dynamically change a security policy for a JVM is complex (and thus error-prone), and as we have seen in the past, this is typically an avenue for discovering loopholes and bypasses to the security controls. 7
Conclusion
The security controls in JDK 1.2 have many desirable features. Stack introspection is not vulnerable to tampering or direct program access even though stack introspection can have high runtime c o s t s , l ~ Further, extended stack introspection also offers good backward compatibility with existing Java applets [18] by defaulting to the original sandbox policy if no other security policy is specified. Extended stack introspection is not a panacea, however. It requires complex changes to the virtual machine. More specifically, each class needing protection must explicitly consult the security system to see whether the class was invoked by an authorized party. This check adds at least one line of code to each class. lo In the worst case, the depth of the stack is traversed before permissions are granted as modeled by the PERMIT function.
206
Lora L. Kassab and Steven J. Greenwald
As we previously noted, changes to the JVM could destabilize the whole system. This adds risks because the major commercial browser vendors (Microsoft and Netscape) have diverged in their implementation of the JVM. Inevitably the flexibility afforded by the JDK 1.2 security controls introduces complexity and more opportunity for error-prone JVM implementations. Our formal security model provides a more rigorous and unambiguous specification of the intended behavior of the security control design of JDK 1.2. By formalizing definitions of the security controls, we were able to: (1) identify potential points of variance in JVM implementations (different stack introspection algorithms and the privileged mechanism), and (2) provide a means to compare the actual behavior of the security controls in different JVM implementations. Our safety analysis focused on whether a particular dm can appear in future states because it is not possible to limit what can appear in a policy file. Our safety analysis showed that APM is NP-complete (cf. [10, 11, 17]). Therefore, if expressing security by APM is of interest to designers and implementors, then any particular policy file configuration can be inspected (not necessarily efficiently) using methods appropriate for NP-complete problems. In response to the complexity of static policy files, we used our model to explore the impact of dynamically changing policy files. As additional features of the Java security architecture in JDK are announced, we plan to reuse our model to analyze the ramifications of those features.
Acknowledgements The authors would like to thank John McDermott, Carl Landwehr, Catherine Meadows, and Richard E. Newman for their contributions to this paper.
References 1. Balfanz, D. and Gong, L.: Experience with Secure Multi-Processing in Java, In Proceedings of the International Conference on Distributed Computing Systems, Amsterdam, Netherlands, May 1998. 2. Bell, D. E., and LaPadula, L. J.: Secure Computer System: Unified Exposition and Multics Interpretation, Technical Report MTR-2997, MITRE, Bedford, Mass., March 1976. 3. Cook, S. A.: The Complexity of Theorem-Proving Procedures, In Proceedings of the Third Annual ACM Symposium on the Theory of Computing, ACM, Pages 151-158, 1971. 4. Dean, D., Felton, E. W., and Wallach, D. S.: Java Security: From Hot Java to Netscape and Beyond, In Proceedings of the 1996 IEEE Symposium on Security and Privacy, pages 190-200, May 1996. 5. Denning, D.: Cryptography and Data Security. Addison-Wesley, 1982. 6. McGraw, G. and Felton, E. W.: Java Security: Hostile Applets, Holes, and Antidotes. John Wiley & Sons, 1997. 7. Garey, M. R., and Johnson, D. S.: Computers and Intractability, Bell Telephone Laboratories, Inc., Murray Hill, New Jersey, Page 259, 1979.
Towards Formalizing the Java Security Architecture of JDK 1.2
207
8. Gong, L.: JDK 1.2 Security Architecture. Sun Microsystems, Inc., Palo Alto, California, March 1998. 9. Gong, L., Mueller, M., Prafullchandra, H., and Schemers, R.: Going Beyond the Sandbox: An Overview of the New Security Architecture in the Java Development Kit 1.2, In Proceedings of the USENIX Symposium on Internet Technologies and Systems, Monterey, California, December 1997. 10. Harrison, M. H. and Ruzzo, W. L.: Monotonic Protection Systems, Foundations of Secure Computation, Academic Press, New York, 1978, 337-365. 11. Harrison, M., Ruzzo, W., and Ullman, J.: Protection in Operating Systems, Communications of the ACM, 19(8), Pages 461-471, August 1976. 12. Lindholm, T. and Yellin, F.: The Java Virtual Machine Specification. AddisonWesley, 1996. 13. Lampson, B. W.: Protection, In Proceedings of the 5th Princeton Symposium on Information Sciences and Systems, Princeton, New Jersey, March 1971, Pages 437-443. 14. McDermott, J. and Gelinas, R.: Prototyping a Java-based Defense Against Storage Spoofing, Submitted for publication. 15. Accessible from http://www.microsoft.com. 16. http://developer.netscape.com/library/documentation/signedobj. 17. Sandhu, R. S.: The Typed Access Matrix Model, In Proceedings of the IEEE Symposium on Research in Security and Privacy, IEEE Computer Society, Oakland, California, Pages 122-136, 1992. 18. Wallach, D. S., Balfanz, D., Dean, D., and Felton, E. W.: Extensible Security Architectures for Java, In Proceedings fo the 16th ACM Symposium on Operating Systems Principles, October 1997. 19. Weissman, C.: Security Controls in the ADEPT-50 Time-Sharing System, In Proceedings of the Fall Joint Computer Conference, Vol 35, Pages 119-133, 1969. 20. Yellin, F.: Low Level Security in Java, In Proceedings of the 4 ~h International World Wide Web Conference, Boston, Massachusetts, December 1995.
E U R O M E D - J A V A : Trusted Third Party Services for Securing Medical Java Applets Angelos Varvitsiotis, Despina Polemi*, and Andy Marsh** Institute of Communications and Computer Systems (ICCS) National Technical University of Athens (NTUA) Heroon Polytechniou 9, Zografou Athens, Greece The authors can be reached via e-mailatthe ~llowing addresses:
[email protected]~ua.gr polemi~softlab.ece.ntua.gr [email protected]
A b s t r a c t . EUROMED, a DG III project 1, aims to create the foundation of telemedical information society. EUROMED-ETS, an INFOSEC project, provided secure communications among EUROMED participants by establishing Trusted Third Party Services (TTPs) over the Web. Java technology plays an important role in EUROMED. In this paper, the threats that Java technology introduces to EUROMED are explored and security countermeasures are proposed, utilizing the TTP infrastructure. Keywords: Telemedical Applications, Web, Trusted Third Party Services, Java, EUROMED, EUROMED-ETS.
1
Introduction
E U R O M E D is a three-year European Commission DG I I I / B project [13][14] [22], which began in J a n u a r y 1996. ICCS-NTUA is leading the project and the partners are: Infoproject (GR), University of Calabria (I), University of Athens (GR), AIS (IT), University of Joensuu (SF), University of Amsterdam (NL), University Hospital Leiden (NL) and University " P U B " Bucharest (RO). EUROMED aims to create the foundation of a telemedical information society. Patients' health records will be distributed in medical cyberspace, and doctors will be able to interact with these records by clicking on H T M L pages. EUROMED supports three hierarchical infrastructures, namely: * Project Manager of EUROMED-ETS ** Project Manager of EUROMED 1 This paper represents the views of the authors and not those of the European Commission.
210
AngelosVarvitsiotis et al.
9 The HCN (Hierarchical Communications Network) using satellite, Internet and telecommunications networks (e.g. ISDN, ATM), which connects dispersed isolated regions. 9 The HCF (Hierarchical Computing Facilities) infrastructure which includes a range of High-Performance Computing (HPC) platforms, modern workstations and PCs, provides heterogeneous computing facilities to every node in the Hierarchical Communication Network. 9 The HMF (Hierarchical Medical Facilities) infrastructure, which consists of specialized clinics, general hospitals and local doctors that can collaborate and facilitate a uniform level of medical practices. EUROMED is based on WWW to establish communication between the participating sites. Its network consists of a number of Internet sites that stores medical data about a number of patients, as well as image processing and archive applications. A physician seeking information to reach diagnosis for a given patient, searches the network with a web browser and collects the available data for the specific patient. Medical data is assumed to be images, two- or three-dimensional, coming from different modalities (X-rays, mammography, CT, MR), biosignals ECGs-, results of biochemical examinations and text reports. The available data for any given patient does not necessarily reside on one site; instead, each data item is archived on the site where it was first produced. The data is annotated with pointers that refer to one another and are summarized on an HTML page, unique to each patient.
I
[
HIS
\
[
www ]
(
~
Intranet
:
',
WWWlnternet HPCN Platform Hospital Y
Fig. 1. EUROMED platform
[
~"
EUROMED-JAVA
21t
The physician inputs the data of the referred patient into the appropriate application and processes it. An application is any program that is distributed to the EUROMED sites, either on physical media, or over the Internet. Java [10][11] was chosen by EUROMED as the standard technology for the distribution of common applications over the Internet. E U R O M E D - E T S is an INFOSEC project [23] which complements EUROMED by concentrating on, and tackling, several issues of security in a telemedical information society [2][3][4]. ICCS-NTUA was leading the project and the partners were: University of Aegean (GR), University Hospital of Magdeburg (D), University of Calabria (I), Infoproject (GR) and National Health Service (U.K.). In EUROMED-ETS, we concentrated on the establishment of TTPs [6][5], for ensuring that all health actors can communicate in a secure way over the WWW. TTPs secure the highly sensitive medical information exchange. In this project all aspects (operational, technical, functional, organizational, regulatory, and legal) of TTPs have been investigated for tetemedical applications over the WWW. SSL [19] was adopted as the security solution after considering EUROMED's needs. Among the services offered by the established TTPs were: registration, name authentication, certification, key and directory management. During the pilot phase of the project a network of four TTPs was established. EUROMED provided the pilot platform in order to demonstrate the proposed security solution. EUROMED-ETS concentrated on the first and last hierarchy (HCN and HMF) of the EUROMED platform in order to ensure the secure communication of EUROMED users over the Internet. For the proper operation of EUROMED, a complete security solution for all three hierarchies must be provided. Only then EUROMED will reach its true potential and all its participants (among them, 80 Hospitals throughout Europe) will start utilizing its true benefits. In order to achieve this goal we need to provide measures for securing the medical Java applets distributed over the EUROMED network. In this paper we examine the role of Java technology in EUROMED, we investigate the threats that Java technology brings to EUROMED, we explore the impact of TTPs in improving Java security and we propose security countermeasures based on the TTP technology. This paper is organized as follows: Section 2 concentrates on the role of Java technology in EUROMED and addresses the threats that this technology brings to EUROMED. Available security measures are reviewed and security problems that remain unresolved are described. Section 3 provides countermeasures for overcoming the threats that Java puts to EUROMED's security. Conclusions and topics for further research are included in Section 4. 2
Java in EUROMED
- Security
Issues
EUROMED has standardized the use of the WWW for telemedical applications. It has proposed HTML, VRML, JavaScript and Java as the standards on which different sites can communicate and interact in order to materialize the
212
Angelos Varvitsiotis et al.
concept of Telemedicine, remote diagnosis and the building of a virtual environment in which physicians interact with each other and with patients' data. Medical data, scattered in different sites even in the case of the same patient, are processed by Java applications, residing in different Institutions and Hospitals. EUROMED even offers the possibility for Hospitals to use HPC installations for CPU-intensive applications. EUROMED is based on a network of HTML pages that allows users to access medical data, input them to Java applications, invoked from other H T M L pages, and archive the results by updating links to the old H T M L pages. Java is selected as a key language in EUROMED for the development of medical applications because of its suitability for building Web applications and its platform- and OS-independence properties. The characteristic of Java to generate platform-independent byte code is crucial for EUROMED since different Hospitals, Institutions and Research Centers that participate in the network are bound to use different computer systems, ranging from PCs, to workstations and HPC systems. Java is used to create applets with medical information that are embedded in HTML pages. EUROMED Java applets perform operations like: image display, image segmentation, multimodal image registration, fusion and anatomical object visualization. A simple mouse click on the browser invokes the execution of the applet on the physician's computer. Summarizing, Java was chosen by EUROMED as the key technology for a variety of functions, including: 9 9 9 9 9
visualization of medical data annotation of visual data (e.g., medical images) remote database access update of medical records cooperation with services available by HPC centers for CPU-intensive applications 9 assistance in reaching diagnosis. Applications such as the above are built either as Java applets that are downloaded and executed by the user's browser, or as Java applications, that need to access resources of the user's computer and operating system in order to perform certain operations. We concentrate on the applets case and the security threats that the model of Java applets introduces in EUROMED. In order to discuss these threats, we categorize them as follows: 9 downloading security threats: this category includes the security threats involved in downloading an applet or any mobile code application [7][12]. The site hosting the applet may be impersonated [1] by another, malicious site. Alternatively, the executable byte code may be altered, either as stored on the hosting site, or during transmission across the network. Maliciously written applets, once loaded instead of legitimate ones, may steal or alter precious information, orchestrate denial-of-service attacks against the hosting server, or consume resources of the client's computer. In EUROMED, an additional severe threat is that malicious applets may cause erroneous diagnoses by presenting false data to the physician.
EUROMED-JAVA
213
9 execution security threats: this category includes the threats t h a t the exe-
cution of an applet may involve against the user's computer and the local resources [15]. Typically, a malicious applet may attempt to read or write a file in the local disk, reveal sensitive information about local resources to unauthorized parties, or exhaust local resources such as CPU, memory and network bandwidth. 9 c o m m u n i c a t i o n security threats: EUROMED applets need to communicate with servers over the Internet to read and update medical information. The confidentiality and the integrity of stored information and data in transit [1] are threatened if not appropriately secured. For each of the abovementioned threats, countermeasures have been proposed by the designers of Java [17][18]. In the sequel, we shall assume that H T T P communication between a browser and a server can be protected by authentication for both the server and the client, encryption and access control, as supported by widely-accepted methods and protocols (e.g., SSL [19], S H T T P [20]). Therefore, as far as the applet downloading threats are concerned, the following countermeasures have been proposed: 9 The server that hosts an applet must be digitally certified so t h a t no other server can impersonate as the legitimate one. 9 The server can request user authentication, so that only authenticated users are able to download applets. 9 The server can implement access control, so that only authorized users are able to download applets. 9 The communication between the browser and the server must be encrypted using strong encryption [26][27] and authenticated using an appropriate MAC algorithm [21][24][25], so t h a t the confidentiality and integrity of the applet code can be guaranteed to a satisfactory degree. 9 The code of the applet itself must be digitally signed by the applet's author, so that the user can avoid execution of applets provided by untrusted sources. To add a level of trust to the code itself, the designers of Java have taken some measures against execution threats. When loaded by browsers, Java applets execute in a confined environment, as controlled by a Security Manager. The security manager will only allow a restricted subset of operations normally available to native Java programs. Such restrictions are: applets cannot open, read, rename or write system files, cannot examine most of the system- and user-related variables available to programs and cannot establish network connections to any other but their hosting server. This is too restrictive for E U R O M E D where reading medical data, processing them, acquiring new data and saving the results either locally or remotely, is essential. These operations presuppose that the Security Manager does not apply the abovementioned restrictions so that applets can perform useful operations (e.g. access and manipulate local files or establish arbitrary network connections). For overcoming this problem, these restrictions should be uplifted in well-defined cases. This will offer to appropriately certified applets the necessary
214
Angelos Va~vitsiotis et al.
flexibility in order to perform efficiently by e.g. storing intermediate computation results on local t e m p o r a r y files, at the same time offering trust to the compliance of these applets to issues and directives regarding privacy of medical information and personal d a t a [30][31][33][4][34]. Furthermore, execution threats cannot be entirely handled by means of the above restrictions. Several problems regarding the execution of applets remain unresolved, including: 9 code modularity: in the event of a complex, multi-component application
where trusted code relies on some untrusted component, the security of the overall application is questionable. 9 multithreading: the freedom given to applets to fork large numbers of threads as well as the possibility to alter the default thread behavior have often been the causes for resource starvation and denial of service problems. 9 i m p l e m e n t a t i o n opaqueness: although specifications for VMs and foundations of basic classes are open, their implementations in browsers are not, leaving space for possible security breaches through poor design or implementation choices. As far as the c o m m u n i c a t i o n s threats are concerned, the measures proposed are based on recently announced and made available Java Application Interfaces, such as a foundation of classes called the Java Cryptography Enhancement toolkit (JCE) and SSL, which is now part of the basic Java Development Toolkit (JDK) [18]. J C E contains implementations of widely accepted digital signature algorithms such as DSA and RSA, encryption algorithms, such as DES [26] and RC4 [28], and message authentication algorithms, such as MD5 [21]. It also contains classes for parsing X.509vl certificates, and the X.509v3 functionality is becoming available. With the use of the JCE classes, Java applets can: 9 Establish network connections with one- or two-side authentication, by verifying certificates presented by the other end of the connection. 9 Transmit messages using encryption and appropriate MAC codes to guarantee communications confidentiality and integrity. 9 Enable access control based on user authentication for critical tasks (e.g., some users may be allowed to view medical information without the right to alter it). 9 Provide other functions, such as patient anonymity, where required by the context or the medical application. Using the SSL API, applets can have secure access to a weMth of applications in a standardized manner. At the same time, SSL provides all of the above functions in the context of a network session, which is particularly useful for the applications in hand (e.g., interactions with medical databases and use of HPC facilities over the Internet). However, there are still problems that need to be resolved, including: 9 The degree to which signed applets should be given unlimited access to local resources.
EUROMED-JAVA
215
9 The lack of a trust chain between trusting an applet author's intentions and trusting the applet code itself. 9 The poor security features of the confined execution environment regarding system resource management (number of threads, alteration of default thread termination behavior). 9 The possibly excessive strictness of the confined environment. 9 The lack of a mechanism for user-configurable, fine-grain tuning of allowed applet access to system resources. 9 The lack of a taxonomy for the certification of code other than the name spaces provided by manufacturers. 9 The opaqueness of the classes provided by browser manufacturers. 9 The premature state or lack of components such as Directory lookup modules [35]. In EUROMED-ETS, we demonstrated the usefulness of T T P s in providing a source of trust for the complex, multi-server, multi-hierarchy, multinational, multi-user, geographically dispersed infrastructures employed by EUROMED. In the next section, we discuss how T T P s can be used for solving the abovementioned problems and address the security threats imposed by the EUROMED applets. 3
TTPs
for Securing
EUROMED
Applets
The establishment of Trusted Third Party Services (TTPs) was proposed and demonstrated by pilot operation during EUROMED-ETS in order to secure HTTP-based transactions. Several security services were provided by EUROMED-ETS in order to establish secure communications among EUROMED participants. However, EUROMED-ETS did not provide countermeasures for the abovementioned threats brought by the use of Java technology. In this section, we propose additional mechanisms to be offered by T T Ps and manufacturers in order to secure the medical Java applets distributed over the EUROMED network. In order to ensure the authenticity and integrity of applets we propose extending the Directory service of T T P s to support directory services applet certification. Execution threats can be overcome with source code certification and varying levels of trust. Secure data transport among applets and servers can be provided either by secure applet communications or by proxying services offered by TTPs. Access to servers that are contacted by applets is secured with access control. Finally, privacy can be enhanced with one-time keys for anonymous services. These mechanisms are explained in the sequel.
Directory Services for Applet Certification: Java classes provided by manufacturers are normally signed by the manufacturers themselves using their private key. During EUROMED-ETS, we used the Directory as a publicly accessible trusted infrastructure for certificate lookup via standardized access protocols [36]. In order to offer a similar service for the provision of a unique source of
216
Angelos Varvitsiotis et al.
trust to applet certificates, the functionality of the E U R O M E D - E T S Directory should be extended to hold certificates of trusted applet manufacturers. This will enable users to install certificates from a trusted source, using a secure connection (LDAP over SSL) for the Certificate lookup. Besides certificates, the Directory can offer a storage space for Java objects [35]. This is a valid alternative that can also be considered by EUROMED T T P s .
Source Code Certification: T T P s can offer a more important service than just storing manufacturer certificates. If T T P s are allowed, on a confidential basis, to examine the source code of an applet, they will be able to certify it themselves as conforming to specifications set forth by EUROMED. Wherever possible, code should be examined with regard to execution properties. Examples of simple properties are: all network connections initiated will use encryption; the default thread termination methods will not be altered; the number of threats forked will be bounded; and so forth. More subtle properties can require more careful inspection a n d / o r object classes specially crafted to this end. Examples of such properties include: sensitive user data will stored, if need be, appropriately encrypted; network connections will only be initiated to addresses learnt through the hosting server; and others. Once satisfying a predefined set of properties to the best of the T T P site personnel's knowledge, the applet code can be signed by the TTP. If multiple levels of trust are used as proposed below, the level of trust will correspond to the specific properties that are asserted. Varying Levels o f Trust: Browsers must be updated in order to allow a userconfigurable set of operations to each applet. This involves having the applet Security Manager maintain and consult a list indexed, for example, by the known certificates of code signers. For each certificate, the list must contain the set of allowed sensitive operations. Using this capability in conjunction with Source Code Certification as discussed above, applets can be signed with different keys, depending on the operations that they need in order to operate correctly a n d / o r efficiently. Varying levels of trust will be supported, using a different key for each level. Levels will range from "completely untrusted", prevented from execution, to "fully trusted" which can be given full access to local resources. Intermediate levels may be defined e.g., to allow network connections to arbitrary addresses learnt through an existing secure connection, to allow access to a limited portion of the local file system, etc. In this paradigm, T T P s will be responsible for the certification of applets, by signing them with the key for the appropriate security level. Secure Applet Communications: The certification service of T T P s should be extended to cover the needs of secure communication among applets and available services (e.g. database servers, image servers, etc.). Such connections should base their communications security on a standardized underlying mechanism, and recent Java developments show that SSL is the mechanism of choice.
EUROMED-JAVA
217
Thus, certificates should be issued by T T P sites and be installed on the sites t h a t host services t h a t applets use.
A c c e s s C o n t r o l M a n a g e m e n t : Communication between applets and servers should be two-way authenticated and access-controlled. Access control should be based on the user certificate presented by the client applet and, as suggested in E U R O M E D - E T S , it should be organized on a group membership basis to allow flexible access control policies and m a x i m u m manageability. This requirement has two interesting implications, one for each communicating side. Servers, one the one hand, must be able to verify a large number of certificates, possibly signed by various authorities, as well as to check the Certificate Revocation Lists provided by these authorities. T h e suggested solution for this problem in E U R O M E D - E T S was to consult the Directory via a secure connection (LDAP over SSL) to verify the validity of user's certificates as well as to check for the latest version of the CRL. This solution should be generalized to cover servers such as the database and H P C servers needed by E U R O M E D applications. The second implication is that applets, in order to present the user's certificate to a server, should have the means to access and use the user's private key, usually stored on a smart card or on magnetic media. For this reason, an exception should be crafted in the SSL component of J a v a so that applets are allowed to read key d a t a from the local device holding the user's key pair. Alternatively, the browsers should provide the required functionality to either import certificates stored in standardized formats [29] on magnetic media, or support access to tokens through the J a v a C a r d A P I [18].
O n e - t i m e k e y s for a n o n y m o u s services: Due to the confidential nature of E U R O M E D , users m a y sometimes be granted access to medical d a t a such as an X-ray image, without any other access (e.g., without access to the patient's medical record and other personal data). A valid scenario would be to have an external physician examine and a n n o t a t e an X-ray image of a patient and store back the annotated image. In order to allow such applications to interact securely with servers, one-time keys should be used. One-time keys m a y be issued by T T P s on request from both the server and the user applet wishing to provide to one another one-time access to a resource for read or update purposes. Secure connections should be used for transferring components of one-time keys.
O t h e r C o m m u n i c a t i o n services: If t h e restriction that applets can only communicate with the hosting server is not uplifted, T T P s can serve as communication intermediates among client applets and various services. If T T P sites host applets that need to communicate to several servers to complete a task (e.g., to an H P C site for a computation-intensive task and to one or more d a t a b a s e servers for d a t a retrieval and archiving), a T T P site m a y offer transparent connection redirection or proxy services.
218
4
Angelos Varvitsiotis et al.
Conclusions and Further Research
The use of Java technology in E U R O M E D introduces new security problems. The extension of the T T P services offered by E U R O M E D - E T S can provide effective solutions to these problems. This will allow Java applets to be transported securely to the physicians' workstations and executed in a trustful environment that can be fine-grain controlled and configured. The interaction of the proposed measures with emerging technologies, such as biometrics- or Java-enabled smart cards should be explored, in order to enhance access control, authentication mechanisms and privacy in the Healthcare sector.
References [1] [2] [3]
[4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
Ahuja, V.: Network &: Internet Security. Academic Press, NY 1996. Barber, B., Bakker A.R. and S. Bengtsson (eds.): Caring for Health Information: Safety, Security and Secrecy. Amsterdam: Elsevier Science, 1994. Blobel, B.: "Towards Security in Medical Tetematics: Legal and Technical Aspects," Open Information Systems and Data Security in Medicine. Barber B., Treacher A. and K. Louwerse (eds). pp.168-182. IOS Press, Amsterdam, Washington, Tokyo, 1996. Council of Europe Recommendation R(97)5: On The Protection of Medical Data. Council of Europe, Strasbourg, 13 February 1997. UK Dept. of Trade and Industry ref. URN 97/669: Licensing of Trusted Third Parties for the Provision of Encryption Services. London, March 1997. Menezes, van Oorschot and Vanstone.: Handbook of Applied Cryptography. CRC Press, 1996. Rothermel, K. and R. Popescu-Zeletin (eds).: Mobile Agents'g7 - Proc. 1st International Workshop. LNCS 1219, Springer-Verlag, April 1997. Schneier, B.: Applied Cryptography, Protocols, Algorithms and Source Code in C. J. Wiley and Sons Inc, 2nd Ed, 1996. Camp L.J., Sirbu M.: "Critical Issues in Internet Commerce," IEEE Communications Magazine. pp.58-62. IEEE Press, 1997. Yourdon E.: "Java, the Web and Software Development," IEEE COMPUTER Magazine. pp.25-30, 1996. Hamilton M.: "Java and the Shift to Net-Centric Computing,". IEEE COMPUTER Magazine. pp.31-39, 1996. Vigna G. (ed): Proc. Mobile Agents and Security. LNCS, Springer-Verlag, 1998 (forthcomming). Marsh A., Delibasis K., Mouravlianski N. and C. Michael: "EUROMED - A WWW-based multi-mediaTelemedical information system," subm. in Transactions on Information Technology in Biomedicine. Marsh A.: "EUROMED - A WWW-based multi-media medical information system," Proc. 19th Annual Intl. Conf. IEEE Engineering in Medicine and Biology Society. IEEE-EMBS, Chicago, 1997. McGraw G. and Ed Felton: Java Security: Hostile Applets, Holes, and Antidotes. J.Wiley, ISBN 0-471-17842-X. Venners, B.: "Java security: How to install the security manager and customize your security policy,"
EUROMED-JAVA
219
[17] Sun Microsystems Inc.: "Secure Computing with Java: Now and the Future," (White Paper) Java One 1997 Conference.
html>
[18] Sun Microsystems Inc.: "Security-related Java APIs," < h t t p : / / j ava. sun. com/security/> [19] Freier, P., Karlton and P. Kocher: "The SSL Protocol Version 3.0," Internet Engineering Task Force: Internet Draft.
[20] Rescorla, A. and Schiffman: "The Secure HyperText Transfer Protocol," Internet Engineering Task Force: Internet Draft. [21] Rivest, R.: "The MD5 Message-Digest Algorithm," MIT Laboratory for Computer Science and RSA Data Security, Inc., April 1992. Internet Engineering Task Force: Request For Comments RFC1321.
[22] EUROMED, ISIS '95, DG III programme, 1995-1998.
[23] EUROMED-ETS: Trusted Third Party Services for Health Care in Europe. INFOSEC programme, DG XIII, 1997.
[24] NIST, FIPS PUB 180-1: Secure Hash Standard. National Institute of Standards and Technology, U.S. Dept. of Commerce, April 1995. [25] Krawczyk, H., Bellcare M. and R. Canetti: "HMAC: Keyed-Hashing for Message Authentication," Internet Engineering Task Force: Request for Comments.
[26] ANSI X3.106: American National Standard for Information Systems Data Link Encryption. American National Standards Institute, 1983. [27] Tuchman, W.: "Hellman Presents no Shortcut Solutions to DES," IEEE Spectrum. 16:8, July 1979. [28] Thayer, R. and K. Kaukonen: "A Stream Cipher Encryption Algorithm," Internet Engineering Task Force: Internet Draft, July 1997.
org/internet-drafts/draft-kaukonen-cipher-arcfour-Ol,
tx
[29] RSA Laboratories: "PKCS #12: Personal Information Exchange Syntax Standard," (version 1.0 Draft), April 1997. [30] CEC COM(90) 314 final SYN 287: "On the Protection of Individuals in Relation to the Processing of Personal Data," Commission of the European Communities, Brussels, September 1990. [31] CEC COM(90) 314 final SYN 288: "On the Protection of Personal Data and Privacy in the Context of Public Digital Telecommunication Networks," Commission of the European Communities, Brussels, September 1990. [32] CE R(81)l: Recommendation R(81)1 on Automated Medical Data Banks, Council of Europe Convention 108, January 1981, ISBN 92-871-0022-5. [33] EU 95/46/EC: On the Protection of Individuals with regards to the Processing of Personal Data and on the Free Movement of Such Data, European Union Directive, OJ L281/31-50, October 1995. [34] Simitis, S.: "Reviewing Privacy in an Information Society," Univ. Pennsylvania Law Review, V.135, pp.707-746, March 1987. [35] Sun Microsystems Inc.: "Java Naming and Directory Interface,"
220
Angelos Varvitsiotis et al. Wahl, M., Howes, T. and S. Kille: "Lightweight Directory Access Protocol (v3)," Internet Engineering Task Force: Request For Comments RFC2251.
MPEG PTY-Marks: Cheap Detection of Embedded Copyright Data in DVD-Video J. P. M. G. Linnartz and J. C. Talstra Philips Research, WY8; Prof. Holstlaan 4 5656 AA Eindhoven; Netherlands
A b s t r a c t . In this paper we propose a method to watermark digital video content in such a way that detection in consumer electronics equipment is possible with very little hardware (a few thousand gates). The method proposes to modify the MPEG encoding procedure to choose the so-called Picture Type of video-frames not from a regular sequence but according to a message one would like to transmit. Removal of this embedded message, the PTY-Mark, from the resulting MPEG-stream without jeopardizing video quality is only possible after a complete MPEG decoding and re-encoding cycle. We investigate the modifications to current MPEG encoders which are necessary to accommodate these PTYMarks. Based on tests we comment on their feasibility. Detection of watermarks without secrets is very reminiscent of "public-key" cryptography. We discuss this relationship by contrasting PTY-marks with pixelwatermarking. Keywords: Watermarking, copy-protection, MPEG, DVD-video.
1
Introduction
The last few years have seen enormous expansion of the number of multimedia storage options, from ordinary CD to hot newcomers like DVD-R. The common denominator of all these new systems is t h a t they store and disseminate digital information, be it text, pictures, audio, video or software. This digital nature poses a very realistic threat to those who provide proprietary or copyrighted content. The need for those providers to protect their legal rights has sparked a flurry of research activity in the field of copyright protection, thereby coming a long way from the days of analog scrambling of premium cable- channels. The history of anti-copy measures has taught us that as no protection scheme is absolutely secure, the relevant question becomes really one of what level of attacks can be subverted at what price. This question is directly related to what kind of attacks one expects a consumer-device to withstand given the realities of the marketplace. Therefore a few words a b o u t levels of piracy and how one would like to guard against them. On the one hand casual home copying can be effectively stopped by fairly simple technical measures. On the other hand, large scale pirates have ample technical means to circumvent any protection. Because the number of these large operations is limited, they can be challenged in court except in countries where the authorities are either non-co-operative
222
J . P . M . G . Linnartz and J. C. Talstra
or insufficiently in control. The category in between, viz. the small-scale pirates running cottage or garage factories, may be too small to attack through legal actions. Meanwhile these pirates often have sufficient facilities for tampering with recording devices, to overcome conditional record protection measures of their own equipment However, pirates have no access to the devices installed in the homes of their potential customers. This suggests that the best measure against small-scale piracy is playback control, at the expense of (a small amount of) additional logic in consumer equipment. This playback control is preferably conducted in a simple disc drive or other storage device which usually has no facilities to process and interpret the stored "bits and bytes". This shift from record- to playback-control of the anti-copy paradigm is generally regarded as a technological challenge due to the "dumbness" of such drives. Imposing playback-control implies that the world of playback equipment gets divided into compliant and non-compliant devices. Given the fact that it will most likely be impossible to root out the members of the second category, the strategy of an anti-copy mechanism should be to keep protected content from being multiplied in the non-compliant world and hurting sales by re-entering the compliant world. Encryption, as for instance applied to disc sectors, only addresses part of the issue of illegal copying. This applies in particular to the new digital "content scrambling system" or CSS, for DVD-video disks. At some point the encrypted content is read from the disc and becomes available in the clear, either after (legal) decryption, or illegally, after the cryptographic algorithm has been cracked or its key obtained. This content needs further protection against copying and mass-multiplication without loss of quality, an issue that is of particular concern to music- and movie-studios. One of the solutions which has been pursued in relatively recent years, is that of pixel watermarking. It is possible to mark an image, a video-clip or soundbite in such a way that marked and unmarked pieces differ in a mean squared sense and are very distinguishable as such by electronic hardware--yet at the same time this deviation cannot be perceived by the human sensory system[l16]. Embedded signaling in the form of watermarks is much like an electronic "tattoo" in that it ensures that marks are not lost in typical operations, including format conversions. Although the principle of pixel-watermarking seems to put an elegant end to copyright issues, it has a few serious drawbacks. First of all, a provider who wishes to assert his/her ownership of still-pictures may employ (in principle) arbitrary resources (time and computing power) to detect the watermark that (s)he inserted. On the other hand, for video playback control this ownership has to be determined by a rather limited proxy, viz. consumer-equipment that decides whether playing/recording for a particular medium is allowed. This verdict should be reached on the fly (say every 10 seconds) and cheaply i.e. with very few gates. Secondly there is the issue of security. With the advent of the personal computer on the film/music scene, care has to be taken that the relatively open busstructure which is absent in a consumer recorder does not become the Achilles
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
223
Heel of a copy-protection system. One can imagine for instance that in a computer, a DVD-drive sends (encrypted) MPEG encoded video-material over the PCI bus to an MPEG decoder video card. That drive learns over the same PCI bus from the card, whether playback should be ceased or not, depending on the state of the watermark in the baseband video-content, a situation that would be vulnerable to a "man in the middle attack". Forestalling this situation by attempting pixel watermark detection already in the drive, would require this relatively dumb playback device to have a partial MPEG- decoder on board! Current estimates of the complexity of such a pixel-watermark detector start around 50,000 gates. The third undesirable feature of pixel-watermarking is that present methods rely on a pseudo-random number sequence that is embedded in images. The detection of this sequence plays the role of a secret key. An important distinguishing characteristic of watermarks is the level of restriction placed on the ability to read a watermark. For example, in many cases, it is desirable to embed information in audio, image or video content such that this information is readable by any recipient. In an application such as transferring copyright ownership information by watermarking news photographs, any and all receiving users should be capable of reading the embedded information. This has been called "public" watermarking, drawing analogy with public key cryptography. However, this nomenclature is misleading. All currently known watermarking algorithms fall into the category of "secret key" algorithms, in the sense that any expert who knows the algorithm and the key also has all the necessary tools to remove that watermark. In the parlance of cryptography this would be called "bringing the content into the open" (allowing it to be copied). The detectors embedded for instance in CE products, therefore have to store this secret in a relatively tamper resistant environment. To the best of our knowledge, no equivalent to public key encryption is currently available for watermarking that would allow public dissemination of a method and key to detect the watermark, without inherently revealing how the watermark can be removed. In (hypothetical) public watermarking, the embedding algorithm is private. i.e., only known to copyright owners, the detection algorithm is public knowledge. Lacking such systems, typically a secret key algorithm is placed in a tamperresistant box. It has been shown, however, that even if one assumes that this box is perfectly tamper-proof, it can efficiently be misused as an oracle to reveal the secrets of the watermark[17]. To deal with these three problems and fend off the various attacks associated with them, we would need another "mark", as closely intertwined, with the content as the pixel-domain watermark, but one that can already be detected in the drive. This mark does not need to survive outside the digital domain, or even after MPEG encoding, as beyond that point, the pixel-domain watermark takes over. In this paper we will describe how a public watermarking scheme could be designed around MPEG-type compression methods. The asymmetry of embedding and detecting this public watermark relies on the difference in complexity of
224
J . P . M . G . Linnartz and J. C. Talstra
M P E G compression versus M P E G decompression. We propose to use the redundancy in the choice of encoding video into M P E G Groups Of Pictures (GOPs) as a carrier for this watermark. Section 2 will explain the M P E G - w a t e r m a r k i n g principle. Section 3 will discuss the feasibility modifying existing encoders to a c c o m m o d a t e this watermark, using d a t a from trial M P E G - e n c o d i n g sessions on various public- domain M P E G 1 / M P E G - 2 encoders. Section 4 will conclude with a comparison of subject of this p a p e r with other similar existing methods and a future outlook.
2
PTY-Marks
As is well known the M P E G - 1 and MPEG-21 standards define three distinct ways in which a frame in a video stream can be MPEG-encoded, viz. as I-,Band P-Picture Types ( P T Y for short). A frame encoded as a n / - p i c t u r e type, is autonomous: it is essentially encoded as a J P E G picture, exploiting only spatial redundancy. B and P frames were introduced to make better use of temporal redundancy: coding a frame as a P - p i c t u r e type, one only describes differences with respect to certain previous frames (of either I - or P-type). Maximal compression efficiency is achieved with B-picture types which code roughly the difference between a given frame and the interpolation between the preceding and succeeding I - or P-frame. A sequence of frames starting with a n / - t y p e and up to, but not including the next I - frame is called a Group Of Pictures, G O P for short[18]2..
'
I-B-B-B-B-P-P-B
!
'~
GOP
pvl' ~
Fig. 1. GOP structure and examples of references of B and P frames.
As illustrated in Figure 1, B frames refer not only to the previous I or P frame, but also to the nearest I or P frame in future. Note t h a t P and B frames cannot be decoded properly if their references are not available. High coding efficiency is achieved by inserting as m a n y P - and B-picture types as possible. To code a given frame as an interpolation of two others, implies 1 Strictly speaking, in MPEG-2 the notion of GOP has been replaced by that of sequence as an autonomous self- referential group of frames/fields. For this paper we will stick to the MPEG-1 nomenclature, but this trivially extends to MPEG-2 2 Note: the MPEG standards and refinements thereof for DVD impose only mild constraints on the choice of whether to encode a given frame as a I,B or P-picture type: (i) the distance between consecutive I ' s in the resulting MPEG- stream typically does not exceed 0.6 seconds (15 frames for PAL, 18 for NTSC), and, in early versions, (ii) consecutive P frames should not be more than 3 frames apart.
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
225
that the encoder needs to do a (potentially) vast search in this frame for features like a moving car that might occur at another place in previous/later frames. The object that connects this feature to its incarnation in a previous/later frame is called a motion vector, and the procedure of finding it carries the n a m e / o r ward/backward motion prediction. It is particularly in this motion estimation and selection of the best reference location that M P E G encoders differ from manufacturer to manufacturer. This part of the encoding is seen as the most difficult and computationally intensive task where consumer encoders will lag in performance, compared to professional products. In principle the degree of freedom of choosing the picture type could be exploited to transmit a low bitrate data-stream, containing e.g. copyright information. We call this deliberate manipulation of picture types a PTY-watermark of P T Y - M a r k for short. Of course one might use one of the dedicated user data areas as accommodated by the M P E G - s y n t a x as a copyright channel, but this opens up serious hacking opportunities for potential software-pirates. Conversely, removal of the copyright information embedded in a deliberately chosen sequence of picture types, requires a complete decoding/encoding cycle of the M P E G stream. It is expected that in the upcoming few years, the cost of this cycle (decoding+encoding while maintaining video quality) will remain prohibitive, financially as well as computationally. At the same time, decoding this P T Y mark should be possible at little cost (a few thousand gates) as it involves just parsing the MPEG-stream and referring to a look-up table to decode GOP-structures into characters. The PTu is very much like public-key watermarking, and this allows a detector to become part of a "dumb" device such as a DVD-ROM/RAM player in a PC.
B O X 1: Vulnerability to A t t a c k s An attacker who is familiar with the M P E G standard can attempt to undo a P T Y - m a r k by rewriting a P frame into a B-frame. This is possible without redoing any motion estimation for that frame. As shown in Figure 2, the new frame could use references to a future frame, but does not need to use these. While the modified frame will be decompressed correctly, artifacts will occur in other frames of that GOP, see fig. 2. Neighboring frames will then have incorrect references. The artifacts typically become more severe for frames later in a GOP. Figure 3 gives an example of such an incorrectly decoded frame. Erroneous references create blocks of 8 by 8 pixels with substantial luminance and chrominance errors, particularly in areas with motion. These artifacts are clearly visible in Figure 4. To correct the artifacts caused by these attacks, the effort needed is comparable to M P E G encoding from scratch. This suggests that the watermarking method can be used in applications
226
J.P.M.G.
Linnartz and J. C. Talstra
I-B-B-B-B-P-P-B~., ',eference not used
I-B-B-B-B-B-P-BF i g . 2. Original marked G O P structure (above) and attacked sequence (below). One P frame is now written as a B frame by an attacker. The B frame allows references to the next P frame but a hacker would not use these, to avoid having to do motion estimation.
I-B-B-B-B-P-P-B-
I
i
F i g . 3. Originally marked G O P structure (above) and attacked sequence (below). One P frame now is written as a B frame. Frames marked with grey circles have incorrect references and will show severe artefacts.
w h e r e c o n v e r s i o n to u n c o m p r e s s e d d i g i t a l or a n a l o g would have c i r c u m v e n t e d t h e c o p y - p r o t e c t i o n m e t h o d anyway.
3
Implementation
Issues
To i n t e g r a t e t h e P T Y - m a r k into a c r y p t o g r a p h i c a l l y secure c o p y r i g h t m a n a g e m e n t s y s t e m , we would like to a l l o c a t e 64 b i t s to it. T h e M P A A (the c o n s o r t i u m of H o l l y w o o d M o v i e Studios) has issued a g u i d e l i n e t h a t w a t e r m a r k - d e t e c t o r s in D V D - p l a y e r s (of either s t a n d - a l o n e o r P C t y p e ) should d e t e c t presence of a w a t e r m a r k , once e v e r y 10 seconds.
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
227
Fig. 4. A) Sample of an original video frame, and B) Artifacts caused by an intentionally modified picture type.
228
J . P . M . G . Linnartz and J. C. Talstra
The choice of P T Y - m a r k s as a subliminal channel only makes sense if G O P structures which have a "meaning" as a symbol transmitted across this channel are not being generated randomly by existing and currently envisioned M P E G encoders. A limited survey of extant DVD-video material (see Appendix) and present M P E G encoding practices yielded the following: 1. With the maximal G O P length of 0.6 seconds, we have at least 15 G O P s / 1 0 seconds, and therefore for a worst case 10 sec. slot every G O P should encode 6 bits on average. 2. Right now, and probably in the few years to come, the G O P - s t r u c t u r e of choice for DVD but also commercial digital broadcast is: IB...BPB...BPB... - I B n ( B n P ) m-1. Typically n -- 1,2 and m = 4 for professional equipment, and n = 0 for consumer-grade hard/softwareencoders, and in either case usually fixed for the duration of a presentation. The conclusion that we draw from this is that the number and distribution of B-frames in a potential P T Y - m a r k should not change too much with respect to a "normal" G O P to maintain coding complexity at a reasonable level 3, yet at the same time it should be descernible from t h a t same standard GOP. The number of P - f r a m e s should stay approximately the same because a coded P - f r a m e requires on average twice as m a n y bits as a B-frame, and with fixed coding rate, e x t r a P ' s decrease the SNR. 3. M P E G encoders m a y optimize the G O P structure a little further than the conventional sequences listed in item 1.: during scene changes, there is little temporal redundancy. To deal with this, an intelligent encoder encodes a B - f r a m e which bears little resemblance to its neighbors as a P - f r a m e . When this doesn't help, a new G O P is forced by coding as it an I frame. In such cases we see more esoteric G O P structures such as I P P P , IBBPBBPBBPPBB, or consecutive single I ' s . 4. Occasionally we see G O P s with a more constant structure such as those containing n > 4 consecutive P ' s , representing freeze-frames without motion. In the next sections we will give a particular implementation of a P T Y - m a r k alphabet and discuss possible improvements.
3.1
P r o p o s a l for a P T Y A l p h a b e t
The material of this and the next section is the subject of current research at Philips NatLab as Philips' contribution [19] to the standardization subcommittee for DVD-video copy-protection through watermarking, the D H S G - C P T W G . The DataHiding SubGroup of the Copy Protection Technical Working Group, is an industry forum with participants drawn from content providers, consumer electronics and I T industries. 3 Besides, if a string of B's becomes too long, the reference frame that they draw their motion estimation from, is too far in the past. Therefore correlation is bad and coding efficiency goes down dramatically.
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
229
If within a G O P we denote a P - f r a m e as the bit "0" and a B-frame as the bit "1", every G O P has a one-to-one relationship to a binary sequence, e.g. I B B P B B P B B P B B =_ 11011011011. Taking into account the requirements in the previous paragraphs, a P T Y - a l p h a b e t was constructed as a Hammingcode with the following properties (numbers refer to the 4 items in the previous section): - To a c c o m m o d a t e item 1: ( ~ 6 b i t s / G O P ) , all valid P T Y - m a r k e d G O P s should fall into 1 of 26 = 64 groups. - To a c c o m m o d a t e item 2, the G O P - l e n g t h is fixed to 12 (11 P s a n d / o r Bs), and the number of B-frames should be close 6, i.e. every group has a representative or "code-word" with six " l ' s . - To deal with item 3 (scene-changes and r a n d o m GOPs), we impose t h a t each group is created from its code-word by flipping at most 1 bit. and thus code-words must have H a m m i n g distance 4. - Regarding item 1: we have to eliminate all words with a H a m m i n g distance < 4 away from the "standard" G O P s "11011011011" and "10101010101", which e m a n a t e typically from standard encoders, and thus represent unmarked material. This yields an alphabet of 62 code-words in all, see table below: 4 Theoretically, imposing picture types m a y have a minor effect on signalto-quantization-noise ratio of the M P E G encoded image. Experiments however showed t h a t there is no deterioration of the image quality, neither observable by the h u m a n eye, nor measurable with statistical significance. We compressed video at predetermined rates and decompressed it. This result was subtracted from the original image and the rms error was computed. This measurement was made for marked and unmarked video, see fig. 3.1. The standard M P E G encoding method (diamonds) does not give significantly different SNR values t h a n the method t h a t embedded P T Y marks (squares). The average difference in error is close to 0 dB (crosses). However, locally, i.e., in particular frames, the SNR m a y be different. This very much depends on the relative alignment of scene changes with B and P picture types. There is no systematic tendency of the marked G O P s being worse t h a t normal M P E G encoded GOPs. In some frames, the marked sequence has better SNR t h a t a typical sequence. 3.2 False
Improved PTY-alphabets Positives
From studying various commercially released movies it appears t h a t in the vast majority of cases the GOP-structures t h a t are being used in the code above do not appear. There are, however, a few false positives: i.e. G O P s which have 4 Note that in the table code-word GOPs are represented by their so-called codingorder, not their display-order, because a PTY-detector would receive frames in the former fashion.
230
J . P . M . G . Linnartz and J. C. Talstra
code-word 11011100010 00101101110 11100010110 01100111010 00010101111 ii010101001
11101001010 11011010100 00111100011 01111010010 10110101010 11100100011 11111000001 01100101101 10111000110 11000101110
char. code-word 1 01101110001 5 00011010111 9{01110001011 13{ 10001011110 17100111110100 21 i00111001101 25 10100111100 29 11101100100 33 01110010101 37 01001101011 41 00100110111 45 11001111000 49 01101011100 53 11110110000 57 01111101000 61 01111111111
char.:code-word 210111011000 6 10000111011 10 10110100101 14:11110001100 18!10101101001 22 01000011111 26 10001100111 30 10010110110 34 01011001110 38 11010000111 42 11100011001 46}01011100101
50i 10101110010 54i00110011110 5810010011101 62
char. code-word
3 01010111100 cha~ 8] 7 11001001101 11 10110010011 121 15 11000110101 16] 19 01101000111 201 24 I 23 01110100110 27 00011111010 28 31 10100001111 32 35 00001111101 36 39 10011110001 40 43 00110111001 44 47 10011101100 48 52 51 01001110110 55 00101011011 56 60 59 01010110011
T a b l e 1. List of the code-words (PTY-marked GOPS) and the characters that they represent.
a meaning in the P T Y - w a t e r m a r k sense, but belong to an unmarked piece of video, viz.: I B B P B B P B B P P . To avoid these type of harmful detections, we have a n u m b e r of options: - P u t a higher layer of error detecting code on top of the subliminal P T Y channel e.g. by adding a CRC to its bit-content. - From a copy-protection point of view, a w a t e r m a r k is only valid if its bitcontent is compatible with the bit-content of another mark, viz. that of the physical medium t h a t carries the M P E G - s t r e a m . E.g. for DVD-copy protection, various disk-marks have been proposed t h a t don't travel along to a new disk when a copy is made: for instance e x t r a subliminal bits are hidden in the redundancy in the error-correcting code layer on disk, or the E F M + (Eight-To-Fourteen Plus) channel-coding, or allowing the spiral track to be modulated sinusoidally (so-called wobble). T h e purpose of this physical m a r k is to verify t h a t copyrighted content is on original media, and not on (forged) ROM-disks or even RAM-disks.
False N e g a t i v e s Another issue is t h a t of false negatives: i.e. a bit of content should be watermarked, but a detector cannot find P T Y - m a r k s in it as the MPEG-encoder was unable to put in P T Y - m a r k e d GOPs. This m a y h a p p e n for instance: - at scene changes. One practical comment might be that scene changes represent little commercial value and might therefore go unprotected by temporarily suspending PTY-marking.
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video SO
,
,
I
i
L
.B.r,p
SO
SNR [dB]
~
'
t
~ .............. J
J
"0i,,,nr'--
"~/.k
,~
~
_
231
"diff-snr" "diff-snr"
-~n
~.
41)
~.~k~-~,o-~a-~
~ ~t.~
~-o
30
20
10 I
~,: i
-10 200
205
210
,
,
i 215
~:
""
,j
i
i
i
i
220
225
230
235
l 240
i 245
__ 250
Frames [#] Fig. 5. Mean squared difference between the original video sequence and an MPEG compressed/decompressed sequence vs. frame-number for PTY-marked (squares) and %tandard"-GOP MPEG material (diamonds). The crosses show the difference between the two. - due to the fact t h a t a P T Y code-word in table above, m a y contain approximately " s t a n d a r d " n u m b e r s of P - and B-frames, b u t with sometimes very non-uniform distributions, e.g. long strings of Bs, which would might locally lead to b a d compression efficiency. To deal with t h e last issue: one way to stray not quite as far from s t a n d a r d G O P s is to indicate the presence of P T Y w a t e r m a r k e d content by an alternating sequence of G O P s : I ( B n - 1 p 2 ) m ' . . . I ( B ' ~ P ) m' " -; for n = 2, m ~ = 2 we would get
IBPPBPP... IBBPBBP... IBPPBPP... IBBPBBP..
.
IBPPBPP..
.
(1)
m and n would be chosen the same as for the case where no w a t e r m a r k is embedded. T h e " . . . " , e.g. B B P B B P can be left up to the e n c o d e r to optimize.
232
J . P . M . G . Linnartz and J. C. Talstra
The most obvious choice for ". 9-" seems to be "-. ."= ( B n P ) m - r e ' , the "nowatermark" ending resulting in a G O P of the original length. A conservative choice would be m' = m / 2 , half of the original G O P is sacrificed to the syncwatermark. This degree of freedom at the end of the marked G O P s should allow the encoder to ensure a video bit-rate/quality within specs. The detection of this "Watermark-Present'-sequence by the detector yields synchronization on the bit level. To allow us to achieve the same on a potential symbol level, we let the synchronizing sequence "count-down'. E.g. for n = 2, m' = 2 we run through the following sequence of 6 GOPs: IBBPBBP
..
IBPBBPB
..
IPBBPBB
..
IBPPBPP
..
IPBPPBP
..
IPPBPPB
..
(2)
Note that the encoding complexity of this sequence of GOPs is no different than the sequence in (1), and still avoids 3 or more consecutive Bs. Accumulating such a sync-sequence with a statistically relevant length (say a threshold of 10 syncGOPs in 10 sec) would then indicate a watermark present. For a few movies ("Four Weddings and a Funeral" and "When We Were Kings" (see Appendix) we analyzed the GOP-structure. Approximately 9% of the GOPs in these movies deviate significantly from their encoder-default IBBPBBP... and might be difficult to watermark. However if we look at the distribution of these 'Bad' GOPs, we see that they tend to bunch together in clusters, such that in a 10 second window, they regularly achieve a density of 20-35% (see Figure 3.2). We define a " Bad" GOP for these movies to be one that doesn't start with the 7 frames I B B P B B P . In Figure 3.2 we have integrated this data to show the relative importance of 10 sec. slots with a given percentage of "Bad" (i.e. difficult to watermark) GOPs. Beyond GOPs indicating the presence of the P T Y watermark, we also would like to embed bit-content. In NTSC with on average 26 G O P s / 1 0 seconds (possible low of 17 G O P s / 1 0 sec) we would have to encode on average approximately 3 b i t s / G O P (but more if we want to introduce error correction). The following 3 options are currently under consideration: - For embedders that typically produce GOPs with m > 4: embed 3 bits via G O P --= I A 1 A 2 A 3 B B P ; A i is either B B P (binary 0) or B P B (binary 1). Generalization to other values of n are trivial. - After the sync-sequence (1) or (2), the GOP length is allowed to vary by -1, 0, or +1 frames, by adding a P-frame or taking out a B from standard GOP I B B P B B P B B P B B . Valid GOPs are those having P-frames spaced not more than n frames apart (as usual). Assuming that the encoder default is n = 3, m = 5, we obtain 35 symbols, i.e. about 5 bits.
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video 100
i
233
l i t 'Four Weddings and a Funeral' - 'When We Were Kings'
80
40
20
0 0
1O0
200
300
Time (in units of
400 10 seconds)
500
600
700
Fig. 6. The percentage of "Bad" GOPs in a 10 sec. slot as a function of the position of the slot in the movie. The data has been compiled from two movies (see Appendix). A "Bad" GOP is one that doesn't start with I B B P B B P . That sequence is the encoderdefault for these two movies6.
3.3
Complexity/SNR
Analysis of Proposed
GOPs
Ill this section we present data comparing the various non-orthodox GOP-structures on the issues of coding-difficulty and picture quality. Throughout this section the bitrate is held constant. We make this comparison based on the time it took various software M P E G encoders to encode pieces of test material with a particular " P T Y - m a r k e d " GOP, and looking at the resulting SNR of the coded M P E G material. For video compression we used a public domain software M P E G - 1 encoder compressing at predetermined rates from Berkeley [3]. The SNR was measured by decompressing the compressed video; this result was subtracted from the original image and the error was c o m p u t e d in the usual mean-squared sense and was normalized to 100 for the orthodox I B B P B B P . . . . G O P structure. We took the encoding rate (in the terms of the number of processed frames per second of CPU-time) as a rough measure for the coding complexity introduced by forcing a particular G O P upon a video sequence. Comparison of the coding complexity and SNR of various G O P structures for a number of movie-clips and for various encoders can be found in table 3.3 below. From this table it appears t h a t P T Y - w a t e r m a r k i n g does not create substantial visual artifacts to the image. Although theoretically, the m e t h o d may have a minor effect on signal-to-quantization-noise ratio of the M P E G encoded image, the data do not bear this out. Also to the h u m a n eye, the difference between P T Y - m a r k e d sequences and the canonical sequences are not observable.
234 0.9
J. P. M. G. Linnartz and J. C. Talstra i
i
i
i
I
i
i
i
i
Four Weddinos and a Funeral 1
W h e n W e W e r e Kings
08
07 O (9 0.e
~
0,~
r '5
~ o "6 0 .
g
)
01
.
b0
" J ' =10 "
" ~ "20
.
.
.
" - ' "3 0 . . . 40 . . . 50.
'o
6 % 'Bad' G O P s in 10 sec. slot
7'0
. . . .
80
90
10O
Fig. 7. The relative importance of 10 sec. time-slots with "Bad GOPs" as a function of the percentage of those Bad GOPs in the time slot. Based on the data in Figure 3.2.
4
Conclusion
PTY-Marks as advocated above, are based on the asymmetry in complexity between encoding a frame as a particular picture type vs. detecting that picture type. There have been attempts similar to ours to exploit this asymmetry for embedding information in particular, by carefully choosing the motion-estimation vectors [22]. Lacy et al. [23] have suggested a similar method that embeds a watermark in MPEG-audio. This mark can also only be removed by a complete decompression/compression cycle. According to their method a subliminal copyright messages is embedded in the LSB of the so-called "scale factors" of the scalefactor bands, in the MPEG-audio standard. The actual frequency components for which a scale factor determines the quantization accuracy, are compensated accordingly. This is done to ensure that when a scale-factor's LSB is altered from default, the change to the frequency components compensates this in a way such that the resulting decompressed signal differs from the unmanipulated one in
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
COP Structure
235
avg. S N R ~eak SNRI Coding efficiency
(l~ame/8, IBBPBBPBBPBB IBPPBPPBBPBB IBBPBBPBBPBBP IBBPBBPBBPB IBPPBPPBPPBPP IBBPBBPBBPBBPBB IBBPBBPBBPBBPB IBBPBPBBPBBPBB IBBPBPBBPBBPBPBB IBBPBBPBBPBBPBBPBB IBBPBBPBBPBBPBBPB IBBPBPBBPBBPBBPBB IBBPPBBPBBBPBBPBPB
(N=12 standard GOP) (synch GOP) (GOP 1 longer) (GOP 1 shorter) (extreme synch GOP) (N=15 std GOP) (GOP 1 shorter) (GOP 1 shorter) (GOP 1 longer) (N=18 std GOP) (GOP 1 shorter) (COP 1 shorter) (random COP)
100.0 100.2 10{}.4 99.5! 99.8 99.4 99.2 99.3 99.7 99.3 99.2 99.6 99.0
158.1 155.8 159.4 157.1 159.0 158.5 157.1 158.1 159.0 159.4 159.0 159.0 159.0
CPU-time) 0.381061 0.442968 0.395922 0.390396 0.532198 0.378430 0.384966 0.386660 0.392643 0.378108 0.383374 0.386100 0.393895
Table 2. PTY-marked COPS, their coding complexity and SNR.
a manner that lies below the perceptual threshold 7. This marking takes a very small toll on the bandwidth (order .2% - .4%). Trying to remove this type of watermark by, say, randomizing the LSB of the scale factors, generates by definition quantization errors which lie above the perceptive threshold for hearing, unless the compression coefficients are manipulated along. This latter case for audio corresponds to a complete decompression/compression cycle, just like for P T Y - m a r k removal (see Box). This method can be trivially extended to MPEG-video by manipulating the LSB of the scalefactors that set the quantization accuracy for DCT coefficients in a macroblock. Contrary to PTY-marks, to remove the subliminal message in the video, scalefactors only requires a partial M P E G encode/decode, as motion-estimation does not have to be performed again. Another obvious extension of these PTY-marks can be made using MPEG4: in that compression standard, temporal redundancy in video is not reduced on the level of macroblocks, but rather by dissecting frames into a collection of Video Objects (VOs, rather like "sprites" from the gone days of home computers) and coding their motion in front of a stationary background. These objects in turn can be built from various Graphics Primitives, or alternatively described in terms of an (MPEG-1 compressed) bitmap or mesh and even class objects such as facial expression. We might say that MPEG-1, -2 video is a special case of M P E G - 4 where the Graphics Primitive is always rectangular (viz. the 16 • 16 pel Macroblock). Obviously compared to MPEG-1 and -2 the MPEG-4 improved compression comes at the cost of an encoding procedure that will be quite a bit more complex as the "search-space" is vastly greater. This complexity may E.g. imagine a frequency component in a scale-factor band with scale-factor f has value D which after quantization becomes [D/fJ 9 f. Then after marking f changes to f' (f, f ' may only differ in LSB) and D is quantized to [D/f'] * f'.
236
J . P . M . G . Linnartz and J. C. Talstra
again be exploited for the purpose of embedding a watermark. As an example, one might imagine tying a "never-copy" status to encoding VOs using a mesh or a particular subset of graphics primitives; non- copyrighted material would be encoded using primitives not from that subset. In summary we showed how one might exploit the complexity of an M P E G encoder to embed a watermark for digital video. This watermark can be detected with very minimal means (e.g. in a drive) forestalling complicated hardware to set up a cryptographically secure link to M P E G decoding logic. An interesting aspect of this P T Y - m a r k is that, to some extent, it has public-key properties. Detecting the watermark is trivially simple (and cheap in terms of hardware requirements), whereas embedding and modifying it requires a substantial effort, namely that of M P E G encoding. Of course, the disadvantage of P T Y marks is their inability to survive analogue transmission. As the barrier of M P E G encoding may not sufficiently protect carriers of copyrighted video material (such as DVD) by itself, the scheme is used as an "accelerator" for recognizing copyrighted content. The method illustrates how public-key watermarking may in future be achieved if compression and representation standards anticipate and accommodate this feature.
Acknowledgement The authors would like to thank Bas v / d Heuvel, Erwin Kragt, Joost Smolders, Ludo Tolhuizen and Emanuel Frimout for discussions, inspiration and help.
References 1. L. F. Turner, "Digital data security system." Patent IPN WO 89/08915, 1989. 2. J.T. Brassil, N.F. Maxemchuk, S. Low, and L. O'Gorman, "Electronic marking and identification techniques to discourage document copying" , Proceedings of IEEE Infocom 94. Toronto, June 1994, pp. 1278-1287 3. Caronni G.: "Assuring Ownership Rights for Digital Images", Proceedings of Reliable IT Systems, VIS '95, Vieweg Publishing Company, Germany, 1995 4. K. Tanaka, Y. Nakamura, and K. Matsui, "Embedding secret information into a dithered multi-level image", in Proc, 1990 IEEE Military Communications Conference, pp. 216-220, 1990. 5. E. Koch, J. Rindfrey, and J. Zhao: "Copyright protection for multimedia data", in Proc. of the Int. Conf. on Digital Media and Electronic Publishing, 1994. 6. E. Koch, J. Zhao: "Towards Robust and Hidden Image Copyright Labeling", Proceedings IEEE Workshop on Nonlinear Signal and Image Processing, Neos Marmaras, June, 1995, pp. 452-455. 7. M. D. Swanson, B. Zhu, and A. H. Tewfik, "Transparent robust image watermarking", in IEEE Int. Conf. onImage Processing, 1996 8. G. B. Rhoads, "Indentification/anthentication coding method and apparatus.", World Intellectual Property Organization, WIPO WO 95/14289, 1995. 9. W. Bender, D. Gruhl, N. Morimoto, "Techniques for Data Hiding", Proceedings of the SPIE, 2420:40, San Jose CA, USA, February 1995
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
237
10. O. Paatelma and R. H. Borland, "Method and apparatus for manipulating digital data works", WIPO Patent WO 95/20291, 1995. 11. L. Holt, B. G. Maufe, and A. Wiener, "Encoded marking of a recording signal." UK Patent GB 2196167A, 1988. 12. I. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, "Secure spread spectrum watermarking for images, audio and video", in IEEE Int. Conference on Image Processing, vol. 3, pp. 243-246, 1996. 13. I. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, "A secure, robust watermark for multimedia", in Information Hiding: First Int. Workshop Proc., R. Anderson, ed., vol. 1174 of Lecture Notes in Computer Science, pp. 185- 206, Springer-Verlag, 1996 14. C. I. Podilchuk and W. Zeng, "Image-Adaptive Watermarking Using Visual Models", IEEE Trans. on Selected Areas of Communications vol 16, pp 525-539, 1998. 15. J. J. K. O. Ruanaidh, W. J. Dowling, and F. Boland, "Phase watermarking of digital images", in IEEE Int. Conf. on Image Processing, 1996. 16. R. D. Preuss, S. E. Roukos, A. W. F. Huggins, H. Gish, M. A. Bergamo, P. M. Peterson, and D. A. G, "Embedded signalling", US Patent 5,319,735, 1994. J.P.M.G. Linnartz and A.C.C. Kalker and G.F. Depovere, "Modelling the false-alarm and missed detection rate for electronic watermarks", Proc. of Second International Information Hiding Workshop, Portland, OR, 1998. 17. T. Kalker, "Watermark estimation through detector observations", Proc. of the IEEE Benelux Signal Proc. Symp., Leuven, pp 119-122, 1995. I. J. Cox and J.M.P.G. Linnartz: "Public watermarks and resistance to tampering", ICIP 97. M. J. J. Maes, "Twin Peaks: the Histogram Attack to Fixed Depth Image Watermarks", Proc. of Second International Information Hiding Workshop, Portland, OR, 1998 J.-P. Linnartz and M. van Dijk, "Analysis of the Sensitivity Attack against Electronic Watermarks in Images", ibid.. 18. J. L. Mitchell, W. B. Pennebaker, C. E. Fogg and D. J. LeGall, "MPEG Video Compression Standard", Chapman and Hall, New York, NY, 1997. 19. Linnartz, J.-P., Kalker, T., Depovere, G.: "Response to Call for Proposals of the Data-Hiding SubGroup of the CPTWG" Sept. 1997. 20. Linnartz, J.-P., Heuvel, S.A.F.A v/d, Smolders, J., unpublished, 1997. 21. MPEG-1 Video Encoder in Software (V1.5), Rowe, L.A et al, Computer Science Division-EECS, UC Berkeley, 1995, http://bmrc.berkeley.edu/projects/mpeg. 22. Jordan, F., Kutter, M., Ebrahimi, T.: "Proposal of a Watermarking technique to Hide/Retrieve Copyright Data in Video", Technical Report M2281, ISO/IEC J T C 1 / S C 2 9 / W G l l MPEG-4 meeting, Stockholm, Sweden, July 1997. 23. J. Lacy, S. R. Quackenbush, A. Reibman and J. H. Snyder, "Intellectual Property Protection Systems and Digital Watermarking", Proc. of Second International Information Hiding Workshop, Portland, OR, 1998.
Appendix: Statistics of GOP Structure for Various DVD-Pictures Four Weddings and a Funeral (Polygram)
When We Were Kings (Polygram)
238
J.P.M.G.
Linnartz and J. C. Talstra
GOP IBBBBPBBPBB IBBPBBPBBPBBBB IPPP IPPPPPPP IPPPPPPPPPPPPPP IBBPBBPBBPBBPBBPPP IBBPBBPPBB IPPPPPPPPPPPBB IBBPBBPBBPBBPBBPP IPPPPPPPPPPP IBBPBBPBBPBBPBBP IBBPBBPBBPBBPBBPPPBB IBBPBBPBBPBBP IBBPBBPBBPBBPBBPBB IBBPBBPBBPBBPBBPPBB IBBPBBPBBPBBPP IBBPBBPBBPBBPPP IBBPBBPBB IBBPBBPPPBB IBBPPP IBBPPBB IBBPBBPBBPBBPPPBB IBBPBB IBBPBBP IBBPBBPP IBBPBBPPP IPPBB IBBP IBBPPPBB IBBPBBPBBPBBPPBB IBBPBBPBBPP IPP IBBPP IBBPBBPBBPPP IBBPBBPBBP IPBB IBBPBBPBBPPBB IP IBBPBBPBBPPPBB IBBPBBPBBPBBPBB IBBPBBPBBPBB
# of occurrences 1 1 1 1 1 2 3 5 6 9 12 14 18 18 26 27 38 51 87 102 104 108 110 113 120 122 122 123 125 125 131 134 135 147 150 158 200 213 248 765 1385 11708
T a b l e 3. "Four Weddings and a Funeral": GOPs Ordered by Frequency
MPEG PTY-Marks: Detection of Copyright Data in DVD-Video
GOP
# of occurrences
I IP IPP IPPP IPPPPPPP IPPPPPPPPPPP IPPPPPPPPPPPPPP IPPPPPPPPPPPBB IPBB IPPBB IBBBBPBBPBB IBBPPP IBBPP IBBP IBBPPBB IBBPPPBB IBBPBB IBBPBBPPPBB IBBPBBPPP IBBPBBPPBB IBBPBBPP IBBPBBP IBBPBBPBB IBBPBBPBBPPP IBBPBBPBBPP IBBPBBPBBP IBBPBBPBBPPBB IBBPBBPBBPPPBB IBBPBBPBBPPBB IBBPBBPBBPBBBB IBBPBBPBBPBBPPBB IBBPBBPBBPBBPPPBB IBBPBBPBBPBBPPP IBBPBBPBBPBBPP IBBPBBPBBPBBP IBBPBBPBBPBBPBBPPPBB IBBPBBPBBPBBPBBPPP IBBPBBPBBPBBPBBPP IBBPBBPBBPBBPBBPPBB IBBPBBPBBPBBPBBP IBBPBBPBBPBBPBBPBB IBBPBBPBBPBBPBB IBBPBBPBBPBB Total:
38 248 135 1 1 9 1 5 200 123 1 104 147 125 108 125 113 $1482 "BAD" GOPs 102 122 3 122 120 87 150 134 158 213 765 213 1 131 110 51 27 18 14 2 6 26 12 18 1385 11708 16969
239
T a b l e 4. "Four Weddings and a Funeral": GOPs Ordered by deviation from typical GOPs with n - - 2, m - - 4
240
J.P.M.G.
Linnartz and J. C. Talstra
GOP
# of occurrences
IBBBBPBBPBB IBBPBBPB IBBPBBPBBP IBBPBBPBBPB I IBPBPBPB IBBPBBPBB IBBPBB IPPP IBB IBBPBBPBBPBB
1 1 1 3 24 30 96 106 127 135 12512
T a b l e 5. "When we were Kings": G O P s ordered by frequency.
GOP
# of occurrences
I 24 IPPP 127 IBPBPBPB 30 IBBBBPBBPBB 1 IBB 135 IBBPBB 106 t 423 "BAD" GOPs IBBPBBPB 1 IBBPBBPBB 96 IBBPBBPBBP 1 IBBPBBPBBPB 3 IBBPBBPBBPBB 12512 T o t~ : 13036 T a b l e 6. "When we were Kings": Ordered by deviation from typical GOPs with n = 2, m - - 4 .
DHWM: A Scheme for Managing Watermarking Keys in the Aquarelle Multimedia Distributed System
Daniel Augot l, Jean-Franqois Delaigle 2, and Caroline Fontaine 1 1 INRIA Rocquencourt Domaine de Voluceau, B.P. 105 F-78153 Le Chesnay Cedex, France TELE, Laboratoire de t~l~d~tection et t~l~communication de l'Universit~ catholique de Louvain Belgium, Louvain-La-Neuve
Abstract. This paper presents secure architecture and protocols for managing Intellectual Property Rights in distributed content databases in a close environment. The implementation of this architecture is currently being realized in the European project AQUARELLE. Registered users will access on the Internet to high value content through secure servers. The main interest of this paper is protocols and architecture developed for using watermarking technologies, with a clever and efficient key management based on the Diffie-Hellman (DH) protocol and Trusted Third Parties (TTP). This paper presents a short survey of watermarking technologies. Next Aquarelle background is specified, along with the chosen watermarking algorithm, which is convenient for the project. Next the DHWM key exchange is presented, based on the simple idea that watermarking and verification can be separated. This scheme uses the Diffie-Hellman keyexchange protocol. Next some hints on the implementation of the scheme and on its correctness are given. Keywords: I P R protection, watermarking, Key exchange, Aquarelle, multimedia distributed system.
1
Introduction
W a t e r m a r k i n g (or fingerprinting or stamping) digital images is a new topic in the security domain. Roughly, this technology consists in hiding an invisible and robust m a r k into an image. This information should be sufficient to identify the copyright owner of the image: watermarked images can be traced to find their originator or their owner. This is clearly recognized of importance in the context of World Wild Web publishing. The given state of-the-art does not allow copyright owners to protect their images after diffusion, and m a n y services are blocked in their development because of the ease of reproduction of digital data. Watermarking, or embedding is a very new and complex technology, which hardly seems by now scalable to the whole Internet. Indeed, the technology is
242
Daniel Augot et al.
not as strong as classical cryptology, and reasonable attacks can be attempted with success. This paper describes the solution devised for a closed environment designed for the access of the European Cultural Database. This work is done for the Aquarelle European Project. In this project, multimedia data is more clearly defined, and users are also well known and identified. In that context, a reasonable solution can be tailored. A trusted third party TTP is introduced. Its role is to check the watermarked images. A first scheme is presented, and our scheme, named DHWM, improves that scheme by making use of the Diffie-Hellman protocol. The paper is structured as follows: the second section briefly surveys techniques relatives to watermarking, and fixes some terminology. Third section presents Aquarelle and its context, giving user requirements and technicals constraints. In section four, the algorithm from UCL is sketched. Section five introduces the DHWM functional model, using the Diffie-Hellman protocol. Section six presents how the scheme is implemented in the Aquarelle prototype.
2
A short survey and a terminology
Many authors have proposed as many algorithms for "marking", "fingerprinting", "data hiding", "steganography", "label embeddings", "watermarking" etc. We will first set a terminology with the most common concepts. We will then survey the functionalities claimed by different propositions, since the objectives are quite different depending on the authors. We will not discuss the technicals properties of the algorithms (robustness, invisibility, speed ... ) but only their aims and objectives.
2.1
M e t h o d s f r o m classical cryptography
It appears that the classical uses of cryptographic techniques are not able to prevent fraudulent use of the images: e n c r y p t i o n encryption can only protects the images during their transmission. With encryption, an eavesdropper does not have access to the on-line image when it is transfered. But when the user has deciphered the image, then this image does not have any copyright protection anymore. s i g n a t u r e the owner of the image may electronically sign the image (with a hash function and a signature algorithm), but since the signature is added as a suffix to the image, it can easily be removed by anyone who gets the image. It appears that additional tools are needed. Watermarking is one of these, it will be explained in next sections.
DHWM: managing watermarking keys in Aquarelle 2.2
243
Terminology
is a very generic concept that consists in hiding messages in a way that eavesdroppers or any monitors do not even know that there is a communication and a message is being sent. Many data biding techniques were invented for this purpose. Those techniques inspired the development of watermarking algorithms for copyright protection [9]. w a t e r m a r k i n g is the robust embedding of a copyright information (e.g. time and date, copyright identifiers or simply a correlation pattern) into a content. This content may be a text [12] [1], an audio content [16], but most of the time watermarking is applied to still or moving images. This paper is focusing on images watermarking for both still pictures and motion pictures applications. In the current state-of-the-art, watermarking uses symmetric keys, in the sense that a secret key is used to hide data in a robust way and the same key is used to retrieve the data. f i n g e r p r i n t i n g consists in uniquely marking and registering each copy of the data. This marking allows a distributor to detect any unauthorized copy and trace it back to the user. Fingerprinting englobes most often data hiding techniques and cryptographic protocols. Fingerprints have to resist to collusions attacks. It must be very difficult for a set of users to collaborate together and alter fingerprints by merging their copies. Data hiding techniques for fingerprinting can be for instance watermarking techniques but data can also be physically hidden in the media that support the data [14]. steganography
2.3
Watermarking
technologies
In this section, we make a short overview of most popular watermarking methods from different universities and companies. In the following "survey", we describe roughly the methods, but we mainly focus on the functional aspects of the methods. Section 4 will describe the watermarking algorithm that has been chosen in the Aquarelle project. - In [4], a "visible watermark" is introduced. It clearly identifies the ownership,
and allows all image details to be seen through it. It is robust enough such that any attempt to remove it alters the image. However, the main drawback is of course that it reduces the quality of the picture. - The algorithm introduced in [15] is envisaged to have application in image tagging, copyright enforcement, counterfeit access and controlled access, although the authors do not explain how to use their algorithm to perform these functionalities. - In [1], electronic marking is applied to textual document, by word or line shifting. An indiscernible codeword is added to the document, and it identifies the registered user to whom the document has been delivered. This can be applied on non-ascii text representation, and not to images. - In [10, 17], the following requirements for an invisible copyright label are defined:
244
-
-
-
-
-
3 3.1
Daniel Augot et al. 1. The image must contain a label or code, which marks it as the property of the copyright holder. 2. The image data must contain a user code, which verifies that the user is in legal possession of the data. 3. The image data is labeled in a manner which allows its distribution to be tracked. Unfortunately, the invisibility of the watermark is not totally guaranteed. The watermark is embedded in chosen DCT coefficients of 8x8 blocks. In order to be resistant against compression, the chosen DCT coefficients have to be quantified, which means that marked blocks are altered. This feature can be damageable for high quality pictures such as museum images. The authors of [5] consider watermarking using spread spectrum [8]. Their mark identifies ownership and the user who got the image. This method has good robustness properties, however, both the original image and the marked one are needed to check the mark. In [2], an algorithm for invisibly marking an image is introduced. Here again, it is needed to have both the original image and the watermarked one to check the mark. In [16], the authors focus on the notion of data hiding. They envisage the application to the problem of copyright proving and to the content integrity, but the paper does not describe the functional aspects clearly enough. The paper [3] is cryptology oriented and makes abstraction of the marking algorithm. The authors discuss the main problem of the fingerprinting technique. In [14], the author describes a global scheme to trace people who abuse broadcast encryption schemes and introduce the interest of fingerprinting. Nevertheless, this paper remains theoretical. Finally, there exist now quite a few companies involved in the area of watermarking, such as DIGIMARC, Signum Technologies, R3S, Mediasec Technologies or CRL. Some bigger companies are also currently developing watermarking techniques, such as IBM, AT&T, SONY, NTT, Matsushita, NEC, Philips. It is quite difficult to collect information about their technologies. Aquarelle
framework
AquareUe aims and technical objectives
Documentation - in a broad sense - is becoming one of the major productions of museums and cultural organizations. To organize exhibitions and produce information products as books or CD-ROM, cultural organizations, museums, libraries, photo-agencies, research laboratories or publishers, have to share information. The aim of Aquarelle is to present a global system for accessing this information. The Aquarelle users are museum curators, urban planners, commercial publishers and researchers. Aquarelle will provide the user with tools for searching
DHWM: managing watermarking keys in Aquaxelle
245
information and browsing in the folders available on the network of connected servers. The main technical objectives of the Aquarelle project are the following: to develop a unified resource discovery system for the cultural heritage information available in archive and folder databases; to provide facilities supporting information access through hypertext navigation as well as information retrieval by querying. The archive server databases contain the images that we want to protect. We consider that organizations running archive servers either own the images they contain, or distribute images belonging to another entity. In both cases, we consider that the organization running an archive server is willing to protect the images which are on the archive. Serious threats result from the facilities of Aquarelle, and more generally of Internet-based information systems. The ease of getting copyrighted images through the system poses challenges for traditional intellectual property regimes. In the Aquarelle system, all users are registered, and have a legal access to the digital resources. So the main danger is further dissemination of the images by legal users, be it voluntary or not. The Aquarelle architecture is quite intricated and sophisticated. For our purpose, we only present the following simplification of the architecture in figure. The front-end user uses a standard browser or the Aquarelle advanced browser. He connects to the "User Client Server WEB", which provides front pages and various cgi-bin. The "access server" is the key entrance to the Aquarelle system. In that place, users are registered, and data is transmitted through that node. Through the Z39.50 protocol, data may come directly from the archive servers, or from the "folder server", where folders contain meta-data, and may be published by publishers or cultural organizations.
3.2
Users r e q u i r e m e n t s and s y s t e m constraints
Cultural partners were concerned with the possibility that their images could be re-used in an unauthorized way. For instance a pirate can make fraudulent copies of images for selling cheap CD-ROMs. Their main concern was to able to prove, if presented with such fraudulent copies, their ownership of images. Such a possibility is a detering threat to potential cheaters. Because of such an objective, they wanted the watermarking system to present a high level of efficiency for protection. This means that the embedded mark must be very robust. Basic cryptographic commandments imply that the algorithm must be parameterized with some key. Following the strong requirements for cultural partners, a unique key will be used for each image. Now we briefly describe the Aquarelle architecture. Users connect to the system through an Aquarelle Access Server. Users uses their user terminal to connect on access servers, where they can log in. Once logged they are able to formulate their query. This query is broadcasted to folders and archive servers, and a result set is presented to the user. Connection between users and Access Servers is performed through the H T T P protocol. Connection between Access
246
Daniel Augot et al.
Servers and core data servers are performed using the Z39.50 protocol. The login and password are managed by the Access Server. For the Z39.50 connection between the Access Server and the core data Servers, a login and user password is provided. This enables to authenticate the Access Servers with respect to the core data servers. T h e r e is no u s e r aut h e n t i c a t i o n at this level. This a simple security mechanism to ensure that only authorized connections have occured to the core data servers. Since users are unknown at the archive server level, there is no possibility for fingerprinting images here. Fingerprinting can not beat the access server level, since information cannot be cross-compared between users queries and delivered information. Furthermore access servers are seen as being too busy at logging users, managing connections, formulating queries, collecting and assembling results sets. They are not able to perform an expensive on-line operation as fingerprinting. It is also easier to implement off-line watermarking at the archive server using an easy to manage software piece. 4
4.1
Properties
of the
algorithm
Rough description
The watermarking technique used here is based on Human Visual System Model that guarantees that the watermarked picture has the same quality as the original. The watermark is a correlation pattern, which has strong correlation properties. The technique is additive since this watermark is added to the original to ensure protection. W h a t is a d d e d t o t h e i m a g e ? Basically, the watermark is composed of 16 preprocessed replications of one basic correlation pattern. This pattern, that allows to identify copyright ownership, is an image of the same size as the original picture. The 16 replications are modulated and preprocessed before composing the watermark that will be added to the original. This process will be described more in detail in Section 4.1. The basic pattern is composed of black and white rectangle of pixels. Each rectangle stands for one bit. Those bits form MLS sequences. MLS sequences [11] are binary sequences having very good correlation properties, since MLS sequences are nearly orthogonal to their shifted versions. This feature is taken into account during the retrieval process. The pattern is illustrated in Figure 1. An additional security feature was added, the sequences bits are pseudorandomly mixed before being mapped into the rectangles, with the use of a secret key. H o w is it a d d e d ? The 16 parts of the watermark are generated independently. Each part consists of a modulation of a basic pattern at a secret frequency and a secret orientation, determined by a secret key and a pseudo-random generator. For this frequency and this orientation, a perceptual mask is computed. Figure 2
DHWM: managing watermarking keys in Aquarelle
247
Fig, 1. Example of one basic pattern
shows one example of Lena filtered with a perceptual mask. It serves to adjust the level of the modulated pattern to have it invisible when added to the image. The white regions correspond to areas where the activity is high at the corresponding frequeny.
Fig. 2. Energy of Lena in middle frequencies
Figure 3 shows the original, the watermark and the watermarked image, which is the addition of these first two. Finally, Figure 4 is the global scheme summarizing all these operations.
248
Daniel Augot et al.
(a)
(b)
(c)
Fig. 3. (a) is the original image, (b) is the watermark, (c) is the watermarked image
How is it decoded? The retrieval procedure is simple. Each part of the watermark is extracted from the watermarked image, by demodulation and filtering, before being added together. The result is an image which is very correlated with the basic pattern if the watermark was present. Autocorrelations are compared to cross-correlations (correlations with shifted MLS) to determine whether an image has been watermarked. 4.2
Robustness
The robustness is provided by the use of MLS and the perceptual mask that allows to embed at a higher level in high activity regions of the image.
compression When the image is compressed at JPEG 10% the watermark can still be recovered, though the quality of the compressed image is very bad. filtering The watermark is still recovered after low-pass filtering (e.g. blurring
7x7). printing The watermark is still recovered after half-tone printing. After redigitizing by scanning, we still recover the watermark. cropping and scanning In this case we need to know the size of the original to retrieve the watermark. 4.3
Invisibility
The quality of the watermarked image is the same as the original image, thanks to the use of perceptual masking.
4.4
Functional aspects
The algorithm, as described in the previous subsections, is well suitable for our purpose.
DHWM: managing watermarking keys in Aquarelle
249
.3 C9
CD
o~
250
Daniel Augot et al.
- It is secured by a secret key. It is optimized for high quality still pictures. - The decoding procedure is a Yes or No decision that determines whether the image is watermarked or not. - The watermark resists perfectly to classical image processing in image editing and distribution. -
5
Functional
5.1
models
The trusted third party
We think that any watermarking algorithm must be public, but parameterized by some key. It does not behave under a static way, since all images would then be embedded along the same scheme (even if the algorithm is adaptative to the image). Should the algorithm be delivered in compiled form, then reverse engineering applies to discover the principle of the algorithm. So we draw the conclusion that any watermarking algorithm must be parameterized with some key. In such a way the algorithm can be made public, and all the secrecy resides in the key. So we consider keyed embedding algorithms such that the knowledge of the embedding key K is needed to verify the watermark. Such an algorithm can offer two modes of operation for verification: 1. The owner reveals the key K to a verifier. The verifier runs the decoding algorithm to check that the image has been marked with the key K. 2. The owner does not reveal the key K , and runs the algorithm for himself. In the first case, the incrusted image is not reusable, since the key K has been shown, and anyone knowing K is able to remove the mark. In the second case, the owner may be a lier, since from an outside point of view, it seems only that the owner is running a black box which outputs YES. He can not be trusted. We solve these issues by introducing a Trusted Third Party, the TTP, which plays the following role: - The T T P knows the secret key K. - The T T P will never reveal the key K . The T T P runs the decoding algorithm, outputs the answer and never lies. -
Furthermore the T T P is highly secure, from many points of view (see Section 6.2). The secret K can not be violated, and there can be no impersonnification of the T T P .
Note It is important to note that the T T P introduced here is not a registration authority of copyright-ownership. The T T P will trust the copyright-owners who wish to use its services, and will not check whether the image belongs or does not belong to the copyright-owner using its services. We shall see that the T T P can defeat image-owners trying to cheat and to use its services for watermarking already protected images.
DHWM: managing watermarking keys in Aquarelle 5.2
251
Entities
In Aquarelle we will consider the following entities: TTP CO CO-ID IM IM-ID D IM* IM** K-IM User
the Trusted Third Party the owner of the copyright of IM. a string which is the unique image identifier of CO the original image a string which is the unique image identifier of IM the date the watermarked image the watermarked image, eventually modified by some hacker the secret used to perform the embedding for that particular image a sample user of the Aquarelle system
From the Aquarelle point of view, we see the CO as an archive-server manager. 5.3
A first f u n c t i o n a l m o d e l
We present a first functional model for clarity purpose. It is NOT the one which is implemented, but it is useful to understand the next one and its advantages. The protocol for watermarking using the above algorithm runs in 3 phases: 1. The Copyright-Owner sends IM, IM-ID and CO-ID to the T T P 2. The T T P generates a random key K-IM, watermarks the image with K-IM, and securely keeps IM-ID,CO-ID,D,K-IM in a table. 3. The T T P sends the watermarked image IM* back to the Copyright-Owner, along with CO-ID, IM-ID. The Copyright-Owner may now deliver the watermarked image IM* through the Aquarelle system. The verification phase is as follows: 1. A user submits an image IM**, IM-ID and CO-ID. 2. The T T P replies YES or NO. The date field in the database of secret keys is introduced to prevent the following scenario. An image-owner CO1 wants to cheat: he picks an image that has been marked by CO at date D, and submits it to the T T P with the identifiers CO1 and IM-ID1 for watermarking. Both CO and CO1 are able to have their watermark checked by the T T P. But since CO1 submitted the image after CO, then the date field D1 related to CO1, IM-ID1 is bigger than the date D from the original query, the fraud can be detected. But the above watermarking protocol has the two following disadvantages. First the image must be transmitted over a secure line for the first phase, since an eavesdropper may steal the unmarked image, which has no protection at that time. Secure line may mean encryption, which is a difficult issue because of
252
Daniel Augot et al.
various regulations on that topic in European countries (notably France). The second disadvantage is that there are two exchanges of images between the CO and the TTP, which makes a large amount of data to be transmitted. The improved watermarking protocol presented below solves these two problems.
5.4
Using the Diffie-Hellman protocol
The improved model uses the Diffie-Hellman key-exchange protocol [6]: it enables two persons to share a common secret, without any secure communication; it gets its security from the difficulty of calculating discrete logarithms in a finite field. The protocol for watermarking runs in 3 phases (see figure 5). 1. The Copyright-Owner and the TTP share a common secret key K-IM using the Diffie-Hellman protocol (each of them sends to the other his DiffieHellman half public key, say KA for the CO and KB for the TTP). 2. The TTP securely keeps I IM-ID,CO-ID,D,K-IM I secret. 3. The CO marks the image with the key K-IM.
Computes K-IM
Computes K-IM
Stores IM-ID,CO-ID,D,K-IM Marks the image
KA,IM-ID,CO-ID
V6 -]
*l
Ks IM*
Fig. 5. A scheme for watermarking images with the Diffie-Hellman protocol
This protocol is an improvement of the previous one since no images are exchanged between the CO and the TTP, so there is no need for a secure communication. Second the data exchanged between the CO and the TTP is very small, a few thousands bits, say. We name this protocol the DHWM protocol, standing for "Diffie-Hellman protocol for Water-Marking'.
DHWM: managing watermarking keys in Aquaretle 6 6.1
253
Integration into the Aquarelle prototype Consequences on the AquareUe architecture
The T T P is implemented here as an H T T P server, and the protocols between the Aquarelle's users and the T T P are built on the H T T P / 1 . 1 protocol. We note that, ideally, an image is watermarked once for its whole lifetime, and that the process can be done "off-line". So the T T P running the DHWM protocol suite may not be a part of the architecture, and since images are stored on the archive servers, the DHWM protocol needs only to be run between the T T P and the CO of the archive servers. The watermarking algorithm from UCL focuses on the invisibility of the mark, and, from that point of view, is quite performant. This means that high quality images will not lose their quality after being watermarked. Nevertheless, we think that the COs running the archive servers wish to keep an unmarked copy of their image. In that case, this means that there is a duplication of images: watermarked images to be retrieved through the Aquarelle links and unwatermarked originals that the COs wish to keep by themselves, such that they remain unreachable through the network.
6.2
Security considerations
There remains three main issues to address in the DWHM protocol.
Random numbers Running the Diffie-Hellman protocol, we need random numbers for generating the Diffie-Hellman half public keys. We chose here to use a self-shrinking generator [13]. Known attacks against this kind of pseudorandom generator only apply when the opponent is able to look at a very long string of bits. Here, we only need small pseudo-random strings for our protocol, and we are then protected against the attacks on our generator. Moreover, it is very fast, and this is an important point since the T T P could perform the random numbers generator very often. The weak point is that the state of the machine must be stored in a file, and attacks on this file may be considered. So this file (Dn_seed in our implementation) must be protected.
Security of the TTP database It is more obvious that the secrets maintained by the T T P must not be discovered by anyone. A cryptographic solution may consist in encrypting the IM-ID field in the database with a key only known by the TTP, but since the T T P acts automatically, this key must be stored somewhere. So the problem of protecting the file where the key is stored still remains. We believe that this problem is more related to computer security than to cryptology. In our implementation, the file is simply protected by usual Unix rights, and only the H T T P server is able to read this file. We leave the problem to computer security specialists, and suggest to use specialized software for this issue. We also suggest to limit the Internet Protocols that are used by the TTP.
254
Daniel Augot et al.
A u t h e n t i c a t i o n While the Diffie-Hellman protocol is designed to be protected against an eavesdropper (i.e. a passive attack), it does not offer protection against active attacks. We mainly think of authentication: it is a main issue that the CO and the T T P can be ABSOLUTELY assured of each other identity when they run the DHWM protocol to share a common secret key. Since the protocol is built onto the H T T P / 1 . 1 protocol, any security tool or software for authentication for the World Wide Web is convenient here. We think that the COs must be registered by the T T P , and the T T P must not be subject to an impersonnification attack. This issue is not covered by the D H W M scheme. Many authentication schemes are proposed for the WEB and for the H T T P protocol. H T T P / 1 . 1 provides a better authentication tool than H T T P / 1 . 0 (login,password). SSL protocol also enables authentication, with heavier tools. Nevertheless, the Diffie-Hellman protocol can integrate authentication, using a three-round protocol instead of a two-round protocol [7]. This protocol combines Diffie-Hellmazl key exchange and authentication, and is the basis of the Photuris protocol for IP security.
7
Conclusion
In a system like Aquarelle, where images are distributed to registered users, we protect the images with a watermarking solution. The chosen algorithm from the catholic University of Louvain has very good properties with respect to robustness, invisibility, resistance to J P E G compression. It offers a low functionality, since it actually embeds a single bit of information in an image. Using the DHWM scheme, this small amount of information is turned into a copyright protection system, using a Trusted Third Party. A simple implementation using an H T T P server has been made. This enables to use any emerging security tool for the WEB to improve the security of the model.
References 1. J. Brassil arts S. Low, N. Maxemchuck, and L. O'Gorman. Electronic marking and identification techniques to discourage document copying. In IEEE INFOCOM'94 - Networking for global communications, 1994. 2. F.M. Boland, J.J.K. O' Ruanaidh, and C. Dautzenberg. Watermarking digital images for copyright protection. Image Processing and its Applications, pages 326-330, 1995. July. 3. D. Boneh and J. Shaw. Collusion-secure fingerprinting for digital data. In D. Coppersmith, editor, CRYPTO 95, number 963 in Lecture Notes in Computer Science, pages 452-564. Springer, 1995. 4. G. W. Braudaway, K. A. Magerlein, and F. Mintzer. Protecting publicly-available images with a visible image watermark. Technical Report RC 20336 (89918) 1/15/96, IBM Research Division, 1996.
DHWM: managing watermarking keys in Aquarelle
255
5. I. J. Cox, J. Killian, T. Leighton, and T. Shamoon. Secure spread spectrum watermarking for multimedia. Technical Report 95-10, NEC Research Institute, 1995. 6. W. Diffie and M.E. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, IT-22(6):644-654, 1976. 7. W. Diffie, P. C. van Oorshot, and M. J Wiener. Authentication and authenticated key exchanges. Designs, Codes and Cryptography, 2:107-125, 1992. 8. R. C. Dixon. Spread Spectrum with Commercial Applications. Wiley, third edition edition, 1994. 9. David Kahn. The history of cryptography. In R. Anderson, editor, Information Hiding, number 1174 in Lecture Notes in Computer Science, pages 1-5. Springer, 1996. 10. E. Koch and J. Zhao. Towards robust and hidden image copyright labeling. In Proe of 1995 IEEE Workshop on Nonlinear Signal and Image Processing, 1995. 11. R. Lidl and H. Niederreiter. Finite Fields. Number 20 in Encyclopedia of Mathematics and its Applications. Addison-Wesley, first edition, 1983. 12. K. Matsui and K. Tanaka. Video-Stenography: How to embed a Signature in a Picture. IMA Intellectual Property Proceedings, 1(1):187-205, January 1994. 13. W. Meier and O. Staffelbach. The self shrinking generator. In R.E. Blahut et al, editor, Communications and Cryptography: Two sides of One Tapestry, pages 287295. Kluwer Academic Publishers, 1994. 14. B. Pfitsmann and M. Schunter. Asymmetric fingerprinting. In U. Maurer, editor, EUROCRYPT 96, number 1070 in Lecture Notes in Computer Science, pages 8495. Springer, 1996. 15. R.G. van Schyndel, A.Z. Tirkel, and C.F. Osborne. Towards a robust digital watermark. In S.Z. Li, D.P. Mital, E.K. Teoh, and H.Wan, editors, ACCV'95, Second Asian Conference on Computer Vision, number 1035 in Lecture Notes in Computer Science, pages 504-508. Springer, 1995. 16. N. Morimo W. Bender, D. Gruhl. Techniques for data hiding. In Proceedings of the SPIE, February 1995. San Jose CA. 17. J. Zhao and E. Koch. Embedding robust labels into images for copyright protection. In Peter Paul Klaus Brunnstein, editor, KnowRight 1995, Schriftenreihe der sterreichischen Computer Gesellschaft, Band 82, pages 242-251. Oldenbourg Verlag, 1995.
The "Ticket" Concept for Copy Control Based on Embedded Signalling J.P.M.G. Linnartz Philips Research, WY8, Holstlaan 4 5656 AA Eindhoven, The Netherlands linnartz@na~lab, research, philips, c o m
A b s t r a c t . This application-oriented paper discusses the use of electronic watermarks (also called embedded signaling) for copy control. Playback Control and Copy-Once are described. The 'ticket concept' is presented to provide these functionalities. Although the ticket shows similarities with a digital signature, there are essential differences. For instance, the ticket allows typical modifications of the content, which are common practice in transmission, storage and presentation of video. The concept is part of a proposal currently under investigation for standardization of DVD/CPTWG copy control. This paper also compares the ticket concept with other solutions, such as embedding a secondary mark at the recorder and using a signature scheme.
K e y words: Protection of IPR, watermarking, embedded signaling, copy control, copy once, playback control, Copy Generation Management System (CGMS), Digital Versatile Disc (DVD).
1
Introduction
While digitM multimedia technology opens many opportunities for new applications and services, content owners are afraid to loose revenues as copies of digital content can be generated rapidly, perfectly, at large scale and without limitations to the number of generations of copies. As copy management involves demanding and conflicting requirements, the issue has come on the "critical time path" of the market introduction of several digital products, including the Digital Video Broadcasting (DVB), Digital Versatile Disc (DVD), the IEEE 1394 digital interface and improved digital audio carriers (e.g. Super-Audio CD). The image quality of digital video disks provides a significant improvement over the quality of existing home video equipment, such as VHS recorders. For content providers, there is a greater risk of illegal copying using perfect digital reproduction. The problem of protection against illegal copying has been recognised decades ago, but adequate solutions have not yet been found. The first aim of copy protection schemes is to prevent illegal copies from being made. Failing that, the aim is to reduce the value of illegal copies, either by reducing their quality (hopefully to the point of being unwatchable) or by restricting their use. Copy protection
258
J.P.M.G. Linnartz
critically determines the viability of many business concepts in the Information Society, and it is receiving increasing pressure to find better solutions. Tools for copy protection in the digital world are sought in two directions: cryptography and embedded signalling. The old cryptographic paradigm in which Alice communicates with Bob and is wiretapped by Eve, does not hold for copy protection. Rather Alice wants to sell information to an unreliable Bob, but meanwhile she wants to restrict the use of that content. At the same time, Bob must be able to use and copy any similar works of art that he created himself, without any restrictions. It can easily be understood that encryption, as for instance applied to DVD disc sectors, only addresses part of the issue of illegal copying. It avoids that the user has direct access to valuable, highly compressed digital content. Instead, the user (or more precisely his electronic device) must make use of patented decryption algorithms. Conformance to copy protection measures is enforced in licensing contracts. Often such contracts state that a playback device may only offer the content in analogue form to the end user. Watermarking is a another technique, not only useful for active protection of intellectual property against illegal copying but also for new multimedia trading mechanisms. Yet there are several technical issues to be resolved. In the process of determining a watermarking standard for enhanced DVD copy protection, the "copy-once" and "conditional playback" became important requirements. This has lead to the development of methods to signal and dynamically modify the copy state of watermarked content, which we cover in this paper. The outline of this paper is as follows: Section 2 discusses the DVD embedded signalling standardisation requirements. Section 3 addresses play control and copy generation management in a broader context. Section 4 covers the proposed 'ticket' solution, and applies it to play control and generation control, in Section 4.1 and 4.2, respectively. Two other solutions have been proposed for the DVD copy-once requirement, namely the secondary watermark, and the digital signature, which are discussed in Section 5 and 6, respectively. Section 7 covers some generic attacks and potential weaknesses of these systems and Section 8 compares the solutions. Section 9 concludes this paper. 2
DVD
Copy
Control
1
DVD video material is encrypted on disc and the decryption keys are stored in a manner such that an ordinary copier who does not have access to the internal hardware of the disc drive can neither read or write these. Moreover, the decryption, MPEG decoding and D/A conversion are conducted in a more or less tamper resistant environment. This is easier to achieve in a consumer electronic device that is closed black box, but more difficult for personal computers. 1 This section relies heavily on an earlier paper by Cox and Linnartz [1]
The "ticket" concept for copy control based on embedded signalling
259
Hence an attacker has difficulty in getting the digital plaintext. Moreover, if he has access to the ciphertext only, he cannot make an encrypted copy that also has the appropriate keys. A digital copy of the encrypted content will not play unless the keys are also copied. To ensure exportability, the key length has been restricted, which exposes the system to cryptanalytic attacks. An important aspect of this approach is that its places a group of 'compliant devices' in the market, which internally adhere to copyright rules and externally communicate over protected links (e.g. using the IEEE 1394 interface). Lacking globally uniform copyright laws, licensing of cryptographic technology is essential to enforce the compliance to copy control rules. Hence, we observe that cryptography is used more as a tool to bind manufactures to copyright rules than as a copy protection mechanism by itself.
Another possible weakness is the analogue signal. Content must eventually be converted into an analog form to present it to a human viewer. Neither cryptography nor licensing contracts protect these analog signals. The in-theclear video signal is available at a variety of interfaces, including the NTSC or the RGB signal output. To prevent analog copying, DVD players are equipped with an analog protection system (APS). This is a proprietary technology developed by Macrovision that modifies the generated NTSC signal such that most VHS video recorders cannot record a high quality copy despite the fact that the same signal does not affect the TV display. Unfortunately, this system does not protect RGB signals, which are common to PC's and in the Europe SCART connectors, from analog recording and is therefore easily circumventable. Thus, copyrighted video material will find its way into the analog domain.
Considering the level protection to digital content, the most likely method by which a causal copier attempts to make a digital copy is through the digitization of an analog signal. Neither encryption nor the APS signaling prevent playback or recording of this illegal copy, unless A/D video grabbers are equipped with APS detectors, to voluntarily disallow digitization of APS formatted video signals. To provide enhanced protection, a watermark is inserted into the copy-restricted video sequence. It is intended to prevent illegal copying by telling a compliant device not to copy it. Hence, the watermark should survive MPEG-2 compression and digital-to-analog-to-digital conversions, i.e. if the video fidelity remains high, then the watermark should remain detectable. It can also reduce the value of illegal copies by preventing them from being played on compliant devices. This means that consumers will have a choice between a) compliant devices, which can play legal, commercially released discs that were encrypted, but cannot play pirated discs, and b) non-compliant devices, which can play pirated material, but cannot play encrypted discs.
260
J.P.M.G. Linnartz
According to requirements formulated by the DVD Copy Protection Technical Working Group, all possible (video)-content should fall into one of four categories: - Free C o p y (no copy restrictions whatsoever, e.g. home productions) C o p y N e v e r (no copies allowed; e.g. for DVD titles or for films rented from a video store). Despite the legislation in many countries that a customer is allowed to create backups for personal use, the content industry currently intends to predominantly publish content as "never copy". One C o p y Allowed (one generation of copies may be made). Users will be able to make a digital copy, but the system should prevent copies of this copy (or subsequent generations of copies) being made. Applications are primarily for Time Shift Viewing, that is, recording a feature film to be watched at a later time. There is no limit to the number of first generation copies that can be made from the original content. However, this original typically is only available once on the digital output of a Pay TV Set Top Box. Making a backup of purchased content is another legal application, at least in those countries where a Copy Never status is unlawful. According to several court cases, the user implicitly buys the right to copy if royalty fees are being levied on blank recording media. - C o p y No M o r e the copy-state of a recording after being copied a first generation. -
-
Copyright data describing the restrictions on that video's usage should not only signalling of the copy state, but also it should trigger the APS system. Instruction bits for usage of the analog protection system (APS) are 00 Don't use APS 01 Use type 1 APS 10 Use type 2 APS 11 Use type 3 APS In the past the Copy Generation Management System (CGMS) was introduced to provide copy-once functionality. In order to implement the copy once functionality of CGMS, it will probably be necessary to have one or more additional bits in the watermark that can easily be changed by consumer recorders. Ignoring certain extensions, the CGMS bits are 00 Video may be copied freely 01 not used 10 Video may be copied once 11 Video may never be copied A well known weakness of CGMS is that hackers can easily flip a copy state bit to grant themselves the possibility to copy. Even worse, cheap black-box devices can be sold on the market that set the copy state of any content to "free copyable", though these have recently been outlawed in the U.S.
The "ticket" concept for copy control based on embedded signalling
261
Embedded signalling should strengthen APS trigger data and generation management. The Copy Protection Technical Working Group (CPTWG) initiated the Data Hiding SubGroup (DHSG), which released a call for proposals in May 1997 [3]. The requirements placed on watermarking algorithms for the above application differ from those for other applications that are currently in the market, such as identification of ownership. The application of watermarking for copy protection requires a low bit-rate and allows the use of many frames for watermark detection. However, since watermark detectors must be built into millions of low-cost, consumer devices, and since these detectors must work at video rates, there is a very strong requirement that the detector be extremely simple and cheap. Furthermore, since the DVD standard employs MPEG coding, the watermarking method must work well with MPEG. These last two requirements are challenging design specifications. The requirements for the playback and copy generation system are - Detectable in the baseband and/or compressed video - Detector should be very inexpensive both in terms of gate count (hardware) or MIPS (software). Typically, the detector should be implementable in "a few thousand gates", although common belief is that "a few tens of thousands of gates" are needed to satisfy the basic requirements. Extremely low false positive rate. Consumer equipment may not fail to work because of an erroneously trigger watermark detector. Detection in 10 seconds or faster. - No visible artifacts, i.e. very high image fidelity - Tamper resistant, i.e. it should not be easily circumvented or removed. - The watermark should survive color representation conversion from YUV to RGB - Low data rate - Support of generation control -
-
As some of the above requirements appeared a challenge to designers, other requirements were relaxed, in particular those addressing robustness to image transformations. Nonetheless, the watermark should also survive: - Compression Decompression Digital-to-Analog - Analog-to-Digital - Standards conversion, e.g. analog video recorder (VHS) -
-
Several solutions have been submitted in response to the DVD Call for Proposals [3], including systems designed by Hitachi, HP, IBM, Macrovision/Digimarc, NEC, SONY, Philips and Pioneer. While the original CFP only recognised the embedding of a secondary watermark as the technical solution for the generation control, other solutions surfaced. Ignoring some details that address the
262
J.P.M.G. Linnartz
coexistence of encryption and watermarking, we distinguish three categories of approaches: 1. Embedding a secondary watermark when the copy is made. The NEC proposal follows this approach. 2. The video that may be copied contains a permission ticket, which is stripped by the recorder. The Philips proposal follows this approach. 3. The video that may be copied contains a signature-type of permission "token", which is stripped by the recorder. The signature solution resembles the ticket concept in that both add a signal to the content that can be removed easily. The IBM proposal follows this approach. In the following sections, we present the design rationale behind the ticket concept and compare it with the other solutions. 3
Record
and
Play
Control
Formally speaking, this copy protection approach relies on a form of publickey watermarking, i.e., user devices must be able to read the watermark, but this should not reveal how a watermark can be erased. Such schemes have not yet been found [1,2] and most currently proposed systems do formally not satisfy this requirement. Current thoughts (as for instance expressed in D V D / C P T W G / D H S G standardisation) are that the watermark detector is embedded as a tamper resistant element of the electronic chip which controls the record and playback engine. In the remainder of this article, we assume that an appropriate watermarking scheme exists such that consumer devices can verify the watermark, but not erase the watermark from the content. Watermarks can only be embedded by agencies that have access to the embedding (algorithm and its) keys. However, Linnartz et al. have shown that such solution potentially is vulnerable to a sensitivity attack [6], in which the tamper-proof detector box is used as an oracle that reveals up to one bit of information about the watermark secret per experiment. The most basic and most common approach is record control. The recording device detects the presence of a watermark and inhibits copying this content. Record control prevents a casual consumer from copying copy-protected, i.e., watermarked material onto a recording device. A pirate who is interested in illegal copying may not only attempt to tamper with the watermarked image, but can also attempt to circumvent the copy control mechanism while leaving the watermarked content unchanged. The most trivial attack is to tamper with the output of the watermark detector and modify it in such a way that the copy control mechanism always sees a "no watermark" situation, even if a watermark is present in the content (Figure 3). Since hackers and pirates more easily can modify (their own!) recorders but not their customers' players, playback control is a mechanism that detects watermarks
The "ticket" concept for copy control based on embedded signalling
263
Fig. 1. Fundamental weakness of Record Control.
during the playback of discs. The resulting tape or disc can be recognized as an illegal copy. An essential strengthening of the system is to prohibit playback of content if it appears to be an illegal copy. In its simplest from, watermark content is played back only if it comes from original media (e.g. stamped 'silver' discs), but playback is rejected if the watermarked content is played from recordable media (e.g. 'golden' discs). The player recognises the copy state of the content, e.g. by detecting the watermark and comparing this with a physical mark on the disc. Only if the carrier's physical properties correctly match with the watermark, the device is authorised to play. Small-scale pirates typically do not run their own ROM disc-pressing equipment, but they have access to commercial pressing facilities. In particular, a protection mechanism that requires a physical modification to the disc press machine effectively hinders many small-scale pirates who do not own the press plant themselves. Enhanced protection can be provided against an attacker who only has access to consumer or unmodified professional equipment (but not to dedicated reverse engineering tools) by making the bit contents of the physical mark inaccessible. Moreover, knowledge of the bit contents should not allow the attacker to press copies of a discs on equipment operated by an commercial company with unmodified pressing machines.
4
Ticket Concept in Play Control
In abstract terms, the ticket method (which is the prime focus of this paper) addresses a m e t h o d to associate a message (which we shall call the ticket) to a watermarked piece of content (say, an image or motion picture), such that - the recipient of the image can detect with high reliability whether or not such an associated ticket file has been issued by the copyright owner - if the recipient has access to a ticket that is presented as being associated with an image, this recipient can verify its integrity and authenticity. T h a t is, he can verify with high reliability whether this ticket is the correct one, that truly belong to this particular image and has been issued by the content owner, and
264
J.P.M.G. Linnartz
- the above two properties still hold after typical signal processing operations (e.g. M P E G compression) have been performed on the content. The third requirement is at odds with the property of 'hard' (see Section 6).
digital signatures
To provide generation management, we modify the associated data in a computationally irreversible manner. Our ticket concept can be viewed as a cryptographically secured Copy Generation Management System (CGMS). (See Section 4.2). Moreover, as explained in the next section, it can also strengthen the play control protection.
4.1
Ticket Concept in Play Control
In its basic implementation, play control allows playback of watermarked content from pressed (silver) discs, but not from recordable or rewritable (silver or green) discs. As we argued before a stronger, say cryptographic mechanism is needed to relate the content to the physical carrier.
Fig. 2. Basic elements of play control: if a player detects watermarks, it checks the presence of an appropriate authorisation mark.
Fig. 3. Relation between copyright data, random seed U, Physical mark P and watermark W. F is a cryptographic one-way (hash) function.
In the ticket concept, the physical mark carries bit string P which is related to the watermark W according to
W = F(P),
The "ticket" concept for copy control based on embedded signalling
265
where y = F(x) denotes a cryptographic one-way function. The bit content of the physical m a r k is embedded on the disc but cannot be read or recovered externally of the drive. During mastering, the physical m a r k and w a t e r m a r k are created from a seed U, according to
P = F(U)
and
W=
F(F(U)),
The predominant requirement of the one-way function F 0 is that it is computationally unfeasible to compute the inverse. C o m p u t i n g an inverse means finding which particular x0 leads to a given y0 with Yo = F(xo). With unfeasible we m e a n t h a t the most efficient method to find such x0 is to exhaustively search all possible bit combinations of x0 and to compute and verify F(xo) for each a t t e m p t . The one-way function is calculated within the drive and recorder, so simple hardware solutions are preferred. From a security point of view, the one-way function itself m a y not need to be kept secret. No other system secret is present in the consumer products, except the secret behind the watermark detection process. One suitable realisation of a physical m a r k is the wobble groove in optical discs. T h e basic principle of this concept is covered in the Orange Book standard for C D - R [4]. It is a superimposed small, periodic variation on the normally linear variation of the radius of the spiral with d a t a on the disc. The wobble bit content cannot be recovered from the output of the disc drive, but it can be detected from the control circuit which stabilises the optical pick-up above the track. The wobble is too fast to be tracked by the servo motor, or its control current, but the wobble signal is present in the feedback loop t h a t positions the pickup. Wobble marks can be inserted on pressed discs, but custom writing these on a recordable disc with normal writers is physically impossible. The other requirements are satisfied by performing a cryptographic one-way function in the control hardware.
4.2
Ticket Concept in Generation Control
To allow one-copy from disc, a special form of playback occurs. The drive must pass a copy ticket to its output. This ticket must allow a recorder to copy the content and a next player to play the copied content. After these transitions, no further copies must be possible. Requirements are - It should be difficult to retrieve the one-copy-allowed value of that mark from copy-no-more content. - T h e consumer equipment should not carry a secret which reveals how tickets can be generated for existing watermarked content. Only the content owner can generate a ticket. Users only can 'clip' the ticket. - Content m a y undergo transformations, during which the copy state must be preserved.
266
J.P.M.G. Linnartz
As illustrated in Figure 3, the ticket changes state during every passage through a playback and recording device. In other words, that ticket behaves as a counter that gets decremented every time it goes through a player or recorder and permits operation of this device as long as this counter is greater than zero. We implement this state changing process in such a way that it becomes (computationally) impossible for attackers to 'increment' the counter again. We may do this using the cryptographic one-way function from the previous section: initially T is some multi-bit number, and during every passage, we apply F(.) to this number and call the result the new ticket. The ticket is a volatile piece of data which can be stored and transferred either in embedded or associated form. This distinction is particularly relevant for internal processing of the signal on platforms such as the PC, where only embedded signals will be retained and associated data can easily be lost. In storage, the ticket will typically be associated as a physical marker of the storage medium. T h a t is, the ticket is stored at locations t h a t are inaccessible by normal hardware products, thus separated from the content. The wobble is a typical example, suitable for professionally released content on stamped discs. For recordable media, potential methods to carry the ticket are by introducing intentional bit errors in a predetermined manner, or to modulate E F M P modulation codes which determine the relation between user bits and pit and lands on the disc [5]. In transit, the ticket is embedded into the signal. Examples are MPEG user_data bits, or data in the blanking intervals of the PAL and NTSC television standard. For a never-copy signal, T specifies that only playback is allowed, i.e., the ticket reduces to the physical mark as discussed in Section 4.1. A "one-copy" signal on a professionally-produced disc carries a ticket for 3 passages: playback, recording, followed by another playback. When in transit from a player or a Set Top Box to a recorder, such the video signal carries a ticket for 2 passages. Ticket T in the stream is replaced by T ~ = F ( T ) during each recording or playback operation, whereby F(.) is a publicly known cryptographic one-way function. Neither the player nor the recorder pass T transparently. The system does not rely on a global secret. Playback Playback will only take place if one of the following conditions is met: - The disc contains a ticket in the form of a physical mark P reserved for stamped media. The content on the disc contains a watermark W. The player further checks the validation ticket. One of the following conditions must be satisfied:
The "ticket" concept for copy control based on embedded signalling
267
Fig. 4. The ticket is clipped (cryptographically modified) during each playback or recorder passage. "Never-copy": The relation W = F ( P ) is satisfied. "One-copy": T is present and W = F ( F ( F ( T ) ) ) is satisfied. In this case, F ( T ) is made available at the output of the drive. - The disc contains a physical m a r k P reserved for recordable media. The content contains a valid W w a t e r m a r k used for professional recording. The validated one-copy T is present, and W = F ( T ) . - The disc contains a physical m a r k P reserved for recordable media. The content is identified as a home recording of a user's personal creation by checking the absence of a watermark.
Recording Recording of copyrighted content is allowed only if the w a t e r m a r k in the stream matches W = F ( F ( T ) ) . T h e recorder always passes the copy control ticket through the one-way function before transferring it to disc. If an attacker manages to modify his recorder and record video even if the appropriate T is not present, a normal player will reject to playback the disc.
Professional Publishing A professional title is produced by initially generating a seed U. From this seed, the following variables are computed: P = F ( U ) , and T = F ( F ( U ) ) ) . The one-way function F and variable P is specified such t h a t P also contains an identifier for the publisher or a serial number of the mastering machine. If a pirate publisher a t t e m p t s to write a particular P, in order to make a bit-exact copy of a copyright disc, that pirate m u s t know U (which he cannot learn from the store bought product) or t a m p e r with his DVD press. Summary
o f Authorisation
The control of the ticket and physical m a r k is summarised in Figure 4. The content starts with generating a seed. FYom this seed physical marks, tickets and w a t e r m a r k s are obtained. Black one-way functions are to be verified. Grey one-way functions have been applied in the past, and cannot be undone.
268
J.P.M.G. Linnartz
Fig. 5. Relation between Physical Mark, Watermark and corresponding authorisation Ticket on a medium indicated in the first column.
5
The Secondary
Watermark
Solution
In the concept formulated in the Call for Proposals of the DVD CPTWG [3], copy generation control was suggested to be performed by embedding a secondary watermark in the recorder. A recorder would accept to record content with a primary watermark only, but would not accept content that has both a primary and a secondary watermark. This approach has the disadvantage that consumer recorder must be able to embed a watermark. This implies that content must made accessible in a form that allows embedding. Typically at least partial MPEG decoding would be required, even if MPEG decoding would neither be required for watermark detection or for the recording function. Reliable embedding, thus with sufficiently strong watermark energy presumably must the adapted to image properties so it requires evaluation of the video using a perceptual model. Maes [8] argued that fixed depth watermarks are also very sensitive to "Twin Peak Histogram Attacks". Section 7 will address some other security aspects associated with this scenario.
6
The Signature
Solution
In another context (e.g. [7]) it has been proposed to make it mandatory that any digital content, watermarked or not, should be accompanied by a digital signature from an authorized agent. A problem with such a copy control concept is the potential leak that occurs because private users must be able to create, store and copy their own personal works of art. Such works must then also be signed. Hence, any user can sign content and can attempt to sign and create illegal copy permissions for copyrighted content. In this paper we address scenarios in which signatures are required only if a watermark is present. In this solution, content is watermarked if copy restrictions apply. Moreover, a hash function is performed over the digital representation of copy-once content
The "ticket" concept for copy control based on embedded signalling
269
and this hash is signed, i.e., encrypted 2 by the content owner.Consumer devices are allowed to copy the watermarked content only if this signature is present. When such devices make a copy, the signature is removed. Two enhancements of this proposal exist 1. Soft signatures: Since signal processing ruins the validity of the signature, the recipient would not be able to verify the integrity of copy state of processed video. In, practice, such modifications often occur, for instance during conversion from one format to another (e.g. digital to analogue, U.S. NTSC into European PAL television standard), transmission using lossy compression such as J P E G or MPEG. It would be inappropriate if the user looses permission to copy due to legitimate processing. Hence, the integrity of the ticket must be verified, even if the image undergoes permissible modifications. Therefor, the hash function is not performed over the digital M P E G representation directly, but over a set of extracted features of the images. These features are chosen such that these are unlikely to change in typical signal processing operations. 2. Countermeasure to replay of the ticket by intentional modification of the content: An attacker can grab the ticket and store it separately and use this to make a second generation copy. To invalidate the ticket and to avoid that the ticket can be misused to restore the one-copy-allowed state, the video can be modified intentionally by a recorder such that the feature extraction mentioned in the previous paragraph yields a different result.
7
Weaknesses and Potential Attacks
Watermarking Weaknesses All solutions addressed here have one common vulnerability, namely that the copy protection is lost if the attacker can tamper with the primary watermark. The attacker can attempt to transform (scale, tilt) the image such that the watermark detector is not triggered. Moreover, he can try to find the watermark secret and erase the watermark pattern. The presence of a watermark detector in every consumer device can be exploited by the attacker to estimate the watermark pattern using the sensitivity attack. See [1, 2] for an overview of attacks and vulnerabilities. In the case of the secondary marking method, an additional weakness occurs because the consumer has access to a watermark embedder, though possibly only in tamperproof encapsulation. Thus, attackers can experiment by the marking random inputs. Next, we will argue that the embedder must have properties that dictate substantial linearity. Thus there is a conflict between robustness and security requirements, which may weaken the security of the secondary marking method. 2 To satisfy complexity requirements, a symmetric encryption algorithm was proposed to the CPTWG.
270
J.P.M.G. Linnartz
Mathematically, given an image I and a watermark W, the watermarked image I' is formed by I' = I + f ( I , W) such that the perceptual differences between I and I ' are tess than "just noticeable", i.e. the watermarked image is constrained to be visually identical (or very similar) to the original unwatermarked image. In theory, the function f may be arbitrary, but in practice robustness requirements pose constraints on how f can be chosen. One requirement is that watermarking has to be robust to random noise addition. Therefore many watermark designers opt for a scheme in which image I will result in approximately the same watermark as a slightly altered image I + e. In such cases f ( I , W) ..~ ( f ( I + c, W). If the recorder changes the watermark status from "one-copy allowed" to "no more copies allowed" by embedding a watermark, the attacker has access to the content before and after this marking. T h a t is, he can create a difference image f ( I , W), by subtracting the unmarked original from the marked content. An obvious attack is to pre-distort the original to undo the mark addition in the embedder. T h a t is, the attacker computes I - f ( I , W) and hopes that after embedding of the watermark, the recorder stores I -- f ( I , W) + f ( I - f ( I , W), W) which is likely to approximate I - f (I, W) + f (I + c, W) .~ I because watermarks are small modifications themselves, thus f ( I , W) ..~ f ( I - f ( I , W), W). Although some countermeasure exist, many schemes are vulnerable to this attack or sophistications of it. Detection of the watermark W is often achieved by correlating the watermarked image with a locally stored reference copy of the watermark. Such correlator is vulnerable to the pre-subtraction attack discussed here. Note that the attacker can not only create a copy that plays on his own or his customers recorders, he can also sell generic circumvention devices.
Scrambled Recording Attack When evaluating the security of the various solutions, it is relevant to consider attacks that appear so fundamental that these are unlikely to be solved by any system. Most importantly it appears unlikely that protective measures can be found to avoid that a hacker can build a storage device that stores 'random bits'. For instance, a hacker can take his copyrighted video sequence and create a copy for his personal use by weakly encrypting all bits. The resulting sequence of 'random bits' can be stored on his device. The watermark detection process is designed to detect the watermark when the video is perceptually meaningful. Thus, a user may apply a weak form of scrambling to copy protected video, e.g. inverting the pixel intensities, prior to recording. The scrambled video is unwatchable and the recorder will fail to detect a watermark and consequently allow a copy to be made. Of course, on playback, the video signal will be scrambled, but the user may then simple invert or descramble the video in order to watch a perfect and illegal copy of a video. Simple scrambling and descrambling
The "ticket" concept for copy control based on embedded signalling
271
hardware would be very inexpensive and manufacturers might argue that the devices serve a legitimate purpose in protecting a user's privacy. Similarly, digital MPEG can easily be converted into a file of seemingly random bits. One way to avoid such circumvention for digital recording is to only allow the recording of content in a recognized file format. Of course this would severely limit the functionality of the storage device.
SteganographicRecording Moreover, it would not make sense to outlaws format non-cognizant bit-copiers because a more subtle circumvention of the copy control mechanism can be used. This attack exploits the technique of steganography (or data hiding) to bypass the watermark detector in the recorder. The method of attack is similar to the argument that laws against the private use of cryptographic encryption can be evaded by steganography. The copyrighted work is hidden in an innocent-looking file of a known and recognized format. For instance the digital MPEG video representation allows the user to embed additional user_data without any significant limitation. Stuff bits may be misused by a pirate to embed a complete stolen MPEG video film as user_data of another video sequence. During playback, the playback platform, e.g. the PC must perform a few additional functions, but this does not need to cause significant performance problems. Thus, any play-control system can be circumvented by a pirate who can insert functionality (e.g. de-scrambling) between the drive and the MPEG decoder, or between de MPEG decoder and the display. Hence, in the evaluation of the various solutions for generation control, we must conclude that systems perform equally in this regard. This makes replay attacks of the ticker or signature less interesting, as the circumvention devices is more sophisticated than a generic black-box, because it needs to store data for every piece of content that is copied. 8
Comparison
Cryptographically, signatures (or "tokens") and tickets behave essentially different their vulnerability to collisions. Collisions can not be exploited to attack the ticket. The philosophy of the ticket purely rests in the difficulty to modify the watermark The attacker has to find a ticket value that matches with this watermark. The one-way function is designed such that finding any input signal for a given output (i.e., an indelible watermark) is not feasible with reasonable amount of hardware and within limited time. This effort would allow him to copy one title. It is critically important to observe that we use the ticket as input to our one-way or hash operation, and that the watermark challenges the output of the one-way function (Figure 6). Through this design the ticket is invulnerable to potential collisions of the one-way function. This concept is essentially different
272
J.P.M.G. Linnartz
Philips: Ticket ~ raM:
Token ~
Content(Watermark) Content(bitrepresentation)
Fig. 6. The conceptual difference between signature and ticket solution rests partly in direction of one-way operation, and in whether the bit representation or the watermark is considered.
from a proposal based on signatures, in which the content is used as input and the authorisation token is a result of the output of a cryptographic function. In such scenarios, an attacker can attempt to modify the content slightly, in the hope that he finds it matching with a particular token.
We summarize relevant differences between the approaches in Table 1. Ticket Secondary mark Hard Signature Soft Signature Yes iPartially, highly No Reliability: Yes depends on type permission to of transformation copy remains and design of after processing feature extraction the video Yes, but recorder Yes Yes Vulnerable t o N/A can distort image replay attacks a little to render of associated tag invalid tag No, provided the No No Vulnerable to Yes soft signature comparison of is cryptographiinput and outcally strong iput of recorder none signature signature secondary Embedded secverification verification rets in recorder watermark keys, feature besides water- detector, water- ikeys extractor mark detection mark embedder each prosignature signature System secrets vider keeps generation ~eneration distributed his own method method among content secret seed. providers, beNo global sides primary secrets watermark Yes, pirate no no Protection no must modify against piracy mastering via ROM machines media
The "ticket" concept for copy control based on embedded signalling
Vulnerable to scrambled recording attack Vulnerable to black-box that modifies state back to copy once
273
Secondary mark Hard Signature Soft Signature Ticket ies yes yes yes
Box must strip iCounterfeit ticket can be Counterfeit secondary mark, generated by any malicious ticketing box designer content provider who requires inversion must know has a licence to release secondary wa- content of one-way termark secret function Perceptual Secondary em- none Might occur none artefacts bedder must if image is work with limimodified to ted complexity, avoid replay artefacts are of signature more difficult to avoid than with primary mark Complexity in secondary hash function, Image feature one-way CE product, watermark encryption extractor, function in addition to detector function hash function, primary water- and embedder encryption mark detector function 2 bits 1 bit Required 1 bit 40-64 bits, watermark but may be payload signalled at low rate Ability to record record record Both support both control control control record and only only only play control Table 1. Comparison of various solutions for generation control
9
Concluding
Remarks
The introduction of the Digital Versatile Disc (DVD) has initiated a substantial effort in enhanced copy control mechanisms. Presumably it will be the first massmarket use of embedded signaling and electronic watermarking. This paper has covered some system concept aspects that occur if watermarks are used for copy control. The ticket concept for record, playback and generation control has been presented. The basic assumption is that the watermark remains fixed throughout the entire chain for transferring content. In each step an authorisation ticket (or
274
J.P.M.G. Linnartz
physical mark) has to be present. A cryptographic counter value is decremented every time the ticket is clipped. An implementational difference between the ticket and signatures is the 'direction' of the cryptographic one-way function. In the ticket concept, random seed data is used as input, the content properties (i.e., the watermark values) are output. In concepts based on digital signatures (as proposed by others), the content is used as input, while the permission item (i.e., the signature) is an output of the one-way function.
References 1. Cox and J.P.M.G. Linnartz, "Public Watermarks and resistance to tampering", IEEE Journ. of Sel. Areas in Comm., Vol.16, No.4, May 1998, pp. 587-593 2. I.J. Cox and J.P.M.G. Linnartz, "Public Watermarks and resistance to tampering", ICIP 97, Santa Barbara, CA, October 1997 3. DVD Copy Protection Technical Working Group (CPTWG), DataHiding SubGroup (DHSG), Call for Proposals for Embedded Signaling, Burbank, CA, May 1997. 4. Orange Book, CD R, CD-RW and CD-MO standard, Coordination Office Optical & Magnetic Media, Philips Consumer Electronics, SWA-1, Eindhoven. 5. J. Hogan, "Exploiting Modulation Code Redundancy", invited paper in Proc. Optical Data Storage, Tucson, AR, April 1997, pp. 88-94 6. J.P.M.G. Linnartz and M. van Dijk, "Analysis of the sensitivity attack against electronic watermarks in images", Workshop on Information Hiding, Portland, OR, 15-17 April, 1998. 7. G. Itkis, "Copyright Protection, Analysis and Authorization Chains", Response to DAVIC CFP 7, London, May 1997. 8. Maes, "Twin Peaks: The histogram attack on fixed depth image watermarks", Preliminary Proceeding of the 2nd International Information Hiding Workshop, Portland OR, April 1998
Panel Session: Watermarking G4rard Eizenberg 1 and Jean-Jacques Quisquater 2 1 ONERA/CERT B.P. 4025 - 31055 Toulouse Cedex 4, France Gerard. EizenbergOcert. fr 2 Universit4 catholique de Louvain 1348 Louvain-la-Neuve Belgium Quisquater@dice. ucl. ac. be
"Watermarking", "Tattooing", and "Fingerprinting" are praised protection techniques in the world of Intellectual Property Protection of electronic documents. However, most of the important characteristics of these techniques remain fuzzy in a large part of the current literature, including the definitions, the considered threats, the precise security requirements to which they are supposed to bring answers, the assumptions about the reception equipment, and so on. Here is a non-exhaustive list of questions on which the panelists will be invited to contribute. The list is long and can receive neither complete nor definitive answers in a panel. However, if important clarifications and appropriate references are brought, the exercise will be fruitful indeed. In what follows - as well as in the title - "watermarking" stands for a generic concept covering all the related techniques. 1. W h a t are the security requirements? How can they be classified? What are the assumptions about the threats to counter, about the reception equipment? 2. Which requirements are most likely to conflict? 3. What are the definitions of the various watermarking techniques and of the related concepts? 4. For each security requirement, are there alternative solutions to watermarking? 5. For each type of document, e.g. text, image, music, . . . , and for each security requirement, do taxonomies of technical solutions exist? 6. What are the known residual vulnerabilities of the existing techniques? 7. What are the main open research problems? 8. Which watermarking techniques are interoperable? 9. How to evaluate the existing techniques? Would it be relevant to define Protection Profiles? The panelists will be experienced actors in the domain. An active participation of the attendees is expected.
Authentication and Payment in Future Mobile Systems Giinther Horn 1 and Bart Preneel 2. 1 Siemens AG, Corporate Technology, D-81730 Miinchen, Germany,
(guenther.hornOmchp, siemens, de) 2 Katholieke Universiteit Leuven, Dept. Electrical Engineering-ESAT, Kardinaal Mercierlaan 94, B-3001 Heverlee, Belgium
(bart.preneel~esat,
kuleuven, ac. be)
Abstract. This article presents an efficient public-key protocol for mutual authentication and key exchange designed for third generation mobile communications systems. The paper also demonstrates how a micropayment scheme can be integrated into the authentication protocol; this payment protocol allows for the provision of incontestable charging. The problem of establishing authenticated public keys through crosscertification is addressed.
1 1.1
Introduction T h e future of mobile s y s t e m s
Mobile communications is one of the fastest growing sectors of the IT industry. For example, in Europe the number of mobile users was 22 million in 1995 and is estimated to reach more than 110 million by the year 2000. While current second generation systems such as GSM (cellular, [ETS1]) and D E C T (cordless, [ETS2]) will continue to play an important role, a new third generation system, the UMTS (Universal Mobile Telecommunications System) is shortly to be introduced in Europe, with commercial UMTS services expected to commence by 2002 [UMTS,ETS3,ITU2]. UMTS will provide a wider spectrum of services than today's systems, ranging from simple voice telephony to high speed, high quality multimedia services, regardless of physical location of the user, using radio frequency access to a convergent network of fixed, cellular and satellite components. 1.2
M o b i l e s y s t e m s security
As for second generation systems, the most fundamental security requirement for UMTS is to ensure that the level of security is at least as high as that in existing wired telecommunications networks. The necessary security features to meet * F.W.O. postdoctoral researcher, sponsored by the Fund for Scientific Research, Flanders (Belgium).
278
Giinther Horn and Bart Preneel
this requirement include: confidentiality on the air interface (which is much more vulnerable to eavesdropping than a wired interface), anonymity of the user and, most importantly, authentication of the user to the network in order to prevent fraudulent use of the system. While these features are already provided in existing second generation systems, they need to be further developed and enhanced by the incorporation of additional features in UMTS. The most important of the new security features in UMTS is that the user must also authenticate the network in order to prevent an intruder from masquerading as a network operator or service provider. This requirement is motivated by the observation that a user may want to make sure that he is connected to the network of an operator whom he trusts. This becomes increasingly important as the number and variety of competing public and private network operators and service providers grows larger. The resulting potential network complexity also suggests that new techniques of managing the cryptographic keys necessary for the provision of these security features may be required. This is a very natural setting for the application of public key cryptographic techniques. Although applied successfully in other areas, public key cryptography has not previously been used in mobile communication environments due to performance constraints. It was not deemed suitable for second generation systems because of the resulting length of messages and the necessary computational loads. To overcome these problems a new protocol was developed for authentication between user and network; it was particularly designed to fit the performance constraints of mobile networks. The protocol is described in Sect. 2 below. Its design exploits the advances in two fields: crypto-controller smart cards (which have a co-processor which efficiently supports public-key cryptographic mechanisms) and elliptic-curve cryptosystems (which permit the use of smaller cryptographic parameters). The new protocol was successfully implemented and tested in the collaborative research project "ASPECT" (Advanced Security for Personal Communications Technologies) which is funded by the European Union under the "ACTS" programme. An extended version of the protocol to include on-line TTPs is presented in Sect. 4. The choice of cryptographic algorithms and certificate formats is discussed in Sect. 5. Another crucial issue in the security of mobile systems is that users may want to protect themselves against incorrect bills. Therefore it may be necessary to provide undeniable evidence that claims related to user charges are correct. The ASPECT project demonstrated the feasibility of a solution to the problem of securely billing for the provision of a value added service using the aforementioned authentication protocol and a suitable micropayment scheme. The payment protocol and its integration with the authentication protocol are described in Sect. 3 below.
2
Authentication
and Initialisation
of Payment
Protocol
In the following, U denotes the user, represented by his User Identity Module, and V denotes the UMTS value added service provider, or, in short, VASP. The protocol presented in this section, however, is also proposed for use between a
Authentication and Payment in Future Mobile Systems
279
user and a mobile network operator during call set-up. The design principles for the protocol can be found in Sect. 2.2.
2.1
Protocol goals
The goals to be achieved by the end of a successful protocol run are: 1. mutual explicit authentication of U and V; 2. agreement between U and V on a secret session key K with mutual implicit key authentication; 3. mutual key confirmation between U and V; 4. mutual assurance of key freshness (mutual key control); 5. non-repudiation of origin by U for relevant data sent from U to V; 6. confidentiality of relevant data sent by U to V. The non-repudiation feature is motivated by the requirement for incontestable charging.
2.2
Principles for the selection o f security mechanisms
One principle in the design of the protocol was to shift as much computational effort as possible from the user side to the network side because it is assumed that the user will be represented by a smart card which has limited computational capabilities. Another principle was to allow for messages that are as short as possible. A way to arrive at shorter messages is the use of elliptic-curve cryptosystems [Mene]. While the protocol does in no way mandate the use of these cryptosystems, it is designed in such a way that their advantages can be best exploited. Another way to arrive at shorter messages is the use of a streamlined certificate format which provides certificates much shorter than X.509 certificates. (For more details see Sect. 4.2.) The choice of the security mechanisms was guided by the following considerations: non-repudiation of data sent by the user requires a digital signature system on the user side. This signature system is then also used for authentication of the user for efficiency reasons. For session key establishment, a key agreement scheme (similar to the E1Gamal scheme [E1Ga]) with implicit key authentication [ISO3] of the network was chosen because then entity authentication of the network can be obtained with little extra cost. (See discussion in Sec. 2.5 below.) The protocols were chosen in such a way that their description is independent of the choice of the signature systems used by the user and the certification authority, respectively. They are also independent of the choice of the finite group in which the exponentiations required in the key agreement scheme are computed. 2.3
Prerequisites
Cryptographic functions We assume that the following cryptographic functions can be executed by any participant:
280
Giinther Horn and Bart Preneel
- A symmetric encryption function where {M}K denotes the encryption of message M with key K . We assume t h a t the cryptographic algorithm is resistant against known cryptanalytic attacks such as code-book attacks and chosen plaintext attacks. A (pseudo-) random number generator. - Functions h l , h2 and h3 which are specified below. Multiplications in a finite group G with generator g, (e.g., the multiplicative group of a finite field or a subgroup of an elliptic curve), in which the Discrete Logarithm Problem is hard.
Further
prerequisites
- The identity idV of V is assumed to be known to U at the start of the protocol. - V has long-term secret and public key agreement keys v and gV respectively, where g is as above. U possesses an asymmetric signature system with secret signature transformation Sig v. In case of a signature with appendix, S i g v 0 denotes only the appendix. - There is a valid certificate certU (issued by a certification authority CA U with identity idCA U) on the public key of the asymmetric signature system of U, available at U. - There is a valid certificate certV (issued by a certification authority CAV with identity idCAV) on the public key agreement key gV of V, available at V. - U possesses the public key necessary to verify certificates issued by CAV. - V possesses the public key necessary to verify certificates issued by CA U.
Functions
h l , h2, h3
Here, we explicitly list the requirements on the functions hl, h2 and h3. T h e following definitions are useful here: definitions 1, 2 and 3 are well-known [MvOV, p. 323ff.], while definitions 4 and 5 are weak forms of the MAC-property and of pseudo-randomness which we believe are sufficient in our context. Concatenation is indicated by tl. 1. A function h is preimage resistant (one-way), if for essentially all outputs y = h(x) it is computationally infeasible to find any input x' with y = h(x') (x ~ m a y or may not be equal to x). 2. A function h is partial-preimage resistant (local one-way), if for essentially all outputs, if part of the input is known it is still hard to find the remainder, i.e. it is not easier than brute-force. 3. A function h is collision resistant (strong collision resistant), if it is compurationally infeasible to find two inputs x' # x which are m a p p e d to the same value y = h(x) = h(x~).
Authentication and Payment in Future Mobile Systems
281
4. A function h is weakly computation resistant (weak MAC-property) if it is computationally infeasible to find a pair (x, h(Kllx)) without knowing R', provided that no other pair (x', h(KIIx')) is known (here the value of x can be chosen by the opponent). 5. A function h is a weakly pseudo-random function if, for secret random key K not used before and for known random x, the output h(KIIx ) is indistinguishable from a random output. NOTE ON THE DEFINITIONS. There are some dependencies between these properties, e.g., a partial-preimage resistant function is especially preimage resistant. Note also that our definition 4 is indeed much weaker than the usual MACproperty (cf. [MvOV, p. 325]) because we assume for the weak MAC-property that an attacker does not even have one valid example (message, MAC) available against which he could test attempted forgeries. Definition 4 is not strictly weaker than definition 5 since definition 4 allows the opponent to choose the value of x. Definition 5 can be made operational in the following way: h is a publicly known function. Given parties Alice and Eve, Alice chooses random parameters Ki, xi, Ri (i = 1, 2,...) and computes Hi := h(Killxi). Alice makes xi, Ri and Hi known to Eve and keeps Ki secret. Eve tries to break the function h by guessing which of the two parameters H~ and Ri is computed from Ki and xi by applying h. If Eve has only about a 50% chance to make the correct guess then h fulfils definition 5. Note that this does not imply that such a function h (in practice a hash function) can be used to define a pseudo-random function because the opponent has no control over the input xi and can only observe a single input-output pair (xi, H0 for each choice of K~. Nothing is implied by definition 5 about repeated applications of h with the same value for Ki. We require that the functions hl, h2 and h3 are hash functions (i.e. they map inputs of arbitrary finite lengths to fixed length outputs) which are easy to compute and that
1. hl is a partial-preimage resistant, weakly computation resistant and weakly pseudo-random function. 2. h2 is a partial-preimage resistant, weakly computation resistant function. 3. h3 is collision resistant. There are a number of practical cryptographic functions which are assumed to be at least collision resistant, e.g., SHA-1 [FIPS,ISO2], RIPEMD-160 [DoBP,ISO2] and ISO/IEC 10118-2 [ISO2]. It is common practice in cryptographic applications to assume that these hash functions are also preimage and partial-preimage resistant, though this issue needs further investigation. Due to [MvOV, p. 331] hash functions should not be used as pseudo-random functions unless the randomness requirements are "clearly understood". This does not contradict our assumption that a practical hash function fulfils our definition 5, (see corresponding note following definition 5). Let h be one of these well-known hash functions. Then we claim that the functions 1. hl(x) := h(x)
282
Gfinther Horn and Bart Preneel
2. h2(x) := trunc(n, h(x)) where trunc(n, y) returns the n least significant bits of y (n being significantly shorter than the length of the hash value, but long enough to prevent successful guessing). 3. h3(x) := h(x) satisfy our requirements.
2.4
Protocol description
USER U
VASP V
g~ll idCAV )
rllh2(KIIrll idV)l I ch_data tl TVII certV {Sigy(h3(g~llgVHrl] idV II ch-datall TVIIc~TII IV) )II certU H(~THIV} K )
Fig. 1. The authentication protocol
At the start of the protocol (see Fig. 1), U generates a random number u and then computes g~ which he sends to V, together with the identity idCAV (see remark on certificate verification keys in Sect. 2.5). On receipt of the first message, V does not know with whom he is communicating. V generates a random number r, computes (g~)~ and then a session key K := hl((g~) v lit). He demonstrates knowledge of the session key K by computing the hash value h2(KIM I idV) which he sends to U, together with r, his certificate certV and additional data needed as input to the payment scheme (a time-stamp T V and charging-relevant data ch_data). On receipt of the second message, U computes the key K = hl((gV)~llr ). He then checks the hash value h2(KIIrll idV) and he thus knows that V actually has the session key K. U generates random numbers I V and a0, computes aT = TV(aO) and signs the hash value of the concatenation g~llg~llrll idY II ch_data T V IIO~TII IV. Here, FIV, T, So, O~T and IVhave significance only for the payment scheme, not for the protocol goals stated above, and are therefore explained in Sect. 3 below. U concatenates the signed hash with his certificate certU and with OlT and IV. U then encrypts the concatenated parameters with K. On receipt of the third message, V first deciphers the message elements using K. He retrieves the certificate certU, and after the verification of the certificate, he can verify U's signature. V stores the signature and the corresponding message for later use in the payment scheme (see Sect. 3 below).
Authentication and Payment in Future Mobile Systems 2.5
283
Discussion
General K e y confirmation and authentication of V: the inclusion of h2(KI]rll idV) in the second message gives key confirmation from V to U and hence, together with key freshness (cf. remark 3 below) and implicit key authentication of V, also entity authentication of V (see, for example, [RuvO]). Note that explicit key confirmation demands that the session key is used in the key distribution itself1. K e y confirmation and authentication o f U" the encryption of the certificate certU with K in the third message provides key confirmation. The inclusion of gullgVllr in the signed part of the third message provides implicit key authentication from U to V as it confirms the origin of g~ and links this value to gV and r. The inclusion in the signed part of the third message of the random number r generated by V provides entity authentication of U. R e p l a y of old keys a n d key freshness: the key K is constructed using the random number r generated by V in order to prevent an old key K being forced on V by U. This, together with the use of the random number u generated by U, guarantees key freshness to both sides. Non-repudiation: U's signature provides non-repudiation of the signed data. Note also that in order to achieve non-repudiation it has to be ensured by other means that the certificate certU of U is valid. Furthermore, in order to achieve non-repudiation, it may be required in addition that V submits Sigv(...) to a trusted time-keeper who signs Sigv(...) together with a timestamp and returns it to V. Otherwise, U could repudiate a signature claiming that the signature was generated by an impostor after the certificate had been revoked. Whether this additional measure should be implemented, however, depends on the security policy and on a trade-off between a higher security level and additional effort. Confidentiality of the data in the third message: it is ensured by encrypting the data with the symmetric encryption function using K. E n c r y p t i o n of the signature: the signature is encrypted for two reasons. Firstly, in order to guarantee that the signer knows the session key K. Secondly, in order to protect the user's identity; if the signature would not be encrypted, an attacker would be able to detect the identity of U by verifying the signature. This might be possible in a scenario where the attacker has access to the public keys of the users and he assumes that the originator of the signature is one of a small group of users (here 'small' depends on the time needed to verify a signature). Inclusion of idV in the second message: this prevents so-called source-substitution attacks (as described in [MvOV, remark 12.54]). (These can also be 1 Some protocols do not use the session key in the key distribution protocol itself in order to avoid leaking information on the session key. Such leaking of information on the session key is minimized in our protocol by the inclusion of random numbers in the encrypted messages and by the assumptions on the symmetric encryption function.
284
Giinther Horn and Bart Preneel
avoided by making sure that the TTP checks that the user is in possession of the corresponding private key before it issues a certificate on the public key. The inclusion of idV is simpler and prevents the need to detail the duties of a TTP in this context.) Inclusion o f idV in the third message: idV must be included in order to indicate the intended recipient of the signature. This is related to the use of the signature in the payment scheme: the signature, together with payment tokens sent in the payment protocol (see below), serves as a proof of payment. So, anyone who intercepts the signature and the tokens and presents them to the broker of the payment scheme could collect money fraudulently. Certificate verification keys: since U has limited space for storing public keys, U will not be able to verify certificates issued by an arbitrary certification authority. Therefore, in the first message, U tells V the identity idCAV of a certification authority CAV (certification authorities CAV1, CAV2,... ) whose certificates he is able to verify. We assume that V has got a certificate issued by this authority (one of these authorities). If this is not the case then an extended version of the protocol has to be run to obtain such a certificate (see Sect. 4.1). Identification of m u l t i p l e users: when the protocol is run concurrently by many users it is necessary to identify which message belongs to which user. This problem is assumed to be taken care of by the underlying communication system and, hence, need not be addressed by the security protocol. Temporary channel identifiers can be used to provide user anonymity. Possible attacks and the choice of the properties of h l , h2 and h3 The weak pseudo-randomness property makes hl suitable for key derivation in our context. Note that only one session key K is derived from the master key guy. ~rthermore, the proposed protocol avoids the following attacks: an attacker Eve may try to forge the second message in three ways by attacking the function hl or the function h2 or the concatenated function H (see 3. below): 1. Eve may try to find a valid pair (r, K) = (r, hl(gUVHr)). If she could do that she could compute h2(Kllrll idV) from this. This is impossible by the weak MAC-property of hl because she does not know gUV. 2. Eve may try to find a valid pair (r', h2(KIIr'll idV)), possibly after having seen the valid pair (r, h2(KIN ] idV)) generated by V. But this would not help Eve because when U tries to verify the forged second message (r', h2( KIIr'll idV) ), U computes K' = hl (g~Vllr' ) which by the weak pseudorandomness of hl is almost certainly different from K for r r r ~. U's verification would be successful only if h2(K'llr'll idV) = h2(KIIr'll idV), but then Eve would have generated a valid pair (r', h2(K'llr'll idV)) without knowing K ~and without having seen another valid pair involving K ~, in contradiction to the weak MAC-property of h2. 3. Eve may attack the concatenated function H(L,r) := h2(hl(LIIr)llrll idV) directly and may try to find a valid pair (r', H (gUy, r')), possibly after having
Authentication and Payment in Future Mobile Systems
285
seen the valid pair (r, H(g uv, r)), generated by V. However, Eve would then also have generated a valid pair (r', h2(K'llr'll idV)) where g ' = hl(gUVllr'), and K I ~ K almost certainly by the weak pseudo-randomness of hl, hence this contradicts the weak MAC-property of h2.
3 3.1
The Payment
Protocol
General
There is a recognised need for so-called micropayment systems, that are suitable for the efficient payment of small, frequently recurring, variable amounts. Applications range from electronic publishing to metering, telecommunications and information services and video-on-demand. A series of payments should be made to the same vendor over a period of time so that the vendor can aggregate the individual payments and spread the cost for clearing them with the broker over a larger number of payments. The micropayment system presented here is based on Pedersen's tick payment protocol. For a detailed discussion, the reader is referred to [Pede], however we summarise its main features. The novelty is not the payment protocol itself, but the way in which it is integrated with the authentication protocol proposed for the mobile system UMTS (cf. Sect. 1) and the payment scenario for basic and value added services in UMTS. The cryptographic mechanism employed in the tick payment protocol is based on Lamport's password scheme [Lamp]. Later Pedersen, Rivest and Shamir [RiSh] and Anderson et al. [AnMS] independently proposed the same mechanism for the payment of small amounts. Hauser et al. [HaSW], who were aware of Pedersen's work, proposed to integrate the mechanism with IBM's iKP (Internet Keyed Payment Systems) protocol for credit-card based electronic transactions. An extension of the approach can be found in [JuYu]. The proposed system has two phases: an initialisation phase in which the payer (in our scenario the user of a value added UMTS service) commits to initial values of the payment scheme by a digital signature, and an actual payment phase in which payments are made to the payee (in our scenario a UMTS value added service provider or VASP) by successively releasing preimages of an initial value aT under a one-way function F. Initialisation is performed as part of the authentication protocol described in the previous section. The meaning of the parameters signed in the third message of that protocol and the detailed working of the tick payment protocol performed in the payment phase are described in the following. 3.2
Goals
F r o m the payer's p o i n t of v i e w 1. a payment in his name can be made only by him; 2. the amount of the payment is exactly what the payer has specified; 3. only the payee specified by the payer can receive the payment.
286
Giinther Horn and Bart Preneel
From the payee's point of view: 4. a payee can verify the correctness of a payment; 5. the payer cannot deny having made a verified payment; 6. the payee can be certain of being credited for verified payments by the broker.
From the broker's point of view: 7. the broker can verify the correctness of a payment. 3.3
Prerequisites
For the t i c k payment protocol, we have the following prerequisites: - There is a public system parameter T which gives the maximum number of ticks (the currency unit of the payment system, reminiscent of phone ticks) to which the user can commit himself by one signature. - There is a public family F of length-preserving one-way functions FIV : {0, 1} n --~ {0, 1} n, where n is a public system parameter and IVis an initialisation vector. (To be more precise, the functions FIV need to be one-way on T - t h iterates, cf. [Pede].) Both F and T have to be chosen with care in order to avoid certain attacks (see discussion in Sect. 3.5). 3.4
Tick payment protocol description USER U
VASP V (
d
Fig. 2. The tick payment protocol
In the "authentication and initialisation of payments" protocol, the user commits to the parameters idV, ch_data, TV, o~T and IV. Parameter idV is the identity of the payee (the VASP V), ch_data gives the conditions under which the payment is made (such as applicable tariff etc.), T V is a time-stamp generated by V, giving date and time of day, IVis an initialisation vector by which the user selects a particular function FIV from the family F of one-way functions, ao is a random number selected by the user and a T ---- FTv(O~0) is the initial value for the tick payments. All payments made are multiples of one "tick". Whenever V requests a payment from U, he sends a message to U with the requested number d of unit payments (ticks) to be made. U responds by releasing preimages of O~T ---- F~v(O~o) under the function FIV. For the payment of the first dl ticks,
Authentication and Payment in Future Mobile Systems
287
the user sends the payment token OlT_dl : FTvdl(o~0). On receipt, V checks if F]~(C~T-dl) = aT. For the payment of the next d2 ticks, the user sends the ~ payment token O l T _ d l _ d 2 = IT~~VT - d l - d 2 (o~0), V checks i f F i ~ ( ~ T _ d l _ d 2 ) = O~T-dl , and so on. Only the last value received from the user has to be stored by V, together with the signature in the third message of the "authentication and initialisation of payments" protocol. If the total number of requested ticks exceeds the maximum amount T to which the user can commit himself by one signature, then either the "authentication and initialisation of payments" protocol has to be run again or, preferably, a simplified re-initialisation protocol is run which has only two messages and does not repeat mutual authentication. This re-initialisation protocol is not presented in this contribution for lack of space. V can aggregate the payments from a single user, and clear the aggregated payments with the broker of the payment system, by presenting the signature by which the user committed to the initial values of the payment process and the last tick payment a T - d made by the user. The VASP V is then credited d ticks by the broker.
3.5
Discussion
A c h i e v e m e n t of goals 1. A payment in the user's name can be made only by him because he commits to the starting value C~Tby his signature, and only he can know the preimages of aT under the one-way function F I V , provided the one-way function is appropriately selected. For the same reason, the payer cannot deny having made a verified payment. 2. The amount of the payment is exactly what the payer has specified: again, this depends on the appropriate choice of the one-way function, see the discussion in [Pede]. 3. The identity i d V of the payee is included in the user's signature, therefore only the payee specified by the payer can receive the payment. 4. The payee verifies the correctness of a payment by verifying the signature and verifying that the d-th iterate of the one-way function F I V applied to the payment token received last equals the payment token received before (the starting value C~T respectively), where d is the amount due. 5. The payee can be certain of being credited for verified payments by the broker because payee and broker, as well as any arbiter, can verify payments (provided of course that an agreement exists that the broker honours verifiable payments). The broker can verify the correctness of a payment in the same way as the payee.
Further issues 1. Re-use of starting value O~T:a payer could use the same starting value aT in two different sessions, either with the same payee or with a different payee.
288
2.
3.
4.
5.
4
Giinther Horn and Baxt Preneel This is to be strictly avoided by the payer as it would be to his disadvantage: assume that he pays for a total of dl ticks in the first session and a total of d2 ticks in the second. The payees could then for both instances claim max(dl, d2) from the broker (provided they could intercept the communication), and the broker would have no way to check. Re-use of initialisation vector: the purpose of using a whole family of oneway functions instead of just one function is to make it much harder for a fraudulent payee to invert the one-way function by brute force with the help of pre-computations. Therefore, a user should avoid re-using IV. Check for double submission: note that the signed commitment contains idV and r, hence the commitment is different for each session and payee. The broker checks for double submission of the same payment by comparing idV and the nonce r in the signed commitment with previous commitments. If there is a match the broker refuses to honour the payment. No checking for double spending is required of the payee V. (Note that the checking by the broker could also be done using TV instead of r, provided that the granularity of TV is fine enough.) Collision resistance is not required for the function FIV: it does not help a fraudulent payee if he is able to generate collisions for FIV because the starting value olT is given to him by the payer. The maximum amount T cannot be arbitrarily large. Rather, F and T must be chosen such that the probability that FIvT(ao) lies on a cycle is small for all IV. If FIvJ(ao) lies on a cycle with length c and j + c < T, then payments for T - j and T - j - c ticks cannot be distinguished as FIVj (s0) = FlyJ+c(c~o). Bounds for T as well as other results on the tick payment scheme will be presented in a forthcoming paper [KnPr]. TTP
Related
Issues
The protocol in Sect. 2 assumes that the user and the VASP possess the public keys necessary to verify each other's certificates, or that they possess an authenticated copy of each other's public keys. However, if this is not the case, the parties might have to contact an on-line T T P during the protocol in order to obtain a cross-certificate. Moreover, contacting an on-line T T P also allows for checking whether the certificate of the other party has been revoked. First the protocol with on-line T T P is described, and next the different options for crosscertificates are sketched. Due to space constraints, the protocol is only briefly described. 4.1
Protocol with on-line TTP
The protocol in Fig. 3 differs from the protocol in Sect. 2 in the following aspects: U also sends to V the identity idTTP of his T T P , together with his own identity idU encrypted with the key L := gUW where gW is the public key agreement key of the T T P . On receipt of the first message, V contacts the T T P of the
Authentication and Payment in Future Mobile Systems VASP V
USER U
289 TTP
9~']] idTTP]] { idU} L
(
TTII CertChain(U, V)
II { CartChain(V, U) } L [I CertChain(V, T) (
11{ SigT( h3(9~'ll cidU I1 cidV 11TT))}L
rllh2( KIIrll idV)l I ch_data IITTII CertChain(U, V) I
<
I1{ SigT( h3(9~ll cidU I1cidV IITT))}L
{Sigu(h3(g~llg v Ilrll idVII ch_data I]TTII(~TII IV)) }Ic~THIV]]L}K }
Fig. 3. The authentication protocol with on-line TTP
user and forwards the information sent by the user together with his certificate certV. It is assumed that the user's identity is sufficient for the TTP to retrieve the appropriate certificate. The TTP decrypts {idU}L and verifies whether the certificate of U has been revoked; it might also perform the same verification for the certificate of V (which might involve contacting his TTP, if V has a different TTP). If the certificates are still valid, the TTP generates a time-stamp TT, and sends to V a cross-certificate chain (see Sect. 4.2 below); the part that may reveal U's identity is encrypted using L. Together with the time-stamp, the TTP signs the unique certificate identifiers cidU and cidV as well as g~. The TTP Sends TT, the (partly encrypted) cross-certificate chain and the encrypted signature to V, who then forwards the encrypted signature and the cross-certificate for his public key to U. U can verify the freshness of the signature by the TTP since it also includes gU (note that U may not have a reliable clock). It is assumed that the user knows the unique identifier number of the certificate on his own public key and on that of V so that he can verify the received signature. If the signature is correct, the user sends the final message to V, which also includes the key L encrypted under the key K. Using the key L, V can decrypt the answer received from the TTP in the third step and verify the signature. It is assumed that V has a reliable clock, so he can verify TT. NOTE: this protocol has the property that V can only verify the signature of the TTP after the last step, which results in some additional delay. If this is not acceptable, the user can add in the first message a MAC computed with a
290
Giinther Horn and Bart Preneel
key derived from L on idU and idV (as an opponent will only see a single (text, MAC) pair for a given key, the required MAC property is weak - but slightly stronger than the weak MAC property discussed in Sect. 2.3). In that case the encryption in the fourth message can use the key K and there is no need to send L in the fifth message. However, this latter solution does not provide protection on the interface between V and T T P. 4.2
Cross-certificates and TTP scenarios
The protocol in Sect. 4.1 uses cross-certificate chains. Such chains are required when parties in the protocol do not have the same TTP, or when the parties do not have on-line access to their T T P . Here CertChain(X, Y ) consists of a sequence of certificates, Co, 0 , . . . , Ca, where the signer of certificate Co is the Certification Authority (CA) of entity X, the subject of ci is equal to the signer of Ci+l (0 < i < n), and the subject of certificate Cn is entity Y. Such a certificate is verified starting with co (using the public key of the CA of entity X); this guarantees the public key required to verify O, etc. The verification is completed after verification of ca. In order to speed up the verification process and reduce the communication overhead, the CA of entity X might also verify the complete chain, and then create a new certificate for entity Y. However, this provides slightly different guarantees to the entity verifying the cross-certificate. For the protocol of Sect. 4.1, a CA structure where the CA of a user (the T T P) and the CA of a VASP V always cross-certify one another, will provide short certificate chains. Under this assumption - - which is, of course, not necessary for the correct functioning of the protocol - - we have:
- CertChain(U, V) consists of a cross-certificate signed by the user's TTP, and cert V. - CertChain(V, U) consists of a cross-certificate signed by V's CA, and certU. - CertChain(V, T) consists of a cross-certificate signed by V's CA, and a certificate on the signature key used in SigT by the user's T T P. Of course the protocol can be simplified if U and V have the same T T P. Moreover, one can also consider the case where V contacts his own TTP, rather than that of the user. In that case the user should send its certificate certU over the air interface in the first protocol step, reducing the efficiency of the protocol and compromising user identity confidentiality. 5
Choices Formats
for Cryptographic
Algorithms
and
Certificate
The protocol and cryptographic mechanisms were chosen in such a way that they are particularly suited to the low bandwidth and low computational capabilities on the user's smart card. The payment protocol itself is very lightweight; the elementary payment operation does not require any public-key operation. The authentication protocol is (essentially) identical to one proposed to ETSI for
Authentication and Payment in Future Mobile Systems
291
UMTS user-to-network authentication lETS4]. The cross-certification approach is chosen in order to minimise communication overheads. 5.1
Cryptographic algorithms
The use of elliptic-curve cryptography allows for much shorter signatures and keys; some additional storage is required for the system parameters, but this can be minimised by selecting common system parameters. Current estimates [Wien] indicate that elliptic curves with a 170-bit subgroup order (which typically corresponds to an elliptic curve over a group with 171... 180 bits) offer a security level comparable to 1024-bit RSA. The signature scheme used is the AMV scheme of ISO/IEC FCD 14888-3 [ISO4], but the construction of the RSA based signature scheme of ISO/IEC 9796-2 [ISO1] has also been planned as an option for the signature on certificates. The hash function RIPEMD-128 can be used for hl and h2 in the authentication protocol; this hash function offers a performance in between MD5 (the security of which is questionable) and SHA-1 and RIPEMD-160. For collision resistance, 128 bits is on the low side, but for the specific needs of hl and h2 in the authentication protocol it certainly provides a high security level. For h3 RIPEMD-160 has been selected. For the tick payment protocol, RIPEMD-160 has been selected, restricted to an output of 64 bits.
5.2
Certificate format
A special certificate format has been designed that minimises the storage space on the smart card and the bandwidth on the air interface. The size of a public-key certificate is less than 200 bytes, which should be compared to about 1 Kbyte for a typical X.509 v.1 certificate [ITU1] (and certificates proposed within IETF). The certificate allows for all necessary information: version number, serial number, issuer identifier, four validity dates (begin and end of validity and two optional dates for usage of the private key), subject identifier and public key information (algorithm type identifier and a public key value). Other optional fields include key usage, cross certificate attributes and certificate path attributes. Similar ad hoc certificate formats are being used in the financial sector (e.g., for the EMV specifications by Europay, Mastercard and VISA). 6
Related
Work
There is a vast literature on authentication and key agreement protocols. For overviews, the reader is referred to [MvOV] or [RuvO]. A protocol related to the one presented in Sect. 2 is the well-known Station-to-Station protocol with its variations [DvOW]. As opposed to our protocol it has longer messages and higher computational requirements, irrespective of the choice of the signature systems used by the user and the certification authority, respectively, and the choice of the finite group in which the exponentiations required in the key agreement scheme are computed. Proposals for authentication protocols specifically designed for
292
Giinther Horn and Bart Preneel
mobile systems were made in [AzDi,BeYa,LiHa] (see also Chapter 12 in [MvOV] for additional references). However, they are either too inefficient for use in UMTS, do not achieve all the protocol goals stated in Sect. 2.1 or do not allow the integration of a payment system as described in Sect. 3 of this contribution. An overview of the ASPECT trial protocol and some trial details are presented in [MaPM].
7
Conclusion
The protocols presented in this paper provide an efficient way to achieve mutual authentication, key establishment and incontestable charging in a mobile environment. The protocol satisfies the needs for UMTS: it requires a low computational load on the user's side and requires only a limited amount of communication. Moreover, it can be extended to a large scale system with multiple T T P s through cross-certification. Acknowledgements We would like to thank the following colleagues for their help and advice: Peter Howard, Volker Kessler, Lars Knudsen, Keith Martin, Chris Mitchell, Klaus Miiller and Konstantinos Rantos. We would also like to thank all the partners in the ASPECT project for their good cooperation and the anonymous referees for their helpful comments.
References [AnMS] R. Anderson, H. Manifavas, C. Sutherland. "A practical electronic cash system." Available from http://www, cl. cam. ac. u k / u s e r s / r j a14/ [AzDi] A. Aziz, W. Diffie, "Privacy and Authentication for Wireless Local Area Networks," IEEE Personal Communications, 1st Q 1994, pp. 25-31. [BeYa] M.J. Beller, Y. Yacobi, "Authentication and key agreement protocol for PCS," Joint experts meeting on privacy and authentication for PCS, P~A JEM/93012, Nov. 8, 1993. [DiHe] W. Diffie, M.E. Hellman, "New directions in cryptography," IEEE Trans. on Information Theory, Vol. IT-22, No. 6, 1976, pp. 644-654. [DvOW] W. Diffie, P. van Oorschot, M. Wiener, "Authentication and authenticated key exchanges," Design, Codes and Cryptography, Vol. 2, 1992, pp. 107-125. [DoBP] H. Dobbertin, A. Bosselaers, B. Preneel, "RIPEMD-160: a strengthened version of RIPEMD," Fast Software Encryptzon, LNCS 1039, D. Gollmann, Ed., Springer-Verlag, 1996, pp. 71-82. [E1Ga] T. E1Gamal, "A public key cryptosystem and a signature scheme based on discrete logarithms," IEEE Trans. on Information Theory, Vol. IT-31, No. 4, 1985, pp. 469-472. lETS1] ETSI ETS GSM 02.09, "European Digital Cellular Telecommunications System (Phase 2); Security Aspects," Version 4.2.4, September 1994. lETS2] ETSI ETS 300175-7, "DECT Common Interface, Part 7: Security Features," October 1992.
Authentication and Payment in Future Mobile Systems
293
[ETS3] ETSI ETR 33.20, "Security Principles for the Universal Mobile Telecommunications System (UMTS)," Draft 1, 1997. lETS4] ETSI SMG SG DOC 73/95, "A public key based protocol for UMTS providing mutual authentication and key agreement." [FIBS] FIBS 180-1, "Secure Hash Standard," Federal Information Processing Standard (FIBS), Publication 180-1, National Institute of Standards and Technology, US Department of Commerce, Washington D.C., April 1995. [HaSW] R. Hauser, M. Steiner, M. Waidner, "Micro-payments based on iKP," Presented at SECURICOM 96. Available from http ://www. zurich, ibm. com. [ISO1] ISO/IEC 9796-2, "Information technology - Security techniques - Digital signature schemes giving message recovery, Part 2: Mechanisms using a hashfunction," 1997. [Iso2] ISO/IEC 10118, "Information technology - Security techniques - Hashfunctions, Part 1: General," 1994, "Part 2: Hash-functions using an n-bit block cipher algorithm," 1994, "Part 3: Dedicated hash-functions," 1998. [iso3] ISO/IEC 11770-3, "Information technology - Security techniques - Key management, Part 3: Mechanisms using asymmetric techniques," 1997. [iso4] ISO/IEC FCD 14888-3, "Information technology - Security techniques - Digital signature with appendix, Part 3: Certificate-based mechanisms", 1997. [ITUI] ITU-T Recommendations X.509, "Authentication Framework," Geneva 1989. [IWU2] ITU, "Security Principles for Future Public Land Mobile Telecommunication Systems," Rec. ITU-R M. 1078. [JuYu] C.S. Jutla, M. Yung, "Paytree: amortised-signature for flexible micropaymerits," Proc. of Second USENIX Association Workshop on Electronic Commerce, November 1996, pp. 213-221. [KnPr] L.R. Knudsen, B. Preneel, "One-way functions for tick payments," in preparation. [Lamp] L. Lamport, "Password authentication with insecure communication," Communications of the ACM, Vol. 24, 1981, pp. 770-772. [LiHa] H. Lin, L. Ham, "Authentication in Wireless Communications," Proc. GLOBECOM 1993. [MaPM] K.M. Martin, B. Preneel, C.J. Mitchell, H.J. Hitz, G. Horn, A. Poliakova, P. Howard, "Secure billing for mobile information services in UMTS," Proc. IS~N'98, to appear. [Mene] A. Menezes, "Elliptic Curve Public Key Cryptosystems," Kluwer Academic Publishers, Boston, 1993. [MvOV] A. Menezes, P. van Oorschot, S. Vanstone, "Handbook of Applied Cryptography," CRC Press, Boca Raton, 1997. [Pede] T. Pedersen, "Electronic payments of small amounts," Security Protocols, LCNS 1361, M. Lomas, Ed., Springer-Verlag, 1997, pp. 59-68. See also DAIMI PB-495, Computer Science Department, Aarhus University, August 1995. IRiSh] R.L. Rivest, A. Shamir, "PayWord and MicroMint: two simple micropayment schemes," Security Protocols, LCNS 1361, M. Lomas, Ed., Springer-Verlag, 1997, pp. 69-87. [RuvO] R. Rueppel, P. van Oorschot, "Modern key agreement techniques," Computer Comm., Vol. 17, No. 7, July 1994. [UMTS] UMTS Forum, "A regulatory framework for UMTS," Report no. 1, 1997. [Wien] M. Wiener, "Performance comparisons of public-key cryptosystems," Presented at the 1998 RSA Data Security Conference, San Francisco, January 12-16, 1998.
Distributed Temporary Pseudonyms: A N e w Approach for Protecting Location Information in Mobile Communication Networks Dogan Kesdogan*, Peter Reichl, and Klaus Jungh~rtchen Department of Computer Science 9 Informatik IV (Communication Systems) 9 Aachen University of Technology 9 D-52056 Aachen 9 Germany ~dogan Jpeter ~klausj }@i4. inf ormat s rwt h-aachen, de
A b s t r a c t . One of the major security aspects in mobile communication networks concerns information about the localization of the (mobile) network user. This information may be protected by establishing a trusted third party that is responsible for creating suitable pseudonyms for the user identity. Distributing the maintenance of pseudonyms among n independent trusted parties allows to increase further the security of location information. In this paper, a method is proposed that guarantees security as long as at least one of the n parties may definitely be trusted whereas the other parties may turn out to be corrupt. The pseudonym collision probability is derived analytically before a detailed OPNET simulation evaluates the cost of the new approach compared to standard GSM.
1
Introduction
Mobile networks allow users to exchange information or use services independently of their current location. Offering these new possibilities causes a couple of security problems that are significantly more complicated t h a n in the case of fixed networks. Besides the transmission on the air interface, those questions are especially related to the growing intelligence of the networks and the mobility behaviour of the individual users. This p a p e r deals especially with the issue of protecting information about the user location against attacks from inside the mobile network, especially in the case of GSM networks. GSM divides the supply area hierarchically as follows [19]: The total supply area consists of several MSC areas (each of t h e m managed by a Mobile Switching Center MSC) which are sub-divided into several Location Areas LA (each of t h e m m a n a g e d by a Base Station Controler BSC). Each Location Area finally is constituted of several cells, where a respective Base Transceiver Station B T S emits and receives radio waves within a given frequency spectrum. Within this framework, the location m a n a g e m e n t is distributed: a central database, the * The work of Dogan Kesdogan was supported by the Gottlieb Daimler- and Karl Benz-Foundation
296
Dogan Kesdogan et al.
Home Location Register HLR contains for each mobile user the respective current MSC area, whereas for each MSC area a local data-base, the Visited Location Register VLR, contains the current LA in the respective MSC area. User mobility is managed by the location management functions, especially procedures for mobile originated (MO) calls setup, mobile terminated (MT) calls setup and Location Update (LU). The latter one is an operation performing the update of the user location information in the network databases (HLR and VLRs). It is initiated by the Mobile Station (MS) while leaving one LA and entering a new one. If both LAs belong to the same MSC, the procedure is handled locally, otherwise the MSC of the new LA forwards the LU request to the HLR which thereupon cancels the record in the VLR of the old MSC and confirms the insertion of the new record in the VLR of the new MSC which in turn acknowledges the MS request. If an M T call from the Public Switched Telephone Network P S T N to a GSM subscriber has to be established, the call is routed to the so-called Gateway MSC (GMSC) which requests routing information from the HLR and the respective VLR in order to route the call to the MSC in whose area the subscriber's MS currently is roaming. The MSC then manages the establishment of a radio connection between the MS and the BTS of the cell where the user currently is registered. There have already been some proposals about how to protect location information in mobile networks (e.g. [8]), especially by using so-called MiXes as introduced by Chaum [2]. Unfortunately, the concept of MIXes has some serious drawbacks, as it implies major changes within the signalling structure as well as the architecture of the mobile network. Moreover, the encryption for the MIX has to exceed 512 bit which adds further load on the air interface. Hence, in this paper a feasible alternative is investigated which has been developed with respect to the following demands: 9 protection of the user location information as long as she moves with her mobile switched on but does not receive an incoming call; 9 implementation without major changes of the GSM standard; 9 sparing use of sensitive data by moving the responsibility for them to the user or to one or several parties the user may trust; 9 no restrictions for real-time communication. To this end, section 2 introduces and discusses the concepts of 'Temporary Pseudonyms' and 'Distributed Temporary Pseudonyms'. The optimal pseudonym length may be derived analytically as demonstrated in section 3. Some major results of a detailed O P N E T simulation are presented in section 4, before section 5 concludes the paper with some final remarks. Note that the companion paper [20] is focussed towards the performance evaluation aspects of the proposed method and contains a lot of simulation results in far more detail.
Distributed Temporary Pseudonyms 2
The Method
of Distributed
Temporary
297
Pseudonyms
The basic idea of 'Temporary Pseudonyms' (TP) has been proposed first in [14, 16]. After introducing, refining and discussing this concept in section 2.1, the following section 2.2 presents an important extension that allows to distribute the pseudonym information among several trusted parties without significantly increasing the signalling load. These concepts have been implemented and evaluated by extensive simulations which finally will be demonstrated in section 4. 2.1
The TP Method
Temporary Pseudonyms. The method of Temporary Pseudonyms (TP method) is based on the concept of trusted parties where e.g. a Home Personal Computer (HPC) confidentially stores sensitive data (authentication keys, location information etc.) or even handles the complete location management (thus replacing the VLR) [7]. But in our context the trusted region is no longer used for saving the actual location information of the user. Instead, the basic idea of the TP method consists of protecting a mobile user's location information (within a mobile network) by protecting her identity. To this end, the user is assigned a pseudonym (Pseudo Mobile Subscriber Identity PMSI). As long as the user is registered under a pseudonym, the network provider may know that a user under a certain pseudonym currently is at a certain place, but he is not able to link the user's real identity with her present location. There are different options for choosing a suitable trusted region. In the case of the TP method, the trusted region consists of a HTD (Home Trusted Device) that is completely under the control of the user. The pseudonym is generated in the MS and the HTD in a time-synchronous manner. To this end, in both stations two identically parametrized pseudo-random generators PRG generate a random number that is between 50 and 100 bits long. The seed for initializing the PRG has previously been exchanged between MS and HTD. The resulting random number is used as an implicit address, i.e. works as the 'identity' of the user, under which she registers in the databases (HLR and VLR) of the mobile network. Within the procedures in the GSM network, the implicit address plays the role of the IMSI (for a detailed discussion of implicit addressing see [6, 13]). If a mobile terminated call arrives at the network, the HTD of the user which may be reached via the ISDN fixed network - is asked for the currently used pseudonym. With this information the network provider is able to find the location information and to set up the call as usual. Note that the user remains anonymous as long as no call arrives for him from outside. Moreover, mobile originated calls may happen completely anonymously as the user location need not to be determined, if the user uses prepaid phone cards. In order to guarantee the encryption of the data sent via the air interface during an MO call, it turns out to be necessary that the MS itself informs the network provider about the (symmetric) key that is used for communication (as an unambiguous secret key
298
Dogan Kesdogan et al.
would allow to deanonymize the user). This last step may take place using an asymmetric encryption scheme (RSA). The user encrypts the key to be used with the public key of the network provider.
Fig. 2.1. Generation and Insertion of a New Pseudonym
Changing the pseudonym takes place time-synchronously between MS and HTD (cf. fig. 2.1). The most efficient way might be changing the pseudonym exactly while changing the location area, i.e. by registering in the new location area under a different pseudonym (without cancelling the old one, because otherwise the two pseudonyms could be linked). This idea hides any information about a direction of the pseudonym, but on the other hand the user may step out of her anonymity (as e.g. a constant driving speed yields constant pseudonym changes and thus linkability). Moreover, it is not possible to directly inform the HTD about the pseudonym update as every message between the user and her HTD lets the user's identity be combined with her location area. Therefore, the time between two pseudonym changes is drawn from an exponential distribution and thus memoryless. Additionally, this approach prevents situations in which all users have to update their pseudonyms at the same time, which easily could exceed the channel capacity of the mobile network. Discussion of the TP Method 9 P a s s i v e A t t a c k s . The security of the T P method is based on the pseudonym
and the HTD of the user. As long as the user has her MS switched on but does not communicate, this method hides her location information against the passive attack of an insider, i.e. the attempt to get location information about the user by observing the network data bases (HLR and VLR). As soon as the user receives a call, her location (i.e. the LAI) is revealed as her HTD reveals her current pseudonym to the network provider who then is able to link user identity and location. Hence, this passive attack cannot be
Distributed Temporary Pseudonyms
299
prevented by the TP method. If the insider observes this location information over long time, he might be able to generate a movement profile for the user. 9 Active Attacks. Active attacks of the network provider, i.e. attempts to find out the user location by periodically asking her HTD, may be recognized because all requests are logged at the HTD. Hence, if there are many more requests at the HTD than actual calls, this points towards an active attack. Note that the user may recognize this attack but is not able to prevent it. One solution might be to add the functionality of a reachability manager to the HTD, i.e. it could decide for each request whether the importance of the request justifies revealing the pseudonym. 9 Real-time C o m m u n i c a t i o n . The most important advantage of the TP method compared to other methods for securing the location information consists of its easy and efficient implementation within existing mobile networks and the small additional signaling overhead. Moreover, the possibility of realtime communication is not restricted. 9 Availability. The major availability problem consists of the possibility of a HTD breakdown. In this case, its availability could only be restored by the user herself which might be difficult in case of her absence. Moreover, if the user is not informed separately about the breakdown, she will not notice it until another user who wanted to call her sends her an extra notification (or she tries to call herself). TP Method with a Trusted Party A rather straightforward extension of the TP concept prevents the availability problem by putting the pseudonym generation into the hands of a third party (e.g. an outside company) which the user may trust. In this case, it is possible to exchange direct messages between MS and the trusted device (TD). As the TD administrates a lot of users, a message from the MS may reveal the relation between the pseudonym (MS) and the group of users administrated by the TD, but not the identity of the user herself. Hence, changes of the pseudonym change rates may be sent directly to the TD, moreover the availability of the TD may be increased compared to the HTD approach. The remarks about active and passive attacks remain the same as in the case of HTD. Note especially that a collaboration of the (unconditionally trusted) TD with the network provider is sufficient for revealing the location information to the insider. 2.2
The DTP Method
In contrast to the TP method, DTP distributes the generation of the pseudonym among n different trusted parties. Each party i holds for each user the entries: 9 9 9 9
M S I S D N , the phone number of the mobile user; pseudoi, the seed for the pseudo-random generator of the pseudonyms; A t i , the time until the next pseudonym change takes place at party i; pmsii, the part of the pseudonym P M S I generated by party i.
300
Dogan Kesdogan et al.
At certain points in time, each trusted party generates its new partial pseudonym pmsi~ (1 < i < n), where no party knows any of the pseudonyms of the other parties. At the same time, the MS registers at the databases of the mobile network under the (total) pseudonym P M S I = pmsi 1 G pmsi2 |
@ pmsin,
(2.1)
where @ denotes the XOR function. As soon as a call arrives at the gateway (GMSC), the n trusted parties are asked for their 'partial pseudonyms'. The answers then are XORed bit by bit yielding the actual pseudonym PMSI of the user as shown in fig. 2.2. The idea of a bitwise superposition of messages follows the concept of DC networks as introduced in [3], but instead of guaranteeing the sender anonymity, the aim of our approach is to distribute the information about the user pseudonym over n independent parties such that the information is safe as long as at least one of them works indeed correctly instead of collaborating with the network provider.
Fig. 2.2. Signalling of Distributed Temporary Pseudonyms
Finally the user may be found as usual over her pseudonym, and the call may be set up. Note that as in the case of the T P method, the user is deanonymized as soon as a mobile terminated call arrives, and her location may be found by the insider. If the time for the next pseudonym change would be determined unambiguously for each user and told to each of the trusted parties, a corrupt 'trusted' party might be able to inform the insider beforehand about the moments of changes. In this case, the insider only has to observe in which location area there is a pseudonym change at that time in order to know the relation between identity and location. Hence, it is necessary to either hide the time of the next pseudonym change within a number of synchronous changes of partial pseudonyms or to weaken the knowledge of a trusted party i about the fact that changing its partial pseudonym pmsi, indeed yields a change of the total
Distributed Temporary Pseudonyms
301
pseudonym P M S I in all of the databases. These deliberations yield the concepts of 'change rate classes' and 'synchronized pseudonyms', respectively. C h a n g e R a t e Classes. In order to maintain large sizes of the anonymity sets it is optimal if all users change at the same time, as in this case the information about the time of the next pseudonym change is of no value at all. But this obviously is impossible because of the limited channel capacity. Hence, the networks proposes j possible 'change rate classes' (e.g. approximately 20 sec, 1 min, 20 min etc.). The j classes should be chosen such that the users of a network are distributed as uniformly as possible over the classes. All members of a certain class change their pseudonyms at the same time. The MSs determine themselves (e.g. using the approximate rate at which they change location areas) the class they wish to belong to. Thus in the case of deanonymization of a user, little is revealed about her prior movement path. The less classes there are and the more members one class possesses, the larger is the 'anonymity set' (cf. [15]) of the respective user (but the larger is the signalling amount at change time, too). An entry change (e.g. the change to a new class) at a trusted party then is performed e.g. while the user is deanonymized.
Synchronized Pseudonyms. Here, the trusted parties are subdivided into two parts. Each party is synchronized with exactly one other party (without knowing it), i.e. their initialization of the PRG and the pseudonym change times are identical (w.l.o.g. assume that for each pair, one party is between number 1 and n/2, whereas the companion is between n / 2 + 1 and n). The only one to know which parties are synchronized and which initializations the pseudorandom generators of the single parties possess is the user. The idea is to prevent a party from being able to predict that the actualization of its partial pseudonym yields indeed a change of the total pseudonym. This is reached because the partial pseudonyms of two parties may cancel each other out because of the XOR function. Each trusted party i possesses the following entries per user: 9 9 9 9 9 9 9 9
MSISDN, the phone number of the mobile user; pseudoi, the seed for the PRG of the pseudonyms; updatei, the seed for the PRG of update times; uniformi, the seed for the PRG of uniformly distributed random numbers; Pi, the probability that a generated partial pseudonym will indeed be used; uni,, a value randomly drawn from a uniform distribution; Ati, the time until the next pseudonym change takes place at party i; pmsii, the part of the pseudonym P M S I generated by party i.
The concept of synchronized pseudonyms yields that exactly two parties (i and n / 2 + i, say) have identical entries. At certain moments, each trusted party generates its new partial pseudonym pmsii (no party knowing the pseudonym of any
302
Dogan Kesdogan et al.
other one nor the moment any other party generates a new partial pseudonym). Additionally, each party determines whether pmsii will be used by comparing the random number uni, with the limit p~: If the random number drawn from a [O;l[-uniform distribution is smaller than Pi the new partial pseudonym is used (for the sake of simplicity let us assume that all p, -- p are identical). Note that each time one pseudonym is generated, a different trusted party generates the same partial pseudonym with probability p. As the total pseudonym is only changed if exactly one of the two new partial pseudonyms is used, with probability p2 + (1 - p ) 2 the two parties behave identically (i.e. the newly generated pseudonym has no effect), and with probability 2p(1 - p ) the user registers under a new total pseudonym.
Now, the expected time At between two changes of the total pseudonym can be calculated as At := min {At1, A t 2 , . . . , Atn/2} (2.2) with Ati = PRGupdate, 1 < i < n, and At, :=
if (uni < p ^ unit+n~2 < p) or (uni >_ p ^ uni +n/2 > p)
(2.3)
where (2.3) describes the case that the total pseudonym does not change. Assuming that each trusted party i, 1 < i < n, draws Ati from an exponential distribution with parameter A, a straightforward calculation shows that for identical acceptance probabilities p~ = p, At can be characterized as drawn from an exponential distribution with parameter
n/2 ~(2-p-(1
- p ) ) " A = p" (1 - p ) . n . A .
(2.4)
i=1
Hence, the time until the next change of the total pseudonym is still memoryless, i.e. an attacker is not able to predict it. Choosing p very high (or low) implies that each trusted party has only small knowledge about the next total pseudonymupdate which makes it even harder for the attacker to explore it. Finally it should be noted that the idea of synchronized pseudonyms requires a change in the attacker model, as at least one synchronized pair out of the n parties is required to work correctly. 3 3.1
Analysis of Pseudonym
Collision Probability
Length of the Pseudonym
Within the TP method, the pseudonyms consist of random numbers (generated in the HTD and the MS) which should be different for all users. In order to hide as much user movement information as possible from the attacker, the pseudonym changes take place according to a Poisson stream whose parameter corresponds
Distributed Temporary Pseudonyms
303
to the location area change rate. In analogy to [9], assume A pseudonyms are already registered in the HLR of the network operator (with A _> N, the number of users). To calculate the average time Tcoll until two pseudonyms coincide if this procedure is repeated independently, let An be the rate of pseudonym changes (identical for all users). Hence the pseudonym changes of all N users form a Poisson stream of rate A = N 9An with interarrival time 1/A. If the pseudonyms are B bits long, the probability of choosing an already existing PMSIis A/2 B, hence the random variable X, which describes the number of pseudonym changes until the first collision, is distributed geometrically with distribution function
A ( A ~ k-1 A P { X = k} = ~-~ . 1 - ~--~] and mean value E(X) = ~-~. Therefore, the mean time between two collisions may be calculated as
E(X) Tc~ -
A
2B
- A. N . An"
(3.1)
According to (3.1) it turns out that (in the case of 10 million active users) for pseudonym lengths of 100 bits the expected time between two collisions is around 20 billion years which seems to be sufficient for our purposes. Note on the other hand that a length of 50 bits yields a time of only 7 days whereas 70 bits yield around 16 years. Therefore, in the OPNET simulation as described in section 4, a pseudonym length of 100 bits has been used throughout. 3.2
A v e r a g e N u m b e r o f P s e u d o n y m s in t h e H L R
Assume the interarrival times of pseudonym changes of each user to be exponentially distributed with mean l/An. Pseudonyms are created independently of each other, hence the total rate of pseudonym changes in the network is A -- N 9An and the interarrival time between each two changes is exponentially distributed with rate 1/A. In order to prevent the HLR to be overloaded by pseudonym entries, each pseudonym is assigned a 'time to live' parameter TTL. Thus, the total time a pseudonym entry is stored corresponds to the sum of the time until the next pseudonym change and the TTL. In order to prevent an attacker from drawing any conclusions from the moment of a pseudonym deletion and the moment of storing a pseudonym (which might allow him to link both events and hence to link both pseudonyms), the T T L parameters are drawn from an exponentim distribution with parameter # TTL" Therefore, the probability of deleting a pseudonym is independent from the moment of registering it due to the memoryless property of the exponential distribution.
304
Dogan Kesdogan et al. Database (HLR or VLR) M/M/c~
~i = A~
M/M/cx~
#2 = # T T L
Fig. 3.1. Tandem Network
The HLR may be modelled as 'tandem network' (fig. 3.1) consisting of two subsequent M / M / c ~ - q u e u e i n g systems. The assumption of infinitely many servers corresponds to all pseudonyms held independently from one another in the HLR until their deletion. The first server delays all entries until a new pseudonym is generated, i.e. for each user with service rate #1 = An. The second server delays the entries by the time to live, hence its rate is p2 : # TTL" As the queueing systems of a tandem network with these properties may be viewed independently from one another, the arrival rate at the two queuing systems of the tandem network equals N 9)~n. The average number of customers in an M / M / o o - s y s t e m equals p = A/it [17]. Hence, the average number of pseudonyms in the HLR may be calculated as
A = P t o t a l = pl + p2 -
A~-- + p T ~
#TTL/
"
If the TTL of a pseudonym is drawn from an exponential distribution with the same parameter as the interarrival times of pseudonym changes, the average number of HLR entries equals A = Ptotal = 2. N , i.e. twice the number of users. 4 4.1
Simulation
Results
Preliminaries
The TP method has been integrated in microscopic detail into an existing GSM model for the simulation platform OPNET [5]. The model consists of the OPNET network layer for building the network topology, the OPNET node layer for modelling the data flow within network components, and the OPNET process layer, in which the behaviour of the components is characterized by finite automatas. In this model, the main GSM hardware and signalling procedures are simulated (see fig. 4.1). Included are mobile stations (MS), base transceiver stations (BTS), base station controlers (BSC), mobile switching centers (MSC), visited location registers (VLR), home location registers (HLR), gateway mobile switching centers (GMSC) and the public switched telephone network (PSTN).
Distributed Temporary Pseudonyms
305
Signalling has been modelled close to the GSM standard down to layer 3 of the protocol model. In addition to [5], packets sent over the SS7 network are piggybacked by appropriate packets of the lower layers TCAP, SCCP and MTP3. The simulation comprises the procedures Mobile Originated Call, Mobile Terminated Call, (Inter-BSC, Inter-MSC) Location Update, (Inter-BSC, InterMSC) Handover and Paging. Service handling and authentication are not integrated, in contrast to System Information and Measurement messages, which are evaluated for deciding on the initiation of location updates and handovers, respectively. The simulation area consists of 12 cells, 4 location areas, 2 MSC/VLRs and 21 mobile users. Each location area consists of 3 cells. Larger networks with up to 40 cells, 4 MSCs and 50 mobile users have been tested, the restriction to the smaller one is due to simulation duration and consistency of the results. The diameter of the cells depends on the mobility scenario and has been chosen to be 1.5 km for a city scenario, 15 km for countryside, 10 k m for 'Autobahn' (i.e. German highway) and 9 km for a mixed scenario.
Fig. 4.1. The OPNET model for the simulated GSM network
The mobility model for the MS has been implemented according to results fl'om [4], who measured delays in bus tramc and found out that these delays
306
Dogan Kesdogan et al.
could be modeled by an Erlang-4-distribution. The bus delays mainly occur due to the individual traffic sharing the roads with the buses. Though based on slightly different driving conditions (e.g. time schedule at every bus stop), the results can be transfered to individual traffic. Hence, the mean delay may be calculated as 'perturbation' of the usual mean driving time according to t m e a n = t m i n + X E r 1 9 t m i n = (1 + X E r l ) 9 t m i n
(4.1)
where X E r 1 is drawn from an Erlang-4 distribution. This yields a mean simulated speed of Vmax (4.2) Vmean -
1 + X Er l
This leads to the following mobility algorithm that is executed for every mobile node during the simulation: 1. Mobile users are distributed randomly across the simulation area. 2. Each mobile user draws a random destination from a uniform distribution. 3. Each mobile node has parameters indicating the real mean speed and the maximum speed. Thus the delay for part of the way and the simulated speed may be calculated. 4. At certain times each mobile node decides whether it performs a 90 ~ turn towards the destination or not. The distance between these points and the turn probability are given as parameters and depend on the simulation scenario; they have been calculated based on typical respective scenarios in the region of Aachen. 5. The mobile node moves according to its calculated speed and direction. 6. If the destination is reached, a new destination is chosen according to step 2, and the algorithm continues from step 3 onwards. Note that the mean speed of mobile nodes has been assumed to be 10 k m / h in the city, 65 k m / h in the countryside, 100 k m / h in the 'Autobahn' and 58 k m / h in the mixed scenario (cf. [18]). Measurements in real mobile networks [1] state that during the busy hour approximately 2/3 of the calls are mobile originated whereas 1/3 are mobile terminated calls. Measured mean call setup rates are around 0.45 calls/hour for mobile originated calls and 0.225 calls/hour for mobile terminated ones, yielding a global call interarrival time of about 5333 seconds. Fhrthermore, the mean duration of mobile terminated calls has been measured to be 115 seconds in contrast to 105 seconds for outgoing calls. According to [10, 11], call holding times as well as interarrival times may be drawn from exponential distributions.
4.2
Variation o f the Mobility Behaviour
In this section it is demonstrated how changes in the mobility behaviour effect the load of the mobile network. To this end, the three scenarios 'city', 'countryside' and 'Autobahn' are explored, using typical simulation parameters in each case.
Distributed Temporary Pseudonyms
Fig. 4.2 Total Network Load in the three Scenarios
307
Fig. 4.3 Mean Duration of the Procedures for Location Update and Pseudonym Change
Fig. 4.2 shows the total signalling load generated by one user in the mobile network. On the one hand, the network load increases proportionally to the mobility environment of the user. A user in the countryside scenario with large location areas and medium speed causes less load than a user who is fast or moves in areas with small cell sizes. This demonstrates the influence of the rate of location area updates. On the other hand, the difference between the signalization in the GSM network with and without the TP method appears to increase proportionally to the mobility behaviour as well, being 75% in the countryside, 80% in the city and 96% in the Autobahn scenario. The mean duration of the procedures for location update and pseudonym change is demonstrated in fig. 4.3. On the air interface the location update procedure uses the TMSI for either method. Thus the length of the PMSI here is of no influence at all. Within the SS7 network PCM30-connections are used. The 64 kbit/s speed of their signalling channel is high enough to prevent the location update delay from rising heavily. The PMSI Insertion procedure has to send one pseudonym across the air interface. Due to its length and the low rate of the dedicated signalling channel the duration of the procedure increases by 67%. Nevertheless it is not as time-critical as a handover procedure. Where does the high signalling load come from? The responsible factor turns out to be the share of the location update procedure, which is extended for the PMSI insertion procedure, in the global load. Fig. 4.4 shows that in the chosen simulation environment and with the T P method being implemented, location updates and pseudonym insertions amount to 74% of the global load, whereas the low impact of the call procedures results from the call rates being adjusted to the values given in [1]. Fig. 4.5 finally shows the call setup time for incoming and outgoing calls. The call setup times for both kinds of calls coincide as long as no (public key encrypted) session key is sent from the mobile to the network. Conveying a session key causes a delay of 0.66 seconds per direction if sent over a dedicated
308
Dogan Kesdogan et al.
Fig. 4.4. Distribution of the global load w.r.t, the different GSM procedures (in for the case of the city scenario
bit/s)
Fig. 4.5. Duration of the Call Setup without (left) and with (right) encrypting a session key for Mobile Originated calls
signalling channel with netto bit rate of 782 bit/s. T h e pseudonym has to be sent twice with an initial message in order to prevent two users from accessing the same traffic channel: First a random number is sent from the user to the network which afterwards is contained in the answer of the network to indicate that the user is now owner of the channel which is assigned by the network. Then the initial message is sent 'piggybacked' to a layer 2 L A P D m SABM-message which is answered by an exact copy of it within an L A P D m UA-message (see [19]). If the mobile does not receive an exact copy of its initial message, it has to clear the connection.
4.3
Variation of P s e u d o n y m Change Rate
This section explores the influence of the pseudonym change rate on the load of the GSM network. We have investigated p s e u d o n y m change intervals from two minutes up to one hour. The mobility and call model is the same as in the previous section. Fig. 4.6 left shows t h a t the shorter the interval between two pseudonym changes, the larger the increase in the global load and vice versa. Note t h a t the load m a y be influenced to a much larger extent than e.g.
Distributed Temporary Pseudonyms
309
by the call rate (or PMSI length, as further experiments have shown). Fig. 4.6 right demonstrates how the PMSI insertion procedure becomes more and more responsible for the total network load: At rate 600 changes/hour the insertion causes 47% whereas at rate 120 changes/hour this share is already approximately 81%. The reason for this lies in the 'length' of the PMSI insertion procedure (i.e. the sum of lengths of all respective messages) which is many times the length of the PMSI. Hence it can be concluded that increasing the PMSI exchange rate has much more effect than prolonging the PMSI.
Fig. 4.6. Global Load for Different PMSI Change Rates (left) and Load Caused by the PMSI Insertion Procedure (right)
Fig. 4.7 shows the distribution of the total load w.r.t, some interfaces. Obviously, the air interface is affected most by the signalling of a user. As the dedicated signalling channels at the air interface are assigned exclusively to the users, whereas the signalling messages of all users use the ressources of the same SS7 network, the load in the SS7 network must be viewed in relation to the user number. A load increase of 0.1 bit/s (as caused by increasing the pseudonym change rate from 2 to 6 changes/hour) increases the total load in the SS7 network for a user number of 1 million by approximately 98 Kbit/s.
Fig. 4.7. Load at Different Air Interfaces for Different Pseudonym Change Rates
310
Dogan Kesdogan et al.
A lot more OPNET simulations were conducted, e.g. investigating the influence of varying the pseudonym length or call rate. The results of these simulations are reported in great detail in [12, 20]. 4.4
Application of the Results
From the results demonstrated in the previous sections it may be concluded that especially the pseudonym changes have major influence on the network load. The T P method as investigated so far is fixed to the location update procedure in order to guarantee the reachability of the user under all circumstances. It is complicated to perform the change of pseudonyms exactly at the boundary of the location area because the user's HTD does not easily know the exact moment. But it seems worth considering to perform the change of location areas in direct conjunction with the change of pseudonyms as - compared to the TP method - this might save the expense of the location update procedure. In this case it must be noted at which rate incoming call attempts will fail as the user cannot be reached. A second point for improvements is the sending of the transmission key in the case of outgoing calls. As explained above, an exact copy of the initial call setup message is sent back to the user in order to prevent several users from using the same dedicated channel. This sending doubles the cost for introducing the transmission key. The following simulation investigates the effect of the above improvements. Fig. 4.8 demonstrates that the total load may be reduced by 17% compared to the TP method, whereas the load at the air interface decreases by 22%. This may be explained by the fact that the 512 bit transmission key is only sent once per initial call setup message. As the pseudonym change is performed exactly as with the original TP Method, the improvement at the HLR interfaces comes from avoiding the inter-MSC-location-update procedure. Sending the transmission key only once reduces the time for setting up a call by approximately one second (fig. 4.8 bottom right). Note that the enhanced T P method does not guarantee the strict reachability of the user as the location updates are replaced by pseudonym changes: if the time between two pseudonym changes becomes too long, possible location area changes are not communicated to the network data bases. Therefore, in our simulation about 17% of call attempts were not able to be performed successfully. This value decreases with the user mobility. As a consequence, either the pseudonym change rate must increase (according to the mobility behaviour of the users) or another paging strategy must be introduced, e.g. the method of dynamical location areas.
Distributed Temporary Pseudonyms
311
Fig. 4.8. Global load, load on the air interface (top row) and the HLR Interface and call setup times (bottom row) for the different methods
5
Concluding Remarks
This paper has dealt with the question of how to protect information about the user location against insider attackers in a GSM network. It has been demonstrated that the T P and D T P methods yield feasible solutions to this problem by hiding the relation between user identity and location using pseudonyms whose unavoidable changes can be protected from being linked together by an attacker. However, even with this approach the mobile user will be deanomyzed as soon as she accepts any incoming call. Current and future research deals with this issue, especially with the question whether it is possible at all to conduct completely anonymous calls.
References I. Brass, V.; Fuhrmann, W. F.: Traffic Engineering Experience from Operating Cellular Networks. IEEE Communication Magazine, August 1997, 66-71. 2. Chaum, D.: Untraceable Electronic Mail, Return Addresses and Digital Pseudonyms. Communications of the ACM (24) 2, 1981, 84-88. 3. Chaum, D.: The Dining Cryptographers Problem: Unconditional Sender and Recipient Untraceability. Journal of Cryptology vol.1 no. 1, 1988, 65-75. 4. Dauber, F.-J.: QoS Parameters of Bus Traffic with respect to Time-Tables and Subsequent Variability. Ph.D. Thesis (in German). Aachen University of Technology, July 1986. 5. Institute Eurecom, Delta Partners: A GSM OPNET Model (incl. documentation). (~) 1994-95.
312
Dogan Kesdogan et al.
6. Farber, D. J.; Larson, K. C.: Network Security via Dynamic Process Renaming. Proceedings of the 4th Data Communications Symposion, Quebec (Canada), October 1975, 8/13-8/18. 7. Federrath, H.; Jerichow, A.; Kesdogan, D.; Pfitzmann, A.: Security in Public Mobile Communication Networks. Proc. of the IFIP TC 6 International Workshop on Personal Wireless Communications. Augustinus (Aachen) 1995, 105-116 8. Federrath, H.; Jerichow, A.; Pfitzmann, A.: MIXes in Mobile Communication Systems: Location Management with Privacy. Proceedings of the Workshop on Information Hiding, Cambridge (UK), May 1997. 9. Federrath, H.: Trusted Mobility Management in Telecommunication Networks. Ph.D. Thesis. TU Dresden 1997 (in German). 10. Fuhrmann, W.; Brass, V.: Performance Aspects of the GSM Radio Subsystem. Proceedings of the IEEE, vol. 82 no. 9, 1994, 1449-1466. 11. Gu~rin, Roch A.: Channel Occupancy Time Distribution in a Cellular Radio System. IEEE Transactions on Vehicular Technology, vol. VT-35 no. 3, 1987, 89-99. 12. Junghiirtchen, K.: Simulative Investigation of Methods for Protecting Location Information in Mobile Networks. Diploma Thesis (in German). Aachen University of Technology, November 1997. 13. Karger, P.A.: Non-Discretionary Access Control for Decentralized Computing Systems. M.Sc. Thesis, Techn. Report MIT/LCS/TR-179, MIT 1975. 14. Kesdogan, D.; Fouletier, X.: Secure Location Information Management in Cellular Radio Systems. Proceedings of the IEEE Wireless Communication Systems Symposion WCSS'95, Long Island, 1995, 35-46. 15. Kesdogan, D.; Egner, J.; Biischkes, R.: Stop-And-Go-MIXes Providing Probabilistic Anonymity in an Open System. Workshop on Information Hiding. Oregon, April 1998 (to be published in Springer LNCS). 16. Kesdogan, D.; Federrath, H.; Jerichow, A.; Pfitzmann, A.: Location Management Strategies increasing Privacy in Mobile Communication Systems. Proceedings of the 12th IFIP International Information Security Conference SEC'96, May 1996 (Chapman & Hall). 17. King, P.J.B.: Computer and Communication Systems Performance Modelling. Prentice Hall 1990. 18. Lyberopoulos, G.L.; Markoulidakis, J.G.; Polymeros, D.F. et al.: Intelligent Paging Strategies for Third Generation Mobile Telecommunication Systems. IEEE Transactions on Vehicular Technology, vol. 44 no. 3, August 1995, 543 - 553. 19. Mouly, M.; Pautet, M.B.: The GSM System for Mobile Communication. Published by the authors, 4, rue Elis~e Reclus, F-91120 Palaiseau, France. 20. Reichl, P.; Kesdogan, D.; Jungh~irtchen, K.; Schuba, M.: Simulative Performance Evaluation of the Temporary Pseudonym Method for Protecting Location Information in GSM Networks. Proceedings of the 10th International Conference for Computer Performance Evaluation TOOLS'98. Palma de Mallorca, September 1998 (to appear in Springer LNCS).
A Mix-Mediated Anonymity Service and Its Payment*
Elke Franz and Anja Jerichow Dresden University of Technology Department of Theoretical Computer Science D-01062 Dresden
{efl,jerichow}r
A b s t r a c t . One measure to provide anonymity for the users of a communication network axe mixes whose usage was proposed in several applications recently. However, in practice such concepts axe not widely used. One reason may be that the payment for providers, who commercially offer such mix-mediated anonymity service, has not been considered yet. We present detailed protocols for payment schemes that allow anonymous, secure payment of a mix-mediated anonymity service. The schemes aim to achieve confidentiality and integrity for all the user, the provider and the bank.
1
Introduction
In the last decade, the discussion about anonymity and privacy has become more and more important. One requirement that should be supported in secure communication systems is anonymous communication, i.e. to keep secret who is in contact with whom at which time and, maybe, from which location. The subject matter of this paper is "mixes" - one anonymity concept. Firstly introduced by Chaum in 1981 for anonymous e-mail transfer [1], the usage of mixes was proposed in several other applications recently. [2], [3], [4], [5], [6], and [7], for example, suggest their usage for providing anonymity in telecommunication networks, mobile communication systems and the internet. By mixes, sender and recipient anonymity as well as the protection of the communication relation can be achieved. T h a t means, sender and recipient of a message cannot be correlated. Though the mix concept allows that lines are tapped; and all but one of the mixes passed may be corrupted. A mix is called "corrupted" if it tells an attacker the correlation of an incoming and the corresponding forwarded message. We distinguish between two networks that must be operated independently: one providing the communication service A/cs and the other providing the anonymity service A/As. To achieve anonymity, we extend A/cs by A/~4s. Thus, if Parts of this work were supported by the German Science Foundation (DFG) and the Gottlieb Daimler- and Karl Benz-Foundation.
314
Elke Franz and Anja Jerichow
Alice (.4) wants to communicate with Bob (B), she includes routing information for the usage of HAs in her message ml and sends it to N'c8, which then forwards it among others to N'As. After processing the message, HAs sends the resulting message back to N'c8 together with other messages. Thus, the provider of N'cs cannot correlate .4 and • (assuming there is enough traffic in the anonymity network). Moreover, if the attacking model of mixes holds, i.e. all lines can be tapped but at least one mix is trustworthy, then even the mix providers of AlAs are not able to correlate anything.
m2 .........
13 ~
?
~ - N c s ~--
ml
,..Ig3
?.,?~,?e
~
AlAs
For both networks, payment has to be supplied. The payment of communication services is generally supported nowadays. In this paper, we focus therefore on the additional needed payment for the mix-mediated anonymity network that must be operated by different providers as of the communication network. To solve this problem for mixes is even more tricky, since one important requirement on mixes is their independent construction as well as their availability and operation by independent providers. Thus, this paper give answers to the following question: How can a user pay for the usage of an anonymity service without revealing his anonymity against all (including the providers of the mixes in N.as)? First ideas to achieve payment of anonymous communication were discussed in [8]. We revise some of these and present further results. Section 2 continues the short overview about the mix network and discusses suitable payment schemes for this anonymity service. Section 3 describes an interactive micro-payment scheme that applies "tick payment" to transfer small amounts of payment. Several payment protocols are described in general. The two most efficient approaches will be applied to the mix concept. Section 4 shows the protocols in detail. Some concerns on attacks and efficiency are discussed in Section 5. The protocols between users, bank and providers to allow secure, anonymous payment are outlined in Section 6. Finally, Section 7 concludes and gives an outlook. 2
2.1
Suitable Payment Schemes for a Mix-Mediated Anonymity
Service
Requirements on Mixes
A mix is a network node with cryptographic facilities that hides the relations between communicating users. Mixes can be linked to mix chains. Due to the functionality of a mix, an attacker is not able to trace messages through the mix
A Mix-Mediated Anonymity Service and its Payment
315
network: The mix changes the appearance of the messages by using a suitable cryptosystem. Correlation by the message length is avoided if a length-preserving scheme is used for encryption. Time correlation is avoided as the mix collects all messages in its buffer and re-orders them before they are forwarded. This way, incoming and forwarded messages of a mix are not linkable. For the purpose of this paper we assume the existence of Arcs. For the anonymity network AlAs the following assumptions are made: The mixes f14~ of a mix chain .~PICk, that a message passes, have independent providers PM.. That means
Vk ~ 3i,j 9 ( M , , M 3 6 MCk) ~ (/)M, # P ~ , )
(1)
whereby a mix chain ,44C comprises m mixes with (~) being the concatenation of all used mixes. m
Vk 9 MC~ ~ N a s A MC~ = ( ~ M~
(2)
A message consists of several message blocks j. For sending it through the mix chain each message block Nlj has to be prepared according to Formula 3.
N(m+l)j := Cm+l(Nj) Nij :---- ci (rij, Ai+I, i(i+l)j)
(i -= 1,..., m)
(3)
where Nj is the message block j for the recipient whose address is Am+l, and Nij is the message block j for mix AJi with address Ai that is encrypted with the mix' public key ci. One can visualize this encryption process as follows: If Alice wants to send a postcard to Bob, she encloses it in an envelope such that only Bob is able to read the content. Additionally she encloses this envelope successively into further envelopes, whereby everyone carries the address of another post office, i.e. a mix address. Like that none of the post offices is in the position to link Alice and Bob. Let us come back to the subject. Each covering, which is put around the original message must contain random bits rij. Otherwise an eavesdropper could easily correlate messages as a mix works deterministic. He needs to encrypt only the output with the mix' public key and, following, to compare the result with all messages that arrived before at this mix. Note: We suggest the usage of a length-preserving scheme. That means all blocks have the same size. More details on this scheme can be found in [1]. Additionally, [9] and [10] give a good introduction to mixes.
2.2
Discussion o f Suitable P a y m e n t Schemes
Each mix provider must be paid for passing messages. There are two ways: (a) A mix provider is directly paid. (b) Only one mix provider of the mix chain is paid. In the latter, there must exist a general agreement (like for international
316
Elke Franz and Anja Jerichow
telephone calls). For example the money received could be distributed among all mixes according to the number of messages each mix has processed. There are several ways of payment: a message can be pre-paid, post-paid or paid immediately (pay-now). Note: In the following section, we focus only on the protocols between user and mix provider. The charging process, which involves the bank, will be discussed in Section 6. The problem with these schemes is that either the user, who paid in advance, has to trust that the mix will really process his message, or the mix, that already processed the message, has to trust that the user will subsequently pay for the service. And last but not least, if an anonymous pay-now system for on-line payment is used, then real-time communication cannot be guaranteed at all (due to the calculation time for on-line verifiction of the payment). Since we want to keep the advantage of one party over the other as small as possible, we propose an interactive micro-payment scheme that will be discussed in the following section. 3 3.1
The
Interactive
Micro-payment
Scheme
The Basic Idea
Instead of paying the full service 9 a message is split in small blocks, and 9 each block is paid when processed. By "paying when processed" we understand that the merchant receives the appropriate amount and immediately performs the service (to allow real-time communication). Section 6 will describe why the merchant can trust that encashment of the received money is possible later. Using this scheme, neither merchant nor customer is much ahead of the other party. The scheme is a micro-payment scheme since only small amounts of money, in the following called ticks, will be transferred. It is interactive because payment will be included in the actual message and dropped at the mix according to the payment strategy. The interactive micro-payment scheme combines two approaches: the concept of mixes and tick payment. 3.2
Tick Payment
Originally, ticks were used for authentication [11]. In 1995, Pederson investigated tick payment [12] as an efficient way of payment. We apply this principle to the concept of mixes. Since a tick means that only some additional bits are added to the message block, it want be much of a burden to the performance of mixes. The principle of tick payment is based on a one-way function. This oneway function is a length-preserving permutation f but, nevertheless, called hash function in the following. At the beginning a random number a is chosen. The hash function is applied n times to a. These results in hash values
A Mix-Mediated Anonymity Service and its Payment fl= f(a),
f2 = f(fl),
...,
fn = f(fn-l) : f(f(f(...f(a))))
317 (4)
where each hash value fn-~ is named a tick t~ (with i : 0 , . . . , n - 1). Ticks can be easily used for paying in small amounts. For example between a customer, i.e. a user, and a merchant, i.e. a mix, to pay for anonymous transferred information. 3.3
G e n e r a t i o n o f a Cheque
The user creates a digital cheque that includes f ~ generated according to Formula 4. Other parameters a cheque should contain are a cheque identifier to recognize the ticks that belong to this cheque, the amount-per-tick value, the address of the user's bank for encashment of the cheque, the recipient of the cheque (i.e. the mix' address), and the signature of the user. This cheque is transferred to and stored at a mix that was chosen by the user. Now, the user sends a tick with each message block. Note, at most n message blocks can be sent and paid.
user g e n e r a t e cheque sends first block
mix send
cheque(/n)
send
m l ~1
>
s t o r e fn
9
if
f(tl) = fn then f o r w a r d ml and store/n-l(= tl)
As a general rule, the mix stores a hash value f~-3+1. If it receives the next tick tj, it applies the publicly known hash function f and compares this result with the stored hash value. If they are equal, the mix has received the correct payment. sends following blocks
send
m j ,tj
,
i f f(tj) = fr~-j+l t h e n forward m j and store
fn-~(= tj)
The mix provider can encash the cheque whenever it is wished. He only has to send the cheque and the last tick received to the bank. Since the bank has supplied the money for generating the value of this cheque, encashment will be no problem (see Section 6 for details on that).
3.4
Different Approaches
There are two aspects to consider: 1. When does the mix provider receive payment, i.e. before or after performing his service (the delivery of a message block)? 2. When is the tick sent, i.e. has the sender or the recipient triggered the payment?
318
Elke Franz and Anja Jerichow
D e l i v e r y a f t e r P a y m e n t . If the sender triggers the payment (Pl.a), she includes the digital cheque ch and the first tick tl in the first block of the message. Protocol PI: (a )
r
m l ,ch,tl
> A/'~s
rnl
- g
(b)
A m~,~h,~.(t~) ., N ~ s
c.(t~) <
tl
,g
ml m j ,t o
rnj
m j ,CB (t~ )
CB (to) ti ms
The mix network stores the cheque and forwards the first message block rnl. The following blocks include only the appropriate tick for each message block m j . Thus the anonymity service is immediately paid for, further steps are not necessary. If the recipient triggers the payment (Pl.b), the first block of the message includes the cheque and the first tick. All ticks must be encrypted with the public key CB of the recipient such that the mix is not able to get the payment before the recipient has sent back a decrypted tick. The mix stores the cheque and keeps the message block. Then it sends the encrypted tick to the recipient of the message to indicate that a message will be sent after receiving the decrypted tick. Hence, the recipient decrypts the tick and sends it back to the mix network. After receiving this tick, the mix will send the message block. The following blocks are handled in exactly the same way. However, if the recipient triggers the payment, two additional steps for the transfer of each message block are needed. P a y m e n t a f t e r Delivery. If the sender triggers the payment (P2.a), she sends the first message block together with the cheque. The mix network stores the cheque and sends the message block to the recipient. After that the recipient sends an ok to the sender of the message. Hence, the sender can be sure that her message block has arrived, i.e. she has a guarantee of message delivery. She then sends the tick to the mix network. For the following blocks, payment is triggered by the sender in the same way. Protocol P2: - A;~s
ok
- g
(b)
A
ml~ch,cB(tl)
ml~cB(tl)
NAs
,/3
t~
tl ms ok
m j ,CB (t j) ,<
tj
t3
To improve efficiency, tj could always be sent together with the following message block m j+l. Nevertheless, using this approach the delay by the mix network slows down the data rate.
A Mix-Mediated Anonymity Service and its Payment
319
If the recipient triggers the payment (P2.b), the ticks must be encrypted as well. The cheque is included in the first message block. After receiving a block, the recipient sends back the decrypted tick to the mix network. In practice this happens by encrypting it again. The sender does not get an acknowledgement whether the recipient has got the message blocks. But the mixes will be interested in sending the messages to the recipient to get their payment. This protocol is more efficient than (P2a). 3.5
Remarks
In Section 3.4 we pointed out which measure is a burden to the performance of the system. In summary, the following should be preferred: If the main goal is to achieve efficiency, one should use the protocol for "delivery after payment" where the sender triggers the payment. If one wants to assure that the addressee has really received his message, then the protocol for payment after delivery should be considered. This protocol will still allow an efficient implementation when the recipient triggers the payment. In Section 2.2, we mentioned two ways of payment. We will now discuss which is applicable for the preferred payment schemes. In case (a) of Sect.2.2, if each provider is directly paid, a digital cheque must be included for each mix of the chain. The problem with that: If a mix continuously receives ticks, it can correlate these ticks and identify the communication relation. Thus, this protocol should only be applied for special services where all users of a mix chain simultaneously send messages (and ticks). I.e. they behave in the same manner and, therefore, belong to the same anonymity group. But this cannot be demanded for all kinds of services. One example where this can be applied are connection-oriented services. For example communication with Real-Time Mixes [13] via ISDN: All users of a local exchange belong to one anonymity group and synchronously establish channels for data transfer (sending dummy traffic if no real data are to be transmitted). However, if all participants have to behave in the same manner, then more efficient measures for payment can be applied such as a flat-rate scheme. Thus, we will focus on case (b) of Sect.2.2 where the problem described above is not really a problem as long as the mix that receives the cheque is either the first or the last mix of the chain. This is due to the fact that, according to the mix scheme, these mixes may know sender or recipient anyway. In the following section we will describe detailed protocols for "delivery after payment" with the first mix receiving the cheque and "payment after delivery" with the last mix receiving the cheque. 4
4.1
Detailed
Protocols
Delivery after Payment
For protocols with delivery after payment, the first mix is used for accepting the payment that the sender will trigger. Since it is not necessary to pass the cheque
320
Elke Franz and Anja Jerichow
and the ticks through the whole mix network, the extending of the message length can be limited: Only the message N U prepared for the first mix must be extended for each message block j. After passing this mix, the additional data are removed. The sender generates her message as follows: first message block N ( m + l ) l :~-~ Nil :--
Cm+l(N1)
Nil
:=
cl(rll,A2,N21, cha, tl)
N(m+l)j := Nij := Nlj :=
Cm+l(Nj) ci (rij, Ai+l, N(i+I)j) ci(rlj,A2,N2j, ida, t3) )
c~(ril, A i + l , N ( i + I ) I )
(5)
following message blocks
(6)
She then sends the first block N l l it to the first mix according to Formula 5. A
Cl (rll ,AM 2 ,Y21, chA, tl )
~
M1
N21
>
...
N(m+l)l
>
B
The cheque chA contains an identifier ida as well as additional parameters (compare with Section 3.3). The mix stores cheque and tick and, at the same time, performs its service, i.e. forwards N21. Thus, tl is the payment for the first block. According to Formula 6, every following message block also contains a tick. Moreover, it contains the identifier idA to assign tj to the appropriate cheque.
A
Cl (rlj ,AM 2 ,N~j, idA, tj )
~
M1
N2~
>
...
N(ra +x)j
)
B
The mix can check the correctness of the tick as described in Section 3.2 while passing the processed message to the next address. Thus, the mix can be sure that it will get the payment for its service. Note: Another possibility to assign a tick to the cheque is that the mix could calculate the predecessor of every received tick by applying the hash function to it. Then it could search for the appropriate cheque according to the calculated value (compare 3.3).
4.2
Payment after Delivery
As motivated in Section 3.5, to improve the efficiency the recipient should trigger the payment for payment-after-delivery protocols and, therefore, the last mix is used for handling the payment.
A Mix-Mediated Anonymity Service and its Payment
321
The sender generates his message as follows: Cm+l (N1, tl)
N(m+l)l := Nml := Nil
am (rml, Am+l, N(m+l)l, c h A ) c i ( r i l , A~+I, N(i+1)1)
:=
(7) N ( m + l ) j :...~ Nmj :=
Nij
:=
cm-t-l(Nj, idA, t j ) Ci ( r m j , Am+i, N ( m + l ) j ) c i ( r i j , A~+I, g ( i + l ) j )
Though the sender does not get an acknowledge message from the recipient, the guarantee of message receipt is indirectly achieved: The mix network will send t h e message blocks to the recipient to get the payment for its service in exchange. Here, the user is ahead of the mix chain provider. A
Nll
9 99
cm (rml ,Am+l ,N(m+l)l, chA) J~m- 1 ~
Mm
Cm-~l(N1, tl)
~
B
Nevertheless, this advantage is kept small: The next message block is not sent before the mix has received and successfully checked the tick for the previous message block. Therefore, the provider can only be defrauded of one tick. cm (rm5 ,Am+l ,N(m+l)j)
N13
A
" ...
Mm-1
>
Mm
Cm+l (N 3 , idA, tj) < )~ cm(idA,tj)
B
Note: In this protocol, we use identifiers such that the mix knows what cheques the ticks are referring to. The recipient must encrypt tick and identifier, too, when triggering the payment. Otherwise, an eavesdropper could easily link the payment data.
5
Discussion on Attacks and Efficiency
5.1
Discussion
of Possible
Attacks
This section discusses the possibilities of attacks on anonymity. There are two questions we want to answer: -
Are there possibilities for attacks by others to prevent either the correct payment or the service? Are there possibilities for attacks by one of the participating parties to defraud the other party?
To the first concern, the mix could deny its service. Denial-of-service attacks are a problem in general. For example an attacker could try to jam the protocol by intercepting either messages or ticks. To recognize this, additional measures are needed such that the users can check the correct work of the mixes and stop the transmission of ticks if necessary. For example, mixes could make a commitment of the processing of all messages arriving in future, which then is
322
Elke Franz and Anja Jerichow
checkable by the users (see [14] for a detailed discussion on the approach to commitment schemes). To keep the anonymity of the users, the cheques only include pseudonyms instead of real user information. A pseudonym is a unique name and cannot be connected to the real user identity. But there is one remaining threat for the users of an anonymity service where cheque payment is applied: The cheques are used for many times. So at least the providers are able to collect information about the users even though only under pseudonyms. However, with additional information beyond the anonymity system, the providers could be able to derive information about the real users. We minimized this problem allowing payment transactions only by the first or the last mix of a chain. Here are two more possibilities for avoiding this problem. At first, the cheques could include not too large amounts. Thus the mixes can only collect information about the limited number of user transactions. It is the responsibility of the users to limit this information. The other possibility is that a person uses several cheques. A mix provider does then not know what cheques, represented by different pseudonyms, belong to the same person. Thus, linking users and the information yielded by the cheques becomes more difficult. However, this variant is limited by the capability of the mixes: If the mix system is to serve a large number of users, the number of possible cheques for each user is inversely proportional. Last but not least, charging protocols are needed to assure a correct payment. The usage of tamper proof hardware is the approach that we prefer. Section 6 describes in detail how the forging of cheques can be prevented by this. The discussed problems show that the only usage of anonymous payment protocols is not sufficient to allow anonymous payment. We must also consider the boundary conditions of the system that provides anonymity. 5.2
E x t e n t i o n o f the Message Length by Payment Data
Depending on who triggers the payment, payment data are included in either the outer- or the innermost shell of a message block (compare with Formulae 5, 6 and 7 in Section 4). The traffic load is not really extended if length-preserving schemes are used. Instead, payment data shorten the available space for real user data. Due to this fact an additional message block may be needed. Since a tick is just a small addition to the protocol, this increasing of the traffic load may be negligible.
5.3
Using Hybrid Encrypting Schemes to Improve the Efficiency
The introduced protocols are based on asymmetric encryption schemes. For improving the efficiency, the usage of hybrid schemes is to be preferred. Such mix schemes that are also length preserving schemes use symmetric encryption for the main part of the message, which allows much faster en- and decryption. The message is split into the real message and a header that includes address information. The mix handles these two parts separately.
A Mix-Mediated Anonymity Service and its Payment
323
The user has to prepare the address part first. Afterwards he encrypts the message with the generated secret keys. I.e. the first block for each mix is encrypted with its public key. It contains a secret key kij for a symmetric cryptosystem that is then used to encrypt the remaining part of the message. Thus, only one block must be encrypted or decrypted with an asymmetric cryptosystem. All other blocks, the remaining part of the header and the real message are then processed using the symmetric cryptosystem. Formula 8 shows the generation of the address part.
Rcm+l~j := [e~] R,j
:= [c~(kij, Ai+l), k~j(Rr
(i = m , . . . , 1)
(s)
In summary, hybrid cryptosystems improve the efficiency of our protocols.
6 6.1
Charging: P r o t o c o l s b e t w e e n U s e r s , N e t w o r k and B a n k General
Former sections discuss the protocols between users and service providers. They are necessary to transmit the payment data to the providers. In this section we also consider the protocol between provider and bank for the encashment of cheques according to [15]. In general, there are two possible protocols for encashment at the bank: online and off-line. If an on-line protocol is used, the mix does not perform its service, i.e. the transmission of the appropriate message, before it has received the credit entry from the bank. Thus, it can be sure that it really will get the payment from the users. Also, double-spending detection is possible that way. But there is an important disadvantage: The encashment request of the mix for all message blocks slows down the message transfer. This is for real-time communication not applicable. The other possibility is to use an off-line protocol. The mix immediately performs its service, i.e. it sends the message to the next mix node or the recipient. Afterwards it sends an encashment request to the bank. Thus, this protocol does not slow down the message transfer. That is why we have chosen this variant for payment in mix-mediated anonymity communications. But another problem occurs: The mix cannot be sure whether it really will get the payment for its service. To avoid this, we investigated the usage of tamper proof hardware, which will be discussed in the following section. This device also prevents double spending. Therefore, an on-line double spending detection is unnecessary.
6.2
U s i n g a Tamper P r o o f Hardware D e v i c e
To generate the necessary cheques for the providers, every user owns a tamper proof hardware device (TPH). For details about building and using a T PH we refer to [16]. The task of the T PH is to achieve trustworthiness for all in the payment process, i.e. the users, the bank and the providers.
324
Elke Franz and Anja Jerichow
That means: - for the bank: 9 It is not possible that users overdraw their accounts while creating cheques. 9 The bank is able to recognize cheques that one of its customers created with a TPH. Only these cheques will be accepted. for the users: 9 The billing on their accounts will be correctly performed. 9 The made out cheques are valid. 9 The user anonymity is kept against both the provider and the bank. - for the providers: 9 They can recognize whether the cheques are valid. 9 They know that cheques, generated with a TPH, will be accepted by the bank and that they will get the correct amount for them. 9 The cheques are not usable by others. -
The requirements on the TPH
can be derived from these points:
- The tamper proof and the correct functionality of the TPH must be guaranteed. To keep user anonymity, the identity of the users must not be recognizable from the cheques (untraceability). Moreover, the non-correlation of different cheques must be guaranteed. - To avoid the usage of forged cheques, there must be ways to decide whether a cheque was created with a TPH. This must be possible for both the bank and the provider. - The bank must be able to test the correct identity of the provider who wants to cash a cheque. -
6.3
Sketching the Charging Process
Every user owns two accounts. One is the real user account at the bank, the other one is an account t h a t is managed by the T P H . The former is designated as bank account, the latter as T P H account. The T P H account is used to make out cheques. For enabling the T P H to generate cheques, it must be loaded. Loading means to transfer money from the bank account to the T P H account. The T P H is used for communication between the user and the bank as well as between the user and the provider. Due to the design of the T P H , i.e. it consists of a user-managed and a b a n k - m a n a g e d part to perform user and bank transactions, bank and user are t a m p e r proved against each other (compare with the wallet-observer concept in [17]). Thus, if the necessary functions for the charging process are performed by the T P H , all parties can be sure about the correct execution. First, the user must load the T P H . This happens to be an on-line process between bank and user, but, in fact, it is independent from using the mixmediated service. Loading the T P H means to withdraw m o n e y from the real
A Mix-Mediated Anonymity Service and its Payment
325
user account. Thus, the user selects the amount that he wants to withdraw. If there is enough money on the bank account, the amount is transferred to the T P H where it is deposited to the T PH account. Now the cheques can be created off-line. If the TP H account is empty, the T PH must be loaded again. Second, if the user wants to send a cheque, it has to be generated. Thus, the T P H makes out this cheque as described in Section 3.2. The maximum amount of the cheque can be calculated from the number of ticks and the amount per tick. Therefore, the T PH is also responsible for generating the ticks. Now the cheque can be used for payment of anonymous communication. For enabling the providers and the bank to test the correctness of a cheque, the included data must be signed by the bank. This procedure is also performed off-line in the TP H as the correct withdrawal from the bank account is already done and the TP H allows only cheques in the range of the balance of the T PH account. As mentioned in Section 6.2, user anonymity must also be kept against the bank. The TP H is responsible that the bank does not have a possibility to trace the cheques. For this reason it uses blind signatures [18]. I.e. the cheques will be blinded before they are signed by the bank transaction part of the TPH. The sign key of the bank is stored in this part of the TPH. Therefore, the signing can also be performed off-line. Then the cheques are sent to the providers. They can test their correctness by checking the bank signature. If the providers have also received the ticks for the cheques, they can cash them at the bank. On the other hand, the bank must test that the provider who sent the cheque was the intended recipient of the cheque. This can be easily performed: The cheque could include the public key of the provider as sign for the recipient. The bank could use a challenge-response protocol to check the identity of the providers: It sends a random number to the provider. The provider must encrypt this number with its private key. The result is sent back to the bank that tries to decrypt it with the public key included in the cheque. If the bank has also successfully tested its own signature included in the cheque, it can be sure that this cheque is correct, and the money is paid to the provider. Thus, both the bank and the providers can trust the payment anonymous performed by the users.
7
Summary
and
Outlook
In this paper, we introduced different protocols for payment of an anonymity service provided by mixes. To keep the advantage of one party over the other as small as possible, we suggested the usage of an interactive micro-payment scheme that combines the anonymity scheme "mixes" and tick payment. There exits no implementation of this approach but the application to existing systems like onion routing [19] and mixmaster [2] should be feasible. In our protocols, we assumed that the sender has to pay for the anonymity service and, therefore, used encryption schemes that maintain the anonymity of the sender. However, the recipient can maintain his anonymity too by us-
326
Elke Franz and Anja Jerichow
ing mixes. However, the application of the protocols to this case seems to be straightforward. The protocols suggested allow payment before and after delivery. We explained the protocols for asymmetric encrypting schemes. The application to the more efficient hybrid encrypting schemes was sketched and should be feasible. Hash functions without trapdoor are the basis for the described protocols. Whether trapdoor hash functions are suitable and perhaps more efficient for payment of the mix providers, must also be investigated in future. We pointed out possible threats to estimate the achieved anonymity of the protocols. We can summarize that it makes sense just to focus on the mix network. The discussion of boundary conditions, especially the relations between users, network and providers, must be considered as well to allow secure anonymous payment. When the message transfer is not complete, there are problems for both, the user and the provider. Additional functions or protocols steps are necessary to guarantee the security and the fairness of the payment. These extensions and, moreover, possible risks if payment is triggered by the recipient is another point to be further discussed. Another open question is how to solve the sharing of payment between the sender and the recipient. Attacks, regarding the sharing, are of special interest. On one hand, the users must not pay less than the agreed amount of money. On the other hand, it must be avoided that the providers can get more money than necessary. In summary, though there are still topics for further investigation, this paper has shown that payment does not necessary break anonymity. We want to thank Guntram Wicke, Prof. Dr. Andreas Pfitzmann and Andreas Graubner for many helpful discussions.
References 1. David Chaum, "Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms", Communications of the A CM 24/2 (1981) 84-88. 2. L. Cottrell, "Mixmaster and Remailer Attacks", http ://www. obscura, com/~ loki/remailer/remailer-essay, html. 3. Andreas Pfitzmann, Birgit Pfitzmann and Michael Waidner, "ISDN-MIXes Untraceable Communication with Very Small Bandwidth Overhead", 7th IFIP International Conference on Information Security (IFIP/SEC'91), Elsevier, Amsterdam 1991. 4. David Goldschlag, Michael Reed and Paul Syverson, "Hiding Routing Information", Workshop on Information Hiding, LNCS 1174, Springer-Verlag, Berlin 1996, 137-150. 5. C. Gfilcfi and G. Tsudik, "Mixing E- mail with BABEL", ISOC Symposium on Network and Distributed System Security, 1996, 2-16.
A Mix-Mediated Anonymity Service and its Payment
327
6. Dogan Kesdogan, Hannes Federrath, Anja Jerichow, and Andreas Pfitzmann, "Location management strategies increasing privacy in mobile communication", 12th IFIP International Conference on Information Security (IFIP/SEC'96), Chapman & Hall, London 1996, 39-48. 7. T. Lopatic, C. Eckert, and U. Baumgarten, "MMIP - Mixed Mobile Internet Protocol", CMS 97 - Communications and Multimedia Security, IFIP TC-6 and TC-11, 22-23 Sept. 1997 in Athens (Greece). 8. E. Franz, A. Jerichow, and G. Wicke, "Payment Scheme for Mixes Providing Anonymity", International IFIP Working Conference on Electronic Commerce 98; June 4-5th 1998, LNCS 1402, Springer-Verlag, Berlin 1998, 94-108. 9. Andreas Pfitzmann and Michael Waidner, "Networks Without User Observability", Computers eJ Security 6/2, 158-166, February 1987. 10. E. Franz, A. Graubner, A. Jerichow, and A. Pfitzmann, "Modelling mix-mediated anonymous communication and preventing pool-mode attacks", 14th IFIP International Conference on Information Security (IFIP/SEC'98), Chapman &: Hall, London 1998. 11. Bruce Schneier, Applied Cryptography: Protocols, Algorithms, and Source Code in C, John Wiley & Sons, (2nd ed.) New York 1996. 12. Torben P. Pedersen "Electronic Payments of Small Amounts", Security Protocols 1996, LNCS 1189, Springer-Verlag, Berlin 1997, 59-68. 13. Anja Jerichow, Jan M/iller, Andreas Pfitzmann, Birgit Pfitzmann, and Michael Waidner, "Real-Time Mixes: A Bandwidth-Efficient Anonymity Protocol", IEEE Journal on Selected Areas in Communications, Special issue "Copyright and privacy protection", 4(1998). 14. E. Franz, A. Graubner, A. Jerichow, and A. Pfitzmann, "Comparison of commitment schemes used in mix-mediated anonymous communication for preventing pool-mode attacks", Australasian Conference on Information Security and Privacy (ACISP'98) in Brisbane, LNCS, Springer-Verlag. 15. Antoon Bosselaers, Ronald Cramer, Rolf Michelsen, Stig Mjr Frank Mfiller, Torben Pedersen, Birgit Pfitzmann, Cristian Radu, Peter de Rooij, Berry Schoenmakers, and Matthias Schunter, "Functionality of the Basic Protocols", CAFE Public Report IHS8341, CWI Amsterdam, October 7, 1995; CAFE (Esprit 7023) Deliverable IHS 8341 (confidential), September1,1995. 16. Andreas Pfitzmann, Birgit Pfitzmann, Matthias Schunter, and Michael Waidner, "Trusting Mobile User Devices and Security Modules", Computer 30/2 (1997) 6168. 17. David Chaum, "Design Concepts for Tamper Responding Systems", Crypto '83, Plenum Press, New York 1984, 387-392. 18. David Chaum, "Blind Signatures for untraceable payments", Crypto '82, Plenum Press, New York 1983, 199-203. 19. Paul F. Syverson, David M. Goldschlag, and Michael G. Reed, "Anonymous Connections and Onion Routing", IEEE Symposium on Security and Privacy, IEEE Computer Press, 1997, 44-54.
A Subjective Metric of Authentication Audun Jr Telenor R&D, 7005Trondheim, Norway
[email protected]
A b s t r a c t . Determining the authenticity of public keys in large-scale open networks can not be based on certificates alone, but must also include the binding between the key used for certification and it's owner, as well as the trust relationships between individual agents. This paper describes a method for computing authenticity measures based on certificates, on key binding, and on trust relationships. Two essential elements of the method are the opinion model which is a radically new way of representing trust, and subjective logic which consists of a set of logical operators for combining opinions. We show that our method for computing authenticity measures can be applied to both anarchic and hierarchic authentication networks.
1
Introduction
Public key c r y p t o g r a p h y seems to be the technical solution for securing global open telecommunication networks. The problem however is to find a reliable way of determining the authenticity of public keys in a large-scale open network. For this purpose, pure c r y p t o g r a p h y is not enough. It is for example not conceivable to have a single global authority t h a t is trusted for key generation and distribution. There will always be different administrative domains which typically will have conflicting economical and political interests. In this situation, each agent has to decide for herself which other agents she wants to trust, and based on this determine the legitimacy of certificates and the authenticity of keys. Several metrics of authentication have been proposed in the literature, such as the BBK model [BBK94], the P G P model [Zim95], the Maurer model [Mau96], and the Reiter-Stubblebine model [RS97]. These models will be briefly discussed in the next section, but we can already mention t h a t they all represent trust either as a discrete or as a continuous p a r a m e t e r in the range [0, 1]. All the mentioned models have drawbacks which we will t r y to overcome. In particular, we find neither the discrete, nor the continuous approaches satisfactory. The discrete are insufficient because they only provide a small set of possible trust values. The continuous and probability oriented models fall short because the operators for combining trust values often seem counterintuitive. This indicates t h a t by modelling trust as probability, i m p o r t a n t aspects of trust as a human cognitive phenomenon is missing. In our view, the missing component is ignorance and uncertainty which can not be reflected by probabilities.
330
Audun Josang
Recently, Reiter and Stubblebine[RS97] have proposed design principles for defining a metric for trust in authentication schemes. In our opinion, our model satisfies all principles. In particular we have solved the problem of the binding between key and key owner, which is only implicitly assumed in many previous approaches, but explicitly expressed in our approach. The computational model expresses trust as a two-dimensional metric, and uses special logical operators which are both intuitive and which have a sound mathematical basis. The two dimensional metric is called an opinion and can also be interpreted as a measure of uncertain probabilities. As such the opinion model and the set of logical operators together represent a logic of uncertain probabilities. The logic which is called subjective logic1 seems particularly suitable for modelling authentication networks. In particular we are able to model recommendation of certificates along paths of agents, and the combination of possibly conflicting recommendations in order to reach a consensus. 2 2.1
Previously
Proposed
Metrics
The BBK Model
The BBK model[BBK94] represents trust relationships between agents as edges, so that A ~ B means that agent A trusts agent B for example to determine the authenticity of public keys, and A - - -+ B that agent A trusts agent B to recommend other agents for the purpose of determining the authenticity of public keys, where trust is expressed as a value in the range [0, 1]. The BBK model has several weaknesses that have been pointed out in [Jcs97c,RS97]. Its recommendation and consensus operators are found to be easily manipulable and thereby inadequate for determining authenticity. 2.2
The Maurer Model
As in the BBK model, the Maurer model[Mau96] is based on direct and recommended trust represented as edges in a graph, but the interpretation of the edges is different. A direct edge A ~ B means that the user evaluating the metric "holds a certificate for B's public key(allegedly) issued and signed by entity A". Similarly, a recommendation edge A - - -4 B denotes that the user is in possession of a recommendation (for recommending or authenticating other entities) for B allegedly signed by entity A. Associated with each direct edge and recommendation edge is a value [0, 1], called a confidence parameter, that is assigned by the entity that created (the construct represented by) the edge. The Maurer model depends on an assumed binding between certificate and certifier. However, as pointed out in [RS97] agents do not sign certificates, keys do. To repeat from the previous paragraph, the edge A ~ B exists in the Maurer model if the user evaluating the metric "holds a certificate for B's public key (allegedly) issued and signed by entity A" ([Mau96] Definition 3.1). Maurer 1 A more detailed description of subject logic can be found in [Jcs97a,Jcs97d].
A Subjective Metric of Authentication
331
uses the word "allegedly" because "without verification, there exists no evidence that the certificate was indeed issued by the claimed entity". P u t another way, when the entity that allegedly signed the certificate is claimed with the certificate, this claim is at best a hint, and at worst an opportunity to be misled. It is therefore ambiguous how certificates should be represented in the Maurer model. A similar concern arises in the BBK model described in the previous section. Evaluating a metric requires the user to collect values from other entities for the various direct and recommendation edges. However, before the user can safely assign a value to the edge A ~ B or A - - --+ B, the user must authenticate this value as having come from A. Assuming that this authentication is performed cryptographically, (e.g. via certificates), again the user is asked to determine a key that can be used to authenticate A. As a result, the answer to the original authentication problem is a new and identical authentication problem, which therefore never can be solved. This fundamental flaw in both the Maurer and the BBK model makes them totally unsuitable for their intended usage. 2.3
The PGP Model
P G P (Pretty Good Privacy [Zim95]) contains one of the most popular civilian public key management systems in the world today. It is based on a representation of public keys and their certificates as a graph where the nodes are keys, and an edge K1 ~/(2 represent a certificate binding the signing private part of key K1 to the signed public part of k e y / ( 2 . The binding is verifiable by the public part of K1. Associated with each public key in the user's database is a trust value which can be unknown, untrusted, marginally trusted or fully trusted. P G P computes the legitimacy of each key as follows: P G P first declares legitimate the node Ko representing the users own public key and any node K~ such that Ko ) K~ is an edge in the graph. Now, if for some node Ky, there is an edge to Ky from a legitimate fully trusted node, then Ky is considered legitimate also. Alternatively, if for some node Kz, there are a minimum number of edges to Kz from marginally trusted nodes, then Kz is considered legitimate. The number of edges required from fully trusted or marginally trusted nodes can be adjusted, but 1 and 2 are the defaults. In practice, determination of node legitimacy is interwoven with assigning trust values to nodes. T h a t is, a trust value for certifying other keys is assigned to a key only after it has been determined to be legitimate, and thus its owner assumed to be known. The P G P trust model has several merits and does not have the defect of the BBK and Maurer models that the binding between a key owner and the certificate is assumed as part of the certificate. P G P first determines the binding between the owner and the key in the form of legitimacy, and then lets the user specify the trust in the certificates produced by the key. This makes P G P practical in real application, as its widespread usage also testifies. However, the limitations of the trust model in P G P is that the rules fore determining legitimacy and the trust values themselves are discrete. In Sec.5, we will build on the P G P approach by separating certification and binding, and use a twodimensional continuous metric which will be described in Sec.3.
332 2.4
Audun Josang The Reiter-Stubblebine Model
As in PGP, the Reiter-Stubblebine model[RS97] is based on representing keys and their certificates as nodes and edges in a directed graph, so that the edge /(1 ~ /(2 represents that the user is in possession of a certificate that signs Ks, and who's signature can be verified by using the public part of key K1. Each edge also has a numeric value associated that represents the amount of money for which the owner of K1 insures the attributes of/(2, i.e. the value for which the owner of/(1 will be liable to the user if the attributes bound t o / ( 2 in the certificate are incorrect, or if the private part of/(2 is used to mislead the user, intentionally or otherwise. In particular, if the private key corresponding to/(2 is compromised and used maliciously, then the owner of/(1 is liable for the stated amount. This form of insurance is called surety bonding. The insurance label of the edge/(1 ~ K~ must be obtained from the owner of/(1 in some reliable way, and so this value is stored in the certificate that K1 ~/(2 represents. This does not force the user to make assumptions about the true owner of/(1 as in the BBK and Maurer models. In fact it is possible t h a t / ( 1 has been compromised and used to forge the certificate/(1 ~ /(2, including its attributes and insurance value. In this case, whoever certified/(1 would be liable, and this can regress along a path arbitrarily far. By assuming the existence of an underlying framework for contract enforcement and insurance collection, they introduce an element which could have been used to enforce the security policy of the key authentication scheme in the first place, and thereby making the proposed model obsolete. What the authors try to achieve with the model is to shift all the burden of damages incurred by a misused key to whoever has certified the key. As a result, certifiers would have to be extremely careful, undoubtedly resulting in a very conservative behaviour. As such, this is not a model for determining trust, but a method for making certifiers behave carefully and correctly, and thereby for creating more trust in the certified keys. As already mentioned, by requiring contract and insurance payment enforcement, the method seems to be impractical at best, and to make itself obsolete at worst. 3
Representing
Trust
Mathematically
The evidence space and the opinion space are two equivalent models of representing uncertain probabilities, and this duality will be very useful for determining uncertain probabilities and for giving them intuitive interpretations. 3.1
T h e Evidence Space
The mathematical analysis leading to the expression for posteriori probability estimates of binary events can be found in many text books on probability theory, e.g. [CB90] p.298, and we will only present the results here.
A Subjective Metric of Authentication
333
It can be shown that posteriori probabilities of binary events can be represented by the beta distribution function. The beta-family of distributions is a continuous family of functions indexed by the two parameters a and ~. The beta(a, t3) distribution can be expressed using the gamma function F as:
F(a+t3) Oa_1(1_0) ~-1
0<0<1,
/(o I~, ~) - r ( a ) r ( Z )
a>0,
~>0
(1)
with the restriction that 0 ~ 0 if a < 1, and 0 ~ 1 if/3 < 1. The mean value of the beta distribution is given by E(O) = a/(a +/3). We will in the following only consider the subclass of beta distributions which we will call probability certainty density functions or pcdf for short. We will denote by 9 the set of pcdfs. In our notation, pcdfs will be characterised by the parameters {r, s} instead of {a, fl} through the following correspondence: a=r+l,
r>_O
~=s+l,
s>o.
and
(2)
Let ~ be a pcdf over the probability variable 0. In our notation ~ will then be characterised by r and s according to:
~ ( 0 1 r, s) = r ( r
F (+r + s+2) 1)F(s +
1)
0T(l_0)8
'
0<00,
-
s>0 -
(3)
The mean value of a pcdf is given by E(0) = (r + 1)/(r + s + 2). As an example, assume that an entity has produced r = 8 positive and s = 1 negative events. The pcdf expressed as ~(018, 1) is plotted in Fig.1.
0'.2
0.4
0.6
0.8
Fig. 1. Posteriori pcdf after 8 positive and 1 negative results
The curve plotted in Fig.1 must not be confused with an ordinary probability density function. A pcdf represents the certainty density regarding the expected probability of a binary event, and not the distribution of probabilities. This is explained in more detail in [Jcs97d].
334
3.2
Audun J0sang
The Opinion Space
For the purpose of believing a binary proposition such as for example: the key is authentic, we assume that the proposition will either be true or false, and not something in between9 However, because of our imperfect knowledge it is impossible to know with certainty whether it is true or false, so that we can only have an opinion about it, which translates into degrees of belief or disbelief as well as uncertainty which fills the void in the absence of both belief and disbelief. We express this mathematically as: b + d + u = 1,
{b, d, u} e [0, 1]3
(4)
where b, d and u designate belief, disbelief and uncertainty respectively9 Eq.(4) defines the triangle of Fig.2, and an opinion can be uniquely described as a point {b, d, u} in the triangle.
Definition 1. Let w = {b, d, u} be a triplet satisfying (4) where the first, second and third component correspond to belief, disbelief and uncertainty respectively9 Then o~ is called an opinion. We will denote by $2 the set of opinions9 [] As an example, the opinion w = {0 9 which corresponds the the pcdf of Fig.1 is represented as a point in the figure. Uncertainty ,,1
, ~:
- - . " "
- - . "t,
,--J'i--;:", ,,,:__.1,:_._.'.:.'___:, '___:<..__g....), Disbelief~ 1
L
- "-
' %":--gi---~)" L ) Belief
: 0
1
Fig. 2. Opinion Triangle
The horizontal bottom line between belief and disbelief in Fig.2 represents situations without uncertainty and is equivalent to a traditional probability model. Uncertainty is caused by the lack of evidence to support either belief or disbelief. In order to illustrate the interpretation of the uncertainty component we will use the following example, which is cited from JEll61]. Let us suppose that you confront two urns containing red and black balls, from one of which a ball will be drawn at random. To "bet on Redi" will mean that you choose to draw from Urn I; and that you will receive a prize a (say $100) if you draw a red ball and a smaller amount b (say $0) if you draw a black.
A Subjective Metric of Authentication
335
You have the following information: Urn I contains 100 red and black balls, but in ratio entirely unknown to you; there may be from 0 to 100 red balls. In Urn II, you confirm that there are exactly 50 red and 50 black balls. For Urn II, most people would agree that the probability of drawing a red ball is 0.5, because the chances of winning or loosing a bet on Redli are equal. For Urn I however, it is not obvious. If however one was forced to make a bet on Redi, most people would agree that the chances also are equal, so that the probability of drawing a red ball also in this case must be 0.5. This example illustrates extreme cases of probability, one which is totally certain, and the other which is totally uncertain, but interestingly they are both 0.5. In real situations, a probability estimate can never be absolutely certain, and a single valued probability estimate is always inadequate for expressing an observer's subjective belief regarding a real situation. By using opinions the degree of (un)certainty can easily be expressed such that the opinions about Redi and Redli become wi = {0, 0, 1} and Wli = {0.5, 0.5, 0.0} respectively. Opinions are considered individual, and will therefore have an ownership assigned whenever relevant. In our notation, superscripts indicate ownership, and subscripts indicate the proposition to which the opinion apply. For example Wp A is an opinion held by agent A about the truth of proposition p.
3.3
Equivalence between the Evidence and Opinion Spaces
We have defined # to be the class of pcdfs, a n d / 2 to be the class of opinions. Let
Wp = {bp, dp, Up} be an agent's opinion about a binary event p, and let ~p(rp, sp) be the same agent's probability estimate regarding p expressed as a pcdf. We now define Wp as a function of ~p(rp, Sp) according to:
bp dp Up
rl' rp+sp+l Sp
r~+~p+l
(5)
rpWsp+l
We see for example that the uniform ~(0, 0) corresponds to w = {0, 0, 1} which expresses total uncertainty, that ~(co, 0) or the absolute probability corresponds to w = {1, 0, 0} which expresses absolute belief, and that ~(0, oo) or the zero probability corresponds to w -- {0, 1, 0} which expresses absolute disbelief. By defining w as a function of ~ according to (5), the interpretation of w corresponds exactly to the interpretation of ~. Strictly speaking, opinions without uncertainty, such as for example w = {0.5, 0.5, 0}, do not have a clear equivalent representation as pcdf because the {r, s} parameters would explode. In order to avoid this problem, we can define $2' to be the class of opinions with non-zero uncertainty, that is with u # 0. Eq.(5) defines a bijective mapping between the evidence space and the opinion space so that any pcdf has an equivalent mathematical and interpretative representation as an opinion and vice versa, making it possible to produce opinions based on statistical evidence.
336
3.4
Audun Josang
Representing Trust
From an information security point of view, it can be observed that humans are trusted because they are believed to be honest whereas systems are trusted because they are believed to be secure [Jcs96], and this will form the basis for our definition of trust. Imagine an observer A who is considering her trust in a particular system. She can form the proposition p: "The system will resist malicious attacks." Now, her trust in the system will be her belief in p, expressed as wA. Let the same observer A consider her trust in a particular human agent. She must assume that the agent will either cooperate or defect. She can form the proposition q: "The agent will cooperate." Her trust in the agent can simply be expressed as wA, which is the belief that he will cooperate. In a similar way, trust in the authenticity of a cryptographic key can be expressed by defining r: "The key is authentic." and express the opinion wA. These simple examples demonstrate that trust easily can be expressed as an opinion. The whole framework for artificial reasoning based on subjective logic can therefore be used for reasoning about trust.
4
The Operators of Subjective Logic
Standard propositional logic operates on binary variables which can only take the values TRUE and FALSE. Subjective logic operates on our subjective perception about binary propositions, represented as opinions. Presently, subjective logic contains about 10 operators[Jr but due to limited space, we only describe the subset consisting of conjunction, consensus and recommendation here.
4.1
Conjunction
A conjunction of two opinions about two distinct propositions consists of determining from the two opinions a new opinion reflecting the conjunctive truth of both propositions. This corresponds to the logical binary "AND" operation in standard logic.
Definition 2. Let wp = {bp, dp, Up} and ~q = {bq, dq, Uq} be an agent's opinions about two distinct propositions p and q. Let wpAq = {bpAq,dpAq,UpAq} be the opinion such that 1. bpAq -~ bpbq 2. dpAq = dp + dq - dpdq 3. UpAq ----bpuq + upbq + UpUq Then WpAq is called the conjunction O] Wp and Wq, representing the agents opinion about both p and q being true. By using the symbol "A" to designate this operation, we get WpAq = Wp A Wq. []
A Subjective Metric of Authentication
337
As would be expected, conjunction of opinions is both commutative and associative. It must be assumed that the opinion arguments in a conjunction are independent. This means for example that the conjunction of an opinion with itself will be meaningless, because the conjunction rule will see them as if they were opinions about distinct propositions. When applied to opinions with absolute belief or disbelief, it produces the same results as the conjunction rule in standard logic, that is; it produces the truth table of logical "AND". In addition, when applied to opinions with zero uncertainty, it produces the same results as serial multiplication of probabilities. 4.2
C o n s e n s u s of I n d e p e n d e n t Opinions
This operator is most naturally expressed in the evidence space, so we will define it there first and subsequently map it over to the opinion space. Assume two agents A and B having observed an entity produce binary events over two different periods respectively, with A having observed r A positive and s A negative events, and B having observed r B positive and s B negative events. According to (3), their respective pcdfs are then ~(r A, 8A) and ~(r B, sS). Imagine now that they combine their observations to form a better estimate of the events' probability. This is equivalent to an imaginary agent [A, B] having made all the observations and who therefore is able to form the pcdf defined by ~ ( r A + r B, s A "4- sB). Definition 3. L e t ~ ( r A , s A) and ~(rpB , SpS ) be two pcdfs respectively held by the agents A and B regarding the truth of a proposition p. The pcdf ~(r A'B, s A'B) defined by 1. r A'B = r A + r B 2. ~p^PA'B SPAp+ S~p is then called the consensus rule ]or combining A's and B ' s estimates, as if it was an estimate held by an imaginary agent [A, B]. By using the symbol @ to designate this operation, we get ~(rA,B,s A,B) = ~(rpA, Sp A) G v(rB, SpB). [] The consensus rule for combining independent opinions is easily obtained by using Def.3 above and the evidence-opinion mapping of Eq.(5). T h e o r e m 1. Let wA = { bAp , d pA, u pA} and wB = {bp,dp,up B S B } be opinions respectively held by agents A and B about the same proposition p. Let wA'B = ,A,B , ap -A,B , UpA,B1 be the opinion such that Op A B + b ~B uA p ) / ~ 1. b ~ : : : (bpup 2. ~PAp,B (dpA u~B + dpB Up A )/~ a
where ~ = u A + u s - upAupS such that ~ ~ O. Then WpA,B is called the Bayesian consensus between Wp A and w B, representing an imaginary agent [A, B] 's opinion about p, as if she represented both A and B. By using the symbol • to designate .A,B = Wp A ~ CgBp. [] this operation, we get wv
338
Audun J0sang
It is easy to prove that @ is both commutative and associative which means that the order in which opinions are combined has no importance. Opinion independence is must be assumed, which obviously translates into not allowing an entity's opinion to be counted more than once The effect of independent consensus is to reduce the uncertainty. For example the case where several witnesses give consistent testimony should amplify the judge's opinion, and t h a t is exactly what the operator does. Consensus between an infinite number of not totally uncertain (i.e. u < 1) opinions would necessarily produce a consensus opinion with u = 0. Two opinions which both contain zero uncertainty can not be combined according to Def.1. This can be explained by interpreting uncertainty as room for influence, meaning that it is only possible to influence an opinion which has not yet been committed to belief or disbelief. An opinion containing zero uncertainty can only influence opinions which do contain uncertainty, not the opposite, and the result will always be the total elimination of uncertainty. In reality, opinions which do not include uncertainty are usually counterintuitive, except in specially designed situations such as games with carefully controlled probabilities. 4.3
Recommendation
Assume two agents A and B where A has an opinion about B, and B has an opinion about a proposition p. A recommendation of these two opinions consists of combining A's opinion about B with B's opinion about p in order for A to get an opinion about p. There is no such thing as physical recommendation, and recommendation of opinions therefore lends itself to different interpretations. The main difficulty lies with describing the effect of A disbelieving that B will give a good advice. For the definition of the recommendation operator, A's disbelief in the recommending agent B means that A thinks that B is uncertain about the t r u t h value of p. As a result A is also uncertain about the t r u t h value of p. D e f i n i t i o n 4. Let A , B and be two agents where w A -~ { b A B , d BA, U BA}
is A ' s
opinion about B ' s recommendations, and let p be a proposition where w B = dp, Up } is B ' s opinion about p expressed in a recommendation to A. Let ojAB = {bpA B ,a'p --AB ,upA B ~~ be the opinion such that bAB ~
A B
1. d~ B _ bBbp , 2. A B - bBd~
.
3. -PAB__ A A A B Up -- d B + u s "~ bBU p
then ~dp AB is called the recommendation rule for combining w A and w B expressing A ' s opinion about p as a result of the recommendation f r o m B . B y using the symbol | to designate this operation, we get w AB = w A | w B. [:] It is easy to prove that | is associative but not commutative. This means that the combination of opinions can start in either end of the chain, and that
A Subjective Metric of Authentication
339
the order in which opinions are combined is significant. In a chain with more than one recommending entity, opinion independence must be assumed, which for example translates into not allowing the same entity to appear more than once in a chain. B's recommendation must be interpreted as what B actually recommends to A, and not necessarily as B's real opinion. It is obvious that these can be totally different if B for example defects. It is important to notice that the recommendation rule can only be justified when it can be assumed that recommendation is transitive. More precisely it must be assumed that the agents in the chain do not change their behaviour (i.e. what they recommend) as a function of which entities they interact with. However, as pointed out in [Jcs96] and [BFL96] this can not always be assumed, because defection can be motivated for example by antagonism between certain agents. The recommendation rule must therefore be used with care, and can only be applied in environments where behaviour invariance can be assumed. 5
Evaluating
Authenticity
Public keys can be exchanged manually or electronically. For manual distribution, agent A1 can for example meet agent A2 physically and give him a diskette containing her public key kA1, and A2 can give his public key kA2 to her in return. The keys can then be considered authenticated through the persons' mutual physical recognition. These keys can then be trusted and used for confidential message exchange, or for certification of other keys. For electronic key distribution, keys need to be recommended and certified by someone whom the recipient trusts for recommending and certifying keys, and who's authenticated public key the recipient possesses. For example if A1 possesses A2's public key kA2 and A2 possesses A3's public key kA3, then A2 can send A3's public key to A1, certified by his private key kA21.Upon reception, A1 will verify A2's certificate, and if correct, will know that the received public key of A3 is authentic, and can then communicate confidentially with A3. This simple certification chain is illustrated in Fig.3. A1
~" A2
> A3
Fig. 3. Simple certification path
However, certificates are not enough. In order to get a binding between keys and key owners, the recipient of the certificate must have an opinion wA1A(kA2) about the authenticity of the key used to certify, that is, her opinion about the binding between the certifier and his public key. In addition, the recipient must A1 have an opinion WRT(A2) about the certifiers recommendation trustworthiness (RT), that is how much she trusts him to actually recommend and certify other
340
Audun Josang
keys. Finally, the certifier must actually recommend to the recipient his own opinion wA~(ka3) about the authenticity of the certified key. This opinion must be embedded in the certificate sent to A1. The definition of authenticity of a certified key can then be defined. D e f i n i t i o n 5. A1, A2 and A3 are three agents, kA1, kA2 and kA3 their respective A A1 public keys. Let ~dgA(kA2 ) and 02RT(A2) be A1 's opinion about the authenticity of k A2 , and about A2 's recommendation trustworthiness respectively. Let coA2A(kA, ) be A2's opinion about the authenticity of kA3. Then A1 's opinion about the authenticity of kA3 is defined by: A1
A1
A1
~2~ oj A 2
02KA(kA3 ) -- (02RT(A2) A 02gA(ka2)) ,r
KA(kA3)
[] In case there is a path through intermediate certifiers, as illustrated in Fig.4, opinions about recommendation trustworthiness WRT must also be recommended along the path and embedded in the certificate together with the certified key. The recommendation trustworthiness RT not only applies to immediate certification of keys, but also to the recommendation of other agents for further recommendations. In [J~s97d] these two types of trustworthiness were treated separately and called CT (certification trustworthiness) and RT respectively. However, since they necessarily are dependent, separate treatment would lead to computational inconsistencies, and we therefore use only RT to denote both types of trustworthiness.
A1
~
A2
~
. . .
~'-
An-1
:~ A n
Fig. 4. Chained certification path
Definition 6. Let the agents A1, ... An have chained trust and certification relationships according to Fig.~. A1 's opinion about the authenticity of kAn can then be expressed as: 02A1A(kA~) -~
A1
^ 02 A1
(WRT(A2) ' '
"~
KA(kA2)J(~''"
[02 A n - 2
~ ~ RT(An-1)
A ~2A n - 2 ~ 02 A n - 1 KA(kAn-1) ] ~ KA(kAn)
[] The framework defined above can now be used to compute authenticity in a given network. If desirable, subjective logic can be reduced to a one-dimensional probabilistic logic by using opinions without uncertainty, i.e. u = 0. The logic can also be made binary by only allowing binary belief components, i.e. b = 0 or b = 1. The full two-dimensional logic will be used in the examples below. We will start by describing how authenticity can be computed in an anarchic structure, and subsequently show how the same method can be used in a hierarchic structure.
A Subjective Metric of Authentication 5.1
341
Anarchic Authentication Network
In anarchic authentication networks each agent decides individually which other agents she will trust to produce certificates. Fig.5 illustrates a possible structure of public keys and their certificates as stored in agent A's private database.
Fig. 5. Structure of keys and certificates in agent A's database
This structure makes no assumption about any binding between key owners and certificates. In addition agent A must therefore keep a list of her opinions wAA about key authenticity, that is, her opinions about binding between keys and key owners. Agent A must also keep a list of her opinions WAT about recommendation trustworthiness, that is how much she trusts the key owners to actually recommend other keys and other agents. Tab.1 below gives an example of possible opinion values. Although it is not shown, a one-to-many binding between an agent and her different keys can perfectly well be accommodated within this structure. Key Key Authenticity Key owner Recommendation Trustworthiness
~A(k.~)
kx
X
ka {1.00, o.oo, 0.00} k . 1{0.98, 0.00, 002}1 a
T
a
~
~
I
A B
~T(x) {1.00, o.oo, 0.00}
I
{0.96, 0.02, 0.02} ,',
,,,
,
,
Table 1. Table of A's opinions about public keys and their owners
It is assumed that A knows B, C and D personally and therefore has firsthand evidence about their recommendation trustworthiness. It is also assumed
342
Audun Josang
that A's opinions about key authenticity is based on having physically exchanged public keys with them. Let A receive the public keys of agents E, F and G electronically. Embedded in the certificates are also the certifying agents' opinions about the key authenticity and recommendation trustworthiness according to Tab.2. Key Authenticity Recommendation Trustworthiness :d~A(k~) : {0.98, 0.00, 0.02} WRT(E)/~-----{0.99, 0.00, 0.01} CORT(F ) / ~ = {0.98, 0.01, 0.01} ~KA(kF)/~---- {0.95, 0.01, 0.04} WRT(F)C" = {0.90, 0.00, 0.10} ~CKA(kF)_-- {0.98, 0.00, 0.02} ~~ : {0.90, 0.05, 0.05} WRT(G) ~' = {0.99, 0.00, 0.01} Table 2. Table of opinions received by A
A key which is received electronically can be considered authentic if it has been certified by someone who is considered trustworthy, and who's public key is considered authentic and if the certifier recommends the key to be authentic. There are of course other considerations, such as e.g. that the cryptographic algorithm can not be broken, but it is assumed that these conditions are met. The authenticity of for example kE as seen by A can then be expressed as:
A A B 0JKA(kE) ---- (03RAT(B) A 03KA(k,) ) Q 03KA(kE) (6) = {0.922, 0.000, 0.078} When there are several certification paths to the same key, the authenticity can be computed as the consensus between the authenticities obtained for each path. The authenticity of kF as seen by A can then be computed as:
WKA(kF))~
t RT(C) A ~KA(kc)) Q 0JKA(kF))
= {0.951, 0.004, 0.045}
(7) When certificates pass through a chain of nodes, recommendation of each node must be included in the expression. The authenticity of ka as seen by A can be computed as: B B E 0JUA(k~:)) ~dKA(kG) (8) = {0.821, 0.046, 0.133} Opinions about key authenticity and recommendation trustworthiness must always be included when sending certificates to other agents. However, opinions based on recommendations from other agents, i.e. based on second-hand evidence, should never be passed to other agents. This is because the recipient may receive recommendations from the same agents, causing opinion dependence and computational inconsistencies. Only opinions based on first-hand evidence and experience should thus be recommended to other agents.
A Subjective Metric of Authentication 5.2
343
Hierarchic Authentication Network
The X.509 authentication framework [ITU89] defines a typical hierarchic network structure in which Certification Authorities (CAs) distribute certified cryptographic keys to the users. Fig.6 illustrates two partly overlapping hierarchies where user U3 belongs to both CA2 and CA3.
j CAI.. CA2
U1
U2
CA3
U3
U4
Key to the figure:
U5
CA
Certification Authority
U
User
Fig. 6. Hierarchic Authentication Network
U It is assumed that every user U has a certain opinion ~RT(CA) about its superior CA regarding recommendation trustworthiness. The same applies to trust between CAs both in upwards and downwards direction. Every user U possesses the public key kcA of his superior CA, to which is attached an opinion 02UA(kcA) about the key authenticity. The same applies to the CAs in upwards and downwards direction, as well as to the CAs regarding their users. The computations for determining key authenticities are analogous to the computations in case of an anarchic network. User Ul's opinion about the key authenticity of user U2 can for example be expressed as: U1 U1 ~dU1 h ~ ~.dCA2 ~dKA(ku2 ) = (~ORT(CA2) A KA(kcA2)] '~' KA(kv2)
(9)
The key authenticity of U3 as seen by U1 can be computed using both the short path via CA2 as well as the long path via CA1 and CA3 to obtain: 0.)U1 KA(ku3) =
U1 ~dU1 (0JRT(CA2) A KA(kcA2) ] {02CA2
[{ ,CA2
A ~CA2
"~
@ ~ KA(ku3) @ ~k~RT(CA1) ' ' KA(kcA1)] ( ,CA1 wCA1 "~ CA3 @ ly.,RT(CA3) A KA(kcA3) ! @ 02KA(ku3)) )
(10)
When the computation is based on intermediate CA's, the recommendations can be embedded inside a multi layer certificate, corresponding to the CA's through which it has passed, or each CA can send a certificate regarding the neighbouring CA directly to the recipient user. When the computation is based on multiple paths, the consensus must be computed by the agent where the paths meet which is CA2 in our example. Otherwise, opinions about the node where the paths meet will appear more than once in the expression, and thereby introduce dependence.
344
6
Audun Josang
Conclusion
We have proposed a radically new way of reasoning about trust and authentication using subjective logic and have described how this can be used to define a metric for authenticity as well as a framework for computing authenticity values. An essential aspect of our approach is to include the binding between keys and their owners, without which a certificate would have no real meaning to the users. We believe that this approach can be directly applied in real implementations of authentication networks.
References [BBK94] T. Beth, M. Borcherding, and B. Klein. Valuation of trust in open networks. In ESORICS 94. Brighton, UK, November 1994. [BFL96] Matt Blaze, Joan Feigenbaum, and Jack Lacy. Decentralized trust management. In Proceedings of the 1996 IEEE Conference on Security and Privacy, Oakland, CA, 1996. [CB90] George Casella and Roger L. Berger. Statistical Inference. Duxbury Press, 1990. JEll61] Daniel Ellsberg. Risk, ambiguity, and the Savage axioms. Quarterly Journal of Ecomonics, 75:643-669, 1961. [ITU89] ITU. Recommendation X.509, The Directory Authentication Framework. International Telecommunications Union, 1989. [Jos96] A. Josang. The right type of trust for distributed systems. In C. Meadows, editor, Proc. of the 1996 New Security Paradigms Workshop. ACM, 1996. [Jos97a] A. Josang. Artificial reasoning with subjective logic. In Abhaya Nayak, editor, Proceedings of the Second Australian Workshop on Commonsense Reasoning, 1997. [Jos97b] A. Josang. A model for trust in security systems. In Arto Karila and Timo Aalto, editors, Proceedings of the Second Second Nordic Workshop on Secure Computer Systems. Helsinki University of Technology, November 1997. [Jos97c] A. Josang. Prospectives for modelling trust in information security. In Vijay Varadharajan, editor, Proceedings of the 1997 Australasian Conference on Information Security and Privacy. Springer-Verlag, 1997. [Jos97d] Audun Josang. Modelling Trust in Information Security. PhD thesis, Norwegian University of Science and Technology, 1997. [Mau96] Ueli Maurer. Modelling a public-key infrastructure. In Computer Security ESORICS'96. Springer~ 1996. [RS97] Michael K. Reiter and Stuart G. Stubblebine. Toward acceptable metrics of authentication. In Proceedings of the lg97 IEEE Conference on Security and Privacy, Oakland, CA, 1997. [Zim95] P.R. Zimmermann. The Oj~icial PGP User's Guide. MIT Press, 1995.
A Sound Logic for Analysing Electronic C o m m e r c e Protocols Volker Kessler 1 and Heike Neumann 2. 1 Siemens AG, Corporate Technology, ZT IK 3, D-81730Munich,
[email protected] 2 MathematischesInstitut, University of Gie~en, Arndtstr. 2, D-35392 Gie~en,
Heike.B.NeumannCmath.uni-giessen.de
Abstract. We present an authentication logic which is capable to analyse one of the most important features of electronic commerce protocols, i.e., the accountability of transactions. By referring to a formal semantics we prove that the calculus is correct. We apply our logic to the payment systems SET and Payword. The formal analysis explains why micropayment systems based on chains of hashvalues do not lead to the same level of assurance as usual payment systems. Keywords: Electronic commerce, formal analysis, logic of authentication, micropayment, Payword, semantics, SET.
1 1.1
Introduction A c c o u n t a b i l i t y in E l e c t r o n i c C o m m e r c e
Electronic commerce (shortly: ecommerce) is supposed to be the killer application for the internet. The prerequisite for establishing an electronic marketplace is secure payment. Many protocols have been proposed to implement different kinds of payments: secure credit card payments, micropayments, and digital coins. As for all cryptographic protocols, the question arises whether these protocols actually provide the required security features. Authentication logics has become one of the most attractive method for the formal analysing of cryptographic protocols. Typically, the BAN-logic [4] and its various dialects prove the achievement of protocol goals like A believes B says "It's me", A and B being the communication p a r t n e r s J BAN-logics verify how the belief of the communication partners evolves during message exchange. Especially, it is assumed t h a t these partners are honest. Belief logics did not bother about proving something to a third party who was not involved in the communication. Thus they failed to determine accountability which is one of the most important requirements for ecommerce protocols because in an ecommerce scenario the involved parties m a y have conflicting interests and will not fully * Work on this topic was sponsored by Siemens AG. 1 Note that a formal analysis does not cope with deficiencies of the cryptogaphic algorithms in use; it is assumed that these are "secure".
346
Volker Kessler and Heike Neumann
trust each other. Of course, there will be some trust but it is limited. In case of a dispute the participants need pieces of evidence to convince an independent party such as a judge.
1.2
Formal Modelling of Accountability
There are some recent papers on the issue of formally analysing ecommerce protocols, e.g., [10], [5], [12]. Kailar's approach [10] is similar to ours. He presents a modified authentication logic by introducing a new predicate A cangrove r to B but he does not provide any formal semantics to justify the predicates and rules. Therefore it remains the question whether his calculus is correct. Unfortunately, most of BAN-dialects lack this justification given by an external formal semantics. There are some papers published on this issue like [2],[17], [19], and [3] but these logics do not apply to the issue of accountability. In this paper we build on the AUTLOG semantics given in [19] and, at full length, in [18] because this one can handle incomprehensible submessages. ~ A U T L O G is a so-called K45-1ogic (logic of belief) [8, p.57]. We introduce new predicates and new rules to model accountability and finally prove that the new calculus is correct with respect to the formal semantics. The main new feature is the modelling of the predicate "A canprove r to J ' . What does "proving" mean? An individual A can prove a statement to another individual J if he can convince the latter of the validity of the statement, cf. [10, p.315], [7]. In general, this means that A will perform some actions, e.g., arguing, showing evidences etc., so that after these actions J will believe that the statement r is true. In our context, this means that A has some messages, e.g., digital signatures, certificates, so that after sending these messages to J the recipient J will believe that the statement r is true. By this definition we are able to define the predicate canprove by means of the action s e n d p ( M , Q) already defined in [19]. This shows that the semantics of [19] is strong enough to introduce new predicates. Three consequences of our definition are easy to see. Firstly, as in [10] it holds the following equivalence for formulae r A believes r
~
A canprove r to A
The implication from left to the right is trivial according to our definition. The reverse direction is plausible but has to be proven in the formal context. Secondly, A believes r usually does not imply A canprove r to B for another person B ~ A. This reflects our intuitive understanding that an individual cannot always convince other individuals of his own beliefs. Thirdly, A canprove r to B for B ~ A does not necessarily imply A believes r A clever bad guy might lead the judge to the conclusion that he is innocent although he himself knows that he is not. Then, of course, it is even possible that 2 Note that in case of incomprehensible (sub)messages P sees M does not imply P believes P sees M. Therefore a rule which expresses this implication as stated in [3] is invalid and the original BAN-logic [4] is not sound.
A Sound Logic for Analysing Electronic Commerce Protocols
347
A canprove r to B although r is not true. This is different from the approach of Huima and Aura who use the modal logic $4 and assume "that X cannot be false if someone is able to prove X " [9, Sect. 4.2]. Note that Kailar does not use the modal operator believes in [10]. Therefore his calculus cannot be mapped to the semantics of usual modal logics as in [2], [17], and [19] and thus a formal validation of its calculus is not possible. In [9, Sect. 6] Huima and Aura put the question whether it is possible to conjunct provability with the believe-operator. In our approach the concept of proving and believing relate together in a quite natural way. Different from [1], [9], and [14] we do not concern where evidence tokens are needed but whether the evidence tokens yield enough evidence. 1.3
Freshness and " O l d n e s s "
Freshness is one of the most important features of the BAN-logics and already introduced in [4]. A message is called fresh if it was not used in a previous protocol run. This means that the message was not sent before a certain time. Especially in ecommerce protocols one has also to bother about the fact that a message was generated before a certain time point. For example, in the scenario of mobile nets, there might be a very expensive call using the private key of C to an information service provider V. Assume further that nearly at the same time the certificate on C's public key is revoked. Then the question arises whether the call was before the revocation or afterwards. In this case a time-stamp provider might help who certifies that a given message is at least as old as time point tl. Similar to [16] we introduce a predicate P saidt X, i.e., P sent the message X before time point t, and a predicate old(t, X) which means that the message X was sent by some principal P before time point t, i.e. the message X is at least as old as t.
1.4
Validation T i m e o f Public K e y s and Certificates
Usually in BAN-logics public keys are regarded as valid during the whole time. This assumption is reasonable in the context of usual authentication protocols where the belief is evolved during the investigated protocol run. But in the context of ecommerce protocols the issue of accountability refers to the establishment of a belief long after the protocol run. It might happen that a vendor tries to convince the judge that a certain public key was valid at a certain time point tl although the certificate has expired in the meantime between tl and the time of presentation before the judge. Therefore the logic must be able to express statements like " K is the valid public key of A until time point t."
1.5
Outline of the P a p e r
In section 2 we describe the calculus of [19] combined with the new predicates and rules developed for ecommerce protocols. Section 3 proves the soundness
348
Volker Kessler and Heike Neumann
of the new rules. In section 4 we apply the new calculus to two well-known electronic payment protocols S E T and Payword.
2
Description of A U T L O G
For the convenience of the reader we briefly repeat those definitions and rules from [19] which are i m p o r t a n t in the context of ecommerce 3 and introduce the new definitions and rules. See [19] for further explanation.
2.1
Syntax
The language of our logic consists of basic items like a set of agents 7~ = (P, Q, R, S , . . . , }, who participate in the protocol; a set of public key schemes : (Kp, KQ,...}, where K + denotes the public and K - denotes the private key; a set of basic messages and shared keys J~4o = (M, N, data, I(RQ,...); and a set of computable functions Y: = (enc, h, a}, where enc is used for encrytion and signature functions with message recovery (usually indicated by {X}K), h denotes all hash functions and a signatures without message recovery. The set of messages J~4 is the smallest set including the names of agents and basic messages in 7) U M 0 , the components K - and K + of the public key schemes K ~ K:, lists of messages, and computed messages F(M) for F E and M E M . Furthermore we need so-called localized messages. A localized message is denoted by Mp which stands for the message M in the view of P , e.g., ({X}K)B denotes the ciphertext in the view of B. This means that P does not necessarily understand MR. Similar to ~d we define the set A4~, of generalized messages which is closed under localization, see [19].
2.2
Protocol Runs
A run r is given by an infinite chain of states starting a some point tr in the past. The point to denotes the start of the current protocol run. At time t the state can be changed by one of the following four actions denoted by H(r't): 4
sendR(M, Q): P sends a message M to agent Q. receiveR (M): P reads a message which has been sent before. generateR(M): P generates a new message M. nameR(M, N ) : P gives a new basic name N to a received message M. 3 Especially we omit in this paper the stuff about key agreement and key derivation from [19] because the investigated protocols do not use these mechanisms. 4 These actions are allowed under certain conditions, cf. [19]. Note that it is possible to explain canprove without enlarging the set of actions. This shows that the semantics is very expressive.
A Sound Logic for Analysing Electronic Commerce Protocols
349
Time is formalized as a completely ordered set T, to denotes the starting point of the current protocol run so that the period from tr to to is the prehistory of the protocol. We assume that every agent is able to compare times, i.e. he believes t < t ~ if and only if t _< t ~. But we need not to assume that clocks are synchronized. For each point (r, t) we define the knowledge base S(r't)of agent P as the smallest set containing S(pr't) :=
{M E M I 3t' E [tr,t],N : H (r'r) E {generatep(M),receivep(M),namep(N,M)}} and which is closed under building lists, projection, computation, and decryption. We define the set of all worlds as: )IV = {(r,t)lr is a run and t > t~}. Each agent has some assumptions like the existence of good keys or trustworthy certification authorities, therefore each agent considers just a few worlds as possible. For each agent P we can fix a subset 142p C W of worlds that meet all assumptions P makes. We say a point (r t, t ~) is possible for P in (r, t) if (r', t') belongs to ~Yp and P cannot distinguish between (r,t) and (rt,tt). To formalize this intuition we have to define what we mean by "P cannot distinguish between two worlds." We call a "real" message M an instance M -~ X of a generalized message X iff M and X are identical except for all localized submessages in X, e.g., (A, {X}K) -~ (A, ( { X } K ) B ) . For the formal definition see also [19]. The function sight~ 't) maps each message M to the set A4o U {*} where * replaces all submessages of M that are unverifiable by P, for example
sight~,t)(h(X)) := ~ h(sight~'t)(x)), if 3 M e S(p~'t) : M ~ X
!
9,
else
sight~'t)(a(K-l'X)) := / a ( K - l ' sight~'t)(Z))' if K + E S(~'t); 3M E S(p~'t) : M ~ X
[
9, else Note that in these two cases the definitions are more general than in [19]. Now we can formally define the indistinguishability of worlds: Let 7-/(r't) ("history") denote the sequence of actions (H(r't~ Hff,t~)). We extend the definition of the function sight to sequences of actions in the canonical manner by replacing all messages M by sight~'t)(M). We write (r,t) mR (r',t ~) to express that P in (r, t) cannot distinguish between both worlds: (r,t) ~ p (r',t') . ,De~, (r',t') E W P A Sight(p"t')(~(pr't)) -~ sight~ 't"] ( ~ tr' 't'~ This relation ist transitive and euclidean but, in general, not reflexive. 2.3
S e m a n t i c s o f t h e Formulae
Let M, N E f14 be messages, X, Y E 2~4p be generalized messages and F E 9~. The new predicates are saidt, old(t, X), P c a n p r o v e r to R u n t i l t, and the inclusion of the validation time for keys. PhasX
(r,t) ~ P h a s X :~=~ 3 M ~ X with M E S(pr't)
350
VolkerKessler and Heike Neumann
P s e e s X (r,t) ~ P s e e s X : ~ 3M "~ X, 3t ~ ~ t, S N E M : H (~'t') = receiveR(N) and M E seensubs(e~.,) (N) The set seensubs(M) is the smallest set containing M itself and is closed under projection and decryption if the decryption key is in S. Hence seensubs(M) is a subset of submsgs(M) which denotes the set all syntactical submessages of M. P said/says/saidt, X Let set saidsubs(M) denote the smallest set containing M, being closed under projection and decryption (as above) and containing the preimage N, if it contains F(N) where N is an element of the knowledge base S. (r,t) ~ P s a i d X :~:~ 3M ~_ X, 3t' <_t, 3 N E 2~4, 3Q E 7) :
H (r't') = sendR(N, Q) and M E saidsub_(~ ~,)(N) ~I~ '
(r, t) ~ P says X : ~
3M -~ X, 3t' E [to, t], 3N E J~4, 3Q E 7) : H (r't') -~ sendp(N, Q) and M E saidsub~(~ ,4 (N)
(r, t) ~ P saidt, X :~:~ 3M ~_ X, 3 t" < t' < t, 3 N E .tel, SQ E 7) :
H (r't'') -- sendR(N, Q) and M E saidsub_(~ t,,)(N) PrecognizesX
(r,t) ~ P r e c o g n i z e s X :~:~ sight~'t)(X) ~ *
fresh(X) (r,t) ~ fresh(X):,d.e~. VM ~ X : M ~ U{submsgs(Y) ] 3t ~ to: g (r't) = sendR(Y)} old(t, X) is the counterpart to fresh(X), old(t, X) is true in a world (r, t') if there is an agent P who said X before t.
(r,t) ~ old(t',X) :~:~ 3P E 7): (r,t) ~ Psaidt, X K
P ~ t , Q. K is/was a good shared key between P and Q until time t': (r,t) ~ P ~-+t, Q : ~
VR e 7) Vt" ~ t': If (r,t") ~ R s a i d F ( K , M ) then (r,t") ~ R s e e s F ( K , M ) or R e {P,Q}
If t' = t we omit t~: P ~ Q. K e ~-~t' P. K p is/was a good public encryption key for P until time t~:
(r,t) p c t P. Vt" _< t': If (VX: (r,t") ~ Q s e e s { Z } K + implies (r,t") ~ Q s e e s X ) then Q = P.
A Sound Logic for Analysing Electronic Commerce Protocols
351
If t' = t we write e ~-~ P.
t P. g p is/was a good signature key for P until time t': (r,t) ~ a ~-~,t P : ~ : ~ Vt" -< t' : If (r,t") ~ Q s a i d F ( K - , M )
then
Q --= P or (r, t") ~ Q sees F ( K - , M). If t ~ = t we write a ~-~ P.
X ~ Y
(r, t) ~ X ~ Y : ~
sight (r't)(x) = sight (r't)(Y)
P b e l i e v e s r P believes that r is true if it is true in every world P considers possible: (r,t) ~ Pbelievesr
:~:~
Vr',t' : (r,t) ,~p (r',t') ~
(r',t') ~ r
C e r t ( P , K +, o', t) denotes a certificate which is a list containing the name of a participant, his public key used for the verification of digital signatures, and an expiration date signed by a certification authority. P c a n p r o v e r t o R u n t i l t ' P canprove the statement r to R if P has messages M I , . . . , M~ which he can send to R so that R, after receiving these messages in the time period [t, t'], will believe that r is true. 5
(r, t) ~ P canprove r to R until t' :r ~ M I , . . . , Mn E S(pr't) : Vt"e[t, t'] : If (r,t") ~ (RseesM1) A . . . A (RseesMn) then (r,t") ~ Rbelievesr The restriction on the time period until t' is due to the fact that some pieces of proof might be time-bounded. For example, after a certificate has expired P cannot prove any more that a signed message originates from the owner of the key unless P can prove that this message was generated before the certificate expired. If there is no such time restriction, i.e. t' = co, we simply write P canprove r to R. 2.4
Calculus
New features are the _<-relation on T, the oldness rules O1-3, rule A3, and the provability rules P1-6. Furthermore, we need the rules R3-4, C, C3, C5 for general messages X. 6 Inference R u l e s . M P : If r and (r --4 r then r M : If r is a theorem then P believes r is a theorem. A theorem is a formula which can be derived from axioms alone. The axioms The definition can be easily extended to the case that a group of participants wants to prove a fact to a third party. 6 In [19] these rules are only formulated for real messages M. But they hold for general messages as well.
352
Volker Kessler and Heike Neumann
are all instances of tautologies of propositional calculus, the ~-relation on T and the following axiom schemas:
Modalities. Our logic is a so-called K45-1ogic7 (belief logic) defined by K P believes r A P believes (r --+ r ~ P believes r 4 P believes r ) P believes P believes r 5 -,P believes r ~ P believes -~P believes r
Possession. H 1 P sees X ~ P has X H2PhasX1 A...APhasXn H 3 P has X ~ P has F ( X )
~ ~ Phas(X1,...,Xn)
Recognizability. R1 R2 R3 R4
PrecognizesXi ~ P r e e o g n i z e s ( X 1 , . . . ,Xn) P r e e o g n i z e s X A P h a s K -1 ~ Preeognizesenc(K,X) P has X ~ P recognizes h ( X ) P has ( K + , X ) ~ Precognizesa(K-,X)
Freshness. F1 fresh(Xi) F2 fresh(X)
~ fresh((X1,...,X~)) ~ fresh(F(X))
Oldness. O1 old(t, X 1 , . . . , X~) -+ old(t, X i ) 0 2 old(t, F ( X ) ) --+ old(t, X ) 0 3 old(t, X ) ) A t < t' --4 old(t', X )
Seeing. SE1 P s e e s ( X 1 , . . . , X n ) ~ PseesX~ S E 2 P sees { X } K A P has K -1 --+ P sees X
Saying. NV PsaidX A fresh(X) ~ PsaysX S A 1 P said ( X 1 , . . . , X n ) ---+ P saidXi SA2 Psays(X1,...,Xn) ~ PsaysXi SA3 Psaidh(X) A -~Pseesh(X) ~ PsaidX S A 4 P says h ( X ) A -,P sees h ( X ) ~ P says X
Authentication and Key Confirmation. A1 R s e e s F ( K , X ) A P ~ Q A - ~ P s a i d F ( K , X ) A 2 R sees F ( K - , X ) A ~r K Q ~ Q said(K-, X) A3 Rsees F(K-,X)
K
A ~ ~--~t Q A o l d ( t , F ( K - , X )
~ Qsaid(K,X) )
~ Q saidt(K-,X)
Comprehension. C P sees X A X p -- Y ~ P believes P sees Y C1 P r e c o g n i z e s X i ~ (XI,...,Xn)p - ((X1)p,...,(Xn)p) 7 In K45 it holds a lot of more axioms, e.g., R Pbelievesr A Pbelievesr ( Pbelieves (r A r but the axiom T Pbelievesr -+ r is not valid, cf. [6, p.114,132].
A Sound Logic for Analysing Electronic Commerce Protocols
353
C2 P r e c o g n i z e s X A P h a s K ) ( e n c ( K , X ) ) p _ enc(K, X p ) C 3 P has X > (h(X))p - h(Xp) C 5 P h a s ( K + , X ) ---+ ( a ( g - , Z ) ) p -~ a ( g - , X p ) Equivalences.
E1X-X E2X-YAY=_Z
>X=Z
E3 X =_ Y ---+ F ( X ) - F ( Y ) E4 X1 --- Y1 A . . . A Xn -- Yn
> (XI,...,Xn)
-~ (Y~,...,Yn)
Key Derivation. Sp~Q---+QKp
Provability. If P can prove the validity of an implication then it holds that if P can prove the condition then he can also prove the conclusion. P 1 Pcanprove(r -+ r . >
[P canprove r to J until t --+ P canprove r to J until t ] If P has a message X and it holds t h a t J will believe a statement ~ if he sees message X then P can prove the statement r to J (by sending X to J). P 2 P has X A (J sees X -+ J believes r > P canprove r to J If P has both a signed message on X and the corresponding public key, and he can prove to J t h a t this is a valid signature key of Q until time point t, and X equals to Y in the view of J then P can prove to J until t that Q said the message Y. P3
P h a s a ( K - , X ) A P h a s ( K + , X ) A P canprove {a ~ t Q} t o J A X j =_Y
> P canprove {Q said Y } to J until t
If everything holds as in the previous rule and, in addition, P can prove that the signature on X was generated before time point t then there is no time restriction any more, i.e., P can all the time prove t h a t Q said Y.
P4 P h a s a ( K - , X )
A P h a s ( K +, X ) A Pcanprove{a ~+t K Q } to J
A X j =- Y A P canprove old(t, a ( K - , X ) ) to J >P canprove {Q said Y } to J If P can prove t h a t Q said the hash value of X and J believes that Q generated h(X) himself then P can prove to J t h a t Q said X , cf. rule SA3. P 5 P canprove Q said h(X) to J until t A Jbelieves -~Q sees h(X)
>P canprove {Q said X } to J until t If P can prove t h a t Q said a list then, for each component of this list, P can prove that Q said this component, cf. rule SA1. P 6 P canprove {Q said (X1, 99 9 Xn)} to J until t
>P canprove {Q saidXi} to J until t
354 3
VolkerKessler and Heike Neumann Soundness
of AUTLOG
We prove that the new rules are valid. T h e o r e m 1. Rule O1 old(t, X I , . . .
, Xn)
~ old(t, Xi) is valid.
Proof Let (r, t r) ~ old(t, X1,..., Xn) 9 According to the definition there is an agent P with ( M 1 , . . . , M n ) ~ (X1,... ,Xn), t" < t < t', 3N E M, H (r't'') = seudp(N, Q) and (M1,..., Mn) E saidsubs(~.,,,)(N). Since saidsub(N) is closed under projection, it follows Mi E saidsubs(v.,,,,)(N) for all i which means by definition (r, t ~) ~ old(t, Xi). [] The proof for 02 is straightforward as well and 03 is an immediate consequence of the definition of old. T h e o r e m 2. Rule A3 is valid:
R sees F ( K - , X) A a ~+t Q A old(t, F ( K - , X) )
~ Q saidtF(K-, X)
Proof Let (r, tl) ~ R s e e s F ( K - , X ) A {a ~'+t K Q} A old(t,F(K-,X)). By (r, tl) ~ R s e e s F ( K - , X ) it follows that there exists M -~ X and (r, tl) R seesF(K-,M). Therefore there exists t2 ~ tl and H (r,~) = receiveR(N) where F ( K - , M) E seensubs(A..,2)(N) C_ submsgs(N), and thus there exists t3 < t2 : H (r'ts) = sendp(N, Q). Let tm~n be minimal satisfying H (r'tm'€ --~ sendp(Y, Q) and F ( K - , M) E submsgs(Y). By minimality of train it follows that F ( K - , M) and ( K - , M) cannot be submessages of messages received by P so that (r, tmin) ~ P s a i d F ( K - , M). By the assumption old(t, F(K~, M) it follows tmin <_ t . By definition of a ~ t Q this means that Q sent the signature or that P received it. But because of the minimality of train the second is not possible. From the definition of saidsub it follows (r, tl ) ~ Q saidt (K-, X). [:] Rule P1 follows directly from rules K and M. Rule P2 is an immediate consequence from our definition of canprove. T h e o r e m 3. Rule P3 is valid:
P h a s a ( K - , X ) A P h a s ( K + , X ) A Peanprove{a ~K+tQ } t o J A X j - Y ) P canprove (Q saidY) to J untilt Proof Let (r,t') ~ { P h a s a ( K - , X ) A Phas ( K + , X ) A Pcanprove {a ~ t Q} toJ A X j - Y} for t' _< t. It has to be shown that there exist messages q(r,t') so that for all t"e[t', t]: M 1 , . . . , M~ E ,~p If (r, t") ~ g sees ( / 1 , . . - , Mn) then (r, t") ~ J believes Q said Y. First there are messages M 1 , . . . , Ms E r,JR tl) SO that for all ill > tt: If (r, t") ~ J sees (M1,..., Ms) then (r, t") ~ J believes {a K
Q}.
A Sound Logic for Analysing Electronic Commerce Protocols
355
By definition of has it follows that there are messages Ms+l, M~+2, Ms+3 E s(r,t') with Ms+l -~ (r(K-,X), M~+2 -~ K + and Ms+3 "g X. So P can send these s + 3 messages to J. Now assume that for t"e[t ~, t] : p
--
_
_
(r, t") ~ J sees (MI,. 99 Ms, Ms+l, Ms+2, Ms+3) This implies (r, t") ~ J believes {a ~ Q}, (r, t") ~ J has (K +, X ) and since C5 is valid it holds (r, t") ~ [a(K-, X ) ] j - a ( K - , X j ) . By iterative application of the rules E2-E4 it follows (r, t") ~ [a(K-, X)]j - a ( K - , Y). Thus the conditons of rule C are fulfilled and it holds (r, t ~) ~ Jbelieves Jsees a ( K - , Y). J believes that A2 is true and therefore it follows from rule K: (r,t") ~ J b e l i e v e s Q s a i d Y
[]
T h e o r e m 4. Rule P4 is valid: P h a s a ( K - , X )
A P has ( K + , X ) A P canprove {a ~g t Q} to J A X j ~-- Y A P canprove old(t, a ( K - , X)) to J P canprove (Q saidY) to J . Proof Similar to the previous proof by using A3 in the end instead of A2. T h e o r e m 5. Rule P5 is valid: P canprove (Q saidh(X)) to J until t A
J believes-~Q sees h(X) ~
P canprove (Q said X) to J until t
Proof By assumption P can send messages to J so that for t"e[t', t] : (r, t '~) Jbelieves Q saidh(X). Thus (r, t") ~ J believes (Q saidh(X) A -~Q sees h(X)) and since J believes that rule SA3 is valid it follows by rule K that (r, t")
J believes Q said X.
[]
T h e o r e m 6. Rule P6 is valid: P canprove (Q said(X1,... ,An)) to J until t
P canprove (Q saidXi) to J until t . Proof If (r, t') ~ P canprove (Q said ( X 1 , . . . , Xn)) to J until t then P can send messages to J so that for t" e[t', t] : (r, t") ~ J believes Q said (X1,. 99 Xn). Since J believes that SA1 is valid it follows by K (r, t") ~ J believes Q saidXi. [] 4
Analysing
SET and Payword
To demonstrate how our logic works we compare two famous protocols: SET [11] and Payword [15]. Both are based on credit card payments but differ in the way they use cryptographic mechanisms. Payword is one of the so-called micropayment schemes which are especially designed for small amounts because in those cases the transactions costs of payment systems like SET would exceed the price of the product. Payword, and similar the protocol invented by Pedersen [13], is based on chains of hashvalues and only occasionally uses signatures created by a public key scheme. We analysed the entire protocols but in this paper we focus on the customer's acknowledge of the billing: the customer sends a set of data to the vendor which
356
VolkerKessler and Heike Neumann
is to be cleared by the bank. We examine the information the vendor is able to derive from the data he received. The goal is to find out whether the vendor can prove to the bank that the customer accepted a billing. We simplify the protocol step by omitting the transaction identification numbers and nonces as they play no part in proving accountability. Let B denote the bank with public encryption key K s+. 4.1
SET
Description. All participants have public keys and certificates for them signed by a certification authority. Customer and vendor have already agreed on the service or product and the price. Let OI denote the order information containing details about the service or product and the purchase amount. OI is a redundant message, which can be understood by everyone who sees it. P I denotes the payment instruction containing information about the purchase amount and the customer's credit card, e.g., its number, expiration date and the PIN. Then C sends to V the following messages
Cert(C, K +, a, t), {KCv}K+, {OI, h(PI)}Kcv, {KCB}K+ , {PI, h(OI)}KcB, a(K~, (h(OI), h(PI)))
Goal. After performing this protocol step the vendor V should believe that the customer C actually ordered goods or services
V believes C said OI And he should also believe that he is able to prove this to a judge J, i.e. we should be able to derive the formula
V believes V canprove (C said 0I) to J Since the achievement of the first goal can be proven by usual authentication logics we focus on the second goal.
Prerequisites. Concerning the belief of the vendor V it is assumed that: (V1) V has his own private key: V has K~. into the certification authority and believe that especially the judge shares this trust:
(v2) All participants trust
P sees Cert(Q, K~, a, t) ~
P believes {a
~-~tQ}
P believes (g sees Cert(Q, g ~ , a, t) ~ J believes {a ~-~t Q}) (v3) The certificates are completely understood by everyone: [Cert(C, K + , a, t)]p = Cert(C, K +, a, t) and v believes ([Cert(C, t)]j = Cert(C, g c+, t)).
A Sound Logic for Analysing Electronic Commerce Protocols
(v4)
357
V can comprehend the order information OI and V believes that J comprehends OI as well, i.e. it holds
(OI)v - OI , V believes(OI)j =_ OI, V believes ,1 recognizes h.(OI)
V reeognizesOI,
Neither V nor J need to understand tile hash value h(PI). We only assume
Y believes (h(PI))v)j - ((h(PI))v)
(Vb)
V believes that no customer signs forwarded messages that he does not understand. In addition, V believes that J shares his belief:
V believes ~ Csees h(OI) V believes J believes -~C sees h(OI). T r a n s a c t i o n . receivev(...) yields
Y sees Cart(C, g +, a, t), {KCV}K+, {OI, h(PI)}Kov,
(1)
{KcB}K+, {h(OI), PI)I(c~, a(Kc; h(OI), h(PI)) O u t l i n e o f t h e p r o o f . V believes t h a t rule P3 is correct, i.e. it holds the following:
Y b e l i e v e s V h a s a ( K c ; h(OI), (h(PI))v) A V believes Y has (K +, h(OI), (h(PI))y) A V believes V canprove {~ ~--~t C} to J A V believes (h(OI), (h(PI))y)j = (h(OI), ((h(PI))y) V believes V canprove {C said (h(OI), (h(PI))y } to J until t Applying rule P6 to the conclusion gives
V believes V canprove {C said (h(OI)} to J until t which leads together with prerequisite V5 to the goal (by applying P5 and K)
IV believes V canprove (C said O I) to J until t 1 Therefore we have to check whether the conditions of the implication above are fulfilled. D e c r y p t i o n . The customer's certificate is not encrypted, hence we have:
1 8E~ V sees Cart(C, K +, a, t)
(2)
V2, 2 MR V believes {a ~--~t C}
(3)
2, V3 c ~ V believes V sees Cart(C, K +, a, t)
(4)
4
H1) Y
believes V has Cart(C, K+c, a, t)
(5)
5 g~ V believes V has g +
(6)
V2, 5 P2,K V believes V canprove {a ~-~t C} to J
(7)
358
VolkerKessler and Heike Neumann
Because of prerequisite (V1) V knows his own private key and can thus decrypt the symmetric key Kcy. Therefore he can decrypt {OI, h(PI)}Kcv. 1, ... SE_~I V has (0I, h(PI) )
8 H~ V has (h(OI), h(PI)) V4 R3 Vrecognizes h(OI)
(8) (9) (10)
C o m p r e h e n s i o n of messages. We have to examine which parts of the messages the vendor and the judge can understand: 8, V4 e3,E~.2 (h(OI))y =- h(OI) 10, V4 C1,E~.2 [h(OI), h(PI)]v - (h(OI), (h(fI))y)
(11) (12)
Analogously we derive from (V4) by applying C1, E2-E4:
(OI, h(PI))y - (OI, (h(PI))y) Y believes (h(OI), (h(PI))y)g -- (h(OI), ((h(PI))y)
(13) (14)
Therefore we can deduce: 8, 13 --~ Y believes Y sees (OI, (h(PI))y)
(15)
15 HI,H~,K IV believes V has (h(OI), (h(PI))v) ]
(16)
E x a m i n i n g t h e signature. We can now analyse the vendor's understanding of the signature, since the vendor has the customer's public key.
1 c5,~.4 [a(K~, (h(OI), h(PI)))]y - a(K~, (h(OI), (h(PI))v))
(17)
Therefore it follows: 1, 17 --~ U believes V sees a(K~, (h(OI), (h(PI) )u)) H1,K [V believes V has a ( K ~ , ( h(OI), (h ( P I ) ) v )) ] 18
(18) (19)
Accountability. The formulae (6), (7), (14), (16), and (19) show that the conditions of the implication are fulfilled and thus the second goal is proven. The first goal is achieved by applying the rules A2, A1 to the formulae (3), (18), and prerequisite (Vb).
4.2
Payword
D e s c r i p t i o n . It is assumed that the customers have public keys and certificates for them signed by the bank. Assume the customer and the vendor have already agreed on the service or product and on the price. Before the transaction starts the customer chooses n E N and a random xn. He computes a chain of hashvalues x~ := h(xi+l) for i = n - 1,..., 0. The value
A Sound Logic for Analysing Electronic Commerce Protocols
359
Xo is called the "root" of this chain. The customer starts the "money transfer" by making a signed commitment consisting of the vendor's identification number, the root of the chain of "paywords" and the value the customer wants to assign to every hashvalue: C ~ V : Cert(C, K +, a, t), x0, value, a ( K c , I D v , xo, value) The vendor verifies the certificate and the customer's signature, and sends an acknowledgement to the customer. If the customer wants to pay the vendor, he simply sends xl to him. The vendor checks that xo = h(xl), and if it is, he delivers the goods.
G o a l . Similar to SET the goal is to obtain the formula:
V believes V canprove (C said xl) to J until t A n a l y s i s . At the first sight it seems that the analysis of Payword can be done in almost the same way as of SET. But there is a fundamental difference. In SET one of our prerequisites says that the order information has enough redundancy to be understood by everyone. This is a reasonable assumption, since the order information actually contains the name of the vendor, the date, a description of the ordered goods or services and so on. This is not true for the hashvalue xl in Payword. xl is a random number to everyone except for the customer, who knows the preimage of xl. Hence the prerequisite [Xl]p ~- Xl is not reasonable for any P ~ C. Correctness of Payword implicitly relies on the validity of the axiom P has h(M) --+ P recognizes M. But this axiom is invalid because h(M) E S (r't) does not imply sight~ 't) (M) ~ *, cf. [19]. Therefore we cannot prove the goal.
5
Conclusions
We have presented a provable sound extension of A U T L O G in order to analyse the accountability of participants in electronic commerce protocols. Our analysis of SET and Payword shows that there are actually differences in the features of two systems which justify the distinction between "normal" systems and micropayment systems based on chains of hashvalues. The difference is that the vendor comprehends a signature but he does not "comprehend" the iterated hash values. Thus our formal analysis confirms the general assumption that micropayment schemes do not lead to the same level of assurance as usual payment systems and should therefore used only for micropayments.
A c k n o w l e d g e m e n t s . We thank the referees and especially Michael Waidner for helpful comments on this paper.
360
Volker Kessler and Heike Neumann
References 1. N. Asokan, E. Van Heereweghen, M. Steiner, Towards A Framework ]or Handling Disputes in Payment Systems, IBM Research Report RZ 2996 (#93042) 03/09/1998, IBM Research Division, Zurich. 2. M. Abadi, M. Tuttle, "A Semantics for a Logic of Authentication," Proc. of the ACM Symp. of Principles of Distributed Computing, 1991, 201-216. 3. A. Bleeker, L. Meertens, "A Semantics for BAN-logic," Proc. of the DIMACS Workshop on Design and Formal Verification of Security Protocols, 1997. 4. M. Burrows, M. Abadi, R. Needham, A Logic of Authentication, Report 39 Digital Systems Research Center, Pao Alto, California, 1989. 5. D. Bolignano, "Towards the Formal Verification of Electronic Commerce Protocols," Proc. of the lOth Computer Security Foundations Workshop, Rockport, IEEE Computer Society Press 1997, 133-146. 6. B. Chellas, Modal Logic, Cambridge University Press, Cambridge, England, 1980. 7. R.A. DeMillo, R.J. Lipton, and A.J. Perlis, "Social Processes and Proofs of Theorems and Programs," Comm. ACM, vol. 22, no. 5, 1979. 8. R. Fagin, J. Halpern, Y. Moses, M. Vardi, Reasoning About Knowledge, MIT Press, Cambridge, Mass., 1995. 9. A. Huima, T. Aura, "Using a Multimodal Logic Express Conflicting Interests in Security Protocols," Proc. of the DIMACS Workshop on Design and Formal Verification of Security Protocols, 1997. 10. R. Kailar, "Accountability in Electronic Commerce Protocols," IEEE Trans. on Software Engineering, Vol. 22, No. 5, 1996, 313-328. 11. MasterCard and VISA Corporations, Secure Electronic Transaction (SET), http://www.mastercard.com/set and http://www.visa.com/ 12. C. Meadows, P. Syverson, "A Formal Specification of Requirements for Payment Transactions in the SET Protocol," Preproceedings of Financial Cryptography, 1998. 13. T. Pedersen, "Electronic Payments of Small Amounts," Proc. Security Protocols I996, Springer LNCS 1189, 59-68. 14. B. Pfitzmann, M. Waidner, Properties of Payment Systems: General Definition Sketch and Classification, IBM Research Report RZ 2823 (#90126) 05/06/1996, IBM Research Division, Zurich. 15. R. Rivest, A. Shamir, "Payword and Micromint: Two simple micropayment protocols," Proc. Security Protocols 1996, Springer LNCS 1189, 69-88. 16. S. Stubblebine, R. Wright, "An Authentication Logic Supporting Synchronization, Revocation, and Receny," Proc. Third A CM Conference on Computer and Communiations Security, New Delhi, 1996, 95-105. 17. P. Syverson, P. van Oorschot, "On Unifying Some Cryptographic Protocol Logics," Proc. of the IEEE Computer Society Syrup. on Security and Privacy 199~, 14-28. 18. G. Wedel, Formale Semantik fiir Authentifikationslogiken, Diplomarbeit FB Mathematik der RWTH Aachen, Nov. 1995. 19. G. Wedel, V. Kessler, "Formal Semantics for Authentication Logics," Computer Security - ESORICS 96, Rome, Springer LNCS 1146, 1996, 219-241.
Kerberos Version IV: Inductive Analysis of the Secrecy Goals Giampaolo Bella and Lawrence C Paulson Computer Laboratory - University of Cambridge New Museums Site, Pembroke Street Cambridge CB2 3QG (UK) Giampaolo. Bella@cl. cam. ac. uk,
Larry. Paulson@cl. cam. ac. uk
A b s t r a c t . An operational model of crypto-protocols is tailored to the detailed analysis of the secrecy goals accomplished by Kerberos Version IV. The model is faithful to the specification of the protocol presented by the MIT technical plan [14] - - e.g. timestamping, double session key delivery mechanism are included. It allows an eavesdropper to exploit the shared keys of compromised agents, and admits the accidental loss of expired session keys. Confidentiality is expressed from the viewpoint of each party involved in a protocol run, with particular attention to the assumptions the party relies on. If such assumptions are unrealistic, they highlight weaknesses of the protocol. This is particularly so from the viewpoint of the responder: the model suggests and proves a reasonable correction. K e y w o r d s : secrecy, secure key, non-expired timestamp, inductive method, machine proof.
1
Overview
Crypto-protocols are highly error-prone, but formal methods can be used to limit the risks deriving from their execution. Different kinds of methods could be used in combination to achieve the best results. Burrows et al. [6] developed a belief logic, i.e. a modal logic suitable to express the beliefs of parties, and analysed the authenticity properties of several classical protocols. Because of the failure to discover errors found by other methods, many extensions and variations have been proposed (e.g. [5]). State enumeration deals instead with systems of limited size, and checks that all the states reachable are safe. This has led to several encouraging results (e.g. [8, 11, 12, 20]). Another approach relying on the simple concept of mathematical induction comprises few methods that enhance state enumeration by inductive features (e.g [13]), and a purely inductive one. This method, simply called the "inductive method" in the sequel, deals with systems of infinite size and is based on theorem proving. It has achieved promising results with classical protocols such as OtwayRees, Needham-Schroeder and Yahalom [17, 18], and with real-world protocols such as the Internet protocol TLS [19] and Kerberos.
362
Giampaolo Bella, Lawrence C Paulson
The work on Kerberos was started by the authors last summer, and has gone through several stages. Several technical results about the Version IV 1 of the protocol were proven soon [1], but the analysis of its secrecy goals turned out to be a major task because of the mechanism that delivers session keys. The subsequent analysis [2] of the simpler BAN version of the same protocol (i.e. the version presented by Burrows et al. [6]) allowed the development of suitable proof strategies thanks to which the formal analysis of the secrecy goals of Version IV has recently been achieved. This work shows a strong protection over non-expired session keys even from attacks that exploit expired ones. However, the risk of loss of expired session keys weakens Bob's confidentiality guarantees. To strengthen them, we suggest to add a simple temporal check to the operation of the trusted party of the protocol, and prove it to be an efficacious cure. It has to be mentioned that Version IV is much harder to analyse than the BAN version. The mechanisation of the former executes in six times the CPU time required by the mechanisation of the latter. Section 2 sketches the main concepts of the inductive method. Kerberos Version IV is introduced in Section 3 and analysed in Section 4. Some related work is mentioned in Section 5, and Section 6 concludes. 2
Inductive
Method
Reminder
Crypto-protocols aim at keeping secret certain pieces of information in order to infer the secrecy of new ones. This can be naturally expressed by mathematical induction. The inductive method models a crypto-protocol as the set of all possible traces of events deriving from its execution by an infinite number of agents. The basic event has the form Says A B m s g . In the real world, networks are often threatened by an eavesdropper trying to access resources he is not entitled to get. A dishonest agent s p y is therefore included in the model. He controls a set bad of compromised agents. Besides, honest agents can leak by accident valuable information such as session keys. Security properties typically concern sets of messages. Three operators map a given set H of messages in another such set: parts H yields all information contained in H , components of compound messages, and bodies of all encrypted messages; a n a l z H yields all information accessible in H , components of compound messages, and bodies of messages encrypted under keys (recursively) extracted from H; synth H yields all compound messages and encrypted messages that can be built using elements of H as components. The spy's ability of monitoring the network traffic is formalised by spies e v s , inductively defined as the set of messages over the trace e v s - - the traffic over e v s 1 Version IV is the "original" Kerberos. Version V is based on the same message structure.
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
363
- - plus the shared keys of compromised agents. Therefore, from the observation of the trace evs, the spy can send messages belonging to the set
synth(analz(spies evs ) ) Confidentiality of a key K is formally stated in terms of the analz operator K • analz(spiesevs) while K not appearing in traffic at all, even encrypted, is formalised in terms of
parts K ~ parts(spiesevs) Encryption is assumed to be safe, i.e. bodies of encrypted messages can not be read without knowing the corresponding key. Guarantees are expressed in form of theorems mechanised by Isabelle [16]. The inductive method is described in greater detail elsewhere [17]. 3
Kerberos
Version IV Overview
The development of Kerberos started during the mid 1980s within project Athena at MIT. After three trial versions, Version IV was released in 1989 [14].
Kerberos System
Fig. 1. Basic Kerberos Layout
Kerberos is a shared key protocol based on timestamps. Its basic layout is shown in Fig. 1. Kerberos is meant to provide secure communication over a local area network (LAN) and its use has become more and more widespread during this decade. However, we believe that the actual protocol has not yet been formally analysed extensively (see also Sec. 5). This might be due to the double authentication procedure that the initiator of the protocol has to go through, a feature that seems to provide reassuring guarantees by inspection, but also makes its formal analysis a lot harder.
364
Giampaolo Bella, Lawrence C Paulson
Shared-key protocols normally rely on a trusted third party: Kerberos relies on two. The first to take part to the protocol execution is the Kerberos Authentication Server (abbreviated by Kas), the second is the Ticket Granting Server (abbreviated by Tgs). A database contains the shared keys of all network users and only Kas can access it. The protocol in Fig. 2 is faithfully quoted from the MIT Athena technical plan [14]. Lifetimes are omitted because sending them over the network does not enhance secrecy, as pointed out by Bellovin and Meritt [4]. Kas does not need a shared key of its own, because it uses Alice's shared key looked up in the Kerberos database to communicate with her. Kas knows the shared key of Tgs.
I. AUTHENTICATION 1.
A - + Kas : A, Tgs, Tal
2.
Kas ~ A : {AuthKey, Tgs, Tk, {A, Tgs, AuthKey, Tk~gtg~)K~ AuthT~cket
II. AUTHORISATION
3
A
Tgs: JA, Tgs, A t Key,
TkiK,gs;A, Ta2iAo, VB
AuthT~cket
authenticator
ServT~cket
III. SERVICE 5.
A -+ B : {A, B, ServKey, Tt}K~,JA , Ta3}se,~Kr Y Serv T~cket
v authentzcator
6. B -~ A : {Ta3 + 1]s~,K~v Fig. 2. Kerberos Version IV
The AUTHENTICATION phase (messages 1 and 2) sees Alice logging onto a workstation in order to access the network. Alice sends her identifier to Kas and gets in reply a session key and an encrypted ticket, respectively called authkey and authticket in the sequel, which will be used in the next phase. They are sent encrypted under Alice's shared key that Kas has retrieved from the database. Alice can not decrypt the authticket as this is encrypted under the shared key of Tgs. The authkey has a lifetime of several hours: Alice is automatically logged out when this key expires. The AUTHORISATION phase (messages 3 and 4) occurs each time Alice wants to access the network resource Bob. Alice presents the authticket to Tgs together with an authenticator to show that the authticket was issued to her. Tgs issues her with a new session key and a new ticket, respectively called servkey and servticket
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
365
below. The servkey has a short lifetime of few minutes, and the servticket is unintelligible to Alice, being encrypted under Bob's shared key. The SERVICE phase (messages 5 and 6) follows each authorisation phase. Alice presents the servticket to Bob along with a new authenticator. Bob's reply is borrowed from the Needham-Schroeder protocol. 4
Kerberos
Version IV Inductive
Analysis
We define the function Ct : e v e n t l i s t
~ bool
expressing the current time over a given trace, for creating timestamps. It is defined as the l e n g t h of a trace, since traces never shrink. Four natural numbers formalise the lifetimes Kerberos relies on. 1. AuthLife is the lifetime of the authkey. Tgs checks the authkey against this lifetime before issuing Alice with a servkey. 2. Servkife is the lifetime of the servkey. Bob checks the servkey to have not expired w.r.t, this lifetime before setting up a communication with Alice. 3. Autckife is the lifetime of any a u t h e n t i c a t o r . Both Tgs and Bob check the authenticator they receive to have not expired w.r.t, this lifetime in order to prevent the replay of past authenticators. 4. Respkife is the lifetime of any s e r v e r r e s p o n s e . Once Alice has contacted Kas or Tgs, she checks their replies to be not late w.r.t, this lifetime. A late reply would indicate some messages of the communication to be possibly faked. The first three lifetimes were introduced by the Athena Technical Plan [14]. The fourth is meant to safeguard Alice, and was suggested by the first author and Riccobene [3] from the observation that late server replies could indicate illegal actions to have been performed. Our model does not force agents to act. They could even reply late, but Alice would discard late replies. Note that temporal checks must involve timestamps. Therefore, saying that a session key has not expired means that the timestamp accompanying it inside the ticket has not expired. Similarly, saying that an authenticator has not expired means that the timestamp inside it has not expired. To enhance readability, we define the predicates 1. ExpirAuth T k e v s -
(Ct evs) > T k + AuthLife
2. ExpirServ
(Ct evs) > T t + ServLife
Tt evs
-
3. ExpirAutc T a e v s -
(Ct evs) > Ta + AutcLife
4. RecentResp T T ~ -
T ~_ T~+ RespLife
The inductive definition of Kerberos is presented in Appendix. Rules K1 to K6 formalise the protocol describing how to build new traces from each message. The other rules express respectively the spy's illegal activity (Fake), the accidental
366
Giampaolo Bella, Lawrence C Paulson
leak of expired authkeys (0opsl), and of expired servkeys (Oops2). Traces are formed in reverse: the new event is put on the front. The temporal checks follow a pattern that is maintained throughout the protocol. The current model allows the accidental leakage of session keys that have expired (see Oopsl and 0ops2), because there is some risk increasing over time that the spy might get hold of some keys used in past runs. Our analysis - discovers that non-expired session keys are strongly protected even from the illegal use of expired ones; - discovers that Bob's guarantees of confidentiality over the servkeys are weakened by the risk of loss of expired authkeys; - shows how to strengthen Bob's guarantees. 4.1
The Secrecy Goals
The technical results about Kerberos, such as possibility and regularity properties, are discussed elsewhere [1]. The secrecy goals are typically of two sorts: to limit the damage arising from a key compromise, and to keep keys confidential. The goals met by Kerberos are expressed by the following theorems. 2 Key-Compromise T h e o r e m s typically state that some keys remain secret even when some session keys have been leaked to the spy. Their proofs can be hard, as they can often require case analysis under the analz operator. They also serve as crucial simplification lemmas for proving confidentiality theorems. Most shared-key protocols prudently never use session keys to encrypt other session keys. Doing so would let the spy easily exploit the theft of one key to learn others. Kerberos takes the risk of encrypting the servkey by the authkey in the message sent by Tgs. Therefore, the compromise of the authkey would obviously compromise also the servkey, as confirmed by the following theorem, proven straightforwardly from the definition of analz. Theorem
1 (Compromise
of servkey from compromise
of authkey).
[I Says Tgs A (Crypt AuthKey {IKey ServKey, Agent B, Number Tt, ServTicket[}) E s e t evs; Key AuthKey E analz (spies evs); evs E kerberos [] Key ServKey E analz (spies evs) Despite this weakness, three other important goals are expected to to be met. 1. Both authkeys and shared keys should remain secure from the compromise of any session keys, as session keys never encrypt them. 2 Proofs are omitted here for the sake of brevity. They are available from the authors at http://www, cl. cam. ac. uk/~gb221/Kerberos/
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
367
2. If Tgs encrypts a servkey by an authkey, then the compromise of another authkey should not compromise that servkey, because a servkey should never be encrypted by more than one authkey. 3. All keys should remain secure from the compromise of a servkey, since servkeys never encrypt other keys. Tackling the first goal required the definition of a set AuthKeys evs yielding the authkeys over a trace evs, and several lemmas to simplify it. The result was a long and time-consuming proof. The proofs of the remaining goals turned out to be a major task, so different strategies were investigated. When Tgs sends the fourth message, a stable association between a servkey and an authkey is created, which resembles the association between nonces created by the Yahalom protocol [18]. A servkey is associated with only one authkey, but an authkey can be associated with several servkeys (i.e. an authkey can encrypt several servkeys), and these facts could be proven. Such association is formalised by the predicate KeyCryptKey authkey servkey evs, which holds if Tgs sends a fourth message containing authkey and servkey in the trace evs. L e m m a 2 ( P r o t e c t i o n f r o m a set of c o m p r o m i s e d session keys). evs 6 kerberos ( SK C__ Compl (range shrK) ---+ (K 6 SK. -~ KeyCryptKey K SesKey evs) ) Key SesKey 6 analz (Key"SK t3 (spies evs)) ) (SesKey 6 SK] Key SesKey 6 analz (spies evs))
This l e m m a states that a set of compromised session keys S K that are never used by Tgs to encrypt the given session key does not help the spy to learn the session key. Precisely, the session key can be analysed from the traffictogether with 5K ifand only ifthe session key belongs to S K or could be analysed from the trafficalone. The result is very general because the assumption on the predicate KeyCryptKey can be refined to define the type of session key -- whether authkey or servkey -- as shown below. The proof consists of 40 Isabelle commands, necessary to apply several simplificationlemmas -- e.g. KeyCryptKey never holds on shared keys or on session keys not yet appeared on the trai~ic-- structural lemmas -- e.g. KeyCryptKey never associates an authkey with another authkey -- and unicity lemmas -- e.g. KeyCryptKey associates a servkey with one and only one authkey. This l e m m a is applied by the following three theorems that prove respectively the three expected goals stated above. Theorem
3 (Protection f r o m c o m p r o m i s e d
session key).
[] AuthKey 6 (AuthKeys evs) [3 range shrK; SesKey ~ range shrK; evs 6 kerberos ~] Key AuthKey 6 analz (insert (Key SesKey) (spies evs)) (AuthKey = SesKey ~ Key AuthKey 6 analz (spies evs))
368
Giampaolo Bella, Lawrence C Paulson
The theorem reads as follows: an authkey can be analysed from the traffic plus a session key if and only if the session key is the authkey or the authkey could be analysed from the traffic alone. This means t h a t the spy can not exploit stolen session keys to learn new authkeys: the authkeys are safe from the compromise of any other session keys. T h e guarantee is strong and desirable - - authkeys are valuable pieces of information, as they have a long lifetime - - and also applies to shared keys. The new proof simply applies the l e m m a stating t h a t an authkey (or a shared key) is never treated by Tgs as a servkey [i K 6 AuthKeys evs U range shrK; evs 6 kerberos ~] V K'. ~ KeyCryptKey K' K eve
and then concludes by l e m m a 2: only 2 Isabelle commands.
Theorem 4 (Protection from compromised different authkey). [i KeyCryptKey AuthKey ServKey evs; AuthKey # AuthKey'; AuthKey' ~ range shrK; evs 6 kerberos I] Key ServKey 6 analz (insert (Key AuthKey') (spies evs)) 4==~ (ServKey = AuthKey' ~ Key ServKey 6 analz (spies eve))
If an authkey is associated with a servkey, then the compromise of a different authkey does not help the spy to learn the servkey. The form is the same as that of the previous theorem. The proof uses l e m m a 2 as well as a unicity lemma about the unique association of a servkey to an authkey: [l KeyCryptKey AuthKey ServKey eve; AuthKey' # AuthKey; evs 6 kerberos ]] ~ KeyCryptKey AuthKey' ServKey evs
Theorem 5 (Protection from compromised s e r v k e y ) . [I ServKey ~ (AuthKeys evs); ServKey ~ (range shrK); evs 6 kerberos I] Key K 6 analz (insert (Key ServKey) (spies eve)) r (K = ServKey I Key K 6 analz (spies eve))
The theorem states t h a t no keys can be learned from the compromise of a servkey. The proof applies l e m m a 2 and a l e m m a stating t h a t a servkey is never treated by Tgs as an authkey: [I K ~ AuthKeys e v s ; K ~ r a n g e s h r K ; e v s q k e r b e r o s ===~ V K ' . -1 KeyCryptKey K K' e v e
I]
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
369
C o n f i d e n t i a l i t y T h e o r e m s typically express the assumptions upon which each party can infer that a certain session key is secure from the spy. They have a common feature. Any parties mentioned by the assumptions are required to be uncompromised in order to protect the secrets they know. With Kerberos, an agent has to trust that the agent at the other end of the communication is not conspiring with the spy. However, there are other real-world protocols for situations when nobody trusts anybody else. T h e o r e m 6 ( C o n f i d e n t i a l i t y for K a s ) . [I Says Kas A (Crypt Ka ~IKey AuthKey, Agent Tgs, Number Tk, AuthTicket[}) E set evs; -~ ExpirAuth Tk evs; A ~ bad; evs E kerberos I] Key AuthKey ~ analz (spies evs) The only session keys administered by Kas are authkeys. This theorem assures Kas that an authkey is safe as long as it has not expired. The proof exploits a unicity l e m m a stating that Kas never distributes the same authkey to different
parties, and applies theorems 3 and 4. T h e o r e m 7 ( W e a k c o n f i d e n t i a l i t y for Tgs). [l Says Tgs A (Crypt AuthKey (IKey ServKey, Agent B, Number Tt, ServTicketl)) E set evs; Key AuthKey ~ analz (spies evs); ExpirServ Ttevs; A ~ bad; B ~ bad; B ~ Tgs; evs E kerberos l] Key ServKey ~ analz (spies evs) The theorem is targeted to the confidentiality of a servkey, as Tgs never sees any authkeys. In order to protect the servkey, the theorem assumes the authkey to be confidential (otherwise theorem 1 would apply). This is a weak guarantee: although Tgs can check the freshness of the timestamp, it can not check the confidentiality of the authkey. Theorems 3, 4, and 5 crucially help the proof.
T h e o r e m 8 (Realistic c o n f i d e n t i a l i t y for Tgs). [l Says Tgs A (Crypt AuthKey ~IKey ServKey, Agent B, Number Tt, ServTicketl}) q set evs; Says Kas A (Crypt Ka {IKey AuthKey, Agent Tgs, Number Tk, AuthTicketl)) E set evs; -~ ExpirAuth Tk evs; -~ ExpirServ Ttevs; A ~ bad; B ~ bad; B r Tgs; evs q kerberos i] Key ServKey ~ analz (spies evs)
This version is more realistic because Tgs and Kas are part of the same system, so they can inspect each other's activity: Tgs could check whether Kas has issued the authkey. The proof refines theorem 7 by theorem 6.
370
Giampaolo Bella, Lawrence C Paulson
Theorem 9 (Confidentiality for Tgs). [] Says Tgs A (Crypt AuthKey {IKey ServKey, Agent B, Number Tt, ServTicket]}) 6 set evs; Crypt (shrK Tgs) {IAgent A, Agent Tgs, Key AuthKey, Number Tkl} 6 parts (spies evs); ExpirAuth Tk evs; ~ ExpirServ Ttevs; A ~ bad; B ~ Bad; B # Tgs; evs 6 kerberos I] Key ServKey ~ analz (spies evs)
This is the most general version of confidentiality for Tgs because, if the involved parties can be trusted, Tgs gets the guarantee as soon as it receives the authticket {A, Tgs, AuthKey, Tk}Ktg, which is encrypted under its own shared key. The proof applies first a lemma stating that the authtickets originate with Kas, and then the previous theorem. Note that the theorem holds since the authticket mentioned by the second assumption appears in the traffic, but becomes useful to Tgs only when Tgs gets hold of the authticket, perhaps later. The same remark applies to other theorems involving parts.
Theorem 10 (Confidentiality over authkey for Alice). [I Crypt (shrK A) {IKey AuthKey, Agent Tgs, Number Tk, AuthTicket]} 6 parts (spies evs); ExpirAuth Tk evs; A ~ bad; evs 6 kerberos I] Key AuthKey ~ analz (spies evs)
Alice gets a strong guarantee 0f confidentiality over the authkey from the reception of a non-expired message encrypted under her shared key. The proof is based on a lemma stating that, since Alice is uncompromised, the message encrypted under her shared key has originated with Kas. Theorem 6 is then applied.
Theorem 11 (Confidentiality over s e r v k e y for Alice). [] Crypt (shrK A) {IKey AuthKey, Agent Tgs, Number Tk, AuthTicketl} 6 parts (spiesevs); Crypt AuthKey {IKey ServKey, Agent B, Number Tt, ServTicketl} 6 parts (spies evs); ExpirAuth Tk evs; ~ ExpirServ Ttevs; A ~ bad; B ~ bad;,B # Tgs; evs, 6 kerberos ]] Key ServKey ~ analz (spies evs)
The guarantee for Alice over the servkey is strong too, as she can check the reception of a message of the expected form encrypted under her shared key, then extract the authkey, and hence check again whether she gets an acceptable message encrypted under the authkey. The proof is based on the following sketch. Since Alice is uncompromised, the message mentioned by the first assumption originated with Kas. Theorem 6 then derives that the authkey is confidential, so
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
371
the message mentioned by the second assumption originated with Tgs. Theorem 8 concludes. T h e o r e m 12 ( W e a k confidentiality for Bob). [I Crypt (shrK B) {IAgent A, Agent B, Key ServKey, Number Ttl} E parts (spies evs); Crypt AuthKey {IKey ServKey, Agent B, Number Tt, ServTicket I} q parts (spies evs)); Crypt (shrK A) {IKey AuthKey, Agent Tgs, Number Tk, AuthTicketl} q parts (spies evs); ExpirAuth Ttevs; ~ ExpirServ Tk evs; A ~ bad; B ~ bad; B ~ Tgs; evs q kerberos I] Key ServKey ~ analz (spies evs)
The theorem says that the servkey is secure provided that it has not expired and that the authkey encrypting it has not expired either. This is a weak guarantee because Bob has no role in the AUTHENTICATIONphase, never sees any authkeys, and can not check whether the authkey encrypting the servkey he gets has expired or not. However, the assumptions on the authkey are indispensable to the theorem because, should the authkey expire, it could be then leaked by accident and disclose (theorem 1) also the servkey to the spy. Not only is the weakness highlighted by the previous theorem due to the incautious design of the fourth message, but also to the lack of a connection between the expiring times of the two kinds of session keys. The following scenario could happen: an authkey expires, its user is logged out from the workstation; the authkey is somehow leaked; the servkeys encrypted under that authkey are still non-expired (i.e. can be used) but compromised to the spy. If Tgs only issued new servkeys when prompted with an authkey still valid for the whole lifetime of the servkeys, then the problem might be fixed. This intuition is confirmed by machine proofs. Rule K4 can be strengthened by the temporal check
(Ct evs)+ ServLife _ Tk+ AuthLife In this stronger protocol, when Bob receives a non-expired servkey, he is also assured that the authkey encrypting it has not expired either. This lemma lets us remove from the previous theorem those assumptions about the anthkey that could not be checked by Bob, so that his confidentiality guarantee is strengthened as follows. T h e o r e m 13 ( C o n f i d e n t i a l i t y for Bob - - fixed m o d e l ) . [I Crypt (shrK B) {IAgent A, Agent B, Key ServKey, Number Ttl} E parts (spies evs); ExpirServ T t e v s ; A ~ bad; B ~ bad; B ~ Tgs; evs q kerberos I] Key ServKey ~ analz (spies evs)
372 5
Giampaolo Bella, Lawrence C Paulson Related
Work
The approaches that have been tailored to the formal analysis of Kerberos Version IV are surprisingly not many. By contrast, a great number has been applied to the simpler BAN version. The first author and Riccobene analyse Version IV [3] by Gurevich's Abstract State Machine [7]. They use a detailed algebraic model to formalise all possible actions of honest agents, but the eavesdropper's potentialities are finite. Theorems are stated from the viewpoint of the single agent of an infinite set. Proofs are carried out by hand thanks to the little formal overhead, but automated support is under development. Mitchell et al. [15] model check a highly simplified version of Kerberos Version IV derived from Kohl et al. [10]. Timestamps are not included, and multiple runs are not allowed. They find no attacks on a system of size three - - initiator, Kerberos servers and responder - - and a "redirection" attack on a system of size four - - with two responders - - by which Alice might believe to be talking with Bob when in fact she has been redirected to Charlie, who is possibly compromised. They also check that the problem can be fixed according to the directions of RFC-1510 [9] by upgrading the authenticator. However, the official Kerberos Version IV [14] easily solves the problem by quoting Bob's name in message 4.
6
Conclusion
Kerberos Version IV has proven to be a remarkable case s t u d y ? Its analysis is far more complicated than the analysis of the BAN version [2]. The risks arising from the accidental loss of the different kinds of session keys have been formally tackled. Strong confidentiality assurances have been provided to each uncompromised party involved in the protocol, showing that non-expired session keys are not compromised by the accidental loss of expired ones. However, Kerberos requires the network clocks to be synchronised, which is a well known intrinsic weakness. Each party has been provided with certain assumptions to check in order to infer valuable guarantees. If a party is not able to check some of these assumptions, then the involved guarantee is weak. This is how it is discovered that, in a realistically hostile environment, Bob gets a weak confidentiality guarantee unless Tgs makes a suitable temporal check, not stated by the Athena Technical Plan [14]. Modelling the possible compromise of session keys has greatly complicated the analysis. However, a deployed protocol such as Kerberos Version IV must be resilient against such losses. A litmus test for any protocol analysis method is whether it addresses such issues. 3 The full Isabelle proof script executes in approximately 4 minutes on a 300 MHz Pentium Pro, the longest time amongst the protocols analysed thus far.
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
373
A c k n o w l e d g e m e n t . The research was funded by EPSRC, grants GR/K57381 'Mechanizing Temporal Reasoning' and GR/K77051 'Authentication Logics'.
References 1. G. Bella, L. C. Paulson. Using Isabelle to Prove Properties of the Kerberos Authentication System. Proc. of DIMACS Workshop on Design and Formal Verification of Security Protocols, Orman and Meadows (eds.), 1997. 2. G. Bella, L. C. Paulson. Mechanising BAN Kerberos by the Inductive Method. Proc. of Conference on Computer Aided Verification, Springer, LNCS Series, 1998. 3. G. Bella, E. Riccobene. Formal Analysis of the Kerberos Authentication System. Journal of Universal Computer Science: Special Issue on Gurewch's Abstract State Machine, Springer, 1997. 4. S. M. Bellovin, M. Meritt. Limitations of the Kerberos authentication system. Computer Comm. Review, 20(5) 119-132, 1990. 5. S. H. Brackin. A HOL Extension of GNY for Automatically Analyzing Cryptographic Protocols. Proc. of Computer Security Foundations Workshop, IEEE Press, 1996. 6. M. Burrows, M. Abadi, R. M. Needham. A logic of authentication. Proceedings of the Royal Society of London, 426:233-271, 1989. 7. Y. Gurevich. Evolving Algebras 1993: Lipari Guide. In Specification and Validation Methods, Oxford University Press, E. BSrger (ed.), 1995. 8. R. Kemmerer, C. Meadows, J. Millen. Three Systems for Cryptographic Protocol Analysis. Journal of Cryptology, 7(2), 79-130, 1994. 9. J. Kohl, B. Neuman. The Kerberos Network Authentication Service (Version V). Internet Request for Comment RFC-1510, 1993. 10. J. Kohl, B. Neuman, T. Ts'o. The Evolution of the Kerberos Authentication Service. IEEE Press, 78-94, 1994. 11. G. Lowe. Breaking and Fixing the Needham-Schroeder Public-Key Protocol using FDR. Tools and Algorithms for the Construction and Analysis of Systems, Margaria and Steffen (eds.), LNCS1055, Springer Verlag, 147-166, 1996. 12. G. Lowe. Casper: a Compiler for the Analysis of Security Protocols. Oxford University, Computing Laboratory, Technical Report, 1996. 13. C. Meadows. The NRL Protocol Analyzer: An Overview. Journal of Logic Programming, 26(2), 113-131, 1996. 14. S. P. Miller, J. I. Neuman, J. I. Schiller, J. H. Saltzer. Kerberos authentication and authorisation system. Project Athena Technical Plan, Sec. E.2.1, 1-36, MIT, 1989. 15. J. C. Mitchell, M. Mitchell, U. Stern: Automated Analysis of Cryptographic Protocols Using Murphi. Proc. of IEEE Symposzum on Security and Privacy, 141-151, 1997. 16. L. C. Paulson. Isabelle: A Generic Theorem Prover. Springer, 1994. LNCS 828. 17. L. C. Paulson. Proving properties of security protocols by induction. Proc. of Computer Security Foundations Workshop, IEEE Press, 1997. 18. L. C. Paulson. On Two Formal Analyses of the Yahalom Protocol. Cambridge University, Computer Laboratory, Technical Report No. 432, July 1997. 19. L. C. Paulson. Inductive Analysis of the Internet Protocol TLS. Cambridge University, Computer Laboratory, Technical Report No. 440, Dec. 1997. 20. S. Schneider. Verifying Authentication Protocols Using CSP. Proc. of Computer Security Foundations Workshop, IEEE Press, 1997.
374
Giampaolo Bella, Lawrence C Paulson
A p p e n d i x . K e r b e r o s Version I V I n d u c t i v e Definition
kerberos :: event list set inductive kerberos Base
[] 6 kerberos
Fake
[I evs 6 kerberos; B # Spy; X 6 synth (analz (spies evs)) [] Says Spy B X # evs 6 kerberos
K1
[I evs 6 kerberos; A # Kas [] Says A Kas {[Agent A, Agent Tgs, Number (Ct evs)[} # evs 6 kerberos
K2
[[ evs 6 kerberos; A ~ Kas; Key AuthKey ~ used evs; Says A' Kas {[Agent A, Agent B, Number Tall} 6 set evs [] Says Kas A (Crypt (shrK A) {[Key AuthKey, Agent Tgs, Number (Ct evs), (Crypt (shrK Tgs) {[Agent A, Agent Tgs, Key AuthKey, Number (Ct evs)[}) [}) # evs 6 kerberos
K3
[[ evs 6 kerberos; A # Tgs; Says A Kas {[Agent A, Agent Tgs, Number Tall} 6 set evs; Says Kas' A (Crypt (shrK A) {[Key AuthKey, Agent Tgs, Number Tk, AuthTicket[}) 6 set evs; RecentResp Tk Tal [] Says A Tgs {[AuthTicket, (Crypt AuthKey {[Agent A, Number (Ct evs)[}), Agent B[} # evs 6 kerberos
K4
[I evs 6 kerberos; A # Says A' Tgs {[(Crypt Key (Crypt
Tgs; B # Tgs; Key ServKey ~ used evs; (shrK Tgs) {[Agent A, Agent Tgs, AuthKey, Number Tk[}), AuthKey {[Agent A, Number Ta2[}), Agent BI} 6 set evs; ExpirAuth Tk evs; ~ ExpirAutc Ta2 evs [] Says Tgs A (Crypt AuthKey {[Key ServKey, Agent B, Number (Ct evs), (Crypt (shrK B) {[Agent A, Agent B, Key ServKey, Number (Ct evs)[}) [}) # evs 6 kerberos
Kerberos Version IV: Inductive Analysis of the Secrecy Goals
375
K5
[I evs E kerberos; A ~ B; Says A Tgs {IAuthTicket, (Crypt AuthKey {IAgent A, Number Ta21)), Agent B[} 6 set evs; Says Tgs' A (Crypt AuthKey {IKey ServKey, Agent B, Number Tt, ServTicket~}) E set evs; RecentResp Tt Ta2 i] ===~ Says A B {iServTicket, (Crypt ServKey {IAgent A, Number (Ct evs5) i}) l} # evs E kerberos
K6
[i evs E kerberos; A ~ B; Says A' B {i(Crypt (shrK B) {iAgent A, Agent B, Key ServKey, Number Tti)), (Crypt ServKey {IAgent A, Number Ta3~)) i) E set evs; ExpirServ Ttevs; ~ ExpirAutc Ta3 evs i] ===~ Says B A Crypt ServKey (Number (Ta3 + 1)) # evs E kerberos
Oopsl
[] evs 6 kerberos; A # Spy; Says Kas A (Crypt (shrK A) {iKey AuthKey, Agent Tgs, Number Tk, AuthTicketi}) E set evs; ExpirAuth Tk evs i] ===~ Says A Spy {IAgent A, Agent Tgs, Number Tk, Key AuthKeyi} # evs E kerberos
Oops2
[] evs 6 kerberos; A ~ Spy; Says Tgs A (Crypt AuthKey {]Key ServKey, Agent B, Number Tt, ServTicketl}) 6 set evs; ExpirServ Tk evs i] ===~ Says A Spy {iAgent A, Agent B, Number Tt, Key ServKeyi} # evs E kerberos
Author Index
Anderson, R. 111 Augot, D. 241 Banfi, R. 17 Bella, G. 361 Bertino, E. 127 Bidan, C. 51 Bruschi, D. 17 Buccafurri, F. 127 Cuppens, F. 159 Dacier, M. 1 Debar, H. 1 Delaigle, J.-F. 241 Eizenberg, G. 275 Ferrari, E. 127 Foley, S. N. 33 Fontaine, C. 241 Franz, E. 313 Gabillon, A. 159 Greenwald, S. J. 191 Hall, C. 97 Horn, G. 277 Issarny, V. 51 Jerichow, A. 313 Jøsang, A. 329 Jungh¨ artchen, K. 295 Karjoth, G. 143
Kassab, L. L. 191 Kelsey, J. 97 Kesdogan, D. 295 Kessler, V. 345 Linnartz, J. P. M. G. 221, 257 Marsh, A. 209 Michels, M. 85 Nassehi, M. 1 Neumann, H. 345 Ortalo, R. 67 Paulson, L. C. 361 Polemi, D. 209 Posegga, J. 175 Preneel, B. 277 Quisquater, J.-J. 275 Reichl, P. 295 Rosti, E. 17 Rullo, P. 127 Schneier, B. 97 Stadler, M. 85 Sun, H.-M. 85 Talstra, J. C. 221 Varvitsiotis, A. 209 Vogt, H. 175 Wagner, D. 97 Wespi, A. 1