Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
2836
3
Berlin Heidelberg New York Hong Kong London Milan Paris Tokyo
Sihan Qing Dieter Gollmann Jianying Zhou (Eds.)
Information and Communications Security 5th International Conference, ICICS 2003 Huhehaote, China, October 10-13, 2003 Proceedings
13
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Sihan Qing Chinese Academy of Sciences, Institute of Software 44th Street South, ZhongGuanCun, Beijing 100080, China E-mail:
[email protected] Dieter Gollmann Microsoft Research Limited 7 J.J. Thomson Avenue, Cambridge CB3 0FB, UK E-mail:
[email protected] Jianying Zhou Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore 119613 E-mail:
[email protected]
Cataloging-in-Publication Data applied for Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at
.
CR Subject Classification (1998): E.3, G.2.1, D.4.6, K.6.5, F.2.1, C.2, J.1 ISSN 0302-9743 ISBN 3-540-20150-5 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2003 Printed in Germany Typesetting: Camera-ready by author, data conversion by PTP-Berlin GmbH Printed on acid-free paper SPIN: 10959817 06/3142 543210
Preface ICICS 2003, the Fifth International Conference on Information and Communication Security, was held in Huhehaote city, Inner Mongolia, China, 10–13 October 2003. Among the preceding conferences, ICICS’97 was held in Beijing, China, ICICS’99 in Sydney, Australia, ICICS 2001 in Xi’an, China, and ICICS 2002, in Singapore. The proceedings were released as Volumes 1334, 1726, 2229, and 2513 of the LNCS series of Springer-Verlag, respectively. ICICS 2003 was sponsored by the Chinese Academy of Sciences (CAS), the National Natural Science Foundation of China, and the China Computer Federation. The conference was organized by the Engineering Research Center for Information Security Technology of the Chinese Academy of Sciences (ERCIST, CAS) in co-operation with the International Communications and Information Security Association (ICISA). The aim of the ICICS conferences has been to offer the attendees the opportunity to discuss the state-of-the-art technology in theoretical and practical aspects of information and communications security. The response to the Call for Papers was surprising. When we were preparing the conference between April and May, China, including the conference venue, Huhehaote City, was fighting against SARS. Despite this 176 papers were submitted to the conference from 22 countries and regions, and after a competitive selection process, 37 papers from 14 countries and regions were accepted to appear in the proceedings and be presented at ICICS 2003. We would like to take this opportunity to thank all those who submitted papers to ICICS 2003 for their valued contribution to the conference. We wish to thank the members of the program committee and external reviewers for their effort in reviewing the papers in a short time. We are also pleased to thank Prof. Xizhen Ni, Dr. Yeping He, and other members of the organizing committee for helping with many local details. Special thanks to Dr. Jianying Zhou who took care of most of the tough work relating to the publishing affairs and contributed to the conference in variety of ways. It now seems that SARS is over. On behalf of the program committee and organizing committee we sincerely hope that you were able to enjoy not only the technical part of the conference, but also the historical city of Huhehaote and the beautiful grassland of Inner Mongolia in China. October 2003
Sihan Qing Dieter Gollmann
ICICS 2003 Fifth International Conference on Information and Communications Security Huhehaote, China October 10–13, 2003 Sponsored by Chinese Academy of Sciences and National Natural Science Foundation of China and China Computer Federation Organized by Engineering Research Center for Information Security Technology (Chinese Academy of Sciences) and International Communications and Information Security Association
General Chair Dequan He
Academician of the Chinese Academy of Engineering, China
Program Chairs Sihan Qing Dieter Gollmann
Chinese Academy of Sciences, China Microsoft Research, UK
Program Committee Feng Bao Thomas Berson Chin-Chen Chang Lily Chen Welland Chu Edward Dawson Robert Deng Jan Eloff
Institute for Infocomm Research, Singapore Anagram, USA MOE, Taiwan Motorola, USA THALES, Hong Kong, China Queensland University of Technology, Australia Institute for Infocomm Research, Singapore University of Pretoria, South Africa
VIII
Organization
Mariki Eloff Dengguo Feng Yongfei Han Lein Harn Yeping He Kwangjo Kim Xuejia Lai Chi-Sung Laih Javier Lopez David Naccache Eiji Okamoto Susan Pancho Jean-Jacques Quisquater Bimal Roy Claus Schnorr Vijay Varadharajan Yumin Wang Susanne Wetzel Tara Whalen Guozhen Xiao Lisa Yiqun Yin Moti Yung Jianying Zhou
University of South Africa, South Africa Chinese Academy of Sciences, China ONETS, China University of Missouri, USA Chinese Academy of Sciences, China Information and Communications University, Korea Swissgroup, Switzerland National Cheng Kung University, Taiwan University of Malaga, Spain Gemplus, France University of Tsukuba, Japan University of the Philippines, the Philippines UCL, Belgium Indian Statistical Institute, India University of Frankfurt, Germany Macquarie University, Australia Xidian University, China Stevens Institute of Technology, USA Dalhousie University, Canada Xidian University, China Princeton University, USA Columbia University, USA Institute for Infocomm Research, Singapore
Organizing Committee Xizhen Ni Yeping He
Chinese Academy of Sciences, China Chinese Academy of Sciences, China
External Reviewers Julien Brouchier, Xiaofeng Chen, Judy Zhi Fu, Pierre Girard, Guang Gong, Helena Handschuh, Wen-Jung Hsain, Qingguang Ji, Jianchun Jiang, WenChung Kuo, Bao Li, Tieyan Li, Dongdai Lin, Wenqing Liu, Hengtai Ma, Manish Mehta, Yang Meng, Pradeep Mishra, Mridul Nandi, Pascal Paillier, Pinakpani Pal, Jian Ren, Greg Rose, Hung-Min Sun, Shen-Chuan Tai, Lionel Victor, Chih-Hung Wang, Guilin Wang, Mingsheng Wang, Wenling Wu, Ching-Nung Yang, Wentao Zhang, Yongbin Zhou, Bo Zhu
Table of Contents
A Fast Square Root Computation Using the Frobenius Mapping . . . . . . . . Wang Feng, Yasuyuki Nogami, Yoshitaka Morikawa
1
A Forward-Secure Blind Signature Scheme Based on the Strong RSA Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dang Nguyen Duc, Jung Hee Cheon, Kwangjo Kim
11
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yan Wang, Chi-Hung Chi, Tieyan Li
22
On the RS-Code Construction of Ring Signature Schemes and a Threshold Setting of RST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Duncan S. Wong, Karyin Fung, Joseph K. Liu, Victor K. Wei
34
A Policy Based Framework for Access Control . . . . . . . . . . . . . . . . . . . . . . . . Ricardo Nabhen, Edgard Jamhour, Carlos Maziero Trading-Off Type-Inference Memory Complexity against Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Konstantin Hypp¨ onen, David Naccache, Elena Trichina, Alexei Tchoulkine Security Remarks on a Group Signature Scheme with Member Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guilin Wang, Feng Bao, Jianying Zhou, Robert H. Deng
47
60
72
An Efficient Known Plaintext Attack on FEA-M . . . . . . . . . . . . . . . . . . . . . . Hongjun Wu, Feng Bao, Robert H. Deng
84
An Efficient Public-Key Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianying Zhou, Feng Bao, Robert Deng
88
ROCEM: Robust Certified E-mail System Based on Server-Supported Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Jong-Phil Yang, Chul Sur, Kyung Hyune Rhee Practical Service Charge for P2P Content Distribution . . . . . . . . . . . . . . . . . 112 Jose Antonio Onieva, Jianying Zhou, Javier Lopez ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Henry C.J. Lee, Vrizlynn L.L. Thing, Yi Xu, Miao Ma
X
Table of Contents
A Lattice Based General Blind Watermark Scheme . . . . . . . . . . . . . . . . . . . . 136 Yongliang Liu, Wen Gao, Zhao Wang, Shaohui Liu Role-Based Access Control and the Access Control Matrix . . . . . . . . . . . . . 145 Gregory Saunders, Michael Hitchens, Vijay Varadharajan Broadcast Encryption Schemes Based on the Sectioned Key Tree . . . . . . . . 158 Miodrag J. Mihaljevi´c Research on the Collusion Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Gang Li, Jie Yang Multiple Description Coding for Image Data Hiding Jointly in the Spatial and DCT Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Mohsen Ashourian, Yo-Sung Ho Protocols for Malicious Host Revocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Oscar Esparza, Miguel Soriano, Jose L. Mu˜ noz, Jordi Forn´e A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Pik-Wah Chan, Michael R. Lyu A Novel Two-Level Trust Model for Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Tie-Yan Li, HuaFei Zhu, Kwok-Yan Lam Practical t-out-n Oblivious Transfer and Its Applications . . . . . . . . . . . . . . . 226 Qian-Hong Wu, Jian-Hong Zhang, Yu-Min Wang Adaptive Collusion Attack to a Block Oriented Watermarking Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 Yongdong Wu, Robert Deng ID-Based Distributed “Magic Ink” Signature from Pairings . . . . . . . . . . . . . 249 Yan Xie, Fangguo Zhang, Xiaofeng Chen, Kwangjo Kim A Simple Anonymous Fingerprinting Scheme Based on Blind Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 Yan Wang, Shuwang L¨ u, Zhenhua Liu Compact Conversion Schemes for the Probabilistic OW-PCA Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Yang Cui, Kazukuni Kobara, Hideki Imai A Security Verification Method for Information Flow Security Policies Implemented in Operating Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Xiao-dong Yi, Xue-jun Yang
Table of Contents
XI
A Novel Efficient Group Signature Scheme with Forward Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Jianhong Zhang, Qianhong Wu, Yumin Wang Variations of Diffie-Hellman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Feng Bao, Robert H. Deng, HuaFei Zhu A Study on the Covert Channel Detection of TCP/IP Header Using Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Taeshik Sohn, JungTaek Seo, Jongsub Moon A Research on Intrusion Detection Based on Unsupervised Clustering and Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Min Luo, Lina Wang, Huanguo Zhang, Jin Chen UC-RBAC: A Usage Constrained Role-Based Access Control Model . . . . . 337 Zhen Xu, Dengguo Feng, Lan Li, Hua Chen (Virtually) Free Randomization Techniques for Elliptic Curve Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 Mathieu Ciet, Marc Joye An Optimized Multi-bits Blind Watermarking Scheme . . . . . . . . . . . . . . . . . 360 Xiaoqiang Li, Xiangyang Xue, Wei Li A Compound Intrusion Detection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 Jianhua Sun, Hai Jin, Hao Chen, Qian Zhang, Zongfen Han An Efficient Convertible Authenticated Encryption Scheme and Its Variant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 Hui-Feng Huang, Chin-Chen Chang Space-Economical Reassembly for Intrusion Detection System . . . . . . . . . . 393 Meng Zhang, Jiu-bin Ju A Functional Decomposition of Virus and Worm Programs . . . . . . . . . . . . . 405 J. Krishna Murthy
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
A Fast Square Root Computation Using the Frobenius Mapping Wang Feng, Yasuyuki Nogami, and Yoshitaka Morikawa Dept. of Communication Network Engineering, Okayama University, Okayama-shi, 700-8530, Japan {wangfeng, nogami, morikawa}@trans.cne.okayama-u.ac.jp
Abstract. The objective of this paper is to give a fast square root computation method. First the Frobenius mapping is adopted. Then a lot of calculations over an extension field are reduced to that over a proper subfield by the norm computation. In addition a inverse square root algorithm and an addition chain are adopted to save the computation cost. All of the above-mentioned steps have been proven to make the proposed algorithm much faster than the conventional algorithm. From the table which compares the computation between the conventional and the proposed algorithm, it is clearly shown that the proposed algorithm accelerates the square root computation 10 times and 20 times faster than the conventional algorithm in Fp11 and Fp22 respectively. At the same time, the proposed algorithm reduces the computation cost 10 times and 20 times less than the conventional algorithm.
1
Introduction
It is well known that in the modern IT-oriented society it is critically important to keep private information secure from evil eavesdroppers. As technology to ensure the security Elliptic Curve Cryptosystem (ECC), a public-key cryptosystem, has been widely studied[1] because it only requires 160 bits length key, while Rivest Shamir Adleman (RSA) cryptosystem based on the difficulty of large number factorization, which has been extensively used in the last two decades, needs 2000 bits length key. On the other hand, IC cards and mobile telephones have become quite compact in recent years, and it is not practical to implement the RSA cryptosystem on such devices with only scarce computation resources. So a lot of studies such as fast implementation of ECC are carried out[2]. In order to implement ECC, not only the acceleration of the fundamental operations, but also that of the square root (SQRT) computation over an extension field must be studied for every encryption process. The objective of this paper is to give a still faster SQRT computation method. First the Frobenius mapping is adopted for exponentiation. Then a lot of calculations over an extension field are reduced to that over a proper subfield by the norm computation. In addition the authors use the inverse SQRT algorithm which better fits to our objective than the conventional algorithm[1]. As far as the authors know there are no reports that the Frobenius mapping has been used in improving the SQRT S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 1–10, 2003. c Springer-Verlag Berlin Heidelberg 2003
2
W. Feng, Y. Nogami, and Y. Morikawa
computation speed before. All of the above-mentioned steps have been proven to make the SQRT computation much faster than the conventional algorithm. In this paper, the authors take examples for Fp11 and Fp22 as extension fields. From the table which compares the computation between the conventional and the proposed algorithm, it is clearly shown that the proposed algorithm accelerates the SQRT computation 10 times and 20 times faster than the conventional algorithm in Fp11 and Fp22 respectively. At the same time, the proposed method reduces the computation cost 10 times and 20 times less than the conventional algorithm. This shows the fact that the proposed algorithm has a great value for the fast SQRT computation. Throughout this paper, let p and m = rn be the characteristic and the extension degree, respectively, where p is an odd prime number larger than 3, r = 2u , u = 0, 1, · · · and n is an odd number. Am , Mm , and φm denote addition, multiplication and the Frobenius mapping in extension field Fpm , respectively. #Am , #Mm and #φm denote the number of these operations, respectively.
2
Background
An elliptic curve is generally given by E(x, y) = x3 + ax + b − y 2 = 0,
(1)
where a, b are the constant elements of finite field Fq , q is an odd prime number larger than 3. In Eq.(1), since all computations are carried out over Fq , Fq is called definition field and the solutions (x, y) of Eq.(1) are said Fq -rational points. Fq -rational points of Eq.(1) form an additive Abelian group over a definite geometric addition and the security of the ECC relies heavily upon the difficulty of the discrete logarithm problem over this group. For ensuring sufficient security, the order of elliptic curve that is the number of Fq -rational points must be larger than 2160 or have such a large prime factor[1]. At the same time ECC must resist all the known attacks: the anomalous elliptic curve attack[3], MenezesOkamamoto-Vanstone (MOV)[4], Frey-Ruck[5] and Weil descent attack[6]. The conditions of the definition field Fq that are enough to resist those attacks can fortunately come down to the following [6], [7], [8]: – the definition field is a prime field, – the extension degree m is not divisible by 4 and each odd prime factor of m is larger than or equal to 11. In order to encrypt a plain text by ECC, we have to map the text to an Fq rational point on Eq.(1). For this purpose we interpret the text as the distinct integer and regard the result as x-coordinate of Eq.(1). However the solution of y in Eq.(1) may not exist in Fq . More precisely when we let integer ξ be x-coordinate of an Fq -rational point, the point ξ, E(ξ, 0) is an Fq -rational point if and only if η = E(ξ, 0) is a quadratic power residue, where“
A Fast Square Root Computation Using the Frobenius Mapping
η
(q−1)/2
=
1 if η is a quadratic power residue (QPR) −1 if η is a quadratic power non residue (QPNR)
.
3
(2)
It should be noted that η is a nonzero element in this paper. Thus, realization ECC requires not only the implementation of Abelian addition of rational points but also that of the QPR test of Eq.(2) and a SQRT computation.
3 3.1
Fast Square Root Computation Quadratic Power Residue Test
QPR test for η can be done through calculating the left hand side of Eq.(2). For the computation over any finite field we usually resort to the binary method in which we compute η 1 , η 2 , η 4 , · · · and then combine them to produce the desired power of η. However in this paper since we are interested in the case of an extension field, we let q = prn , where r and n as described in the last place of Section1. In this instance we can regard the extension fields Fp11 and Fp22 as two especial cases. When q = prn , the left hand side of Eq.(2) can be rewritten as: rn
η (p
−1)/2
r
= (η 1+p
+···+pr(n−1) (pr −1)/2
)
= (α1+p+···+p
r−1
r
= α(p
−1)/2
)(p−1)/2 = β (p−1)/2 ,
(3)
where α = NFpr (η), β = NFp (α) and NFq (·) denotes the norm of · with respect to Fq . Since α is given as the product of all conjugates of η ∈ Fprn with respect to its subfield Fpr , it becomes a nonzero element in Fpr . Since multiplication in a lower extension field is more economic than that in a higher extension field, the reduction of Eq.(3) can give a fast implementation of QPR test. Therefore for QPR test over general extension field we can say that the maximum efficiency can be attained when we factorize the extension degree as much as possible and apply reduction such as Eq.(3) to each factor. It is noted that Eq.(3) is valid even when r = 1, in this case α = β = NFp (η) and β = NFp (α) becomes nonsense. Furthermore, NFpr (η) can be expressed by using the Frobenius mapping φ[i] (x): NFpr (η) =
n−1
φ[ri] (η),
i
φ[i] (x) = xp .
(4)
i=0
Since φ[i] (x) has the linearity: φ[i] (aξ + bζ) = aφ[i] (ξ) + bφ[i] (ζ) (a, b ∈ Fp , ξ, ζ ∈ Fprn ),
(5)
and any element η in Fprn is expressed as linear combination of basis. If the Frobenius mapping of the basis is simple, φ[i] (η) can be obtained almost without computation, especially in optimal extension field (OEF)[9] whose modular polynomial is an irreducible binomial, all one polynomial field (AOPF)[10] whose
4
W. Feng, Y. Nogami, and Y. Morikawa
modular polynomial is an irreducible all one polynomial and successive extension field (SEF)[11] that is a combination of OEF and AOPF. Moreover, we can save the considerable computation cost in norm computation by an addition chain where we use repeatedly the previously obtained values. Fig.1 shows an addition chain example to compute NFp (η) for η ∈ Fp11 . From Fig.1 we see that the required number of multiplication over Fp11 is only 5 in that addition chain, while from Eq.(4) it is 10 in direct computation. In general, the computations of NFpr (η) for η ∈ Fprn and NFp (α) for α ∈ Fpr require the following cost: #φrn = #Mrn = log2 (n) + w(n) − 1, (6) #φr = #Mr = log2 (r) + w(r) − 1, where · and w(·) show respectively the maximum integer less than · and the Hamming weight of · .
Fig. 1. An example of addition chain when r = 1, n = 11
In the last part of QPR test, we must compute the β (p−1)/2 as shown in Eq.(3). By the binary method the computation cost is: p−1 p−1 #M1 = log2 +w − 1. (7) 2 2 In the rest of this paper, we often encounter the same function of the number for evaluation, so we express the function in short: LW (·) = log2 (·) + w (·) . 3.2
(8)
Square Root Computation (The Conventional Algorithm)
Since SQRT computation over a finite field is a kind of discrete logarithm problem, its computation cost is expensive. For SQRT computation, we usually resort to the conventional algorithm.
A Fast Square Root Computation Using the Frobenius Mapping
5
The Conventional Algorithm Input: A nonzero quadratic power residue η ∈ Fprn . √ Output: A square root η ∈ Fprn . Preparation: (1) Factorize the order of multiplicative group in Fprm as prn − 1 = 2e s, s is odd, and e ≥ 1.
(9)
(2) Find an appropriate QPNR element θ ∈ Fprn and compute a = θs . Procedure: Step1: Compute b = η (s−1)/2 and set t0 = 0, k = 0. Step2: Iteratively compute tk by increasing k up to e − 2:
2e−2−k 0 if (atk b)2 · η =1 k tk+1 = tk + 2 ck , ck = . (10)
t 2 2e−2−k = −1 1 if (a k b) · η √
η = ate−1 bη. In the above description, a, e, s can be prepared if the field is given and these parameters are determined before SQRT computation. It is also noted that when e is equal to 1, Step2 is skipped and the result remains valid. Moreover, if we √ √ substitute Step3’ η −1 = ate−1 b for Step3 η = ate−1 bη in the conventional algorithm, then we regard the modified version as the inverse SQRT algorithm. Although the computation of Eq.(10) can use the binary method to improve its speed, we have to carry out all of exponentiations and multiplications over the extension field Fprn . Furthermore, the computation in Step1 is also done by using the binary method, but it is rather slower comparing with the Frobenius mapping when p is a very large odd prime number. Step3: Output the square root
3.3
Square Root Computation (Proposal)
As shown in Eq.(3), for η ∈ Fprn , r
α = η 1+p
+···+pr(n−1)
∈ F pr .
(11)
Multiplying the both sides of Eq.(11) by η and then taking the SQRT, we have: √
(pr +1)/2 √ −1 η = η ηE α ,
(n−1)/2
E=
pr(2i−1) .
(12)
i=1
In Eq.(12) we can effectively adopt the Frobenius mapping to compute η E . We can also compute the (pr +1)/2-th power mainly by using the Frobenius mapping, if we develop (pr + 1)/2 as follows: p−1 pr + 1 = · (pr−1 + pr−2 + · · · + p + 1) + 1. 2 2
(13)
6
W. Feng, Y. Nogami, and Y. Morikawa
The computation by using the Frobenius mapping is feasible for the parenthesis part in Eq.(13) and for (p − 1)/2 part we compute by the binary method. Consequently we obtain the following fast SQRT computation using the Frobenius mapping. The Proposed Algorithm Input: A nonzero quadratic power residue η ∈ Fprn . √ Output: A square root η ∈ Fprn . Step1: Compute α = NFpr (η).
(pr +1)/2 Step2: Computation for η η E in Eq.(12): (1) Compute ξ = η E by the Frobenius mapping. r−1
(2) From Eq.(13), compute ψ = ξ 1+p+···+p (3) Compute ζ = ψ
by using the Frobenius mapping.
(p−1)/2
by binary method.
E (pr +1)/2 and then multiply by η to get (4) Multiply ζ by ξ to get η
(pr +1)/2 ω = η ηE .
Step3: By using α ∈ Fpr , compute √ −1 √ Step4: Compute η = ω α .
√ −1 α with the inverse SQRT algorithm.
As mentioned above, we know that not only the binary method but also the Frobenius mapping is adopted for exponentiation in the proposed algorithm. And then a lot of calculations over an extension field are reduced to those over a proper subfield. Furthermore an addition chain and the inverse SQRT algorithm are also adopted to save the computation cost.
4
4.1
Comparison between the Conventional and the Proposed Algorithm Evaluation of the Conventional Algorithm
Since the cost of preparation in the conventional algorithm becomes negligibly small compared with that of the main procedure, in what follows, we only evaluate the cost of the main procedure. At first, suppose θ be a QPNR element of Fpr , then we have: r
θ(p
−1)/2
= −1.
Moreover n is an odd number, so we have:
(14)
A Fast Square Root Computation Using the Frobenius Mapping rn
θ(p
−1)/2
r
= (θ(p
7
−1)/2 1+pr +···+pr(n−1)
)
r
= (−1)1+p
+···+pr(n−1)
= −1,
(15)
where we should note the fact that 1 + pr + · · · + pr(n−1) is an odd number. Therefore, θ is also a QPNR element over Fprn . This shows that we can choose θ ∈ Fpr in the conventional algorithm. In the main procedure of the conventional algorithm, in Step1 we must compute η (s−1)/2 . By the binary method it requires the following cost: s−1 − 1. (16) #Mrn = LW 2 In Step2, when every ck in Eq.(10) is equal to 1, the computation cost becomes the maximum. In this case k = 0, 1, · · ·, for very k, atk b is: ab, aa2 b, aa2 a4 b, · · · ,
e−2
i
a2 b.
(17)
i=0
For example, aa2 b corresponding to k = 1 is computed by multiplying ab and a2 together, where we should note that ab has been already computed when k = 0. In addition, since a = θs is an element in Fpr , we can obtain ab with computation cost #Mr = n. Therefore, for every k the computation cost of atk b is: (18) #Mr = (e − 2) + n{(e − 2) + 1}. Next, we compute the square of atk b and then multiply by η for each k as shown in Eq.(10). Accordingly, the computation cost is given by #Mrn = 2(e − 1).
(19)
We compute 2e−2−k -th power for each k as shown in Eq.(10) and finally multiply by η in Step3. Therefore, these operations need the following computation cost: #Mrn =
e−2 i=1
4.2
i+1=
(e − 1)(e − 2) + 1. 2
(20)
Evaluation of the Proposed Algorithm
Before making a SQRT computation of an input element, we usually perform a QPR test. As mentioned in Section3.1, the left hand side of Eq.(2) is evaluated with two steps as shown in Eq.(3). It is considered that α of Eq.(11) has been computed in the QPR test. From Fig.1 η E in Eq.(12) has also been computed in the QPR test. Therefore the computation cost in Step1 and in Step2-(1) in the proposed algorithm is not necessary to count.
8
W. Feng, Y. Nogami, and Y. Morikawa
First, let us evaluate the cost of Step2-(2), Step2-(3) and Step2-(4), it is given by the following: #φrn = #Mrn = LW (r) − 1, #Mrn = LW
p−1 2
(21)
+ 1,
(22)
where Eq.(21) is the cost of (2), Eq.(22) is the cost of (3) and (4). Next, in Step3 √ −1 we apply the inverse SQRT algorithm described in Section3.2 to compute α , where α is given by Eq.(11). As mentioned in Section4.2, it is noted that α is given as a nonzero element in Fpr . By the evaluation of the conventional algorithm in Section4.1, we need the following computation cost:
s − 1 (e − 1)(e − 2) − 1 + (2e − 3) + 2(e − 1) + , #Mr = LW 2 2
(23)
if we subtract one multiplication from the cost of the conventional algorithm computation when n = 1, we can easily get the Eq.(23). √ −1 and ω in Step4, and the result is the objective At last, we multiply α √ −1 √ SQRT η. For this operation, since α and ω are nonzero elements in Fpr rn and Fp respectively, we need the following computation cost: #Mr = n.
5
(24)
Experimental Results and Conclusion
In this section, we restrict characteristic p and extension degree m as follows: p = 228 + 625 = 268436081,
(25a)
m = 11 and 22.
(25b)
And then we simulate the conventional and the proposed algorithms over Fp11 and Fp22 , where we construct Fp11 by adopting the following binomial as the modular polynomial[9]: x11 − 2.
(25c)
And we construct Fp22 as SEF by adopting the all one polynomial[11],[12]: x2 + x + 1.
(25d)
Based on Eq.(25), we can explicitly evaluate the computation cost of the fundamental arithmetic over Fp11 and Fp22 such as φm and Mm , where m is the extension degree, with #A1 and #M1 as mentioned in column A of Table 1. In column B, we convert and show the cost of those operations over Fp .
A Fast Square Root Computation Using the Frobenius Mapping
9
Table 1. Computation cost needed for a square root computation CPU: Pentium4, 2.67GHz A. Numbers of Operations #φ2 #φm #M1 #M2 #Mm
Fp11
Fp22
B. Computation Cost C. Simulation #A1
#M1
Result[µs]
QPR Test
−
5
31
−
5
1010
386
27.0
conventional
−
0
92
−
444
89688
27176
21.4 × 102
proposal
−
4
110
−
36
7272
2346
19.3 × 10
QPR Test
1
5
31
1
5
3045
999
92.0
conventional
1
0
31
115
930
566015
161266
12.5 × 103
proposal
1
10
31
165
42
26361
7992
65.6 × 10
Remarks: In this table, the cost of QPR test and a = θs is also evaluated.
We implemented the conventional and the proposed algorithms on a Pentium4 (2.67GHz) with C language. From Table 1, it is clearly shown that the proposed algorithm accelerates the SQRT computation 10 times and 20 times faster than the conventional algorithm in Fp11 and Fp22 respectively. At the same time, the proposed algorithm reduces the computation cost 10 times and 20 times less than the conventional algorithm. The main reason is that we adopt the Frobenius mapping and most multiplications over the definition field Fpm are replaced by those over its proper subfield Fp or Fp2 . Consequently, we can conclude that the proposed algorithm is quite effective compared with the conventional algorithm.
References 1. I.Blake, G.Seroussi, and N.Smart, Elliptic Curves in Cryptography, LNS 265, Cambridge University Press, 1999. 2. J.Guajardo, R.Blumel, U.Kritieger, and C.Paar, “Efficient Implementation of Elliptic Curve Cryptosystems on the TI MSP430x33x Family of Microcontrollers,” PKC2001, LNCS 1992, pp. 365–382, 2001. 3. T.Sato, and K.Araki, “Fermat Quotients and the Polynomial Time Discrete Lot Algorithm for Anomalous Elliptic Curve,” Commentarii Math. Univ. Sancti. Pauli, vol47, No.1, pp. 81–92, 1998. 4. A.Menezes, T.Okamoto, and S.Vanstone, “Reducing Elliptic Curve Logarithms to Logarithms in a Finite Field,” IEEE Trans. 39, pp. 1639–1646, 1993. 5. G.Frey and H.R¨ uck,“A Remark Concerning m-Divisibility and the Discrete Logarithm in the Divisor Class Group of Curves,” Math. Comp., vol.62, pp. 865–874, 1994.
10
W. Feng, Y. Nogami, and Y. Morikawa
6. P.Gaudry, F.Hess, and N.Smart,“Constructive and destructive facets of Weil descent on elliptic curves,” Hewlett Packard Lab. Technical Report, HPL-2000-10, 2000. 7. http://www.exp-math.uni-essen.de/˜diem/english.html 8. http://www.ieee.org/p1363 9. D.B.Bailey and C.Paar, “Optimal Extension Fields for Fast Arithmetic in PublicKey Algorithms,” Proc. Asiacrypt2000, LNCS 1976, pp. 248–258, 2000. 10. Y.Nogami,A.Saito, and Y.Morikawa, “Finite Extension Field with Modulus of AllOne Polynomial and Expression of Its Elements for Fast Arithmetic Operations,” Proc. of The International Conference on Fudamentals of Electronics, Communications and Computer Sciences (ICFS2002), R-18 pp. 10–15, 2002. 11. T.Kobayashi, K.Aoki, and F.Hoshino, “OEF Using a Successive Extension,” Proc. The 2000 Symposium on Cryptography and Information Security, no. B02, 2000, in Japanese. 12. Y.Nogami, Y.Fujii, and Y.Morikawa, “The Cost of Operations in Tower Field,” The 2002 Symposium on Cryptography and Information Security, vol.2, pp. 693– 698,2002.
A Forward-Secure Blind Signature Scheme Based on the Strong RSA Assumption Dang Nguyen Duc1 , Jung Hee Cheon2 , and Kwangjo Kim1 1
2
International Research Center for Information Security (IRIS) Information and Communication University (ICU) 58-4 Hwaam-dong, Yusong-gu, Deajeon, 305-732 Korea {nguyenduc, kkj}@icu.ac.kr http://www.iris.re.kr/ School of Mathematical Science, Seoul National University (SNU) San 56-1 Shillim-Dong, Kwanak-Gu, Seoul 151-747, Korea [email protected]
Abstract. Key exposures bring out very serious problems in security services. Especially, it is more severe in the applications such as electronic cash or electronic payment where money is directly involved. Forward secrecy is one of the security notions addressing the key exposure issues. Roughly speaking, forward secrecy is aimed to protect the validity of all actions using the secret key before the key exposure. In this paper, we investigate the key exposure problem in blind signature (with an application to the electronic cash in mind) and propose a blind signature scheme which guarantees forward secrecy. Our scheme is constructed from the provably secure Okamoto-Guillou-Quisquater (OGQ for short) blind signature scheme. Using the forking lemma by Pointcheval and Stern [4], we can show the equivalence between the existence of a forger with the solvability of the strong RSA problem. Further we show that our scheme introduces no significant communication overhead comparing with the original OGQ scheme.
1
Introduction
Digital signatures are the most well-known public key cryptography application which provides authentication of signing act. Clearly, the ability to sign (i.e., owning the secret keys) must be available to the signer only. In practice, it is very difficult to guarantee that secret keys cannot be compromised since many implementation and administration errors can be exploited. To relax the problem, an intuitive solution is to use many secret keys - each valid only within a short period of time - and preferably keeps the public key unchanged over its lifetime. Such strategy is called key evolution. However, key evolution must be designed carefully. For instance, if secret keys used in the past can be easily computed from the compromised secret key then key evolution does not help dealing with the key exposure problem. To address this issue, the notion of forward secrecy was introduced by Anderson S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 11–21, 2003. c Springer-Verlag Berlin Heidelberg 2003
12
D.N. Duc, J.H. Cheon, and K. Kim
[2]. Intuitively speaking, forward secrecy preserves security goal for all previous usage in case the current secret key is compromised. In other words, security goal is protected up to (forward ) the time of secret key exposure. An interesting extension of digital signature is blind signature proposed by Chaum [1]. Blind signature enables users to get a signer’s signatures on their messages without revealing the message contents. Blind signature plays one of key ingredients in electronic cash system where the bank plays as the signer and customers play as users. Roughly, let’s assume that a signature issued by the bank is equivalent to an electronic coin. Now, we consider the key exposure problem in case of blind signature (and so of an electronic cash system). It turns out that the key exposure problem in blind signature is very serious. Specifically, in electronic cash system, it is very severe since money is directly involved. When secret keys of the bank are stolen, attacker can generate as many valid electronic coins as he wants. Suppose that the bank is aware of key exposure and performs public key revocation. Since nobody can trust signature generated by using the stolen key, people who withdrawn their electronic coins but have not spent it, or who were paid electronic coins but have not deposited it will lose their money. The first solution, the bank can think of, is to make stealing his secret keys essentially hard. For example, the bank can use secret sharing technique to distribute secret keys to several sites together with a threshold blind signature scheme to issue signatures. Clearly, this approach makes it more difficult for attackers to steal secret keys since they have to break in all sites holding shared secrets to learn the bank’s secret keys. However, the above approach requires distributed computation that is very costly. Again, we turn to key evolution and forward secrecy. Specifically, the bank updates his secret key at discrete intervals and it is infeasible for an adversary to forge any signature valid in the past even if the current secret key is compromised. Blind signature is also seen to have other applications including electronic voting, auction, etc. All those applications are clearly vulnerable against key exposure problem. Thus relaxing the key exposure problem in blind signature is a useful feature not only in electronic cash but also in many other cryptographic applications. Our approach to construct a forward secure blind signature scheme is to extend a well-studied blind signature scheme in the literature. We choose the Okamoto-Guillou-Quisquater (OGQ for short) blind signature scheme as our candidate. This scheme is constructed from the witness indistinguishable identification protocol based on Guillou-Quisquater identification protocol by Okamoto ∗ [8]. This blind signature scheme works on ZN where N is a product of two large primes. The security of this scheme is proved by Pointcheval and Stern under random oracle model [4]. The scheme seems not to be vulnerable against generalized birthday attack [12] since this attack requires the knowledge of the order of the base group which is equivalent to factoring N . In this paper, we present a forward secure blind signature scheme by extending the OGQ blind signature scheme. Our scheme exhibits an efficient key updating protocol and introduces no significant overhead comparing to the OGQ scheme.
A Forward-Secure Blind Signature Scheme
13
The organization of the paper is as follows: In Section 2, we present background and definitions. The description of our forward secure blind signature scheme is given in Section 3. In Section 4, we analyze correctness, efficiency and security of our proposed scheme. Section 5 will be our conclusion and future work.
2 2.1
Background The Key-Evolving Blind Signature
In this section, we demonstrate a formal definition of a key-evolving blind signature scheme. The definition is adopted from the definition for a key-evolving digital signature given in [6]. Definition 1. A key-evolving blind signature scheme consists of five algorithms, FBSIG = , where 1. FBSIG.Setup is a probabilistic polynomial-time algorithm which takes the security parameter k as its input and outputs system parameters including the initial secret key SK1 and the public key P K of the signer. 2. FBSIG.Update is either deterministic or probabilistic algorithm. It takes the secret key SKi for current time period, period i, as its input, and outputs a new secret key SKi+1 for time period i + 1. 3. FBSIG.Signer and FBSIG.User are a pair of probabilistic interactive Turing machines which model the signer and an user involving in a signature issuing session, respectively. Both machines have the following tapes: a readonly input tape, a write-only output tape, a read/write work tape, a read-only random tape and two communication tapes (one read-only and one writeonly). The two machines may share a common read-only input tape as well. FBSIG.Signer has its secret key SKi on its input tape in time period i. FBSIG.User has a message m and the signer’s public key P Ki on its input tape. FBSIG.Signer and FBSIG.User engage in a signature issuing protocol. After the protocol ends, FBSIG.Signer either outputs ‘complete’ or ‘incomplete’, and FBSIG.User either outputs signature of the message m, (i, σ(m)), or ⊥ (i.e., error) respectively. 4. FBSIG.Verify is a deterministic algorithm which takes the public key of the signer, P K, and message, signature pair (m, i, σ(m)) as its input. It outputs either ‘accept’ or ‘reject’. Clearly, for every valid signature, FBSIG.Verify must output ‘accept’. We should emphasize that the period index, i, must be embedded into every signature. Otherwise, we cannot tell in which time period, the signature is issued.
14
2.2
D.N. Duc, J.H. Cheon, and K. Kim
Security Notions for a Key-Evolving Blind Signature with Forward Secrecy
Blindness. One characteristic of the ordinary cash is anonymity, meaning that user’s buying activities can not be traced by the bank who issues cash. Blind signature clearly needs to address this issue since it is a means of cash issuance in electronic cash system. In fact, blindness is stronger than “obtaining signature without revealing message”. To satisfies anonymity, blindness property implies that the signer cannot statistically distinguish signatures. In a key-evolving blind signature, one may argue that since the time period index must be included in every signature. Then, the signer may use the time period index to uniquely identify every signature if he updates his secret keys after issuing each signature. So blindness property will be lost. However, the time period index j is publicly available and the signer must agree with all involved parties on when his secret keys should be updated. Another issue one may concern is that if a time period is too short, then there will be only a few signatures issued in that period. It may make the signer easier to identify signatures later on. This can be prevented by requiring a more rigorous blindness property. Let’s consider the following game played by the signer (or any adversary that controls the signer) and two honest users, say U0 and U1 . – The signer chooses two messages m0 and m1 . – A referee chooses a random bit b and then mb and m1−b are given to U0 and U1 , respectively. – U0 and U1 engage with the signer to get signatures on their messages, mb and m1−b , respectively (not necessery in two different time periods since blindness property must be satisfied for all signatures, not just for signatures issued in one time period). Then, The two signatures are given to the signer. Finally, the signer outputs a guess for b, say b . The signer wins the game if b = b . If probability that the signer wins the game is no better probability of guessing the random bit b given no information (i.e., probability of 12 ), the signer cannot link a signature to its owner. We say that blindness property is satisfied. Forward Secrecy in Key-evolving Blind Signature. In different cryptographic schemes, forward secrecy may have different meanings depending on security goals for the schemes. In blind signature context, forward secrecy means unforgeability of signatures valid in previous time periods even if the current secret key of the signer is compromised. 2.3
Security Assumption
The security assumption of our scheme depends on the intractability of the strong RSA problem. The strong RSA problem is described as follows: Given a RSA modulus N (which is a product of two large primes) and a random ∗ ∗ element c ∈ ZN , find m and r ∈ ZN such that mr = c mod N . The strong RSA assumption implies that the strong RSA problem is intractable.
A Forward-Secure Blind Signature Scheme
15
The strong RSA assumption is usually used with a special modulus N , i.e., that is a product of two numbers, so called safe primes. We give definition of a safe prime as follows: Definition 2. Given a prime number q , if q = 2q + 1 is also prime, we call q is a safe prime number. (q is known as Sophie Germain prime.).
3
Our Forward Secure Blind Signature Scheme
In this section, we describe our forward secure variant of the OGQ blind signature scheme. We denotes ÷ by a division operation which gives the result as the quotient of the division (i.e., if a = qb + r then a ÷ b = q). The denotes string concatenation. Also, we assume that a collision-free hash function H is available where its domain and codomain are {0, 1}∗ and Zλ∗ (λ is a prime), respectively. Firstly, we explain our idea on implementing a key-evolving protocol for the OGQ blind signature scheme. The OGQ scheme works on the multiplicative ∗ where N is a product of two primes. Its secret key is a pair (r, s) group ZN and the corresponding public key is V = a−r s−λ where a and λ are public (λ is also prime). Updating the secret s is easy, we just compute s from s by squaring, say s = s2 . However, updating r (in a way the new public key is related to the old public key) is difficult because we do not know the order of ∗ . If we compute V 2 , we get V 2 = a−2r (s2 )−λ mod N . We cannot take a in ZN (2r, s2 ) as a new secret key pair since it is trivially easy to get r from 2r. To add ∗ randomness to the new r, we take a random exponent e from ZN and compute V 2 ae = a−2r+e (s2 )−λ mod N . l and r denote the quotient and the remainder of (2r − e) divided by λ, respectively. Then, we have V 2 ae = a−r (al s2 )−λ mod N . Now, we can take V 2 ae as a new public key, (r , s = al s2 ) as a new secret key. This key-evolving protocol is forward secure because in order to compute r or s from the new key pair (r , s ) and ae mod N , one needs to compute e from ae or s from s2 . Since e is taken randomly, both of problems are very root finding ∗ problem in ZN , which is equivalent to factoring N [14]. In an offline electronic cash system, payment can be made without online communication with the bank. In other words, verifiers should be able to verify signature without online communication with the signer. Therefore, in our case, ae should be embedded into every signature so that verifier can compute the public key from V and the period index. One may argue that it is no better than generating the new key pair at random and including the public key into every signature. However, in blind signature, users are in charge of hashing their messages. Thus, users are under no obligation to embed the correct time period index into signatures (which means forward secrecy is lost). In contrast, the public key in our scheme is continuously squared after every period. So for i verifiers to compute correct public key using period index (i.e., V 2 ), users must embed the correct time period index into signatures. We now describe each component of a five-tuple FBSIG = .
16
D.N. Duc, J.H. Cheon, and K. Kim
algorithm FBSIG.Setup(k) Generate randomly two safe primes p and q of length k/2 bits N ← pq ϕ(N ) ← (q − 1)(p − 1) Generate a random prime λ such that it is co-prime with ϕ(N ) ∗ of order greater than λ Choose a from ZN ∗ ∗ Choose r0 ∈R Zλ s0 , e ∈R ZN V ← a−r0 s−λ mod N 0 f1 ← ae mod N v1 ← V 2 ae mod N l ← (2r0 − e) ÷ λ r1 ← (2r0 − e) mod λ s1 ← al s20 mod N Erase p, q, e, r0 , s0 and ϕ(N ) SK1 ← (1, r1 , s1 , v1 , f1 ) P K ← (N, a, V, λ) RETURN (P K, SK1 ) algorithm FBSIG.Update(SKi ) (i, ri , si , vi , fi ) ← SKi ∗ Choose e ∈R ZN 2 e vi+1 ← vi a mod N fi+1 ← fi2 ae mod N l ← (2ri − e) ÷ λ ri+1 ← (2ri − e) mod λ si+1 ← al s2i mod N SKi+1 ← (i + 1, ri+1 , si+1 , vi+1 , fi+1 ) Erase SKi , e and l RETURN (SKi+1 )
Note that, i, vi and fi of SKi are not secret anyway. We prefer to keep P K unchanged to avoid confusion because if public key is changed, we need to perform public key revocation. The signature issuing protocol is given as follows: algorithm FBSIG.Signer(SKi ) On Error RETURN ‘incomplete’ (i, N, λ, a, ri , si , fi ) ← SKi Choose t ∈R Zλ∗ ∗ Choose u ∈R ZN t λ x ← a u mod N Send x to FBSIG.User
algorithm FBSIG.User(P K, m) On Error RETURN ⊥
Get x from FBSIG.Signer (N, λ, a, V ) ← P K Choose blinding factors ∗ α, γ ∈R Zλ∗ and β ∈R ZN α λ γ x ← xa β vi mod N c ← H(i fi m x )
A Forward-Secure Blind Signature Scheme
17
c ← (c − γ) mod λ Send c to FBSIG.Signer
Get c from FBSIG.User y ← (t + cri ) mod λ w ← (t + cri ) ÷ λ z ← aw usci mod N Send y, z to FBSIG.User
Get y, z from FBSIG.Signer y ← (y + α) mod λ w ← (y + α) ÷ λ w ← (c − c) ÷ λ z ← aw vi−w zβ mod N σ(m) ← (fi , c , y , z ) RETURN (i, σ(m))
RETURN ‘complete’
We assume that when users contact with the signer, i, vi and fi are available to users (i.e., in the signer’s read-only public directory). All users can access those information anonymously. The ‘On Error’ pseudo-code can be interpreted as ‘Whenever an (unrecoverable) error occurs’. In practice, an error will be caused by a communication error between FBSIG.User and FBSIG.Signer. To express the signature of a message, we will omit the index i on fi since attackers (when try to forge a signature) do not have to use the correct f for a period). algorithm FBSIG.Verify(m, i, σ(m), P K) (N, λ, a, V ) ← P K (f, c , y , z ) ← σ(m) i vi ← V 2 f mod N λ x ← ay z vic mod N If c = H(i f m x ) then RETURN ‘accept’ else RETURN ‘reject’
4 4.1
Analysis of FBSIG Correctness
Theorem 1. Suppose that FBSIG.Signer and FBSIG.User engage in a signature issuing protocol in period i such that FBSIG.Signer returns ‘complete’ and FBSIG.User returns signature on a message m, (i, σ(m)). Then, FBSIG.Verify always returns ‘accept’ on input (P K, i, σ(m)). λ
i
Proof. We will show that x = ay z (V 2 fi )c = x mod N . If the signature issuing protocol ends successfully then f = fi and we have:
λ
i
ay z (V 2 fi )c = ay (aw vi−w zβ)λ vic mod N
18
D.N. Duc, J.H. Cheon, and K. Kim
= ay aw λ z λ β λ vic −w = = = = = = = =
λ
mod N
a (a usi c )λ β λ vic −w λ mod N ay+α awλ uλ si cλ β λ vic −w λ mod N ay+wλ aα uλ si cλ β λ vic −w λ mod N at+cri aα uλ si cλ β λ vic −w λ mod N −c λ c −w λ at uλ aα (a−ri s−λ mod N i ) β vi α λ −c c −w λ xa β vi vi mod N (c −c)−w λ mod N xaα β λ vi α λ γ xa β vi = x mod N
y +w λ
w
Hence H(i f m x ) = H(i f m x ) = c always holds which means that FBSIG.Verify always returns ‘accept’. 4.2
Efficiency
We compare the key and signature sizes (in bits) of our key-evolving blind signature scheme and the OGQ blind signature scheme in the following table. Scheme Public Key Size Secret Key Size Signature Size Our FBSIG 5k + log λ + log(i) k + log λ 2k + 2 log λ + log(i) OGQ Scheme 3k + log λ k + log λ k + 2 log λ
Note that log(i) is bit length of time period index. In terms of computational cost, the signature issuing procedure remains the same as the OGQ scheme. In verification process, we need to so some squaring operations to compute vi . Our key updating is quite efficient. It needs three squaring operations, two exponen∗ . tiations, one division and three multiplications in ZN 4.3
Security
Security of OGQ Blind Signature. In [4], the authors showed that onemore unforgeability is related to security of RSA cryptosystem. Even though the complexity of reduction step in their security proof is not polynomial in all security parameters, it is still one of the best result for blind signature. We state two theorems regarding the security of our scheme as follows: Theorem 2. Our proposed scheme satisfies blindness property of a blind signature scheme. Proof. Let’s consider the game played by an adversary A (the signer or the one controls the signer) and two honest users, U0 and U1 described in Section 2.2. If A receives ⊥ from one of users, then he has no information to help guessing b other than a wild guess. Now suppose that he gets (i, σ(mb )) = (i, fi , c b , y b , z b ) and (j, σ(m1−b )) = (j, fj , c 1−b , y 1−b , z 1−b ) from two users instead of ⊥. Note
A Forward-Secure Blind Signature Scheme
19
that, what are exchanged between the signer and an user during signature issuing protocol are c, y and z. We call (c, y, z) is a view of the signer. We should show that, given any view (c, y, z) and any signature (m, i, σ(m)), there always exist uniquely blinding factors such that the resulting signature is (m, i, σ(m)) and the view of the signer is (c, y, z). This fact prevents the signer from deciding a given view corresponding to which signature since blinding factors are chosen randomly. The blinding factors α, β and γ can be uniquely computed given (c, y, z) and (m, i, σ(m)) = (m, i, f, c , y , z ) as follows: γ = c − c mod λ, α = y − y mod λ and β = z /(aw vi−w z) mod N where w and w are computed i just like in the signature issuing protocol and vi = V 2 f mod N . To conclude, in any case, any adversary A cannot gain any helpful information during the signing protocol to guess b. In other words, his probability of success in guessing b is 1/2. Theorem 3. If there exists a forger which can break forward security of our scheme. Then, with non-negligible probability, we can violate the strong RSA assumption. Proof. A forger F obtains P K of the signer as its input, and interacts with the signer in an arbitrary way to get a set of message (of his choice) signature pairs M S. Whenever he wants, he breaks in the system (let say at time period b) and learns SKb . Finally, with non-negligible probability, F outputs a forged message/signature pair for a time period j < b which is not in the set M S. We need to simulate the signer to interact with F during signature issuing protocol and provide an hashing oracle to answer F ’s hashing queries. As usual, F can only interact with the signer polynomially many sessions and ask the hashing oracle polynomially many queries. We also need to provide a random tape for F. First, we guess the period j that F will output a forged signature for that period. The break-in time of F must be period b > j. We can easily compute SKb to answer F’s break-in query by using the key setup and update procedure properly. We will run F twice with the same input P K. At the first time, assume that F outputs a forged signature (j, σ1 (m)) = (j, f, c1 , y1 , z1 ) on a message m and the h-th query on the hashing oracle is (j f m x1 ). It is expected that j V 2 f = vj mod N . Otherwise, we retry from the beginning. For the second time, we run F with the same random tape and answer to its hashing oracle queries the same values as in the first run until the h-th query, (j f m x1 ). Due to the forking lemma [4], with non-negligible probability, F will again output a forged signature on message m for the period j, (j, σ1 (m)) = (j, f, c2 , y2 , z2 ). j j λ λ Then it must be the case that ay1 z1 (V 2 f )c1 = ay2 z2 (V 2 f )c2 mod N . Thus, j e(c −c ) ay1 −y2 (z1 /z2 )λ = vj 2 1 mod N (vj = V 2 f mod N ). Since vj = a−rj sj −λ mod N , we can come up with the following equation aρ = bλ mod N for some integer number ρ and b. This equation enables us to violate the strong RSA assumption due to the following lemma. Lemma 1. Given a, b ∈ (Z/N Z)∗ , along with ρ, λ ∈ Z, such that aρ = bλ mod ∗ N and gcd(ρ, λ) = 1, one can efficiently compute µ ∈ ZN such that µλ = a mod N.
20
D.N. Duc, J.H. Cheon, and K. Kim
Proof. Since gcd(ρ, λ) = 1 we can use extended Euclidean algorithm to compute two integers ρ and λ such that ρρ = 1+λλ . Then, µ = bρ a−λ mod N satisfies λ µ = a mod N . Using the above lemma we can compute a λ-th root of a which contradicts with our security assumption, the RSA assumption since it is very likely that gcd(ρ, λ) = 1 (since λ is prime).
5
Conclusions and Future Work
We presented the first forward secure blind signature scheme and analyzed its security. We believe that forward secrecy provides really useful features for a blind signature scheme, considering its applications such as electronic cash or electronic payment systems. Our scheme is as efficient as the original OGQ scheme. The key evolving protocol is efficient and supports unlimited time periods. However, the signature size of our scheme is two times of the original signature. Reducing the signature size is left as the future work. Our scheme can also be extended to general groups whose orders are hard to find. In this case, the security assumption also changes to the strong root assumption [13] which is an analogy of the strong RSA assumption. An example of groups of unknown orders are class groups of imaginary quadratic orders. This generalization will be described in the full version of this paper. Acknowledgment. The first author is grateful to Dr. Zhang Fangguo for his helpful discussion on blind signature. The second author was partially supported by SNU foundation in 2003.
References 1. David Chaum, “Blind Signatures For Untraceable Payments”, Advances in Cryptology – CRYPTO’82, Plenum Publishing, pp. 199–204, 1982. 2. Ross Anderson, “Two Remarks on Public Key Cryptography”, Invited Lecture, Fourth Annual Conference on Computer and Communications Security, ACM, 1997. 3. Louis S. Guillou and Jean J. Quisquater, “A Practical Zero-Knowledge Protocol Fitted to Security Microprocessors Minimizing both Transmission and Memory”, Advances in Cryptology – EUROCRYPT’88, LNCS 330, Springer-Verlag, pp. 123– 128, 1988. 4. David Pointcheval and Jacques Stern, “Provably Secure Blind Signatures Schemes”, Advances in Cryptology – ASIACRYPT’96, LNCS 1163, Springer-Verlag, pp. 252– 265, 1996. 5. Gene Itkis and Leonid Reyzin, “Forward-Secure Signatures with Optimal Signing and Verifying”, Advances in Cryptology - CRYPTO’01, LNCS 2139, SpringerVerlag, pp. 332–354, 2001. 6. Mihir Bellare and Sara K. Miner, “A Forward-Secure Digital Signature Scheme”, Advances in Cryptology – CRYPTO’99, LNCS 1666, Springer-Verlag, pp. 431–448, 1999.
A Forward-Secure Blind Signature Scheme
21
7. Fangguo Zhang and Kwangjo Kim, “ID-Based Blind Signature and Ring Signature from Pairings”, Advances in Cryptology – ASIACRYPT’02, LNCS 2501, SpringerVerlag, pp. 533–547, 2002. 8. Tatsuki Okamoto, “Provably Secure and Practical Identification Schemes and Corresponding Signature Schemes”, Advances in Cryptology - CRYPTO’92, LNCS 740, Springer-Verlag, pp. 31–53, 1992. 9. Ari Juels, Michael Luby and Rafail Ostrovsky, “Security of Blind Signatures”. Advanced in Cryptology – CRYPTO’97, LNCS 1294, Springer-Verlag, pp. 150– 164, 1997. 10. Ronald Crammer and Victor Shoup, “Signature Scheme Based on the Strong RSA Assumption”, In ACM Transactions on Information and System Security, volume 3, pp. 161–185, 2000. 11. Claus P. Schnorr, “Security of Blind Discrete Log Signatures Against Interactive Attacks”, In Proceedings of ICISC’01, LNCS 2229, Springer-Verlag, pp. 1–12, 2001. 12. David Wagner, “Generalized Birthday Problem”, Advances in Cryptology – CRYPTO’02, LNCS 2442, Springer-Verlag, pp. 288–303, 2002. 13. Safuat Hamdy and Bodo Moller, “Security of Cryptosystems Based on Class Groups of Imaginary Quadratic Orders”, Advances in Cryptology – ASIACRYPT’00, LNCS 1976, Springer-Verlag, pp. 234–247, 2000. 14. Dan Boneh and Ramarathnam Venkatesan, “Breaking RSA May Not Be Equivalent to Factoring”, Advances in Cryptology – EUROCRYPT’98, LNCS 1403, SpringerVerlag, pp. 59–71, 1998.
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents Yan Wang1 , Chi-Hung Chi2 , and Tieyan Li3 1 2
Department of Computing, Division of Information and Communication Sciences, Macquarie University, NSW 2109, Australia Department of Computer Science National University of Singapore 3 Science Drive 2, Singapore 117543 {ywang,chich}@comp.nus.edu.sg 3 Infocomm Security Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613 [email protected]
Abstract. For the application of large-scale mobile agents in a distributed environment, where a large number of computers are connected together to enable the large-scale sharing of data and computing resources, security and efficiency are of great concern. In this paper, we present secure route structures and corresponding protocols for mobile agents dispatched in binary to protect the dispatch route information of agents. The binary dispatch model is simple but efficient with a dispatch complexity of O(log2 n). The secure route structures adopt the combination of public-key encryption and signature schemes and expose minimal route information to hosts. The nested structure can help to detect attacks as early as possible.
1
Introduction
Mobile agents are computational entities that are autonomous, mobile and flexible that can facilitate parallel processing. Very often, a mobile agent acts on behalf of its owner to migrate through the distributed network, completes the specified tasks and returns results back to the owner [1,2,3]. The use of mobile agents in a distributed environment is gaining increasing attention. For example, in a national scale Grid environment [4,5,6,7,8], a large number of computers are loosely coupled together to enable the large-scale sharing of data and computing resources, where agents, especially mobile agents, are naturally the tools for monitoring, managing hosts and deploying jobs. Typically, a mobile agent can carry a computational job and execute at a host after being dispatched there. Likewise, in a mobile agent based E-commerce environment [9], mobile agents can be dispatched as the request of a consumer (end-user) to visit e-shops for asking offers for a specified product, evaluating these offers and negotiating with shops. In the above-mentioned environments, an efficient dispatch model is important and initial dispatch route information should be protected against potential malicious hosts. Otherwise, some attacks may be S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 22–33, 2003. c Springer-Verlag Berlin Heidelberg 2003
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents
23
easily mounted breaking the deployment of agents. So, if the owner needs to dispatch large-scale mobile agents, the security and efficiency are of great concern [10,11]. Tamper-poof devices [12] and secure coprocessors [13] are hardware-based mechanisms that can be used for protecting mobile agents and hosts. Softwarebased approaches involve more work such as using Hiding Encrypted Function (HEF) [14], using proxy signature [15] and using delegation certificate [16]. However, these approaches are either limited in certain context or arise other security problems. The secure structure for an individual mobile agent is discussed in [10]. Several secure route structures are presented in [17] for protecting a serially migrating agent. But a serial migrating agent can only satisfy small-scale applications and it is not adequate for Grid computing or E-commerce where parallelism is exploited to ensure high performance and fast response. In such a case, dispatching agents in parallel is essential. However, the secure route structures for mobile agents become more complicated. In this paper, we focus on the issue of efficiently dispatching mobile agents while protecting their routes. We first present a fast binary dispatch model (FBD), which is able to efficiently dispatch a large number of mobile agents in parallel. Based on this model, we present several secure route structures and security enhanced parallel dispatch protocols, which will expose minimal route information to current hosts. The nested structure of secure route can help to detect attacks as early as possible. In terms of security and robustness, these models are improved one by one targeted at preserving the efficiency of the hierarchical dispatch model while ensuring route security. In this paper, we assume a secure mobile agent environment employing well-known public key cryptography [19] and X.509 certification framework [18,19,20]. In the following, we assume that there exists a secure environment including the generation, certification and distribution of public keys. Each host enables an execution environment for mobile agents and can know the authentic public key of other hosts. The rest of this paper is organized as follows: Section 2 first previews the BBD model, a basic binary dispatch model. Then it presents FBD model, a fast binary dispatch model. Two secure route structures based on FBD are presented in Section 3. The security properties of two route structures are also compared in this section. The complexities of the route generation of different structures are analyzed in Section 4. Finally, Section 5 concludes this work.
2
A Fast Binary Dispatch Model (FBD)
When there are n mobile agents, a serial dispatch model is to dispatch them one by one. But it is not efficient since the dispatch complexity is O(n). In [21,22], we proposed the basic binary dispatch (BBD) model. It is a typical parallel dispatch model where each parent agent can dispatch two child agents resulting in a binary dispatch tree structure with the dispatch complexity of O(log2 n). We term an agent as a Master Agent (e.g. A0 in Figure 1) if it is created at the home host (e.g. H0 ) and is responsible for dispatching a pool of
24
Y. Wang, C.-H. Chi, and T. Li
mobile agents to remote hosts. We call an agent a Worker Agent (WA) if its sole responsibility is to perform simple tasks assigned to it, e.g. accessing local data. If a WA also dispatches other worker agents besides performing the task of local data accessing, it is called a Primary Worker Agent (PWA).
A0 A1(T) A3(2T)
A7(3T) A8(4T)
A15 4T
A16 5T
A2 (2T) A4(3T)
A5(3T)
A6(4T)
A9(4T) A10(5T) A11(4T) A12(5T) A13(5T) A14(6T)
(a) A possible binary dispatch for 6T A0 A1(T) A3(2T)
A2 (2T) A4(3T)
A5(3T)
A7(3T) A8(4T) A9(4T) A10(5T) A11(4T)
A15 4T
A16 5T
A6(4T)
A12(5T) A13(5T) A14(6T)
A14 5T
(b) An optimized binary dispatch for 5T Master Agent
PWA
WA
Fig. 1. FBD dispatch tree with 16 mobile agents
While the BBD model [21,22] is efficient, it has a drawback. For example, if there are 16 mobile agents, 8 mobile agents arrive at their destinations and start their local tasks at 4T and other 8 mobile agents do at 5T. Here we distinguish the tasks of a PWA by dispatch tasks and local tasks. Agent A1 arrives at its destination at 1T but it can only start its local data access task at 4T since it has to dispatch other agents. The start time is the same with agents A2 to A8 . So do other PWAs. In other words, half of the n agents can start their tasks at time (log2 n)T and the other half at time (log2 n + 1)T . As shown in Figure 1, in the FBD model, a PWA is only responsible for dispatching 1 or 2 child agents before starting its local task. No virtual dispatch is necessary. But to obtain fast dispatch performance, partial adjustment is necessary. As shown in Figure 1, one node should be moved to the left branch so
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents
25
that the overall dispatch time is within (log2 n + 1)T (see Figure 1b). It is the same with 32 or n (when n = 2h , h is an integer) agents. We can observe in Figure 1b that A1 starts its local task at 3T no matter how many descendent agents it has. It is 4T for A2 and A3 , and 5T for A4 and A5 etc. The latest one is (log2 n + 1)T when having n agents altogether. The final one is the same with BBD model. That means that the starting times of all agents disperse equally from 3T to (log2 n + 1)T but the dispatch complexity remains O(log2 n). This significantly benefits the efficiency when the number of mobile agents is large. For the implementation strategy of both BBD and FBD models, in IBM Javabased Aglets system [1], if all agents have the same type of tasks with different arguments, a clone-based strategy can be adopted. This can reduce the network bandwidth. Otherwise, all agent classes can be packaged in a JAR file that can be attached with a dispatched agent. A new agent instance can be created from it. For both strategies, the common feature is that when a new agent is created, arguments can be encapsulated before it is dispatched. Here in this paper, we focus on the generic route structures and ignore implementation details.
3
Two Secure Route Structures
In this section, we will discuss possible solutions of secure route structure and dispatch protocol based on the FBD model. The structure of an agent can be described as follows: {Cer0/id0, S, C, D} Cer0 is the certificate of its sender, which should be a registered host in a PKI (Public Key Infrastructure) environment. With it, a receiver could verify the ownership of a coming agent. Without loss of generality, for simplicity, Cer0 can be replaced by the unique id of the sender. S is the state of an agent represented by a set of arguments. A route is part of it. C is the code of the agent and D is the results obtained after execution. It can be sent back through messages. In the FBD model, if no secure route structure is provided, a host where a PWA resides can know all addresses of the hosts where the PWA’s descendant agents should go. Attacks can be easily mounted without being detected. In this section, to propose several secure route structures, we adopted the combination of public-key encryption and signature schemes. In our protocol, all routes are generated by master agent A0 at home host H0 before any dispatch is performed. Routes are encrypted by public keys of corresponding hosts that will be visited. A carried encrypted route can be decrypted with the assistance of the destination host. The host also helps to dispatch child agents when a PWA arrives there. The agent can verify the validity of plaintext using included signature. The host can delete a used route after the corresponding dispatch is successful. In the following context, we assume the following scenario. A host (say, home host H0 here) needs to dispatch a pool of mobile agents to other hosts for execution. After generating corresponding secure routes, the master agent A0
26
Y. Wang, C.-H. Chi, and T. Li
dispatches 2 PWAs by FBD, encapsulating secure routes to them and then waits for returned results. To simplify, we also suppose that agent Ai should be dispatched to host Hi where once arriving, Ai should deploy its subsequent child agents if it is a PWA or complete its local task if it is a WA. In our description, ¯ denotes the one-way hash function. PA denotes the public key of participant h A while SA denotes A’s secret key. Also we will examine if these secure route structures can be used to detect the attacks as follows. ATK1 : route forging attack (forge a route) ATK2 : route delete attack (delete a unused route) ATK3 : dispatch skip attack (skip a predefined dispatch) ATK4 : replay attack (dispatch a forged agent to a visited host) ATK5 : wrong dispatch attack (dispatch an agent to a wrong host) ATK6 : dispatch disorder attack (break the predefined dispatch order) 3.1
Secure Route Structure (I)
During the process of dispatching, a PWA resides at the same host without any migration. Its task is to dispatch one or two child agents and then complete its local task. The secure route structure is as follows: Secure Route Structure (I) (i) For a PWA A at current host CH, ¯ r(A)=PCH [isPWA, ip(LH), r(LA), ip(RH), r(RA), ip(H0 ), t, SH0 (h(isPWA, ip(PH), ip(CH), ip(LH), r(LA), ip(RH), r(RA), ip(H0 ), id(H0 ), t))] (ii) For a WA A at current host CH, ¯ ip(PH), ip(CH), ip(H0 ), id(H0 ), t))] r(A)=PCH [isWA, ip(H0 ), SH0 (h(isWA, where – r(A) denotes the route obtained at host H that is encrypted by the public key of H, say PH ; – isPWA or isWA is the token showing the current agent is a PWA or a WA; – ip(H) denotes the address of host H ; – CH is the current host; LH and RH are the left child host and right child host and PH is the parent host of CH ; H0 is the home host; – LA is the left child agent of A and RA is the right one; – if current agent has only one child agent, ip(RH) and r(RH) are NULL; – id(H0 ) denotes the unique identification of H0 ; here for simplification, we use it to represent the ownership; – t is the unique timestamp when the route is generated at H0 and it is unique in all routes; In route structure (I), the route of an agent is encrypted by the public key of its destination host. The route is encapsulated when it is dispatched by its parent agent. Starting the binary dispatch process with secure routes, the master agent A0 dispatches two PWAs to different hosts, each being encapsulated with an encrypted route for future dispatch task. When an agent has successfully
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents
27
arrived at the current host CH, it should send back a feedback message to its parent host PH confirming the successful dispatch as follows: ¯ . . ), SCH (ip(CH), tR , SH (h(. ¯ . . ))] PP H [ip(CH), tR , SH0 (h(. 0 This message is encrypted by the public key of home host including the signature by H0 included in the dispatched agent’s route. tR is the time when the agent is received. The carried route r(A) can be decrypted with the secret key of CH so that the agent can know: – it is a PWA or a WA. This is used to determine if it needs to dispatch child agents; ¯ – the signature signed at host H0 , i.e., SH0 (h(isPWA, ip(PH), ip(CH), ¯ ip(LH), r(LA), ip(RH), r(RA), ip(H0 ), t)) for a PWA, or SH0 (h(isWA, ip(PH), ip(CH), ip(H0 ), t)) for a WA. If it is a PWA, it will also know – the address ip(LH) of the left child host LH and its route r(LA); – the address ip(RH) of the right child host RH and its route r(RA); For any PWA or WA, the route includes the address of H0 (i.e. ip(H0 )), the home host where A0 is residing. With this address, the agent can send its result back to A0 . Next, we illustrate the dispatch process through an example. 1 When A0 is dispatched to H1 , it carries its route r(A1 ). 2 After the route is decrypted, namely ¯ . . . . . ))} r={isPWA, ip(H3 ), r(A3 ), ip(H4 ), r(A4 ), ip(H0 ), t, SH0 (h(. A1 obtains addresses ip(H3 ) ip(H4 ) and ip(H0 ), routes r(A3 ) and r(A4 ). 3 Then A1 dispatches agent A3 to host H3 , encapsulating route r(A3 ) to it. 4 Once arriving H3 , A3 sends back a confirmation message as follows: ¯ . .), SH (id(H3 ), ip(H3 ), tR , SH (h(. ¯ . .))] msg = PH1 [ip(H3 ), tR3 , SH0 (h(. 3 3 0 where tR3 is the time when H3 received A3 5 After that A1 dispatches agent A4 to H4 and receives a message from A4 . 6 Hereafter A1 will start to complete its local task and return the result to A0 at H0 . Clearly, under this model, at any layer, only the addresses of the 2 child hosts are exposed to the current host. Next, we will examine if route structure (I) and its dispatch protocol can detect the above-mentioned attacks. Fist, route structure (I) adopts a nested structure. Each route is encrypted by the pubic key of the destination host. It does not need the agent to carry any key. Second, in each route, a signature by H0 is included which includes the information of the rest of the route. At a destination, the host could use the ¯ to check the signature and public key of H0 and the public hash function h verify the data integrity of the route. Since no party knows the private key of H0 , the signature cannot be forged. That means a forged route can be found by the destination host (ATK1 ). Even if a sub-route (say, r(LA) or r(RA)) is deleted by current host, the agent can also check the integrity via a trust third
28
Y. Wang, C.-H. Chi, and T. Li
party (TTP). And deleting a route will cause no results returned to master agent A0 . So a route deletion attack (ATK2 ) or a dispatch skip attack (ATK3 ) will be found. Meanwhile since t is unique in all routes and signatures, and signatures cannot be forged, a replay attack can be found by the destination host (ATK4 ). In a signature, the dispatch route, i.e. the path from parent host PH to current host CH and to child host LH or RH, is included also. This can reduce the redundancy of the route (ip(PH) and ip(CH) appear in the signature only) and detect a wrong dispatch (ATK5 ). But with route structure (I), a PW A could dispatch its right agent first or dispatch agents after the local task is completed. That means the dispatch order may not be strictly followed (ATK6 ). Thus the overall dispatch performance will be worsened. The reason is that two sub-routes for child agents are obtained simultaneously when a route is decrypted. Moreover there is no dependency between two dispatches. 3.2
Secure Route Structure (II)
In the following, an alternative route structure is presented where the route of the right child agent is included in the route of left child agent. When the left child agent is dispatched to the left child host, a feedback is returned to the current agent including the route for the right dispatch. With it, the current agent can dispatch the right child agent to right child host. Hereby, the dispatch order could not be broken (ATK6 ) while the properties against other attacks remain the same. Obviously in this route, the structures for left dispatch and right dispatch are different since a left dispatch should return a predefined route that is included ahead. For the right dispatch, there is no such a sub-route. Secure Route Structure (II) (i) For a PWA A at current host CH, if A is a left child agent of its parent agent at host PH, the route for A is: r(A)=PCH [isPWA, ip(LH), r(LA), ip(RH), ip(H0 ), r(ARS ), t, ¯ SH0 (h(isPWA, ip(PH), ip(CH), ip(LH), r(LA), ip(RH), ip(H0 ), r(ARS ), id(H0 ), t))] where – ARS is the right-sibling agent of A, namely, the right child agent of A’s parent agent; – r(RA) is not included in r(A). (ii) For a PWA A at current host CH, if A is a right child agent of its parent agent at host PH, the route for A is: ¯ r(A)=PCH [isPWA, KP A , ip(LH), r(LA), ip(RH), ip(H0 ), t, SH0 (h(isPWA, KP A , ip(PH), ip(CH), ip(LH), r(LA), ip(RH), ip(H0 ), id(H0 ), t))] where – KP A is a switch variable for parent agent PA that is encrypted by the public key of parent host PH, say PP H ;
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents
29
(iii) For a WA A at current host CH, if A is a left child agent of its parent agent at host PH, the route for A is ¯ ip(PH), ip(CH), r(ARS ), r(A)=PCH [isWA, r(ARS ), ip(H0 ), t, SH0 (h(isWA, ip(H0 ), id(H0 ), t))] where – ARS is the right-sibling agent of A, namely, the right child agent of A’s parent agent; (iv) For a WA A at current host CH, if A is a right child agent of its parent agent at host PH, the route for A is ¯ KP A , ip(PH), ip(CH), r(A)=PCH [isWA, KP A , ip(H0 ), t, SH0 (h(isWA, ip(H0 ), id(H0 ), t))]
Fig. 2. Dispatch process of structure (II)
In route structure (II), a PWA arriving at the destination knows that it has to dispatch 2 child agents and where they should go. But it does not have the route for the right child agent. Only after its left child agent is dispatched can the route for the right child agent be returned and hereafter the right dispatch can be performed. Similar to structure (I), the route for the right agent is encrypted by the public key of the right child host. So the left child host cannot decrypt it and don’t know the address where the corresponding agent should go. This could prevent a forged agent to be dispatched to the right child host by the left child agent. In terms of the route structure, the route for the right child agent, say r(RA), is moved from r(A) to the route of left child agent r(LA) (hereby r(RA) is denoted as r(ARS )). Likewise, in structure (II), a switch variable for current host CH is included in the route of its right child agent. Here we assume that each agent has its unique switch variable encrypted by the public key of its destination host. Only after the right child agent is dispatched can current agent obtain it to start its local task. Next, we will illustrate the dispatch process of agent A1 (see Figure 2). 1 When A1 arrives H1 , its decrypted route is ¯ . . ))} r={isPWA, ip(H3 ), r(A3 ), ip(H4 ), ip(H0 ), t, SH0 (h(. 2 A1 will know it is a PWA. Its left child agent is going to H3 with r(A3 ) while its right child agent is going to H4 but there is no route for it now. After A3 is dispatched to H3 , A1 obtains r(A4 ) from a message as follows:
30
Y. Wang, C.-H. Chi, and T. Li
¯ . . )), SH (ip(H3 ), ip(H3 ), r(A4 ), tR , msg=PH1 [ip(H3 ), r(A4 ), tR3 , SH0 (h(. 3 3 ¯ . . )))] SH0 (h(. where tR3 is the time when H3 received A3 . 3 Hereby A4 could be dispatched. 4 From the successful dispatch of A4 , A1 gets the switch variable KA1 to start its task and return the result to A0 at H0 . In fact structure (I) has the same dispatch process as shown in Figure 2. But the returned message is simpler. Moreover, it is easy to see structure (II) remains the same properties as structure (I) against attacks ATK1 to ATK5. Due to the special arrangement of the route r(RA), the dispatch order will be strictly followed so that the dispatch protocol can prevent dispatch disorder attack (ATK6 ). The comparison of the security properties of two structures is listed in Table 1. Table 1. Security Properties of Two Structures ATK1 ATK2 ATK3 ATK4 ATK5 ATK6 Route (I) Y Y, by A0 Y, by A0 Y Y N Route (II) Y Y Y Y Y Y Y : the attack can be prevented or detected; N : the attack cannot be prevented or detected.
4
Complexity Comparison of Route Structures
In this section, we analyze the complexity of route generation of different models. To simplify, we assume that the time to encrypt a message of arbitrary-length is a constant, say C. In structure (I), when a branch has m nodes, the route of the root is generated after two sub-routes are ready, which have m/2-1 and m/2 nodes respectively. T (n) = 2T (n/2) T (m) = T (m/2) + T (m/2 − 1) + C (2 ≤ m ≤ n/2) (1) T (1) = C T (m) = T (m/2) + T (m/2 − 1) + C) < 2T (m/2) + C yields T(m)=O(m). So T(n) is O(n). In route structure (II), the route of the right child agent is generated first (step 1 in Figure 3). Then it is included in the route of the left child agent (step 2 in Figure 3), which is included in the route of the parent agent (step 3 in Figure 3). If each sub-branch has m/2 nodes, the complexity is T (n) = 2T (n/2) T (m) = 2T (m/2) + C (2 ≤ m ≤ n/2) (2) T (1) = C
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents
31
Fig. 3. Steps in the route generation of structure (II)
So T(n) is O(n). Though structure (II) seems more complex than structure (I), their route generation complexities are the same. The complexity comparison of two structures is listed in Table 2. Table 2. Complexity Comparison of Two Structures Route Generation Complexity Route (I) O(n) Route (II) O(n)
5
Dispatch Complexity O(log2 n) O(log2 n)
Conclusions
This paper presented two secure route structures and corresponding dispatch protocols based on a fast binary dispatch (FBD) model ensuring both security and efficiency. They expose only minimal addresses to a host to perform dispatches. With the improvement of security performance in structure (II), the complexity of route generation remains unchanged. For practical applications, mobile agents with the same type tasks and physically close destinations can be put in the same group encapsulated with preencrypted routes. For verifying the integrity of a coming agent, the pure code can be included in the signature of a route after being hashed to a fixed length (e.g. 128 bytes by MD5 algorithm) when it is generated at the home host. And the length of the signature remains unchanged. Though structure (II) has better properties, once a predefined host is not reachable, all members predefined in a branch will not be activated. To resolve this problem, a robustness mechanism should be designed. Furthermore, in our future work, we will conduct experiments comparing the performance differences of different protocols. Acknowledgement. This work was partly supported by National University of Singapore. The authors would like to thank the anonymous reviewers for their valuable comments.
32
Y. Wang, C.-H. Chi, and T. Li
References 1. D. B. Lange, and M. Oshima, Programming and Deploying Java Mobile Agents with Aglets, Addison-Wesley Press, Massachusetts, USA, 1998 2. S. Papastavrou, G. Samaras, and E. Pitoura, Mobile Agents for World Wide Web Distributed Database Access, IEEE Transactions on Knowledge and Data Engineering, Vol. 12, Issue 5, Sept.-Oct. 2000, pp 802–820 3. D. B. Lange, and M. Oshima, Mobile Agents with Java: The Aglet API, appears in Mobility: Process, Computers, and Agents (edited by Milojicic, D., Douglis, F. and Wheeler, R.), Addison-Wesley Press, Reading, Massachusetts, USA, 1999, pp 495–512 4. I. Foster, C. Kesselman, J. Nick, S. Tuecke,The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002 5. I. Foster, The Grid: A New Infrastructure for 21st Century Science. Physics Today, 55(2):42–47, 2002. 6. I. Foster, C. Kesselman, Computational Grids, Chapter 2 of ”The Grid: Blueprint for a New Computing Infrastructure”, Morgan-Kaufman, 1999. 7. M. Baker, R. Buyya and D. Laforenza, Grids and Grid Technologies for Wide-Area Distributed Computing, International Journal of Software: Practice and Experience, Volume 32, Issue 15, Wiley Press, USA, 2002. 8. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke, The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications, 23:187–200, 2001 9. Y. Wang, K.-L. Tan and J. Ren, A Study of Building Internet Marketplaces on the Basis of Mobile Agents for Parallel Processing, World Wide Web Journal, Kluwer Academics Publisher, Vol. 5, Issue 1, 2002, pp 41–66 10. V. Varadharajan, Security Enhanced Mobile Agents, in Proceedings of the 7th ACM conference on Computer and Communications Security, November 1–4, 2000, Athens, Greece, ACM Press, pp 200–209 11. I. Foster, C. Kesselman, G. Tsudik, S. Tuecke, A Security Architecture for Computational Grids, Proc. 5th ACM Conference on Computer and Communications Security Conference, 1998, pp. 83–92 12. U. G. Wilhelm, Cryptographically Protected Objects, Technical Report, Ecole Polytechnique Federale de Lausanne, Switzerland, 1997 13. E. Palmer, An Introduction to Citadel-a Secure Crypto Coprocessor for Workstations, in Proceedings of IFIP SEC’94 (Curacao, 1994) 14. T. Sander and C.F. Tschdin, Protecting Mobile Agents Against Malicious Hosts, Mobile Agents and Security, LNCS Vol. 1419, Springer-Verlag, 1998, pp 44–60 15. P. Kotzanikolaou, M. Burmester, and V.Chrissikopoulos, Secure Transactions with Mobile Agents in Hostile Environments, ACISP 2000, LNCS 1841, Springer-Verlag, 2000, pp 289–297 16. A. Romao, and M.M. Sliva, Secure Mobile Agent Digital Signatures with Proxy Certificates, E-Commerce Agents, LNAI 2033, Springer-Verlag, 2001, pp 206–220 17. D. Westhoff, M. Schneider, C. Unger and F. Kenderali, Methods for Protecting a Mobile Agent’s Route, in Proceedings of the Second International Information Security Workshop (ISW’99), Springer Verlag, LNCS 1729, 1999, pp 57–71 18. P. Wayner, Digital Copyright Protection, SP Professional, Boston, USA, 1997
Secure Route Structures for the Fast Dispatch of Large-Scale Mobile Agents
33
19. A. Menezes, P. Oorschot, and S. Vanstone, Handbook of Applied Cryptography, CRC Press, 1996 20. CCITT Recommendation X. 509-1989. The Directory-Authentication Framework. Consultation Committee, International Telephone and Telegraph, International Telecommunication Union, Geneva, 1989 21. Y. Wang and J. Ren, Building Internet Marketplaces on the Basis of Mobile Agents for Parallel Processing, in the Procs. of 3rd International Conference on Mobile Data Management (MDM2002), IEEE Computer Society Press, Jan. 8-11 2002, Singapore, pp 61–68 22. Y. Wang, Dispatching Multiple Mobile Agents in Parallel for Visiting E-Shops, in the Proc. of 3rd International Conference on Mobile Data Management (MDM2002), IEEE Computer Society Press, Jan. 8-11 2002, Singapore, pp 53– 60
On the RS-Code Construction of Ring Signature Schemes and a Threshold Setting of RST Duncan S. Wong, Karyin Fung, Joseph K. Liu, and Victor K. Wei Department of Information Engineering The Chinese University of Hong Kong Hong Kong, China {duncan,kyfung2,ksliu9,kwwei}@ie.cuhk.edu.hk
Abstract. We propose a Reed-Solomon (RS) code construction of the 1-out-n (ring) signature scheme. It is obtained from the observation of the equivalency between the erasure correction technique of the RS code and the polynomial interpolation. The structure is very simple and yields a ring equation that can appropriately denoted by z1 + · · · + zn = v, which represents the summation of n evaluations of a polynomial. We also show how to extend the generic RST scheme [6] to a t-out-n threshold ring signature scheme. Keywords: Signature Schemes, Coding Theory
1
Introduction
The notion of ring signature was first formalized by Rivest, et al. [6] in 2001. The scheme concerns about the generation of a signature on a message by some signer who uses its own private key and some other parties’ public keys without their consent or assistance. Essentially, any signer can choose any set of possible signers that includes himself, and sign any message by using his secret key and the others’ public keys. Any verifier who has all the public keys can verify if a ring signature is actually produced by at least one of the possible signers. However, the verifier does not know who the real signer is. It is called a ring signature scheme and distinguishes itself from a group signature scheme as it does not have a group manager to predefine certain groups of users or revoke the identity of the actual signer, nor does require any cooperation among those parties whose public keys are included in a ring signature. In 2002, Bresson, et al. [3] extended the notion to a threshold setup. A (t, n)threshold ring signature scheme is defined to be a ring signature scheme of which at least t corresponding private keys of the n public keys are needed to produce a signature. Applications of ring signature and threshold signatures include leaking authoritative secrets in an anonymous way [6], communicating sensitive data among parties in ad-hoc groups [3], and some others. In this paper, we propose a new approach of constructing a ring signature scheme and also a new construction of a threshold ring signature scheme. We obtain the new ring signature scheme from the observation of the equivalency S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 34–46, 2003. c Springer-Verlag Berlin Heidelberg 2003
On the RS-Code Construction of Ring Signature Schemes
35
between the erasure correction technique of the Reed-Solomon (RS) code [5] and the polynomial interpolation. By modifying and considering a special case (when t = 1) of a (t, n)-threshold ring signature scheme using secret sharing1 , we obtain a new ring signature scheme with the ring structure being so simple that it can be represented by a summation of evaluations of a polynomial at n distinct nodes. In [6], the authors investigated the feasibility of using simple combining functions such as bitwise exclusive-or operations or simple summations. However, they fell short to obtain a secure one. In this paper, we propose to use simple summations as the combining function and discuss what additional requirements are needed in order to make the scheme secure. About the new construction of a threshold ring signature scheme, our approach can be described as a natural extension of the RST scheme [6] using a tandem construction technique. We will see that the extension retains the original ring-like structure of RST and the security proofs can be carried out without any major deviations. Our scheme is efficient for moderate number of possible signers n and small number of participating signers t. In addition, our technique can also be used to extend other ring signature schemes to threshold forms. The rest of the paper is organized as follows. In the next section, we review some ring signature schemes and threshold ring signature schemes. This is followed by the RS code construction of the ring signature scheme in Sec. 3. In Sec. 4, we review the RST scheme and propose a threshold extension to it using a tandem construction technique. Its security and complexity are also discussed. We conclude the paper in Sec. 5.
2
Related Work
A ring is a set of n parties, each of them is called a ring member. We assume that each ring member (indexed by) i, 1 ≤ i ≤ n, is associated with a publicly known trapdoor one-way permutation gi and a secret trapdoor information Ti which is known only by the ring member i. That is, only ring member i knows how to compute the inverse gi−1 efficiently, using the trapdoor information Ti . 2.1
Ring Signature
RST [6] is the first ring signature scheme ever proposed. Not only the notion is portrayed to a ring due to its geometric characteristics such as uniform periphery and the absence of center, their construction is also very well illustrated as a ring structure which consists of n nodes. In their construction, the real signer uses the public keys of other possible signers to construct an open ring with a gap. Then he uses his own private key to close the gap. Although ring signature was first formalized in 2001 by Rivest, et al., similar concept was actually raised earlier. In 1994, Cramer, et al. [4] proposed a proof of knowledge protocol which consists of the properties of a threshold ring 1
Due to Bresson, et al. in the full version of [3]. Available at www.di.ens.fr/˜bresson
36
D.S. Wong et al.
signature scheme at large. Their protocol was instantiated as a 1-out-n (ring) signature scheme by Abe, et al. in [1]. Besides the instantiation, the authors of [1] also proposed another ring signature scheme [1] which allows signers using the mixture of public keys for three-move type signature schemes and trapdoor one-way function type signature schemes at the same time. Their ring signature forms a hash chain, which is similar to the one found in [3,8]. 2.2
Threshold Ring Signature
A (t, n)-threshold ring signature scheme allows any t (or more) ring members to produce a signature for a message, whilst anyone who has the public information of all the ring members can perform the signature verification. However, any t−1 or fewer ring members cannot produce a valid signature. Similar to RST, a (t, n)-threshold ring signature scheme has the properties of set-up free and anonymity. Set-up free refers to the capability of having t ring members (participating signers) produce a threshold ring signature for any message on their own solely from their own secrets (their trapdoor information) and all the publicly known information (trapdoor permutations of all the ring members). This is done by the participating signers without any coordination with the other non-participating ring members. Anonymity refers to the requirement that it should be infeasible to determine the identity of any one of the participating signers with probability greater than t/n. This limited anonymity requirement can be either computational or unconditional. Witness Indistinguishable Signature (WIS). As mentioned before, the notion of threshold ring signature was actually lightly described by Cramer, et al. [4] when they proposed a proof of knowledge protocol in 1994. Their protocol allows a prover to show that he knows at least t out of n solutions without revealing which t instances are involved. Secret sharing technique is suggested to realize the threshold property. BSS1 and BSS2. Bresson, et al. [3] proposed a threshold ring signature using the concept of partitioning. We call it BSS1. In the full version of their paper2 , another threshold ring schemes using secret sharing technique is proposed. We call it BSS2 and review it in the next section. The scheme is more or less an instantiation of the WIS protocol above. In this paper, we show that a simple threshold ring signature scheme can also be built by extending directly from the RST ring signature scheme without using the secret sharing or costly partitioning technique.
3
The Reed-Solomon Code Construction of a Ring Signature Scheme
In this section, we describe a new approach of construction of a ring signature scheme. The ring equation of the new scheme is very simple. For a n-node ring, 2
Available online at www.di.ens.fr/˜bresson
On the RS-Code Construction of Ring Signature Schemes
37
the ring equation is represented by z1 +z2 +· · ·+zn = v where zi is an associated value of some node i, 1 ≤ i ≤ n and v specifies all members in the ring. We start from the description of BSS2 t-out-n threshold ring signature scheme, modify it, then take a special case when t = 1, and finally shows that a new ring signature scheme is constructed. The technique is based on the equivalency of erasure correction mechanism of Reed-Solomon (RS) code [5] to polynomial interpolation, which in term, links to the way of using the secret sharing technique by BSS2. 3.1
Review of BSS2
Let m ∈ {0, 1}∗ be some message to be signed into a t-out-n threshold ring signature. For simplicity, we index the ring members with numbers 1, · · · , t if they are participating signers or so called real signers, and with numbers t+1, · · · , n if they are non-participating signers. Let P1 , · · · , Pn be the public keys of all the n possible signers. Denote H : {0, 1}∗ → {0, 1} to be a cryptographic hash function and Ek,i : {0, 1} → {0, 1} to be the symmetric encryption function −1 of member i under a -bit symmetric key k and Ek,i to be the corresponding symmetric decryption function. is a system-wide security parameter. Let gi : {0, 1} → {0, 1} be the trapdoor one-way permutation of member i associating with the public key Pi and gi−1 be the corresponding inverse computed using some trapdoor information with respect to Pi . gi can be instantiated using the RSA variant of [6]. Appropriate domain adjustment is assumed to be present when the security parameters among the ring members are not the same. The signing algorithm of BSS2 proceeds as follows. Compute the symmetric key for E: k = H(m). Compute value at origin of the ring: v = H(P1 , · · · , Pn ). For i = t+1, · · · , n, randomly pick xi ∈R {0, 1} and compute yi = gi (xi ). Compute a sharing polynomial: Compute a polynomial f over GF(2 ) such that deg(f ) = n−t, f (0) = v and f (i) = Ek,i (yi ) for i = t+1, · · · , n. −1 (f (i))), for i = 1, · · · , t. 5. Compute xi = gi−1 (Ek,i 6. Output the signature: (P1 , · · · Pn , x1 , · · · , xn , f ). 1. 2. 3. 4.
For signature verification, the verifier checks whether ?
f (0) = H(P1 , · · · , Pn ), and ?
f (i) = EH(m),i (gi (xi )), for i = 1, · · · , n. The verifier accepts if all the equalities above hold. For simplicity, we use zi to denote EH(m),i (gi (xi )) for i = 1, · · · , n. We call these zi ’s are the nodes of a ring signature. They are also the n evaluations of the polynomial f . 3.2
Using RS Code
In BSS2 described above, the authors use the secret sharing technique to perform threshold proof. A different approach on interpreting their method is the erasure correction technique of the RS Code. This is obvious according to the fact that the erasure correction is equivalent to polynomial interpolation [7].
38
D.S. Wong et al.
The Modification. Let α be a primitive element in GF(2 ). For simplicity, let q = 2 . The specification of the polynomial f is now modified to deg(f ) = q − t − 1 f (0) = 0 f (α0 ) = −v f (αi ) = zi , i = t+1, · · · , n f (x) = 0, for all z ∈ GF(q) − {0, α0 , α1 , · · · , αn } and there are q−t distinct evaluations of f . Let f (x) = f0 + f1 x + f2 x2 + · · · + fq−t−1 xq−t−1 with variables {fi }0≤i≤q−t−1 . Since the number of variables matches the number of distinct evaluations, the polynomial f can be exactly determined. To complete the modification, we define the signature to be (P1 , · · · Pn , x1 , · · · , xn ). Notice that due to the large degree of f , we have to remove the description of f from the signature. As a consequence, a verifier needs to construct f on its own during signature verification. This is done by picking randomly n−t values of x’s from the signature, constructing f , and evaluating the following. ?
f (α0 ) = H(P1 , · · · , Pn ) f (αi ) = EH(m),i (gi (xi )), 1 ≤ i ≤ n. ?
More stringently, we only need to check the other t values of x’s in the signature for the equalities above. Also note that zi = EH(m),i (gi (xi )) is now denoting f (αi ), 1 ≤ i ≤ n. One additional minor detail is that if E and the trapdoor permutation gi of member i are probabilistic, then the signature should also include the sequence of coin flips which lead to the value of zi from xi . This is because the same set of zi , 1 ≤ i ≤ n, is required for re-constructing f . We can see that the modification only changes the number of evaluations with corresponding adjustment on the degree of f . Intuitively, modification increases the number of non-participating signers from n− t ring members as in BSS2 to q−t−2 ring members. These q−t−2 ring members are indexed by {αi }t+1≤i≤q−2 . Complexity. The complexity of finding polynomial f using the Vandermonde Approach or the classical Lagrange Approach3 is in O(q 3 ) or O(nq) in terms of the number of multiple precision arithmetic operations, respectively. None of them is practical for our application. Fortunately, here we only need to evaluate f (x) for x = αi , i = 1, · · · , t. In addition, q − n − 1 distinct evaluations of f yield the ‘magic number’ 0 and only n − t + 1 distinct values of x have nonzero results of f (x). By using these properties, we can reduce the complexity of Lagrange Approach to O(t2 (n−t)). Details can be found in Appendix A. 3
The general case is in O(q 2 ) while in our case, there are n − t + 1 summation terms and in each term, the complexity is in O(q). Hence the complexity is in O(nq).
On the RS-Code Construction of Ring Signature Schemes
39
Special Case when t = 1. We now show that when t = 1 (that is, a 1-out-n ring signature scheme), a new form of ring signature schemes is evolved from the modification above. This new form has a very simple ring structure, which is just the summation of the n nodes of the corresponding ring signature, namely v = z1 + z2 + · · · + zn . Let f (x) = f1 x + f2 x2 + · · · + fq−2 xq−2 be a polynomial with deg(f ) = q−2 over GF (q). Consider a RS code vector f = [f (α0 ) f (α1 ) · · · f (αq−2 )] and vectors eu = [α0 αu α2u α3u · · · α(q−2)u ], The transpose of
e0 e1 .. .
0 ≤ u ≤ q−2
eq−2 is a Vandermonde matrix. Since α, α , · · · , αq−2 are distinct, this matrix is nonsingular in GF (q) and thus can be used to solve for the polynomial coefficients fi , 1 ≤ i ≤ q−2, uniquely. This is given by 2
f=
q−2
fi ei
(1)
i=1
Note that e0 = [1 1 1 · · · 1]
(2)
< eu , ev >= 0
(3)
if 0 < u + v < q − 1, where < a, b > denotes the inner product of the two vectors. From (1), (2) and (3), we have < e0 , f >= 0, or equivalently, f (α0 ) + f (α1 ) + f (α2 ) + · · · + f (αq−2 ) = −v + f (α1 ) + f (α2 ) + · · · + f (αn ) + 0 + · · · + 0 = −v + z1 + z2 + · · · zn = 0. The final equality is the one we seek, in the case of t = 1.
4
Threshold Extension RST
In the following, we first review the RST ring signature scheme [6]. Then we show how to extend it ‘naturally’ to a threshold ring signature scheme.
40
4.1
D.S. Wong et al.
Review of RST
For simplicity, we describe the version of RST in which all the n ring members use trapdoor one-way permutations with the same domain due to some domain adjustment applied. Let E : {0, 1}k × {0, 1} → {0, 1} be a publicly defined symmetric encryption algorithm such that for any key K of length k, the function EK is a permutation over -bit strings where we define EK (x) as E(K, x) for any K ∈ {0, 1}k and x ∈ {0, 1} . It is modeled as a random (permutation) oracle [2]. Let h : {0, 1}∗ → {0, 1}k be a publicly defined hash function which is also modeled as a random oracle. Given some message m ∈ {0, 1}∗ to be signed, a set of trapdoor permutations L = {gi }1≤i≤n with the same domain {0, 1} of all the n ring members and the trapdoor information Ts of some ring member (the signer) s, 1 ≤ s ≤ n, the signer generates a ring signature σ = (r, L, x1 , · · · , xn ) by following the procedure below. 1. 2. 3. 4.
Compute K = h(m). Randomly pick n binary strings r, xi ∈R {0, 1} , 1 ≤ i ≤ n, i = s. Compute yi = gi (xi ), 1 ≤ i ≤ n, i = s. Find ys such that the following n-node ring equation satisfies. r = EK (yn ⊕ EK (· · · ⊕ EK (y2 ⊕ EK (y1 ⊕ r))))
(4)
We call equation (4) a n-node ring equation by considering ‘graphically’ that there are n nodes in a ring of which each node j has the same structure: out = EK (yj ⊕ in). 5. Compute xs = gs−1 (ys ) using the trapdoor information Ts . 6. Output σ = (r, L, x1 , · · · , xn ). The signature verification is done is the straightforward way. First the verifier computes K and all yi = gi (xi ), 1 ≤ i ≤ n. Then checks if the ring equation (4) is evaluated to r. The security of RST relies on having the capability of filling in a ‘gap’ only if at least one trapdoor information of the n ring members is known. The gap is between the output and input values of two cyclically consecutive E’s along the ring equation where a trapdoor permutation must be inverted in order to construct a valid signature. It is shown in [6] that there must be a gap along the ring equation and hence at least one trapdoor information needs to be known in order to construct a ring signature. In the following, we show how to extend the RST scheme ‘naturally’ to a threshold ring signature scheme. We first give a high level description of our approach below. 4.2
High Level Description
Our idea is to construct a ring equation such that in order to produce a ring signature on a message, one has to invert at least t distinct trapdoor one-way
On the RS-Code Construction of Ring Signature Schemes
41
permutations which entail the knowledge of t trapdoor information. To do this, we build a ( nt )-node RST ring equation and associate each of the ( nt ) combinations of t out of n ring members to each of the nodes. The symmetric key K is computed as h(m, t) or h(m, L, t). The reason of including t is explained in Sect. 4.5. For each node, say node i, the yi value is computed from a -bit random number xi by applying t distinct trapdoor permutations which correspond to the t ring members associated to node i if not all the t ring members are participating signers. Note that each of ( nt ) − 1 nodes has at least one associating ring member which is a non-participating signer, if there are exactly t participating signers out of n ring members. In this case, there is only one node which has all the t associating ring members be participating signers. Now suppose this node (called the participating signers’ node) is node s. To close the ‘gap’ of the ( nt )-node ring equation, these participating signers are required to invert their corresponding trapdoor one-way permutations for computing xs from ys . This natural extension from RST to a threshold ring signature scheme follows closely the basic structure of the ring equation of RST. The security requirement of the transformation procedure from xi to yi for each node i is that it is difficult to invert the transformation if not all the trapdoor information of the t associating ring members is known. We call the transformation a multiparty trapdoor transformation. In the following, we give the security requirements of such a function and describe a multiparty trapdoor transformation which is secure if and only if the trapdoor one-way permutations of all the n ring members are hard to invert when all the corresponding trapdoor information is not known. 4.3
Secure Multiparty Trapdoor Transformation
Definition 1. For any set of t distinct trapdoor one-way permutations denoted by {g1 , g2 , · · · , gt } with the corresponding trapdoor information {T1 , T2 , · · · , Tt }, and for all sufficiently large , a permutation F12···t : {0, 1} → {0, 1} is a secure multiparty trapdoor transformation if 1. computing y ← F12···t (x) is easy for any x ∈ {0, 1} , −1 (y) is easy for any y ∈ {0, 1} , if T1 , T2 , · · · , Tt are 2. computing x ← F12···t known, while −1 3. computing x ← F12···t (y) is hard for overwhelming portion of y ∈ {0, 1} if T1 , T2 , . . . , Tl−1 , Tl+1 , · · · , Tt are known but Tl , for any 1 ≤ l ≤ t. It is easy only for negligible portion of y ∈ {0, 1} . Negligibility is defined as usual, namely is negligible if for every constant c ≥ 0, there exists an integer kc such that (k) < k −c for all k ≥ kc . Obviously by following the proof sketch described in [6], it can be shown that our extension of RST retains signer anonymity and is computationally secure if and only if at least n − t + 1 trapdoor information among all the n trapdoor information is unknown.
42
4.4
D.S. Wong et al.
Tandem Construction
We now study how to build a secure trapdoor construction F12···t : {0, 1} → {0, 1} . One possible construction is to apply t trapdoor one-way permutations with the same domain in tandem. That is, F12···t = gt · gt−1 · · · · · g2 · g1 where f · g denotes the composition of two functions with the range of g being the same as the domain of f . Fig. 1 illustrates the structure of one node on a RSTbased ring equation with our tandem construction as the multiparty trapdoor transformation.
Fig. 1. One Node on a Ring Equation
It obviously satisfies the first two conditions of a secure multiparty trapdoor transformation stated in Definition 1. To see that it also satisfies the third condition of Definition 1, we show that the following proposition is true. Proposition 1. For any y ∈ {0, 1} , any set of t distinct trapdoor one-way permutations {gi }1≤i≤t with the same domain {0, 1} , define a permutation F12···t −1 as gt · gt−1 · · · · · g2 · g1 . Computing F12···t (y) is hard if and only if at least one of the corresponding trapdoor information is unknown. Proof. It is obvious to see (and can be shown by contradiction) that if F12···t is difficult to invert, then at least one of the t trapdoor information must be unknown. Hence the forward direction is true. For the reverse direction, suppose there exists an algorithm A which inverts F12···t in probabilistic polynomial time with non-negligible probability. That is, for sufficiently large (considered to be the security parameter), for any y ∈ {0, 1} , Pr[A(y, g1 , g2 , · · · , gt , l, T1 , T2 , · · · , Tl−1 , Tl+1 , · · · , Tt ) = x : x ∈ {0, 1} , y = gt · gt−1 · · · · · g2 · g1 (x)] > 1/Q() for some polynomial function Q. T1 , T2 , · · · , Tl−1 , Tl+1 , · · · , Tt are the trapdoor information corresponding to g1 , g2 , · · · , gl−1 , g1+1 , · · · , gt , for some 1 ≤ l ≤ t. Our goal, for the purpose of having contradiction occur, is to construct another probabilistic polynomial-time algorithm B which inverts, with nonnegligible probability, the trapdoor one-way permutation gl over {0, 1} without
On the RS-Code Construction of Ring Signature Schemes
43
knowing the corresponding trapdoor information Tl , and hence is equivalent to knowing Tl . The problem instance is described as follows. For any Y ∈ {0, 1} , find X ∈ {0, 1} such that Y = gl (X). Below is the algorithm B with A as a black box (denoted by B A ) which solves the problem instance in polynomial time with non-negligible probability. B A = “On input gl , a trapdoor permutation over {0, 1} , and Y ∈ {0, 1} , 1. Define arbitrarily t−1 distinct trapdoor permutations with the corresponding trapdoor information. They are denoted by (g1 , T1 ), (g2 , T2 ), · · · , (gl−1 , Tl−1 ), (gl+1 , Tl+1 ), · · · , (gt , Tt ).
All of them are operating over {0, 1} . 2. Compute y = gt · gt−1 · · · gl+1 (Y ). 3. Query the black box of form A with (y, g1 , g2 , · · · , gl−1 , gl , gl+1 , · · · , gt , l, T1 , T2 , · · · , Tl−1 , Tl+1 , · · · , Tt ).
Let the response be x ∈ {0, 1} . 4. Computes X = gl−1 ·gl−2 · · · g2 ·g1 (x) using Tl−1 , Tl−2 , · · · , T2 , T1 and output X.” Since algorithm A inverts F12···t with probability greater than 1/Q(), we can see that the success rate of B A is also non-negligible and it is in polynomial time. 2 4.5
Complexity and Security
When t = 2, our extended RST is illustrated in Fig. 2. We can see that when t = 1, our scheme is the same as the conventional RST scheme. This also implies that our scheme is a generalization of the RST scheme. On the complexity of the signature generation. The scheme carries out [( nt ) − 1]t trapdoor one-way permutations and t inversions of trapdoor one-way −1 can permutations. We assume that the computational complexity of EK or EK be ignored when compared with that of trapdoor one-way permutations. Hence t the complexity is in proportion to t( nt ) (whose upper bound is t( en t ) ) trapdoor one-way permutations. Complexity increases when t closes to n/2; and decreases when t closes to 1 or n. Therefore the scheme is suitable for a small group of participating signers or a very large group of participating signers with respect to the size of the ring. The Inclusion of t in the Computation of K. Some simplified variants compute K as h(m), h(m, L), or some other ways without the inclusion of t; however they are insecure in the threshold setup. Consider when t = n − 1, the number of nodes on the ring is n, which is also the case when t = 1. One can construct n secure trapdoor constructions for a (n−1, n)-threshold ring signature
44
D.S. Wong et al.
Fig. 2. The Extended RST (t = 2)
scheme such that they can be mapped to n secure trapdoor constructions of a (1, n)-threshold ring signature scheme. Hence a (1, n)-threshold ring signature can be produced from a (n − 1, n)-threshold ring signature for any message and any particular n-member ring. For example, considering the tandem construction described in Sec. 4.4, suppose n = 4, K = h(m) and the 4 secure trapdoor constructions are F123 = g1 · g2 · g3 , F124 = g2 · g1 · g4 , F234 = g3 · g2 · g4 and F134 = g4 · g1 · g3 . For a message m, let (r, L, n − 1, x1 , x2 , x3 , x4 ) be a (n − 1, n)-threshold ring signature of m. We can see that one can readily forge a (1, n)-threshold ring signature on the message m by designating Fi = gi , 1 ≤ i ≤ 4 and having the signature be (r, L, 1, x1 , x2 , x3 , x4 ) where x1 = g2 (g3 (x1 )), x2 = g1 (g4 (x2 )), x3 = g2 (g4 (x3 )) and x4 = g1 (g3 (x4 )). This attack can be generalized to any n and t. It may introduce concerns to systems in which various values of t and n of threshold ring signature schemes are allowed to be present at the same time. By defining K as h(m, t) or h(m, L, t) prevents the problem.
5
Concluding Remarks
In this paper, we show that a simple equation denoted by z1 + · · · + zn = v can also be a feasible ring equation provided that zi ’s are the evaluations of some polynomial. Our construction can be considered as a reduction of a modified BSS2. On the construction of a threshold ring signature scheme, we show that it can be constructed by extending the generic RST scheme by introducing a secure multiparty trapdoor transformation called tandem construction. It is not difficult to see that the tandem construction (Sec. 4.4) can also be applied to those ring signature schemes based on the hash chaining technique [3,1,8] and
On the RS-Code Construction of Ring Signature Schemes
45
extend them to threshold ones. We notice that the complexity of the extension is high when n is large and t is close to n/2. However, it becomes quite efficient for moderate n and small t as the construction does not use any secret sharing or partitioning technique.
References 1. M. Abe, M. Ohkubo, and K. Suzuki. 1-out-of-n signatures from a variety of keys. In Proc. ASIACRYPT 2002, pages 415–432. Springer-Verlag, 2002. LNCS 2501. 2. M. Bellare and P. Rogaway. Random oracles are practical: A paradigm for designing efficient protocols. In Proc. 1st ACM Conference on Computer and Communications Security, pages 62–73. ACM Press, 1993. 3. E. Bresson, J. Stern, and M. Szydlo. Threshold ring signatures for ad-hoc groups. In Proc. CRYPTO 2002, pages 465–480. Springer-Verlag, 2002. LNCS 2442. 4. R. Cramer, I. Damg˚ ard, and B. Schoenmakers. Proofs of partial knowledge and simplified design of witness hiding protocols. In Proc. CRYPTO 95, pages 174–187. Springer-Verlag, 1994. LNCS 839. 5. I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. SIAM J. Applied Math., 8:300–304, June 1960. 6. Ronald L. Rivest, Adi Shamir, and Yael Tauman. How to leak a secret. In Proc. ASIACRYPT 2001, pages 552–565. Springer-Verlag, 2001. LNCS 2248. 7. Victor K. Wei. Modulation, Coding and Cryptography: Theory, Algorithms and Source programs. Draft, 1998. 8. F. Zhang and K. Kim. ID-Based blind signature and ring signature from pairings. In Proc. ASIACRYPT 2002, pages 533–547. Springer-Verlag, 2002. LNCS 2501.
A
The Complexity of Evaluating t Values of f (x) in Sec. 3.2
Let m(x) be the irreducible polynomial in Z2 [x] of degree . Hence the ring Z2 [x]/m(x) is a field GF (q) where q = 2 . Let α be a primitive element of GF (q), that is αq−1 ≡ 1 (mod m(x)) (5) and q − 1 is the smallest positive integer to have the congruence above hold. {0, α0 , α1 , · · · , αq−2 } forms a complete set of residues modulo m(x). The product of all the non-zero elements is α0 α1 · · · αq−2 = α(q−1)(q−2)/2 ≡ 1
(mod m(x))
(6)
We can see that {a0 + b, aα0 + b, aα1 + b, · · · , aαq−2 + b} is also a (permuted) complete residue system. Set a = −1 and b = αi where i ∈ {0, 1, · · · , q−2}, then it becomes {αi − 0, αi − α0 , αi − α1 , · · · , αi − αi−1 , αi − αi , αi − αi+1 , · · · , αi − αq−2 }.
(7)
Now, consider ci (x) = (x − 0)(x − α0 )(x − α1 ) · · · (x − αi−1 )(x − αi+1 ) · · · (x − αq−2 )
(8)
46
D.S. Wong et al.
defined over GF (q) for any x = αi where i ∈ {0, 1, · · · , q −2}. It is the product of all the non-zero elements of the complete residue system shown in (7), and results in 1 when x = αi , according to Eq. (6). In Sec. 3.2, we specify a polynomial f of degree q − t − 1 over GF (q) by q − t distinct evaluations: f (0) = 0, f (α0 ) = y0 , f (αi ) = yi−t , i = t + 1, · · · , n, and f (αj ) = 0, j = n+1, · · · , q − 1, where q = 2 for some large integer and α is a primitive element in GF (q). Our job is to evaluate f (αi ) for i = 1, · · · , t. In the following, we show that the complexity of these t evaluations is in O(t2 (n−t)) with respect to the number of multiple precision arithmetic operations. By using the Lagrange Approach, we can express f (x) as follows. f (x) = g0 (x)y0 + gt+1 (x)y1 + · · · + gn (x)yn−t . where g0 (x) =
x(x − αt+1 ) · · · (x − αq−2 ) (α0 − 0)(α0 − αt+1 ) · · · (α0 − αq−2 )
(9)
(10)
and gi (x) =
(αi
x(x − α0 )(x − αt+1 ) · · · (x − αi−1 )(x − αi+1 ) · · · (αi − αq−2 ) (11) − 0)(αi − α0 )(αi − αt+1 ) · · · (αi − αi−1 )(αi − αi+1 ) · · · (αi − αq−2 )
for t+1 ≤ i ≤ n. Now for x = α1 , gi (α1 ) can be written as gi (α1 ) =
(αi
−
0)(αi
(α1 − 0)(α1 − α0 )(α1 − α2 ) · · · (αi − αq−2 ) · − α0 )(αi − α1 ) · · · (αi − αi−1 )(αi − αi+1 ) · · · (αi − αq−2 )
(αi − α1 ) · · · (αi − αt ) (α1 − α2 ) · · · (α1 − αt )(α1 − αi ) =
c1 (α1 ) (αi − α1 ) · · · (αi − αt ) · 1 i ci (α ) (α − α2 ) · · · (α1 − αt )(α1 − αi )
=
(αi − α1 ) · · · (αi − αt ) (α1 − α2 ) · · · (α1 − αt )(α1 − αi )
(12)
It also applies when i = 0. Without loss of generality, the same technique can be applied to evaluate all other values of x in {α1 , · · · , αt }. Now we estimate the complexity of our derivation in terms of the number of multiple precision arithmetic operations. We need to do t distinct evaluations. In each evaluation, there are n−t+1 summation terms. In each summation term, there are 2t subtractions, 2t+1 multiplications and one division. About the total number of exponentiations, we have many duplicated terms among the n−t+1 summation terms and among the t evaluations. We can see that the number of distinct exponentiations is n, that is computing αi , 1 ≤ i ≤ n. Hence the total number of multiple precision arithmetic operations is n + t ∗ (n − t + 1) ∗ (4t + 2) ≈ 4t2 (n − t).
A Policy Based Framework for Access Control Ricardo Nabhen, Edgard Jamhour, and Carlos Maziero PPGIA – PUC PR – CURITIBA – PARANÁ - BRAZIL {rcnabhen, jamhour, maziero}@ppgia.pucpr.br
Abstract. This paper presents a policy-based framework for managing access control in distributed heterogeneous systems. This framework is based on the PDP/PEP approach. The PDP (Policy Decision Point) is a network policy server responsible for supplying policy information for network devices and applications. The PEP (Policy Enforcement Point) is the policy client (usually, a component of the network device/application) responsible for enforcing the policy. The communication between the PDP and the PEP is implemented by the COPS protocol, defined by the IETF. The COPS (Common Open Policy Service) protocol defines two modes of operation: outsourcing and provisioning. The choice between outsourcing and provisioning is supposed to have an important influence on the policy decision time. This paper evaluates the outsourcing model for access control policies based on the RBAC (RoleBased Access Control) model. The paper describes a complete implementation of the PDP/PEP framework, and presents the average response time of PDP under different load conditions.
1
Introduction
In policy-based networking (PBN), a policy is a formal set of statements that define how the network’s resources are allocated among its clients. Policies may be used to achieve better scaling in network management by describing common attributes of classes of objects, such as network devices, software services and users, instead of individually defining attributes for these elements. In order to implement PBN it is important to define a vendor independent method for representing and storing network policies. A formal method for representing users, services, groups and network elements is also required. An important work in this field, called CIM (Common Information Model), was proposed by the DMTF (Distributed Management Task Force) [4]. The CIM model addresses the problem of representing network resources. PCIM (Policy Core Information Model) is an information model proposed by IETF that extends CIM classes in order to support policy definitions for managing these resources [5]. PCIM is a generic policy model. Application-specific areas must be addressed by extending the policy classes and associations proposed by PCIM. For example, QPIM (QoS Policy Information Model) is a PCIM extension for describing quality of service polices [11]. In this context, this paper describes a PCIM extension for access control, called RBPIM (Role Based Policy Information Model), which permits to represent network access control policies based on roles, as well as static and dynamic constraints, as defined by the proposed NIST RBAC standard [1].
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 47–59, 2003. © Springer-Verlag Berlin Heidelberg 2003
48
R. Nabhen, E. Jamhour, and C. Maziero
Typically, PCIM is implemented using a PDP/PEP approach [9]. The PDP (Policy Decision Point) is a network policy server responsible for supplying policy information for network devices and applications. The PEP (Policy Enforcement Point) is the policy client (usually, a component of the network device/application) responsible for enforcing the policy. The communication between the PDP and the PEP is implemented by the COPS protocol, defined by the IETF [10]. The COPS (Common Open Policy Service) protocol defines two modes of operation: outsourcing and provisioning. In the outsourcing model, the PDP receives policy requests from the PEP, and determines whether or not to grant these requests. Therefore, in the outsourcing model, the policy rules are evaluated by the PDP. In the provisioning model the PDP prepares and "pushes" configuration information to the PEP. In this approach, a PEP can take its own decisions based on the locally stored policy information. The motivation for defining RBAC in PCIM terms can be summarized as follows. First, there are several situations where the same set of access control policies should be available for heterogeneous applications in a distributed environment. This feature can be achieved by adopting the PDP/PEP framework. Second, an access control framework requires having access to information about users, services and applications already described in a CIM/PCIM repository. Implementing access control in PCIM terms permits to leverage the existing information in the CIM repository, simplifying the task of keeping a unique source of network information in a distributed environment. The remaining of this paper is organized as follows: Section 2 presents a short description of the RBAC model used in this paper. Section 3 reviews some related works. Section 4 presents RBPIM. Section 5 presents the RBPIM framework implemented using the outsourcing model. Section 6 presents the performance evaluation results of a prototype of the RBPIM framework under various load conditions. Finally, the conclusion summarizes the main aspects in this project and points to future works.
2
RBAC Model
RBAC models have received a broad support as a generalized approach to access control, and are well recognized for their many advantages in performing large-scale authorization control. Several RBAC models have been proposed, each one exploring features that, supposedly, exhibit true enterprise value. The RBAC model adopted by the RBPIM framework is based on the proposed NIST (National Institute of Standards and Technology) Standard [1]. The RBPIM framework accommodates the most important RBAC features described in [1]. Also, the PEP implementation in the RBPIM framework (called RBPEP – Role Based PEP) is based on API’s described in the proposed NIST RBAC functional specification [1]. This section will present a summary of the RBAC features used in the RBPIM framework. The purpose of this summary is to define a standard nomenclature for presenting the RBPIM framework in sections 4 and 5. For a more complete description, please, refer to the proposed NIST standard [1]. The proposed NIST standard presents a RBAC reference model based on four components: Core RBAC, Hierarchical RBAC, Static Separation of Duty Relations and Dynamic Separation of Duty Relations. The idea of organizing the reference
A Policy Based Framework for Access Control
49
model in components is to permit vendors to partially implement RBAC features in their products. The Core RBAC model element includes sets of five basic data elements called users (USER), roles (ROLES), objects (OBS), operations (OPS), and permissions (PRMS). The main idea behind the RBAC model is that permissions are assigned to roles instead of being assigned to users. The User Assignment (UA) is a many-to-many relationship. An important concept in RBAC is that roles must be activated in a session. That means that user must select the roles he wants to activate within a session in order to get the permissions associated to the roles. A session is associated with a single user, and each user is associated with one or more sessions. The Permission Assignment (PA) is also a many-to-many relationship (i.e., a permission can be assigned to one or more roles, and a role can be assigned to one or more permissions). A permission is an approval to perform an operation (e.g., read, write, execute, etc.) on one or more RBAC protected objects (e.g., a file, directory entry, software application, etc.). The Hierarchical RBAC model element introduces role hierarchies (RH). Role hierarchies simplify the process of creating and updating roles with overlapping capabilities. In the proposed RBAC model, role hierarchies define an inheritance relation of permissions among roles; e.g., r1 “inherits” role r2 if all privileges of r2 are also privileges of r1. The Static Separation of Duty (SSD) model element introduces static constraints to the User Assignment (UA) relationship by excluding the possibility of the user to assume conflicting roles. The proposed RBAC model defines SSD with two arguments: a role set that includes two or more roles, and a cardinality greater than one indicating the maximum combination of roles in the set a user can be assigned, e.g., for constraining a user to assume the roles “r1” and “r2”, one must define a set {r1, r2} with cardinality 2 (the user can assume cardinality-1 roles in the set). The Dynamic Separation of Duty (DSD) model element introduces constraints on the roles a user can activate within a session. The strategy for imposing constraints on the activation of roles is similar to the SSD approach, using a set of roles and cardinality greater the one. Note that SSD imposes general constraints on which roles a user can assume, while DSD imposes constraints on which roles a user can simultaneously activate in a session. The RBPIM framework described in sections 4 and 5 supports all four elements of the proposed NIST standard and proposes a more flexible method for defining UA relationships by combining Boolean conditions as defined by the PCIM standard and its extensions [6].
3
Related Works
Recent works starts exploring the advantages of the PDP/PEP approach for implementing an authorization service that could be shared across a heterogeneous system in a company. An interesting work in this field is the XACML (eXtensible Access Control Markup Language), proposed by the OASIS consortium [12]. XACML is a XML based language that describes both an access control policy language and a request/response language. The policy language is used to express access control policies. The request/response language is used for supporting the communication between PEP clients and PDP servers. RBPIM framework described in this paper also uses the PDP/PEP approach. However, our approach differs from XACML on several points. First, the RBPIM uses a standard COPS protocol for supporting the PEP/PDP communication, instead of XML. Second, the information
50
R. Nabhen, E. Jamhour, and C. Maziero
model used for describing policies is based on a PCIM extension. Third, RBPIM has been implemented for supporting a specific access control method, the RBAC. That permits to define a complete framework that includes the algorithms in the PDP, especially conceived for evaluating policies that includes hierarchy of roles and both, dynamic and static separation of duties. Most of the research efforts found in the literature refer to the use of the PCIM model and its extensions for developing policy management tools for QoS support [11]. However, a pioneer work for defining a PCIM extension for supporting RBAC, called CADS-2, has been proposed by BARTZ, L.S. [3]. The CADS-2 is a review of a previous work, called hyperDRIVE, also proposed by BARTZ [2]. The hyperDRIVE is a LDAP [7] schema for representing RBAC. This schema can be considered as a first step for implement RBAC using the PDP/PEP approach. However, hyperDRIVE was elaborated before the PCIM standard, and has been discontinued by the author. As hyperDRIVE, CADS-2 defines classes suitable to be implemented in a directorybased repository, such as LDAP. CADS-2 defines RBAC roles in terms of policy objects, and introduces classes to support different comparison operators, e.g., equal, greaterThan, lessThan. These operators permit to represent complex comparison expressions with the attribute values of other object stored in a LDAP repository. These expressions are used to represent the conditions a user must satisfy in order to assume a RBAC role. The RBIM model described in the section 5 uses some ideas presented in the CADS-2 model. Specially, the idea of mapping roles to users using Boolean expressions. Note that this approach offers an additional degree of freedom for creating RBAC policies because the UA (User Assingment) relationship can be expressed through Boolean expressions instead of a direct mapping between user and roles. However, a recent IETF publication called PCIMe (PCIM Extensions) proposes a different approach for representing Boolean expressions [6]. The RBPIM framework adopts the PCIMe strategy. Also, many features have been introduced in order to support the other elements of the RBAC model, such as hierarchy of roles, DSD and SSD, not supported in the original CADS-2 model.
4
RBPIM: The Role Based Policy Information Model
Figure 1 shows the PCIM model, and the proposed RBPIM extensions for supporting RBAC policies. In the PCIM approach, a policy is defined as a set of policy rules (PolicyRule class). Each policy rule consists of a set of conditions (PolicyCondition class) and a set of actions (PolicyAction class). If the set of conditions described by the class PolicyCondition evaluates to true, then a set of actions described by the class PolicyAction must be executed. A policy rule may also be associated with one or more policy time periods (PolicyTimePeriodCondition class), indicating the schedule according to which the policy rule is active and inactive. Policy rules may be aggregated into policy groups (PolicyGroup class) and these groups may be nested, to represent a hierarchy of policies. In a PolicyRule, rule conditions can be grouped by two different ways: DNF (Disjunctive Normal Form) or CNF (Conjunctive Normal Form). The way of grouping policy conditions is defined by the attribute ConditionListType in the PolicyRule class. Additionally, the attributes GroupNumber and ConditionNegated, in the association class PolicyConditionInPolicyRule helps to create condition
A Policy Based Framework for Access Control DSDRBAC **
*
* RBACPolicyGroup **
-DSDName -RoleSet[] -Cardinality
** RBPIM classes
* PolicyRule *
SSDRBAC ** -DSDName -RoleSet[] -Cardinality
*
*
-TimePeriod
RBACRole ** -RoleName -InheritedRoles[]
PolicyCondition
*
RBACPermission ** -PermissionName
* PolicyAction (abstract)
-ConditionGroupNumber
AssignerRBACPermission ** -AssignedRBACPermission
* 1
PolicyValue
+ConditionListType -RulePriority
*
*
PolicyVariable
*
*
PolicyTimePeriodCondition 1
51
SimplePolicyCondition *
AssignerOperation ** -AssignedOperation[]
Fig. 1. PCIM class hierachy and RBPIM extensions.
expressions. In DNF, conditions within the same group number are ANDed ( ∧ ) and groups are Ored ( ∨ ). In CNF, conditions within the same group are ORed ( ∨ ) and groups are ANDed ( ∧ ). In order to illustrate this approach, suppose we have a set of five PolicyConditions Ci(GroupNamber,ConditionNegated) as follows: C={C1(1,false), C2(1,true), C3(1,false), C4(2,true), C5(2,false)}. Then, the overall condition for the PolicyRule will be defined as: If ConditionListType = DNF then: evaluate(C ) = (C1 ∧ !C 2 ∧ C 3 )∨ (C 4 ∧ C 5 ) If ConditionListType = CNF then: evaluate(C ) = (C1 ∨ !C 2 ∨ C 3 ) ∧ (C 4 ∨ C 5 )
The RFC 3460 proposes several modifications in the original PCIM standard. These modifications are called PCIMe (Policy Core Information Model Extensions) [6]. PCIMe solves many practical issues raised after the original PCIM publication. For example, PolicyCondition have been extended in order to support a straightforward way for representing conditions by combining variables and values. This extension is called SimplePolicyCondition. The strategy defined by SimplePolicyCondition is to build a condition as a Boolean expression evaluated as: does MATCH ? Variables are created as instances of specializations of PolicyVariable. The values are defined by instances of specializations of PolicyValue. The MATCH element is implicit in the model. PCIMe defines two types of variables: explicit (PolicyExplicitVariable) and implicit (PolicyImplicitVariable). Explicit variables are used to build conditions that refer to objects stored in a CIM repository. For example, considers the following condition: Person.Surname MATCH “Doe”. Person.Surname refers to the Surname attribute of the class Person in the CIM model. This condition is expressed as PolicyExplicitVariable.ModelClass = “Person” and PolicyExplicitVariable.Property = “Surname”. Because Person.Surname is a string, the PolicyStringValue subclass must be used in this condition, i.e., PolicyStringValue.StringList = “Doe”. Observe that explicit variables are a very powerful instrument for reusing CIM information in policy based management tools. Implicit variables are used to represent objects that are not stored in a CIM repository. They are especially useful for defining filtering rules with conditions based on protocol headers, such as source and destination addresses or protocol types.
52
R. Nabhen, E. Jamhour, and C. Maziero
For supporting filtering rules, PCIMe defines several specializations of PolicyImplicitVariable, such as PolicySourceIPv4Variable, PolicySourcePortVariable, etc. These specializations have no properties. For example, the condition “source IPv4 address” MATCH “192.168.0.0/24” would be represented using the class PolicySourceIPv4Variable and PolicyIPv4AddrValue. IPv4AddrList = “192.168.0.0/24”. PCIMe offers also the possibility of creating conditions that use sets or range of values instead of single values. For example, the condition “source port” MATCH “[1024 to 65535]” would be represented using the class PolicySourcePortVariable and PolicyIntegerValue.IntegerList=”1024..65535”. Values with wildcards are also permitted. Please, refer to the RFC 3460 for more details about this approach.
PolicyTimePeriodCondition
*
RBACRole
*
RBACPermission
-RoleName -InheritedRoles[]
-TimePeriod
-PermissionName
*
* * SimplePolicyCondition
* AssignerRBACPermission
*
*
SimplePolicyCondition
-AssignedRBACPermission
*
*
AssignerOperation -AssignedOperation[]
Fig. 2. RBPIM class associations
The RBPIM model is a PCIM extension for representing RBAC policies. The RBPIM class hierarchy is shown in the Figure 1. The following classes have been introduced: RBACPermission and RBACRole (specializations of PolicyRule), AssignerPermission and AssignerOperation (specializations of PolicyAction), DSDRBAC and SSDRBAC (specializations of Policy). The RBACPolicyGroup class (specialization of PolicyGroup) is used to group the information of the constrained RBAC model. As shown in Figure 2, the approach in the RBPIM model consists in using two specializations of PolicyRule for building the RBAC model: RBACRole (for representing RBAC roles) and RBACPermission (for representing RBAC permissions). RBACRole can be associated to lists of SimplePolicyCondition, AssignerRBACPermission and PolicyTimePeriodCondition instances. The instances of SimplePolicyCondition are used to express the conditions for a user to be assigned to a role (UA relationship). The instances of AssignerRBACPermission are used to express the permissions associated to a role (PA relationship). The instances of PolicyTimePeriodCondition define the periods of time a user can activate a role. RBACPermission can be associated to a list of SimplePolicyCondition and AssignerOperation instances. The instances of SimplePolicyCondition are used to describe the protected RBAC objects and the instances of AssignerOperation are used to describe approved operation on these objects.
A Policy Based Framework for Access Control
5
53
RBPIM Framework
5.1
Overview
Several IETF works describe the implementation of policy-based network management tools using the PDP/PEP approach [9,10]. The IETF defines that the PEP and the PDP communicates using the COPS (Common Open Policy Service) protocol [10]. The COPS is an object-oriented protocol that defines a generic message structure for supporting the exchange of policy information between a PDP and its clients (PEPs). The COPS protocol defines two models of operation: outsourcing and provisioning. The choice between outsourcing and provisioning is supposed to have an important influence on the policy decision time. In environments where network polices are mostly static, one can suppose that the provisioning approach will be faster than the outsourcing approach. However, if external events trigger frequently policy changes, the performance in the provisioning approach can be significantly reduced, and outsourcing model could be a better choice. Also, it is possible to conceive hybrid approaches, combining the outsourcing and provisioning features. The RBPIM framework described in this paper uses a “pure” outsourcing model. Figure 3 illustrates the main elements in the RBPIM framework. RBPIM framework adopts the PDP/PEP model using the outsourcing approach, i.e., the PDP carries most of the complexity and the PEP is comparatively light. In the RBPIM framework, the PEP is called Role-Based PEP (RBPEP). The Role-Based PDP (RBPDP) is a specialized PDP responsible for answering the RBPEP questions. Observe that the RBPDP has an internal database (called State DataBase) used for storing the state information of the RBPEP. The CIM/Policy Repository is a LDAP server that stores both: objects that represent network information such as users, services and network nodes and objects that represents policies (including the RBPIM model described in the section 4). The PCLS (Policy Core LDAP Schema) supplies the guidelines for mapping PCIM into LDAP classes [8]. RBPIM is mapped to a LDAP schema as defined by PCLS. The Policy Management Tool is the interface for updating CIM/Policy repository information and for administrating the PDP service.
a p p lica tio n
RBA C AP I
R BPEP
TC P POR T (3 2 8 8 )
N e tw o rk N o d e
a p p lica tio n
RBPDP
C OPS Pro to co l
RBA C AP I
N e tw o rk N o d e
R BAC Ou tso u rcin g Alg o rith ms
L D AP
C IM/Po licy R e p o sito ry (L D AP)
L D AP
R BPEP Sta te D a ta Ba se
Policy Management Tool
Fig. 3. RBPIM Framework Overview
5.2 RBAC API’s As show in Figure 3, the RBPEP offers a set of API for permitting developers to build RBAC-aware applications without implementing a COPS interface. The RBPIM framework defines a set of five APIs:
54 • • • • •
R. Nabhen, E. Jamhour, and C. Maziero RBPEP_Open () RBPEP_CreateSession(userdn:string; out session:string, roleset[]:string, usessions:int) RBPEP_SelectRoles(session: string, roleset[]:string; out result:BOOLEAN) RBPEP_CheckAccess(session: string, operation:string, objectfilter[ ]: string; out result:BOOLEAN) RBPEP_CloseSession(session:string)
The RBPEP_Open is the only API not related to RBAC. It establishes the connection between the PEP and the PDP. The API could be used by an application to ask the RBPEP to initiate the RBAC service. The RBPEP will process the API only if it is not already connected to the PDP. The RBPEP_CreateSession API establishes a user session and returns the set of roles assigned to the user that satisfies the SSD constraints. This approach differs from the standard CreateSession() function proposed by the NIST because it does not activate a default set of roles for the user. Instead, the user must explicitly activate the desired roles in a subsequent call to the RBPEP_SelectRoles API. This modification avoids the need of the user to drop unnecessarily activated roles in order to satisfy DSD constraints. In order to call the CreateSession API, an application must specify the user through a DN (distinguish name) reference to a CIM Person object that represents the user (userdn). The RBPIM framework does not interfere in the authentication process. It supposes the application have already authenticated the user and mapped the user login to the corresponding entry in the CIM repository. Because the DSD constraints are imposed only within a session, the CreateSession API returns to the application the number of sessions already open by the user (usessions). Finally, the session parameter is a unique value generated by the RBPEP and returned to the application in order to be used in the subsequent calls. The RBPEP_SelectRoles API activates the set of roles defined by the roleset[] parameter. This API evaluates the SSD constraints in order to determine whether the set of roles can be activated or not. If all roles in the set roleset[] can be activated, the function returns result=TRUE. The SelectRoles API, differently from the standard AddActiveRole function proposed by the NIST, can be evocated only once in a session. Also, in the RBPIM approach, the standard function DropActiveRole proposed by the NIST was not implemented. We have evaluated that allowing a user to drop a role within a session would offer too many possibilities for violating SSD constraints. The RBPEP_CheckAccess API is similar to the standard CheckAccess function proposed by the NIST. This API evaluates if the user has the permission for executing the operation on the set of objects specified by the filter objectfilter[]. The objectfilter[] is a vector of expressions of type “PolicyImplicitVariable=PolicyValue” or “PolicyExplictyVariable=PolicyValue” used for discriminating one or more objects. In the current RBPIM version, the expressions in objectfilter[] are ANDed, i.e., only the objects that simultaneously satisfy all the conditions in the vector are considered for authorization checking. For example, {“PolicyDestinationIPv4Variable=192.168.2.3”, “Directory.Name=/usr/application”}, specifies the object directory /usr/application in the host 192.168.2.3. The objectfilter[] vector is confronted with the conditions specified by the RBACPermission objects in the RBPIM model. If the user has the right to execute the operation on all the objects that satisfy the objectfilter[] vector, the function returns result=TRUE. The RBPIM framework does not considers relationship between the CIM classes. The explicit variables expressions are evaluated
A Policy Based Framework for Access Control
55
independently, and must belong to the same object class in order to avoid an empty set of objects. To consider association between the CIM classes is a complex issue let for future studies. As an alternative, a condition “DN=value”, based on the distinguished-name of an object, can be passed in the object filter to uniquely identify a CIM object, leaving to the application the responsibility of querying the CIM repository. The RBPEP_CloseSession terminates the user session, and informs to the PDP that the information about the session in the “state database” is no longer needed. The RBPEP_API is currently implemented in Java, and throws exceptions for informing the applications about the errors returned by the PDP. Examples of exceptions are: “RBPEP_client not supported”, “non-existent session”, “userdn not valid”, etc. 5.3 COPS Messages The COPS protocol version used in the RBPIM protocol is based on the RFC 2748. This section presents a short summary of the COPS protocol, please, refer to [10] for a more detailed description. Each COPS message consists of a common header followed by a number of typed objects. A field in the common header called “opcode” identifies the type of COPS message being represented. The RFC 2748 defines ten types of COPS messages. In order to understand how these messages are used, it is important to note that the COPS protocol assumes a stateful operation mode. Requests from the PEP are installed or remembered by the remote PDP until they are explicitly deleted. A PEP requests a PDP decision using the REQ (Request) message, and PDP responds to the REQ with a DEC (Decision) message (see Figure 4). The RPT message is used by the PEP to communicate to the PDP its success or failure in carrying out the PDP’s decision. The DRQ message is sent by the PEP to remove a decision state from the PDP. A field in the common header called “client-type” identifies the policy client. The interpretation of all encapsulated objects that follow the common header is relative to the “client-type”. A PEP sends an OPN (Open) message in order to verify if its specific client-type is supported by the PDP. The PDP responds with a CAT (Client-Accept) message or with a CC (Client-Close) message (the client is rejected). The CAT message specifies a timer in seconds (called KA timer), used for each side validating that the connection is still functioning when there is no other messaging. The PEP sends KA (Keep-Alive) messages to the PDP and the PDP echoes the PEP also using the KA messages. All the RBPEP APIs described in the previous section are mapped to COPS messages. The Figure 4 illustrates the RBPEP API to COPS mapping. The general structure of each COPS messages is also illustrated in the Figure 4. The RBPEP_Open API is mapped to the COPS OPN, CAT and CC messages. In all messages, the uses the client-type 0x8000 for identifying a RBPEP client to the PDP. This value belongs to the range defined for enterprise specific client-types (0x8000 to 0xFFFF). The OPN message carries the specific object that identifies the RBPEP to the PDP. The is a symbolic string, usually representing the IP or the FQDN of the RBPEP host. If the PDP supports the RBPEP-type client, and the belongs to the list of authorized clients, it returns a CAT message; otherwise, it returns a CC message. The RBPEP
56
R. Nabhen, E. Jamhour, and C. Maziero RBPEP_CreateSession RBPEP_SelectRoles RBPEP_CheckAccess
RBPEB_Open
Application
RBPEP
PDP
Application
RBPEP
RBPEB_CloseSession
PDP
RPEP_API COPS OPN
Return
COPS_DEC Return
COPS OPN: COPS CAT: COPS CC: <Error>
PDP
COPS_DRQ
COPS_REQ
COPS CAT or CC
RBPEP
RBPEP_API
RBPEP_API
Return
Application
COPS_RPT
COPS REQ: COPS DEC: |<Error> ::= [] COPS RPT:
COPS DRQ:
Fig. 4. RBPEP API to COPS Mapping
will process the API only if it is not already connected to the PDP. The three APIs, RBPEP_CreateSession, RBPEP_SelectRoles and RBPEP_CheckAccess are mapped to the COPS REQ, DEC and RPT messages. In all messages, the object encapsulates the session identifier. In the REQ message, the object identifies the API to the PDP and the (Client Specific Information) objects are used to transport the parameters of the API. In the DEC message, the objects are used to encapsulate the parameters returned by the PDP. In the RPT message, the object carries the information about the success or failure of the RBPEP object implementing the decision delivered by the PDP. Because the RPT message is automatically generated by the RBPEP, the always returns a success status. The RBPEP_CloseSession API is mapped to the COPS DRQ message. Like the other messages, the object identifies the session. The object transport a code that identifies the reason that justifies why the state (session) is being removed. The codes used by the object are identified by the RFC 2748 [10].
6
Evaluation
In order to evaluate the performance the RBPIM framework, a Java based RBPDP and a RBPEP scenario simulator was implemented (see Figure 5). This prototype is available for download in [13]. In the evaluation scenario, twenty RBPEP clients request the RBPIM policy service provided by a single RBPDP. Each RBPEP keep a distinct COPS/TCP connection with the RBPDP. The RBPEP clients simulate typical access control scenarios created by text input files. Each line of these input files corresponds to an API call presented in section 5.2. Several user sessions were created in the context of each RBPEP connection. For each connection served, the RBPDP generates an output file containing all COPS messages associated with the correspondent API call in the input file and the elapsed time from the instant of receiving the RBPEP’s COPS message to the RBPDP’s decision. In order to simulate different load scenarios, we have introduced a random delay between each API call contained in the input files. By varying the range of the random delay, we have created six load scenarios as shown in Figure 6. The load scenario “1” is the lightest
A Policy Based Framework for Access Control
57
scenario and the number “6” is the heaviest one. The former makes the RBPDP to receive 2.7 requests/second (average) and the latter increases this number to 40 requests/second (average). The Figure 6 presents the results obtained with the Java prototype, using a Pentium IV 1.5 Ghz 256 Mb RAM for hosting the RBPDP, and other identical machine for hosting the 20 RBPEP clients. Initially, we defined a small set with five role objects hierarchically related and six permission objects, corresponding to a small set of departmental policies grouped in a single RBACPolicyGroup object. Each role and permission object has been defined considering a small set of three or four conditions combining implicit and explicit variables. Also, three SSD constraints and one DSD constraint were considered. One observes from the results that the RBPEP_CreateSession API correspond to the longest decision time. This is justified by the fact that this API prepares the state database by retrieving the list of the roles assigned to the user, free of SSD constrains.
RBPDP
resu lts
RB PEPs
TCP/CO PS
ap p1 .ou t
in pu t
RBPE P
Application1
ap p1 .in
RBPE P
Application2
ap p2 .in
Application20
a pp 20 .in
TCP/CO PS ap p2 .ou t
Pen tium III D ua l, 1 .3 GH z 51 2 Mb R AM Linu x O.S.
3288 Sta te D ata ba se (MySQL )
... a pp 20 .o ut
... ...
TCP/CO PS RBPE P
Pe ntiu m IV, 1 .3 GH z 25 6 Mb R AM Win do w s 20 00
LDAP S erver 389
Pe ntiu m IV, 1 .3 GH z 25 6 Mb R AM SU N ON E D ire cto ry Se rve r 5 .1
LDAP
C IM/PC IM R ep osito ry
Fig. 5. Simulation Scenario
After this initial test, the number of RBPIM objects has been increased. Each RBPIM object affects differently the response time of the RBPEP_APIs. Because of the flexibility introduced in the UA relationship by the RBPIM approach, the number of roles objects significantly affects the RBPEP_CreateSession API. Increasing the number of roles from five to twenty has almost doubled the average response time. By the other hand, the effect of increasing the number of SSD objects is not important. The response time of other APIs are not affected because the roles assigned to the user are saved in the state database for subsequent calls. The RBPEP_SelectRoles is almost imperceptible affected by the number of DSD objects and it is not affected by the other RBPIM objects. The RBPEP_CheckAccess should be affected by the number of permission objects associated to the roles. However, our tests shown that increasing the average number of permissions per role from two to ten has no significant effect in the response time. As a final remark, in all APIs, increasing the number of conditions associated to a role or permission object has no significant effect, because the DNF or CNF conditions are transformed in a single LDAP query. The results of the evaluation tests show the number of role (RBACRole) objects as the most important parameter affecting the response time in the RBPIM framework. The results also show reasonable response times considering the Java implementation and the CPU capacity of the machines used in the simulation. A response time of 50 ms for RBPEP_CreateSession (100 ms with twenty roles) in scenario 4 is a reasonable result for an API that is evocated only once in a session.
58
R. Nabhen, E. Jamhour, and C. Maziero
Also, the RBPEP_CheckAccess average response time API has presented reasonable results for applications that requires decisions based on user events, and is not significantly affected by the number RBAC policy objects. Average response time (ms)
Maximum response time (ms)
400
800
350
700
300
600
250
500
200
400
150
300
100
200
50
Load Scenario 1 2 3 4 5 6
0 1
2
3
4
5
6
API calls/s 2.7 3.3 4.4 6.7 13.3 40.0
RBPEP_CreateSession
100
0
Delay Range 5 to 10 s 4 to 8 s 3 to 6 s 2 to 4 s 1 to 2 s 0 to 1 s
RBPEP_SelectedRoles 1
2
3
4
5
6
RBPEP_CheckAccess
Fig. 6. RBPDP decision time x API calls.
7
Conclusion
This paper has presented a complete policy based framework for implementing RBAC policies in heterogeneous and distributed systems. This framework, called RBPIM, has been implementing in accordance with the IETF standards PCIM and COPS, and also, the proposed NIST RBAC standard. The framework proposes a flexible RBAC model by permitting specify the relationship between users, roles, permissions and resource objects by combining Boolean expressions. The performance evaluation of the outsourcing model indicates that this approach is suitable for supporting RBAC applications that requires decisions based on user events. This paper does not discuss the problems that could rise if the PDP breaks. Future works must evaluate alternative solutions for introducing redundancy in the PDP service. Also, additional specifications are required for assuring a secure COPS connection between the PDP and the RBPEPs. These studies will be carried out in parallel with the evaluation of provisioning and hybrid approaches for implementing the RBPIM framework. Also, some important PCIMe modifications must be taken into account in a revised version of the RBPIM information model. Finally, some studies are being developed for evaluating the use of the RBPIM framework for QoS management based on RBAC rules.
References 1. 2. 3. 4.
D.F. Ferraiolo, R.S. Sandhu, G. Serban, “A Proposed Standard for Role-Based Access Control”, ACM Transactions on Information System Security, Vol. 4, No. 3, August 2001, pp. 224–274. L.S. Bartz, “LDAP Schema for Role Based Access Control”, IETF Internet Draft, expired, October 1997. L.S. Bartz, “CADS-2 Information Model”, not published, IRS: Internal Revenue Service, 2001. Distributed Management Task Force (DMTF), “Common Information Model (CIM) Specification”, URL: http://www.dmtf.org.
A Policy Based Framework for Access Control 5. 6. 7. 8. 9. 10. 11. 12. 13.
59
B. Moore, E. Elleson, J. Strasser, A. Weterinen, “Policy Core Information Model”, IETF RFC 3060, February 2001. B. Moore, E. Elleson, J. Strasser, A. Weterinen, “Policy Core Information Model Extensions”, IETF RFC 3460, February 2003. W. Yeong, T. Howes, S. Killie, “LightWeight Directory Access Protocol”, IETF RFC 1777, March, 1995. J. Strassner, E. Ellesson, B. Moore, R. Moats, "Policy Core LDAP Schema", IETF Internet Draft, January 2002. R. Yavatkar, D. Pendarakis, R. Guerin, “A Framework for Policy-based Admission Control”, IETF RFC 2753, January 2000. D. Durham, Ed., J. Boyle, R. Cohen, S. Herzog, R. Rajan, A. Sastry, The COPS (Common Open Policy Service) Protocol, IETF RFC 2748, January 2000. Y. Snir, Y. Ramberg, J. Strassner, R. Cohen, B. Moore, "Policy QoS Information Model", IETF internet-draft, November 2001. OASIS, "eXtensible Access Control Markup Language (XACML) – Version 1.03”, OASIS Standard, 18 February 2003, URL: http://www.oasis-open.org RBPIM Project WebSite. http://www.ppgia.pucpr.br/~jamhour/RBPIM
Trading-Off Type-Inference Memory Complexity against Communication Konstantin Hypp¨ onen1 , David Naccache2 , Elena Trichina1 , and Alexei Tchoulkine2 1 University of Kuopio Department of Computer Science Po.B. 1627, FIN-70211, Kuopio, Finland {konstantin.hypponen, elena.trichina}@cs.uku.fi 2 Gemplus Card International Applied Research & Security Centre 34 rue Guynemer, Issy-les-Moulineaux, 92447, France {david.naccache, alexei.tchoulkine}@gemplus.com
Abstract. While bringing considerable flexibility and extending the horizons of mobile computing, mobile code raises major security issues. Hence, mobile code, such as Java applets, needs to be analyzed before execution. The byte-code verifier checks low-level security properties that ensure that the downloaded code cannot bypass the virtual machine’s security mechanisms. One of the statically ensured properties is type safety. The type-inference phase is the overwhelming resource-consuming part of the verification process. This paper addresses the RAM bottleneck met while verifying mobile code in memory-constrained environments such as smart-cards. We propose to modify classic type-inference in a way that significantly reduces the memory consumption in the memory-constrained device at the detriment of its distrusted memory-rich environment. The outline of our idea is the following, throughout execution, the memory frames used by the verifier are MAC-ed and exported to the terminal and then retrieved upon request. Hence a distrusted memory-rich terminal can be safely used for convincing the embedded device that the downloaded code is secure. The proposed protocol was implemented on JCOP20 and JCOP30 Java cards using IBM’s JCOP development tool.
1
Introduction
The Java Card architecture for smart cards [1] allows new applications, called applets, to be downloaded into smart cards. While general security issues raised by applet download are well known [9], transferring Java’s safety model into resource-constrained devices such as smart cards appears to require the devising of delicate security-performance trade-offs. When a Java class comes from a distrusted source, there are two basic manners to ensure that no harm will be done by running it. S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 60–71, 2003. c Springer-Verlag Berlin Heidelberg 2003
Trading-Off Type-Inference Memory Complexity against Communication
61
The first is to interpret the code defensively [2]. A defensive interpreter is a virtual machine with built-in dynamic runtime verification capabilities. Defensive interpreters have the advantage of being able to run standard class files resulting from any Java compilation chain but appear to be slow: the security tests performed during interpretation slow-down each and every execution of the downloaded code. This renders defensive interpreters unattractive for smart cards where resources are severely constrained and where, in general, applets are downloaded rarely and run frequently. Another method consists in running the newly downloaded code in a completely protected environment (sandbox), thereby ensuring that even hostile code will remain harmless. In this model, applets are not compiled to machine language, but rather to a virtual-machine assembly-language called byte-code. Upon download, the applet’s byte-code is subject to a static analysis called byte-code verification which purpose is to make sure that the applet’s code is well-typed. This is necessary to ascertain that the code will not attempt to violate Java’s security policy by performing ill-typed operations at runtime (e.g. forging object references from integers or calling directly API private methods). Today’s de facto verification standard is Sun’s algorithm [7] which has the advantage of being able to verify any class file resulting from any standard compilation chain. While the time and space complexities of Sun’s algorithm suit personal computers, the memory complexity of this algorithm appears prohibitive for smart cards, where RAM is a significant cost-factor. This limitation gave birth to a number of innovating workarounds such as [5], [6], [11], [10] and [8]. Our results: The work reported in this paper describes an alternative bytecode verification solution. Denoting by Mmax the number of variables claimed by the verified method and by J the number of jump targets in it, we show how to securely distribute the verification procedure between the card and the terminal so as to reduce the card’s memory requirements from O(Mmax J) to O(J log J + cMmax ) where c is a small language-dependent constant or, when a higher communication burden is tolerable, to a theoretic O(log J + cMmax ).
2
Java Security
The Java Virtual Machine (JVM) Specification [7] defines the executable file structure, called the class file format, to which all Java programs are compiled. In a class file, the executable code of methods (Java methods are the equivalent of C functions) is found in code-array structures. The executable code and some method-specific runtime information (namely, the maximal operand stack size Smax and the number of local variables Lmax claimed by the method1 ) constitute a code-attribute. We briefly overview the general stages that a Java code goes through upon download. 1
Mmax = Lmax + Smax .
62
K. Hypp¨ onen et al.
To begin with, the classes of a Java program are translated into independent class files at compile-time. Upon a load request, a class file is transferred over the network to its recipient where, at link-time, symbolic references are resolved. Finally, upon method invocation, the relevant method code is interpreted (run) by the JVM. Java’s security model is enforced by the class loader restricting what can be loaded, the class file verifier guaranteeing the safety of the loaded code and the security manager and access controller restricting library methods calls so as to comply with the security policy. Class loading and security management are essentially an association of lookup tables and digital signatures and hence do not pose particular implementation problems. Byte-code verification, on which we focus this paper, aims at predicting the runtime behavior of a method precisely enough to guarantee its safety without actually having to run it. 2.1
Byte-Code Verification
Byte-code verification [4] is a link-time phase where the method’s run-time behavior is proved to be semantically correct. The byte-code is the executable sequence of bytes of the code-array of a method’s code-attribute. The byte-code verifier processes units of method-code stored as class file attributes. An initial byte-code verification pass breaks the byte sequence into successive instructions, recording the offset (program point) of each instruction. Some static constraints are checked to ensure that the bytecode sequence can be interpreted as a valid sequence of instructions taking the right number of arguments. As this ends normally, the receiver assumes that the analyzed file complies with the general syntactical description of the class file format. Then, a second verification step ascertains that the code will only manipulate values which types are compatible with Java’s safety rules. This is achieved by a type-based data-flow analysis which abstractly executes the method’s byte-code, by modelling the effect of the successive byte-codes on the types of the variables read or written by the code. The next section explains the semantics of type checking, i.e., the process of verifying that a given pre-constructed type is correct with respect to a given class file. We explain why and how such a type can always be constructed and describe the basic idea behind data-flow analysis. The Semantics of Type Checking. A natural way to analyze the behavior of a program is to study its effect on the machine’s memory. At runtime, each program point can be looked upon as a memory instruction frame describing the set of all the runtime values possibly taken by the JVM’s stack and local variables. Since run-time information, such as actual input data is unknown before execution starts, the best an analysis may do is reason about sets of possible computations. An essential notion used for doing so is the collecting semantics
Trading-Off Type-Inference Memory Complexity against Communication
63
defined in [3] where, instead of computing on a full semantic domain (values), one computes on a restricted abstract domain (types). For reasoning with types, one must precisely classify the information expressed by types. A natural way to determine how (in)comparable types are is to rank all types in a lattice L. The most general type is called top and denoted . represents the potential simultaneous presence of all types, i.e. the absence of (specific) information. By definition, a special null-pointer type (denoted null) terminates the inheritance chain of all object descendants. Formally, this defines a pointed complete partial order (CPO) on the lattice L. Stack elements and local variable types are hence tuples of elements of L to which one can apply point-wise ordering. Abstract Interpretation. The verification process described in [7] §4.9, is an (iterative data-flow analysis) algorithm that attempts to build an abstract description of the JVM’s memory for each program point. A byte-code is safe if the construction of such an abstract description succeeds. For the first instruction of the method, the local variables that represent parameters are initialized with the types τj indicated by the method’s signature; the stack is empty () and all other local variables are filled with s. Hence, the initial frame is set to: (,
(this, τ1 , . . . , τn−1 , , . . . , ))
For other instructions, no information regarding the stack or the local variables is available. Verifying a method whose body is a straight-line code (no branches), is easy: we simply iterate the abstract interpreter’ transition function Φ over the successive instructions, taking the stack and register types after any given instruction as the stack and register types before the next instruction. The types describing the successive JVM memory-states produced by the successive instructions are called working frames. Denoting by in(i) the frame before instruction i and by out(i) the frame after instruction i, we get the following data-flow equation where evaluation starts from the right: in(i + 1) ← out(i) ← Φi (in(i)) We refer the reader to [10] and [8] for an explanation of the treatment of branches that introduce forks and joins into the method’s flowchart. We remind that if an instruction i has several predecessors with different exit frames, i’s frame is computed as the least common ancestor2 (LCA) of all the predecessors’ exit frames: in(i) = LCA{out(i) | j ∈ Predecessor(i)}. 2
The LCA operation is frequently called unification.
64
K. Hypp¨ onen et al.
Finding an assignment of frames to program points which is sufficiently conservative for all execution paths requires testing them all; this is what the verification algorithm does. Whenever some in(i) is adjusted, all frames in(j) that depend on in(i) have to be adjusted too, causing additional iterations until a fix-point is reached (i.e., no more adjustments are required). The final set of frames is a proof that the verification terminated with success. In other words, that the byte-code is well-typed. 2.2
Sun’s Type-Inference Algorithm
We assume that the reader is familiar with Sun’s verification algorithm [7] and rely hereafter on the algorithm’s description and notations introduced in [8]. We do not reproduce these here again given the lack of space in these proceedings but include these details in the ePrint version of this paper that will be posted on www.iacr.org after ICICS 2003. As one can see, the time complexity of Sun’s algorithm is upper-bound by the O(D × I × J × Lmax ), where D is the depth of the type lattice, I is the total number of instructions and J is the number of jumps in the method. While from a theoretical standpoint, time complexity can be bounded by a crude upper bound O(I 4 )3 , practical experiments show that each instruction is usually parsed less than twice during the verification process. Space (memory) complexity is much more problematic, since a straightforward coding of Sun’s algorithm yields an implementation where memory complexity is bound by O(ILmax ). Although this is still polynomial in the size of the downloaded applet, one must not forget that if Lmax RAM cells are available on board for running applets, applets are likely to use up all the available memory so as to optimize their functional features, which in turn would make it impossible to verify these same applets on board. Here again, a straightforward simplification allows to reduce this memory complexity from O(ILmax ) to O(JLmax ).
3
Trading-Off On-Board RAM against Communication
A smart card is nothing but one element in a distributed computing system which, invariably, comprises terminals (also called card readers) that allow cards to communicate with the outside world. 3
In the worst case, all instructions are jumps, and each instruction acts on c different variables, i.e., Lmax = c × I, where c is a language-dependent constant representing the maximal number of variables possibly affected by a single instruction. Additionally, one may show (stemming from the observation that the definition of a new type requires at least one new instruction) that D is the maximal amongst the depth of the primitive data part of the type lattice L (some langauge-dependent constant) and I. This boils down to a crude upper bound O(I 4 ). Considering that byte-code verification takes place only once upon applet downloading, even a relatively high computational overload would not be a barrier to running a byte-code verifier on board.
Trading-Off Type-Inference Memory Complexity against Communication
65
Given that terminals usually possess much more RAM than cards, it seems natural to rely on the terminal’s storage capabilities for running the verification algorithm. The sole challenge being that data stored in the terminal’s RAM can be subject to tampering. Note that the capacity of working with remote objects (Remote Method Invocation) would make the implementation of such a concept rather natural in Java4 . 3.1
The Data Integrity Mechanism
Our goal being to use of the terminal’s RAM to store the frames created during verification, the card must embark a mechanism allowing to ascertain that frame data is not modified without the card’s consent. Luckily, a classic cryptographic primitive called MAC (Message Authentication Code) [12] does just that. It is important to stress that most modern cards embark ad hoc cryptographic co-processors that allow the computation of MACs in a few clock cycles. The on-board operation of such co-processors is particularly easy through the cryptographic classes and Java Card’s standard APIs. Finally, the solution that we are about to describe does not impose upon the terminal any cryptographic computations; and there is no need for the card and the terminal to share secret keys. Before verification starts, the card generates an ephemeral MAC key k; this key will be used only for one method verification. We denote by fk (m) the MAC function, applied to data m. k should be long enough (typically 160 bits long) to avoid the illicit recycling of data coming from different runs of the verification algorithm. The protocol below describes the solution implemented by our prototype. In the coming paragraphs we use the term working frame, when speaking of in(i + 1) ← out(i) ← Φi (in(i)). In other words, the working frame is the current input frame in(i + 1) of the instruction i which is just about to be modelled. For simplicity, we assume that instruction number i is located at offset i. Shouldn’t this be the case, a simple lookup table A[i], which output represents the real offset of the i-th instruction, will fix the problem. The card does not keep the frames of the method’s instructions in its own RAM but uses the terminal as a repository for storing them. To ascertain data integrity, the card sends out, along with the data, MACs of the outgoing data. These MACs will subsequently allow the card to ascertain the integrity of the data retrieved from the terminal (in other words, the card simply sends MACs to itself via the terminal). The card associates with each instruction i a counter ci kept in card’s RAM. Each time that instruction i is rechecked (modelled) during the fix-point computation, its ci is incremented inside the card. The role of ci is to avoid playback attacks, i.e. the malicious substitution of type information by an older versions of this type information. 4
However, because of the current limitations of Java Cards, the prototype reported in this paper does not rely on RMIs.
66
3.2
K. Hypp¨ onen et al.
The New Byte-Code Verification Strategy
The initialize step is replaced by repeating the following for 2 ≤ i ≤ I: 1. Form a string representing the initialized (void) type information (frame) Fi for instruction i. 2. Append to this string a counter ci representing the current number of times that instruction i was visited. Start with ci ← 0. 3. Compute ri = fk (unchanged, ci , i, Fi ) = fk (unchanged, 0, i, Fi ). 4. Send to the terminal {unchanged, Fi , i, ri }. Complete the initialization step by: 1. Sending to the terminal {changed, F1 ← (, (this, τ1 , . . . , τn−1 , , . . . , )), 1, r1 ← fk (changed, c1 ← 0, 1, F1 )}, 2. Initializing an on-board counter τ ← 1. In all subsequent descriptions check ri means: re-compute ri based on the current i, the {ci , k} kept in the card and {Fi , changed/unchanged bit} sent back by the terminal and if the result disagrees with the ri sent back by the terminal, reject the applet. The main fix-point loop is the following: 1. If τ = 0 accept the applet, else query from the terminal an Fi for an instruction i which bit is set to changed. a) Check if the transition rules allow executing the instruction. In case of failure reject the applet. b) Apply the transition rules to the type information Fi received back from the terminal and store the result in the working frame. 2. For all potential successors j of the instruction at i: a) Query the terminal for {Fj , rj }; check that rj is correct. b) Unify the working frame with Fj . If unification fails reject the applet. c) If unification yields a frame Fj different than Fj then – increment cj , increment τ – compute rj = fk (changed, cj , j, Fj ), and – send to the terminal {changed, Fj , j, rj }. The terminal can now erase the old values at entry j and replace them by the new ones. 3. Decrement τ , increment ci , re-compute ri and send {unchanged, Fi , i, ri } to the terminal. Again, the terminal can now erase the old values at entry i and replace them by the new ones. 4. Goto 1. The algorithm that we have just described only requires the storage of I ci counters. Since time complexity will never exceed O(I 4 ), any given instruction can never be visited more than O(I 4 ) times. The counter size can hence be bound by O(log I) thereby resulting in an overall on-board space complexity of
Trading-Off Type-Inference Memory Complexity against Communication
67
O(I log I + cLmax ). where c is a small language-dependent constant (the cLmax component of the formula simply represents the memory space necessary for the working frame). Note that although in our presentation we allotted for clarity a ci per instruction, this is not actually necessary since the same ci can be shared by every sequence of instructions into which no jumps are possible; this O(J log J +cLmax ) memory complexity optimization is evident to Java verification practitioners. 3.3
Reducing In-Card Memory to O(log I + cLmax )
By exporting also the ci values to the terminal, we can further reduce card’s memory requirements to O(log I + cLmax ). This is done by implementing the next protocol in which all the ci values are kept in the terminal. The card generates a second ephemeral MAC key k and stores a single counter t, initialized to zero. – Initialization: The card computes and sends mi ← fk (i, ci ← 0, t ← 0) to the terminal for 1 ≤ i ≤ I. – Read ci : To read a counter ci : • The card sends a query i to the terminal. • The terminal returns {ci , mi }. • The card checks that mi = fk (i, ci , t) and if this is indeed the case then ci can be used safely (in case of MAC disagreement the card rejects the applet). – Increment ci : to increment a counter ci : 1. For j = 1 to I: • Execute Read cj • If i = j, the card instructs the terminal to increment ci . • The card computes mj = fk (j, cj , t + 1) and sends this updated mj to the terminal. 2. The card increments t. The value of t being at most equal to the number of steps executed by the program, t occupies an O(log I) space (in practice, a 32 bit counter). Note, however, that the amount of communication and computations is rather important: for every ci update, the terminal has to send back to the card the values and MACs of all counters associated with the verified method; the card checks all the MACs, updates them correspondingly, and sends them back to the terminal.
4
Implementation Details
We implemented algorithm 3.2 as a usual Java Card applet. It is uploaded onto the card and after initialization, waits a new applet to be received in order to check it for type safety. Thus, our prototype does not have any access to the Java Card Runtime Environment (JCRE) structures nor to Installer’s functions
68
K. Hypp¨ onen et al.
and by no means can it access information about the current contents of the card and packages residing on it. However, the purpose of our code is to check the type safety of newly uploaded applets. Given that new applets can make use of packages already existing on board, our verifier should have full information about the following structures: – the names of the packages already present on board and classes in these packages; – methods for resident classes, along with their signatures; – fields in resident classes and their types. Since this information cannot be obtained from the card itself, we had to assume that the newly downloaded applet uses only common framework packages, and pre-embed the necessary information about these packages into our verifier. The type lattice information is “derived” by the verifier from the superclass references and interface references stored in the byte arrays of classes. The terminal-side applet plays an active role in the verification process; it calls methods of the card-side applet and sends them all the necessary data. 4.1
Programming Tools and Libraries
The prototype has been implemented as a “normal” Java Card applet. It enjoys the full functionality of Sun’s off-card verifier, that we reverse-engineered in the course of this project using a special application called dump, from the JTrek library [13] originally developed by Compaq5 . JTrek contains the Trek class library, which allows navigation and manipulation of Java class files, as well as several applications built around this library; dump being one such application. dump creates a text file containing requested information for each class file of the trek (i.e., a path through a list of class files and their objects); in particular, the generated text file may contain class file’s attributes, instructions, constant pool, and source statements. All this makes it possible to reconstruct source code from class files. After decompiling the program class file (and fixing some of JTrek’s bugs in a process) we obtained, amongst other things: – Parsers for the Java Card CAP and export files; – The verifier’s static checks for all JCVM byte codes; – An abstract interpreter for the methods including the representation of the JCVM states. These tools were used to develop the terminal-side verifier applet; and some ideas were recycled for developing the card-side verifier applet. For actual applet development we used IBM Zurich Research Laboratory’s JCOP Tools [14]. This toolbox consists of the JCOP IDE (Integrated Development Environment) and BugZ, a source-level debugger. Furthermore, shell-like 5
JTrek is no longer downloadable from its web page.
Trading-Off Type-Inference Memory Complexity against Communication
69
APDU command execution environment, as well as command-driven CardMan are included for simple card management tasks, such as listing packages and applets installed on the card, displaying information about given CAP files, installing applets from an uploaded package, sending arbitrary APDU commands to the card, etc. JCOP Tools are shipped with the off-card Application Programming Interface (API). Using the provided implementations of these APIs, it is possible to develop applications that can: – Upload the CAP file onto a card; – Install the applet on a card; – Communicate with the card’s applet (i.e., send APDUs to the applet and receive APDUs from it); – Delete the applet instance and the package from the card. Since JCOP Tools can interact with any Java Card inserted into the reader, the availability of cryptographic functions depends on the card. The kit is shipped with three Java Cards; all of which support 3DES encryption/decryption, and two support RSA. Hence, the JCOP Tools provided us with all the necessary features for implementing both the card-side and the terminal-side parts of our protocol, testing them on virtual as well as real Java Cards and allowing to benchmark the whole. 4.2
Interaction between Terminal-Side and Card-Side Applets
The implemented prototype consists of the terminal-side and card-side applets. Both applets run in parallel. The verification algorithm is fully deterministic (with the exception of the selection of a single frame from the set of all frames marked as changed). Since the order in which marked frames are selected does not affect the final result (i.e., accept or reject the applet), the terminal-side applet can be “proactive” because it has all necessary information for running the verification process in parallel with the card6 . Using this strategy, we can avoid all requests from the card to the terminal given that the latter is fully aware of the current verification state and can hence provide the card-side applet with all required data without being prompted. Thus, the only data sent from the card to the terminal are response status and MAC-ed frames that have to be stored in the terminal. The terminal initiates all verification steps; it sends the card the results of the modelling of each instruction and the results of unification of different frames. The card-side applet simply checks that the verification process advances as it should and updates the instruction counters7 . 6
7
Note that this is not along the general design philosophy of our protocol whereby the terminal needs no other form of intelligence other than the capacity to receive data, store it and fetch it back upon request. We nonetheless implemented some extra intelligence in the terminal to speed-up the development of our proof of concept. Again, the previous footnote applies to this simplification as well.
70
K. Hypp¨ onen et al.
The Terminal-Side Applet. The terminal-side applet is based on Sun Microsystems’ off-card verifier. The latter was fully revised and some new functionality added. The communication with the card-side applet is implemented using IBM JCOP’s API. The terminal-side applet is in charge of the following tasks: – Prepare the CAP file components for sending them to the card-side applet. Parse the CAP file (storing it in the object structure) and check its compliance with Sun’s file format (structural verification being beyond the scope of our demonstrator, we left this part off-board for the time being); – Maintain the storage for frames and their MACs. Exchange frames with the card-side applet; – Resolve the problem of finding the LCA of two frames in nontrivial cases (trivial ones can be dealt with by our card-side applet) and send the result to the card. The Card-Side Applet. The card-side applet: – Controls the correctness of the verifier’s method calls by the terminal-side applet; – Checks and applies transition rules (i.e., performs type inference) to individual instructions. – Maintains a list of counters ci for all instructions; updates counter values as necessary; – Executes cryptographic functions; – Solves the problem Is type A a descendant of type B in the type lattice L? (in other words, is A B?) in order to check the result of the unification of two frames sent by the terminal; – For instructions invokespecial, invokestatic and invokevirtual, checks arguments for their type consistency and pushes the returned type onto the operand stack. Supports calls to all framework methods as well as to methods of the package being currently verified. The invokeinterface instruction is not yet supported. – The card-side applet can unify two frames for all types of stack and local variables except when both types to be unified are references to classes or arrays of references to classes. In this case, the card-side applet asks the terminal to perform unification, waits for results, and checks these results before accepting.
5
Conclusion
Our proof-of-concept (not optimized) implementation required 380 Kbytes for the terminal-side applet source code and 70 Kbyte for the card-side applet source code. With the maximum length of method’s byte-code set to 200 bytes and both, Smax and Lmax limited to 20 (the restrictions of the Java Cards shipped with
Trading-Off Type-Inference Memory Complexity against Communication
71
JCOP Tools), one needs 440 bytes of RAM to run our two-party verification procedure. When the verified byte-code is written into EEPROM (as is the case in most real-life scenarios), one would need only 240 bytes of on-board RAM and 8976+ 200 EEPROM bytes. The natural way to turn our prototype into a full-fledged verifier, is to incorporate it into the Installer applet, which has already its own representation of the CAP file components. We do not think that communication overhead is a serious concern. With the advent of fast card interfaces, such as USB, the transmission’s relative cost is reduced. Typically, USB tokens can feature various performances ranging from a 1.5 Mb/s (low-speed) to 12 Mb/s (full speed). But even with slower interfaces, such as ISO 7816-3 our prototype still functions correctly in real-time.
References 1. Z. Chen, Java Card Technology for Smart Cards: Architecture and Programmer’s Guide, The Java Series, Addison-Wesley, 2000. 2. R. Cohen, The defensive Java virtual machine specification, Technical Report, Computational Logic Inc., 1997. 3. P. Cousot, R. Cousot, Abstract Interpretation: a Unified Lattice Model for Static Analysis by Construction or Approximation of Fixpoints, Proceedings of POPL’77, ACM Press, Los Angeles, California, pp. 238–252. 4. X. Leroy, Java Byte-Code Verification: an Overview, In G. Berry, H. Comon, and A. Finkel, editors, Computer Aided Verification, CAV 2001, volume 2102 of Lecture Notes in Computer Science, pp. 265–285, Springer-Verlag, 2001. 5. X. Leroy, On-Card Byte-code Verification for Java card, In I. Attali and T. Jensen, editors, Smart Card Programming and Security, proceedings E-Smart 2001, volume 2140 of Lecture Notes in Computer Science, pp. 150–164, Springer-Verlag, 2001. 6. X. Leroy, Byte-code Verification for Java smart card, Software Practice & Experience, 32:319–340, 2002. 7. T. Lindholm, F. Yellin, The Java Virtual Machine Specification, The Java Series, Addison-Wesley, 1999. 8. N. Maltesson, D. Naccache, E. Trichina, C. Tymen Applet Verification Strategies for RAM-constrained Devices, In Pil Joong Lee and Chae Hoon Lim, editors, Information Security and Cryptology – ICISC 2002, volume 2587 of Lecture Notes in Computer Science, pp. 118–137, Springer-Verlag, 2002. 9. G. McGraw, E. Felten Java Security, John Wiley & Sons, 1999. 10. D. Naccache, A. Tchoulkine, C. Tymen, E. Trichina Reducing the Memory Complexity of Type–Inference Algorithms, In R. Deng, S. Qing, F. Bao and J. Zhou, editors, Information and Communication Security, ICICS 2002, volume 2513 of Lecture Notes in Computer Science, pp. 109–121, Springer-Verlag, 2002. 11. G. Necula, Proof-carrying code, Proceedings of POPL’97, pp. 106-119, ACM Press, 1997. 12. B. Schneier, Applied Cryptography: Second Edition: protocols, algorithms and source code in C, John Willey & Sons, 1996. 13. http://www.digital.com/java/download/jtrek/ 14. http://www.zurich.ibm.com/jcop/news/news.html
Security Remarks on a Group Signature Scheme with Member Deletion Guilin Wang, Feng Bao, Jianying Zhou, and Robert H. Deng Infocomm Security Department Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore 119613 http://www.i2r.a-star.edu.sg/icsd/ {glwang, baofeng, jyzhou, deng}@i2r.a-star.edu.sg
Abstract. A group signature scheme allows a group member of a given group to sign messages on behalf of the group in an anonymous and unlinkable fashion. In case of a dispute, however, a designated group manager can reveal the signer of a valid group signature. Based on the Camenisch-Michels group signature scheme [7,8], Kim, Lim and Lee proposed the first group signature scheme with a member deletion procedure at ICISC 2000 [15]. Their scheme is very efficient in both communication and computation aspects. Unfortunately, their scheme is insecure. In this paper, we first identify an effective way that allows any verifier to determine whether two valid group signatures are signed by the same group member. Secondly, we find that in their scheme a deleted group member can still update his signing key and then generate valid group signatures after he was deleted from the group. In other words, the Kim-Lim-Lee group signature scheme [15] is linkable and does not support secure group member deletion. Keywords: Digital signature, group signature, member deletion.
1
Introduction
In 1991, Chaum and van Heyst first introduced the concept of group signatures [10]. In a group signature scheme, each group member of a given group is able to sign messages anonymously and unlinkably on behalf of the group. However, in case of later disputes, a designated entity called the group manager can reveal the identity of the signer by “opening” a group signature. From the viewpoints of verifiers, they only need to know a single group public key to verify group signatures. On the other hand, from the viewpoint of the signing group, the group conceals its internal organizational structures, but still can trace the signer’s identity if necessary. In virtue of these advantages, group signatures have many potentially practical applications, such as authenticating price lists, press releases, digital contract, e-voting, e-bidding and e-cash etc [11, 16,1]. A secure group signature scheme must satisfy the following six properties [1, 2]: S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 72–83, 2003. c Springer-Verlag Berlin Heidelberg 2003
Security Remarks on a Group Signature Scheme with Member Deletion
73
– Unforgeability: Only group members are able to sign messages on behalf of the group. – Anonymity: Given a valid signature of some message, identifying the actual signer is computationally hard for everyone but the group manager. – Unlinkability: Deciding whether two different valid signatures were computed by the same group member is computationally hard. – No Framing: Neither a group member nor the group manager can sign on behalf of other group members. – Traceability: The group manager is always able to open a valid signature and identify the actual signer. – Coalition-resistance: A colluding subset of group members (even if comprised of the entire group) cannot generate a valid signature that the group manager cannot link to one of the colluding group members. Up to now, a number of new group signature schemes and improvements have been proposed. In [11], Chen and Pedersen constructed the first scheme which allows new members to join the group dynamically. Camenisch and Stadler proposed the first group signature scheme in which the group public key and signatures have lengths independent of the group size [6]. At the same time, they introduced the new concept of signatures of knowledge, which has become a standard tool in the design of group signature schemes and other related cryptographic protocols. Generally speaking, signatures of knowledge allow a prover to non-interactively prove the knowledge of one or several secrets with respect to some public information. Based on the strong RSA assumption, Camenisch and Michels presented an efficient group signature scheme in [7,8]. Ateniese and Tsudik pointed out some obstacles that stand in the way of real world applications of group signatures, such as coalition attacks and member deletion [2]. In [1], Ateniese et al. presented a provably secure coalition-resistant group signature scheme. Based on the scheme in [7,8], Kim, Lim and Lee proposed the first group signature scheme with a member deletion procedure [15]. Their extension is very efficient in both communication and computation aspects. Whenever a member joins or leaves the group, the group manager only needs to publish two pieces of public information by doing several modular multiplications and exponentiations, and each group member can update his secret key by doing only one modular multiplication. Bresson and Stern also provided a group signature scheme with member deletion [5]. However, their scheme is not efficient when the number of deleted members is large. In addition, to deal with exposure of group members’ secret keys, Song constructed two forward secure group signature schemes in [18]. At the same time, she also extended her schemes to support member deletion. However, these two extensions are not much efficient in the sense that to verify a signature a verifier has to search all revocation tokens (Section 4.4 of [18]) by checking whether the signature is revoked. Therefore, the computational cost in signature verification is proportional to the size of deleted members. Based on the notion of dynamic accumulators, Camenisch and Lysyanskaya proposed a new efficient method for the member deletion problem in group signature schemes [9].
74
G. Wang et al.
In this paper, we discuss the security of the first group signature scheme with a member deletion procedure proposed by Kim, Lim and Lee in [15]. First of all, we point out that the requirements for security parameters listed in [15] are not sufficient to guarantee the system security. Secondly, we identify an effective way that allows any verifier to determine whether two valid group signatures are signed by the same group member. Thirdly, we find that in their scheme a deleted group member can also update his signing key and then generate valid group signatures after he was deleted from the group. In other words, the KimLim-Lee group signature scheme is linkable and does not support group member deletion. Furthermore, we discover that a newly joined group member can derive signing keys corresponding to the time periods before he joins the group. In some scenarios, this is also not a desirable property. The rest of this paper is organized as follows. We introduce related cryptographic assumptions in Section 2. Then, we review Kim-Lim-Lee scheme in Section 3 and present our security analysis in Section 4, respectively. Finally, the conclusion is given in Section 5.
2
Assumptions
In this section we give a brief description of three assumptions: strong RSA assumption [3,14], modified strong RSA assumption [7,8], and decisional DiffieHellman assumption [13,4]. These three assumptions are the security basis of the schemes in [7,8,15]. Let g be a suitable security parameter and G(g ) denote the set of groups whose order has length g and consists of two prime factors of length (g − 2)/2. k, 1 , 2 < g and are further security parameters. For simplicity, we define two intervals Γ and Γ by Γ := [21 − 22 , 21 + 22 ] and ˜ ˜ Γ := [21 − 2 , 21 + 2 ], where ˜ := (2 + k) + 1. In addition, let M(G, z) := e {(u, e)|z = u , u ∈ G, e ∈ Γ, e ∈ primes}. Let K be a key-generation algorithm that on input 1g outputs a group G ∈ G(g ) and z ∈ G/{±1}. Assumption 1 (Strong RSA Assumption): There exists a probabilistic polynomial-time algorithm K such that, for all probabilistic polynomial-time algorithm A and all sufficiently large g , the probability that A on input (G, z) outputs e ∈ Z>1 and u ∈ G satisfying z = ue is negligible. Assumption 2 (Modified Strong RSA Assumption): There exists a probabilistic polynomial-time algorithm K such that, for all probabilistic polynomialtime algorithm A, all sufficiently large g , all M ⊂ M(G, z) with |M| = O(g ), and suitably chosen k, 1 , 2 and , the probability that A on input (G, z, M) / M is negligible. outputs u ∈ G and e ∈ Γ satisfying z = ue and (u, e) ∈ Assumption 3 (Decisional Diffie-Hellman Assumption): There exists a probabilistic polynomial-time algorithm K such that, for all probabilistic polynomial-time algorithm A and all sufficiently large g , the probability that A on input g, g x , g y , and g z ∈R G can distinguishe whether g xy and g z are equal is negligible.
Security Remarks on a Group Signature Scheme with Member Deletion
75
For more discussions about these assumptions, please refer to [8]. Especially, Camenisch and Michels pointed out that Assumption 1 implies Assumption 2 (Section 3 of [8]).
3
Review of Kim-Lim-Lee Scheme
In this section we review the Kim-Lim-Lee group signature scheme [15]. In their scheme, the group manager is split into two roles: the membership manager (MM) and the revocation manager (RM). The whole scheme consists of six stages, i.e., system setup, join, delete, sign, verify and open. Hereafter, r ∈R denotes to select an element r from a set R uniformly and randomly. 3.1
System Setup
The group manager (MM) executes the following procedures: ˆ 1 , 2 , k, such that > 1, g > 1 > 2 and 1–1). Set security parameters g , , g > (2 + k) + 2, and choose a hash function H : {0, 1}∗ → {0, 1}k . 1–2). Choose a group G = g with order G and two random elements z, h ∈R G with the same large order (≈ 2g ) such that: a) In G assumptions 2 and 3 hold; b) Computing discrete logarithms in G to the bases g, h or z is infeasible. 1–3). Set a RSA modulus n = pq, where p and q (≈ 2g /2 ) are two large secure primes such that p, q = 1 mod 8 and p = q mod 8. 1–4). Choose a secret/public key pair (dN , eN ) such that dN eN = 1 mod φ(n). ˆ 1 , 2 , k, and prove that g, h and z have 1–5). Publish n, eN , G, g, h, z, H, g , , the same order, but keep p, q, and G privately. At the same time, the revocation manager (RM) selects his secret key xR ∈R [0, 2g − 1] and publishes yR = g xR as his public key. 3.2
Join
Assume that C := {G1 , G2 , · · · , Gm−1 } is the set of (m − 1) current group members in the system, and the membership key of the group member Gi is a pair (xi , yi ) that satisfies yixi = z,
xi ∈R [21 , 21 + 22 − 1],
where the secret key xi is a prime selected by the group member Gi and the public key yi is extracted by MM. When a user, say Alice, wants to join the system as the m-th group member, she does as follows:
76
G. Wang et al. ˆ
ˆ
2–1). Choose two random primes xm ∈R [21 , 21 + 22 − 1], x ˆm ∈R [2−1 , 2 − 1] 1 such that xm , x ˆm = 1 mod 8 and xm = x ˆm mod 8 . ˆm , z˜ := z xˆm , and commits to x ˜m and z˜. Then, 2–2). Alice computes x ˜m := xm x she sends x ˜m , z˜ and their commitments to MM. 2–3). To convince MM that x ˜m and z˜ are prepared correctly, Alice and MM execute the following interactive statistical zero-knowledge protocol2 : z ). W = SP K{(τ, ρ) : z x˜m = z˜τ ∧ z˜ = z ρ ∧ τ ∈ Γ }(˜ Now, we assume that the group’s public property key is UM := y1 · · · ym−1 y , where random element y ∈R G is known only by MM. When the above protocol is executed successfully, MM does the followings: 2–4). Generate Alice’s public key ym := z˜1/˜xm (= z 1/xm ). 2–5). Compute the new group’s public property key U M := y1 · · · ym−1 ym y by choosing a random number y ∈R G. 2–6). Compute the new group’s public renewal property key U N := (ym y /y )dN . 2–7). Generate the member Gm ’s secret property key Um := (y1 · · · ym−1 y )dN . 2–8). Publish (U M , U N ), and send (ym , Um ) to Alice securely. As the m-th group member Gm , Alice verifies her membership key (xm , ym ) by checking xm ≡ z, and ym (Um )eN ≡ U M . ym At the same time, each other valid group member Gi (1 ≤ i ≤ m − 1) updates his secret property key from Ui := (y1 · · · yi−1 yi+1 · · · ym−1 y )dN into U i := Ui · U N = (y1 · · · yi−1 yi+1 · · · ym−1 ym y )dN . He can also verify his new U i by checking yi (U i )eN ≡ U M . 3.3
Delete
Let the current group’s public property key be UM = y1 · · · ym y where y ∈R G. To delete a group member Gj (1 ≤ j ≤ m), MM performs the following deletion protocol: 3–1). By selecting y ∈R G, compute a new group’s public property key U M := UM y /(yj y ) (= y1 · · · yj−1 yj+1 · · · ym y ). 3–2). Compute a new group’s renewal public property key U N := (y /(yj y ))dN . 3–3). Publish (U M , U N ). Each valid group member Gi updates his secret property key from Ui to U i by computing U i := Ui · U N , and verifies U i by checking yi (U i )eN ≡ U M . 1
2
ˆ
ˆ
The authors of [15] require that xm , x ˆm ∈R [2−1 , 2 − 1]. However, this is wrong. Otherwise, Alice is unable to prove that she knows the value of xm belonging to the interval Γ . Therefore, we correct this error according to the descriptions in [7,8]. For the security of this protocol, please consult Theorem 2 in Section 5.5 of [7].
Security Remarks on a Group Signature Scheme with Member Deletion
3.4
77
Sign
To sign a message M , the member Gi , with the membership key (xi , yi ) and his secret property key Ui , does the followings: w 4–1) Choose a random integer w ∈R {0, 1}g , compute a := g w , b := yi yR , d := xi w w w weN . g h , α := Ui h and β := yR h 4–2) Choose r1 ∈R {0, 1}(2 +k) , r2 ∈R {0, 1}(g +1 +k) , and r3 ∈R {0, 1}(g +k) . 4–3) Compute t1 := br1 (1/yR )r2 , t2 := ar1 (1/g)r2 , t3 := g r3 , t4 := g r1 hr3 , and r 3 r 3 eN h . t5 := yR 4–4) Evaluate c := H(g||h||yR ||z||a||b||d||β||t1 ||t2 ||t3 ||t4 ||t5 ||M ). 4–5) Calculate s1 := r1 − c(xi − 21 ), s2 := r2 − cwxi , s3 := r3 − cw (all in Z).
The resulting signature on the message M is (c, s1 , s2 , s3 , a, b, d, α, β). Kim et al. [15] pointed out that such a group signature would be denoted by λ L = SP K{(θ, λ, µ) : z = bθ /yR ∧ 1 = aθ /g λ ∧ a = g µ µ µeN θ µ ∧ d = g h ∧ β = yR h ∧ θ ∈ Γ }(M ).
3.5
Verify
To verify a group signature (c, s1 , s2 , s3 , a, b, d, α, β) on a message M , a verifier checks its validity as follows: 1
1
s2 5–1) Compute t1 := z c bs1 −c2 /yR , t2 := as1 −c2 /g s2 , t3 := ac g s3 , t4 := 1 s 3 s 3 eN dc g s1 −c2 hs3 , and t5 := β c yR h . 5–2) Evaluate c := H(g||h||yR ||z||a||b||d||β||t1 ||t2 ||t3 ||t4 ||t5 ||M ). 5–3) Check c ≡ c ∈ {0, 1}k , s1 ∈ [−22 +k , 2(2 +k) ], s2 ∈ [−2g +1 +k , 2(g +1 +k) ], s3 ∈ [−2g +k , 2(g +k) ], and a, b, d, α, β ∈ G. 5–4) Accept the signature if and only if βUM /αeN ≡ b 3 .
3.6
Open
To trace the identity of the signer of a signature σ = (c, s1 , s2 , s3 , a, b, d, α, β), RM first checks its validity, then decrypts the ElGamal cipher text (a, b) to find yi = b/axR , generates the signature of knowledge P := SP K{ρ : yR = g ρ ∧ b/yi = aρ }(yi ||σ||M ) and reveals (yi , P). In this way, RM shows that he does not misattribute the group member Gi . The authors of [15] also provided a sign-tracing procedure that allows MM (under the help of RM) to check whether a specific valid group signature is signed by a specific member. We omit this procedure since our discussion has no relation to it. 3
Kim et al. assume that the list of all UM ’s and the corresponding updated dates are publicly available, and that the generating date is embedded in a signature. Therefore, the verifier can find a proper UM to check the validity of a given signature.
78
4 4.1
G. Wang et al.
Security of Kim-Lim-Lee Scheme Security Parameters
In this subsection, we will point out that the requirements for security parameters given in [15] are not sufficient to guarantee the security. The security parameters, ˆ are only required to satisfy the following conditions (see , k, 1 , 2 , g and , Definition 1 in Section 4.4 of [15]) > 1,
g > 1 > 2 ,
and g > (2 + k) + 2.
(1)
However, to guarantee the security of their scheme, we note that the following two conditions are also necessary. 2 >> 1 − (ˆ + 1 )/4,
and 1 > (2 + k) + 2.
(2)
We explain the reasons as follows. If 2 >> 1 − (ˆ + 1 )/4 does not hold, due to the work of Coppersmith in [12], MM can factor the value of x ˜m which is sent to him in Join protocol. Once x ˜m ’s two factors x ˆm and xm are known, MM can mount a framing attack by generating valid group signatures under the name of the member Gm (remember that MM has already know ym and Um ). Therefore, to provide the property of no framing, the first condition in Equation (2) is necessary. The requirement 1 > (2 + k) + 2 is not given in [7], while it is added in [8]. We note that without this requirement, the scheme in [7] may be insecure. For example, Camenisch and Michels suggested that the security parameters can be selected as follows (see Section 5.6 in [7]): = 9/8, k = 160, 1 = 860, 2 = 600, and g = ˆg = 1200. It is obvious that this suit of parameters satisfies all requirements in equations (1) and (2). Therefore, in such a case the security is guaranteed. However, if there is no requirement 1 > (2 + k) + 2, one can re-set 2 = 760 but keep other parameters unchanged. In this case, all requirements in equations (1) and 2 >> 1 − (ˆ + 1 )/4 are also satisfied but the scheme [7] is insecure because anybody (not necessarily a group member) can use (u := z, e := 1) as a valid membership certificate to generate valid group signatures. The correctness of this attack can be directly checked (refer to Section 5.3 of [7] for details of signature generation and verification). As for Kim-Lim-Lee scheme [15], similar attack is unlikely mounted unless an attacker also obtains a secret property key (UM /z)dN . However, it seems natural to add requirement 1 > (2 + k) + 2 to the Kim-Lim-Lee scheme since this scheme is an extension of the scheme in [7,8]. 4.2
Linkability
The authors of [15] claimed that similar to the Camenisch-Michels scheme [7,8] their scheme is also unlinkable. However, we find in fact their scheme is linkable. Before discussing the linkability of the Kim-Lim-Lee scheme, we first prove that yi g xi eN is an invariant for the group member Gi . More specifically, for i = j,
Security Remarks on a Group Signature Scheme with Member Deletion
79
we want to show that yi g xi eN = yj g xj eN holds only with a negligible probability. Since z, yi , yj ∈ G = g, we assume that z = g a0 , yi = g ai and yj = g aj for some x unknown a0 , ai , aj ∈ ZG . From z = yixi = yj j , we have ai xi = a0 mod G and x i eN x j eN = yj g , we get xi eN + ai = xj eN + aj mod G. aj xj = a0 mod G. If yi g Then, using ai xi = a0 mod G and aj xj = a0 mod G, we have (xi xj eN − a0 )(xi − xj ) = 0 mod G. This implies G|(xi xj eN − a0 )(xi − xj ).
(3)
Note that xi , xj ∈ [21 , 21 + 22 − 1] are two random primes selected by the members Gi and Gj , and they must be different. Otherwise, if Gi and Gj set xi = xj , then MM will extract the same value for yi and yj and find they are cheaters. Therefore, we have xi = xj and |xi | = |xj | = 1 + 1 (|r| denotes the bit-length of the integer r). At the same time, |G| ≈ g > 1 , G (the order of the cyclic group G) consists of two large prime factors and only MM knows the value of G. Furthermore, group members do not know the value of a0 , i.e., the discrete logarithm of z to the base g. Therefore, it is not difficult to see that Equation (3) holds only with a negligible probability. Consequently, for different i and j, yi g xi eN = yj g xj eN holds only with a negligible probability. Given a valid signature pair (c, s1 , s2 , s3 , a, b, d, α, β) on a message m, according to Step 4-1) in the signing protocol, we know that w w weN b = yi y R , d = g xi hw , α = Ui hw , β = yR h ,
for some wR ∈ {0, 1}g .
Note that at any moment in the system lifetime, UM = yi (Ui )eN holds for any current member Gi . Therefore, we have the following equalities (d/α)eN = g xi eN /UieN = yi g xi eN /UM .
(4)
Note that UM is unchanged in the time period T in which the group’s public property key UM is valid. At the same time, we have proved that yi g xi eN is an invariant for the member Gi , so the right most expression in equation (4) is an invariant for the group member Gi in the time period T . This implies that all signatures signed by the same group member in the same time period T are linkable. That is, given two valid group signatures (c, s1 , s2 , s3 , a, b, d, α, β) and ¯α ¯ which are signed in the same period T , anybody (not (¯ c, s¯1 , s¯2 , s¯3 , a ¯, ¯b, d, ¯ , β) necessarily a group member) can know whether they are the signatures of the same group member by checking ¯α d/α ≡ d/ ¯.
(5)
Furthermore, according to equation (4) and the fact that UM β = bαeN , we have the following equalities: deN b/β = deN UM /αeN = yi g xi eN .
(6)
Since yi g xi eN is an invariant for the member Gi (in all time periods), the above equalities show that deN b/β is also an invariant for the member Gi . This implies
80
G. Wang et al.
that all signatures signed by the same group member in all time periods are linkable. Equation (6) also shows that even one value of α or β is released, group signatures signed by the same member are still linkable. In other words, the Kim-Lim-Lee scheme reveals much more information so that it does not satisfy the unlinability. Note that linability also means that the anonymity of a signer does not satisfy in the sense that one opened group signature will reveal all other group signatures signed by the same group member. 4.3
A Member Is Deleted from the Group
In Setion 5 of [15], Kim et al. claimed that “The following theorem implies that non-group member or a deleted group member with his obsolete secret key cannot generate any valid signature by showing that forging a valid signature is equivalent to solving the RSA problem.” Theorem 1 [15]. There exists a probabilistic polynomial algorithm that on input yR , yi , h, UM and eN outputs (w, α) satisfying βUM /(αeN ) = b where w weN w h and b = yi yR if and only if it is able to solve the RSA problem. β = yR We do not find any problem in their proof of Theorem 1. However, we notice that Theorem 1 does not imply that a deleted group member cannot use his obsolete secret key to generate valid signatures. In other words, the above claim they made is wrong. The reason is that a deleted group member not only has yR , yi , h, UM and eN , but also has xi and Ui such that yixi = z and yi (Ui )eN = UM . Therefore, in the essence Theorem 1 has no relation to the forging ability of a deleted member after he is deleted. In the following, we give an example to show how a deleted group member can update his secret key and then generate valid group signatures as a valid member does (The authors of [9] also point out this problem but without details.). The only assumption is that he can access the newly updated group’s public renewal property key UN . This assumption is reasonable since UN is a public information (at least in the group of system members). Therefore, in the case a deleted member cannot access newly updated UN , we assume that he may collude with a valid group member. Let G1 , G2 , · · · , Gm , Gm+1 be (m + 1) current group members in the system, and the current group’s public property key be UM = y1 · · · ym ym+1 y . Later, for some reason, one group member is deleted by MM. Without loss of generality, we assume that Gm+1 is the deleted group member. Then, MM publishes the new group’s property key U M = y1 · · · ym y , for some y ∈R G, and new group’s renewal property key U N = (y /ym+1 y )dN . By using U N and U M , each valid group member updates his secret property key as described in Delete protocol in Section 3.3. For a secure group signature scheme with member deletion, Gm+1 should not be able to update his secret property key any more. However, in the scheme [15], Gm+1 can update his secret property key Um+1 as follows.
Security Remarks on a Group Signature Scheme with Member Deletion
81
Assume that before Gm+1 has been deleted, his secret property key is Um+1 , eN = UM where Um+1 = (y1 · · · ym y )dN . To update his which satisfies ym+1 Um+1 secret property key, he needs to compute a value U m+1 such that eN
ym+1 U m+1 = U M .
(7)
−1 −1 )dN = (y1 · · · ym y ym+1 )dN = This implies U m+1 = (U M ym+1 dN dN = Um+1 U N . Therefore, by using the same (y1 · · · ym y ) · (y /(y ym+1 )) method, the deleted member Gm+1 can also update his secret property key as a valid group member does. Consequently, Gm+1 can generate valid group signatures by using his membership key (xm+1 , ym+1 ) and newly secret property key U m+1 even after he has been deleted from the system. Now, we further consider whether the deleted member Gm+1 can update his secret property key continuously when the group of system members changes dynamically. The answer is positive. We assume the system is set up at the time τ0 , and a member joins or is deleted at the time τj . The time sequence satisfies τ0 < τ1 < · · · < τj < τj+1 < τ · · ·. At the time τj , MM publishes the group’s public property key UMj and the τj group’s public renewal property key UN . During the time period Tj := [τj , τj+1 ), τ each group member Gi uses his secret property key Ui j to generate signatures. Therefore, for each valid member Gi in the time period Tj , the following equality holds: τ τ (8) yi (Ui j )eN = UMj .
In addition, from the description of Join and Delete protocols, it is not difficult to see that either a member joins the system or is deleted from the system in the time period Tj , the following equality always holds: τ
τ
τ
UNj = (UMj /UMj−1 )dN .
(9)
Assume that the member Gm+1 is deleted at the time τj . He wants to get τj+t his secret property key Um+1 for the time period Tj+t that satisfies Equation τj+t eN τ τj+t τ dN (8), i.e., ym+1 (Um+1 ) = UMj+t . This implies that Um+1 = (UMj+t )dN /ym+1 = τj+t τj+t−1 dN τj+t τj+t−1 dN UN · (UM ) /ym+1 = UN · Um+1 . Therefore, for any time period Tj+t , the deleted member Gm+1 can update his secret property key by using the following equation: τ
τ
τ
τ
τ
τ
j+t j Um+1 = UNj+t · UNj+t−1 · · · UNj+2 · UNj+1 · Um+1 ,
for any t ∈ Z>0 .
(10)
By using Equation (10), a deleted member can update his secret property key as a valid member does. Therefore, the authors of [15] failed to provide a group signature scheme supporting secure member deletion. 4.4
A Member Joins the Group
Now, we want to know when a new group member joins the system in the time period Tj , whether he can get his secret property key corresponding to the time period Tj where j < j? Again, the answer is positive.
82
G. Wang et al.
Assume that Gm+1 joins the system at time τj , and gets his secret property τj for time period Tj . Similar to equation (10), we can derive the following key Um+1 equation: j−t j Um+1 = (UNj−t+1 · · · UNj−1 · UNj )−1 · Um+1 ,
τ
τ
τ
τ
τ
for any 0 < t < j.
(11)
Therefore, if a group member Gm+1 who joins the system in time period Tj can get old renewal property keys, he is able to derive his secret property key corresponding to early time periods. According to how to bind the signature generation date and time in a signature (Lim et. al do not provide details), this kind of secret property keys may enable group members who joins the group later to generate back-dated group signatures. The generation time and date are normally embedded in a signature to allow a verifier to easily find the appropriate public property key UM to check the validity of a signature. In such a case, a newly joined member can use an earlier secret property key to generate signatures which look as if they are signed before. In some applications, this property may be not desirable.
5
Conclusion
In this paper, we presented a security analysis of the Kim-Lim-Lee group signature scheme with a member deletion procedure [15]. Our analysis showed that this scheme is linkable and does not support secure group member deletion. More specifically, we demonstrated that a verifier can easily determine whether two group signatures are signed by the same group member, and that a deleted group member can also update his signing key and then generate valid signatures after he was deleted from the group. Furthermore, we discovered that a newly joined group member can derive signing keys corresponding to the time before he joins the group and generate back-dated group signatures. In some scenarios, this may be not a desirable property. In addition, we pointed out that the requirements for security parameters listed in [15] are not sufficient to guarantee the system security. Therefore, the Kim-Lim-Lee group signature scheme is insecure though it provides a very efficient member deletion procedure.
References 1. G. Ateniese, J. Camenisch, M. Joye, and G. Tsudik. A practical and provably secure coalition-resistant group signature scheme. In: Advances in Cryptology – CRYPTO’2000, LNCS 1880, pages 255–270. Berlin: Springer-Verlag, 2000. 2. G. Ateniese and G. Tsudik. Some open issues and new directions in group signature schemes. In: Financial Cryptography (FC’99), LNCS 1648, pages 196–211. Berlin: Springer-Verlag, 1999. 3. N. Baric and B. Pfitzman. Collision-free accumulators and fail-stopsignature schemes without trees. In: Advances in Cryptology – EUROCRYPT’97, LNCS 1233, pages 480–494. Berlin: Springer-Verlag, 1997.
Security Remarks on a Group Signature Scheme with Member Deletion
83
4. D. Boneh. The decision Diffie-Hellman problem. In: Proceedings of the Third Algorithmic Number Theory Symposium, LNCS 1423, pages 48-63. Berlin: SpringerVerlag, 1998. 5. E. Bresson and J. Stern. Efficient revocation in group signatures. In: Public Key Cryptography (PKC’01), LNCS 1992, pages 190–206. Berlin: Springer-Verlag, 2001. 6. J. Camenisch and M. Stadler. Effient group signature schemes for large groups. In: Advances in Cryptology – CRYPTO’97, LNCS 1294, pages 410–424. Berlin: Springer-Verlag, 1997. 7. J. Camenisch and M. Michels. A group signature scheme with improved efficiency. In: Advances in Cryptology – ASIACRYPT’98, LNCS 1514, pages 160–174. Berlin: Springer-Verlag, 1998. 8. J. Camenisch and M. Michels. A group signature scheme based on an RSA-variant. Technical Report RS-98-27, BRICS, University of Aarhus, November 1998. An earlier version appears in [7]. 9. J. Camenisch and A. Lysyanskaya. Dynamic accumulators and application to efficient revocation of anonymous credentials. In: Advances in Cryptology – CRYPTO 2002, LNCS 2442, pages 61–76. Berlin: Springer-Verlag, 2002. 10. D. Chaum and E. van Heyst. Group signatures. In: Advances in Cryptology - EUROCRYPT’91, LNCS 950, pages 257–265. Berlin: Springer-Verlag, 1992. 11. L. Chen and T. P. Pedersen. New group signature schemes. In: Advances in Cryptology - EUROCRYT’94, LNCS 950, pages 171–181. Berlin: Springer-Verlag, 1995. 12. D. Coppersmith. Finding a small root of a Bivariatre interger equation; Factoring with high bits known. In: Advances in Cryptology – EUROCRYPT’96, LNCS 1070, pages 178–189. Berlin: Springer-Verlag, 1996. 13. W. Diffie and M.E. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, 6(IT-22):644-C654, 1976. 14. E. Fujisaki and T. Okamoto. Statistical zero-knowledge protocols to prove modular polynomial relations. In: Advances in Cryptology – CRYPTO’97, LNCS 1294, pages 16–30. Berlin: Springer-Verlag, 1997. 15. H.J. Kim, J.I. Lim, and D.H. Lee. Efficient and secure member deletion in group signature schemes. In: Information Security and Cryptology (ICISC 2000), LNCS 2015, pages 150–161. Berlin: Springer-Verlag, 2001. 16. A. Lysyanskaya and Z. Ramzan. Group blind digital signatures: A scalable solution to electronic cash. In: Financial Cryptography (FC’98), LNCS 1465, pages 184–197. Berlin: Springer-Verlag, 1998. 17. H. Petersen. How to convert any digital signature scheme into a group signature scheme. In: Security Protocols Workshop, LNCS 1361, pages 177–190. Berlin: Springer-Verlag, 1997. 18. D.X. Song. Practical forward secure group signature schemes. In: Proceedings of the 8th ACM Conference on Computer and Communications Security (CCS 2001), pages 225–234. New York: ACM press, 2001.
An Efficient Known Plaintext Attack on FEA-M Hongjun Wu, Feng Bao, and Robert H. Deng Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore 119613 {hongjun,baofeng,deng}@i2r.a-star.edu.sg
Abstract. Yi et al. have proposed a cipher called the fast encryption algorithm for multimedia (FEA-M). Recently Mihaljevi´c and Kohno pointed out that FEA-M is insecure. However, their attacks are not efficient: their chosen plaintext attack and known plaintext attack require 237 -bit chosen plaintext and 260 -bit known plaintext, respectively. In this paper we give an efficient known plaintext attack against FEA-M. Our attack requires only 228 -bit known plaintext and about 233 XOR operations.
1
Introduction
Yi et al. have proposed a fast encryption algorithm for multimedia (FEA-M) [4]. FEA-M is a cipher based on the Boolean matrix operations. Mihaljevi´c and Kohno broke FEA-M with two attacks [2]. Their chosen plaintext attack requires about 225 chosen messages with the first 4096 bits being 0. Their known plaintext attack requires about 260 -bit known plaintext. Both attacks are not efficient due to the large amount of chosen/known plaintext required. In this paper, we give a very efficient known plaintext attack against FEAM. Under our attack, the key is recovered with 228 -bit known plaintext. And only about 233 XOR operations are needed in the attack. Our attack shows that FEA-M is extremely insecure. This paper is organized as follows. Section 2 introduces the cipher FEA-M. Our efficient known plaintext attack is given in Section 3. Section 4 concludes this paper.
2
Description of FEA-M
¯ The secret key of FEA-M is a 64 × 64 invertible binary matrix denoted as K. For each message being encrypted, a session key pair (K,V ) is generated, where K and V are 64 × 64 binary matrices and K is invertible. This pair is encrypted ¯ as with the use of K ¯ · K −1 · K ¯ K = K ¯ ·V ·K ¯ V =K S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 84–87, 2003. c Springer-Verlag Berlin Heidelberg 2003
(1) (2)
An Efficient Known Plaintext Attack on FEA-M
85
where ‘·’ denotes the matrix multiplication over GF (2) and K −1 denotes the inverse of K over GF (2). To encrypt a message, the message is divided into 64 × 64 binary matrices P1 , P2 , · · ·, Pr , · · ·. Each plaintext block Pi is encrypted into ciphertext Ci as C1 = K · (P1 + V ) · K + V Ci = K · (Pi + Ci−1 ) · K i + Pi−1
for i ≥ 2
(3) (4)
where the ‘+’ denotes the matrix addition over GF (2). The ciphertext together ¯ with (K ,V ) are sent to the receiver. The message could be recovered with K.
3
The Efficient Known Plaintext Attack
In this section, we will introduce our efficient known plaintext attack against FEA-M. The attack is applied to recover the session key pair (K, V ) in Subsection ¯ is recovered in Subsection 3.2. 3.1. The master secret key K 3.1
Recovering the Session Key Pair (K, V )
We assume that (Pi−1 + Ci ) is invertible for i ≥ 2. The impact of the noninvertible (Pi−1 + Ci ) on the attack is discussed at the end of this subsection. From (4), we obtain that I = (Pi−1 + Ci )−1 · K · (Pi + Ci−1 ) · K i
for i ≥ 2
(5)
where I is the identity matrix. Let Ai = Pi−1 + Ci and Bi = Pi + Ci−1 , we rewrite (5) as i for i ≥ 2 (6) I = A−1 i · K · Bi · K Combine any two consecutive equations in (6), we obtain Ai+1 · A−1 i · K · Bi = K · Bi+1 · K
for i ≥ 2
(7)
Solve the following linear equations for the binary unknown variables xi (2 ≤ i ≤ 4098), 4098 xi · Bi+1 = 0 (8) i=2
To solve (8), we write (8) as M · X = 0, where M is a 4096 × 4097 binary matrix (
i
,i mod 64)
64 with each element M (i,j) = Bj+2 for 1 ≤ i ≤ 4096 and 1 ≤ j ≤ 4097 i i ), and X is a binary vector (the floor function 64 denotes the integer part of 64 with 4097 elements with each element Xi = xi . A non-zero vector X satisfying M · X = 0 always exists since the rank of M is at most 4096, which is less than the number of variables. From a non-zero solution X, we define a set S as
S = {i|xi = 1} From (7) and the definition of S, we obtain the following relation
86
H. Wu, F. Bao, and R.H. Deng
Ai+1 · A−1 i · K · Bi = 0
(9)
i∈S
(9) can be written as T · Y = 0, where T is a 4096 × 4096 binary matrix, and i Y is a binary vector with 4096 elements, each element Yi = K ( 64 ,i mod 64) . It is known that the rank of a randomly generated m × n binary matrix is r (1 ≤ r ≤ min(m, n)) with probabiltiy Pr = 2r(m+n−r)−nm
r−1 i=0
(1 − 2i−m )(1 − 2i−n ) 1 − 2i−r
For an n × n binary matrix (n ≥ 64), the rank is n, n − 1, n − 2, n − 3 and n − 4 with probability 0.2888, 0.5776, 0.1284, 0.0052 and 4.7 × 10−5 , respectively. The probability that the rank being less than n − 4 is negligible. Since a non-zero Y (the session key K) is a solution to (9), the rank of T is less than 4096. The rank of T is less than 4092 with negligible probability, so there are only a few non-zero solutions to (9). We can filter the wrong K by substituting those solutions into any equation in (4). Once we know the value of K, V can be obtained by solving (3). Note that (7) holds only if Ai and Ai+1 are invertible. A randomly generated 64 × 64 binary matrix is invertible with probability 0.2888. The probability that both Ai and Ai+1 are invertible is about 0.083. We thus need about 216 blocks of known plaintext in the attack, that is equivalent to 228 -bit known plaintext. 3.2
¯ Recovering the Secret Key K
¯ from the session key pair (K, V ). We proceed to recover the master secret key K ¯ −1 . From (1) and (2), we obtain Let Z = K ¯ Z · K = K −1 · K,
¯ Z ·V =V ·K
(10)
V is invertible with probability 0.2888. If V is invertible, (10) can be simplified further by eliminating Z. Otherwise we solve (10) directly. The pair (K , V ) is ¯ can known to the attacker since it is sent together with the ciphertext. (Z, K) be retrieved by solving at most 8192 linear equations in (10). In case that too many solutions exist, one more pair (K, V ) is needed to refine the results. 3.3
Complexity of the Attack
The expensive operations in the attack are related to 1) computing the inverses of 216 64 × 64 binary matrices to find out 4097 invertible (Ai , Ai+1 ) pairs, 2) computing the matrix T , and 3) solving four groups of binary linear equations (8), (9), (3) and (10). We use the standard Gaussian elimination in the attack and assume that the attack is implemented on the 32-bit microprocessor. Computing the inverse of a 64 × 64 binary matrix requires about 213 XOR operations. We need about 231 XOR operations to form the matrix T . Solving each of (8), (9),
An Efficient Known Plaintext Attack on FEA-M
87
(3) requires 229.4 XOR operations. Solving (10) requires 232.4 XOR operations. The amount of XOR operations required in the attack is about 216 × 213 + 231 + 3 × 229.4 + 232.4 ≈ 233.28 . In the attack we use the standard Gaussian elimination instead of Strasen’s algorithm [3] and Coppersmith and Winograd’s algorithm [1]. The reason is that the dimension of the matrices being involved in the attack is small (at most 8192) and the Gaussian elimination performs well already. The complete attack requires 228 bits known plaintext and about 233 XOR operations. Our attack is more efficient than that in [2]. The reason is that we developed efficient technique to eliminate the quadratic terms in (1), (2) and (7), while the standard linearization technique (replacing each quadratic term with a new variable) is used in [2].
4
Conclusions
In this paper, we proposed a known plaintext attack against FEA-M. It is much more efficient than the attacks reported early. Our attack shows that FEA-M is extremely weak and should not be used.
Acknowledgements. We would like to thank the anonymous reviewers of ICICS for the helpful comments.
References 1. D. Coppersmith, and S. Winograd, “On the Asymptotic Complexity of Matrix Multiplication”, SIAM Journal on Computing, Vol. 11 (1982), pp. 472–492. 2. M.J. Mihaljevi´c, and R. Kohno, “Cryptanalysis of Fast Encryption Algorithm for Multimedia FEA-M”, IEEE Communications Letters, Vol. 6, No. 9, pp. 382–385, September 2002. 3. V. Strassen, “Gaussian Elimination is not Optimal”, Numerical Mathematics, Vol. 13 (1969), pp. 354–356. 4. X. Yi, C.H. Tan, C.K. Siew, and M.R. Syed, “Fast Encryption for Multimedia”, IEEE Transactions on Consumer Electronics, Vol. 47, No. 1, pp. 101–107, February 2001.
An Efficient Public-Key Framework Jianying Zhou, Feng Bao, and Robert Deng Institute for Infocomm Research 21 Heng Mui Keng Terrace Singapore 119613 {jyzhou,baofeng,deng}@i2r.a-star.edu.sg
Abstract. Public-key certificates play an important role in binding the public key with the identity of the owner of the corresponding private key. A certificate might be revoked before its scheduled expiry date by the issuing CA. Efficient and timely distribution of certificate revocation information is a big challenge facing the PKI providers. Existing certificate revocation schemes place a considerable processing, communication, and storage overheads on the CA as well as the relying parties. To improve the current situation, we propose a revocation-free public-key framework, in which the maximum lifetime of a certificate is divided into short periods and the certificate could expire at the end of any period under the control of the certificate owner (or his manager in a corporate environment). The verifier can check the status of such a certificate without retrieving the revocation information from the CA. The new framework is especially useful for applications on wireless devices that are unable to make simultaneous connections. The new framework could be easily integrated into existing PKI products that support X.509-based certificates.
1 Introduction The public-key infrastructure (PKI) provides an important support for various security services relying on public-key cryptography [AL99]. A public-key certificate binds the public key with the identity of the owner of the corresponding private key [ISO138881]. X.509 is an industry standard which defines the format of a public-key certificate [X509]. To ensure the authenticated binding, the certificate needs to be issued by a trusted third party (TTP) called the certification authority (CA). A certificate might be revoked before its scheduled expiry date by the issuing CA. Efficient and timely distribution of certificate revocation information is a big challenge facing the PKI providers. The IETF PKIX Working Group is developing the Internet standards to support an X.509-based PKI [RFC2459], which provides a framework on services related to issuing public-key certificates and distributing revocation information. In practice, distribution of revocation information constitutes a substantial cost of PKI. The efficiency could be significantly improved if a user can control the validity of his own certificate and others can check the validity of such a certificate without retrieving the revocation information from the CA (or the designated directory). S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 88–99, 2003. © Springer-Verlag Berlin Heidelberg 2003
An Efficient Public-Key Framework
89
In this paper, we propose an efficient public-key framework that exempts the CA from certificate revocation. We define an extensible public-key certificate that divides its maximum lifetime into short periods and is allowed to expire at the end of any period. The undeniable information that extends the certificate’s expiry date is released at a regular interval, and might be controlled either by the certificate owner or by his manager in a corporate environment. The certificate verifier can determine whether the certificate is valid without contacting the CA or other trusted third parties. The new framework removes a major operational bottleneck in today’s PKI. It is especially useful for applications on wireless devices that are unable to make simultaneous connections. It could be easily integrated into existing PKI products that support X.509-based certificates. The rest of the paper is organized as follows. In Section 2, we review two standardized certificate revocation mechanisms in the IETF. After that, we propose a revocation-free public-key framework in Section 3, and discuss the integration with X.509 in Section 4. We conclude the paper in Section 5.
2 Certificate Revocation Certificate revocation is one of the major issues in PKI. There are two standardized certificate revocation mechanisms in the IETF. • CRL – Certificate Revocation List [RFC2459], which provides periodic revocation information. • OCSP – On-line Certificate Status Protocol [RFC2560], which provides timely revocation information. 2.1 Certificate Revocation List A CRL is a time-stamped list of serial numbers or other certificate identifiers for those certificates that have been revoked by a particular CA. It is signed by the relevant CA and made freely available in a public repository. Updates should be issued regularly, even if the list has not changed (thus enabling users possessing a CRL to check that it is the current one). The revoked certificates should remain on the list until their scheduled expiry date. X.509 v2 CRL format profiled for Internet use in [RFC2459] defines the required and optional fields. The required fields identify the CRL issuer, the algorithm used to sign the CRL, the date and time the CRL was issued, and the date and time by which the CA will issue the next CRL. Additional information includes • Reason Code – identifies the reason for the certificate revocation. • Hold Instruction Code – indicates the action to be taken after encountering a certificate that has been placed on hold. • Invalidity Date – provides the date on which it is known or suspected that the private key was compromised or that the certificate otherwise became invalid.
90
J. Zhou, F. Bao, and R. Deng
•
Certificate Issuer – identifies the certificate issuer associated with an entry in an indirect CRL. A main optional field is “CRL extensions”, which provides methods for associating additional attributes with CRLs. The X.509 v2 CRL format allows communities to define private extensions to carry information unique to those communities. Each extension in a CRL may be designated as critical or non-critical. A CRL validation must fail if it encounters a critical extension which it does not know how to process. However, an unrecognized non-critical extension may be ignored. Operational protocols that deliver CRLs to client systems could be built based on a variety of different means such as LDAP, HTTP, FTP, and X.500. A disadvantage of the CRL-based mechanism is that the time granularity of revocation is limited to the CRL issue period. For example, if a revocation is reported now, it will not be reliably notified to certificate verifiers until the next periodic CRL is issued – this may be up to one hour, one day, or one week depending on the frequency that the CA issues CRLs. 2.2 Online Revocation and Verification As a supplement to checking against a periodic CRL, the OCSP-based mechanism enables applications to determine the status of a certificate timely but with a much higher operational cost. An OCSP client issues a status request to an OCSP responder and suspends acceptance of the certificate in question until the responder provides a response. The OCSP responder must be one of the following parties. • The CA who issued the certificate in question, • A trusted responder whose public key is trusted by the requester, or • A designated responder who holds a specially marked certificate issued directly by the CA, indicating that the responder may issue OCSP responses for that CA. Upon receipt of a request, the OCSP responder either returns a definitive response, or produces an error message. All definitive response messages should be digitally signed. The response for each of the certificates in a request mainly consists of “certificate status value” and “response validity interval”. There are two certificate status values. “Good” indicates a positive response to the status inquiry. “Revoked” indicates that the certificate has been revoked. “Unknown” indicates that the responder does not know about the certificate being requested. There are two response validity intervals. “ThisUpdate” indicates the time at which the status being indicated is known to be correct. “NextUpdate” indicates the time at which newer information will be available about the certificate status. If “nextUpdate” is not set, it means newer revocation information is available all the time. Prior to accepting a signed response as valid, OCSP clients should confirm that • The certificate identified in a received response corresponds to the one identified in the request. • The signature on the response is valid. • The identity of the signer matches the intended recipient of the request.
An Efficient Public-Key Framework
91
• The signer is currently authorized to sign the response. • The time “thisUpdate” is sufficiently recent. • The time “nextUpate” is greater than the current time if it is set. Both of the above IETF standardized revocation mechanisms require the certificate verifier to obtain the revocation information from a trusted third party to check the status of a public-key certificate. That could place a considerable processing, communication, and storage overheads on the CA as well as the relying parties, which might be unaffordable to applications with limited computational and/or network capability. For instance, a wireless device may not be able to establish an extra connection with the CA to check the status of a certificate in an on-going communication session with another entity. Many efforts have been devoted to improve the efficiency of certificate revocation. The use of certificate revocation tree (CRT) was suggested in [Ko98] to enable the verifier to get a short proof that the certificate was not revoked. A windowed revocation mechanism was proposed in [MJ00] to reduce the burden on certificate servers and network resources. A certificate revocation system was presented in [Mi01] to improve the CRL communication costs. More work on efficient certificate revocation can be found in [ALO98, Co00, NN98, WLM00]. Unfortunately, there is no scheme that exempts the CA from certificate revocation.
3 A New Public-Key Framework Here we present a new public-key framework in which the maximum lifetime of a certificate is divided into short periods and the certificate could expire at the end of any period under the control of the certificate owner (or his manager in a corporate environment). The verifier can check the certificate status without retrieving the revocation information from the CA. This is based on a security building block “one-way hash chain” [La81]. One-way hash chain has been seen in many applications including one-time password authentication and micro-payment. A one-way hash chain can also be bound to a public-key certificate. In [Mi01], the CA generates a one-way hash chain for each user requesting a public-key certificate, and includes each user’s last chained hash value in their certificate. The CA updates the status of users’ certificates regularly by releasing the corresponding hash values instead of the CRL. The performance is improved in such a system. However, the CA still needs to be constantly involved to provide the revocation information to certificate verifiers. We intend to establish a new public-key framework that exempts the CA from testifying the validity of a public-key certificate once the certificate has been issued by the CA. The exclusion of the CA’s involvement is based on the assumptions that the CA’s private key is well protected against compromise and the certificates issued by the CA are error-free.
92
J. Zhou, F. Bao, and R. Deng
Definition 1. A public-key framework is revocation-free if the CA need not provide the revocation information of public-key certificates that it has issued, and the verifier can check the certificate status without contacting the CA. 3.1 Generation of New Certificate We first consider the situation that the validity of a public-key certificate is solely controlled by the certificate owner. SIGNA(M) denotes party A’s signature on message M. A user U’s public-key certificate with an extensible expiry date could be generated in the following way. Actions by U 1. Generate a pair of keys: SKU – private key PKU – public key 2. Define the certificate parameters: T – maximum lifetime D – starting valid date L – time period for refreshing validity of the certificate Suppose j = T/L is an integer. The refreshing points are denoted as D1 = D+L, D2 = D+2*L, …, Dj = D+j*L and illustrated in Figure 1. i i-1 0 3. Generate a one-way hash chain H (r) = H(H (r)) (i = 1,2, …, j), where H (r) = r and r is a random number known only to U. j 4. Send (PKU, D, H (r), j, L) to the CA. Actions by the CA 1. Authenticate U’s request in an out-of-band method. j 2. Generate a certificate CERTU = SIGNCA(U, PKU, D, H (r), j, L).1 3. Issue CERTU to U.
L
Certificate Lifetime:
Refreshing Point:
D j-1
Hash Value Release: H (r)
+
…
D1
… De-1
j-2
+
i
H (r) … H (r)
L
+
…
+
De
…
Dj-1
i-1
L
= j*L = T
Dj
0
H (r) … H (r)
Fig. 1. Certificate Expiry Date Extension
Compared with an ordinary public-key certificate, CERTU contains extra data j (H (r), j, L).2 They will be used to control the validity of CERTU. 1
For simplicity, other less related information is omitted in CERTU.
An Efficient Public-Key Framework
93
Definition 2. A public-key certificate CERTU is (r,j,L)-extensible if the maximum number of extension is j, the refreshing period is L, and the control seed is r. Definition 3. A public-key certificate CERTU is self-controlled (r,j,L)-extensible if CERTU is (r,j,L)-extensible and r is known to U only. 3.2 Use of New Certificate Once CERTU is generated, it could either be delivered by the certificate owner U during a transaction, or be retrieved from a public directory maintained by a third party. j-1 At the starting valid date D, U can release H (r) to initialize the validity of CERTU, which then has an expiry date D1 = D+L. We focus our discussion on the use of public-key certificate in digital signatures. Suppose the next refreshing point of CERTU is De. When U generates a digital signai ture with SKU, he will attach (H (r), i), where i = j - (De-D)/L, to the signature. (The hash value release at each refreshing point is illustrated in Figure 1.) Note that it is entirely up to U for the hash value release at a refreshing point. For example, if U does i not generate any signature in the period between De-1 and De, U need not release H (r). But later if U wants to generate signatures in the period between De and De+1, U can i-1 directly release H (r). When a transacting party V wants to verify U’s signatures, he first needs to check the status of CERTU. Suppose V holds the CA’s public verification key, and the current time that V verifies CERTU is Dv. V can take the following steps to check the status of CERTU. j 1. V verifies the CA’s signature on (U, PKU, D, H (r), j, L). If true, V is sure that U’s public key is PKU. The starting valid date is D, the maximum lifetime is T = j*L, the refreshing time period is L, and the last hash value in the onej way hash chain is H (r). j-i i j i 2. V checks that 0 ≤ i < j and H (H (r)) = H (r). If true, V believes that H (r) is a j valid hash value in the one-way hash chain ended with H (r). 3. V checks that Dv ≤ D + (j-i)*L. If true, V concludes that CERTU is valid now, and remains valid until De = D + (j-i)*L. In such a way, U can control the validity of CERTU by releasing the corresponding i H (r) when generating digital signatures. V can check the status of CERTU without retrieving the revocation information from the CA. Thus, the CA is exempted from certificate revocation in our new public-key framework. 3.3 Protection of Hash Chain Root In the above framework, the certificate owner U relies on the hash chain root r to control the expiry date of his public-key certificate CERTU. There is an advantage on the use of a separate secret r to protect the private key SKU. The system remains secure 2
CERTU should also include an identifier of the hash function used to generate and verify the hash chain.
94
J. Zhou, F. Bao, and R. Deng
as long as either r or SKU is not compromised. If SKU is compromised, U could destroy r then CERTU will expire shortly at the next refreshing point. Similarly, if r is compromised, U could destroy SKU and stop using it for signing. It might be at the same risk, however, if r and SKU are stored in the same computer system. If the system is broken, both r and SKU will be compromised. Then, a hacker holding r and SKU can always generate valid signatures by refreshing the validity of CERTU until its maximum lifetime T. Therefore we need to protect them separately. The hash chain root r and the private key SKU are different in two aspects. • r is needed only at the refreshing points while SKU might be used at any time. That means SKU should be highly available in a system while r could be kept “off-line”. • A signing key usually has a length of 1024 bit or above while the hash chain root can be as short as 128 bits. That implies SKU is usually beyond the human’s capability to memorize while r might be memorized. Consequently, the hash chain root can be protected in a way different from the signing key. For individual users, the most straightforward approach is to remember the hash chain root r and manually input r at the time of refreshing CERTU. After the hash value needed for refreshing is generated, r will be erased from the local computer system. That will minimize the possibility of compromise caused by system break-in. The hash chain root protection mechanism for corporate users is discussed below. 3.4 Manager-Controlled Certificate In the above framework, the certificate owner U has the full control on the validity of CERTU until it reaches its maximum lifetime T. This can only address the need of certificate revocation caused by the compromise of private keys. However, a publickey certificate may have to be revoked by the manager of the certificate owner for other reasons such as termination of job or change of name. Definition 4. A public-key certificate CERTU is manager-controlled (r,j,L)-extensible if CERTU is (r,j,L)-extensible and r is known to U’s manager only. This problem could be solved if the hash chain root is generated by a security server (SS), which is supposed to be administrated by the manager of corporate users. Then, the process of certificate generation will be changed as follows. Actions by U 1. U generates a pair of keys: private key SKU and public key PKU. 2. Suppose U has registered his password at the SS. U sends the request of a certificate for corporate use, together with PKU, to the SS over an authenticated channel established with a password-based protocol (e.g., [BM92, Wu98]). Actions by the SS 1. According to the corporate security policy, the SS defines the maximum lifetime of U’s certificate as T, and the starting valid date as D. It also selects the time period for refreshing the validity of the certificate as L.
An Efficient Public-Key Framework
95
2.
Suppose j = T/L is an integer. The SS selects a random number r as the root of i i-1 a one-way hash chain, and generates a one-way hash chain H (r) = H(H (r)) (i = 1,2, …, j). 3. The SS sends (U, PKU, D, Hj(r), j, L) to the CA. Actions by the CA 1. The CA authenticates the SS’s request for generating a public-key certificate in an out-of-band method.3 (This will prevent U from requesting a public-key certificate for corporate use without authorization.) 2. The CA may further challenge U for a signature to ensure U holds the corresponding private key. (This will prevent the SS from requesting a public-key certificate in the name of U who is unaware of it.) j 3. The CA generates a certificate CERTU = SIGNCA(U, PKU, D, H (r), j, L). 4. The CA issues CERTU to U (via the SS). When a refreshing date is approaching, the SS distributes the corresponding hash value to U. Suppose the next refreshing date of CERTU is De. The security server cali i culates H (r) from r where i = j - (De-D)/L, and distributes (H (r),i) to U. No protection i is needed in distribution. U can easily verify that H (r) is the hash value to be released j-i i j on the date De by checking whether j-i = (De-D)/L and H (H (r)) = H (r). If the SS wants to revoke U’s certificate for some reason instructed by the corporate management, it can do so by stopping release of U’s hash values, thus CERTU will expire soon at the next refreshing point. The SS could even temporarily invalidate CERTU if U is on leave, and refresh CERTU later if necessary by releasing the corresponding hash value. If U suspects a compromise of his private key, U could send a request to the SS for stopping distribution of the next hash value. The SS’s role in our new public-key framework is fundamentally different from the CA’s role in certificate revocation. Availability • The CA needs to make the revocation information available to any potential certificate verifier over the Internet, which may lead to the higher risk of denial of service attacks. • The SS only needs to communicate with the internal certificate owners. There could be a set of security servers, each of which manages the hash chain roots for a group of clients. The connection to these security servers could be tightly controlled within the specified sub-domains to minimize the risk of system break-in and denial of service attacks.4
3
4
On-line authentication could be performed if a secure channel exists between the SS and the CA. A dedicated security server may be set up to manage mobile corporate users, and the maximum lifetime of those certificates may be defined shorter than normal.
96
J. Zhou, F. Bao, and R. Deng
Authenticity • The authenticity and integrity of the revocation information released by the CA need to be protected. • The chained hash values released by the SS need no protection. 3.5 Comparison We evaluate the performance of our new public-key framework against the CRLbased and OCSP-based mechanisms. We first consider the computing complexity. With the CRL-based or OCSP-based mechanism, signature generation and verification are needed when updating and verifying the certificate status. In our framework, only hash operations are required when updating and verifying the certificate status. Now we discuss the communication overheads. With the CRL-based or OCSPbased mechanism, the CA (or a designated party) always needs to be contacted to check the status of a certificate. On the contrary, the status of a self-controlled certificate in our framework can be updated and verified without contacting any third party. Even in the case of manager-controlled certificate, the cost for connecting the security server is lightweight. For the latter case, let us take a look at the following two scenarios related to signature verification. • 1 signer vs n verifiers – (1-n) scenario: one signer generates n signatures and sends to n verifiers. • n signers vs 1 verifier – (n-1) scenario: n signers generate n signatures and send to one verifier. In the (1-n) scenario, each of n verifiers needs to contact the CA(s) to check the certificate status when the CRL-based or OCSP-based mechanism is used. In comparison, if n signatures are generated at i different periods, only i (i ≤ n) connections between the signer and the SS are required in our framework. Obviously, when i is small (i.e., most of the signatures are generated within the same period), the communication overheads of our framework are much lower. In the (n-1) scenario, n connections between the signers and the SS(s) are required in our framework. If each signer’s certificate is issued by different CAs, n connections between the verifier and the CAs are also required for the CRL-based and OCSPbased mechanisms. If each signer’s certificate is issued by the same CA, and verifications take place at k different periods, k connections between the verifier and the CA are required for the CRL-based and OCSP-based mechanisms. Usually k is almost equal to n in the OCSP-based mechanism. Even if k is small in the CRL-based mechanism, k CRLs are much longer than n 20-byte hash values. Therefore the communication overheads are not much different for three mechanisms in this scenario. Table 1 shows the comparison result when the security server is used in our framework. We should also take into consideration of the different types of communication when assessing the performance, i.e., connection with the CA over the Internet and connection with the SS over the Intranet.
An Efficient Public-Key Framework
97
Table 1. Comparison of Communication Overheads Scenario
1-n n-1
CRL
OCSP
n
n
n (different CAs) k (different periods)
n (different CAs) k (different periods)
Ours (using SS)
i (different periods) n
In our public-key framework, the certificate status update is flexible as the update period is controlled by the parameter L. When L is selected very short, the certificate status update is almost real-time like the OCSP-based mechanism, but is more efficient than the OCSP-based mechanism as demonstrated above. When L is selected long, the certificate status update is similar to the CRL-based mechanism, but L is a local parameter of individual certificates rather than a global one of all certificates in the CRL-based mechanism. For instance, a certificate with an ordinary security requirement might have the maximum lifetime T = 2 years (730 days), the refreshing period L = 1 day, then the hash chain length j = 730. Alternatively, a certificate with a high security requirement could have T = 1 year (365 days), L = 1 hour, then j = 8760. It is not difficult to handle these certificates with different status update periods by a security server. However, it is hard to manage certificates with different CRL release periods by a CA. Certificate verifiers will be confused if the CA releases more than one CRLs. The above comparison shows the overall performance of our public-key framework is better than the CRL-based and OCSP-based mechanisms.
4 Integration with X.509 X.509 is an industry standard which defines the format of a public-key certificate. The success of our new public-key framework is closely related to the interoperability when the extra data for an extensible expiry date is integrated into the existing X.509 certificate. The X.509 v3 certificate basic syntax includes version number, serial number, issuer’s signature algorithm identifier, issuer name, validity period, subject name, subject public key information, issuer unique id, subject unique id, and extensions [RFC2459]. The most flexible part of a X.509 v3 certificate is its “extensions” field. Each extension contains an extension id and the extension value, and may be designated as critical or non-critical. The extensions defined for X.509 v3 certificates provide methods for associating additional attributes with users or public keys and for managing the certification hierarchy. The X.509 v3 certificate format also allows communities to define private extensions to carry information unique to those communities. Current standard extensions are authority key id, subject key id, key usage, certificate policies, subject alter-
98
J. Zhou, F. Bao, and R. Deng
native name, issuer alternative name, basic constraints, name constraints, policy constraints, and extended key usage. j As pointed out in Section 3.1, to support the extensible expiry date, the data (H (r), j, L) should be included in the certificate. From the structure of a X.509 v3 certificate, j there are three possible extensions that the data (H (r), j, L) could be integrated into. • The first option is “private extension”. This extension could be defined locally, and allows X.509 v3 certificates to include more attributes. We could define a j new private extension that specifies the data format as (H (r), j, L) to support the extensible expiry date. • The second option is “subject key id”. The subject key id extension provides a means of identifying certificates that contain a particular public key. As the hash chain root r is randomly selected when generating a public-key certificate, j the data (H (r), j, L) could be regarded as a subject key identifier that uniquely links to the public key. • The third option is “subject alternative name”. The subject alternative name extension allows additional identities to be bound to the subject of the certifij cate. The data (H (r), j, L) could be regarded as an additional identity bound to the subject in the form of locally defined “other name”. We have integrated the new public-key framework into SMIME and SSL successfully with backward compatibility. In the integrated system, users of SMIME and SSL can check the certificate status without retrieving the revocation information from the CA.
5 Conclusion Certificate revocation is an important issue in the public-key infrastructure. Currently, there are two standardized certificate revocation mechanisms, either using CRL for periodic revocation or using OCSP for on-line revocation, both of which place a considerable processing, communication, and storage overheads on the CA as well as the relying parties. In this paper, we proposed a new public-key framework, where the certificate owner can control the validity of his certificate and the verifier can check the status of such a certificate without retrieving the revocation information from the CA. The new framework significantly improves the efficiency as a result of reduced computing and communications overheads on certificate verifiers and the CA. It is especially useful for applications on wireless devices that may not support simultaneous connections. We introduced the security servers into our new framework for a corporate environment. Each security server manages the validity of public-key certificates for a specified group of corporate users thus enabling the prompt suspension of an employee’s certificate once he terminates the job. It plays a fundamentally different role than the CA in certificate revocation. The extension of public-key certificate in our new framework is compatible with X.509, which makes it easier to be integrated into existing PKI products.
An Efficient Public-Key Framework
99
Our new public-key framework is not intended to replace existing certificate revocation mechanisms completely. Instead, it provides a new option in the deployment of PKI, which might be extremely useful for some types of applications.
References [AL99]
C. Adams and S. Lloyd. “Understanding public-key infrastructure: concepts, standards, and deployment considerations”. Indianapolis: Macmillan Technical Publishing, 1999. [ALO98] W. Aiello, S. Lodha, and R. Ostrovsky. “Fast digital identity revocation”. Lecture Notes in Computer Science 1462, Advances in Cryptology: Proceedings of Crypto'98, pages 137–152, Santa Barbara, California, August 1998. [BM92] S. Bellovin and M. Merritt. “Encrypted key exchange: Password-based protocols secure against dictionary attacks”. Proceedings of 1992 IEEE Symposium on Security and Privacy, pages 72–84, Oakland, California, May 1992. [Co00] D. Cooper. “A more efficient use of delta-CRLs”. Proceedings of 2000 IEEE Symposium on Security and Privacy, pages 190–202, Oakland, California, May 2000. [ISO13888-1] ISO/IEC 13888-1. “Information technology – Security techniques – Nonrepudiation – Part 1: General”. ISO/IEC, 1997. [Ko98] P. Kocher. “On certificate revocation and validation”. Lecture Notes in Computer Science 1465, Proceedings of 1998 Financial Cryptography, pages 172–177, Anguilla BWI, February 1998. [La81] L. Lamport. “Password authentication with insecure communication”. Communications of the ACM, 24(11):770–772, November 1981. [Mi01] S. Micali. “Certificate revocation system”. US Patent 6292893, September 2001. [MJ00] P. McDaniel and S. Jamin. “Windowed certificate revocation”. Proceedings of IEEE INFOCOM’2000, pages 1406–1414, Tel-Aviv, Israel, March 2000. [NN98] M. Naor and K. Nissim. “Certificate revocation and certificate update”. Proceedings 7th USENIX Security Symposium, San Antonio, Texas, January 1998. [RFC2459] R. Housley, W. Ford, W. Polk, and D. Solo. “Internet X.509 public key infrastructure certificate and CRL profile”. RFC 2459, January 1999. [RFC2560] M. Myers, R. Ankney, A. Malpani, S. Galperin, and C. Adams. “X.509 Internet public key infrastructure on-line certificate status protocol (OCSP)”. RFC 2560, June 1999. [WLM00] R. Wright, P. Lincoln, and J. Millen. “Efficient fault-tolerant certificate revocation", Proceedings of 7th ACM Conference on Computer and Communications Security, pages 19–24, Athens, Greece, November 2000. [Wu98] T. Wu. “The secure remote password protocol”. Proceedings of 1998 Internet Society Network and Distributed System Security Symposium, pages 97–111, San Diego, California, March 1998. [X509] ITU-T. “Information technology – Open systems interconnection – The directory: Public-key and attribute certificate frameworks”. ITU-T Recommendation X.509 (V4), 2000.
ROCEM: Robust Certified E-mail System Based on Server-Supported Signature Jong-Phil Yang1 , Chul Sur1 , and Kyung Hyune Rhee2
2
1 Department of Computer Science, Pukyong Nat’l Univ., 599-1, Daeyeon3-Dong, Nam-Gu, Pusan 608-737, Republic of Korea {bogus, kahlil}@mail1.pknu.ac.kr Division of Electronic, Computer and Telecommunication Engineering, Pukyong Nat’l Univ., 599-1, Daeyeon3-Dong, Nam-Gu, Pusan 608-737, Republic of Korea [email protected]
Abstract. In this paper we propose a new certified e-mail system which alleviates computational overhead of mobile devices with limited computing power considering server-supported signatures scheme. Our system is also fault-tolerant and robust against mobile adversary and conspiracy attacks since it distributes secure information to several servers based on the threshold cryptography. Keywords: Certified E-mail, Mail security, Secret sharing
1
Introduction
Nowadays e-mail has become an essential communication tool for business as well as academic area. Due to easy and convenient communication over e-mail, many people and businesses are moving into on-line transactions and the Internet access becomes more commonplace in everywhere, so e-mail communications will be increased tremendously in the near future. However, the Internet does not provide all the services required by business communication model such as secure, reliable and fair electronic exchange. The certified e-mail that is a value added to an e-mail system is a different solution from the existing secure e-mail systems such as PGP, S/MIME[17]. Although PGP and S/MIME provide authentication, confidentiality and nonrepudiation of origin, they do not guarantee fair exchange between two communicating parties. For secure and fair exchange, additionally the certified e-mail system must satisfy a property of fairness: at the end of exchange, it must be guaranteed that either each party has received what it expects to receive or neither party has received anything useful. In order to achieve fairness, the sender of e-mail has to be able to prove that the receiver has received it. On the other hand, the receiver has to be able to prove that the sender was the authentic originator of the message[11]. In this paper, we present a new certified e-mail system which is called ROCEM(Robust Certified E-Mail system). One goal of the new system is to reduce S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 100–111, 2003. c Springer-Verlag Berlin Heidelberg 2003
ROCEM: Robust Certified E-mail System
101
the computational overheads of users in mobile system environment through the server-supported signatures[12][18]. Another goal is to provide the reliability and security against mobile adversary and conspiracy attacks through the threshold cryptography. The rest of the paper is organized as follows. The next section describes preliminaries to induce the main idea of the paper. Section 3 outlines the certified e-mail system used by our system. We analyze and evaluate the proposed protocol in Section 4. Finally, we have conclusions in Section 5.
2 2.1
Preliminaries Server-Supported Signatures
To make it possible for users who have cellular phone or PDA to send certified e-mail, our proposal uses server-supported signatures scheme which had been proposed by N.Asokan. etc[12][18]. In this scheme, a server performs digital signature on users’ behalf. It is possible to provide security services such as nonrepudiation both of origin and receipt on a signature. Moreover, if the server for signing a message is regarded as a TTP(Trust Third Party), it can guarantee fair-exchange between a sender and a receiver. 2.2
Threshold Cryptosystems
When users depend on a single server for their cryptographic operation, the configuration of the server become to be simple. However, the single server will be a main target of malicious adversaries. When the single server is compromised, the whole cryptographic operation of it must be stopped. In this case, we can make use of secret sharing and threshold cryptosystem for developing a more robust server system. In the case of (n, t)-threshold signature scheme, n ≥ 2t + 1, there is a server system which consists of n servers. There is one secret/public key pair for the server system. In the beginning, a TTP(trusted third party) computes secret shares si , 1 ≤ i ≤ n, from the secret key, and securely distributes si to each server. It allows any subset of t + 1 servers out of n to generate a signature with the secret key, but it does not allow the creating of a valid signature if only t or fewer servers participate in the protocol. For the purpose of corrupting the whole server system, an adversary has to corrupt at least t+1 servers and obtains their secret shares[5],[1],[13]. 2.3
Certified E-mail
Almost all certified e-mail require a TTP as a mediator for fair-exchange of e-mail. Recently, many authors, such as G.Ateniese[6],K.Imamoto[8], J.Zhou[7] and B.Schneier[4], have researched about certified e-mail system. Certified e-mail systems can be classified as on-line protocols and optimistic protocols according to their involvement of TTP. There are some desirable properties for certified e-mail:
102
J.-P. Yang, C. Sur, and K.H. Rhee
– Fairness : Both a sender and a receiver can obtain the result which each user desires, or neither of them does. – Authentication : A communication partner is certainly the target partner. – Integrity : In the middle of a protocol, an adversary cannot forge a message. – Non-repudiation : All parties cannot decide to withdraw their support from a contract after the protocol is over. Specially, fairness is the most important requirement. For assuring fairness between a sender and a receiver, the system must be robust against an attack which is in conspiracy to corrupt it between user and malicious TTP.
Fig. 1. Architecture of ROCEM
3 3.1
ROCEM(RObust Certified E-mail) System Architecture
Fig. 1 shows the architecture of ROCEM. SIS(Secure Indexing Server) is a trusted authority that issues credential which is used to support user authentication, and securely saves some information for users’ signature. DMD(Distributed Mail Delivery) is implemented by a set of n MD(Mail Delivery)s (n ≥ 2t + 1), each runs on a separate processor in a network. There is one service public/secret key pair in DMD. It is used for signing a message on users’ behalf. The service secret key is not held by any MDs for obvious reasons. Instead, n different shares of the service secret key are distributed and stored on each MD, and the threshold cryptography is deployed to construct signatures on a message. An user who wants to send certified e-mail sends a request to single MD in DMD, the MD becomes delegate for the user. Delegate must collaborate with SIS and n − 1 MDs in DMD for performing a cryptographic operation. In this paper, we assume the followings : – There is an authenticated communication channel between SIS and each MD.
ROCEM: Robust Certified E-mail System
103
– All users and MDs know the service public key. – The cryptographic techniques that are used in our proposal are secure. 3.2
Notations
We introduce some notations that are used to describe our protocol : – – – – –
–
– – –
S, R : the identities of sender and receiver, respectively. C : the information that explains a message M . M Di : the identity of i-th Mail Delivery, where 1 ≤ i ≤ n. N RT : non-repudiation token. This is signed by DMD. SK : a session key for symmetric cryptosystem. It is used during a single session. The encrypted message M with session key SK is represented as [M ]SK . hX () : one-way collision resistant hash function for user X. Users should personalize the hash function. For example, this can be done always by including their unique names as an argument: using h(X, M ), where M is a message. H(M ) : the message digest of a message M using one-way collision resistant hash function. KX : a randomly chosen secret key from the range of hX (). i : a user X’s (n − i)-th signing key. Based on KX , the user X computes KX 0 1 n the hash chain KX , KX , · · · , KX , where i−1 0 i = KX , K X = hiX (KX ) = hX (KX ) KX
n KX constitutes X’s root signing key, the current value of i is signature i is X’s current signing key. counter, KX – SigX (M ) : a digital signature of a message M with a user X’s secret key. – EX (M ) : an encryption of a message M with a user X’s public key. – CreX : a user X’s credential, which is issued by SIS. n CreX = SigSIS (X, n, KX , SIS)
3.3
Basic Protocol
In this section, we introduce a mail delivery protocol which is used to send a user’s certified e-mail by ROCEM. It is based on server-supported signature which was proposed by N. Asokan and X. Ding[12],[18]. In this paper, we assume a situation such that each M Di (1 ≤ i ≤ n), has already had its secret share si for the service secret key of DMD. Fig. 2 shows the mail delivery protocol for ROCEM. [Step 0] To participate in ROCEM, each user X randomly generates n n = hnX (KX ). X submits the root signing key KX to SIS KX , and computes KX for credential. SIS issues a credential for X and publishes X’s credential to a directory service.
104
J.-P. Yang, C. Sur, and K.H. Rhee
Fig. 2. Mail Delivery Protocol
[Step 1] A sender S who wants to send certified e-mail hashes a mail message M , and sends S, R, C, H(M ), i, KSi as [M 1] in fig. 2 to a M Dh , (1 ≤ h ≤ n) in DMD. [Step 2] The M Dh which receives [M 1] from S becomes a delegate. It verifies the current signing key KSi based on the root signing key in the i n sender’s credential CreS , i.e., checks that hn−i S (KS ) = KS . For generating j a candidate NRT, it obtains the signature counter(j) and signing key(KR ) of the receiver(R) from SIS. The delegate configures a message consisting of j , denoted by α for convenience, and multicasts S, R, M Dh , C, H(M ), i, j, KSi , KR it to n − 1 M Dk=h (1 ≤ k ≤ n). Each M Dk including the delegate computes partial signature P SSk (α) for α with its secret share sk . All M Dk except the delegate send their partial signatures to the delegate as responses. For generating a signature of DMD, the delegate needs at least t + 1 correct partial signatures. Therefore, the delegate chooses t + 1 partial signatures, and computes SIGDM D (α). If the computed value is invalid, the delegate tries to compute SIGDM D (α) again with another sets of partial signatures. Finally, the j delegate generates candidate NRT, SigDM D (S, R, M Dh , C, H(M ), i, j, KSi , KR ), and then sends it to both sender and receiver as [M 2] in fig. 2. [Step 3] In procedure [M 2] of fig. 2, both the sender and the receiver perform as followings: – Mail sender : S verifies the received candidate NRT. If the verification is successful, S computes next signing key KSi−1 , and encrypts it with the service public key of DMD. S sends EDM D (KSi−1 ) with [M ]K i−1 to delegate S as [M 3 − S] in fig. 2. – Mail receiver : In the beginning, R reads C in [M 2] in fig. 2. After reading C, if R wants to receive the certified e-mail from S, R verifies the received candidate NRT. If the verification is successful, R computes next signing j−1 key KR , and encrypts it with the service public key of DMD, and sends j−1 ) to delegate as [M 3 − R] in fig. 2. EDM D (KR
ROCEM: Robust Certified E-mail System
105
j−1 [Step 4] The delegate multicasts EDM D (KSi−1 ) and EDM D (KR ) to the others M Dk=h (1 ≤ k ≤ n) for decryption of the encrypted next signing keys with the service public key of DMD. Each M Dk including the delegate computes j−1 ) with its secret share sk . Except the partial decryption P Dsk (KSi−1 ), P Dsk (KR delegate, all M Dk send their partial decryptions to the delegate as a response. j−1 ) and the identity of That is, all MDs in DMD send P Dsk (KSi−1 ), P Dsk (KR the delegate(M Dh ) to SIS. SIS stores the received information for resolving a potential dispute. For decryption, the delegate needs at least t + 1 correct partial decryptions. Therefore, the delegate chooses t + 1 partial decryptions, j−1 . By using the decrypted KSi−1 , the delegate decrypts and decrypts KSi−1 , KR j−1 [M ]K i−1 . Finally, the delegate sends KSi−1 , KR to SIS. SIS checks the validity S of next signing keys for S and R as followings:
hn−i+1 (KSi−1 ) = KSn , hS (KSi−1 ) = KSi S j−1 j−1 j n hn−j+1 (KR ) = KR , hR (KR ) = KR R
– If the verification is successful, SIS replaces signature counter i by i−1 for S, and signature counter j by j − 1 for R. SIS stores KSi−1 as a current signing j−1 for R. Then, SIS sends the delegate to a message of key for S, and KR ”protocol proceed notification”. – If the verification fails, SIS sends the delegate to a message of ”protocol fail notification”. [Step 5] If the delegate receives the message of ”protocol proceed notification”, j−1 it sends KR to sender as [M 4 − S] and KSi−1 , M to receiver as [M 4 − R] in fig. 2. If the delegate receives the message of ”protocol fail notification”, the delegate stops mail delivery protocol. [Step 6] Finally, S and R perform a verification steps as followings: j−1 j is the preimage of KR in – Sender : S checks whether the received KR the candidate NRT. If the check is successful, S obtains the NRT which R cannot repudiate the receipt of mail message. j j−1 SIGDM D (S, R, M Dh , C, H(M ), i, j, KSi , KR ), KR
Finally, S records KSi as already used value by replacing signature counter i by i − 1. – Receiver : R checks whether the received KSi−1 is the preimage of KSi in the candidate NRT and whether the received message M is the preimage of H(M ) in the candidate NRT. If two checks are successful, R obtains NRT which S cannot repudiate the sending of mail message. j ), KSi−1 SIGDM D (S, R, M Dh , C, H(M ), i, j, KSi , KR j Finally, R records KR as already used value by replacing signature counter j by j − 1.
106
J.-P. Yang, C. Sur, and K.H. Rhee
If there are any problems during [Step 6], a dispute can be occurred and a resolution procedure is necessary to resolve the dispute. ROCEM is appropriate for the threshold RSA[14],[16]. Because the schemes based on discret logarithms may require an agree-upon random number to generate partial signatures[9],[10]. Such schemes can be implemented by adding a new first step, in which the delegate decides a random number based on suggestions from t + 1 MDs and notifies it to the others, before servers can generate partial signature. When it is implemented on mobile users,we suggest that the service public key of DMD is 3,i.e. e = 3 to minimize the computation overhead for them. There are some methods to overcome the security weakness caused by using a small encryption exponent in [3]. However, by using a small exponent, we can minimize the computation overhead of users who verify a signature or encrypts a message through the service public key of DMD. 3.4
Dispute Resolution
In this section, we classify disputes or attacks into four-scenarios, and explains how to solve each problem. Case-1 : When a sender repudiates his/her e-mail that was sent. – A receiver submits NRT and mail message M to an arbiter. Then, the arbiter who works together with SIS will verify as followings: 1. The signature in NRT by DMD is valid. 2. The current signing key of the sender in SIS is the same as the next signing key in NRT. 3. The H(M ) value in NRT is the hash value of the mail message M . – If at least one of these checks fails, then the arbiter judges the sender is correct. However, if these checks are all successful, the sender is allowed to the opportunity to repudiate the e-mail by providing a different NRT corresponding to the same current signing key. Case-2 : When a sender does not receive [M 4 − S] which becomes to be a proof that a receiver received an corresponding e-mail successfully. – According to the mail delivery protocol, the delegate M Dh (1 ≤ h ≤ n) performs threshold decryption with the others n − 1 M Dk=h (1 ≤ k ≤ n) after receiving [M 3−S] and [M 3−R]. In the case of performing threshold decrypj−1 ) and the identity of tion, all MDs in DMD send P Dsk (KSi−1 ), P Dsk (KR the delegate(M Dh ) to SIS. Therefore, it is impossible for the delegate not to j−1 to SIS. So, SIS possesses the correct KSi−1 . send the decrypted KSi−1 , KR – For resolving the dispute, the sender submits candidate NRT to an arbiter. The arbiter who works together with SIS will verify as followings: 1. The signature in NRT by DMD is valid. 2. The current signing key of the sender in candidate NRT is a hash of the current signing key of the sender in SIS.
ROCEM: Robust Certified E-mail System
107
– If these checks are successful, the arbiter judge that the delegate became to be compromised and did not send the next signing key of the receiver j−1 to the sender. maliciously. Therefore, the arbiter makes SIS send KR Case-3 : When a receiver does not receive [M 4 − R] which becomes to be a proof that a sender received an corresponding e-mail successfully. – Basically, the solution for resolving the dispute is the same as case-2. If the receiver is correct, the arbiter make the sender or the delegate send mail message to the receiver. Case-4 : Fair exchange fails by conspiracy between a user(sender or receiver) and the delegate. – Because of threshold signature scheme, when at least t + 1 mail deliveries are compromised, it is possible to forge or derive a failure of fair exchange. – Example, when a conspiracy attack between a sender and the delegate occurs. • The sender does not send an encrypted mail message [M ]K i−1 in S
[M 3 − S]. That is, the sender only sends EDM D (KSi−1 ) as [M 3 − S]. The delegate performs threshold decryption for decrypting EDM D (KSi−1 ). The delegate sends [M 4 − S] to the sender, and sends only KSi−1 as [M 4 − R] to the receiver or none. Consequently, the sender successfully obtains NRT for the receiver in spite of not sending mail message. • According to [Step 6] in the mail deliver protocol, the receiver requests dispute resolution to an arbiter. The method for resolving this dispute is the same as case-2 and case-3.
Fig. 3. Enhanced Protocol for Confidentiality
3.5
Simple Enhancement for Confidentiality of Mail Message
The mail delivery protocol introduced in section 3.3 does not provide the confidentiality for mail message. Therefore, we introduce a simple method for
108
J.-P. Yang, C. Sur, and K.H. Rhee
confidentiality based on DH key agreement protocol. Fig. 3 shows the enhanced protocol for confidentiality. From now, we only introduce the changed parts at mail delivery protocol in section 3.3. [Step 0] SIS selects a large prime p and generator g of Zp∗ (2 ≤ g ≤ p − 2), and publishes them to users. Each user X chooses a secret x ∈ R Zp−1 , and computes y = g x mod p for generating DH key-pair. Each user X randomly n = hnX (KX ). X submits the root signing key generate KX , and computes KX n KX to SIS. SIS issues a credential for X: CreX = SigSIS (X, n, KSn , g x , SIS) we introduce some additional notations which are used in this section: – xi : DH secret key of a user i. – yi : DH public key of a user i. That is, yi = g xi mod p. – Ti : local timestamp value of a user i. [Step 1] S generates a timestamp value(TS ) based on local system clock, and sends [M 1] in fig. 3 to a M Dh , (1 ≤ h ≤ n) in DMD. [Step 3] After receiving [M 2], S verifies the received candidate NRT. If the verification is successful, S computes a session key(SK) for secure communication with R by using DH public key(yR = g XR mod p) in R’s credential, DH secret key(XS ) of S and TS . XS ·TS mod p) = H(g XR ·XS ·TS mod p) SK = H(yR
S computes next signing key KSi−1 , and encrypts it with the service public key of DMD. S sends EDM D (KSi−1 ), [M ]SK to the delegate as [M 3 − S]. [Step 4] & [Step 5] the delegate cannot see the mail message M , because it is encrypted with SK which can be calculated by only S and R. If the delegate receives ”protocol proceed notification” message from SIS, the delegate only sends [M 4 − S] to S, and [M 4 − R] to R. [Step 6] R checks ”Is the received KSi−1 is the preimage of KSi in candidate NRT?”. If the check is successful, R computes a session key(SK) for secure communication with S by using DH public key(yS = g XS mod p) in S’s credential, DH secret key(XR ) of R and TS . SK = H(ySXR ·TS mod p) = H(g XS ·XR ·TS mod p) By using SK, R decrypts [M ]SK and checks ”Is the received message M is the preimage of H(M ) in candidate NRT?”. If the check is successful,R becomes to obtain NRT which S cannot repudiate the sending of mail message and receives the mail message M .
ROCEM: Robust Certified E-mail System
109
Fig. 4. Support for roaming user
3.6
Support for Roaming Users
Fig. 4 shows a conceptual procedure for supporting a roaming user who wants to send a certified e-mail. Users have low computing power and low battery devices such as cellular phone and PDA. A user who wants to send a certified e-mail connects to the nearest M Dh , (1 ≤ h ≤ n) in DMD, and requests a support for sending and signing mail messages. When a user handovers into another area, he/she tries to connect a M Dh in the migrated area. The delegate which is received a request from users communicates with the others M Dh for computing a threshold signature or decryption.
4
Security Evaluation
The security of ROCEM wholly depends on the security of service secret key of DMD. Therefore, it is possible to use proactive secret sharing scheme to make ROCEM more secure against mobile adversary[2],[15]. By using proactive secret sharing, we can periodically update the secret share of each M Dh , (1 ≤ h ≤ n) through a secure manner, and recover the compromised M Dh . In [18], authors introduced a basic solution for denial-of-service attack, and it can be also applicable to our scheme. ROCEM guarantees some desirable properties that were introduced in section 2.3, and provides additional security services: – Fairness : By using server-supported signatures scheme, fairness is provided between a sender and a receiver, if DMD which supports users’ signatures is correct. – Authentication : Users can authenticate each other through candidate NRT and credential. – Confidentiality : In section 3.5, we introduced a simple approach for confidentiality.
110
J.-P. Yang, C. Sur, and K.H. Rhee
– Non-repudiation : Through NRT of a sender and a receiver, it is impossible to repudiate his/her own activities successfully. – Attack against a malicious MD : It is impossible for a single mail delivery to forge or delete a message successfully. – Attack against conspiracy between a user and a MD : To forge or delete a message successfully, a user must conspire with at least t + 1 mail deliveries. – Fast revocation : The fast revocation means the revocation for a signature ability of a user. When a user’s signing key is compromised, SIS can revoke the user’s credential and delete the user related information in SIS on the instant. Consequently, DMD does not perform digital signature on behalf of users. – More secure signature : Since DMD digitally signs a message on behalf of users, it is possible to use a more strong RSA key-pair without a burden of users for computational overhead.
5
Conclusion
A new certified e-mail system with low computational overhead for mobile users are proposed. The scheme is also reliable and secure against mobile adversary and conspiracy. Our proposal is suitable for users who want to send their secure e-mails by using their cellular phone or PDA with limited computing power or battery. The communication efficiency and implementation of the proposed scheme will be deployed for the future works. Acknowledgements. This work was supported by Institute of Information Technology Assessment (IITA) of Ministry of Information and Communication (MIC).
References 1. A. De Santis, Y. Desmedt, Y. Frankel and M. Yung. “How to share a function securely”. In Proceedings of the 26th ACM Symposium on the Theory of Computing, pages 522–533, Santa Fe, 1994. 2. A. Herzberg, S. Jarechi, H. Krawczyk, and M. Yung. “Proactive secret sharing or: How to cope with perpetual leakage”. Advances in Cryptology-Crypto’95, the 15th Annual International Cryptology Conference, Proceedings, volume 963 of LNCS, page 457–469. 3. Alfred J. Menezes, Paul C. van Oorshot, Scoot A. Vanstone “Handbook of Applied Cryptography”, 1997, CRC Press 4. B. Schneier and J. Riordan. “A certified e-mail protocol”. 13th Annual Computer Security Applications Conference, pages 100–106, Dec. 1998. 5. D. Malkhi and M. Reiter. “Byzantine quorum systems” Distributed Computing, 11(4):203–213, 1998
ROCEM: Robust Certified E-mail System
111
6. G. Ateniese, B. d. Medeiros and M. T. Goodrich. “TRICERT: A Distributed Certified E-Mail Scheme”. In ISOC 2001 Network and Distributed System Security Symposium(NDSS’01), San Diego, CA, USA, Feb. 2001. 7. J. Zhou and D. Gollmann. “Certified electronic mail”. In Computer Security – ESORICS’96 Proceedings, pages 55–61. Springer Verlag. 1996. 8. Kenji Imamoto, Kouichi Sakurai. “A Certified E-mail System with Receiver’s Selective Usage of Delivery Authority”. INDOCRYPT 2002, LNCS 2551, pp. 326–338, 2002. 9. L. Harn, “Group oriented (t, n) digital signature scheme”. IEE ProceedingsComputer and Digital Techniques, 141(5):307–313, September 1994 10. M.Cerecedo, T.Matsumoto, H. Imai, “Efficient and secure multiparty generation of digital signatures based on discret logarithms”. IEICE Transactions on Fundamentals of Electronics, Information and Communication Engineers, E76-A(4):532–545, April 1993 11. M.Franklin and M.Reiter. “Fair exchange with a semi-trusted third party”. In Proc. ACM Conference on Computer and Communications Security. 1997. 12. N. Asokan, G.Tsudic, M.Waidner, “Server-Supported Signatures”. European Symposium on Research in Computer Security , September 1996. 13. P.Gemmel. “An introduction to threshold cryptography”. in CryptoBytes, a technical newsletter of RSA Lab. Vol. 2, No. 7. 1997. 14. R. Gennaro, S. Jarecki, H. Krawczyk, and T. Rabin. “Robust and efficient sharing of RSA functions”. In Advances in Cryptology-Crypto’96, LNCS 1109, pp. 157– 172, 1996 15. S. Jarecki. “Proactive Secret Sharing and Public Key Cryptosystems”. Master thesis. MIT. 1996. [14] Victor Shoup, 16. Victor Shoup, “Practical threshold signatures”, in Proc. Eurocrypt 2000 17. William Stallings “CRYPTOGRAPHY AND NETWORK SECURITY : Principles and Practice” Second Edition, Prentice-Hall 18. X. Ding, D. Mazzocchi and G. Tsudik “Experimenting with Server-Aided Signatures”, 2002 Network and Distributed Systems Security Symposium (NDSS’02), February 2002.
Practical Service Charge for P2P Content Distribution Jose Antonio Onieva1 , Jianying Zhou1 , and Javier Lopez2
2
1 Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore 119613 {onieva,jyzhou}@i2r.a-star.edu.sg Computer Science Department, E.T.S. Ingenieria Informatica University of Malaga, 29071 – Malaga, Spain [email protected]
Abstract. With emerging decentralized technologies, peer-to-peer (P2P) content distribution arises as a new model for storage and transmission of data. In this scenario, one peer can be playing different roles, either as a distributor or as a receiver of digital contents. In order to incentivize the legal distribution of these contents and prevent the network from free riders, we propose a charging model where distributors become merchants and receivers become customers. To help in the advertisement of digital contents and collection of payment details, an intermediary agent is introduced. An underlying P2P payment protocol presented in [1] is applied to this scenario without total trust on the intermediary agent.
1
Introduction
A crucial factor in the rapid growth of the Internet is electronic commerce: the ability to advertise goods and services, search for suppliers, compare prices and make payments, all being conducted at the click of a few computer mouse buttons. Nowadays several factors have lit a fire under the peer-to-peer (P2P) movement: inexpensive computing power, bandwidth, and storage. In a P2P architecture, computers that have traditionally been used solely as clients communicate directly among themselves and can act as both clients and servers, assuming whatever role is needed at each moment. The new P2P networking paradigms offer new possibilities for content distribution over the Internet. Customer peers interchange roles with provider peers, and compete in this new networked economy. A major differentiating factor of P2P from traditional content distribution models is the lack of central management and control. This very important characteristic of P2P systems offers the ability to create efficient, scalable, anonymous - when required, and persistent services by taking advantage of the fully distributed nature of the systems. If a peer distributing contents gets paid for this distribution, why is this peer going to distribute contents freely? This approach can incentivize a legitimate S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 112–123, 2003. c Springer-Verlag Berlin Heidelberg 2003
Practical Service Charge for P2P Content Distribution
113
P2P content distribution, hence avoiding the actual problems of free riders and legal issues for which P2P networks such as Napster and Gnutella have been strongly criticized [2,3]. Popular software for P2P networking like Napster, Gnutella [4], and Freenet [5] provides everybody with opportunities to exchange low value digital goods. But potential merchants with low value goods (i.e., users inside a P2P network) have no future in such a competitive digital world due to the hardness of collection of payments and advertisement of their goods, compared with profits expected. For such reasons, new solutions that help the merchant to gain entrance to P2P e-commerce should be designed. Previous work on paid P2P service [6] relies on a fully trusted on-line escrow server, which could be too expensive for those low value transactions. In this paper, we introduce a P2P service and payment protocol in which the load of the merchant peer is significantly reduced for only distribution of digital contents while a weakly trusted intermediary agent is used for the advertisement and collection of (small) payments. The rest of this paper is organized as follows. In section 2, we sketch the scenario under which we envisage the distribution of digital contents inside a P2P network, and identify the security requirements in that scenario. In section 3, we describe the underlying payment mechanism used in our approach. In section 4, we present our protocol for P2P content distribution and give an informal security analysis. Finally, before concluding the paper, a more practical view of the operations needed by the peers in the content distribution is explained in section 5. Some basic notation used throughout the paper is as follows. – – – – – – – – – –
2
M, C, B: merchant peer, customer peer, and broker/bank, respectively A, T T P : agent and trusted third party, respectively X, Y : concatenation of X with Y h(inf ): one-way hash function over message inf KeyedHashK (inf ): message inf is hashed using a secret key K EK (inf ) and DK (inf ): symmetric encryption and decryption of inf SU (inf ): digital signature of entity U over message inf PU (inf ): encryption of inf using public key of entity U A → B : X: entity A sends message X to entity B A ← B : X: entity A retrieves message X from entity B
Scenario and Requirements
A market study about P2P commerce is provided in [7], where peers can find evaluation functions and results about the behavior of such a system, permitting them to make decisions in advance. In that study, different parameters are used to evaluate the market such as cost of transportation, popularity of the contents and competitiveness of the peers. In this paper, we intend to reduce costs of transportation, i.e., reduce involvement of the peers in the framework.
114
J.A. Onieva, J. Zhou, and J. Lopez
Figure 1 shows a general scenario we can find in a P2P application. In this scenario, each peer entity desires to earn some money by selling its files (photos, music, videos etc.). But it is hard for every entity to advertise its goods and manage the (probably small) payments with so many entities. Then such an entity can seek a purchase agent for advertising goods and collecting payments, thus it only needs to provide the digital goods/contents.
Fig. 1. P2P Service and Payment Scenario
We affirm that in a protocol where multiple entities participate, and none of them is totally trusted, i.e. collusion between any pair of them is possible, total fairness cannot be obtained. Nevertheless, we assume that the purchase agent is weakly trusted by the merchant peer in the only sense that collusion with the customer is not possible. Tools and reasons for making this type of collusion harder (although not impossible) can be found in reputation issues and incentive schemes. As an incentive for the participation of the agent in this scenario, it could earn a part of each payment or a monthly percentage of each user’s successful transactions. On the other hand, collusion with the merchant peer or misbehavior of the agent by itself has to be properly and efficiently treated in our protocol. Provision of evidence to the peers for later dispute resolution would be important to boost P2P e-commerce where exchanges are carried out between parties that probably have no prior relations and whose identities could be highly volatile. The following properties are desirable in the above P2P service and payment scenario: 1. Confidentiality: The digital goods/contents should be disclosed only to the intended party (i.e. only to customers). 2. Payer anonymity: Payers may prefer keeping their everyday payment activities private, i.e. not allowing payees and in some cases even banks to observe and track their payments. There are two levels of anonymity: untraceability simply means that an adversary cannot determine a payer’s identity in a run
Practical Service Charge for P2P Content Distribution
3.
4.
5.
6.
3
115
of a payment protocol; unlinkability means that, in addition, participation of the same player in two different payments cannot be linked. Fairness: The customer cannot obtain the digital goods/contents either from the intermediary agent or from the merchant unless a payment is ensured to the merchant. Timeliness: The transacting parties always have the ability to reach, in a finite amount of time, a point in which they can stop the protocol without loss of fairness. Non-repudiation: It is impossible for a sender peer, after a successful execution of the protocol, to deny having distributed the digital goods. It is impossible for an agent, after a successful execution of the protocol, to deny having received the payment. Light-weight merchant: Since the protocol is run in a P2P scenario, merchant peers should not be overloaded with payment issues.
A P2P Payment Protocol
General purpose electronic payment systems have been widely studied, which can be classified into two categories: cash-like systems and check-like systems. In cash-like systems, special tokens denominated as electronic coins (or cash) are used [8,9]. The payer has been previously taken away an amount of money in a withdrawal protocol, hence they are pre-paid payment protocols. In check-like systems, the payer usually issues a form (whether it be a check or a credit card slip) to the payee [10]. There is no previous withdrawal of money, and the payee must ensure that the payer possesses enough money to carry out the payment. So consulting the payer’s bank is necessary prior to accept it. Such systems are also denominated as on-line verification payments (e.g. [11]). An electronic payment system for P2P scenarios was proposed in [1]. In this protocol, three entities are involved: merchant, customer, and broker/bank who is trusted by the other entities. The same notation used in the original paper is listed below for the understanding of this scheme. – – – – –
IDX : identity of entity X KX : secret key used by entity X (and known only to it) SerN um: unique serial number associated with every digital note V alue: value associated with the digital note T 0, T 1, T 2, T 3: time of issue, deadline of redemption, start time of refund, and expiry time for the digital note, respectively Each digital note is prepared by the merchant as follows.
– IDM aterial = IDM , V alue, SerN um, T 0, T 1, T 2, T 3 – DigitalN ote = IDM aterial , KeyedHashKM (IDM aterial ) The main advantage of having the merchant creating its own digital notes is that it can check double spending before-the-fact without contacting the broker/bank. The merchant peer then transfers digital notes to the broker who
116
J.A. Onieva, J. Zhou, and J. Lopez
adds a broker stamp such that a stamped digital note format is [DigitalNote, BrokerStamp, SVC ]. – BrokerStamp = IDB , KeyedHashKB (DigitalN ote, IDB ) – Stamp Verification Code (SV C) = h(BrokerStamp) Once the broker transfers SV C to the merchant, the stamped digital note is ready for circulation. Whenever the customer peer wants to purchase goods from a particular merchant peer, he approaches the broker, obtains a certain amount of digital cash (issued by that merchant peer and stamped and stored by the broker) using macro payment mechanisms such as credit card payment schemes. The stamped unspent digital note should be kept secret and protected by the entity possessing it, either by the broker or by the customer. A merchant peer can redeem the value of digital cash before time T1 associated. The merchant peer has to reveal the broker stamp to the broker. If the broker stamp is valid, the broker credits the merchant peer’s account and marks the digital note as spent. Similarly, the customer peer can refund its unspent digital note before expiration, that is, after time T2 but before time T3. For the transaction between the merchant and customer peers, the customer sends the stamped digital note excluding the broker stamp to the merchant, who verifies that this SVC is unspent, and that current time is less than T1. Then, a fair exchange protocol is assumed for the exchange of the digital good and the broker stamp. Although the original paper did not provide any detail about the fair exchange, we claim that all the parties involved should have the ability to check the broker stamp validity.
4
Our Approach
We design a P2P content distribution and payment protocol in which the load on the peer that plays the role of merchant is significantly reduced, hence motivating the participation of peers in this type of evolving e-commerce. The basic idea is the delegation of the merchant role to the agent during the payment phase. With this change, the customer peer does not need special digital notes for each merchant peer. Instead, he can buy digital contents from several merchant peers by interacting with only one agent who represents these merchant peers. Thus, the payment view of the P2P network changes to a new model (see Figure 2). A prior notation needed for the complete understanding of the protocol is as follows. – DigitalContent: digital content that the merchant peer M sells to the customer peer C – descr: description of the digital content that C obtains before starting purchase (i.e. from the agent’s web site) – PID : identifier of the digital content and its price – utsn: unique transaction serial number – L = (utsn, PID ): label of the current transaction
Practical Service Charge for P2P Content Distribution
117
– kc : session key generated by the agent and used by M to encrypt the digital content – Cipher = Ekc (DigitalContent): ciphertext of the digital content encrypted with kc – dc = h(DigitalContent): digest of the digital content – IntegritySign = SM (dc, descr, PID ): digital content verification code generated by M and available at the agent’s web site – t = PT T P (A, M, kc ): ciphertext of the session key encrypted with the TTP’s public key 4.1
P2P Service and Payment Protocol
In our protocol, we assume that each peer (acting as a customer) can set up a secure and confidential channel (SSL or IPSec) with its agent, broker/bank, and the TTP, and the agent can also establish such a channel with the broker and the TTP. Our protocol consists of a main protocol and two sub-protocols. In the normal situation, only the main protocol will be executed among the customer peer, the agent, and the merchant peer, while the TTP is off-line and not involved. If there is something wrong in a transaction, the agent can initiate the cancel sub-protocol and the customer peer can initiate the resolve sub-protocol to terminate the transaction without loss of fairness. The agent will prepare the digital notes for the merchants that it represents, and send these digital notes to the broker for stamping. The customer can obtain the stamped digital notes from the broker using a macro payment mechanism (e.g. credit card), and use these digital notes in purchase of digital goods/contents from a merchant (via its agent). Suppose the customer C has obtained some digital notes from the broker B. At the beginning, C accesses information in the agent A’s web page, and downloads descr, PID , and IntegritySign. Then C launches the following P2P service and payment main protocol. 1. C → A : M, L, DigitalN ote, SV C A checks and IF correct follows 2. A → M : M, L, kc , t, SV C, SA (M, L, kc , t, SV C) 3. C ← M : A, L, Cipher, dc, t, SM (A, L, h(Cipher), dc, t, SV C) 4. C → A : BrokerStamp 5. A → C : M, L, kc , SA (M, L, kc ) At Step 1, the customer C sends a digital note to the agent A. A makes all necessary checks on the digital note before notifying the merchant M at Step 2 that there is a request pending from C. Such checking includes that the current time is earlier than the deadline of redemption T1, and that the digital note has not been spent yet. If correct, A provides M with its signature which could be used to prove the amount of payment to be credited to M ’s account (if the transaction is completed) and the session key for encryption of the digital content. A also encrypts the session key kc with the TTP’s public key (in order
118
J.A. Onieva, J. Zhou, and J. Lopez
to reduce the computational load on the merchant peer host). After verifying the purchase request redirected by A, M prepares the encrypted digital content and its signature which could be used to prove the origin of the digital content. C retrieves the encrypted digital content from M at Step 3. Prior to submit the broker stamp to A at Step 4, C verifies whether M is committed that the digital content sent in ciphertext is the one as C expected. A releases the session key at Step 5 after obtaining the valid broker stamp from C. If the above protocol is executed successfully, the customer peer obtains kc for the decryption of Cipher, and thus the digital contents, and the agent obtains the broker stamp. Then the agent can send the broker stamp to the broker for redemption. If A does not receive the broker stamp in a pre-determined amount of time before T1, it can launch the following cancel sub-protocol. 4 . A → T T P : A, M, L, SV C, SA (cancel, A, M, L, SV C) IF not resolved THEN 5 . A ← T T P : ST T P (cancel, A, M, L, SV C) ELSE 5 . A ← T T P : BrokerStamp In such a case, A sends to the TTP a cancel request. Then, if the protocol has not been resolved the TTP verifies A’s signature on the request. If correct the TTP signs a cancel affidavit. If the protocol was resolved by C, the TTP gives A access to retrieve the valid broker stamp. Note that the agent can obtain the broker stamp and a cancel affidavit, which result in an unfair situation. Nevertheless we consider a poll solution to revoke the broker stamp redemption. In this poll solution, the broker has access to the cancel affidavits from the TTP server and then it will execute that operation (searching for fraudulent redemption operations) before redeeming the agent. If C does not get the session key for decryption of the digital content in the main protocol before time T1, it appeals to the TTP in a resolve sub-protocol. 5 .C → T T P : A, M, L, h(Cipher), dc, t, SV C, SM (A, L, h(Cipher), dc, t, SV C), BrokerStamp IF not cancelled THEN 6 .C ← T T P : kc ELSE 6 .C ← T T P : ST T P (cancel, A, M, L, SV C) In such a case, C sends to the TTP all the information received from M as well as the broker stamp. If the protocol has not been cancelled, the TTP verifies M ’s signature and checks whether the hash of the broker stamp equals SVC. If everything is positive, the TTP decrypts t, verifies that the key kc is intended for A and M , and finally stores kc for C’s access. If the protocol has been revoked by A, the TTP will provide a cancel affidavit. Some financial issues should be taken into account. The agent could send all broker stamps to the broker in batch mode. Similarly, the broker could credit the
Practical Service Charge for P2P Content Distribution
119
agent account in batch mode, giving an elapse time for this operation, such that it can retrieve from the TTP all the cancel affidavits and revoke the broker stamp redemption (and hence the agent’s bank account credit operation) if needed. None of these financial assumptions seems to be hard to obtain. Finally we would like to state that a complementary design based on a reputation system [12] could help to boost the P2P commerce. Reputation is the only mechanism available to peers in order to evaluate a candidate provider of a requesting service in terms of quality, reliability and correctness and thus plays significant roles in the selection of agents. So, if a situation arise in which an agent is misbehaving, a network of reputation can “mark” this entity thus preventing the next fraudulent action. 4.2
Dispute Resolution
Disputes can arise, and we show how the resolution with an arbitrator proceeds for all the entities involved in such a dispute. Origin of digital content: If M denies having sent a particular digital content, then C gives descr, IntegritySign, A, L, PID , Cipher, dc, t, kc , SV C, and M ’s signature to the arbitrator. The arbitrator checks – – – – – –
descr fits with DigitalContent dc = h(DigitalContent) IntegritySign is M ’s signature on (dc, descr, PID ) t = PT T P (A, M, kc ) DigitalContent = Dkc (Cipher) M ’s signature on A, L, h(Cipher), dc, t, SV C
If all the above checks are positive, the arbitrator concludes that the digital content is from M . If C receives a wrong digital content, some of the first three checks in the list might be false. However, C can demonstrate the misbehavior of M with IntegritySign. If M can present A’s signature on a different session key kc for the same transaction L, the arbitrator concludes that A is the misbehaving party. Payment received by A: A possible dispute could arise between M and A if the latter did not credit M ’s account after transferring a broker stamp to B for redemption. A could obtain incentives from the merchant peers, and the commission depends on how many successful payments it carries out. However, A may try to keep the entire payment of the digital good. If A denies having completed a transaction (L, SV C), M should present to the arbitrator M, L, kc , t, SV C and A’s signature on it. Then the arbitrator checks the signature and if A cannot present a cancel affidavit signed by the TTP for that transaction (L, SV C), the arbitrator concludes that A completed the transaction and must pay M for it. Note that if A tries to misbehave by completing the transaction and obtaining a cancel affidavit it will eventually succeed. But B will prevent it from crediting A’s account if B obtains the cancel affidavit from the TTP. In this case only
120
J.A. Onieva, J. Zhou, and J. Lopez
C will be benefited. C obtains the kc and thus the digital content while A is not redeemed. C could spend the same stamped digital note later again, or get refunded from B. Invalid broker stamp: If C tries to misbehave by sending an invalid broker stamp, two cases are possible. – C sends the invalid broker stamp at Step 4. Assume that A is not going to collude with C as we discussed in section 2, A will detect an invalid broker stamp using SVC and will reject it. – C stops the protocol at Step 4, and contacts the TTP to resolve. If the transaction has not been cancelled, the TTP will check with SVC signed by M that the broker stamp provided by C is valid before providing kc to C. Origin of SVC: If M colludes with C and sends to C at Step 3 a SVC which has already been spent by C. Then C could contact with the TTP and successfully get kc with the resolve sub-protocol. Whenever A tries to fetch the broker stamp from the TTP, it will discover M ’s fraudulent behavior and go to the arbitrator. If M can present an A’s signature on the same transaction (L, SV C), the arbitrator concludes that the misbehaving party is A and A will have to pay for the transaction to M . Otherwise, M is identified as the colluding party. 4.3
Security Analysis
Now we informally analyze whether our P2P service and payment satisfies the requirements described in section 2. – Confidentiality: The digital goods/contents are disclosed only to the intended customer peer. Although the agent knows the deciphering key, it does not have the knowledge about the encrypted digital content if it is transmitted over a private channel (e.g. SSL or IPSec enabled) from M to C. Similarly, the TTP, if involved, cannot get the encrypted digital contents either. – Payer anonymity: Only the first level of anonymity is reached, that is, untraceability. A customer peer never needs to reveal its identity except its IP address for receiving messages during the protocol. – Fairness: As we mentioned before, our fairness is achieved under the assumption of no collusion between the agent and the customer. In the main protocol, fairness will not be lost until Step 3 since neither the agent has obtained the broker stamp nor the customer gets the key to decrypt the digital content. After Step 3, both of them obtain what they expect (digital good and broker stamp) or none of them obtains any valuable information. – Timeliness: After notifying the merchant of a request for purchase at Step 2, the agent has the ability of cancelling the protocol if needed to reach the end of the protocol without breach of fairness. On the other hand, the customer can terminate the protocol at any time before releasing the broker stamp, or initiate the resolve sub-protocol after Step 4.
Practical Service Charge for P2P Content Distribution
121
– Non-repudiation: Proofs of origin of digital contents and payment received by the merchant are discussed in section 4.2. If the digital content provided at the end of a successful execution of the protocol does not fit with the description signed by the merchant on IntegritySign, the customer can obtain the evidence from Step 3 for a dispute resolution. If the agent cheats the merchant by falsely denying receipt of the payment from the customer, the merchant can get the evidence from Step 2 for a dispute resolution. – Light-weight merchant: For each protocol run, the merchant only needs one signature verification and generation. (Although IntegritySign token is also generated by the merchant, this operation can definitely be carried out in an off-line process.) More importantly, the merchant only receives one service request from the agent, and makes the encrypted digital contents available to the customer. The merchant does not need to take care of advertisement and payment.
5
Practical View
In order to give a more practical view of the involvement of the entities in our P2P service and payment protocol, we give an instantiated execution of the protocol. We define a typical P2P scenario where one of the peers tries to purchase a file. This file description and associated advertisement are hosted in an agent server. Note that if an agent advertises similar files (belonging to different peers), an analysis about the competition between different peers in the distribution of the contents should be undertaken. A preliminary study can be found in [7]. We sketch a scenario (see Figure 2) where a previous contract or relation exists between a merchant peer and an agent. This is something totally necessary, since at least, the merchant must register to use the agent’s hosting services. As we analyzed earlier, the merchant peer has a very light participation in the protocol, an important property that will facilitate the involvement of peers distributing, in an exchange for a small amount of money, digital contents over P2P networks. 1. A peer, who is surfing the web, visits http://www.curious-papers.com and once inside, clicks the section “Snakes”. He reads the abstract or description descr and decides to buy it. So he pushes the button “buy it”. This operation forms a transaction label L, downloads the content verification code IntegritySign token, and uploads a valid stamped digital note (excluding broker stamp). 2. The agent’s server verifies the validity of the digital note, and checks that SV C is unspent. If correct, it redirects this request to another peer who owns the paper, along with the product information received from the customer peer (L), the key needed to encrypt the contents (kc ), and the fingerprint of the broker stamp (SVC). 3. The merchant peer prepares the encrypted version of the paper (Cipher), and generates a signature. Then it notifies the customer peer to retrieve.
122
J.A. Onieva, J. Zhou, and J. Lopez
Fig. 2. Application Scenario
4. The customer peer downloads the cipher paper and the merchant peer’s signature. An add-in component in the customer peer’s browser verifies the merchant peer’s signature and the fingerprint of the broker stamp. If correct, the customer peer is asked for approval of the description signed by the merchant peer by pressing the button “OK”. 5. Then the customer peer’s computer sends the broker stamp to the agent. A window of notification should pop up to advise the customer that once the broker stamp is sent, it will be the non-return point of the transaction. The add-in component waits for the session key. 6. After receiving the broker stamp, the agent proceeds to send the session key. At the customer peer side, the add-in component will verify the agent’s signature, decrypt the cipher paper, and display the paper to the customer. 7. If the agent does not receive the broker stamp within a determined time (depending on the security policy) a cancel sub-protocol can be launched, obtaining either a cancel affidavit for the digital note or the valid broker stamp from the TTP. 8. If the customer peer does not receive the session key within a determined time, the add-in component redirects a request to the TTP in order to resolve the protocol. If the session key is received, the add-in component will decrypt the cipher paper and display the paper to the customer. If a TTP signed cancel affidavit is received, the add-in component will pop up a window to notify the customer that the transaction has been cancelled. Redemption and refund phases proceed according to the underlying P2P payment protocol. The broker has the ability of cancelling the redemption phase as stated before.
Practical Service Charge for P2P Content Distribution
6
123
Conclusion
With the emergence of wireless technology, grid computing, and other technologies where the storage and transmission of data are carried out without a centralized server, it is clear that new models of charging and distribution should not only comply with the requirements of this new topology but also provide with an efficient and practical solution. In this paper, we introduced a new entity that without being totally trusted, acts as a hub of the topology, helping the distributors in collection of possibly small payments and advertisement of the digital contents. We made use of an underlying P2P payment protocol and applied it to our practical P2P content distribution scenario where a merchant peer’s workload is largely shifted to an intermediary agent thus each peer can be easily involved in distributing digital contents and receiving payment via the agent. We also discussed the trustworthiness presumed to each of the entities in our model.
References 1. Anantharaman, L., Bao, F.: An efficient and practical peer-to-peer e-payment system. manuscript (2002) 2. Adar, E., Huberman, B.: Free riding on gnutella (2000) 3. Golle, P., Leyton-Brown, K., Mironov, I., Lillibridge, M.: Incentives for sharing in peer-to-peer networks. Lecture Notes in Computer Science 2232 (2001) 75–87 4. http://www.gnutella.com. 5. http://freenet.sourceforge.net. 6. Horne, B., Pinkas, B., Sander, T.: Escrow services and incentives in peer-to-peer networks. In: Proceedings of the 3rd ACM conference on Electronic Commerce, ACM Press (2001) 85–94 7. Antoniadis, P., Courcoubetis, C.: Market models for P2P content distribution. In: AP2PC’02. (2002) 8. Boly, J.P., Bosselaers, A., Cramer, R., Michelsen, R., Mjolsnes, S.F., Muller, F., Pedersen, T.P., Pfitzmann, B., de Rooij, P., Schoenmakers, B., Schunter, M., Vallee, L., Waidner, M.: The ESPRIT project CAFE – high security digital payment systems. In: ESORICS. (1994) 217–230 9. Rivest, R.L., Shamir, A.: Payword and micromint: Two simple micropayment schemes. In: Security Protocols Workshop. (1996) 69–87 10. Asokan, N., Janson, P.A., Steiner, M., Waidner, M.: The state of the art in electronic payment systems. IEEE Computer 30 (1997) 28–35 11. Bao, F., Deng, R., Zhou, J.: Electronic payment systems with fair on-line verification. In: IFIP TC11 16th Annual Working Conference on Information Security: Information Security for Global Information Infrastructures, IFIP TC11, Kluwer Academic Publishers (2000) 451–460 12. Damiani, E., Vimercati, S.C.D., Paraboschi, S., Samarati, P., Violante, F.: A reputation-based approach for choosing reliable resources in peer-to-peer networks. In Atluri, V., ed.: Computer and Commmunications Security, ACM (2002) 207–216
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback Henry C.J. Lee, Vrizlynn L.L. Thing, Yi Xu, and Miao Ma Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613 {hlee, vriz, yxu, miaom}@i2r.a-star.edu.sg
Abstract. DoS/DDoS attacks constitute one of the major classes of security threats in the Internet today. The attackers usually use IP spoofing to conceal their real location. The current Internet protocols and infrastructure do not provide intrinsic support to traceback the real attack sources. The objective of IP Traceback is to determine the real attack sources, as well as the full path taken by the attack packets. Different traceback methods have been proposed, such as IP logging, IP marking and IETF ICMP Traceback (ITrace). In this paper, we propose an enhancement to the ICMP Traceback approach, called ICMP Traceback with Cumulative Path (ITrace-CP). The enhancement consists in encoding the entire attack path information in the ICMP Traceback message. Analytical and simulation studies have been performed to evaluate the performance improvements. We demonstrated that our enhanced solution provides faster construction of the attack graph, with only marginal increase in computation, storage and bandwidth.
1
Introduction
The Internet is increasingly becoming the pervasive means of communications for all media. At the same time, this has also generated many security problems. In this paper, we look at the issue relating to Denial-of- Service (DoS) [1] and Distributed DoS (DDoS) attacks. In a DoS attack, typically huge quantity of malicious packets are generated and directed towards one or many victims. DDoS is a variation of DoS in that the attacker launches an attack not from one single source, but from several sources that the attacker has already penetrated. As a result, legitimate data traffics are disrupted, servers are compromised, and services are denied to the legitimate users. In such attacks scenario, attackers usually send packets with spoofed IP addresses so as to hide their true network location from the victims and the network infrastructure. The IP [2] packet contains two addresses: source and destination. The destination address is used by the routing architecture to deliver the packet. The IP network routing infrastructure does not verify the authenticity of the source address carried in IP packets. The source address is used by the destination host to determine the source for message reply. In general, no entity is responsible for the correctness of source address. The scenario is the same as sending a letter using the postal service; the postal service does not care about the correctness or authenticity of the source address, it merely makes sure that the letter is delivered to the correct destination. Consequently, the design of the IP protocol and forwarding mechanism makes it difficult to identify the real origin of a S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 124–135, 2003. c Springer-Verlag Berlin Heidelberg 2003
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback
125
packet. This characteristic of the Internet is exploited by some malicious users to hide their source and identify. Some mechanisms such as “Ingress filtering” [3] have been proposed to enforce the validity of source IP address originating from a stub network. However, such mechanisms are quite limited as they can only be used in edge networks and the universal enforcement of which is difficult. The objective of IP Traceback is to determine the actual origin of IP packets, so as to institute accountability. Several approaches have been proposed to address this issue. IP Logging has been proposed in [5], where the intermediate routers log the passage of all IP packets. The log information is then stored in the routers. The combination of this log from the various routers can then be used, if necessary, to trace the path taken by any IP packet. IP marking has been proposed in [6], where the intermediate routers add router’s information derived from its address into the IP packets (e.g. in the “Identification” field) with certain probability. The victim of an attack can thus examine this information found in the attack packets so as to construct the path taken, which eventually leads to the true attack origin. ICMP Traceback has been proposed in [4], where intermediate routers generate an ICMP Traceback message probabilistically with every IP packet, and send the ICMP Traceback message to the same destination as the IP packet. The victim of an attack can thus use the received ICMP Traceback messages to construct the attack path. This paper is organized as follows: section 1 gives an introduction to the paper. Section 2 describes the various current approaches for IP Traceback, namely IP logging, IP marking and ICMP Traceback. Section 3 describes our proposed enhancement to the ICMP Traceback messages. Section 4 describes the comparisons between our approach and the ICMP Traceback. Section 5 describes simulation studies and results. Section 6 concludes the paper.
2
Background
The challenge of IP Traceback is to find an efficient and scalable way to track the source of an arbitrary IP packet. The source can be an Ingress point to the traceback-enabled network, the actual host or network of origin, or compromised routers within the enabled network. It depends on the extent to which the traceback framework is deployed. In an attack, it is possible that some routers may be subverted, hence there is a need to construct the attack path, which comprises the routers traversed by packets from the “source” to the victim. In the case of a DDoS attack where packets come from potentially many secondary sources, there’ll be many attack paths. The attack graph is defined as the set of attack paths. The objective of the IP traceback mechanism is to construct the attack graph with the constraint that it should minimize the time that routers spend on tracking and minimize the storage used to keep the tracking information. Lastly, the solution should not adversely impact the privacy of legitimate users. There are two main approaches to perform traceback: infrastructure scheme and end host scheme. In the first approach, infrastructure scheme, the network is responsible for maintaining the traceback state information necessary for the victim and the network to construct the attack graph. IP logging scheme belongs to this category. In the end host scheme, the end hosts, which are the potential victims, maintain the traceback state information. IP marking and ICMP Traceback belong to this category.
126
2.1
H.C.J. Lee et al.
IP Logging
In this approach, the network routers log the passage of all IP packets. The key challenge here lies in the potential huge amount of information storage requirement. For example, if a router were to log all the packets in its entirety, each OC-192 link at 1.25 GB/s at the router requires 75 GB of storage for a 1-minute query buffer. The storage requirement quickly becomes prohibitive as the number of router links increases. One solution, SPIE (Source Path Isolation Engine) [5], has been proposed for IP version 4. The mechanism is designed to identify the true source of a particular IP packet given a copy of the packet to be traced and an approximate time of receipt. In order to take care of packets transformation as they are routed from source to destination, the mechanism identified the invariant portions of the 20-byte IPv4 header. The fields that are susceptible to changes include: TOS (Type of Service), TTL (Time to Live), Checksum and Options field. The logging is based on the invariant portion of the IP header and the first 8 bytes of payload. Based on statistics collected, the 28-byte prefix described above results in a rate of collision of approximately 0.00092% in a WAN environment and 0.139% in a LAN environment. To further reduce the storage requirement, instead of storing the entire 28-byte prefix, hashing is performed on it, followed by a Bloom filter processing. The scheme reduces memory storage requirement in the router to 0.5% of link bandwidth per unit time. It also maintains privacy, prevents eavesdropping of legitimate traffic stream. 2.2
IP Marking
The intermediate routers marks the IP packets with additional information so that the victim can use them to determine the attack path. Approaches proposed include node append, node sampling and edge sampling [6]. The node append mechanism is similar to the IP Record Route Option [2], in that the addresses of successive routers traversed by an IP packets are appended to the packets. The victim can thus easily traceback the source of such attack packets. However, this method introduces very high overhead in terms of router processing and packet space. The node sampling approach reduces such overhead by the probabilistic marking of IP packets. The edge sampling approach, as its name imply, marks an edge of the network topology, traversed by the IP packets, instead of just the node. Most algorithms proposed to put the marking information in the Identification field of the IP header. This type of mechanism has an inherent disadvantage in that it affects the format of IP packets. The necessary changes in the IP packet format depends on the algorithm used. The standardization of format for IP marking becomes an issue. 2.3
ICMP Traceback (ITrace)
In the ICMP Traceback mechanism, a new ICMP message type, ICMP Traceback (ITrace), is defined to carry information on routes that an IP packet has taken. As the IP Marking requires to overload some fields in the IP header, which raises backward protocol compatibility problem, the ICMP Traceback utilizes out-band messaging to achieve the packet tracing purpose.
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback
127
As an IP packet passes through a router, ICMP Traceback message (ITrace) [4] is generated with a low probability of about 1/20000. Assuming that the average diameter of the Internet is 20 hops, this probability value translates to a net increase in traffic of about 0.1%. This ITrace message is then sent randomly, with equal probability, to the destination or to the origin of the IP packet. In the event of a DoS/DDoS attack, the destination node can then use it to traceback the attack path. On the other hand, the ITrace message provides information for the origin to decipher reflector attacks. When a router generates an ITrace message, it may generate one of the followings: back link, forward link, or both. Each link element defines a link along which the packet will or has travelled through. The link element comprises of 3 components: the interface name at the generating router, source and destination IP address of the link, and finally a link-level association string that is used to tie together Traceback messages emitted by adjacent routers. On LANs, this string is constructed by concatenating the source and destination MAC addresses of the two interfaces. Finally, each ITrace message contains a variable length RouterID field.
3
ICMP Traceback with Cumulative Path (ITrace-CP)
The current IETF’s ITrace message proposal allows routers to generate ITrace messages for the source and destination of IP packets. In the context of DoS / DDoS attack, the victims can make use of the received ITrace messages to construct the attack paths and ultimately identify the attackers. Since each ITrace message only carry one or two links of the entire path, the victim will have to re-construct the various attack paths from the various segments. The task will be especially difficult in the event of a DDoS attack. This attack graph construction procedure will be facilitated if the ITrace messages are made to carry the entire path information from the routers nearest to the attackers all the way to the victim. With this approach, the victim will only need to identify the attack packets in order to establish the entire attack path or attack graph. In the following section, we will propose and analyze various solutions to encode the path traversed by the attack packets into the ITrace message. Our enhancement only applies to the ITrace messages sent to the destination address. A simple approach will be to generate ITrace message with the IP Record Route option, so that the subsequent routers will append their addresses to the ITrace message. However, this approach has some drawbacks. Firstly, the ITrace message may not take the same path as the corresponding attack packet to the victim. In this case, the ITrace message will record the wrong route information. Furthermore, the record route option is limited to 9 routers. This is because the header length of IP header is a 4-bit field, limiting the entire IP header to 60 bytes. Since the fixed size of the IP header is 20 bytes, and the RR option uses 3 bytes for overhead, this leaves 37 bytes for the list, allowing up to 9 IP addresses. As such, it may be sufficient in the early days of ARPANET but it is of limited use given the extent of Internet today. Last but not least, most hosts/routers ignore or discard this option. Our approach constructs ITrace message in a different way. Instead of encoding the path information in the IP packet header record route option, we use an enhanced ITrace message, called the ICMP Traceback with Cumulative Path (ITrace-CP) to store the path
128
H.C.J. Lee et al.
information. When a router receives an IP packet, it generates an ITrace-CP message with a certain probability. However, instead of sending the ITrace-CP message to the destination address of the IP packet, it is sent to the next hop router. This “next hop” should be as far as possible the same as the next hop for the corresponding IP packet. The ITrace-CP packet will also contain as much of the IP packet as possible, including the final destination address. In addition, the ITrace-CP message should be sent after the corresponding IP packet. At the next hop router, the router will process the ITrace-CP message as follows. There are two possibilities: 1. If the ITrace-CP packet is forwarded to the same router as the corresponding IP packet, then the router will generate a new ITrace-CP message and append its own IP address. The new ITrace-CP then forwarded to the next hop router of the corresponding IP packet. 2. Otherwise, the router that processes the ITrace-CP will generate a new ITrace-CP message for the final destination, without making any changes to the payload. As a result, full or partial path information is stored in the ITrace-CP message when it reaches its destination. The problem now is how to identify corresponding IP and ITrace-CP messages. The simplest way is for the routers to store the IP packets for a short duration and compare them to the received ITrace-CP messages. However, if we take the example of a Router with 16 OC-192 links at 1.25 GB per second, this would translate to a storage requirement of 2 GB for 100 ms seconds of buffer. We propose three schemes to reduce the storage requirement for matching corresponding IP packets and ITrace-CP messages. In the subsequent analysis, we use 100 ms as the upper bound for the inter-arrival time between an IP packet and its corresponding ITrace-CP message, if any. We will also use the same router configuration as above and assume that the average IP packet size is 256 bytes. 3.1
Scheme 1: Basic Packet Identification (BPI)
Typically, the source of an IP packet sets the identification field to a value that must be unique for that source-destination pair and protocol for the time the packet will be active in the Internet. Hence, the value of the “Identification” field can be used, together with the source and destination addresses and the protocol number to uniquely identify an IP packet in a short time window. In order to take care of possible fragmentation, the flags and fragment offset field of the IP packet can also be included for packet identification. In this way, if fragmentation did occur, the victims will be able to construct paths taken by the fragmented packets, and paths taken by the non-fragmented packets and link them together through the Identification field mentioned earlier that uniquely identifies the packet stream. 3.2
Scheme 2: Hash-Based Packet Identification
In this approach, instead of storing the BPI of a packet, the routers determine the hash of the BPI to reduce the storage requirement. The Hash function used must satisfy the
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback
129
following requirements. Firstly, the function must distribute a highly correlated set of input values (i.e. the BPI information) as uniformly as possible over the hash function’s output space. Secondly, the hash function should be computationally efficient so as to minimize the computation overload. A hash of 16 bits of hash will result in a collision rate of less than 0.002%. For our router configuration, 100 ms of hash will require a buffer size of less than 16 MB. 3.3
Scheme 3: Hash-Based Packet Identification with Indicator Bit
In addition to scheme 2, this approach set a bit in the IP packets to indicate that an ITrace-CP message has been generated for a specific IP packet. One possibility is to use the first bit of the 3-bit flags field, which is not used currently. This reduces the need to keep those packets, where the bit is not set, or their hashes in downstream routers, hence reducing significantly the storage and processing requirement. The disadvantage is that some changes to the IP packet processing is required. Let IT RACE CP DON E be the bit in IP packet that indicate if an ITrace-CP message has been generated for it. The pseudo code of the algorithm at each router is shown as follows. For each IP packet received at a router: If (ITRACE_CP_DONE is set) then { Calculate the packet’s hash and store it in the buffer (this hash will be kept for 100 ms) forward the IP packet to R (next hop router) (record R in the buffer) } else { generate an ITrace-CP message with a probability p set the ITRACE_CP_DONE bit in the IP packet forward the IP packet to R (next hop router) send the new ITrace-CP message to R }
If an ITrace-CP packet has not been received within 100 ms for an IP packet that has been stored in the buffer, it is possible that the ICMP packet is routed differently as the IP packet or that the inter-arrival time between the IP packet and its corresponding ITrace-CP is higher than 100 ms. In all cases, a new ITrace-CP message will be generated. Using the bit information, assuming that the max number of hops traversed by the packets is 20, the hash storage requirement is reduced to 16 KB (16 MB * 20 /20000). However, this scheme is vulnerable to another form of exploitation. If the attacker artificially set this bit in all attack packets, then ITrace messages will be generated at the first router for all the packets and all the subsequent routers will construct the cumulative path. Although this mechanism worsens the DoS / DDoS attacks by doubling the attack traffic, the victims will be able to detect the attack, construct the attack graphs and determine the true sources almost instantaneously.
4
Comparison of ITrace-CP with ITrace
We compare the ITrace-SP and Itrace mechanisms in terms of computation overhead, bandwidth and storage overheads. Firstly, in terms of bandwidth, the overhead is minimal
130
H.C.J. Lee et al.
as the additional information carried is the IP addresses of the intermediate routers. Assuming that the average path length is 10, the additional bytes carried is only 40 bytes per ITrace-CP message. Given the message infrequency, the overhead is still minimal even if more router information needs to be included. In terms of storage, each router only need to cater less than 16 MB (for ITrace-CP scheme 2) for an inter-arrival time of 100 ms between an IP packet and its corresponding ITrace-CP message. If a bit is used to mark the IP packet (for ITrace-CP scheme 3), the storage overhead will only be 16 KB. In terms of computation, the hash function will introduce minimum overheads. In summary, the additional overheads of ITrace-CP are relatively minimal compared to the ITrace scheme, similarly as compared to other IP Traceback proposals such as IP logging and IP marking. However, the ITrace-CP scheme will perform much better than the ITrace scheme in its ability to traceback the attack source faster, because more information on the attack path is carried inside the ITrace-CP message. We will now look at a network scenario where the attack path comprises L routers. Let p be the probability of generating an ITrace or ITrace-CP message. We determine their respective performances in attack path construction. The performance metric is expressed as the probability that the full attack path can be constructed with a given number of attack IP packets (N ). For the ITrace-CP scheme, the entire attack path can be constructed by the victim when the router furthest from the victim generates at least one ITrace-CP. Hence, the probability PE that the full path can be constructed after the victim has received N IP packets is given by (1). Note that PE is independent of the path length L. PE = 1 − (1 − p)N
(1)
For the basic ITrace scheme, each ITrace message can contain either the forward link, back link or both links. For simplicity, we assume that ITrace messages with either the forward or back link enable the victim to discover two routers addresses on the attack path, whereas the ITrace messages with both links enable the victim to discover three. PB1 and PB2 denote the full path construction probabilities for ITrace (forward or back link) and ITrace (both links) respectively. L
PB1 = (PE ) 2
L
PB2 = (PE ) 3
(2)
(3)
Figure 1 and 2 plot the Probability of Path Construction as a function of the number of IP packets received. With 20,000 attack packets, the ITrace-CP has a 63% chance of constructing the entire path, versus the ITrace (Forward and Back links) which has chances of 47%, 22% and 10% for 5, 15 and 20 hop attack paths respectively. With 50,000 packets, the probabilities are 92% for ITrace-CP, 87% for 5-hop ITrace, 65% for 15-hop ITrace, and 57% for 20-hop ITrace. The figures show clearly that the ITrace-CP mechanism requires much less packets to construct the entire attack path as compared to the ITrace.
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback
131
1
0.9
0.8
Probability of construction
0.7
0.6
0.5
0.4
0.3 ITrace−CP ITrace Forward Link (L = 5) ITrace Forward Link (L = 15) ITrace Forward Link (L = 20)
0.2
0.1
0
0
0.2
0.4
0.6
0.8 1 1.2 No. of packets (N)
1.4
1.6
1.8
2 5
x 10
Fig. 1. Performance Comparison between ITrace-CP with ITrace (Forward Link) 1
0.9
0.8
Probability of construction
0.7
0.6
0.5
0.4
0.3
ITrace−CP ITrace Both Links (L = 5) ITrace Both Links (L = 15) ITrace Both Links (L = 20)
0.2
0.1
0
0
0.2
0.4
0.6
0.8 1 1.2 No. of packets (N)
1.4
1.6
1.8
2 5
x 10
Fig. 2. Performance Comparison between ITrace-CP with ITrace (Forward & Back Link)
5
Simulation Studies
Our simulation studies evaluate the effectiveness of the ITrace-CP and ITrace (Forward or back link, both links) mechanism, in terms of time taken to establish the attack graphs in the event of DoS and DDoS attacks. We use the ns-2 network simulation software to
132
H.C.J. Lee et al.
model both traceback mechanisms. We constructed the agents for the attacker, the router and the victim. 5.1
Simulations Model
As discussed earlier, we assume that, for the ITrace forward or back link, 2 routers would be detected per each ITrace message, and 3 routers in the case of both links encoding option. For the ITrace-CP scheme, the number of routers detected is the number of routers addresses encoded in the message. Since we are comparing in terms of time taken to establish the attack graphs, evaluation based on false positive was not considered. Therefore, only attack traffic was generated and hash collision was not simulated. In this simulation, each router would generate ITrace or ITrace-CP messages with a probability of 1/20,000 (determined by a random number generator) on the attack traffic they received. When the victim received these messages, it would discover the intermediate routers of the attack graphs. The time taken to detect various numbers of routers on the attack graphs was recorded. 5.2
Network Scenario
We performed simulation studies on the two schemes using the linear network topology, for attackers situated 5, 15, and 20 hops away from the victim. The tree topology was not simulated as in the case of multiple attackers at the leaves of the tree, they would have been treated as independent attack paths. This would be similar to simulating the linear topology. For example, in Figure 3, if an attacker sends packets through routers 3 and 2 to the victim node at 1, and that ITrace messages are generated by router 2 based on these packets, it should not be treated as if the router 2 is detected for attack path by another attacker sending packets through router 4 and 2.
3
2
1
4
Fig. 3. Tree Topology
In all simulation scenarios, the effective attack traffic arriving at the victim is 1 Mbits/s. However, as we are interested in the relative performances of the two schemes, this number is only indicative. 5.3
Results
The average times (for 30 runs) for the construction of various hops of the attack path for the ITrace (forward link and both links) and ITrace-CP were obtained. The graphs were
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback
133
plotted and shown in Figure 4 to 6. In all the figures, the x-axis represents the number of hops of the attack path discovered while the y-axis represents the time taken in seconds. The 3 figures corresponds to attack paths of length 5, 15 and 20 respectively. In Figure 4, the performance of the ITrace-CP scheme improved over the ITrace for the forward link option but not the both links option. The average times taken to detect the full path were 27 secs, 19 secs, and 23 secs for the ITrace forward link option, ITrace both links option, and ITrace-CP scheme respectively.
30 ITrace (Forward Link) ITrace (Both Links) ITrace−CP
Average time taken (secs)
25
20
15
10
5
0
1
1.5
2
2.5
3 3.5 No. of hops detected
4
4.5
5
Fig. 4. Average time taken to detect various numbers of hops (5-hop attack path)
In Figure 5, the performance of the ITrace-CP scheme becomes better than the ITrace schemes starting from the detection of 4 hops of the attack path. The window from 10 to 13 hops showed significant improvement by the ITrace-CP scheme; with peak improvement at the 13th hop detection. However, the high average time taken to detect the 14th and 15th hop resulted in the drop in improvement. The average times taken to detect the full path were 40 sec, 27 sec, and 22 sec for the ITrace forward link option, ITrace both links option, and ITrace-CP respectively. In Figure 6, the performance of the ITrace-CP scheme become better than the other 2 schemes starting from the detection of 4 hops of the attack path. The low gradient of the curve for the ITrace-CP scheme from detection of the 1st to 14th hop indicated that about 14 hops of the attack path could be detected within the same average time taken. The window from 10 to 18 hops showed significant improvement by the ITrace-CP scheme; with peak improvement at the 18th hop detection. However, the high average time taken to detect the 19th and 20th hop resulted in the drop in improvement. The average time taken to detect the full path was 39 sec, 24 sec, and 18 sec for the ITrace forward link option, ITrace both links option, and ITrace-CP respectively.
134
H.C.J. Lee et al. 40 ITrace (Forward Link) ITrace (Both Links) ITrace−CP
35
Average time taken (secs)
30
25
20
15
10
5
0
0
5
10
15
No. of hops detected
Fig. 5. Average time taken to detect various numbers of hops (15-hop attack path) 40 ITrace (Forward Link) ITrace (Both Links) ITrace−CP
35
Average time taken (secs)
30
25
20
15
10
5
0
0
2
4
6
8 10 12 No. of hops detected
14
16
18
20
Fig. 6. Average time taken to detect various numbers of hops (20-hop attack path)
6
Conclusion
The objective of IP Traceback is to determine the true source of DoS/DDoS attacks. This paper first gives an overview of the respective approaches for IP Traceback. We then proposed an enhanced ICMP Traceback scheme, called ITrace-CP (ICMP Traceback with Cumulative Path) that encodes cumulative attack path information. We described the ITrace-CP protocol and the mechanism for constructing ITrace-CP messages so that
ICMP Traceback with Cumulative Path, an Efficient Solution for IP Traceback
135
it contains the addresses of all the routers on the attack path. As part of the ITraceCP protocol, we proposed three schemes for the routers to match corresponding IP and ITrace-CP messages. We have carried out qualitative comparison of the ITraceCP scheme with the ITrace scheme in terms of storage, bandwidth and computational requirements. We deduced that the ITrace-CP introduces marginal overhead in terms of storage and bandwidth and acceptable computational overhead. Analytical studies were done to compare the performances in terms of the probability of attack path construction as a function of number of attack packets and attack path length. We found that the performance of the ITrace-CP is independent of the attack length and that the probability of path construction of ITrace-CP is significantly higher than that of the ITrace, for all hop lengths. Simulation studies have also been conducted to further evaluate their relative effectiveness in constructing the DoS and DDoS attack paths. Our simulation showed that the ITrace-CP mechanism performs better than the ITrace mechanism and takes significantly less time to construct the entire attack path in longer attack paths.
7
Future Work
In the ICMP Traceback proposal, it is recommended that ITrace messages be generated with a probability of 1/20,000 so as to limit the increase in data traffic to less than 0.1%. However, in ITrace-CP scheme, given that path information are generated, it is more logical to generate ICMP messages nearer to the attackers, or in other words further from the victim. We will investigate how the probability can be determined to further improve the performance. Also, we have assumed an upper-bound of 100 ms for the packets inter-arrival time. A more rigorous study of this will enable a more accurate determination of the buffer allocation as well as optimal performance of ITrace-CP.
References 1. K.J. Houle, G.M. Weaver, “Trends in Denial of Service Attack Technology”, CERT Coordination Center, Oct 2001. http://www.cert.org/archive/pdf/DoS trends.pdf 2. J. Postel, “Internet Protocol”, Request for Comments 0791, Internet Engineering Task Force, 1981. 3. P. Ferguson, D. Senie, “Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing”, Request for Comments 2827, Internet Engineering Task Force, May 2000. 4. Steve Bellovin et al, “ICMP Traceback messages”, IETF Internet Draft “draft-ietf-itrace04.txt”, Feb 2003. Work in progress. 5. Alex C. Snoeren et al, “Hash-Based IP Traceback”, ACM SIGCOMM 2001, August 2001. 6. Stefan Savage et al, “Practical network support for IP traceback”, ACM SIGCOMM 2000.
A Lattice Based General Blind Watermark Scheme 1
1,2
3
1
Yongliang Liu , Wen Gao , Zhao Wang , and Shaohui Liu 1
Dept. of Computer Science and Engineering, Harbin Institute of Technology, China 2 Institute of Computing Technology, Chinese Academy of Sciences, China 3 Dept. of Control Science and Engineering, Harbin Institute of Technology, China [email protected]
Abstract. Digital watermark is a very active research area that has received a considerable amount of attention in many multimedia applications. For most watermark applications, it is often desired to retrieve the embedded information without access to the host data; this is known as blind watermark. Most of previous blind watermark schemes either suffer significantly from host data interference or require expense of storage. Therefore, simple and effective blind watermark scheme is expected urgently. In this paper, we attempt to resolve the question. Here, a lattice based general blind watermark scheme is proposed. The host data interference is eliminated entirely and only a little of cost of storage is needed. Thus, it has considerable advantage over previously proposed schemes. Experimental results demonstrated the power of this scheme.
1 Introduction Digital watermark is a very active research area that has received a considerable amount of attention in recent years. Many excellent papers have appeared in dedicated conferences and workshops [1]-[4]. The basic idea behind digital watermark is to embed information into a host data so that if the embedded information can be reliably recovered, then this information can specify the affiliation between the data and its original owner. The embedding process involves imperceptibly (for human audio or visual systems) modifying the host data using a secret key and the watermark to produce a watermarked data. The modifications must be done such that reliable extraction of the embedded watermark is possible even under a “reasonable” level of distortion applied to the watermarked data. Some typical distortions that digital watermark schemes are expected to survive include smoothing, compression, rotation, translation, cropping, scaling, resampling, digital-to-analog and analog-to-digital conversion, linear and nonlinear filtering. These distortions, whether intentional or incidental, are known as attacks. In some instances, the amount of information that can be hidden and detected reliably is important. The hiding capacity is the value of a game [5][6] between the information hider and the attacker. Here, capacity means the maximal embedding rate for a given level of distortion and any watermark scheme. Digital watermark has a number of important multimedia applications. The interest in digital watermark was first triggered by its potential use for copyright protection of multimedia data exchanged in digital form. However, watermark has been used for a variety of other purposes. For example, watermark was proposed as a means of S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 136–144, 2003. © Springer-Verlag Berlin Heidelberg 2003
A Lattice Based General Blind Watermark Scheme
137
tracing traitors [7]. In many applications, it is often desired to retrieve the embedded information without access to the host data; this is known as blind watermark. Blind watermark has been extensively explored in recent years [8][9]. Early blind watermark schemes were built on the principle of spread spectrum. Although this technique allows for reliable communication even for strong attacks, spread spectrum based system offered relatively little robustness when the host signal is not known at the decoder and blind detection of spread spectrum watermark suffers significantly from host data interference. It has been shown recently that blind watermark can be considered communication with side information (the host data) at the watermark encoder [10], and thus improved blind watermark schemes can be designed. This insight leads to a new group of blind watermark schemes, a key paper in this field is the work by Costa [11]. For the additive white Gaussian noise case, Costa showed theoretically the interference from the host data could be eliminated. However, the proof involves a huge, unstructured, random codebook, which is not feasible in practical systems. Eggers and Girod proposed a suboptimal scalar Costa scheme (SCS) [12] to reduce complexity, but large expense of storage is still required for this scheme. Similar situations exist in [9][13]. Thus it is urgently required to provide a simple and practical blind watermark scheme that suffers not from host data interference. In this paper, we attempt to resolve the question. Here, a lattice based general blind watermark scheme is proposed. The host data interference is eliminated entirely and only quite a little of cost of storage is needed. So, the scheme has considerable advantages over previously proposed schemes. The outline of this paper is as follows: In section 2, we review basic lattice principle. In section 3, a lattice based general blind watermark scheme is proposed. In section 4, experimental results are provided. In section 5, we provided some of theoretical analysis. Finally, a conclusion is given, and future research direction is proposed.
2 Lattice Theories 2.1 Lattice A n-dimensional lattice Λ is a discrete subgroup of real Euclidean n -space R n . Without essential loss of generality, we will assume that Λ spans R n . For example, the set of integers Z is a discrete subgroup of R , so Z is an one-dimensional lattice. A fundamental region of Λ is a region R(Λ) ⊆ R n that includes one and only one point from each coset of Λ in R n . Algebraically,
R (Λ ) is a set of coset
representatives for the coset of Λ in R n . Every x ∈ R may, therefore, be written uniquely as x = a + b for some a ∈ R (Λ ) and b ∈ Λ . Symbolically, we may n
write
R n = R(Λ) + Λ .
138
Y. Liu et al.
Rv (Λ) of Λ is a fundamental region in which every point a ∈ Rv (Λ ) is a minimum-energy point in its coset Λ+a. The set of translates {Rv(Λ) +b| b∈Λ} of a fundamental Voronoi region Rv (Λ ) tile n -space. A fundamental Voronoi region
2.2 The mod- Λ Map Given Rv (Λ) , the mod- Λ map mod- Λ : R → Rv (Λ) is defined by n
x a , where a is
Rv (Λ) such that a ≡ x mod Λ . We write this map simply as a = x mod Λ . n This is just a concrete way of writing the natural homomorphism from R to Rn / Λ .
the unique element of
For example, if Z is the integer lattice, then a fundamental Voronoi region for Z is Rv (Z) = 0,1). For x ∈ R , a = x mod Z is the fractional part of x .
[
2.3 The Code Based on Lattice Given a n-dimensional lattice Λ and a transmission channel, the channel input is a point a ∈ Rv (Λ ) , where Rv (Λ ) is the fundamental Voronoi region of Λ . For transmission, an arbitrary lattice point b ∈ Λ is added to
a ∈ Rv (Λ ) to form an
x ∈ R n to the channel. The channel output is y = x+ N = a+b+ N where N is the channel noise. At the receiver, the received channel output y is decoded as follows: A. First, y is reduced to z = y mod Λ , the unique element of Rv (Λ ) that is congruent to y mod Λ . Then z = y mod Λ = (a + N ) mod Λ ’ since b ∈ Λ . The effect of the b thus disappears completely. Define N ∈ Rv (Λ) input
as
N ’ = N mod Λ .
Since
z = (a + N ) mod Λ , we have
z = (a + N ’ ) mod Λ . B. Given z , a decoder finds a corresponding sequence aˆ ∈ Rv ( Λ ) . This is the whole process of the code and decode based on lattice.
A Lattice Based General Blind Watermark Scheme
139
3 General Blind Watermark Scheme Based on the principle mentioned above, we present a lattice based general blind watermark scheme as follows. A. Watermark Embedding Let x denotes the host data (image, audio or video), m denotes the watermark message. First, host data are transformed into frequency domain by discrete cosine transform (DCT) or discrete wavelet transform (DWT) and so on. For convenience of denotation, transformed host data are still denoted by x . Next, select N coefficients from x : x1 , x 2 , , x N . The selection must be made under considering quality and robustness of watermarked data subsequently. Then quantifying x1 , x 2 , , x N , the simplest one is uniform quantization. Let q be quantization step and x iq be the quantization version of
xi , defining x iq as follows:
x iq = kq , if xi ∈ [(k − 1/ 2)q, (k + 1/ 2)q) k ∈ Z , i = 1, 2, , N . Define Λ = {kq | k ∈ Z }, it is obvious that Voronoi domain is
[0, q ) , and
Λ is an one-dimension lattice. Its
xiq ∈ Λ , i = 1,2, N .
m to be embedded, combining host data with used embedding algorithm and making suitable process, we get b = (b1,b2,,bN ) that is the For watermark message
actual watermark sequence to be embedded. So it is reasonable that watermark sequence satisfies some assumptions, and we will utilize this property latter. Watermark sequence can be embedded into host data by different schemes. Here, let invertible transform T denote some embedding way, its inverse transform is denoted −1
by T . The invertibility is for the need to extract watermark sequence. In fact, we merely require that T has a generalized inverse, namely, it is possible that extracting watermark. Then, the embedding process is following:
xi = xiq + T (bi ) , i = 1,2, , N ,
xi denote the coefficient of watermarked data in frequency domain, T (bi ) can be considered as modification to quantified host data. We assume T (bi ) ∈ [0, q) in order to apply the lattice theory, i = 1,2,, N . This assumption is reasonable from where
discussion above. Let
xi =
xi i = 1,2, , N xi otherwise
140
Y. Liu et al.
then we perform the corresponding inverse frequency transform to {x i }to get watermarked host data x . In the above embedding process, quantization step and coefficient selection way can be saved as key. The watermark message can be extracted validly only if key K is known. B. Watermark Extracting In the transmission process, watermarked data may be subject to intentional attacks (attacker’s malicious attack) or incidental attacks (for example, common signal processing). Let this distorted watermarked data be ~ x . In practice, this distortion has to be so small that it does not lead to descending significantly in quality, namely, the distortion does not impair the business value of data. When extracting the watermark, watermark decoder performs corresponding frequency domain transform to received data ~ x . For simplicity, we still denote ~ transformed data by x . Next, the authorized watermark decoder uses key K to find the placements of xi : watermark embedding, namely, the N coefficients ~
~ xi = xi + ni , i = 1,2, N where ni denotes any possible distortion. Then by knowledge of quantization step computing
q,
Tˆ (bi ) = ~ xi mod Λ = (T (bi ) + ni ) mod Λ , −1 we get the valid estimation Tˆ (bi ) of T (bi ) , take inverse transform T to Tˆ (bi ) to obtain a valid estimation bˆ of b , i = 1,2,, N . Finally we have the estimation i
i
mˆ of watermark message m . xi = xi , i = 1,2, , N , namely no distortion is induced to watermarked When ~ data in transmission process, then
Tˆ (bi ) = xi mod Λ = T (bi ) mod Λ ˆ = m. so we can get the accurate estimation of watermark message m 4 Experimental Results The standard image Lena of size 256 × 256 was used as the host image in our experiments. And the watermark message is binary image of size 64 × 64. The Lena image and watermark image are shown in Figure.1(a) and Figure.1(g), respectively. The watermark sequence is b = (b1, b2 ,, b4096) , where bi = 0 or 1,
i = 1,2, ,4096 .
A Lattice Based General Blind Watermark Scheme
141
To embed watermark, we perform the DWT to host image firstly. The coefficients in which watermark sequence to be added are chosen random in the parts of low frequency and middle frequency of transformed host data. Next, ununiform quantization method was used, namely choosing variable quantization step based on the coefficients value. For simplicity, let T to be linear transform. In the embedded process we make redundancy embedding [12] (embedding watermark sequence repeatedly) and utilize (7,4) Hamming code in order to enhance robustness of watermark scheme. Our experimental results are as follows:
(a)
(b)
(d)
(g)
(c)
(e)
(h)
(i)
(f)
(j)
(k)
(l)
Fig. 1. Experimental results (a) host image; (b) watermarked image; (c) attacked image with the bluring; (d) attacked image with the cropping;; (e) attacked image with the sharpening; (f) attacked image with the JPEG compression; (g) initial watermark; (h) watermark extracted from (b); (i) watermark extracted from (c); (j) watermark extracted from (d); (k) watermark extracted from (e); (l) watermark extracted from (f).
Watermarked Lena image and four watermarked Lena images which were attacked, respectively, by means of bluring, cropping, sharpening, JPEG compression are shown in Figure.1 (b)-(f). Correspondingly, recovered watermark images
142
Y. Liu et al.
extracted from (b)-(f) are shown in Figure.1 (h)-(l). Table 1 gives some experimental results based on numerical value, where NC denotes the similarity between the original watermark and the extracted watermark. Table 1. Experimental Results Attack
Peak signal noise rate
Bit error rate
No Blur
38.5364 31.6327
0.761% 5.985%
1.011 0.979
Cropping
10.4561
8.981%
1.006
Sharpen
23.5876
10.749%
0.961
27.3532
9.743%
0.976
JPEG compression
NC
5 Remarks In previous sections we give the simple and effictive general scheme for blind watermark and provide experimental results. In this scheme, it can be found that based on quantifying coefficients of host data in frequency domain the interference of host data is eliminated by mod Λ map. And we implement a blind watermark extraction successfully. In following, this scheme will be discussed in detail. A. Selection of coefficients. Selection of N coefficients determines the embedding amount (strength) and placement. And it has a significant effect on the rate, robustness and imperceptibility of watermark embedding. B. Quantization step. q . It is obvious that robustness increases with q and this scheme can resist stronger attacks with larger quantization step. But the increase of quantization step makes quantization noise larger and lead to too much distortion. Hence, there is a trade-off of selecting quantization step. The small quantization step is enough for weak attack, but larger quantization step is needed to extract watermark for strong attack. A more effective quantization method is un-uniform quantization. C. Invertible transform T . The invertible transform T determines the intensity of embedding, which has a significant effect on robustness and imperceptibility. D. Assumptions of distortion induced by embedding and attack. In above schemes, we have made some suitable assumptions for distortions. The reasonability of assumptions of modification induced by embedding process is stated above. We clarify the reasonability of assumptions of distortion induced by attack now. Usually, incidental attack will either induce a small distortion or a large distortion for host data, the latter impairs significantly quality of host data, and it is out of our interesting. So, we focus on intentional attacks. Both watermark user and attacker will consider distortion restriction. Watermark can be thought of as a game [5] between the information hider and attacker. So information hider can define a distortion function and specify the constraint on admissible distortion levels for itself and attacker. Then he seeks the maximum rate of reliable transmission watermark message m over any possible watermark strategy and any attack that satisfies the specified constrain. This is done by application of information-theoretic principles
A Lattice Based General Blind Watermark Scheme
143
[14]. Depending on human audio/visual system [15], he can embed maximum of watermark message. This make attacker can not to remove watermark no cause serious degradation of data quality, while the attack want to embed another watermark again is also limited (this is a very important issue which can also be solved by using time-stamp). Hence, the assumption is reasonable. E. mod Λ map mod Λ map has been clarified in detail in section 2.2. In this scheme, the main function of the mod Λ map is to eliminate interference of host data. F. Effect of attack to extracting watermark message and countermeasures. It can be found that attack has significant effect on getting a valid estimation in above extracting process. For improve the robustness of watermark scheme, following measures can be used: (1) Improve watermark scheme robustness itself. It can be seen that the choosing of invertible transform T is very important. (2) Redundancy embedding, namely embedding watermark sequence repeatedly. (3) Making use of error correct code to decrease error bit rate. (4) Taking other countermeasures [16].
6 Conclusion This paper presents a general blind watermarking scheme. And we implement a blind watermark extraction by this scheme successfully. Although this scheme may be not the optimal, but it is simple and practical to implement and independent of host data. So it has an obvious advantage over previous schemes such as blind spread spectrum watermark scheme, SCS and quantization index modulation (QIM) [9]. The further research task is to seek a more effective general blind watermark scheme and to invent blind watermark scheme satisfied application requirement. Content based watermark is under our considering.
References 1. 2.
3. 4. 5. 6. 7. 8.
Proceedings of the SPIE/IS&T International Conference on Security and Watermarking of Multimedia Contents, vol. 3657, January 25–27, 1999 Proceedings of the SPIE International Conference on Security and Watermarking of Multimedia Contents IV, Vol. 4675, January 20–25, 2002, San Jose, CA. Ross J. Anderson (Ed.): Information Hiding, First International Workshop, Cambridge, U.K., May 30–June 1, 1996, Proceedings. Lecture Notes in Computer Science 1174 Springer 1996, ISBN 3-540-61996-8 Ira. S. Moskwitz (Ed.): Information Hiding, 4th International Workshop, IHW 2001, Pittsburgh, PA, USA, April 25-27, 2001, Proceedings. Lecture Notes in Computer Science 2137 Springer 2001. T. Basar, G. J. Olsder, in: Dynamic Noncooperative Game Theory, SIAM Classics in Applied Mathematics, SIAM, Philadelphia, 1999. P. Moulin and A. Ivanovic, The Watermark Selection Game, Proc. Conference on Information Science and Systems, Baltimore, MD, March 2001. A. Fiat and T.Tassa, Dynamic Traitor Tracing. Journal of Cryptology. vol. 14: 211–223. 2001 W. Zeng and B. Liu. On resolving rightful ownership of digital images by invisible watermark Proc. IEEE Conf. Image Processing. vol. 1, CA, Oct. 1997 552–555
144 9. 10. 11. 12. 13. 14. 15. 16.
Y. Liu et al. B. Chen and G. W. Wornell. Provably robust digital watermark. Proc of SPIE: Multimedia Systems and Applications, 1999, vol.3845, pp. 43–54. B. Chen, G. W. Wornell, An information theoretic approach to design of robust digital watermarking systems, Proceedings of the International Conference on Acoustics, Speech and Signal Proceeding (ICASSP), Phoenix, AZ, March 1999. M. H. M. Costa, Writing on Dirty Paper. IEEE Trans. on Information Theory, vol. 29, no. 3, pp. 439–441, May 1983. J. Eggers and J. K. Su. Performance of a pratical blind watermarking scheme. Proceedings of SPIE.Vol.4314, 2001. J. Chou. S. Pradhan and K. Ramchandran. A Robust Blind Watermarking Scheme based on Distriibuted Source Coding Pronciples. Proceedings of SPIE, 2000. P. Moulin. The role of information theory in watermarking and its application to image watermarking. Signal Processing vol. 81 pp. 1121–1139. 2001. N.J. Jayant, J. Johnston, and R Safranek Signal compression based on models of the human perception. Proc. IEEE. Vol. 81, pp. 1385–1422. 1993. A. Miyazaki, A. Okamoto Analysis of watermarking systems in the frequency domain and its application to design of rubust watermarking systems IEICE Trans. Vol 85 No 1, 117– 124. Jan 2002.
Role-Based Access Control and the Access Control Matrix Gregory Saunders1 , Michael Hitchens2 , and Vijay Varadharajan2 1
School of Information Technologies, University of Sydney, Australia [email protected] 2 Department of Computing, Macquarie University, Australia {michaelh,vijay}@ics.mq.edu.au
Abstract. The Access Matrix is a useful model for understanding the behaviour and properties of access control systems. While the matrix is rarely implemented, access control in real systems is usually based on access control mechanisms, such as access control lists or capabilities, that have clear relationships with the matrix model. In recent times a great deal of interest has been shown in Role Based Access Control (RBAC) models. However, the relationship between RBAC models and the Access Matrix is not clear. In this paper we present a model of RBAC based on the Access Matrix which makes the relationships between the two explicit. In the process of constructing this model, some fundamental similarities between certain capability models and RBAC are revealed. In particular, we outline a proof that RBAC and the ACM are equivalent with respect to the policies they can represent. From this we conclude that, in a similar way to access lists and capabilities, RBAC is a derivation of the Access Matrix model.
1
Introduction
Computer systems contain large amounts of information, much of which is of a sensitive nature. It is necessary to be able to define what entities have access to this information and in what ways they can access it. These functions are variously known as access control or authorisation. The basic model of access control is the Access Control Matrix (ACM) [1,2]. The ACM specifies individual relationships between entities wishing access (subjects) and the system resources they wish to access (objects). For each subject-object pair the allowable access appears in the corresponding entry in the (two-dimensional) matrix. Current access control mechanisms do not implement the ACM directly, due to well known efficiency problems [3]. However, most access control mechanisms in current use are based on models which have a direct relationship with the ACM. Recently there has been an increasing interest in other models of access control. One of the more prominent of these has been Role Based Access Control (RBAC) [4,5,6]. The interest in RBAC is often claimed to be its ability to manage access control policies more effectively. The policies of real world organisations are often of a sophisticated nature and cannot be readily expressed within the framework of S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 145–157, 2003. c Springer-Verlag Berlin Heidelberg 2003
146
G. Saunders, M. Hitchens, and V. Varadharajan
the ACM or its immediate derivatives. RBAC, amongst other proposals, shows promise in being able to express real-world policies. As might be expected, the advantages of RBAC do not come without cost. The ACM is a relatively simple concept and it, and its closely related derivatives (access control lists and capabilities) have been extensively studied. Even here though, the differences between access control lists and capabilities have made it difficult to compare systems based on these models in any formal way. It is important that there be a means for comparing the expressive power of different models in order to determine if they meet the needs of a particular application. In a previous paper [7] we presented a formalism, based on that of Harrison, Ruzzo and Ullman [2] which encompasses both access control lists and capabilities, making it easier to compare such systems. In this paper we extend that formalism to encompass RBAC. In the process it becomes clear that RBAC has significant fundamental similarities to capability based access control. In particular we outline a proof that RBAC and the ACM are equivalent in the policy sets they can represent. We conclude that RBAC, in a similar manner to access control lists and capabilities, is a derivation of the Access Matrix model. This is in contrast to some contributions to the literature,e.g. [8,9] which contend that RBAC is an alternative to traditional DAC and MAC. The rest of this paper is arranged as follows. The following section contains a revised and simplified description of our basic model, first presented in [7]. Section 3 extends the basic model to form a matrix model. Sections 4 and 5 extend the model to describe a capability system and RBAC respectively and fundamental similarities between these models are discussed. An example illustrating these similarities is presented in Section 6. In Section 7 we outline a proof that RBAC and the ACM are equivalent in the policy sets they can represent. Section 8 concludes the paper with suggestions for future research.
2
The Base Model
We begin with a basic model which is expanded in later sections to describe the various access control models. This model is a revision and simplification of one we described in an earlier work [7], which in turn was based on the access matrix model of Harrison et al. [2]. We shall base our access control models on a series of definitions, each of which declares the existence of one of three things: 1. A set; 2. A container (list, queue, vector, matrix etc.) the contents of which are either elements of a set defined earlier; or a set or container (recursively.); 3. A mapping between sets defined earlier. Each of the models in this formulation are extensions of the following six definitions, some of which may be augmented depending on the model. Definition 1 Rts the set of Rights (e.g. read, write, execute, own) Definition 2 Obj the set of Objects (e.g. files)
Role-Based Access Control and the Access Control Matrix
147
Table 1. The primitive operations available to the commands in C enter x into Y create object Xo destroy object Xo delete x from Y create subject Xs destroy subject Xs
Definition 3 Sbj the set of Subjects (e.g. users, processes) Definition 4 C the set of commands Definition 5 B the set {grant, deny} Definition 6 f a function from Sbj × Obj × Rts to B Each element of C is a command of the form: command α(X1 , . . . , Xi ) if cnd1 and cnd2 ... cndj then op1 . . . opk end The commands provide the only means of manipulating the elements of the access control system in the same way that the methods of an object oriented class provide the only means for manipulating the private variables of that class. The contents of C are determined by the model under consideration, and each model will typically provide commands for creating and destroying objects and subjects, and for conferring and revoking access privileges between subjects. The symbol α is the name of the command. The arguments, X1 . . . Xi , may be elements of any set declared earlier. Within the commands, each cndj is a condition using either f or the operator ‘in’ which tests membership in a set or container. Each opk is one of the primitive operations in Table 1. The enter and delete operations are defined more generally than in the model of Harrison et al. The enter operation inserts an element x into a set or container Y , while delete removes it. We assume the other operations have their intuitive meanings. The function f determines whether a given subject has a given right for a given object. The exact definition of f depends on the model in question.
3
The Access Matrix Model
To model the Access Control Matrix [2] we begin with Definitions 1–6 from the previous section, and extend them with Definition 7 M a matrix, indexed by Obj and Sbj, each element of which is a subset of Rts.
148
G. Saunders, M. Hitchens, and V. Varadharajan Table 2. The set C of commands for the access control matrix model.
command CREAT E(sbj, obj) create object obj enter own into M [sbj, obj] end
command DEST ROY (sbj, obj) if own in M [sbj, obj] then destroy object obj end
command CON F ERr (sbj, sbj2, obj) if own in M [sbj, obj] then enter r into M [sbj2, obj] end
command REM OV Er (sbj, sbj2, obj) if own in M [sbj, obj] then delete r from M [sbj2, obj] end
command CHOW N (sbj, new, obj) if own in M [sbj, obj] then delete own from M [sbj, obj] enter own into M [new, obj] end
The contents of C are shown in Table 2. Commands for the creation and destruction of subjects are similar to those for objects and are omitted here and in the later models. The function f returns grant if rt ∈ M [s, o]. So much for the Access Control Matrix model then. As is well known, the space requirements of the matrix prohibit the actual use of this model in a computer system. There are, however, methods for reducing the space required. For example, we can replace Definition 7 with: Definition 8 (replaces 7) M a set of triplets (s, o, rts) where s ∈ Sbj, o ∈ Obj and rts ⊆ Rts. and then remove those triplets where rts = ∅ to save space, assuming that the majority of entries matrix are, in fact, empty[3]. This would require modifications to the commands in C, for example in the CREAT E command we add “enter (sbj, obj, {own}) into M ” in place of the existing enter operation.
4
Capability Containers
Capability systems partition the matrix M by subject, storing a set of (o, rts) tuples, called capabilities, for each subject. Some capability systems, (e.g. [10, 11]), allow capabilities to be stored within objects. When a subject wishes to access an object, they locate a capability within one of their objects and present it to the system. Note that objects here are used in the security sense as containers rather than in an object-oriented sense. Beginning again with Definitions 1–6, this can be modelled in the following way: Definition 9 OR the set of (o, rts) tuples. Definition 10 CapO a set of objects that may contain capabilities, and such that CapO ⊆ O.
Role-Based Access Control and the Access Control Matrix
149
Table 3. The set C of commands for the capability container model. command CREAT E(sbj, obj) create object obj enter (obj, {Rts}) into sbj end
command DEST ROY (sbj, obj) if f (sbj, obj, destroy) then destroy object obj end
command CON F ERr (sbj, capo, obj) if f (sbj, obj, conf er) then enter (obj, {r}) into capo end
command REM OV Er (sbj, sbj2, obj) if f (sbj, obj, remove) then delete (obj, {r}) from capo end
Definition 11 CA a many-to-many mapping from OR to CapO. Instead of storing capabilities in a central repository (OR), they are stored within objects. The CA mapping simply tells us which capabilities are contained in a particular CapO. This scheme raises a number of interesting issues. Firstly, what capabilities does a subject possess on creation? One possibility would be to create a mapping from some characteristic of the new subject, its owner for example, to a set of capabilities which the subject will possess on creation. Another solution would have the subject inherit some or all of the capabilities of its parent. These solutions are not mutually exclusive, and the second has the advantage of being able to support the principle of least privilege by dynamically restricting the capabilities that a child subject inherits, or perhaps by temporarily deactivating capabilities under certain conditions (the password capability system of Anderson et al. [11] has facilities for doing this). To model this we introduce: Definition 12 proclist a many-to-many mapping from Sbj to OR. which tells us which capabilities held in a subject can presently be used. Another issue raised by this scheme is that by possessing a capability for a capability containing object a subject may, depending on the rights in the capability, be able to acquire and use the capabilities in that object. We designate the set of capability containing objects from which a subject can acquire capabilities as the active capability containing objects. Definition 13 active a many-to-many mapping from Sbj to CapO giving the CapOs which are reachable from a given subject. A CapO called x is reachable by a subject s if x = s, or if a capability for x is an element of proclist(s), or if a capability for x is an element of CA(y) (where y ∈ CapO) and y is reachable. A process wishing to access an object would simply present a capability for the object from among the capabilities available in any of the objects to which
150
G. Saunders, M. Hitchens, and V. Varadharajan
it has a capability, or can get one. The function f therefore takes the form grant if (o, rts) ∈ proclist(s)∨ (o, rts) ∈ ∪c∈active(s) {x|x ∈ CA(c)} ∧ rt ∈ rts f (s, o, rt) = deny otherwise and the set C of commands is shown in Table 3.
5
Role-Based Models
Sandhu et al. [5] define four reference models for RBAC. RBAC0 defines a basic RBAC system. RBAC1 augments RBAC0 with role hierarchies. RBAC2 adds constraints to RBAC0 and RBAC3 combines RBAC1 and RBAC2 . In this paper we focus on RBAC1 , which has the following components [5] • • • • •
U, R, P and S (users, roles, permissions, and sessions respectively); P A ⊆ P × R, a many-to-many permission to role assignment relation; U A ⊆ U × R, a many-to-many user to role assignment relation; user : S → U , a function mapping each session si to the single user user(si ); RH ⊆ R × R is a partial order on R called the role hierarchy or role dominance relation, also written as ≥; and • roles : S → 2R , a function mapping session si to a set of roles roles(si ) ⊆ {r|(∃r ≥ r)[(user(si ), r ) ∈ U A]} (which can change with time) and session si has the permissions ∪r∈roles(si) {p|(∃r ≥ r)[(p, r ) ∈ P A]}.
It may come as a surprise to realize that there are fundamental similarities between the capability model presented earlier, and RBAC models. In fact, the process of deriving a RBAC model from the model presented in the previous section is largely one of renaming. We extend our previous definitions with: Definition 14 (replaces 9) P the set of Permissions, such that P = OR. The subjects in a RBAC system are neither users, nor processes, but a new entity called a session. When a user logs in, a new session is created which is active in a subset of their roles. This is analogous to the user creating a process with a subset of their capabilities. We can model this with: Definition 15 S the set of Sessions. Definition 16 (replaces 6) f a function from S × Obj × Rts to B In RBAC systems, roles relate users to permissions. This is analogous to giving a user a capability for a capability containing object. Definition 17 R a set of Roles.
Role-Based Access Control and the Access Control Matrix
151
Table 4. The set C of commands for the role based model. command CREAT E(role, obj) create object obj enter (obj, {Rts}) into P enter ((obj, {Rts}), role) into P A end
command DEST ROY (role, obj) if f (role, obj, destroy) then destroy object obj end
command CON F ERr (role1, obj, role2) if f (role1, obj, conf er) then enter ((obj, {r}), role2) into P A end
command REM OV Er (role1, obj, role2) if f (role1, obj, remove) then delete ((obj, {r}), role2) from P A end
Perhaps the most important difference between capability container models and RBAC models is that roles are not objects as CapOs are. Therefore, it is not possible to manipulate roles in the same way as normal system objects. Also, a permission is not required to access a Role. Instead, role membership is determined independently of any permissions held by a user (indeed, the permissions held by a user are determined by role membership.) Lastly, roles are not, strictly speaking, sets of permissions (though they can be usefully thought of as such). So we require a mechanism to tell us which permissions are assigned to a role, just as we required a mechanism to map capabilities to the objects which contained them. Definition 18 (replaces 11) P A a many-to-many mapping from P to R. In the capability container models, the subjects are themselves capability containers and therefore behave in a similar manner to roles. In RBAC the subjects are restricted to inheriting permissions from roles, they cannot contain permissions that are not inherited from roles. Furthermore, it is not possible to obtain permissions from a Subject by possessing a permission for that subject. We require a mechanism to tell us which roles are being used by a particular session. This mechanism performs a similar function to proclist from Definition 12, in that it allows for a subset of the available roles to be made active. Definition 19 roles a many-to-many mapping from S to R. Some RBAC models allow roles to be partially ordered in a Role Hierarchy. This is analogous to having a capability containing object which contains capabilities for other capability containing objects. The active mapping from Definition 13 provides an almost identical function in container based capability models. In RBAC models the role hierarchy is defined by Definition 20 RH A partial order on the set R of roles. The commands of the set C are defined in Table 4 and the function f from Definition 16 takes the form grant if (o, rts) ∈ ∪rl∈roles(s) {p|p ∈ P A[rl]} ∧ rt ∈ rts f (s, o, rt) = deny otherwise
152
G. Saunders, M. Hitchens, and V. Varadharajan
CSO
O1 CSO
SO1
SO2
SO3
O3
read, write
SO1
read
SO2
read
SO3
O2
read, execute read, write
(a) Role Hierarchy
(b) Access Matrix
Fig. 1. The chief security officer example
We now have a basic RBAC model derived from the container based capability model of the previous section. The following section presents examples to illustrate the similarities between RBAC and container based capability models.
6
An Example
We can illustrate the similarities between RBAC models and capability container models with an example taken from Sandhu et al.[5]. Space constraints preclude the inclusion of a more complex example. One may be found, however, in [12]. Consider the role hierarchy found in Figure 1(a) in which the Chief Security Officer (CSO) role inherits from three junior Security Officer (SO) roles. In this example the set R of roles is simply {CSO, SO1, SO2, SO3} and the partial order set RH contains {(SO1, CSO), (SO2, CSO), (SO3, CSO)}. Figure 1(b) is an example matrix describing the rights each of the security officer roles has for objects O1 , O2 and O3 . In a Role Based system, this matrix is represented by the sets P of permissions and P A of permission assignments: O2 , {read, write} (p1 , SO1), (p1 , SO2), p1 p2 (p2 , CSO), (p3 , SO2), O2 , {read, execute} O3 , {read, write} (p4 , SO3)
O1 , {read}
p3
p4
This means that any user active in the CSO role is able to use permission p2 and also any of the other permissions by virtue of the inheritance relationships. Figure 2 illustrates the same scenario in terms of the capability container model. The set of capability containing objects, CapO, would be {CSO, SO1, SO2, SO3}. The capabilities contained within the CSO object include {(SO1, acq), (SO2, acq), (SO3, acq)} where acq represents the set of rights which enable the acquisition and use of capabilities from the destination object. In addition to the capabilities mentioned above, the set OR contains the permissions of set P . Furthermore, the SO1 object contains the capability p1 ,
Role-Based Access Control and the Access Control Matrix
SO1, {acq}
SO2, {acq}
SO3, {acq}
153
O2, {r, w}
CSO
O1, {r}
SO1
O1, {r}
O2, {r, x}
SO2
O1
O3, {r, w}
SO2
O2
O3
Fig. 2. The example using the capability container model.
the SO2 object contains the capabilities p1 and p3 , the SO3 object contains p4 and lastly, the CSO object contains p2 . Since CSO also contains capabilities for SO1, SO2 and SO3, any user who holds a capability for CSO is able to retrieve capabilities from SO1, SO2 and SO3 in an analogous way to role inheritance.
7
Comparing ACM and RBAC
In this section we investigate the relationship between the policies which can be expressed using an ACM-based approach and those which can be expressed using an RBAC approach. 7.1
Canonical RBAC Form
Consider an instance of the RBAC0 model discussed above. Intuitively this instance represents some unique policy set, that is, a unique set of ‘subject can do action to object’ rules. However, a given policy set may be represented by multiple different instances of the RBAC0 model. Consideration of the policy sets which can be represented in a RBAC approach will be simplified if it is possible to make a one-to-one mapping between the RBAC specification and a policy set. For an instance of the RBAC0 model at some instant in time t, we derive its canonical form at t using Algorithm 1. If a permission covers more than one object, Algorithm 1 can be extended to iterate over those objects. The canonical form has a single role for each user, with all their permissions assigned to that role, and no inheritance. Note, however, that the canonical form adheres to the rules of RBAC0 .
154
G. Saunders, M. Hitchens, and V. Varadharajan
Algorithm 1: Derive the canonical form of an RBAC0 instance.
Theorem 1 The canonical form of a RBAC0 instance represents the same policy set as that instance. Proof Sketch: Assume that for a policy set ps1 represented by an instance of RBAC0 we derive, using Algorithm 1, the canonical policy set psc1 . Further assume that ps1 does not represent the same policy set as psc1 . Then either: 1. 2. 3. 4.
ps1 ⊃ psc1 (i.e. ps1 contains all the policy rules of psc1 , plus others); or ps1 ⊂ psc1 (i.e. psc1 contains all the policy rules of ps1 , plus others); or ps1 ∩ psc1 = ∅, but neither ps1 ⊃ psc1 nor ps1 ⊂ psc1 ; or ps1 ∩ psc1 = ∅.
Assume ps1 ⊃ psc1 . This implies that ∃o, ri , p, rl, u : p = (o, {r1 . . . rn }) ∈ P ∧ (p, rl) ∈ P A ∧ memberof (u, rl) ∧ ri ∈ {r1 . . . rn } and ¬∃p , rl : (p , rl ) ∈ P A ∧ (u, rl ) ∈ U A ∧ p = (o, {ri }). But from Algorithm 1: ∀p ∈ P where p = (o, {r1 . . . rn }), if ∃u, rl : (p, rl) ∈ P A ∧ (u, rl) ∈ U A then ∃p1 . . . pn ∈ P : p1 = (o, {r1 }) . . . pn = (o, {rn }) and ∀pi ∈ {p1 . . . pn } ∃rl : rl ∈ R ∧ (pi , rl ) ∈ P A ∧ (u, rl ) ∈ U A . But then we can substitute p for pi , and therefore ∃p , rl : (p , rl ) ∈ P A ∧ (u, rl ) ∈ U A ∧ p = (o, {ri }). This is a contradiction and therefore our initial assumption was false and ps1 ⊃ psc1 . Similarly it can be shown that possibilities 2–4 above also lead to contradiction. It follows that ps1 and psc1 represent the same policy set. Theorem 2 Any two different instances of RBAC0 in canonical form represent different policy sets.
Role-Based Access Control and the Access Control Matrix
155
Algorithm 2: Transform an instance of RBAC0 to the ACM model.
Proof Sketch: Assume we have two different instances (I1 and I2 ) of RBAC0 in canonical form and that both represent the same policy set. Since the instances are different, at least one of U , R, P , U A or P A is different in I1 and I2 . Consider the first possibility, in which the set U differs. It can be seen from steps 1 and 2 of Algorithm 1 that only those users who have at least one right will appear in the user sets in canonical form. Since both I1 and I2 represent the same policy set, it follows that they must have the same set of users. Possibilities 2–5 can be eliminated in similar fashion. Therefore I1 is equivalent to I2 , but this is a contradiction, since we assumed they were different. Therefore two different instances of RBAC0 in canonical form must represent two different policy sets. We note that any change in the original instance of RBAC0 can be duplicated in its canonical form. However, a single modification of the original instance may require many modifications in its canonical form. 7.2
Equivalence of ACM and Canonical RBAC
Having established a one-to-one mapping between the canonical form of RBAC and the abstract policy sets represented, we can compare the range of policy sets which can be expressed in the ACM and RBAC approaches. It should be obvious to the reader that each ACM instance represents a unique policy set. It remains to be determined whether the ACM and RBAC0 models are equivalent in the policy sets they can represent, i.e. for all sets of abstract policy rules of the form ‘subject can do action to object’ is the set of such sets representable using the ACM model the same as the set representable using canonical RBAC0 ? First we must establish that for any given policy set represented in the canonical RBAC0 model we can construct an equivalent ACM representation, and vice-versa. Proof of this will show that the range of both representations is the same. We then show that converting from the ACM representation to RBAC and back again (and vice-versa) produces the original policy set (and representation). For any given policy set represented in the RBAC0 model we can construct an equivalent representation in the ACM model. Consider an arbitrary RBAC0 instance rb. We can easily construct an ACM instance, a, using Algorithm 2. For any given policy set represented by an instance, a, of the ACM model we can construct a canonical RBAC0 representation, rbc, using Algorithm 3. That the above transformation produces a representation of a policy, in the other model, is obvious. What needs to proved is that the policy set represented before and after transformation is the same.
156
G. Saunders, M. Hitchens, and V. Varadharajan
Algorithm 3: Transform an instance of the ACM model to RBAC0 .
Theorem 3 Algorithm 2 results in a representation in the ACM model of the same policy set represented by the original RBAC0 instance rb. Proof Sketch: By Theorem 1 the policy sets represented by rb and rbc are the same. Therefore we only need to prove that the same policy set is represented by a and rbc. The proof is similar to that for Theorem 1 and is omitted. Theorem 4 Algorithm 3 results in a representation in the canonical RBAC0 model of the same policy set represented by the original ACM instance. The proof is similar to that for Theorem 3 and is omitted. Having established that converting from the ACM model to the canonical RBAC0 model, and vice versa, results in a representation of the same policy set, it follows naturally that a conversion from ACM to canonical RBAC0 and back again will result in a representation of the same policy set. Consider • • • • • •
a an instance of the ACM model; a is converted to rbc, an instance of canonical RBAC0 model, (Algorithm 2). a and rbc represent the same policy set (Theorem 3); rbc is now converted to a , an instance of the ACM model, (Algorithm 3); a and rbc represent the same policy set (Theorem 4) and As rbc represents the same policy set as both a and a, then a and a must represent the same policy set.
The mapping from a canonical RBAC0 instance to an ACM instance and back again can be handled similarly. Hence the policy sets which can be represented in the ACM model and the canonical RBAC0 model are equivalent.
8
Conclusion
We have presented a formal model of Role-Based Access Control which is derived from the Access Control Matrix. Such a model places RBAC in relation to the traditional access control models and enables comparisons to be made between systems based on the various models. In the process we have demonstrated fundamental similarities between RBAC and capabilities. That RBAC should be related to a derivative of the ACM should come as no surprise. The ACM is the
Role-Based Access Control and the Access Control Matrix
157
fundamental expression of discretionary access control andcapabilities are an intuitive method of viewing it. Understanding the relationship between capabilities and RBAC and, more distantly, RBAC and the ACM, opens the possibility of applying results known for those models to RBAC (and vice-versa). It should also simplify comparisons, such as in terms of safety analysis, between systems based on the various models. Two broad areas of future work offer themselves. First is the examination of the implications of placing RBAC in a taxonomy of access control models. Does its similarity to capabilities indicate that implementations of RBAC based on capabilities have promise? Can known properties of capability systems be applied to RBAC systems? Second is further extending our formalism for other access control models, such as the Chinese Wall and other lattice based models [13].
References 1. Lampson, B.W.: Protection. Operating Systems Review 8 (1974) 2. Harrison, M.A., Ruzzo, W.L., Ullman, J.D.: Protection in operating systems. Communications of the ACM 19 (1976) 3. Sandhu, R.S., Samarati, P.: Access control: Principles and practice. IEEE Communications Magazine 32 (1994) 4. Ferraiolo, D., Kuhn, R.: Role-based access controls. In: 15th NIST-NCSC National Computer Security Conference. (1992) 5. Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. IEEE Computer 29 (1996) 6. Sandhu, R.S., Ferraiolo, D., Kuhn, R.: The NIST model for role-based access control: Towards a unified standard. In: Proceedings of the Fifth ACM Workshop on Role-Based Access Control. (2000) 7. Saunders, G., Hitchens, M., Varadharajan, V.: An analysis of access control models. In: Proceedings of the Fourth Australasian Conference on Information Security and Privacy. (1999) 8. Sandhu, R., Munawer, Q.: How to do discretionary access control using roles. In: Proceedings of the Third ACM Workshop on Role-Based Access Control. (1998) 9. Osborn, S., Sandhu, R., Munawer, Q.: Configuring role-based access control to enforce mandatory and discretionary access control policies. ACM Transactions on Information and System Security 3 (2000) 10. Dearle, A., di Bona, R., Farrow, J., Henskens, F., Hulse, D., Lindstr¨ om, A., Norris, S., Rosenberg, J., Vaughan, R.: Protection in the grasshopper operating system. In: Proceedings of the 6th International Workshop on Persistent Object Systems. (1994) 11. Anderson, M., Pose, R.D., Wallace, C.S.: A password-capability system. The Computer Journal 29 (1986) 12. Saunders, G., Hitchens, M., Varadharajan, V.: Role-based access control and the access control matrix. Operating Systems Review 35 (2001) 13. Sandhu, R.S.: Lattice-based access control models. IEEE Computer 26 (1993)
Broadcast Encryption Schemes Based on the Sectioned Key Tree Miodrag J. Mihaljevi´c Mathematical Institute, Serbian Academy of Sciences and Arts Kneza Mihaila 35, 11001 Belgrade, Serbia and Montenegro [email protected]
Abstract. This paper proposes a family of key management schemes for stateless receivers, and particularly two of the family members called SKT-A and SKT-B. A basic strategy of the proposed approach could be formulated as follows: Before dealing with the set covering issues, perform an appropriate preprocessing over the underlying tree in order to specify a more suitable underlying structure for the set covering. The main underlying idea for developing a novel family of the key management schemes is employment of appropriate clustering of the keys and users, and employment heterogeneous cluster oriented local key management. The proposed schemes are compared with the recently reported, and the advantages of the novel ones are pointed out. Keywords: Broadcast encryption, key management, stateless receivers.
1
Introduction
Broadcast encryption (BE) schemes define methods for encrypting content so that only privileged users are able to recover the content from the broadcast. Later on, this flagship BE application has been extended to another one - media content protection (see [16] or [12], for example). This application has the same one-way nature as an encrypted broadcast: A recorder makes an encrypted recording and, a player needs to play it back. This situation allows no opportunity for the player and recorder to communicate. Accordingly, in this paper we are dealing with the stateless receivers - the devices in which the operations must be accomplished based only on the current transmission and its initial configuration because these receivers do not have a possibility to update their state from session to session. When cryptography is used for securing communications, a sessionencrypting key (SEK) is used to encrypt the data. Ensuring that only the valid members of the selected group have the SEK at any given time instance is the key management problem in BE. Whenever the SEK is invalidated, there needs to be another set of keys called the key-encrypting keys (KEKs) that can be used to encrypt and transmit the updated SEK to the valid members of the group. Hence, the key management problem reduces to the problem of distributing the S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 158–169, 2003. c Springer-Verlag Berlin Heidelberg 2003
Broadcast Encryption Schemes Based on the Sectioned Key Tree
159
KEKs to the members such that at any given time instant all the valid members can be securely reached and updated with the new SEK. The difficulty of managing cryptographic keys used arises from the dynamic membership change problem. A number of methods has been reported in the literature employing the following approach: Provide the receivers with a collection of the keys in such a manner that the communication overload is reduced. The first breakthrough in BE key management is reported in [8] where the schemes in which each receiver has a fixed set of reusable keys were proposed. However, the complexity of these schemes was strongly dependent on the size of the adversarial coalition. Later on, a number of different schemes as well as the system approaches, have been reported and analyzed - see [15], [19]-[20], [3], [1], [9], [16], [17], [18], [2] and [4], for example, and recently, certain results have been reported in [11], [13], [6], [5] and [14], as well. According to [11], the most interesting variant of BE deals with stateless receivers and has the following requirements: (a) Each user is initially given a collection of symmetric encryption keys; (b) The keys can be used to access any number of broadcasts; (c) The keys can be used to define any subset of users as privileged; (d) The keys are not affected by the user’s “viewing history”; (e) The keys do not change when other users join or leave the system; (f) Consecutive broadcasts can address unrelated privileged subsets; (g) Each privileged user can decrypt the broadcast by himself; (h) Even a coalition of all non-privileged users cannot decrypt the broadcast. This paper addresses the problem of developing improved BE key management schemes assuming the above given requirements. Related Work The most relevant references for this work are [16], [11] and [1]. On the other hand, note that the origins for these references include [15] and [19]-[20], and accordingly these references will be discussed here as well. An important characteristic of the system Iolus, [15], is that it solves the scalability problem by making use of a hierarchy. Iolus’s tree hierarchy consists of clients at the leaves with multiple levels of group security agents (agents, in short) above. For each tree node, the tree node (an agent) and its children (clients or lower level agents) form a subgroup and share a subgroup key. There is no globally shared group key. Thus a join or a leave in a subgroup does not affect other subgroups; only the local sub-group key needs to be changed. The approaches [20]-[19] have proposed a different hierarchy. The employed tree hierarchy consists of keys, with individual keys at leaves, the group key at the root, and subgroup keys elsewhere. There is a single key server for all the clients. There are no agents, but each client is given multiple keys (its individual key, the group key, and some subgroup keys). Following the results from [20]-[19], [1] has addressed the problem where the keys hierarchy consists of the long-lived keys, and a relaxed concept of the users subgroup specification is employed. A starting point was the observation that the requirement “no users outside the target set can decrypt the message” is too strict for many applications, i.e. some free-riders
160
M.J. Mihaljevi´c
may be tolerated. It is pointed out in [1] that: (i) breaking a large population into smaller subgroups and solving the key management problem independently for each subgroup results in a good performance trade-off; (ii) by increasing the number of keys, and thereby the sets, the probability of finding a smaller cover increases. Recent papers [16] and [11] have addressed the BE scenario with the stateless receivers. The basic idea in the most efficient stateless broadcasting encryption schemes is to represent any privileged set of users as the union of s subsets of a particular form. A different key is associated with each one of these sets, and a user knows a key if and only if he belongs to the corresponding set. The broadcaster encrypts SEK s times under all the keys associated with the set in the cover. Consequently, each privileged user can easily access the program, but even a coalition of the non-privileged users cannot recover SEK. The simplest implementation of this idea is to cover the privileged set with singleton sets. A better solution is to associate the users with the leaves of a binary tree, and to cover the privileged set of leaves with a collection of subtrees. For further considerations, let N be the number of receivers and R the number of revocations. In [16], a generic framework, is given by encapsulating several previously proposed revocation methods called Subset-Cover algorithms. These algorithms are based on the principle of covering all non-revoked users by disjoint subsets from a predefined collection, together with a method for assigning KEKs to subsets in the collection. Two types of revocation schemes in the Subset-Cover Framework, are proposed [16] with a different performance tradeoff. Both schemes are tree-based, namely the subsets are derived from a virtual tree structure imposed on all receivers in the system. The first proposed scheme, Complete Sub-Tree scheme (CST), requires a message length of Rlog2 (N/R) and storage of log2 N keys at the receiver and constitutes a moderate improvement over previously proposed schemes. The second called the Subset Difference algorithm (SD) exhibits a substantial improvement: it requires a message length of 2R where R is the number of revocations. The improved performance of SD is primarily due to its more sophisticated choice of covering sets. Let i be any vertex in the tree and let j be any descendent of i. Then Si,j is the subset of leaves which are descendants of i but are not descendants of j. Note that Si,j is empty if i = j. Otherwise, Si,j looks like a tree with a smaller subtree cut out. An alternative view of this set is a collection of subtrees which are hanging off the tree path from i to j. The SD scheme covers any privileged set P defined as the complement of R revoked users by the union of O(R) of these Si,j sets. What is shown in [11] is that SD collection of sets can be reduced: The basic idea of the Layered Subset Difference (LSD) scheme is to use only a small subcollection of Si,j sets employed by SD scheme which suffices to represent any such P as the union of O(R) of the remaining sets, with a slightly larger constant. Since there are fewer possible sets, it is possible to reduce the number of initial keys given to each user. In [11], it is shown that if we allow the number of sets in the cover to grow by a factor of two, we can reduce the number of keys
Broadcast Encryption Schemes Based on the Sectioned Key Tree
161
from O((log2 N )2 ) to O((log2 N )3/2 ), and then this technique was extended and it has been shown how to reduce the number of keys to O((log2 N )1+ ) for any fixed < 1. Contributions of the Paper This paper proposes a family of key management schemes, and particularly two of the family members called SKT-A and SKT-B. The proposed family is based on a heterogeneous logical key hierarchy. The approach employed in this paper is a different from the previously reported ones, and could be formulated as follows: Before dealing with the set covering issues, perform an appropriate preprocessing over the underlying tree in order to specify a more suitable underlying structure for the set covering. The main underlying idea for developing a novel family of the key management schemes is employment of appropriate clustering of the keys and users, and employment of heterogeneous cluster oriented local key management. Accordingly, the underlying ideas include the following: (i) specification of the appropriate partitions/sections of the keys tree; (ii) performing key management on the section-by-section basis; (iii) in a general case, employment different key management schemes in different sections; (iv) in certain cases, employment of modified local (section related) key management schemes which provide a relaxed specification of the privileged set. Assuming that H0 and R0 are the scheme parameters, 0 ≤ H0 ≤ log2 N and 0 ≤ R0 ≤ R, the proposed SKT-A key management scheme has the following main characteristics: dimension of the storage@receiver overload O((H0 )1.5 − H0 + log2 N ); dimension of the communications overload O(R + R0 ((log2 N ) − H0 ) − R0 log2 R0 ); dimension of the processing@receiver overload O(H0 ). Assuming that H0 , H1 and R0 , R1 are the scheme parameters, 0 ≤ H0 +H1 ≤ log2 N and 0 ≤ R1 ≤ R0 ≤ R, the proposed SKT-B key management scheme has the following main characteristics: dimension of the storage@receiver overload O((H0 )1.5 + (H1 )1.5 − H0 − H1 + log2 N ); dimension of the communications overload O(R + R0 + R1 ((log2 N ) − H1 − H0 ) − R1 log2 R1 ); dimension of the processing@receiver overload O(max{H0 , H1 }). As an illustrative comparison note the following. Assuming a huge group with a heavy dynamics when N = 227 and the revocation rate is 2−12 (i.e., ≈ 0.025% receivers should be revocated) and assuming approximately the same communication overload, the proposed schemes require approximately three and ten times smaller storage@receiver overload in comparison with LSD [11] and SD [16], respectively. Also, under this scenario, the proposed schemes require approximately three times smaller processing@receiver overload in comparison with LSD and SD. Organization of the Paper Section 2 yields the underlying ideas for developing of the improved key management schemes. A novel family of the key management schemes and two particular members of the family, called SKT-A and SKT-B, are proposed in Section 3. Main characteristics of the proposed general scheme and the particular ones are analysed in Section 4 including a comparison of SKT-A and SKT-B with
162
M.J. Mihaljevi´c
recently reported schemes targeting the same key management scenario. Finally, some concluding discussions are given i Section 5.
2
Underlying Ideas for the Improved Key Management Schemes
Recall that recently proposed, highly efficient key management schemes [16] and [11], have been developed by focusing on obtaining the solution for the underlying set covering problem using the tree based paradigm. The approach employed in this paper is a different one and could be formulated as follows: Before dealing with the set covering issues, perform an appropriate preprocessing over the underlying tree in order to specify a more suitable underlying structure for the set covering. So, the employed preprocessing could be also considered as a particular divide-and-conquer method for key management. The main underlying idea for developing a novel family of the key management schemes is employment of appropriate clustering of the keys and users, and employment a heterogeneous cluster oriented local key management. Accordingly, the underlying ideas include the following: • specification of the appropriate partitions/sections of the keys tree; • performing key management on the section-by-section basis; • in a general case, employment different key management schemes in different sections; • optionally, in certain cases, employment of modified local (section related) key management schemes which provide a relaxed specification of the privileged set. The proposed key management scheme is based on a novel underlying structure, called sectioned key tree, for assigning KEKs to the receivers and for SEK distribution. The opportunity for employment of different key management schemes in different sections opens a door for desired optimization of the key management overload characteristics. For example recall that CST re-keying requires significantly smaller storage@receiver overload at the expense of increased communications overload in comparison with LSD based re-keying. Accordingly, employing the CST based technique in one subset of the tree sections and LSD based one in another subset, for example, yields an opportunity for obtaining the desired overall characteristics. Also note the following two characteristics of SD and LSD schemes: (i) communications overload is linear with R; (ii) storage@receiver overload is polynomial with logN . These characteristics open a door for the trade-off based on divide-and-conquer approach. Additionally, note that, for example, a relaxed version of LSD, which does not perform the strict revocations but the relaxed ones in a manner similar to that reported in [1], could be employed as the appropriate one in certain cases. Also note that, although the key management is based on the section-bysection processing, this has no impact on the storage and processing complexity at the receivers side.
Broadcast Encryption Schemes Based on the Sectioned Key Tree
3 3.1
163
Key Management Based on the Sectioned Key Tree Center Side
From the center point of view, the key management scheme consists, as in an usual case, of the following two main components: (i) underlying graph structure for the keys and receivers assigning; (ii) methods employed for distributing a session key (SEK) to the stateless receivers. After this conceptual similarity, the proposed scheme differs from the reported ones as follows: - the underlying structure called sectioned key tree (SKT) is a particular tree structure different from the previously employed ones; - the distribution of SEK is based not on a single method but on employment a number of different methods. The Underlying Structure. The proposed key management scheme is based on an underlying structure in form of the partitioned key tree obtained by the following horizontal and vertical splitting: - a number of the horizontal layers is specified; - each layer is partitioned into a number of sections and each section contains a sub-tree which root is identical to a leaf of the upper layer section. In a special case, the following can be enforced: each of the layers has the same height, and each layer’s section contains the same number of nodes. Accordingly, each section contains the same subtree. In a general case, the tree is partitioned into L horizontal layers with the heights H , = 0, 1, .., L − 1, respectively. Then, the top layer contains a subtree with 2HL−1 leaves, and a layer consists of L−1
L−1 H 2Hi = 2 i=+1 i
i=+1
sections, each containing a sub-tree with 2H leaves. The illustrative examples of the underlying structure for the keys assignment employed in the proposed key management scheme are displayed in Fig. 1. Accordingly, we assume the following basic scenario for the key management based on the above underlying structure: N receivers grouped into M clusters, R revocations in total, assuming Rm revocations from a clusterwith index m, M m = 1, 2, ..., M , and the parameter M is an integer such that m=1 Rm = R and N/M is an integer, M ≤ N . Section-by-Section Key Management. The proposed key management scheme assumes the section-by-section key management, and in a general case, it yields the opportunity for employment different local key management schemes in different sections. Assuming SKT with L layers, and that a layer contains M () sections, = 0, 1, ..., L − 1, we propose the following section-by-section key management:
164
M.J. Mihaljevi´c
Fig. 1. An illustration of the proposed sectioned key tree.
– layer 0 processing (0) • For the subtree corresponding to section j, identify a set Rj of the leaves (receivers) which should be revoked, j = 1, 2, ..., M (0) . • Perform section-by-section processing: for the revocations over the subtree in section j employ a desired key management scheme for revocation (0) of elements in Rj , j = 1, 2, ..., M (0) . – layer processing, = 1, 2, ..., L − 1 () • For the subtree corresponding to section j, identify a set Rj of the leaves which correspond to the sections in layer − 1 affected by the revocations, and accordingly which should be revoked, j = 1, 2, ..., M () . • Perform section-by-section processing: for the revocations over the subtree in section j employ a desired key management scheme for revocation () of elements in Rj , j = 1, 2, ..., M () . So, at the center side, the procedure for revocation a number of receivers consists of the following main steps: (a) the center specifies a set of receivers which should be revoked; (b) employing the section-by-section processing, the center decides on the KEKs (nodes of the tree) which should be used for new SEK delivery (encryption); (c) center broadcast the following message: (i) an implicit information (in a general case) on the employed KEKs; (ii) SEK encrypted by each of the employed KEKs. Let E(·) denotes the algorithm employed for encryption of the new SEK, newSEK, Im defines the information on a KEK with index m, KEKm , employed for encryption of the new SEK, m = 1, 2, ..., M , where M is total number of KEKs employed for covering the desired subset of receivers, and FnewSEK (·) denotes the algorithm employed for the payload encryption. Accordingly, BE center broadcast the following: [[I1 , I2 , ..., IM , EKEK1 (newSEK), EKEK2 (newSEK), ..., EKEKM (newSEK)], FnewSEK (P ayload)] = [[I1 , I2 , ..., IM , C1 , C2 , ..., CM ], P ayloadCiphertext] .
Broadcast Encryption Schemes Based on the Sectioned Key Tree
3.2
165
Receivers Side
At a receiver side the situation is equivalent to the one related to the employment of CST, SD, or LSD based approaches. A receiver should store a number of cryptographic keys, monitor the communication channel to see weather its current SEK should be exchanged, and if “yes” extract the new SEK based on certain processing employing a memorized key. Actually, a receiver is not aware of the employed underlying structure at the center side. At a receiver’s side the re-keying is performed as follows. Each receiver monitors the re-keying broadcast by the center. In this message, a nonrevoked receiver will find an information on a KEK it posses which should be used for the new SEK recovering. Based on this information and the encrypted form of the new SEK, the nonrevoked receiver will recover the new SEK. Upon receiving a broadcast message, the receiver performs the following operations: – Finding Im which is related to the receiver: If the receiver is revoked, no one such information will be found; – Employing Im and the keys stored at the receiver perform a processing in order to recover KEKm employed for newSEK encryption. −1 (Cm ). – Recovering the new SEK performing the decryption EKEK m Finally, after recovering the new SEK, the payload is obtained by −1 (P ayloadCiphertext). FnewSEK
3.3
Two Particular Key Management Schemes
As the illustrative examples, this section specify two particular key management schemes called SKT-A and SKT-B where SKT stands for Sectioned Key Tree.
SKT-A. SKT-A is a particular key management scheme based on the following sectionization of the key tree and the local re-keying: – there are two horizontal layers and height of the bottom one is equal to H0 , and accordingly the upper layer has height equal to log2 N − H0 ; – LSD revocation method is employed in each section of the bottom layer and CST revocation method is employed in the upper layer-section.
SKT-B. SKT-B is a particular key management scheme based on the following sectionization of the key tree and the local re-keying: – there are three horizontal layers and heights of the bottom and middle ones are equal to H0 and H1 , respectively; accordingly the top layer has height equal to log2 N − H0 − H1 ; – LSD revocation method is employed in each section of the two lower layers and CST revocation method is employed in the upper layer-section.
166
4 4.1
M.J. Mihaljevi´c
Analysis of the Proposed Key Management Schemes Main Characteristics of the Proposed Schemes
This section is focused on the following issues of the considered key management schemes: (i) communications – dimension of the messages overload to be sent for the re-keying; (ii) storage@receiver: dimension of keys which should be stored at a receiver; (iii) processing@receiver: processing overload due to the keys updating at receiver. Main Characteristics of SKT-A. Taking into account the results reported in [16] and [11], it can be shown that SKT-A key management has the following main characteristics. Proposition 1. SKT-A key management requires the following overload for R revocations in total which affect R0 different sections: - dimension of the storage@receiver overload: O((H0 )1.5 − H0 + log2 N ); - dimension of the communications overload: O(R + R0 ((log2 N ) − H0 ) − R0 log2 R0 ); - dimension of the processing@receiver overload: O(H0 ). Sketch of the Proof. Recall that in SKT-A scheme there are 2log2 N −H0 sections in the lower layer, and each of them is controlled via the basic LSD technique [11]; the upper layer consists of only one section where CST technique [16] is employed. Note that the re-keying of a receiver is performed via the lower layer section or the upper layer one. Accordingly, a receiver should store the keys related to LSD and CST based re-keying. A section oriented basic LSD technique requires (H0 )1.5 keys, and the upper section oriented CST requires log2 N − H0 keys. So, dimension of storage@receiver overload is O((H0 )1.5 − H0 + log2 N ). Regarding the processing@receiver overload note the following. A new SEK could be delivered to the receiver employing the LSD or CST related keys. If a LSD related key is employed, the new SEK recovering at the receiver requires the processing overload proportional to H0 . If a CST related key is employed, the new SEK recovering requires processing@receiver overload proportional to log2 log2 2log2 N −H0 = log2 log2 (log2 N −H0 ). So the maximum processing@receiver overload is: O(max{H0 , log2 log2 (log2 N − H0 )}) = O(H0 ). Finally, regarding the communications overload, suppose that there are rm revocations in the mth section, m = 1, 2, ..., 2log2 N −H0 , noting that 2log2 N −H0 2log2 N −H0 rm = R, and m=1 (1 − δ0,rm ) = R0 , where δa,b is a function m=1 which takes value 1 if a = b, and 0 otherwise. LSD based revocation within a section m requires communication overload of dimension O(rm ), assuming rm > 0. So, revocation of all R receivers require a communications overload of dimension O(R). Also, R0 revocations should be performed over the upper section employing CST, which requires additional communication overload of dimension O(R0 log2 (2log2 N −H0 ) − R0 log2 R0 ).
Broadcast Encryption Schemes Based on the Sectioned Key Tree
167
Accordingly, dimension of the communications overload is given by O(R + R0 ((log2 N ) − H0 ) − R0 log2 R0 ). Main Characteristics of SKT-B. Taking into account the results reported in [16] and [11], it can be shown that SKT-B key management has the following main characteristics. Proposition 2. SKT-B key management requires the following overload for R revocations in total which affect R0 and R1 different sections in the lower two layers, the bottom (0-th) and the middle (1-st) ones, respectively: - dimension of the storage@receiver overload: O((H0 )1.5 + (H1 )1.5 − H0 − H1 + log2 N ); - dimension of the communications overload: O(R + R0 + R1 ((log2 N ) − H1 − H0 ) − R1 log2 R1 ); - dimension of the processing@receiver overload: O(max{H0 , H1 }). Proposition 2 proof follows the same lines as the proof of Proposition 1. Analysis of a General Case. We assume the following: (i) the tree is partitioned into L horizontal layers of the height H , = 0, 1, ..., L − 2, and HL−1 = L−2 log2 N − =0 H ; (ii) R revocations of the receivers imply the revocation of R sections in -th layer, = 0, 1, ..., L − 1 (note that R ≥ R0 ≥ R1 ≥ ... ≥ RL−1 , L−1 and accordingly R + =0 R ≤ (L + 1)R ); (iii) LSD revocation is employed in the sections at the layers = 0, 1, ..., L − 2, and CST is employed at the top ((L − 1)-st) tree layer. Proposition 3. The considered key management has the following characteristics: L−2 L−2 - dimension of the storage@receiver overload: O( =0 H1.5 − =0 H +log2 N ); L−3 - dimension of the communications overload: O(R + =0 R + RL−2 ((log2 N ) − L−2 =0 H ) − RL−2 log2 RL−2 ); - dimension of the processing@receiver overload: O(max{H , = 0, 1, ..., L−2}). Proposition 3 can be proved following the same lines as in the proofs of Propositions 1 and 2. 4.2
Comparison with the Previously Reported Schemes
This section yields a comparison of the main characteristics of the proposed key management schemes, SKT-A and SKT-B, and the Complete Sub-Tree (CST) [16], Subset Difference (SD) [16] and Layered Subset Difference (LSD) [11] schemes. The same characteristics as ones considered in Section 4.1 are compared, i.e. communications, storage@receiver and processing@receiver overloads. Based on the results on CST, SD and LCD reported in [16] and [11], and the results given in Section 4.1 of this paper, a comparison is summarized in Table 1.
168
M.J. Mihaljevi´c
Also note that employing the same arguments as the ones used for the security evaluation of the schemes CST, SD and LSD, it can be shown that the proposed family of key management schemes is the secure one.
Table 1. Comparison of the main characteristics of the proposed key management schemes and the Complete Sub-Tree (CST)[16], Subset Difference (SD) [16] and Layered Subset Difference (LSD) [11], assuming N receivers, R revocations, and that all the parameters are positive integers. technique and communication storage@receiver processing@rec. parameters CST [16] O(Rlog2 N ) O(log2 N ) O(log2 log2 N ) R N, R SD [16] O(R) O((log2 N )2 ) O(log2 N ) N, R LCD [11] O(R) O((log2 N )1+ ) O(log2 N ) N, R proposed SKT-A O(R + R0 ((log2 N ) − H0 ) O((H0 )1.5 O(H0 ) N, H0 , R, R0 −R0 log2 R0 ) −H0 + log2 N ) H0 < log2 N ; R0 ≤ R proposed SKT-B O(R + R0 + R1 ((log2 N ) O((H0 )1.5 + (H1 )1.5 O(max{H0 , H1 }) N, H0 , H1 , R, R0 , R1 −H1 − H0 ) − R1 log2 R1 ) −H0 − H1 + log2 N ) H0 + H1 < log2 N ; R1 ≤ R0 ≤ R
5
Discussion
An appropriate underlying structure for BE key management has been proposed which yields a possibility for the section-by-section processing, and improved overall characteristics of the developed method in comparison with the previously reported ones. Table 1 shows that the clustering and combining of heterogeneous schemes appear as a powerful approach for developing improved key management schemes which yield a possibility for appropriate trade-offs between the main overloads related to the key management. Also note that the proposed key management is based on a heterogeneous logical key hierarchy. The main origin for the gain obtained by the proposed key management in comparison with the previously reported ones is due to the employed dedicated divide-and-conquer approach: (i) partition of the key tree into the sections which appears as a very powerful technique for obtaining improved characteristics of a key management scheme; (ii) performing overall key management based on a number of local (the section oriented) key managements; in a general case these key managements can be different.
Broadcast Encryption Schemes Based on the Sectioned Key Tree
169
References 1. M. Abdalla, Y. Shavitt and A. Wool, “Key management for restricted multicast using broadcast encryption”, IEEE/ACM Trans. Networking, vol. 8, pp. 443–454, Aug. 2000. 2. S. Banerjee and B. Bhattacharjee, “Scalable secure group communication over IP multicast”, IEEE Journal on Selected Areas in Communications, vol. 20, pp. 1511– 1527, Oct. 2002. 3. R. Canetti, T. Malkin and K. Nissim, “Efficient communication-storage tradeoffs for multicast encryption”, EUROCRYPT’99, Lecture Notes in Computer Science, vol. 1592, pp. 459–474, 1999. 4. K.-C. Chan and S.-H. Gary Chan, “Distributed server networks for secure multicast”, IEEE Journal on Selected Areas in Communications, vol. 20, pp. 1500–1510, Oct. 2002. 5. P. D’Arco and D.R. Stinson, “Fault tolerant and distributed broadcast encryption”, CT-RSA 2003, Lecture Notes in Computer Science, vol. 2612, pp. 263–280, 2003. 6. G. Di Crescenzo and O. Kornievskaia, “Efficient re-keying protocols for multicast encryption”, SCN 2002, Lecture Notes in Computer Science, vol. 2576, pp. 119–132, 2003. 7. U. Feige, “A threshold of ln(n) for approximating set cover”, Jour. ACM, vol. 45, pp. 634–652, July 1998. 8. A. Fiat and M. Naor, “Broadcast encryption”, Advances in Cryptology – CRYPTO’93, Lecture Notes in Computer Science, vol. 773, pp. 480–491, 1994. 9. J.A. Garay, J. Staddon and A. Wool, “Long-lived broadcast encryption”, CRYPTO 2000, Lecture Notes in Computer Science, vol. 1880, pp. 333–352, 2000. 10. M.R. Garey and D.S. Jonson, Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco, CA: Freeman, 1979. 11. D. Halevy and A. Shamir, “The LCD broadcast encryption scheme”, CRYPTO 2002, Lecture Notes in Computer Science, vol. 2442, pp. 47–60, 2002. 12. J. Lotspiech, S. Nusser and F. Prestoni, “Broadcast encryption’s bright future”, IEEE Computer, (7 pages) August 2002. 13. J.H. Ki, H.J. Kim, D.H. Lee and C.S. Park, “Efficient multicast key management for stateless receivers”, ICISC 2002, Lecture Notes in Computer Science, vol. 2587, pp. 497–509, 2003. 14. N. Matsuzaki, T. Nakano and T. Matsumoto, “A flexible tree-based key management framework”, IEICE Trans. Fundamentals, vol. E86-A, pp. 129–135, 2003. 15. S. Mittra, “Iolus: A framework for scalable secure multicasting”, Proc. ACM SIGGCOM’97, pp. 277–288, Sept. 1997. 16. D. Naor, M. Naor and J. Lotspiech, “Revocation and tracing schemes for stateless receivers”, CRYPTO 2001, Lecture Notes in Computer Science, vol. 2139, pp. 41– 62, 2001. 17. R. Poovendran and J. S. Baras, “An information theoretic approach for design and analysis of rooted-tree-based multicast key management schemes”, IEEE Trans. Inform. Theory, vol. 47, pp. 2824–2834, Nov. 2001. 18. R. Poovendran and C. Bernstein, “Design of secure multicast key management schemes with communication budget constraint”, IEEE Communications Letters, vol. 6, pp. 108–110, March 2002. 19. D. Wallner, E. Harder and R. Agee, “Key management for multicast: Issues and architectures”, RFC 2627, http://www.ietf.org/rfc/rfc2627.txt 20. C.K. Wong, M. Gouda, and S.S. Lam, “Secure group communications using key graphs”, IEEE/ACM Trans. Networking, vol. 8, pp. 16–31, Feb. 2000.
Research on the Collusion Estimation Gang Li and Jie Yang
Institute of Image Processing & Pattern Recognition, Shanghai Jiaotong University, Shanghai 20030 China
Abstract. Digital watermarking is now well accepted as an effective digital content marking technique, but it is far from application. One reason is that watermarking technique must be robust against malicious attacks, while the knowledge on attacks is limited. Here we propose a formulation of collusion attack using an estimation-based concept. The algorithm aim for the high probability watermark estimation, which is also can be used to the hiding information estimation.
1 Introduction The research on digital watermarking was concentrated on the copyright protection. In this way, a watermark is embedded into the host image as noise. When necessary, the watermark can be abstracted. The difficulty of this application is to develop the robustness of the technique to survive kinds of malicious attacks. On the other side, the research on the malicious attacks has not been given enough emphasis to. However, the malicious attacks block the application of digital watermarking like a wall. A digital watermark can not be applied to real copyright protection unless it can be proved that it is robust enough to all kinds of malicious attacks. In this way, a [7] benchmark should be proposed to certify the robustness of a watermarking scheme. But all these work should be based on the research on the malicious attacks. This paper describes our research on the collusion estimation. With the collusion attack, at least two instances of the same watermarking algorithm and the same watermark are available. The estimation consists of two main stages: (a) determine the presence of any hiding information; (b) indicate the estimation of watermark with high probability. In the further work, we want to use the different parts of one watermarked image take place the need for two different watermarked images. In this way, the work can be applied to detect the secure information hid in one image.
2 Problem Formulation Traditionally, the watermark attacks try to remove watermark from a watermarked image. The wide class of existing attacks can be divided into for main categories [1]:
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 170–178, 2003. © Springer-Verlag Berlin Heidelberg 2003
Research on the Collusion Estimation
171
removal attacks, geometrical attacks, cryptographic attacks and protocol attacks. The common methods include denoising, collusion, averaging, transforms, filtering and so on. However, this paper aims at the retrieving of the hiding watermark with the least requirement. According our research, that work must be based on some supposes as the following: At least two instances of the same watermarking algorithm and the same watermark are available. 1) The watermark should own different distribution to the host images in the feature subspace. In fact, this is an easy requirement. Most images meet this requirement. 2) The original host image, the watermark and the embedding method are not needed. In another word, this is all-blind watermark estimation. The output of the estimation is a difference image, which is the effect of watermarking on the host image. Because the embedding method is not provided, we don’t try to completely retrieve the watermark, but to estimate the difference image of original and watermarked images. Unobtrusiveness means that the watermark should be perceptually invisible, but it is difficult to watermark be invisible in feature sub-space. That is the theory of our research. According the supposes above, there are mainly three difficulties blocking the this work, 1) Faint signal: the intensity of watermark is just the 0.1 times of host image normally, and when estimation, we have no any idea of watermark. This is a big difficulty in estimation. 2) If we consider the watermark embedding as a linear embedding, the collusion estimation is a blind separation problem. 3) We believe that the ICA algorithm must be used to solve the blind separation problem. However, according the ICA algorithm, there are at least two observed channels. How to get the two channels is another difficulty.
3 Proposed Technique Here, first we used ICA to decompose the host image into 160 channels. The aim of decomposition is solve the faint signal problem, because the watermark signal can be stronger than the host image signal in some channels. The independency of ICA decomposition is important for watermark estimation. The existence of watermark reduces the independency of two watermarked images. According to the rule, we can compute the watermark.
172
G. Li and J. Yang
3.1 ICA Decomposition ICA was proposed as a method to deal with problems related to the cocktail-party problem. Suppose the data X = {x1 x n x N } are observed independently and generated by a mixture model
[2]
and the source component is
S = {s1 s m s M } .
Using a vector-matrix notation A, the mixture model can be described by: M
X = AS = ∑ a i s i
(1)
i =1
where A is a M × N matrix, and
a i is the i th vector of A. S is the source compo-
nent, which are statistically independent and can not be directly observed. The mixture model above is called independent component analysis, or ICA model. It described how the observed data generated by a process of mixing the source components. The approach to ICA can be described as estimating the mixing matrix A and computing its inverse matrix W. Then the independent component can be simply obtained by:
sˆ = Wˆ x ≈ s
(2) In this way, the source components are statistically independent each other, so S are called the scales of the feature sub-space [3]. The decomposition of images by ICA has been discussed much. Olshausen and Field modeled visual data by a simple linear generative process [4]. In the similar way, Aapo Hyvarinen and Patrik Hoyer proposed a fast independent component analysis method of image data[5][6]. Extending the ICA model to image decomposition begins with the selection as “natural” images as possible, because we wish to make an ICA decomposition model for all digital images. Then from the images, a number of 16 by 16 image patches are sampled randomly. That means the dimensionality of the observed data X is 256. The starting point for FastICA is an image patch X. The process can be divided into two steps: the preprocessing before decomposition and the abstraction of independent components. The preprocessing includes two parts. The first part is to center X, in order to make x a zero-mean variable. It can be formulated by
X = X − E( X )
(3) The other part is to whiten the observed data. In other words, the covariance matrix of X equals the identity matrix:
E{ XX T } = I
(4) After the preprocessing, inverse matrix W can be computed by an iterative learning rule. The FastICA learning rule begins with a random matrix W0, and finds a direction to maximize nongaussianity, then updates W. In Hyvarinen’s paper, this iteration can be described by
w+ = E{xg ( wT x)} − E{g ’( wT x)}w
(5)
Research on the Collusion Estimation
173
where w is a weight vector of W, and g (u ) = tanh( au ) , and w should be normalized after every iteration. Demixing matrix A in the ICA model can be computed as a set of basis images A = {a1 am aM } (see Fig. 1), and the ICA model for images can be denoted by Fig. 2. When decomposing an image, we should divide the original image into some 16 by 16 patches, reshape the patches into one dimension X. if X multiply mixing matrix W, the coefficient of 160 channels S will be gotten. On the other hand, if we multiply S and A, the image will be reconstructed.
Fig. 1. The ICA basis of patches computed by FastICA
Fig. 2. The linear synthesis model
3.2 Independency of Decomposition The independency of decomposition coefficients provides an estimation method. Suppose C = {c1,1,1 c k ,i , j c M , N ,O } be the decomposition coefficient set of an im-
174
age,
G. Li and J. Yang
S i , j = {c1,i , j c k ,i , j c M ,i , j } is the decomposition coefficient sequence of
one patch, and
Tk = {c k ,1,1 c k ,i , j c k , N ,O } is the kth channel sequence. M is the
number of channels, and N & O are the number of row and column of patches. Suppose the independency of two vectors is weighted by (5)
IND( A, B) =
A• B
A• A × B• B
= Cos (α )
(6)
In which α is the angle of two vectors. If A is similar or same to B, IND(A,B) is close to 0, on the other hand IND(A,B) is close to 1. The independency of decomposition coefficients here prefer to that the S i , j of one image decomposition is independency to the
S i , j of another image, and Tk is in the same situation. Fig 3 and Fig.4
shows the IND between two images.
Fig. 3. IND of T between two images (Mean: -0.0087 variance: 0.0085)
Fig. 4. IND of S between two images (Mean -0.0163 variance 0.0052)
From the figure above, we can deduce that IND (A,B) of two independent images trends to 0. However, when the two images were embedded watermark, the IND (A,B) would increase to some degree. Fig.5 and Fig 6 show the situation of the same two images after been added watermark.
Research on the Collusion Estimation
175
Fig. 5. The IND of T between two watermarked images (Mean: 0.1025 variance: 0.0189)
Fig. 6. The IND of S between two watermarked images (Mean: 0.1181 variance: 0.0153)
From the figures above, we can see the mean and variance increase. That can help us to decide whether the two images have been added watermark.
3.3 Computing the Watermark The computing of the watermark is based on a rule that the embedding process is a linear addition in the ICA decomposition sub-space. The demonstration can be seen in fig. 7. In the ICA decomposition sub-space, we consider Tk as a vector of the kth
Tk for example. In fig.7 the vector OA,OB represent the Tk of two original image, and they are vertical. AA’ and BB ’ represent the Tk of the watermark and AA’ is same to BB ’. Because we thought the embedding process as a linear addition, the OB ’ and OA’ is the watermarked image, which we can get, channel, and here we use the
when estimation. We can also consider the length of the vector of watermark is short.
176
G. Li and J. Yang
Fig. 7. A demonstration of embedding process in ICA decomposition sub-space
When computing, we can suppose the length of watermark, and the problem is how to confirm the direction of the vector of watermark. All the potential points compose a circle
P = { p p − a = Lengthw }
(7)
Fig. 8. A demonstration of the estimation
According the Independency of two independent images, we want to find a pair of point at the same position of the two circles, to make the two vectors vertical each other (see Fig.8).
VecW = {α (OA − α ) ⊥ (OB − α )} where
(8)
Vec w is the estimation of the kth channel of the watermark. the error of the
variance of IND(A,B) of the original image.
Research on the Collusion Estimation
The work to find the
177
Vec w is onerous, because of the high dimension. Here, we
thought the problem as a global optimum problem, and used GA (Genetic Algorithm) to find the Vec w . The algorithm is effective, of cause there may be more effective algorithm, but it is not the most important thing in this paper.
4 Simulation Experiments In the section above, we demonstrated the whole procedure of collusion estimation. In this section, we will investigate the performance of the algorithm using a simulation. For simplicity, we used a directly DWT based watermarking algorithm, and two natural image as host images.
A. Watermark
B. One host image
C. The other host image
Fig. 9. A simulation experiment
A. The aim image (difference)
B. The estimation result
Fig. 10. The result of simulation experiment
The embedding method is a common DWT based watermarking algorithm. We used the collusion estimation algorithm above to estimate the difference image. The result is show in Fig. 10.
178
G. Li and J. Yang
5 Conclusion A watermark estimation based on collusion has been presented in this paper. The most interesting feature is that it is an all-blind watermark attack and it can give an estimation of the watermark. A simulation has been reported to demonstrate the effect of the estimation. The attack method tell us that if a watermark was embedded into different host images, it must have the similar feature to the host image, to ensure the watermark be unobtrusiveness in the feature sub-space.
References 1. M. Kutter: A fair benchark for image watermarking systems. Electronic Imaging’99, security and watermarking of Multimedia Contents, San Jose, CA, USA , Vol. 3657, Jan 1999, 219–239. 2. Comon, Pierre: Independent component analysis. A new concept? Signal Processing vol.36 Apr 1994 287–314. 3. Hyvarinen, A., Hoyer: P Emergence of complex cell properties by decomposition of natural images into independent feature subspaces. ICANN 99. Ninth International Conference on (Conf. Publ. No. 470) , Vol. 1 , 1999, 257–262 4. Olshausen B. A, Field D.J :Sparse coding with an overcomplete basis set: A strategy employed by V1? Bision Research, Vol. 37, 3311–3325 5. HyvarinenA: Fast ICA for noisy data using Gaussian moments.. ISCAS '99. Proceedings of the 1999 IEEE International Symposium on , Vol. 5 , 1999, 57–61 6. Hyvarinen, A, Cristescu, R., Oja,: A fast algorithm for estimating overcomplete ICA bases for image windows. IJCNN '99. International Joint Conference on, Vol. 2, Jul 1999, 894– 899 7. S.Voloshynovshiy, S. Pereira and V. Iquise: Attack modeling: towards a second generation watermarking benchmark. Signal Processing, Special Issue: Information Theoretic Issues in Digital Watermarking, May, 2001, 1177–1214
Multiple Description Coding for Image Data Hiding Jointly in the Spatial and DCT Domains Mohsen Ashourian1 and Yo-Sung Ho2 1
Azad University of Iran, Majlesi Branch P.O. Box 86315-111, Isfahan, Iran [email protected] 2 Kwangju Institute of Science and Technology (K-JIST) 1 Oryong-dong Puk-gu, Kwangju, 500-712, Korea [email protected]
Abstract. In this paper, we propose a new method for hiding a signature image in the host image. We encode the signature image by a balanced two-description subband coder and embed the descriptions in the different portions of the host image. We split the host image into two images from its even and odd rows, and embed the information of one signature description in the first portion of the host image in the spatial domain, and the other description in the second portion in the DCT domain. In both cases, we employ proper masking operation to reduce visibility of embedded information in the host image. At the receiver, the multiple description decoder combines the information of each description to reconstruct the original signature image. We experiment the proposed scheme for embedding gray-scale signature images of 128×128 pixels in the gray-scale host image of 512×512 pixels, and evaluate the system robustness to various attacks.
1 Introduction In data hiding schemes, perceptually invisible changes are made to image pixels for embedding additional information [1]. Data hiding can be used to embed control or reference information in digital multimedia data for various applications, such as tracking the use of a particular video for pay-per-view, billing for commercials in audio/video broadcast, and for watermarking. Unlike traditional encryption methods where it is obvious that some information is encoded, perceptually invisible data hiding in image or video offers an alternative approach for secret information transmission. Main features of the image data hiding scheme are the method of encoding a signature image and the way to embed the signature information into the host information. In the image hiding method given by Chae and Manjunath [2], the signature image is encoded using lattice vector quantization of its subbands. An improved version of the above system using channel optimized vector quantization for the signature signal encoding is also suggested [3]. Both methods are robust to JPEG compression S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 179–190, 2003. © Springer-Verlag Berlin Heidelberg 2003
180
M. Ashourian and Y.-S. Ho
and addition of noise; however, they are not robust to some attacks, such as cropping and down-sampling. In this paper, we suggest to use a multiple description coding method for encoding the signature image and embedding the information of the two descriptions in both the spatial and DCT domains of the host image. The main advantage of encoding the signature image by two descriptions and embedding these descriptors in the host signal is that with an appropriate strategy, we can reconstruct a high quality signature signal when we receive both descriptions without any error. On the other hand, if the host signal is attacked, we can retrieve a less corrupted description from the host image and reconstruct an acceptable quality signature image using the less corrupted description. After we provide an overview of the proposed image hiding system in Section 2, we explain the encoding process of the signature image using multiple description coding in Section 3. Section 4 and Section 5 explain the data embedding and extraction processes respectively. Finally we present experimental results of the proposed scheme in Section 6, and summarize the paper in Section 7.
2 Overview of the Proposed Method Fig. 1 shows the overall structure of the proposed system for signature image embedding. We encode the signature image using a two-description subband coder. The output of two descriptions are represented by Do and De . The host image is divided into two parts of its odd and even rows, I o and I e , which are analogues to the two communication channels. The bit stream of the first description Do , is embedded in the spatial domain of I o , and the bit stream of the other description De , is embedded in the DCT domain of I e . Fig. 2 shows the block diagram of recovering the signature image at the receiver. We use the original host image and the received host image to recover the two descriptions, and reconstruct the signature image using the MDC subband decoder.
3 Multiple Description Coding of the Signature Image Multiple description coding (MDC) was originally proposed for speech transmission over noisy channels [4]. El-Gamal and Cover provided the information-theoretic analysis of MDC [5], and Vaishampayan devised a method for the multiple description scalar quantizer design [6]. Recently, MDC has been studied as an approach for transmission of compressed visual information over error prone environments [7]. Various MDC schemes for images have been proposed for wireless and computer network applications [7]. In this paper, we develop a fixed rate MDC subband image coder using multiple description scalar quantization for the subband signals.
Multiple Description Coding for Image Data Hiding
Multiple Description Subband Encoder
Original Signature Image
Do
De
Iˆe
181
Data Embedding in the DCT Domain
Data Embedding in the Spatial Domain
Ie :Even Rows
Iˆo
Io :Odd Rows Input Host Image Output Host Image
Fig. 1. Signature image embedding in the host image
Recovered Signature Image
Multiple Description Subband Decoder
De
Iˆe
Do
Data Recovery in the DCT Domain
Data Recovery in the Spatial Domain
Ie: Even
Iˆo
Io: Odd
Rows
Rows Original Host Image Received Host Image
Fig. 2. Signature image recovery
In the first stage for signature image encoding, we decompose the signature image using the Haar wavelet transform, resulting in four subbands usually referred to as LL, LH, HL and HH. Except for the lowest frequency subband (LL), the probability density function (PDF) for other subbands can be closely approximated with the Laplacian distribution. Although the LL subband does not follow any fixed PDF, it contains the most important visual information. We use a phase scrambling operation to change the PDF of this band to a nearly Gaussian shape [8]. Fig. 3 gives the block schematic of the phase scrambling method. As shown in Fig. 3, the fast Fourier transform (FFT) operation is performed on the subband and then a pseudo-random noise is added to the
182
M. Ashourian and Y.-S. Ho
phase of its transformed coefficients. The added random phase could be an additional secret key between the transmitter and the registered receiver. We encode the subbands using a PDF-optimized two-description scalar quantizer, assuming the Laplacian distribution for high frequency bands, and the Gaussian distribution for the LL subband after phase scrambling. We devise index assignments scheme for subband scalar quantizers with different output bit-rates [6]. A sample of index assignment for the three bits quantizer is shown in Fig. 4, where rows and columns are quantization indices of the first and second descriptions.
Random-Phase Subband LL
Scrambled Subband
Σ
Phase FFT
IFFT Magnitude
Fig. 3. Phase-Scrambling of lowest frequency subband
1 1 2 3 4 5 6 7 8
2
3
4
5
6
7
8
Fig. 4. Sample of index assignment used for subband multiple description scalar quantizers
In this paper, we have set the image encoding bit-rate at three bit per sample (bps), and obtained PSNR value over 31 dB for different tested images, which is satisfactory in image hiding applications [1]. We use an integer bit-allocation scheme among the four subbands based on their energies. The information of subband energies (15 bits) can be sent as side information or can be encoded with a highly robust error correction method and embedded in the host image. We use the folded binary code (FBC) for representing output indices of quantizer to have higher error resilience and scramble the output indices of each description and arrange the indices as two binary sequences De = d e,1 , d e,2 , ... , d e,n and Do = d o,1 , d o, 2 , ... , d o,n . In order to embed the data, we change the binary elements of
the sequences to bipolar bits by mapping each bit form {0,1} to {-1,1}.
Multiple Description Coding for Image Data Hiding
183
4 Data Embedding in the Host Image The data embedding in the host image could be in the spatial or frequency domain [1]. While data embedding in the spatial domain is more robust to geometrical attacks, such as cropping and down-sampling, data embedding in the frequency domain usually has more robustness to signal processing attacks, such as addition of noise, compression and lowpass filtering [1]. As shown in Fig.1, we use data embedding in both spatial and DCT domains. We make two images, Ie and Io , from even and odd rows of the host image. One description of the signature image is embedded in the spatial domain of Ie , and the other description is embedded in the DCT domain of Io . In fact, transmission channels for the two signature image descriptions are Ie and Io . In the proposed system, we need the host image at the receiver for signature image recovery; however, using different methods for embedding information in the texture area of the host image [1], this system can be easily extended for blind image hiding applications. 4.1 Data Embedding in the Spatial Domain We embed each element of the binary sequence Do = d o,1 , d o,2 , ... , d o, n in a pixel xi , j ∈ I o by
xˆ i , j = xi , j + M (i, j ) ⋅ α o ⋅ d o,k
(1)
where the positive scaling factor α o determines the modulation amplitude of the watermark signal in the spatial domain, and M (i, j ) is a spatial masking vector derived from the normalized absolute value of the gradient vector G(i, j ) at xi , j .
M (i, j ) = 0.5 * (1 + G (i, j ) )
(2)
4.2 Data Embedding in the DCT Domain We embed the second descriptor of the signature image in the second portion of the host image I e . We distribute the bit stream De = de,1, de,2 , ..., de,n among the 8×8 pixel th
blocks. The new DCT coefficients of the k block ( Wˆik, j ) can be obtained from the original coefficients ( Wik, j ) by Wˆi ,k j = Wi ,k j + N k (i, j ) ⋅ α e ⋅ d e , m
(3)
where N k is a masking matrix derived from the DCT coefficients of each block using the Watson model [9], and the positive scaling factor α e determines the modulation
184
M. Ashourian and Y.-S. Ho
amplitude of embedded signal in the DCT domain. In practice, since the size of signature image is smaller than the host image size, we only embed data in DCT coefficients of middle frequency bands.
5 Signature Image Recovery Fig. 2 shows the process of signature image recovery. We use the original host image and the received host image to derive the even portion ( I e , Iˆe ), and the odd portion ( I o , Iˆo ). For recovering the description embedded in the spatial domain using the original image pixels xi , j ∈ I o and the received image pixel xˆi, j ∈ Iˆo , we extract the embedded bits by dˆ e, k = 0.5 * ( sign (
xˆ i , j − x i , j
α o ⋅ M (i , j )
) + 1)
(4)
and since M (i, j ) and α o are positive parameters, Eq. 4 can be simplified to dˆ e, k = 0.5 * ( sign ( xˆ i , j − x i , j ) + 1) .
(5)
Similarly, we derive the description embedded in the DCT domain by subtracting the DCT coefficients of the received image from the original DCT coefficients of I e . dˆ o , k = 0.5 * ( sign (
Wˆ i ,k j − W i ,k j
α e ⋅ N k (i , j )
) + 1)
(6)
and since N k (i, j ) and αe are positive parameters, Eq. 6 can be simplified to dˆ o, k = 0.5 * ( sign (Wˆ i k, j − Wi k, j ) + 1) .
(7)
The subband quantization indices are obtained by proper arrangement of the extracted bits. Considering the multiple description scheme that has been used in information embedding, we can reconstruct three signature images based on each descriptor alone or based on their combinations. The receiver uses the index assignment, as illustrated for the three bit quantizer in Fig. 4, and reconstructs each subband. When the reconstructed indices of the two descriptions are very far, we assume that one of the two descriptions has been corrupted highly by noise; therefore, by comparing the MSE value of the original host image and the reconstructed one in the area contains those descriptions, we can decide which index should be selected.
Multiple Description Coding for Image Data Hiding
185
6 Experimental Results and Analysis In our scheme, the host image should be at least 6 times larger in size than the signature image, because we use two descriptions with three bits per pixel quantization. We use a gray-scale host image of 512×512 pixels and signature image of 128×128 pixels. We use “Lena” image as the host image for all the experiments. In order to control the host image distortion by data embedding, we can change the embedding factor in the spatial and DCT domains. We set the two modulation factors, α e and α o , such that the host image PSNR stays above 35 dB for our experiments. Fig. 5 shows the host image after data embedding.
Fig. 5. The host image after data embedding
We arrange two series of experiments. For image hiding application, two images, “Barbara” and “Elaine”, are used as signature images, and for watermarking application, the “IEEE” logo image is used. Fig. 6 shows reconstructed signature images and Fig. 7 shows the reconstructed logo image. For data hiding for image transmission applications, PSNR values of reconstructed signature images are given. For copyright protection, we should make a binary decision for the presence or absence of the signature image because the presence of the signature is important rather than the quality of reconstructed image. We define the similarity factor between the recovered logo image sˆ(m, n) and the original signal s(m, n) as
ρ=
∑ sˆ( m, n) s ( m , n )
(8)
m ,n
∑ ( sˆ( m , n ))
2
m,n
Based on the value of ρ , we make a decision on the presence ( ρ = 1 ), or absence of the logo image ( ρ = 0 ). We provide PSNR value and ρ for several main types of attacks for evaluating system performance.
186
M. Ashourian and Y.-S. Ho
Robustness to Gaussian Noise: We add Gaussian noise with a different variance to the normalized host signal after signature embedding. Fig. 8 shows the PSNR values of signature images for additive noise with different variances. From Fig. 8, we conclude from this figure that for certain range of noise, our strategy shows good performance in resisting Gaussian noise for data hiding applications.
Fig. 6. Reconstructed signature images
Fig. 7. Reconstructed logo image
Fig. 8. PSNR variation of recovered signature images for additive Gaussian noises
Multiple Description Coding for Image Data Hiding
187
Fig. 9 shows the value of similarity factor ( ρ ) for the hidden logo. We can see that even at high additive noise, the ρ value is higher than 0.75, which means the possibility of watermark recovery.
Fig. 9. Similarity factor variation of logo image for additive Gaussian noises
Resistance to JPEG Compression: The JPEG lossy compression algorithm with different quality factors (Q) is tested. Fig. 10 shows the PSNR variation for different Q factors and Fig. 11 shows the similarity factor variation due to JPEG compression for the logo image. As shown in these figures, PSNR values drop sharply for Q smaller than 50 , and U drops for Q smaller than 40.
Fig. 10. PSNR variation of recovered signature images due to JPEG compression
188
M. Ashourian and Y.-S. Ho
Fig. 11. Similarity factor variation of recovered logo image due to JPEG compression
Resistance to Median and Gaussian Filtering: Median and Gaussian filters of 3×3 mask size are implemented on the host image after embedding the signature. We choose the Gaussian filter standard deviation equal to 0.5. PSNR values of recovered signature image are listed in Table 1, and the similarity factors for the recovered logo image are listed in Table 2. Table 1. PSNR (dB) values of the recovered signature images after implementing median and Gaussian filters on the host image
Barbara Elaine
Median Filter
Gaussian Filter
21.90 20.65
26.80 25.82
Table 2. Similarity factor values of the recovered logo images after implementing median and Gaussian filters on the host image
ρ
Median Filter 0.80
Gaussian Filter 0.85
Resistance to Cropping: In our experiment, we have cropped parts of the host image coroners. Fig. 12 shows a sample of the host image after 20% cropping. We fill the cropped area with the average value of the remaining part of the image. Table 3 shows PSNR values and Table 4 shows the similarity factor when some parts of the host image corners are cropped. Considerably good resistance is due to the existence of two descriptors in the image and scrambling of embedded information, which makes it possible to reconstruct the signature image information partly in the cropped area from the available descriptor in the non-cropped area.
Multiple Description Coding for Image Data Hiding
189
Fig. 12. Sample of the host image with embedded data after 20% cropping Table 3. PSNR (dB) values of the recovered signature image for different percentage of cropping the host image
5% 24.58 24.15
Barbara Elaine
10% 22.42 23.04
15% 21.60 22.10
20% 20.92 20.01
Table 4. Similarity Factor values of the recovered logo image for different percentage of cropping the host image
ρ
5%
10%
15%
20%
0.92
0.84
0.760
0.69
Resistance to Down-sampling: Table 5 shows results of PSNR values of recovered signature image, and Table 6 shows results of similarity factor for the logo image after several down-sampling processes. Due to loss of information in the down-sampling process, the host image cannot be recovered perfectly after up-sampling. However, it is possible to recover the signature image from the available host image pixels in the spatial domain. Table 5. PSNR (dB) values of the recovered signature image after different amount of downsampling the host image
Barbara Elaine
1/2 27.18 28.03
1/4 21.1 21.3
1/8 18.2 16.7
Table 6. Similarity factor of the recovered logo image after different amount of down-sampling the host image
ρ
1/2 0.82
1/4 0.76
1/8 0.67
190
7
M. Ashourian and Y.-S. Ho
Conclusion
We have presented a new image hiding scheme for embedding a gray-scale image into another gray-scale image based on multiple description subband image coding, and data embedding jointly in the spatial and DCT domains. We examined the system performance for signature image embedding in another image for secure transmission, and for logo image embedding for watermarking purpose. As results show, multiple description coding of signature image and embedding in different domains make it possible to recover the signature signal with good quality even when the host image undergoes different geometrical and signal processing operations. The system performance could be further improved by estimating the image data hiding capacity in the different domains [10] and using it for optimum bit allocation among the descriptors. Acknowledgements. This work was supported in part by Kwangju Institute of Science and Technology (K-JIST), in part by the Korea Science and Engineering Foundation (KOSEF) through the Ultra-Fast Fiber-Optic Networks (UFON) Research Center at K-JIST, and in part by the Ministry of Education (MOE) through the Brain Korea 21 (BK21) project.
References 1.
Petitcolas, F.A.P., Anderson, R.J., and Kuhn, M.G.: Information Hiding-a Survey. Proceedings of the IEEE, Vol. 87, No.7, (1999)1062–1078. 2. Chae, J.J., and Manjunath, B.S.: A Robust Embedded Data from Wavelet Coefficients. Proceeding of SPIE, Storage and Retrieval for Image and Video Databases VI, (1998) 308–317. 3. Mukherjee, D., Chae, J.J., Mitra, S.K., and Manjunath, B.S.: A Source and ChannelCoding Framework for Vector-Based Data Hiding in Video. IEEE Transaction on Circuits and System for Video Technology. Vol. 10, No. 6, (2000)630-645. 4. Jayant, N.S.: Sub-sampling of a DPCM Speech Channel to Provide Two Self-contained Half-rate Channels. Bell System Technical Journal, Vol. 60, No. 4, (1981)501–509. 5. El-Gamal, A.A., and Cover, T.M.: Achievable Rates for Multiple Descriptions. IEEE Trans. on Information Theory, Vol. 28, No. 11, (1982) 851–857. 6. Vaishampayan, V.A.: Design of Multiple Description Scalar Quantizers. IEEE Trans. on Information Theory, Vol. 39 , No.5, (1993) 821–834. 7. Goyal, V.K.: Multiple Description Coding: Compression Meets the Network. IEEE Signal Processing Magazine, Vo.18, Issue 5, (2001)74–93. 8. Kuo, C.C.J., and Hung, C.H.: Robust Coding Technique-Transform Encryption Coding for Noisy Communications. Optical Engineering, Vol. 32, No. 1, (1993)150–153. 9. Wolfgang, R.B., Podilchuk, C.I., and Delp, E.J.: Perceptual watermarks for digital images and video. Proceedings of the IEEE, Vol. 87, No. 7, (1999)1108–1126. 10. Moulin, P., and O'Sullivan, J.A.: Information-theoretic analysis of information hiding. IEEE Transactions on Information Theory, Vol.49, No.3, (2003)563–593.
Protocols for Malicious Host Revocation Oscar Esparza, Miguel Soriano, Jose L. Mu˜ noz, and Jordi Forn´e Department of Telematics Engineering. Technical University of Catalonia. C/ Jordi Girona 1 i 3. Campus Nord, Mod C3, UPC. 08034 Barcelona. Spain. {oscar.esparza, soriano, jose.munoz, jforne}@entel.upc.es
Abstract. Mobile agents are software entities that consist of code, data and state, and that can migrate autonomously from host to host executing their code. Security issues restrict the use of code mobility despite its benefits. The protection of mobile agents from the attacks of malicious hosts is considered by far the most difficult security problem to solve in mobile agent systems. Using a Trusted Third Party in the mobile agent system can aid to solve this problem. The Host Revocation Authority [2] is a TTP that controls which hosts acted maliciously in the past, and for this reason they have been revoked. Each agent sender consults the HoRA before sending an agent in order to remove from the agent’s itinerary all the malicious hosts. Accordingly, the revoked hosts will not receive mobile agents any more. This paper presents two new protocols that can be used to revoke malicious hosts.
1
Introduction
Mobile agents are software entities that move code, data and state to remote hosts, and that can migrate from host to host performing actions autonomously on behalf of a user. The use of mobile agent technology saves bandwidth and permits off-line and autonomous execution in comparison with habitual distributed systems based on message passing. In consequence, mobile agents are especially useful to perform automatically functions in almost all electronic services. Despite their benefits, massive use of mobile agents is restricted by security issues. We have two main entities in this scenario, the agent and the host. Protection is necessary when trustworthy relationships between entities cannot be assured, so these are the main cases that can be found: – The agent attacks the host: host protection from malicious agent attacks can be achieved by using sand-boxing techniques and a proper access control. – Communication security: agent’s protection while it is migrating from host to host can be achieved with cryptographic protocols, like TLS. – The host attacks the agent: there are not any published solution to protect mobile agents completely from the attacks of an executing host. This kind of attack is known as the problem of malicious hosts. This paper introduces two new protocols that aid to solve the problem of malicious hosts by using a Host Revocation Authority (HoRA from here on). The S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 191–201, 2003. c Springer-Verlag Berlin Heidelberg 2003
192
O. Esparza et al.
HoRA was introduced in [2] and it must be considered an independent Trusted Third Party (TTP) in a mobile agent system, like the Certification Authority is considered in the Public Key Infrastructure (PKI). The HoRA stores in a list the identifiers of those hosts that have been proven malicious and hence they have been revoked. Before sending an agent, each origin host consults the revocation information (1) by asking directly to the HoRA or (2) by consulting a local copy of the list of revoked hosts, so all the revoked hosts must be deleted from the agent’s itinerary. As a result, the revoked hosts will not execute agents any more. The origin hosts can use these two new protocols to revoke a host by demonstrating to the HoRA that it acted maliciously. The paper is organized as follows: Section 2 presents the state-of-the-art related solutions to solve the problem of the malicious hosts; Section 3 details how the HoRA works; Section 4 presents the host revocation protocols and finally, some conclusions can be found in Section 5.
2
Malicious Hosts
The attacks performed by a malicious host that is executing the mobile agent are considered, by far, the most difficult problem to solve regarding mobile agent security. On one hand, it is possible to assure the integrity and authentication of code, data or results that come from other hosts by using digital signature or encryption techniques. On the other hand, it is difficult to detect or prevent the attacks performed by a malicious host during the agent’s execution, i.e. execution integrity. Malicious hosts could try to get some profit of the agent reading or modifying the code, the data, the communications or even the results due to their complete control on the execution. The agent cannot hold a decryption key because the hosts could read it. Furthermore, it is not sure that the host runs the complete code in a correct manner, or it simply does not allow the migration to other hosts. There are two types of approaches: (1) attack detection approaches, whose aim is detection during or after the attack; and (2) attack avoidance approaches, that try to avoid the attacks before they happen. 2.1
Attack Detection Approaches
Attack detection approaches permits the origin host to know if its agent was tampered during or after the execution due to illegal modifications of code, data or execution flow. In [4], Minsky et al. introduce the idea of replication and voting. In each stage, hosts execute the agent in a parallel way and send several replicas of the agent to a set of independent hosts in the next stage. This fact implies a waste of resources that makes the solution impractical. In [8], Vigna introduces the idea of cryptographic traces. The running agent takes traces of instructions that alter the agent’s state due to external variables. If the agent owner suspects that a host acted maliciously and wants to verify the execution, it asks for the traces and executes the agent again. Therefore, the executing hosts
Protocols for Malicious Host Revocation
193
must store the traces for an indefinite period of time because the origin host can ask for them. Furthermore, verification is performed only in case of suspicion, but how a host becomes suspicious is not explained. In [1], the authors introduce a protocol for detecting suspicious hosts by limiting the agent’s execution time. Using jointly this suspicious detection protocol with the cryptographic traces approach, it is possible to detect suspicious hosts and to ask for the traces just when the agent returns to the origin host. In our opinion, attack detection approaches are not enough on their own. These kind of mechanisms must be attached with some punishment policies. A host will turn into malicious behavior only in case that the benefits of tampering the agent are going to be greater than the punishment, so then the harder the punishment, the less attacks will be performed by the hosts. Little attention has been paid to punishment mechanisms in mobile agent systems. In [2], the HoRA was introduced as a TTP that solves the lack of an entity with punishment capabilities. The HoRA stores in a list the identifiers of those hosts that have been proven malicious, and for this reason they have been revoked. In this sense, the punishment lies in avoiding the revoked hosts can execute agents. 2.2
Attack Avoidance Approaches
Detection techniques are not useful for services where benefits for tampering a mobile agent are greater than the possible punishment. In those cases, only attack avoidance approaches must be used. Unfortunately, there is no current approach that avoids attacks completely. Yee introduces the idea of a closed tamper-proof hardware subsystem [9] where agents can be executed in a secure way, but this forces each host to buy a hardware equipment. The environmental key generation [5] makes the agent’s code impossible to decipher until the proper conditions happen on the environment, but this makes the host monitoring the environment continously. Roth presents the idea of cooperative agents [6] that share secrets and decisions and have a disjunct itinerary. This fact makes collusion attacks difficult, but not impossible. Hohl presented obfuscation [3] as a mechanism to assure the execution integrity during a period of time, but this time depends on the computation capacity of the malicious host. The use of encrypted programs [7] is proposed as the only way to give privacy and integrity to mobile code. Hosts execute the encrypted code directly, and a decryption function is used when the agent reaches the origin host to recover the results. The difficulty here is to find functions that can be executed in an encrypted way.
3
Host Revocation Authority
The HoRA [2] must be considered an independent TTP in a mobile agent system, like the Certification Authority is considered in the PKI. The HoRA controls those hosts that have been proven malicious and hence they have been revoked. Before sending an agent, each origin host consults the revocation information
194
O. Esparza et al.
(1) by asking directly to the HoRA or (2) by consulting a local copy of the list of revoked hosts, so all the revoked hosts must be deleted from the itinerary. As a result, the revoked hosts will not execute agents any more. This mechanism cannot be considered neither a detection approach nor an avoidance approach, but a blend of them. The first attack performed by a host cannot be avoided, but if the agent sender proves that the host acted maliciously, this host will be revoked, so then any other attack from this malicious host will be avoided. In this section we include a brief description of the tasks that the HoRA must perform. A more detailed explanation of this topic can be found in [2]. The main two tasks that the HoRA must perform are: – Keeping the Revocation Information: the aim of host revocation is to distinguish the malicious hosts from the honest ones. Unfortunately, it is not possible to know if a honest host can turn into malicious behavior just in the current transaction. However, it is possible to know if a host acted maliciously in the past. The HoRA knows which hosts have been revoked by saving their host identifiers in a list. – Revoking malicious hosts: it is possible to revoke a host if some proofs of its malicious behavior can be found. In section 4 the authors of this paper introduce two new protocols that can be used by the origin hosts to revoke malicious hosts. Additionally, the HoRA performs a set of jobs that depends on the way that the origin hosts consult the revocation information. Assuming that the HoRA works in a similar way as the Certification Authority regarding certificate revocation, two possible revocation policies can be followed. – Off-line Revocation Policy: it is based on the distribution of revocation information using a Host Revocation List (HRL from here on), i.e. a list of revoked host identifiers signed by the HoRA. Origin hosts must download a copy of the HRL in order to consult it before executing an agent. Origin hosts must also update the list periodically to take into account new malicious hosts. In this sense, the HRL works in a similar way as the Certificate Revocation List in the PKI. – On-line Revocation Policy: before sending a mobile agent, each origin host requests the HoRA if there are any revoked hosts in the agent’s itinerary. The HoRA sends a signed response to the origin host pointing out which hosts have been revoked. This mechanism works in a similar way as the Online Certificate Status Protocol used in the PKI.
4
Revoking Malicious Hosts
Revocation can only be performed in case there are proofs of the malicious behavior. In this sense, one of any existing detection and proving mechanisms must be used. As the cryptographic traces approach [8] is the most widely known, we are going to use it in our scheme.
Protocols for Malicious Host Revocation
195
The rest of the section presents two possible protocols that can be used to revoke malicious hosts. Before starting with the protocol details, some notation used in the message and agent passing must be introduced: – We denote a mobile agent that moves from host x to host y as Agentx→y (). – We denote a message from host x to host y as M essagex→y (). – We denote the signed copy of document D as signα [D], where α is the signing host identifier. – We denote the One-Way Hash Function value of document D as OW HF (D). 4.1
Host Revocation Protocol
The main revocation protocol can be divided in three parts: (1) An agent sending part, in which origin hosts execute the agent and include some data that can be used as proof of execution integrity; (2) A proof checking part, in which the origin host can ask for the traces to the executing hosts if some of them are suspicious of malicious behavior; and finally, (3) A host revocation part, in which the HoRA can revoke a host if some proofs of its malicious behavior can be found. To make more clear the explanation, the host revocation protocol is presented by using an example. The following assumptions have been used in this example: – The agent’s itinerary has only two hosts. – It is assumed that the origin host uses the off-line policy in order to make independent the status checking part and the host revocation protocol. Consequently, the origin host consults internally its local copy of the HRL to verify the status of the hosts in the itinerary. – None of the hosts in the itinerary has been revoked, but the second one is going to turn into malicious behavior just in the current transaction. – It is assumed that privacy is not required. If so, it is possible to use encryption in those parts that must be confidential. – The cryptographic traces approach [8] has been taken as the detection and proving mechanism. Agent Sending Part. In the agent sending part, the mobile agent travels from host to host executing its code and data. Each executing host must send to the origin host a proof that links the code, the data, the results and the traces of the execution. This proof can be used later to revoke a host if its malicious behavior can be demonstrated. A description of the steps needed in the example is included below: 1. The origin host (O) consults internally its local copy of the HRL. As none of the hosts has been revoked, the agent can be sent to the first host in the itinerary. The agent carries the code and some input data. A Traces Storage Timestamp (T ST from here on) is also included to indicate when the origin host loose its rights for starting a host revocation process. This time will be
196
O. Esparza et al.
used to determine the expiry time of the proofs, i.e. after this time T ST all proofs can be deleted by the executing hosts. Of course, all data included in the agent must be signed in order to avoid repudiation attacks. Therefore the origin host sends to the Host1 the following agent: AgentO→1 (A) where A = signO [Code, DataO, T ST ]. 2. When Host1 receives the agent, it extracts the code from A and executes it. The traces are created automatically during the execution. As the size of the traces is expected to be too large, a hash value of them is sent to the origin host as a proof. The complete traces will be sent in case the executing host becomes suspicious. The results and some input data for the following host are also included in the agent. The signature of Host1 certifies that there is a link between the code, the data, the traces and the results. The following agent is sent to the next host: Agent1→2 (B) where B = sign1 [A, Data1, Results1, OW HF (T races1)]. 3. When Host2 receives the agent, it extracts the code and data from B and modify them in order to take some profit, so the code is executed in a tampered way. After this, it prepares the agent to be sent to the next host in the itinerary. As the following host is the origin host, it is not necessary to include data for the execution in the next host. So Host2 sends the origin host the following agent: Agent2→O (C) where C = sign2 [B, Results2, OW HF (T races2)]. Figure 1 shows the agent passing for the described example.
Fig. 1. Agent Sending Part
Protocols for Malicious Host Revocation
197
Proof Checking Part. In the proof checking part, the origin host asks for the traces to the suspicious hosts in order to verify the execution integrity. 4. The origin host asks for the traces of Host2 because it is detected as suspicious [1]. The following message asking for its traces is sent: M essageO→2 (signO [send T races2]) 5. Host2 replies with a signed message containing the complete traces. The sent message has the following format: M essage2→O (sign2 [T races2])
The origin performs this set of verifications when it has the traces: – It verifies that T races2 coincide with the hash value OW HF (T races2) sent in step 3. If there is an inconsistence in the hash value, there is a proof that Host2 does not execute the agent properly. – It executes the agent again and verifies that the execution agrees with T races2. If the traces agrees with the execution, the host can be considered as honest. However, if there is an inconsistence in the execution, there is a proof that Host2 does not execute the agent properly. Host Revocation Part. In the host revocation part, the origin host starts a host revocation process because there are proofs that a host did not act honestly executing the agent. 6. The revocation process consists in sending the HoRA the signed proofs to demonstrate that the host did not execute the agent properly: M essageO→HoRA (signO [C, sign2 [T races2]]) The HoRA receives the request for Host2 revocation. First of all, the T ST is verified in order to know if the origin host can still start a revocation process. After that, the HoRA performs the same set of verifications than the origin host performed in the step 5 of the proof checking part. These verifications were: (1) Confirming that the hash value of the traces matches with the traces, and (2) Executing the agent again and verifying that execution matches the traces. Obviously, the HoRA must have a module with agent execution capabilities to do these tasks. If finally the proofs are considered valid, the malicious host is revoked and its identifier is added to the list of revoked hosts that the HoRA has internally. As all messages are properly signed, Host2 cannot perform a repudiation attack. Figure 2 shows the message passing by the proof checking and host revocation parts.
198
O. Esparza et al.
Fig. 2. Proof Checking and Host Revocation Parts
4.2
Provisional Revocation Protocol
There is the possibility that a host goes out of service just after executing the agent, so its traces cannot be sent in case the origin host asks for them. This host can be revoked until it proves that the agent was executed properly. This fact has no further consequences as this host is out of service, so it cannot execute any agent. The message passing of the provisional revocation process continues with the previous example, but it starts in step 5 because the message that includes the traces was not sent by the host. Figure 3 shows these new messages: 5. The origin host does not receive a response from Host2, so it starts a provisional host revocation process. The origin host sends a message to the HoRA informing that the traces have not been sent by Host2. The message also contains the proofs that the origin host has: M essageO→HoRA (signO [C, T races2 not recieved]) 6. The HoRA receives the message asking for Host2 provisional revocation. First of all, the T ST is verified in order to know if the origin host can still start a revocation process. If so, the HoRA asks for the traces directly to Host2. A signed message is sent: M essageHoRA→2 (signHoRA [send T races2])
Protocols for Malicious Host Revocation
199
Fig. 3. Provisional Revocation Protocol
– If Host2 does not reply to the message, its identifier is included in the list that the HoRA has internally, so it is provisionally revoked. During a certain period of time Host2 has the possibility of sending the traces to the HoRA. After this time, the status of the host passes to permanently revoked. – If Host2 replies to the message with the traces, the HoRA has all the information needed and it can perform the normal verifications. The process continues as usual revocation process: ∗ M essage2→HoRA (sign2 [T races2]) 7. In both cases the HoRA must inform the origin host about the status of the host (revoked or not) to know what to do with the results of the agent M essageHoRA→O (signHoRA [host status]) Just underline that provisional revoked hosts must not be included in the HRL until their status pass to permanently revoked in order not to revoke honest hosts. 4.3
Attacks
The attacks that can be performed to the protocols are basically focused on hiding the proofs: – A malicious host can try to modify the input data for the next host, the results, the hash value of the traces or even the traces. In the previous example, the malicious host Host2 could try to modify Results2, OW HF (T races2)
200
O. Esparza et al.
or T races2. If this host is suspicious, its traces will be asked, the agent will be executed again and finally the proof of the malicious behavior will be found. – The malicious host can try not to send the proofs. As there are only two proofs, there are two possible attacks of this kind: • The malicious host does not send the OW HF (T races2). Without the hash value of the traces there is no proof that links the input data, the results and the traces. This attack is considered a denial of service attack because an incomplete agent is received. In this case the host can be revoked directly by sending the HoRA the proof that the malicious host does not send the hash value M essageO→HoRA (signO [C, Incomplete Agent]) where C = sign2 [B, Results2, −]. • The malicious host does not send T races2. In this case the malicious host pretends to be out of service. A provisional revocation process is started and finally if the traces are not sent, the host will be permanently revoked. – An origin host can try to involve a honest host by starting a provisional revocation process: M essageO→HoRA (signO [C, T races1 not recieved]) This kind of attack can be avoided if the honest host stores T races1 until the T ST finishes. 4.4
Drawbacks
The approach has the following drawbacks: – A non-deliberated error during execution could lead a host to be revoked in case it is considered suspicious. This can seem a disproportionate measure, but in the author’s opinion hosts must assure correctness in all transactions. – The list that the HoRA has internally grows in an indefinite way. This problem can be solved by using an Agent Execution Certificate, i.e. a certificate issued by the HoRA that permits the hosts to execute agents during a validity period. In this case, the HoRA does not revoke the host identifier, but the certificate. – The HoRA must be accessible for all hosts. An alternative topology based on repositories and a replication policy between entities must be thought.
5
Conclusions
This paper introduces two new protocols that aid to solve the problem of malicious hosts by using a Host Revocation Authority [2]. The HoRA controls which
Protocols for Malicious Host Revocation
201
hosts acted maliciously in the past, and for this reason they have been revoked. Each agent sender consults the HoRA before sending an agent in order to remove from the agent’s itinerary all the malicious hosts. Accordingly, the revoked hosts will not receive mobile agents any more. These two new protocols can be used by an origin host to revoke a host by demonstrating to the HoRA that it acted maliciously.
References 1. O. Esparza, M. Soriano, J.L. Mu˜ noz, and J. Forn´e. A protocol for detecting malicious hosts based on limiting the execution time of mobile agents. In IEEE Symposium on Computers and Communications – ISCC’2003, 2003. 2. O. Esparza, M. Soriano, J.L. Mu˜ noz, and J. Forn´e. Host Revocation Authority: a Way of Protecting Mobile Agents from Malicious Hosts. In International Conference on Web Engineering (ICWE 2003), LNCS. Springer-Verlag, 2003. 3. F. Hohl. Time Limited Blackbox Security: Protecting Mobile Agents From Malicious Hosts. In Mobile Agents and Security, volume 1419 of LNCS. Springer-Verlag, 1998. 4. Y. Minsky, R. van Renesse, F. Schneider, and S.D. Stoller. Cryptographic Support for Fault-Tolerant Distributed Computing. In Seventh ACM SIGOPS European Workshop, 1996. 5. J. Riordan and B. Schneier. Environmental Key Generation Towards Clueless Agents. In Mobile Agents and Security, volume 1419 of LNCS. Springer-Verlag, 1998. 6. V. Roth. Mutual protection of cooperating agents. In Secure Internet Programming: Security Issues for Mobile and Distributed Objects, volume 1906 of LNCS. SpringerVerlag, 1999. 7. T. Sander and C.F. Tschudin. Protecting mobile agents against malicious hosts. In Mobile Agents and Security, volume 1419 of LNCS. Springer-Verlag, 1998. 8. G. Vigna. Cryptographic traces for mobile agents. In Mobile Agents and Security, volume 1419 of LNCS. Springer-Verlag, 1998. 9. B.S. Yee. A sanctuary for mobile agents. In DARPA workshop on foundations for secure mobile code, 1997.
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code Pik-Wah Chan and Michael R. Lyu* Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, Hong Kong {pwchan, lyu}@cse.cuhk.edu.hk
Abstract. In this paper, a digital video watermarking algorithm is proposed. We present a novel DWT-based blind digital video watermarking scheme with scrambled watermark and error correcting code. Our scheme embeds different parts of a single watermark into different scenes of a video under the wavelet domain. To increase robustness of the scheme, the watermark is refined by the error correcting code, while the correcting code is embedded as watermark in audio channel. Our video watermarking algorithm is robust against the attacks of frame dropping, averaging and statistical analysis, which were not solved effectively in the past. Furthermore, it allows blind retrieval of embedded watermark which does not need the original video; and the watermark is perceptually invisible. The algorithm design, evaluation, and experimentation of the proposed scheme are described in this paper.
1
Introduction
We have seen an explosion of data change in the Internet and the extensive use of digital media. Consequently, digital data owners can transfer multimedia documents across the Internet easily. Therefore, there is an increase in the concern over copyright protection of digital content [1, 2, 3]. In the early days, encryption and control access techniques were employed to protect the ownership of media. They do not, however, protect against unauthorized copying after the media have been successfully transmitted and decrypted. Recently, the watermark techniques are utilized to maintain the copyright [4, 5, 6]. In this paper, we focus on engaging the digital watermarking techniques to protect digital multimedia intellectual copyright and propose a new algorithm for video watermarking. Video watermarking introduces some issues not present in image watermarking. Due to large amounts of data and inherent redundancy between frames, video signals are highly susceptible to pirate attacks, including frame averaging, frame dropping, frame swapping, statistical analysis, etc [4]. However, the currently proposed algorithms do not solve these problems effectively. In our scheme, we attack this *
The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CUHK4182/03E).
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 202–213, 2003. © Springer-Verlag Berlin Heidelberg 2003
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code
203
problem by applying scene change detections and scrambled watermarks in a video. The scheme is robust against frame dropping, as the same part of the watermark is embedded into the frames of a scene. For different scenes, different parts of the watermark are used, making the scheme robust against frame averaging and statistical analysis. At the same time, an audio watermark is included to enhance the robustness of the scheme. Error correcting code of a video watermark can be embedded as an audio watermark and used for refining the embedded watermark during detection. Our approach cultivates an innovative idea in embedding different parts of a watermark according to scene changes, and in embedding its error correcting code as an audio watermark. Although the concept is quite simple, this approach is never explored in the literature, and its advantages are clear and significant. The effectiveness of this scheme is verified through a number of experiments. This paper is organized into four sections. The next section presents the details of the novel video watermark scheme and the experimental results are shown in Section 3. Section 4 provides a conclusion and the further improvement of this scheme.
2
A Video Watermarking Scheme
The new watermarking scheme we propose is based on Discrete Wavelet Transform. Fig. 1 shows an overview of our watermarking process. In our scheme, an input video is split into audio and video stream and undergoes watermarking respectively. On the other hand, a watermark is decomposed into different parts which are embedded in corresponding frames of different scenes in the original video.
Fig. 1. Overview of the watermarking process
As applying a fixed image watermark to each frame in the video leads to the problems in maintaining statistical and perceptual invisibility [7], our scheme employs independent watermarks for successive but different scenes. Applying independent watermarks to each frame also presents a problem: Regions in each video frame with little or no motion remain the same frame after frame. These motionless
204
P.-W. Chan and M.R. Lyu
regions may be statistically compared or averaged to remove independent watermarks [8,9], so we use an identical watermark within each motionless scene. With these mechanisms, the proposed method is robust against the attack of frame dropping, averaging, swapping, and statistical analysis. At the same time, error correcting codes are extracted from the watermark and embedded as an audio watermark in the audio channel, which in turn makes it possible to correct and detect the changes from the extracted watermarks. This addition protection mechanism enables the scheme to overcome the corruption of a watermark, thus the robustness of the scheme is increased under certain attacks. This newly proposed scheme consists of four parts, including: watermark preprocess, video preprocess, watermark embedding, and watermark detection. Details are described in the following sections. 2.1 Watermark Preprocess Watermark preprocess consists of two parts, video watermark and audio watermark. After both watermarks are preprocessed, they will be embedded into video channel and audio channel, respectively. Video Watermark. A Watermark is scrambled into small parts in preprocess, and they are embedded into different scenes so that the scheme can resist to a number of attacks specified to the video. A 256-grey-level image is used as a watermark, as shown in Fig. 3a, so 8 bits can represent each pixel. The watermark is first scaled to a particular size with the following equation p+q=n
, p and q > 0
(1)
where m is the number of scene changes and n, p, q are positive integers. And the size of the watermark should be (2) Then the watermark is divided into 2 small images with size 64 × 64. Fig. 2 and 3 show the procedure and the result of watermark preprocess with m = 10, n = 3, p = 1, and q = 2. n
Fig. 2. Overview of watermark preprocess.
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
205
Fig. 3. (a) Original watermark (b-i) Preprocessed watermark m0-m7 (j) Encrypted watermark m’0
In the next step, each small image is decomposed into 8 bit-planes, and a large image mn can be obtained by placing the bit-planes side by side only consisting of 0’s and n 1’s. These processed images are used as watermarks, and totally 2 independent watermarks are obtained. To make the scheme more robust, the processed watermarks m are transformed to the wavelet domain and encrypted [10]. Sample preprocessed watermarks are shown in Fig. 3, where (a) is the original watermark, (b)-(i) represent the scrambled watermarks in the spatial domain, and (j) shows the encrypted watermark of (b), i.e., m’0. Audio Watermark. Error correcting code is extracted from the watermark image and embedded in the audio channel as an audio watermark. This watermark provides the error correcting and detection capability for the video watermark. In detection phase, it would be extracted and used for refining the video watermark. Different error correcting coding techniques can be applied such as Reed-Solomon Coding Techniques [11] and Turbo Coding [12]. Error correcting code plays an important role to a watermark, especially when the watermark is corrupted, i.e., when it is damaged significantly. Error correcting code overcomes the corruption of a watermark, and can make the watermark survive through serious attacks. Moreover, the scheme also takes advantages of watermarking the audio channel, because it provides an independent channel for embedding the error correcting code, which gives extra information for watermark extraction. Therefore, the scheme is more robust than other schemes which only used video channel alone. The key to error correcting is redundancy. Indeed, the simplest error correcting code is simply repeated everything several times. However, in order to keep the audio watermark inaudible, we cannot embed too much information into an audio channel. In our scheme, we apply averaging to achieve the error code. Within a small region of an image, the pixels are similar. Therefore, an average value of a small region can be used to estimate the pixels within that particular region. The average value of the pixels in each region is calculated as follows:
206
P.-W. Chan and M.R. Lyu
(3) th
where k is the k block of the average image, (p, q) is coordinate of region k, (x, y) is the coordinate of the pixel in region k and x × y is the size of a block. A sample is shown in Fig. 4.
(a)
(b)
(c)
Fig. 4. (a) Original video watermark (b) Visualization of averaging (c) Audio watermark (average of a)
2.2 Video Preprocess Our watermark scheme is based on 4 levels DWT. All frames in the video are transformed to the wavelet domain. Moreover, scene changes are detected from the video by applying the histogram difference method on the video stream.
Fig. 5. After scene change detection, watermark m1 is used for the first scene. When there is a scene change, another watermark m3 is used for the next scene.
After scene change detection, as shown in Fig. 5, independent watermarks are embedded in video frames of different scenes. Within a motionless scene, an identical watermark is used for each frame. The watermark for each scene can be chosen with a pseudo-random permutation such that only a legitimate watermark detector can reassemble the original watermark.
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code
207
2.3 Watermark Embedding Watermark is then embedded to video frames by changing position of some DWT coefficient with the following condition: if W[j] = 1, Exchange C[i] with max(C[i], C[i+1], C[i+2], C[i+3], C[i+4]) else Exchange C[i] with min(C[i], C[i+1], C[i+2], C[i+3], C[i+4]) th
(4)
th
where C[i] is the i DWT coefficient of a frame, and W[j] is the j pixel of a certain watermark [13]. The sequence of watermark coefficients used is stated in Fig. 6.
Fig. 6. Embedding watermarks in a frame. Higher frequency coefficients are embedded to higher frequency part of the video frame. Also, only the middle frequency wavelet coefficient of the frame (middle frequency sub-band) is watermarked [9].
The emphasis of this scheme is the video watermark. The audio watermark is used to help the video watermark and make it more robustness. Namely, the audio watermark is used for refining the video watermark in detection phase, so the error coding code is stored in the audio channel. We have applied a simple audio watermarking technique, the spread spectrum which is proposed in [16], in this scheme. 2.4 Watermark Detection The watermark is detected through the following process, where overview is shown in Fig. 7.
Fig. 7. Overview of detection of the watermark
208
P.-W. Chan and M.R. Lyu
A test video is split into video stream and audio stream and watermarks are extracted separately by audio watermark extraction and video watermark extraction. Then the extracted watermark undergoes refining process. Video Watermark Detection. The video stream is processed to get the video watermark. In this step, scene changes are detected from the tested video. Also, each video frame is transformed to the wavelet domain with 4 levels. Then the watermark is extracted with the following condition: if WC[i] > median(WC[i], WC[i+1], WC[i+2], WC[i+3], WC[i+4]) W[j] = 1 else W[j] = 0
(5)
th
where WC[i] is the i DWT coefficient of a watermarked video frame, and W[j] is th the j pixel of an extracted watermark [13]. As an identical watermark is used for all frames within a scene, multiple copies of each part of the watermark may be obtained. The watermark is recovered by averaging the watermarks extracted from different frames. This reduces the effect if the attack is carried out at some designated frames. Then we can combine the 8 bitn planes and recover the 64 × 64 size image, i.e., 1/2 part of the original watermark. If enough scenes are found and all parts of the watermark are collected, the original large watermark image can be reconstructed. This can be shown in Fig. 8, where the original frame, the watermarked frame, and the extracted watermark are depicted. Moreover, if some of the watermark part is lost, the final watermark can still survive. We will show this later.
(a)
(b)
(c)
(d)
Fig. 8. (a) Original frame (b) Watermarked frame (c) Extracted watermark corresponding to Fig. 3(g) (d) Recovered watermark
Audio Watermark Detection and Refining. At the same time, error correcting codes are extracted from the audio stream and the video watermark extracted is refined by this information with the following equation (6) th
where k is the k block of the average image, (i, j) is coordinate of the video watermark, and P: Q is a ratio of importance of extracted video watermark to audio watermark.
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code
209
After extracting and refining the watermark, a similarity measurement of the extracted and the referenced watermarks is used for objective judgment of the extraction fidelity and it is defined as:
(7) which is the cross-correlation normalized by the reference watermark energy to give unity as the peak correlation [14]. We will use this measurement to evaluate our scheme in our experiment.
3 Experimental Results To evaluate the performance of the new video watermarking scheme, several experiments have been done. They are: the experiment with various dropping ratio, the experiment with various number of frame colluded, the experiment with various quality factor of MPEG, and the experiment with various cropping ratio. Another DWT-based watermarking scheme which embeds an identical watermark in all frames is used to compare with the proposed scheme. A video clip with 1526 frames of size 352 × 288 is used in our experiment. The video consists of 10 scene changes. The NC values are retrieved when the watermarked video is under different attacks. The experimental results are described in details in the following. 3.1
Experiment with Frame Dropping
As a video contains large amount of redundancy between frames, it may suffer attacks by frame dropping. This experiment is aimed to examine the robustness of the scheme under attack by frame dropping. Different percentages of frames are dropped and obtained result is shown in Fig. 9. Our scheme achieves better performance. It is because in each scene, all frames are embedded with the same watermark. This prevents attackers from removing the watermark by frame dropping. If they try to remove one part of the watermark, they need to remove the whole trunk of frames (i.e., the whole scene) and this would lead to a significant damage to the video. In addition, when frames are dropped, the error is only introduced to a corresponding small part of the watermark. For the DWTbased scheme (i.e., non-scene-based), however, the error is introduced to the whole watermark and it makes the performance worse. The performance of the scheme is significantly improved by combining with a audio watermark, especially when the dropping rate of video frame is high. The improvement is increased with the dropping rate of the frame. This is because when the dropping rate increases, the error of the extracted watermark is increased and it significantly damages the watermark. The error correcting code from the audio watermark provides information to correct the error and overcome the part of the corruption of the video watermark, thus the NC values of the watermark is higher than the one without the error correcting code. Moreover, the error correcting code is
210
P.-W. Chan and M.R. Lyu
embedded in the audio channel. Frame dropping would not affect the audio channel much. Our scheme can take advantages of this to avoid destroying the information, and error correcting code can still be used to refine the watermark in improving the NC value.
Fig. 9. NC values under frame dropping. From the experiment, we found that our scheme achieves better performance than the DWT-based scheme without scene-based watermarks.
3.2
Experiment with Frame Averaging and Statistical Analysis
Frame averaging and statistical analysis is another common attack to the video watermark. When attackers collect a number of watermarked frames, they can estimate the watermark by statistical averaging and remove it from the watermarked video [17,18]. The scenario is shown in Fig. 10.
Fig. 10. Scenario of statistical averaging attack.
Our proposed scheme performs better because our scheme crops a watermark into pieces and embeds them into different frames, it making the watermarks resistant to attacks by frame averaging for the watermark extraction. The identical watermark used within a scene can prevent attackers from taking the advantage of motionless regions in successive frames and removing the watermark by comparing and averaging the frames statistically [19]. Independent watermarks used for successive, but different scenes can prevent attackers from colluding with frames from completely different scenes to extract the watermark.
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code
211
Fig. 11. NC values under statistical averaging. After this attack is applied to the watermarked video with different numbers of video frame colluded, watermarks are extracted and NC values are obtained. It is found that the proposed scheme can resist to statistical averaging quite well.
3.3
Experiment with Lossy Compression
This experiment is aimed at testing the robustness of the scheme under attack by lossy compression. Fig. 12 shows the NC values of the extracted watermarks with different quality factors of MPEG.
Fig. 12. NC values under lossy compression. From the experiment, we found that the proposed scheme improves the robustness for watermark protection.
The performance of the scheme is significantly improved by combining with audio watermark, especially when the quality factor of MPEG is low. This is because when the quality factor of MPEG is low, the error of the extracted watermark is increased and the watermark is damaged significantly. As the error correcting code is provided from the audio watermark, it can survive the attack by lossy compression which is applied to the video channel. The proposed scheme without audio watermark has similar performance with other DWT-based scheme because both of them satisfy the following condition. Higher frequency DWT coefficients of the watermark are embedded to higher frequency part of the video frame and high frequency sub-band DWT coefficients (HH) of video frame are not watermarked. This approach makes
212
P.-W. Chan and M.R. Lyu
the watermark survive MPEG lossy compression, as lossy compression removes the details of the image [20]. 3.4
Experiment with Attacks on Watermarked Frame
DWT inherits many advantages in resisting the attacks on the watermarked frames. It achieves both spatial and frequency localization, perceptual invisibility and attacks by image processing techniques [15]. Cropping is one of the attacks applied to video frequently. Fig. 13 shows the result of the watermarked video under different ratio of cropping. It is also found that the proposed scheme gives the best result.
Fig. 13. NC values under cropping
4 Conclusion and Future Work This paper proposes an innovative blind video watermarking scheme with scrambled watermarks and error correcting code. The process of this video watermarking scheme, including watermark preprocessing, video preprocessing, watermark embedding, and watermark detection, is described in detail. Experiments are performed to demonstrate that our scheme is robust against attacks by frame dropping, frame averaging, and statistical analysis. Robustness of the scheme is enhanced by combining with audio watermarks. The scheme can be improved by making use of the information from the video, such as time information, to increase the robustness of the watermark. We will conduct this improvement in the future.
References 1. 2. 3.
A. Piva, F. Bartolini, and M. Barni: Managing copyright in open networks. IEEE Internet Computing, Volume 6, Issue: 3, pp: 18–26, May-June 2002 Chun-Shien Lu, Hong-Yuan, and Mark Liao: Multipurpose Watermarking for Image Authentication and Protection. IEEE Transactions on Image Processing, Volume: 10 Issue: 10, Oct 2001 Page(s): 1579–1592 C. S. Lu, S. K. Huang, C. J. Sze, and H. Y. M. Liao: Cocktail watermarking for digital image protection. IEEE Transactions Multimedia, Volume 2, pp. 209–224, Dec. 2000.
A DWT-Based Digital Video Watermarking Scheme with Error Correcting Code 4. 5. 6. 7. 8. 9.
10. 11.
12. 13. 14. 15. 16. 17. 18. 19. 20.
213
Joo Lee and Sung-Hwan Jung: A survey of watermarking techniques applied to multimedia. Proceedings 2001 IEEE International Symposium on Industrial Electronics (ISIE2001), Volume. 1, pp: 272–277, 2001. M. Barni, F. Bartolini, R. Caldelli, A. De Rosa, and A. Piva: A Robust Watermarking Approach for Raw Video. Proceedings 10th International Packet Video Workshop PV2000, Cagliari, Italy, 1–2 May 2000. M. Eskicioglu and J. Delp: An overview of multimedia content protection in consumer electronics devices. Signal Processing Image Communication 16 (2001), pp: 681–699, 2001. N. Checcacci, M. Barni, F. Bartolini, and S. Basagni: Robust video watermarking for wireless multimedia communications. Proceedings 2000 IEEE Wireless Communications and Networking Conference (WCNC 2000), Volume 3, pp: 1530–1535. Bijan G. Mobasseri: Direct sequence watermarking of digital video using m-frames. Proceedings International Conference on Image Processing (ICIP-98), Chicago, Illinois, Volume 3, pp: 399–403, October 4–7 1998. Mitchell D. Swanson, Bin Zhu, and Ahmed H. Tewfik: Multiresolution Video Watermarking using Perceptual Models and Scene Segmentation. Proceedings International Conference on Image Processing (ICIP ’97), 3-Volume Set-Volume 2, Washington, DC October 26–29, 1997. P. P. Dang and P. M. Chau: Image encryption for secure Internet multimedia applications. IEEE Transactions on Consumer Electronics, Volume: 46 Issue: 3, pp: 395–403, Aug. 2000. Lijun Zhang, Zhigang Cao and Chunyan Gao: Application of RS-coded MPSK modulation scenarios to compressed image communication in mobile fading channel. Proceedings nd 2000 52 IEEE Vehicular Technology Conference, VTS-Fall VTC.2000, Volume: 3, 2000 pp: 1198–1203. A Ambroze, G. Wade, C. Serdean, M. Tomlinson, J. Stander, and M. Borda: Turbo code protection of video watermark channel. IEE Proceedings-Vision, Image and Signal Processing, Volume: 148, Issue: 1, Feb 2001 pp: 54–58. F.Y. Duan, I. King, L. Xu, and L.W. Chan: Intra-block algorithm for digital watermarking. Proceedings IEEE 14th International Conference on Pattern Recognition (ICPR’98), volume II, pp: 1589–1591, 17–20 August 1998. Chiou-Tung Hzu and Ja-Ling Wu: Digital watermarking for video. Proceedings 1997 13th International Conference on Digital Signal Processing, DSP 97, Volume: 1, pp: 217–220, 2–4 Jul 1997. Xiamu Niu and Shenghe Sun: A New Wavelet-Based Digital Watermarking for Video. 9th IEEE Digital Signal Processing Workshop, Texas, USA, Oct. 2000. D. Kirovski, and H. Malvar: Robust spread-spectrum audio watermarking. Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001, Volume 3, pp: 1345–1348. K. Su, D. Kundur and D. Hatzinakos: A Novel Approach to Collusion-Resistant Video Watermarking. Security and Watermarking of Multimedia Contents IV, E. J. Delp and P. W. Wong, eds., Proc. SPIE, Volume 4675, pp:12, San Jose, California, January 2002. K. Su, D. Kundur and D. Hatzinakos: A Content-Dependent Spatially Localized Video Watermarked for Resistance to Collusion and Interpolation Attacks. Proceedings IEEE International Conference on Image Processing, October 2001. Yiwei Wang, John F. Doherty, and Robert E. Van Dyck: A wavelet-based watermarking algorithm for ownership verification of digital image. IEEE Transactions on Image Processing, Volume 11, No 2, Feb 2002. Eugene T. Lin, Christine I. Podilchuk, Ton Kalker, and Edward J. Delp: Streaming Video and Rate Scalable Compression: What Are the Challenges for Watermarking? Proceedings SPIE International Conference on Security and Watermarking of Multimedia Contents III, Volume 4314, January 22–25, 2001, San Jose, CA.
A Novel Two-Level Trust Model for Grid Tie-Yan Li1 , HuaFei Zhu1 , and Kwok-Yan Lam2 1
Infocomm Security Department, Institute for Infocomm Research (I 2 R) 21 Heng Mui Keng Terrace, Singapore 119613 {litieyan, huafei}@i2r.a-star.edu.sg 2 School of Software, Tsinghua University, Beijing 100084, PR China [email protected]
Abstract. Trust is hard to establish in a service-oriented grid architecture because of the need to support end user single sign-on and dynamic transient service. In order to enhance the security by the Grid Security Infrastructure (GSI), this paper proposes a two-level trust model and the corresponding trust metrics evaluation algorithms. The upper level defines the trust relationships among Virtual Organizations (VO) in a distributed manner. The lower level justifies the trust values within a grid domain. This novel model provides an integrated trust evaluation mechanism to support secure and transparent services across security domains. It is flexible, scalable and interoperable. We design the implementation of embedding the trust scheme into GSI. At this stage, we achieve additional authentication means between grid users and grid services.
1
Introduction
A computational grid is a collection of heterogeneous computers and resources spreading across multiple administrative domains with the objective of providing users easy access to these resources. Grid applications are distinguished from traditional client-server applications by their simultaneous use of massive amount of resources with dynamic requirements. Such resources are typically drawn from multiple administrative domains interconnected by complex communication structures, and need to be access with stringent performance requirements. Achieving these goals, the Globus Toolkit [4] was developed (current version 3.0) by the Grid research community and is currently the most widely-used grid infrastructure. Security services in Globus are provided by the Grid Security Infrastructure (GSI) [1] – the de facto security standard in the grid community, which provides basic security properties such as authentication, authorization and confidentiality. However, as pointed out by [7], GSI suffers from many potential security drawbacks such as uncontrolled delegation, leaky infrastructure and insecure services. Thus, further security mechanisms are needed to complement GSI in order to ensure the security of grid services (see Section 2 for a detailed review on GSI and its trust issues). At present, no complete trust model for grid has been proposed. A CA-based trust model was drafted [8] and is being proposed to Global Grid Forum (GGF). S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 214–225, 2003. c Springer-Verlag Berlin Heidelberg 2003
A Novel Two-Level Trust Model for Grid
215
However, while the document described the trust requirements in a grid, trust is solely built on authentication of identity certificates. As authentication is not insufficient for establishing strong security, it is clear that a proper trust evaluation model for grid is needed. In the literature, several well known trust models have already been proposed [18,15,16]. The X.509 trust model [18] is a centralized approach such that each participant has a certificate signed by a central CA. Since GSI employs X.509 certificates, this trust model can be used within a grid domain. The SPKI trust model [15] offers more flexibility by supporting delegation certificates. This property is similar to the proxy certificates supported by GSI. However, issuing related to the control of proxy/delegation certificates remain unsolved. PGP [16] adopts a distributed trust model that builds trust on an entity from its neighbors. Though all of these trust models were designed for some specific scenarios, none of them fit well in the grid environment directly. In this paper, we propose a two-level trust model. The grid architecture is divided into two levels: the domain (lower) level and the VO (upper) level. We note that one process utilizing resources from different security domains traverses its local domain, VOs and the remote domain respectively. The security requirements and management structures within a domain (i.e. intra-grid) and outside a domain (i.e. extra-grid or within a VO) are different. Thus, the domains and the VOs are to expected to adopt different trust models. Besides, since computing of trust metrics is desirable, we assign different trust evaluation mechanisms for these two levels. Two distributed trust value evaluation algorithms based on paths have been introduced in [14,17]. Being inspired by these approaches, we adopted some of the results from them in our grid trust model. We emphasize the following major properties of our scheme: - A two-level trust model is suitable for the two levels of grid architecture, hence suitable for centralized grid domains and distributed VOs. - Different trust metric evaluation algorithms are deployed in grid domains and VOs separately. - The proposed model is an integrated solution. It is flexible, scalable and independent of underlying security components. This paper is organized as follows: Section 2 reviews the grid security infrastructure. Section 3 elaborates our trust model where the grid architecture, the two-level trust model and the extensions are given in detail. Implementation issues of our approach are addressed in Section 4. Section 5 concludes the discussion of the paper which also points out directions for further research.
2
Trust Issues in GSI
Brief overview of GSI: Grid security infrastructure [2] is built on well-known security standards such as X.509 certificate data structures [6], SSL protocol [10] and Generic Security Service API (GSS-API) [9]. The basic security components of Globus Toolkit provide the mechanisms for authentication, authorization and confidentiality among grid services. GGF complemented these standards with
216
T.-Y. Li, H. Zhu, and K.-Y. Lam
proxy certificates [5] in order to allow users’ single sign-on and delegation. In [3], the members also proposed a comprehensive OGSA (Open Grid Service Architecture) security architecture and a set of security components that encapsulate the required security functionalities. OGSA is a set of open standards serving as the basis of all grid related applications and is gaining global popularity among the scientific as well as the industrial grid communities. Trust issues in GSI: Although GSI has been widely adopted as the core component of grid applications, GSI which provides a basic secure and reliable grid computing environment is still at its early stage of development. Since GSI is built upon PKI, risks factors due to the use of PKI have to be considered carefully such as compromising of private keys or theft of certificates. Beyond this, security issues related to proxy certificate are still hot topics [11], e.g. how to specify the rights that may be delegated and how to specify the valid period of a delegation certificate? These issues are under intensive investigation in GGF security working groups. The security concerns of using delegation certificates are mainly arisen from individual grid user’s ignorance of the trust relationships outside its own local domain. Therefore, building the trust relationships throughout the entire grid environment is necessary. The establishment of trust can substantially broaden the user’s view on top of the grid domains and help user make sound choices on delegation of rights. Hence, the security of the whole system is enhanced.
Fig. 1. A process traversing domains and VO
To elaborate the trust issues within GSI, a typical grid application that supports user single sign-on and transient services can be described in the following case (illustrated in Figure 1: A grid user U within certain grid domain X (DX ) is going to run a process P 1 . P can be launched on a remote host and is able to generate sub-process Psub 2 to be launched further on its behalf. If the process 1
2
In GSI, U will generate a proxy certificate for P using U ’s original certificate. U can therefore log on to the system once, delegate its right and perform multiple processes. Similarly, P uses its proxy certificate to generate Psub ’s proxy certificate. GSI allows delegation continue using this method and form a delegation chain.
A Novel Two-Level Trust Model for Grid
217
needs any resource R provided by another domain Y (DY ), P or Psub has to traverse the intermediate network before arriving at DY . On receiving the request, R will first verify the certificate chain and if it is valid, Psub is allowed to access the resources. From this case, we notice that the trust from P to R is actually based on a path including several intermediates (N0 , N1 , ..., Nk ): P ← Nk ← · · · ← N0 ← R Several trust issues arisen from it are: Suppose P and R are in same domain: to make R trusts P , R will verify the certificate chain provided by P . If all the certificates are valid, the request is approved. R completely trusts P due to successful verification. The “trust by authentication” method could be effective due to the presence of centralized management mechanisms (e.g. unique root CA or authorization policy) within a domain. However, considering an invalid proxy certificate generated by a malicious host, the sub-process could be a faked one accomplishing a malicious task. Thus, R will still take some risk on judging to what degree it can trust P . In other words, R should estimate P ’s trustworthiness beforehand. Suppose P and R are in different domains: the problem is more complicated, R has to trust all the intermediate hosts along the path that P traversed before arriving R. Even worse, the security policies in two domains as well as in VO are different. Any negative statement from any intermediate host towards the request may make the request fail in the whole trust path. Thus, trust relationship is hard to be set up between R and P . Definitely, for R trusts P , a mechanism to evaluate the trust degree along the whole path is also necessary. From above problem statements, we can see the need of a trust model as well as the trust evaluation method for grid computing. Firstly, the scheme needs to define direct or mutual trust relationships between two hosts within a domain, as well as indirect trust relationships traversing intermediaries. Secondly, due to the dynamic nature of grid, trust relationships might also need to be established dynamically using intermediaries in a distributed means. Specially, it should also set up the basis satisfying the security requirements to achieve single sign-on and delegation in grid.
3 3.1
Two-Level Trust Model Two-Level Architecture of Grid
Figure 2 depicts a conceptual model of grid architecture. We observe that a grid domain is a set of computing resources geographically coupled together to provide a virtual computing resource uniformly. Normally, the grid resources within a domain share the same security policies (e.g. rules of authentication and authorization) and be protected by same edge network checkpoints (e.g. firewalls). Centralized management for individual grid domain is apparently suitable for such condition. However, since different domains may have different security
218
T.-Y. Li, H. Zhu, and K.-Y. Lam
Fig. 2. Two-level architecture of grid
level, policies, mechanisms and strategies, maintaining an integrated security infrastructure that includes authentication service, user registry, authorization engine, network layer protection and other security services among all is impossible. On the other hand, grid computing is only meaningful by providing an integrated service for large scale scientific computing. Virtual Organizations (VOs) are proposed to solve the paradox. Shown in upper level of Figure 2, VO is formed dynamically while the members of grid domain join/leave it. Although the security policies, authentication credentials and identities belonging to a member’s domain are likely to be managed, issued and defined only within the scope of that domain, the members joining a VO should at least maintain some trust relationship between each other in order to support secure and automatic cross domain operation. Therefore, defining and establishing such a kind of trust relationship is essential to grid security. We therefore propose a flexible and scalable trust model: it can be flexibly adopted in either centralized or distributed environments; it can also be extended well to fit into a very large scale computing scenarios. We elaborate our model as well as the trust degree evaluation algorithms in detail in the following sections. 3.2
Centralized Trust Model in Domain
As mentioned above, X.509 certificate architecture is used within a GSI domain. The centralized certificate architecture determines a centralized trust model within a domain. Thus, we suppose a central server as the overall system authority in charge of all security mechanisms (such as assigning policies, issuing certificates). GSI is also compatible with non-X.509 based models like kerberos [12] by using interoperable gateways. Our trust model is not limited to X.509, but to be easily adapted to other centralized system. Indeed, our trust model is independent of the underlying security platforms as long as centralized management feature dominated in various security domains.
A Novel Two-Level Trust Model for Grid
219
Maintaining a trust table. Since every domain has a domain manager, we assign a role of trust evaluation to this domain manager. The domain manager may maintain a trust relationship table for all the domain members. In each record of the table, a trust value associated with a member’s identity is initially assigned and adjusted by the central authority. The trust relationship between any two members must be computed by the domain manager. Although a hierarchical structure is also supported, we simply study a two-level top-down model in order to demonstrate our trust model. We describe the trust functions as follows: Computing trust value. Notions: – DM denotes a domain managed by a domain manager (i.e. a root certificate authority in X.509). – fX−Y denotes the trust function from X to Y. Trust value from P to R can be computed via DM indirectly: fP −DM
fDM −R
P −→ DM −→ R; Therefore, we get trust function from P to R as fP −R = fP −DM × fDM −R Note that fP −DM = fDM −P since only DM can decide the trust value. If the trust policy with a domain is unique, we use fDM denote the trust function of the domain. 3.3
Distributed Trust Model in VO
VO is formed dynamically where any member can join and leave anytime and anywhere. As the members are from different security domains, they may not share the same security policy. The decentralized structure makes it difficult to establish trust in the grid. We therefore employ a distributed trust evaluation scheme to fit the grid environment (as inspired by our original scheme [19]). A formal modelling of PKI by Maurer can be found at [20]. We start by mapping a VO (a limited distributed network environment) to a graph G. We consider a member in a VO as a node of G and a path (e.g. Nk ← · · · ← N1 ← N0 ) between two members of a VO as an edge of G. The graph G can be further defined as follow: DEFINITION 1: A graph G = (V, E) has a finite set V of vertices and a finite set E ⊆ V × V of edges. The transitive closure G∗ = (V ∗ , E ∗ ) of a graph G = (V, E) is defined to have V ∗ = V and to have an edge (u, v) in E ∗ if and only if there is a path from u to v in G. We map a newly generated path into an edge of the transitive closure of graph G, We then can compute the trustworthiness based on transitive closure of graph G. Whenever a VO is constructed, a corresponding graph is built. Therefore, any activity (e.g. joining, leaving or updating) in VO may cause an update of the graph.
220
T.-Y. Li, H. Zhu, and K.-Y. Lam
Building a trust graph. When a new member M decides to join a VO, it should apply for a direct trust value from an already existing member. Consequently, we should define the initialization of direct trust degree computing algorithm. We remark that trust is a predicate however we can assign a value v to trust under condition of the output of predicate. This value is called a trust degree or a trust value. In a trust graph, suppose a set of nodes (Nk , · · ·, N1 , N0 ) is the existing members, M joins G through a direct recommender N0 and will N0 N0 N0 . dtvM is a value in [0, 1]. trM = 0 implies be assigned a direct trust value dtvM N0 N0 distrusts M completely while trM = 1 implies N0 trusts M completely. Notions: – CertX : participant X s certificate that binds semantics (e.g., a name or an e-mail address) to its public key; N0 : The record of history as N0 is a direct recommender of M ; If – HistroryM N0 M is a new participant in the networks, then HistroryM ← N ull; – PredX (·, ·), a predicate defined over the set of certificates according to the strategy defined by X; – StrategyX (·, ·, ·), a deduction algorithm used by X to compute a value based on the set of historical parameters. Joining a graph, a new member’s direct trustworthiness can be computed as follows: N0 ); Input: (CertN0 , CertM , HistroryM Computing: u ← PredN0 (CertN0 , CertM ) If u = 0, then v ← 0 N0 |u = 1); Else, v ← StrategyN0 (CertN0 , CertM , HistroryM N0 Output: a value dtvM ← v. N0 N0 In the case when M leaves G, we assign HistroryM ← dtvM . The trust relationship between M and N0 is maintained while M is resumed by joining again. If a recommender, i.e. N0 , is leaving, its successor (M ) ’s trust has to be redirected to N0 ’s recommenders so as to maintain this trust relationship in G. N1 N1 Suppose N1 is N0 ’s recommender, N1 will put dtvN into its history HistroryN 0 0 N1 while N0 is leaving. M , as N0 ’s successor, has to gain a redirect trust value rtvM from N1 as follows: N1 N1 N0 , HistroryM , dtvM ); Input: (CertN1 , CertM , HistroryN 0 Computing: u ← PredN1 (CertN1 , CertM ) If u = 0, then v ← 0 N1 N1 , HistroryM |u Else, v ← StrategyN1 (CertN1 , CertM , HistroryN 0 N0 1, dtvM ); N1 Output: a value rtvM ← v.
=
Computing trust value. Based on the trust graph, we can now compute the trust value from node M to N . Since M and N join in VO independently, they
A Novel Two-Level Trust Model for Grid
221
may not know the existence of each other. Before setting up any trust between them, one may try to reach the other via certain route. Such a path searching procedure from M to N can be completed by a path finder, say P athServer [13]. To formulate the computation of trust between two end nodes, we first define two notions below: Suppose P1 , · · · , Pk are k pathes provided to M by a path finder. These pathes are referred to as delegation pathes. Let N (Pi ) be a set of intermediates {Ni1 , Ni2 , · · · , Nil } in the i-th path Pi . - P1 , · · · , Pk are called independent, denoted by DP (P1 , · · · , Pk ) if ∀Pi , Pj , N (Pi ) ∩ N (Pj ) = ∅, 1 ≤ i, j ≤ k -P1 , · · · , Pk are called relevant, denoted by RP (P1 , · · · , Pk ) if there exists i, j such that N (Pi ) ∩ N (Pj ) = ∅ N by computing Based on the above notions, we can define a trust value tvM N N = F (HistoryM , tvP1 , tvP2 , · · · , tvPk ). tvM N
NY NY NY N Where tvPj = min {tvMj1 , · · · , tvN } in which tvN = {dtvN |dtvN = j X X X l
NY NY |dtvN = 0} 0, rtvN X X Case 1: (P1 , P2 , · · · , Pk ) are independent paths, the combination of these trust values is defined by 1 k tvcomb = tvPi i=1 k The trust value is defined as: N N tvM ← ρ × tvM + (1 − ρ) × tvcomb
where ρ is referred to as a trust factor, which is determined by N completely N is not in the recorded in the history. and ρ = 0 if tvM Case 2: Suppose (P1 , P2 , · · · , Pk ) are k pathes relevant paths. These paths are divided into t sets so that paths in each set are independent. These sets are denoted by SP1 , SP2 , · · · , SPt . The trust value is defined as: 1 t tvPij tvcomb = i=1 t Where Pij is a path chosen at random from the set SPi . N N ← ρ × tvM + (1 − ρ) × tvcomb tvM
where ρ is referred to as a trust factor, which is determined by N completely N and ρ = 0 if tvM is not in the recorded in the history. Note that we choose one path at random in a relevant set SPj each as the input to compute the trust value to reduce computational complexity at N ’s side. Finally, we can get the trust function (for case 1 or 2) from M to N as: N fM −N = tvM
222
3.4
T.-Y. Li, H. Zhu, and K.-Y. Lam
Extensions
We have formulated the trust functions above. In this subsection, we consider different situations where we compute the trust values between R and P as follows: Notions: – fDMi denotes trust function in the ith domain – DMi ; – fV Oj denotes trust function in the j th VO – V Oj ; – V Oset denotes a set of intermediate connecting VOs. Case 1: [P, R ∈ DMi ]
fP −R = fP −DM × fDM −R
(as stated in section 3.2, and simply fP −R = fDMi ) Case 2: [P ∈ DMi , R ∈ V Oj ] If V Oj ∩ DMi = ∅, fP −R = fDMi × fV Oj If V Oj ∩ DMi = ∅ and ∃V Oset , DMi ∩ V Oset ∩ V Oj = ∅, fP −R = fDMi × fV Oset × fV Oj If V Oj ∩ DMi = ∅ and ∀V Oset , DMi ∩ V Oset ∩ V Oj = ∅, fP −R = 0 Case 3: [P ∈ DMi , R ∈ DMj ] If ∃V Oset , DMi ∩ V Oset ∩ DMj = ∅, fP −R = fDMi × fV Oset × fDMj If ∀V Oset , DMi ∩ V Oset ∩ DMj = ∅, fP −R = 0 (*)Case 4: In case 1-3, VOs are supposed to be located in a flat structure (the upper level stated in this paper). Indeed, multi-level (or hierarchical) VOs can also be organized. For example, a Super VO (SVO) can be formed by combining several VOs. The members of SVO are collected from those members of VOs. The above proposed algorithm (in Section 3.3) could be used to evaluate the trust degree in SVO (however, we ignore the complex derivations in this paper).
4
Implementation Design
The model introduced in Section 3 is to be implemented as an extension of the Globus Toolkits 3.0 [4] platform. Since GT3.0 has been developed in many scientific grids and provided a set of cryptographic means as fundamental security
A Novel Two-Level Trust Model for Grid
223
Fig. 3. Authentication procedure
mechanisms, yet unable to provide a trust model, we can embed our scheme into GSI. Several design issues are discussed here. In GT3.0, a grid user invoking a job on a grid resource may involve several steps where the first two steps are relevant to authentication. As illustrated in Figure 3, O1 : The user generates a job request with a description of the job to be started. the user then signs this request with their GSI proxy credentials and sends the signed request to Master Managed Job Factory Service (MMJFS, formerly called “Gatekeeper” in GT V2.2). O2 : The MMJFS verifies the signature on the request and establishes the identity of the use who sent it. Then, it uses grid-mapfile (a mapping table of global grid identities to local accounts) to determine a local account for the grid user. By applying our trust model, several steps P1 - P4 are added (the shadow part of Figure 3) for the resource to reset the trust path back to the user. A second chance for evaluating the trust along the intermediates between the two end entities that perhaps never met before is provided. Consequently, we say that the channel is authenticated again resiliently. P1 : After MMJFS verifies the delegation chain, it consults the domain server for obtaining a trust value tvDMY . P2 : Outside domain DMY , in a VO, trust paths are set up. We finally get the trust value from domain DMY to domain DMX , as tvV O . P3 : The trust value of the user’s domain DMX is computed as tvDMX P4 : The path trust value, tvDMY × tvV O × tvDMX , is sent to MMJFS for justification. If successful, MMJFS will resume step O2 . On completing P1 and P3 , the domain server should maintain a trust table. Every member of the domain will have an item (e.g. its identity and the
224
T.-Y. Li, H. Zhu, and K.-Y. Lam
associated trust value) in the table. The trust value can be assigned initially when the user’s certificate is first generated. Specially, the server might generate a DM-trustmap file while generating the root CA. All other user’s certificates generated with the program grid-cert-request and sent to root CA will be recorded in DM-trustmap with the trust value assigned by local security rules. Using the algorithm stated in Section 3.2, two parties may consult the server for setting up their trust relationship. Step P2 requires a distributed trust evaluation application installed on each members of VO. This application implements the algorithm of computing the trust degree. An example of such an application is “PathServer” [13]. “PathServer” is a web based service for finding the paths from a source to a target. The service provides a WWW interface by which a user can submit its request in the form of a source’s and a target’s PGP key identifier. The user will receive in real time a display of the requested paths. It is implemented to work in the context of PGP system and could be adapted to other public key systems. But the adaptation to GSI is not easy. The members of a VO may have different security policies as well as various security mechanisms, it is difficult to uniformly evaluate the trust relationship between each other. The trust built solely on certificate authentication could simplify this situation. On this assumption, a new application ‘‘PathFinder’’ is installed on each VO member and a trust graph is built and updated periodically from the database of certificates maintained by different certificate servers. We are developing the programs for implementing our trust model on GT3.0. For the case of evaluating trust relationships under different security policies (i.e. mediating trust among hybrid PKIs [21]), we will be finding matching algorithms to compute the trust values among them.
5
Conclusions and Future Directions
We proposed a novel two-level trust model most suitable for the grid environment. Through analyzing the trust requirements in the grid context, we identified a two-level architecture that best reflects the styles grid applications. We elucidated the details on evaluating the trust metrics in the two level of the model and integrated them to provide a complete trust solution. With it, a transient service can be transparently processed while crossing different security domains safely. Based on GT3.0, we designed the building blocks to construct the trust evaluation applications. Using our scheme, one objective of resilient authentication between two grid end entities is achieved. At the time of writing, grids are being implemented in several major scientific institutes where trust is set up by pre-defined security policies. As grid will emerge quickly in various areas, we argue that the proposed model be suitable to evaluate the trust between distributed grid scenarios and be adopted later on. Further on, we will implement our model and apply it on real production system to get more practical experiences.
A Novel Two-Level Trust Model for Grid
225
References 1. I. Foster, C. Kesselman, G. Tsudik, S. Tuecke, “A Security Architecture for Computational Grids.” Proc. 5th ACM Conference on Computer and Communications Security Conference, pp. 83–92, 1998. 2. R. Butler, D. Engert, I. Foster, C. Kesselman, S. Tuecke, J. Volmer, V. Welch. “A National-Scale Authentication Infrastructure”. IEEE Computer, 33(12):60–66, 2000. 3. Nataraj Nagaratnam, et. al. “Security Architecture for Open Grid Services”. GGF OGSA Security Workgroup. http://www.ggf.org/ogsa-sec-wg 4. Globus Toolkits V3.0 of the Globus project. Http://www.globus.org 5. S. Tuecke, et. al, “Internet X.509 Public Key Infrastructure Proxy Certificate Profile” IETF Internet Draft, Apr. 2003. http://www.ietf.org/internet-drafts/draftietf-pkix-proxy-05.txt 6. Housley, R., W. Polk, W. Ford, and D. Solo, “Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile,” RFC 3280, April 2002. 7. Mike Surridge. “A Rough Guide to Grid Security”. V1.1, IT Innovation Centre, 2002. 8. M. Thompson, et.al. “CA-based Trust Model for Grid Authentication and Identity Delegation”. Grid Certificate Policy Working Group, Oct. 2002. 9. Linn, J., “Generic Security Service Application Program Interface”, Version 2, Update 1, RFC 2743, January 2000. 10. A. Freier, P. Kariton, P. Kocher, “The SSL Protocol: Version 3.0”. Netscape communications, Inc., CA (Mar. 1996). 11. Simon N. Foley. “Trust Management and Whether to Delegate”, Security Protocols, LNCS 2467, pp. 151–157, 2002. 12. Kohl, J. and C. Neuman, “The Kerberos Network Authentication Service (V5),” RFC 1510, September 1993. 13. M. Reiter and S. Stubblebine. “Resilient authentication using path independence”. IEEE Transactions on computers, Vol.47, No.12, December 1998. 14. M.K. Reiter and S.G. Stubblebine. “Authentication metric analysis and design”. ACM Transactions on Information and System Security, 2(2):138–158, 1999. 15. C. Ellison et al. “Spki certificate theory”. September 1999. Internet Request for Comments: 2693. 16. Phil Zimmermann. Pretty Good Privacy (PGP), PGP User’s Guide, MIT, October, 1994. 17. Tuomas Aura. “Distributed Access-Rights Managements with Delegations Certificates”. Secure Internet Programming 1999: 211–235. 18. Mendes, S. and Huitema, C. “A new approach to the X.509 framework: Allowing a global authentication infrastructure without a global trust model”. In Proceedings of NDSS’95. 19. Huafei Zhu, Bao Feng and Robert H. Deng. “Computing of Trust in Distributed Networks”. Cryptology ePrint Archive: Report 2003/056. 20. Ueli Maurer. “Modelling a Public-Key Infrastructure”. ESORICS’96, LNCS 1146, pp. 325–350, 1996. 21. Joachim Biskup, Yucel Karabulut. “Mediating Between Strangers: A Trust Management Based Approach”. 2nd Annual PKI Research Workshop. http://middleware.internet2.edu/pki03/
Practical t-out-n Oblivious Transfer and Its Applications Qian-Hong Wu, Jian-Hong Zhang, and Yu-Min Wang State key Lab. of Integrated Service Networks, Xidian Univ., Xi’an, Shanxi 710071, P. R. China {woochanhoma,jhzhs}@hotmail.com, [email protected]
Abstract. General constructions of t-out-n (string) oblivious transfers and millionaire protocol are presented using two-lock crypto-system, which enables Alice to send Bob secret without shared key. In the proposed t-out-n (string) oblivious transfer, Alice cannot determine which t messages Bob received even if she has unlimited computational power while Bob cannot learn the other n − t messages if the discrete logarithm problem is infeasible. The scheme requires constant rounds. Alice needs n + t modular exponentiations and Bob needs 2t modular exponentiations. Furthermore, the basic scheme is improved to meet public verifiability and extended to distributed oblivious transfers. As applications, efficient PIR scheme and millionaire protocol are built.
1
Introduction
Rabin [17] proposed the concept of oblivious transfer (OT) in the cryptographic scenario. In this case, Alice has only one secret (bit) m and would like to make Bob to get it with probability 0.5. On the other hand, Bob does not want Alice to know whether it gets m or not. For 1-out-2 OT, Alice has two secrets m1 and m2 and would like to let Bob get one of them at Bob’s choice. Again, Bob does not want Alice to know which secret it chooses. 1-out-n OT is a natural extension of 1-out-2 OT to the case of n secrets. Nevertheless, to construct 1-out-n OT from 1-out-2 OT is not trivial. A general approach for constructing string t-out-n OT is first to construct a basic 1-out-2 (bit) OT and then to construct the k-bit string 1-out-2 OT by invoking k runs of the bit 1-out-2 OT, and then to construct the string 1-outn OT by invoking the basic string 1-out-2 OT for many runs, typically, n or log2 n runs [4,5,13], and then to construct the string t-out-n OT by invoking t runs of the string 1-out-n OT scheme [21]. 1-out-n OT schemes are also possible built from basic techniques directly [19,20]. The reduction approach is studied in [2,4,5]1 . The oblivious transfer has found many applications in cryptographic studies and protocol design, such as, secure multiparty computation, private information 1
This paper is supported by Chinese National Natural Science Foundation (No. 69931010).
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 226–237, 2003. c Springer-Verlag Berlin Heidelberg 2003
Practical t-out-n Oblivious Transfer and Its Applications
227
retrieval (PIR), fair electronic contract signing, oblivious secure computation, and etc [6,8,11,12]. Its computational requirements and bandwidth consumption are quite demanding and they are likely to be the bottleneck in many applications that invoke it. Our main contribution is to directly implement efficient string t-out-n OT using two-lock cryptosystem. It is basically simple to realize t-out-n (string) oblivious transfer using two-lock cryptosystem: Alice locks n secret messages in n boxes and sends them to Bob; then Bob locks his chosen t boxes and resends them to Alice; Alice unlocks the boxes and delivers them to Bob and Bob can read t of n secret messages. We introduce concrete two-lock cryptosystems: one is based on Knapsack problem; the other is based on the discrete logarithm problem. Then we propose concrete t-out-n (string) oblivious transfer schemes based on the discrete logarithm problem. Alice cannot determine which t messages Bob received even if she has unlimited computational power while Bob cannot learn the other n − t messages if the discrete logarithm problem is infeasible. The proposed protocols require constant (i.e., 2 for Alice and 1 for Bob) rounds of communication. In our basic t-out-n oblivious transfer scheme, Alice requires n + t modular exponentiations, and Bob requires 2t modular exponentiations. In Tzeng’s scheme [21], the most efficient previous scheme to our best knowledge, Alice need compute 2nt modular exponentiations, and Bob need compute 2t modular exponentiations. Hence our scheme is more efficient. We also improve our basic t-out-n oblivious transfer scheme with public verifiability and extend it to distributed oblivious transfers. As applications, efficient PIR schemes and millionaire protocols are built.
2 2.1
Models t-out-n Oblivious Transfer
A t-out-n OT scheme is a two-party protocol where Alice possesses n (string) secrets m1 , m2 , . . . , mn and would like to reveal t secrets of them to Bob. A t-out-n OT scheme should satisfy the following requirements: Correctness: If both Alice and Bob follow the protocol, Bob gets t secrets after executing the protocol with Alice. Receiving ambiguity: After executing the protocol with Bob, Alice shall not learn which t secrets Bob has received. Sending privacy: After executing the protocol with Alice, Bob gets no information about other n − t messages or their combinations. 2.2
Two-Lock Cryptosystem
Suppose that Alice wishes to send a secret message m to Bob. Alice and Bob have encryption algorithms A and B respectively. Alice chooses her random secret key k and Bob chooses her random secret key s. If Bs (Ak (m))= Ak (Bs (m)) for any k and s, they may begin their confidential communication as the following procedure:
228
Step Step Step Step
Q.-H. Wu, J.-H. Zhang, and Y.-M. Wang
1. 2. 3. 4.
Alice sends Bob: Y = Ak (m). Bob sends Alice: Z = Bs (Y ). Alice sends Bob: C = A−1 k (Z). Bob decrypts: m = Bs−1 (C).
Here, A−1 k (·) denotes the decryption of Ak (·). Bob can decrypt the cipher text C and reveal the message m = Bk−1 (C). We call such a cryptographic primitive as two-lock cryptosystem. In case that A = B, it is also known as commutative encryption [3]. A two-lock cryptosystem should meet the following security requirements: it is infeasible for an adversary to find k such that C = A−1 k (Z) or s satisfying Z = Bs (Y ). Two-lock cryptosystems are frail under men-in-themiddle attack, so an authenticated channel from Bob to Alice is required. The authenticated channel can be achieved with authentication techniques and we omit it in the description. 2.3
Construction of t-out-n Oblivious Transfer Using Two-Lock Cryptosystem
Let Alice possess n (string) secrets m1 , m2 , . . . , mn and be willing to reveal t secrets of them to Bob. Suppose Bob is interested in secrets mi1 , . . . , mit . Assume that Alice chooses her random secret key k and Bob chooses secret keys s1 , . . . , st . It is convenient to implement t-out-n OT using two-lock cryptosystem as follows: Step Step Step Step
1. 2. 3. 4.
Alice sends Bob: Y1 = Ak (m1 ), . . . , Yn = Ak (mn ). Bob sends Alice: Z1 = Bs1 (Yi1 ), . . . , Zt = Bst (Yit ). −1 Alice sends Bob: C1 = A−1 k (Z1 ), . . . , Ct = Ak (Zt ). −1 −1 Bob decrypts: mi1 = Bs1 (C1 ), . . . , mit = Bst (Ct ).
To achieve sending privacy, Alice’s encryption algorithm should meet the security requirements: given C1 , Zi1 , . . . , Ct , Zit , it is infeasible to find k satisfying −1 C1 = A−1 k (Zi1 ), . . . , Ct = Ak (Zit ). On the other hand, if Bob’s encryption is semantically secure, then receiving ambiguity is guaranteed. Clearly, in this direct construction Alice’s computation complexity is O(n + t) and Bob’s is O(t). The complexity of the most efficient previous construction, which requires t calls to 1-out-n Oblivious Transfer, is O(nt) and O(t) respectively. Our construction is more efficient. 2.4
New Solution to Millionaires’ Problem Using Two-Lock Cryptosystem
Suppose Alice has a secret integer a and Bob has a secret integer b. Let 1≤ a, b ≤ n. They are willing to compare a and b but they don’t want to reveal their secrets. This problem is known as millionaire problem. It has been extensively researched and has many applications such as sealed auction, electronic cash, and etc. It can be solved with the general function evaluation techniques. In this section, we give a new solution to this problem with two-lock cryptosystem.
Practical t-out-n Oblivious Transfer and Its Applications
229
Let the message space be parted into two disjoint sets, M0 and M1 . Let mi,α be random message in Mα , where 1≤ i ≤ n, α∈{0,1}. The following protocol enables that Alice and Bob securely compare their secret integers. Step 1. Alice sends Bob Y1 = Ak (m1,0 ),. . . , Ya = Ak (ma+1,1 ),. . . , Yn = Ak (mn,1 ). Step 2. Bob sends Alice Z = Bs (Yb ). Step 3. Alice sends Bob C = A−1 k (Z). Step 4. Bob decrypts mb,α = Bs−1 (C). If mb,α ∈ M0 , Bob learns that b ≤ a. Else if mb,α ∈ M1 , b > a. In the above millionaire protocol, Alice needs n encryptions and 1 decryption. Bob needs 1 encryption and decryption. This solution achieves the same efficiency as the classical Yao’s Millionaire protocol. It is more efficient than the general solution with secure function evaluation techniques when n is small, for instance, Alice and Bob may compare their ages.
3 3.1
Concrete Two-Lock Cryptosystems Two-Lock Cryptosystem Based on Knapsack Problem
Let a1 , . . . , al , S and l be integers, asking whether or not there are x1 , . . . , xl satisfying x1 a1 +. . . +xl al = S, where xi ∈{0,1}, 1≤ i ≤ l. Karp presented this problem called as knapsack problem in 1972. It has been proved that this is an intractable problem in polynomial time in the generic case. The difficulty of this problem is related to Knapsack Density and the dimension l of the knapsack vectors (a1 , . . . , al ). The density d(a) of the knapsack vectors (a1 ,. . . ,al ) is defined as d(a) = l/log2 max{a1 ,. . . ,al }. The first Knapsack cryptosystem [15] presented by Merkle and Hellman in 1978, because of the intrinsic trait of superincreasing sequence, was cracked by Shamir [18]. Lagarias and Odlyzko[14] proved that any knapsack problem with knapsack density less than 0.645 could be solved in polynomial time. Chor and Rivest presented a knapsack cryptosystem over the finite field arithmetic [10]. The knapsack density in this system is more than 0.645 and it is difficult to find a permutation and a modulus to convert the knapsacks into superincreasing ones. However, typical realizations of the Chor-Rivest scheme were also cryptanalyzed by Vaudenay [22] because of the known low cardinality of the subset-sum and the symmetry of the trapdoor information. The reason why most of the knapsack cryptosystems are cracked is that the knapsacks in these systems are special, i.e., they are either derived from supercreasing knapsacks, sparse knapsacks, or they lead into low cardinality of subset-sum. Hence, a cryptosystem based on the knapsack problem is expected to be secure only when the knapsack density is at least 1 and the knapsacks cannot be converted into superincreasing ones and the construction does not lead into low cardinality of subset-sum. The following two-lock cryptosystem based on knapsack problem satisfies the above requirements. Let t, k, n, l be secure parameters. Let Alice wish to send Bob a positive integer sequence m=(s1 , . . . , sl )=(u1,1 , . . . , ul,1 ) + (v1,1 , . . . , vl,1 ), where the
230
Q.-H. Wu, J.-H. Zhang, and Y.-M. Wang
binary length of si is n and si = sj (i = j). They begin their confidential communication as follows. Alice: For h=1, . . . , t, select random positive integers eh , Mh , f , Nh satisfying Mh > kmax{u1,h , . . . , ul,h }, Nh > kmax{v1,h , . . . , vl,h }, (eh , Mh )=1, (fh , Nh )=1 and (Mh , Nh )=1. For j=1, . . . , l, select integers βj,h ← Z; compute uj,h+1 = eh uj,h mod Mh , vj,h+1 = fh vj,h mod Nh . Using the Chinese remainder Theorem, compute (y1 , . . . , yl ) such that ϕ(M ) uj,t+1 = yj mod Mt and vj,t+1 = yj mod Nt . i.e., yj = uj,t+1 Nt t + ϕ(Nt ) mod Mt Nt , where ϕ(·) is the Euler Totient function. vj,t+1 Mt Then select a random integer α and send Y = (Y1 , . . . , Yl ) = (y1 − α, . . . , yl − α) to Bob. Bob: Select a random nonsingular matrix B=(bi,j )l×l , where bi,j ∈R {0,1} and the Hamming weight of each column is k. Send Z=(z1 , . . . , zl ) =YB to Alice. −1 mod Mh . Let Alice: For h = t, . . . , 1, compute dh = e−1 h mod Mh , gh = fh Ui,t = dt (zi +kα) mod Mt , Vi,t = gt (zi +kα) mod Nt for i=1, . . . , l. For h = t−1, . . . , 1, calculate Uj,h = dh Uj,h+1 mod Mh , Vj,h = dh Vj,h+1 mod Nh , for i = 1, . . . , l. Finally, send Bob C = (c1 , . . . , cl ) = (U1,1 + V1,1 , . . . , Ul,1 + Vl,1 ). Bob: Compute (m1 , . . . , ml )= (c1 , . . . , cl )B −1 . Proofs For simplicity, we assume t= 1(Note that it is not suggested.). Since b1,j +. . . +bl,j = k for j=1, . . . , l , it follows that Z = (z1 , . . . , zl ) = (y1 − α, . . . , yl − α)B , i.e., zj = b1,j y1 + . . . + bl,j yl − kα for j=1, . . . , l. Note that Mh > kmax{u1,h , . . . , ul,h }, Nh > kmax{v1,h , . . . , vl,h }. Then for j=1, . . . , l, cj = Uj,1 + Vj,1 = d1 (zj + kα) modM1 + g1 (zj + kα) modN1 = d1 (b1,j e1 u1,1 + . . . + bl,j e1 ul,1 ) modM1 + g1 (b1,j f1 v1,1 + . . . + bl,j f1 vl,1 ) modN1 = (b1,j u1,1 +. . . +bl,j ul,1 ) mod M1 +(b1,j v1,1 +. . . +bl,j vl,1 ) mod N1 =(b1,j u1,1 +. . . +bl,j ul,1 )+ (b1,j v1,1 +. . . +bl,j vl,1 ) =b1,j s1 +. . . +bl,j sl Then we get the following equation (c1 , . . . , cl ) = (s1 , . . . , sl )B. Hence, (s1 , . . . , sl )= (c1 , . . . , cl )B −1 . We give an informal analysis of the above protocol. Assume that the channel from Bob to Alice is authenticated. The adversaries cannot implement a menin-the-middle attack. They can only intercept the following data. (1) For j=1, ..., l, Yj =((et (. . . (e1 uj,1 mod M1 ). . . ) mod ϕ(Mt ) +(ft (. . . (f1 Mt )Nt ϕ(N ) vj,1 mod N1 ). . . ) mod Nt )Mt t ) mod Mt Nt − α, where α, eh , gh ,Mh , Nh for h=1, . . . , t, uj,1 and vj,1 are unknown. (2) Z=(z1 , . . . , zl ) where zj = Y1 b1,j +. . . + Yl bl,j for j=1, . . . , l. The matrix entities bi,j ∈R {0,1} are unknown.
Practical t-out-n Oblivious Transfer and Its Applications
231
(3) C=(c1 , . . . , cl ) where cj = d1 (. . . (dt (zj + kα) mod Mt ). . . ) mod M1 + g1 (. . . (gt (zj + kα) mod Nt ). . . ) mod N1 for j=1, . . . , l. α, dh , gh , Mh and Nh for h=1, . . . , t are unknown. It is impossible for adversaries to extract information about α, dh , gh , Mh and Nh for h=1,. . . ,t and (s1 , . . . , sl ) from (1). If adversaries wish to extract information of α, dh , gh , Mh andNh from (3), they need to find α, dh , gh , Mh andNh for h=1,. . . ,t satisfying cj = d1 (. . . (dt (zj + kα) mod Mt ). . . ) mod M1 + g1 (. . . (gt (zj + kα) mod Nt ). . . ) mod N1 for j = 1, . . . , l. It is computationally infeasible when the space to which α, dh , gh , Mh and Nh belong is large enough and the security parameters l, t are also large. Even they found α, dh , gh , Mh and Nh satisfying equation (3), they would not know whether α, dh , gh , Mh and Nh satisfy (1) or not due to the denseness of the rational number. Assume that the adversaries intend to find a nonsingular matrix (bi,j )l×l from (2) satisfying zj = b1,j y1 + . . . + bl,j yl , which is a random knapsack problem with approximate density l/log2 (Mt Nt ). Let l ≥1000, t ≥50, k=128, n=100 and Mt Nt ≤2900 . Then d(a) > 1. Therefore it is secure against attacks with L3 algorithm. Because the knapsack is generated in a random way, it cannot be converted into superincreasing knapsack. So it is secure under attacks with Shamir algorithm. Contrarily, if adversaries found a nonsingular matrix (bi,j )l×l from (2) satisfying zj = y1 b1,j +. . . + yl bl,j , they would be able to decrypt the message as Bob, i.e., they had found an equivalent key. However, to find such an equivalent key is as difficulty as to find the actual key used by Bob. 3.2
Two-Lock Cryptosystem Based on Discrete Logarithm
Let G be cyclic multiplicative group with order q which is a large prime such that it is infeasible to calculate discrete logarithm in G. Typically, G is the set of quadratic residues of Zp∗ , where p = 2q + 1 is also prime, or G is the set of GF (2t )\{0}, where t=2s −1 is a prime, e.g., t=211 − 1. Any element in G\{1} is a generator of G. Let Alice wish to send Bob secret message m ∈ G\{1}. They run the following protocol: Step Step Step Step
1. 2. 3. 4.
Alice chooses a random integer x ∈ Zq∗ , and sends Bob X = mx . Bob chooses a random integer y ∈ Zq∗ , and sends Alice Y = X y . Alice calculates u = x−1 mod q, and sends Bob Z = Y u . Bob computes v = y −1 mod q, and gets message m = Z v .
The above cryptosystem is secure unless an adversary could compute discrete logarithm. However, it is infeasible if q is a large prime. It is a slightly extension of the commutative encryption [3].
4
t-out-n Oblivious Transfer Protocols
Assume Alice possesses n (string) secrets m1 , m2 , . . . , mn and would like to reveal t secrets of them to Bob. It is insecure to directly construct t-out-n oblivious transfer as shown in section 2.3 with the above two-lock cryptosystems. Some
232
Q.-H. Wu, J.-H. Zhang, and Y.-M. Wang
modifications are required. We firstly consider the case of two-lock cryptosystem based the Knapsack problem. A plausible modification is that Alice runs a (l, l) threshold secret sharing protocol on each secret before Alice and Bob execute the oblivious transfer to prevent Bob from getting combined information about the secrets out of choice. In the case of two-lock cryptosystem based on discrete logarithm, the direct construction is secure if m1 , m2 , . . . , mn distribut uniformly in G\{1}. However, if the sizes of m1 , m2 , . . . , mn are short, the direct construction is insecure. In this case, the messages need padding with random strings. We only give concrete constructions of t-out-n oblivious transfer based on the discrete logarithm problem and constructions based on the Knapsack problem are similar. 4.1
t-out-n Oblivious Transfer Based on Discrete Logarithm
For simplicity, we assume that a secure padding has been computed in the following description. Consider the protocol in the honest-but-curious model in which both Alice and Bob are assumed honesty but try to obtain more information than they are entitled. The following scheme given here can be considered as an efficient extension of the scheme presented in [3]. Step 1. Alice chooses a random integer x ∈ Zq∗ , and sends Bob X1 = mx1 , . . . , Xn =mxn . Step 2. Bob chooses random integers y1 , . . . , yt ∈ Zq∗ , and sends Alice Y1 = Xiy11 , . . . , Yt = Xiytt , where Xi1 , . . . , Xit ∈ {X1 , . . . , Xn }. Step 3. Alice calculates a = x−1 mod q, and sends Bob Z1 = Y1a , . . . , Yta . Step 4. Bob computes b1 = y1−1 mod q, . . . , bt = yt−1 mod q, and gets messages mi1 = Z1b1 , . . . , mit = Ztbt . The scheme takes only three rounds. Alice need send n + t elements in G to Bob and Bob need send t elements to Alice. In Tzeng’s scheme, Alice need send nt elements to Bob and Bob need send t elements to Alice. Our scheme consumes less bandwidth. For computation, Alice needs n + t modular exponentiations and Bob needs 2t modular exponentiations. In Tzeng’s scheme, Alice needs 2nt modular exponentiations and Bob needs 2t modular exponentiations. Hence, our scheme is also more efficient in term of computation. If Alice and Bob follow the protocol, Bob will get t secrets after finishing the protocol. This is obvious. Bob gets no information about the other n − t secrets if discrete logarithm problem in G is infeasible. The choice of Bob is unconditionally secure. This is due to the following fact. Since X1 , . . . , Xn are generators of G, for any Yi in G, there exist r1 , . . . , rt such that Yi = X1r1 , . . . , Yi = Xnrt . Therefore, Alice cannot get any information about Bob’s choice even if she has unlimited computing power. 4.2
Publicly Verifiable t-out-n Oblivious Transfer
In applications that requires high standard of security, it is important to enable anyone verify that Alice sends Bob secrets as committed and Bob also chooses
Practical t-out-n Oblivious Transfer and Its Applications
233
the secrets according to his former commitment. The following protocol meets these security requirements. We first present some useful zero-knowledge proofs to achieve public verifiability for the t-out-n oblivious transfer. Let g, h, g1 , . . . , gn , h1 , . . . , hn be independent generators of G. H(·): {0,1}∗ →{0,1}∗ is a publicly known hash function. Here, ZKP {x|R(x)} means the zero-knowledge proof that the prover knows secret x such that R(x) is true. Zero-knowledge Proof of Equality of Discrete Logarithms. The efficient protocol described below allows Alice to prove to Bob that she knows an integer x satisfying yi = gix for i=1, . . . , n, where yi , . . . , yn is publicly known. Step 1. Select randomly integers u, and computes c = H(g1u ||. . . ||g un ), a=u-cx mod q.The resulting witness is (c, a). Step 2. The proof is valid if c = H(g1a y1c ||. . . ||gna ync ). It is a generalization of the zero-knowledge proof ZKP {x|y1 = g1x ∧ y2 = g2x } due to Chaum and Pedersen [9]. We denote the above protocol by ZKP {x|y1 = g1x ∧. . . ∧yn = gnx }. This protocol is a computational zero-knowledge proof. Zero-knowledge Proof of Partial Discrete Logarithms. The following an efficient zero-knowledge proof allows Alice to prove that she knows y, k such that Y = gky ∧ k ∈{1, . . . , n}. Step 1. Compute x = y −1 mod q. Step 2. Select randomly u, ci for i = 1, · · · , n, i = k, and compute ck−1 ck+1 c =H(Y u g1c1 . . . gk−1 gk+1 . . . gncn ). Step 3. Compute ck = c⊕(c1 ⊕ · · · ⊕ ck−1 ⊕ ck+1 ⊕ · · · ⊕ cn ), s = u − ck xk mod q. The resulting witness is (s, c1 , · · · , cn ). The proof is valid if c1 ⊕ · · · ⊕ cn = H(Y s g1c1 . . . gncn ). This is derived from [7]. There is a similar scheme in [1] which works as 1-out-n signature. We denote the above protocol by ZKP {y, k|Y = gky ∧ k ∈{1, . . . , n}}. Proposed Publicly Verifiable t-out-n Oblivious Transfer. With the above zero-knowledge proofs, the following publicly verifiable t-out-n oblivious transfer protocol enables any one to verify that Bob will get t-out-n secrets from Alice as they committed. Step 0. Alice chooses random integers r1 , . . . , rn ∈ Zq∗ , and publishes (u1 , v1 ) = (g r1 , m1 hr1 ), . . . , (un , vn ) = (g rn , mn hrn ) as her commitments to secrets m1 , m2 , . . . , mn . Bob chooses randomly integer y1 , . . . , yt ∈ Zq∗ , and publishes w1 = giy11 , . . . , wt = giytt as his commitments to his choice i1 , . . . , it ∈ {1, . . . , n}.
234
Q.-H. Wu, J.-H. Zhang, and Y.-M. Wang
Step 1. Alice chooses a random integer x ∈ Zq∗ , and publishes X1 = mx1 , . . . , Xn = mxn ; (U1 , V1 )=(ux1 , v1x ), . . . , (Un ,Vn )=(uxn , vnx ), ZKP {x|U1 = ux1 ∧ V1 = v1x ∧ . . . ∧Un = uxn ∧ Vn = vnx }, ZKP {xr1 |U1 = g xr1 ∧ V1 /X1 = hxr1 }, . . . , ZKP {xrn |Un = g xrn ∧ Vn /Xn = hxrn }. Step 2. Bob verifies the above zero-knowledge proofs. If the check fails, Bob aborts the protocol; else Bob publishes Y1 = Xiy11 , . . . , Yt = Xiytt , where ij ∈ {1, . . . , n} for j = 1, . . . , t. ZKP {y1 |Y1 /w1 =(Xij /gij )y1 ∧ij ∈ {1, . . . , n} for j = 1, . . . , t. Step 3. Alice checks the above zero-knowledge proofs. If the check fails, Bob aborts the protocol; else Alice calculates a = x−1 mod q, and publishes Z1 = Y1a , . . . , Zt = Yta , ZKP {x|Y1 = Z1x ∧ . . . ∧Yt = Ztx ∧ U1 = ux1 }. Step 4. Bob validates the above zero-knowledge proof and computes bj = yj−1 b
mod q for j = 1, · · · , t, and gets messages mij = Zj j for j = 1, · · · , t. After the above protocol is fulfilled, a verifier is convinced that Bob will get t secrets in accordance with his choice hiding in the former commitment from the n secrets committed by Alice. A trivial proof implies this t-out-n OT scheme does not degrade the unconditional receiving ambiguity. 4.3
Distributed t-out-n Oblivious Transfer Scheme
For a distributed oblivious transfer scheme, there are three types of parties: one sender Alice, p servers S1 , S2 , . . . , Sp , and one receiver Bob. Alice has n secrets m1 , m2 , . . . , mn . Let Γ ={τ 1 , . . . , τ λ } be a monotonic access structure over p servers S1 , S2 , . . . ,Sp . Each τ i ={Si1 , Si2 , . . . , Siδ } is an authorized set of servers such that all servers in τ i together can reconstruct the shared secret. Assume that n messages m1 , m2 , . . . , mn are shared according to Γ by some secret sharing scheme S such that Reconstruct(S(i,τ ))= mi if and only if τ ∈Γ , where S(i,τ ) is the shares of mi held by the servers in τ and Reconstruct(·) is the secret reconstruction algorithm. By [16], a distributed oblivious transfer scheme should meet the following requirements: Correctness: if Alice and servers follow the protocol and Bob receives information from severs in τ , Bob can compute t secrets mi1 ,. . . , mit , where i1 , . . . , in are his choice. Sender’s privacy: even if Bob receives information from a set of severs which contains an authorized set, he gains no information about any other mi , i ∈{i / 1 , . . . , it }. Furthermore, if Bob receives information from a set of severs which is not contained in any authorized set, it gains no information about any mi , 1 ≤ i ≤ n. Receiver’s ambiguity: any set of severs which is not contained in any authorized set cannot gain any information about the Bob’s choice about secrets. Security against receiver-server collusion: after Bob gets mi1 , . . . , mit , any set of severs which is not contained in any authorized set cannot gain / 1 , . . . , it }. any information about any other mi , i ∈{i
Practical t-out-n Oblivious Transfer and Its Applications
235
We combine our t-out-n oblivious transfer scheme and a general secret sharing scheme S to form a t-out-n Γ -OT scheme as follows. Step 1. Sever Sj obtains shares mi,j of mi by the secret sharing scheme S, 1 ≤ i ≤ n, 1 ≤ j ≤ p. xj Step 2. Sever Sj chooses a random integer xj ∈ Zq∗ , and sends Bob X1,j = m1,j , xj . . . ,Xn,j = mn,j , 1 ≤ j ≤ p. Step 3. Let τ be an authorized set that Bob contacts its severs. Bob contacts y1,j y Sj ∈ τ with Y1,j = Xi1,j , . . . , Yt,j = Xitt,j , where Xi1,j , . . ., Xit,j ∈{X1,j , a a mod q. . . . , Xn,j }. Sj responds with Z1,j = Y1,jj , . . . ,Yt,jj , where aj = x−1 j Step 4. Bob computes mi1,j , . . . , mit,j for each Sj ∈τ and recover mik from shares mik,j , where j satisfying Sj ∈τ , 1 ≤ k ≤ t. It meets the requirements if the basic secret sharing scheme is secure. The receiver’s choice is unconditionally ambiguity. The sender’s privacy is guaranteed if discrete logarithm problem in G is difficult.
5 5.1
Applications Efficient Private Information Retrieval (PIR)
Efficient string oblivious transfer schemes can improve practical efficiency of the schemes in which oblivious transfer is used. One primary application is for private information retrieval (PIR), in which the user (U) wants to query some data blocks from a database, but U does not want the database manager (DBM) to know which data blocks he is interested in [8]. The regular PIR does not restrict U to obtain only one data block of the database. When more than one block that U intends to obtain, our scheme is more efficient than known schemes. Furthermore, step 0 can be precomputed to improve the efficiency of system. Assume that the database has n data blocks m1 , . . . , mn , and each is in G. The following steps enable U obtain the data blocks mi1 , . . . , mit in which U is interested. Step 0. DBM chooses a random integer x ∈ Zq∗ and publises X1 = mx1 , . . . , Xn = mxn ; Step 1. U chooses random integers y1 , . . . , yt ∈ Zq∗ , and sends Y1 = Xiy11 , . . . , Yt = Xiytt to DBM, where Xi1 , . . . , Xit ∈{X1 , . . . , Xn }; Step 2. DBM calculates u = x−1 mod q, and sends Z1 = Y1u , . . . , Ytu to U. Step 3. U computes v1 = y1−1 mod q, . . . , vt = yt−1 mod q, and gets message mi1 = Z1v1 , . . . , mit = Ztvt . 5.2
Secure Comparison of Secret Integers
We consider the two-party case in honest-but-curious model. Let Alice have a secret integer a and Bob have a secret integer b. And it is known that 1≤ a, b ≤ n. They wish to compare a with b but they don’t want to reveal their secrets. This
236
Q.-H. Wu, J.-H. Zhang, and Y.-M. Wang
problem is known as the millionaires’ problem. It can be solved with the general function evaluation techniques. In this section, we give a general solution to this problem with two-lock cryptosystem. Let the message space be parted into two sets, M0 and M1 . The following protocol enables Alice and Bob securely compare their secret integers. Step 1. Alice chooses a random integer x ∈ Zq∗ and sends X1 = mx1,0 , . . . , Xa = mxa,0 , Xa+1 = mxa+1,1 , . . . , X n = mxn,1 to Bob, where mi,α is random message in Mα , 1≤ i ≤ n and α∈{0,1}; Step 2. Bob chooses a random integer y ∈ Zq∗ , and sends Y = Xby to Alice, where b ∈{1, . . . , n}; Step 3. Alice calculates e = x−1 mod q, and sends Z = Y e to Bob; Step 4. Bob computes d = y −1 mod q, and gets message mb,α = Z d. . If mb,α ∈ M0 , Bob learns that b ≤ a, else b > a. Bob tells Alice the result of comparison. In the above scheme, Alice cannot learn Bob’s secret integer b even if she has unlimited computational power. Bob cannot learn Alice’s secret integer a if the discrete logarithm problem in G is infeasible. Alice requires n+1 modular exponentiations and Bob requires 2 modular exponentiations.
6
Concluding Remarks
In this paper we introduce a new cryptographic primitive, two-lock cryptosystem, which enables Alice sends Bob secret without a shared secret key. Then we give the general constructions of t-out-n (string) oblivious transfers and millionaire protocols using two-lock cryptosystem. We introduce concrete two-lock cryptosystems. One is based on Knapsack problem. The other is based on discrete logarithm problem. In the proposed t-out-n (string) oblivious transfer schemes, Alice cannot determine which t messages Bob received even if she has unlimited computational power while Bob cannot learn the other n − t messages if the discrete logarithm problem is infeasible. The proposed protocols require constant rounds of communication. In our scheme based on discrete logarithm problem, Alice requires n + t modular exponentiations and Bob requires 2t modular exponentiations. We also improve our basic t-out-n oblivious transfer scheme with public verifiability and extend it to distributed oblivious transfers. As applications, efficient PIR schemes and millionaire protocol are built. In the proposed PIR scheme, when more than one block which users intend to obtain, our scheme is very efficient and practical. In our millionaire protocol, Alice requires n+1 modular exponentiations and Bob requires 2 modular exponentiations. It is practical in many applications such as electronic auctions and age-comparing scenarios.
References 1. M. Abe, M. Ohkubo, and K. Suzuki. 1-out-of-n Signatures from a Variety of Keys. ASIACRYPT’02, 2002, pages 415–432, 2002.
Practical t-out-n Oblivious Transfer and Its Applications
237
2. G. Brassard, C. Cr´epeau. Oblivious Transfers and Privacy Amplification. EUROCRYPT’97, pages 334–346, 1997. 3. F. Bao, R. Deng, P. Feng. An Efficient and Practical Scheme for Privacy Protection in E-commerce of Digital Goods. ICICS’03, pages 162–170. 2000. 4. G. Brassard, C. Cr´epeau, J.-M. Robert. Information Theoretic Reduction among Disclosure Problems. 27th IEEE Symposium on Foundations of Computer Science, pages 168–173, 1986. 5. G. Brassard, C. Crepeau, M. Santha. Oblivious Transfer and Intersecting Codes. IEEE Trans. on Inf. Th., special issue in coding and complexity, Vol. 42, No. 6, pages 1769–1780, 1996. 6. M. Ben-Or, S. Goldwasser, A. Wigderson. Completeness Theorems for Noncryptographic Fault-tolerant Distributed Computation. 20th ACM Symposium on the Theory of computing, pages 1–10, 1988. 7. R.Cramer, I. Damgard, and B. Schoenmakers. Proofs of Partial Knowledge and Simplified Design of Witness Hiding Protocols. CRYPTO’94, pages 174–187, 1994. 8. B. Chor, O. Goldreich, E. Kushilevitz, M. Susdan. Private Information Retrieval, Journal of the ACM 45(6), pages 965–982, 1998. 9. D. Chaum and T. Pedersen. Transferred Cash Grows in Size. EUROCRYPT’92, pages 390–407, 1993. 10. B. Chor, R. L. Rivest. A Knapsack Type Public-key Cryptosystem Based on Arithmetic in Finite Field. CRYPTO’84, pages 54–65, 1985. 11. S. Even, O. Goldreich, A. Lempel. A Randomized Protocol for Signing Contracts, Communications of the ACM 28, pages 637–647, 1985. 12. O. Goldreich, R. Vainish. How to Solve any Protocol Problem: An Efficient Improvement. CRYPTO’87, pages 73–86, 1988. 13. H. A. Hussain, J. W. A. Sada, and S. M. Kalipha. New Multistage Knapsack Public-key Cryptosystem. International Journal of Systems Science, Vol. 22, No. 11, pages 2313–2320, Nov. 1991. 14. J. C. Lagarias, and A. M. Odlyzko. Solving Low-density Subset Sum Problems. 24th IEEE Symposium on Foundations of Computer Science, pages 1–10, 1983. 15. R. C. Merkle, and M. Hellman. Hiding Information and Signatures in Trapdoor Knapsack. IEEE Transactions on Information Theory, Vol.24, No.5, pages 525– 530, 1978. 16. M. Naor, B. Pinkas. Distributed Oblivious Transfer. ASIACRYPT’00, pages 205– 219, 2000. 17. M. Rabin. How to Exchange Secrets by Oblivious Transfer. Technical Report TR81, Aiken Computation Laboratory, Harvard University, 1981. 18. A. Shamir, and A. Fiat. On the Security of the Merkle-Hellman Cryptographic Scheme. IEEE Trans. On Information Theory, Vol.26, No.3, pages 339–340, May 1980. 19. J. P. Stern. A New and Efficient All-or-nothing Disclosure of Secrets Protocol. ASIACRYPT’98, pages 357–371, 1998. 20. A. Salomaa, L. Santean. Secret Selling of Secrets with Several Buyers. 42nd EATCS Bulletin, pages 178–186, 1990. 21. W. Tzeng. Efficient 1-out-of-n Oblivious Transfer Schemes. PKC’02, pages 159– 171, 2002. 22. S. Vaudenay. Cryptanalysis of the Chor-Rivest Cryptosystem. CRYPTO’98, pages 243–256, 1998.
Adaptive Collusion Attack to a Block Oriented Watermarking Scheme Yongdong Wu and Robert Deng Institute for Infocomm Research 21, Heng Mui Keng Terrace, Singapore, 119613 {wydong,deng}@i2r.a-star.edu.sg
Abstract. In this paper, we propose an adaptive collusion attack to a block oriented watermarking scheme [1]. In this attack, traitors conspire to selectively manipulate watermarked blocks to remove the watermark information. To this end, the traitors compare the watermarked blocks generated from the same original block. If two watermarked block are not equal, they average these two blocks to generate a pirated block. Then, replace the watermarked blocks with the pirated blocks so as to build a pirated image. The pirated image has no watermark but has much higher quality than watermarked images. We also give a theoretical analysis on the probability of successful traitor tracing. Both theoretical and experimental results demonstrate that our attack is very effective when four or more traitors are involved in the collusion attack. In the cases of less than four traitors, we show how to integrate our collusion attack with an adaptive random attack to improve the qualities of pirated images as well as to defeat the tracer.
1
Introduction
The rapid development of computer networks and the increased use of multimedia data via the Internet have resulted in faster and more convenient exchange of digital information. With the ease of editing and perfect reproduction, protection of ownership and prevention of unauthorized manipulation of digital audio, image and video materials become important concerns. Digital watermarking is a technique used to identify ownership and fight piracy in digital distribution networks. Its principle is to embed special labels in digital contents so as to degrade the quality of piracy, or confirm at least one traitor with high probability. In recent years, researchers have made considerable progress in watermarking schemes [2,3, 4, 5, 6] which are more and more robust to defeat many traditional attacks, such as nonlinear geometric attacks and common image transformations. However, most of the invisible watermarking schemes are prone to collusion attacks under a very general framework. Such attacks do not consider any specific watermarking scheme given that the probability of implicating an innocent is reasonably low. In a collusion attack, a group of traitors collectively obtains an average of their individually watermarked copies and escapes from being identified. Ergun et al. [7] proved that the upper bound on the size of the traitor S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 238–248, 2003. c Springer-Verlag Berlin Heidelberg 2003
Adaptive Collusion Attack to a Block Oriented Watermarking Scheme
239
group is O( n ln(n)) when no traitor is captured, where n is the size of the cover signal. For example, given an image of size n = 512 × 512, the number of traitors required is roughly 145 for a successful attack. This result is of more importance in theory than in practice because the number of traitors is too big unless the target image is of high value. For a low-value image, it is probably not worth collecting that many watermarked images. However, from the viewpoint of watermarking designers, a good watermarking scheme should approach this upper bound. Celik et al. [8] propose a collusionresilient watermarking method, wherein the host signal is pre-warped randomly prior to watermarking. As each copy undergoes a distinctive warp, Celik et al claimed that collusion through averaging either yields low-quality results or requires substantial computational resources to undo random warps. At the time of writing of this paper, we only have access to the abstract of the paper. In ICICS’02, Das and Maitra presented an invisible spatial domain watermarking scheme to defeat many attacks, including some collusion attacks. The scheme divides the image into small blocks and modifies the intensity of some blocks depending on the bit values of a secret key. Given watermarked image which is suspected under attack, the recovery process traces back the exact key value using either standard correlation measure or error correcting codes. This method can survive the nonlinear geometric attacks, common image transformations and intentional attacks both in spatial and frequency domains. The experiments presented by the authors of [1] demonstrated that this watermarking scheme can stand certain collusion attacks such as average, minimum and maximum attacks. However, our collusion attack to be presented in this paper breaks the scheme when four or more traitors collude. Our collusion attack to the watermarking scheme of [1] is adaptive in nature. In this attack, the traitors conspire to select watermarked blocks. When the traitors find two different blocks produced from the same original block, they average these two blocks to obtain a pirated block. Substitute the watermarked blocks with the pirate blocks so as to create a pirated image. This attack not only removes the watermark, but recovers 88% or 94% of the manipulated blocks with the conspiracy of four or five traitors, respectively. As the number of traitors increases, the quality of the pirated image improves exponentially. This fact may allude users to join the traitor group in order to get a high quality copy of an image. To create a pirated image when less than 4 traitors are available, we propose an adaptive random attack which creates a pirated image of degraded quality. Then we combine the collusion attack and the adaptive random attack to generate a pirated image of good quality without revealing any traitors to the tracer. We give a theoretical analysis on the probability of a successful traitor tracing. We also implement the attack. The experiment result is in concert with the theoretical conclusion. This paper is organized as follows. Section 2 introduces the scheme addressed in [1]. Section 3 first shows the general collusion attacks mentioned in [1] which is followed by detailed descriptions of our attack. We also elaborate the analysis of the attack. Section 4 contains the results of our experiments which demonstrate
240
Y. Wu and R. Deng
the efficiency of our attack and the improvement of the quality of the pirated image.
2
Overview of the Das and Maitra Watermarking Scheme
The watermarking scheme proposed by Das and Maitra [1] is block oriented where an image I is divided into n blocks of size β × β and the blocks are scanned in raster scan order, i.e., from left to right and then from top to down. Denote the j th block as Ij , j = 0, 1, · · · , n − 1, and denote U as the β × β block whose elements are all 1s. In the following, all the image block operations are matrix operations unless stated otherwise. 2.1
Watermark Embedding Process
In the embedding process, an image owner produces a unique secret key for each user. Using this key, a unique watermarked image W for the user is generated. Let π(·) be a pre-defined permutation of n integers 0, 1, · · · , n − 1 (π(·) is invariable for all the users but unknown to any user). The owner embeds a watermark or key as follows: 1. For each user, select a random key k = (k0 , k1 , · · · , km−1 ) of length m, where m < n. Let s = 0. 2. Let j = π(s), thus Ij is a block for watermarking. 3. Calculate the minimum ψl and the maximum ψh of the intensities of the block Ij . 4. Calculate δ = max(µ, α(ψh − ψl )), where µ and α are constants (1 ≤ µ ≤ 3 and 0.05 ≤ α ≤ 0.10). 5. If ks = 1, let Wj = Ij + δU; otherwise Wj = Ij − δU. 6. Let s ← s + 1. If 0 ≤ s ≤ m − 1, go to step (2). 7. All the blocks Wj obtained in step (5) and the n − m non-watermarked blocks Iπ(s) (m ≤ s ≤ n − 1) are assembled to form the watermarked image W. Steps (2) -(5) embed one key bit into an image block to generate a watermarked block. Then step (6) repeats the one-bit embedding process until all the key bits are used up. After the embedding process, the image owner gives the watermarked image to the user and inserts the key k and the user information into a secret database, which is kept securely by the image owner. 2.2
Watermark Retrieving Process
The watermark retrieving process requires the availability of the inspected image, the original image and the secret database. The original scheme [1] was designed to foil malicious geometric operations such as the affine transformation and image cropping which are not related to our discussion. Therefore, we will ignore the steps related to counter geometric operations. The simplified process
Adaptive Collusion Attack to a Block Oriented Watermarking Scheme
241
for retrieving embedded value is as follows: 0. Pre-processing: compute the difference between the non-watermarked blocks of the original image and those of the inspected image so as to increase robustness. 1. Let s = 0. 2. Set j = π(s), then (Ij , Wj ) is a pair of cover-block and watermarked block. 3. Rewrite the j th inspected block whose top left location is (x, y) as Wj (x, y) and the original block whose top left location is (x, y) as Ij (x, y). Calculate the sum of the absolute differences between Wj (x, y) and each neighbor block Ij (x + ∆x, y + ∆y).Select the minimum difference as the similarity measure Γx,y between the inspected block and the original block. That is, Γx,y = min(Wj (x, y) − Ij (x + ∆x, y + ∆y)) for all ∆x, ∆y = −0.5c to 0.5c, where c is a predefined constant, and the minus operation is defined as the sum of absolute value of pixel differences. 4. If Γx,y > 0, let ks = 1; if Γx,y < 0, let ks = 0, and if Γx,y = 0, take ks = 0 or 1 depending on the outcome of a coin tossing. 5. Let s ← s + 1. If 0 ≤ s ≤ m − 1, go to step (2) 6. Read the key k of each record presented in the secret database. Define the correlation factor between the retrieved value and the inspected key as corr = d(k , k )/m, where d(k , k ) is the Hamming distance between k and k , i.e., the number of different bits between the two keys. If corr > γ, reject k ; otherwise, k is regard as the correct key, where γ ≥ 0.5 . Step(3) finds a matched window so as to increase the robustness of the scheme. However, if there is no geometric modification on the image, the position of matched block should be the same as that of the inspected image, in other words, ∆x = ∆y = 0 if there is no geometric manipulation.
3
Adaptive Attack Scheme
Due to the large number of variations of collusion attacks, it is hard to prove that a watermarking scheme is resistant to all collusion attacks. In [1], Das and Maitra showed that their scheme survives the benchmark attacks provided in [2] when only a few traitors are involved. In the experiments presented in [1], three traitors collude to generate a pirated image from watermarked images W1 , W2 , and W3 generated from the same original image in the following way. – Take three pixels z1 , z2 , z3 from the same location of the images W1 , W2 , and W3 . – Construct a pixel value z = f (z1 , z2 , z3 ) where f is taken to be one of the functions from median, max, min, average and weighted average. – Construct an image I with all the pixel values z.
242
Y. Wu and R. Deng
Using this attack, the correlation factor between the traitor’s key and the detected key is much greater than 50% while the correlation between the key of an innocent user and the detected key is very close to 50% [1]. Therefore, the traitors can be identified and no one will be wrongly implicated. From these tests, Das and Maitra claimed that their watermarking scheme was resilient to collusion attack. However, the scheme is vulnerable to our collusion attack. 3.1
Our Collusion Attack
The main difference between our attack and the attack in [1] is that we distinguish the changes in individual blocks while the attack in [1] just operate passively on watermarked images. Suppose that there are t traitors whose watermarked images are W1 , W2 , · · · , Wt . Denote the j th block in the ith watermarked image as Wij . Assume the original image I = {I0 , I1 , · · · , In−1 }, and the pirated image I = {I0 , I1 , · · · , In−1 }, where Wij , Ij , Ij (j = 0, 1, · · · , n − 1) are β × β blocks. The process of constructing the pirated image is as follows: 1. 2. 3. 4. 5.
Let j = 0. If W1j = · · · = Wtj , then Ij = W1j and go to step (4). If ∃i ∈ {2, 3, · · · , t} such that Wij = W1j , then Ij = 0.5(Wij + W1j ). Let j ← j + 1. If j ≤ n − 1, go to step (2). Arrange all the above blocks Ij to generate the pirated image I .
In step (2), if W1j = · · · = Wtj , the pirated block Ij is the same as the watermarked block W1j . This happens either because the block Ij is not selected for embedding, or all Wij (i = 1, 2, · · · , t) are generated by changing the original block Ij with the same value. For this kind of blocks, the watermark information is reserved. In step (3), if ∃Wij = W1j , the original block Ij is manipulated with different values so as to produce two different watermarked blocks, i.e., (Ij + δU) and (Ij − δU). In this situation, the traitors just average the two watermarked blocks to recover the original block exactly. Mathematically, Ij = 0.5(Wij + W1j ) = 0.5(Ij + δU + Ij − δU) = Ij Intuitively the image owner would expect that a pirated image should be of low quality. Apparently, the quality of the pirated image resulted from our attack is better than any watermarked images! The last weapon of the owner is to start the tracing process to identify traitors. Can the traitors escape from being identified when a pirated image is confiscated? Unfortunately, the answer is negative as we will demonstrate below. 3.2
Resilience to Tracing
Assume that the random key value of the watermarked image for the ith traitor is ki = {ki0 , ki1 , · · · , ki(m−1) }, for i = 1, 2, · · · , t. Split the bit positions of the keys into two disjoint sets: S1 = {s | k1s = k2s = · · · = kts , 0 ≤ s ≤ m − 1} and
Adaptive Collusion Attack to a Block Oriented Watermarking Scheme
243
the complement set S2 = {s | s ∈ S1 , 0 ≤ s ≤ m − 1}. We consider each key bit position s ∈ {0, 1, · · · , m − 1} in two cases. Case 1: s ∈ S1 From the definition, an element in S1 indicates the position where all the keys have the same bit value. Thus, the probability is P (s ∈ S1 ) = 2 · 2−t = 21−t . Let a = π(s) be the index of an image block. From on the embedding process, we see that the original block Ia is used to produce the same watermarked block for all the traitors. According to the construction process of pirated images, a 1a (step 2 in subsection 3.1). pirated block Ia is the same as the traitor’s block W Thus, the retrieved key bit ks = kis , i = 1, 2, · · · , t . That is to say, the key bit embedded in the block Ia can be detected correctly. The expected number of detected key bits in case 1 is E1 = mP (s ∈ S1 ) = m21−t . Case 2: s ∈ S2 Conversely, an element in S2 indicates the position where at least one key is different from other keys. Thus, the probability is P (s ∈ S2 ) = 1 − P (s ∈ S1 ) = 1 − 21−t . Let b = π(s) be the index of an image block. Based on the embedding process, the original block Ib is used to produce two kinds of watermarked blocks. The traitors construct a pirated block Ib by averaging two different blocks so as to remove the watermark completely (step 3 in subsection 3.1). Thus, the similarity measure Γx,y = 0 in the retrieval process (subsection 2.2). Therefore, the bit of the detected key is determined by coin tossing. That is to say, only half of the key bits ks (s ∈ S2 ) can be detected successfully. Consequentially, the expected number of detected key bits in case 2 is E2 = mP (s ∈ S2 )/2 = 0.5m(1 − 21−t ). Finally, the total expected number of detected key bits is E1 + E2 = m21−t + 0.5m(1 − 21−t ) = m(0.5 + 2−t ) To avoid being traced, the size t of the traitor group should satisfy 0.5 + 2−t < γ. Thus t > − log2 (γ − 0.5). In the example of [1], the threshold γ = 0.6, thus we have t ≥ 4. In other words, four or more traitors can create a pirated image while no traitor is being identified. From the above analysis, any block Ib in the pirated image is recovered exactly. Consequentially, the expected number of recovered blocks is (1−21−t )m.
244
4
Y. Wu and R. Deng
Experiments
We performed two experiments, one demonstrates that our attack is efficient and the other shows that the pirated image is of better quality than any watermarked image. 4.1
Watermark Removal
Select α = 0.05, µ = 1, β = 4 and γ = 60% as given in [1]. The test image is a 256 × 256 gray image. Figure 1 illustrates the relationship between the correlation value corr and the number of traitors. In Figure 1, the correlation value for each experiment is the maximum value between the traitors’ keys and the key retrieved from the pirated image, i.e., the correlation between retrieved key and the key of the most unlucky traitor. We draw the theoretical result in solid line and the experiment result in dotted line. From the experiment curves, we see that the risk of the traitors decreases exponentially, which in turn shows that our attack is very effective. In our attack, no traitor can be identified when the collusion involves four or more traitors.
,:<<370>6:9 +
/53:><>6207 <3=?7> -@;3<6839> <3=?7>
.?813< :4 ><06>:<=
Fig. 1. Resilence to the detector
However, if there are less than 4 traitors, they will be traced with high probability. For example, if there are only two traitors, the similarity is up to 75%, both traitors will be implicated. The random attack in [1] changes the pixel values randomly and the tracer can detect the traitors correctly. In the following paragraph, we introduce an adaptive random attack on blocks. Adaptive Random Attack Firstly, a traitor estimates δ with the formula δ = max(µ, α(ψh − ψl )) for each block in his watermarked image. Then, he modifies the block randomly by
Adaptive Collusion Attack to a Block Oriented Watermarking Scheme
245
decreasing the value of each pixel of the block by δ or increasing each pixel by δ. If he increases the block pixels, he further increases 10% of pixels of the block by 1. Conversely, if he decreases the block pixels, he further decreases 10% of block pixels by 1. The modified block is the corresponding pirated block. The traitor repeats this random attack until all blocks are processed. Finally, he assembles all the pirated blocks to generate a pirated image. In the retrieval process, the pre-processing stage is of no use for this adaptive random attack because each block is modified randomly so that the shift value is almost 0. Table 1 below shows the similarity Γx,y and the key detection probability. The second column represents the value imposed by the owner, the third column represents the value imposed by the random attack, where the value in parenthesis indicates only 10% of pixels are modified with the value. The fourth column is the average similarity measure Γ x,y = Γx,y /β 2 derived in the retrieval process. The fifth column is the watermark bit corresponding to the owner’s modification. The last column is the detected key bit. From this table, we know that 50% of the detected bits are identical to those of the watermark (rows 1 and 4 in table 1). That is to say, the traitor can not be identified because the correlation value is less than the threshold γ = 60%. However, of the pirated image, non-watermarked blocks will be modified by δ(+1) , 50% of watermarked blocks will be modified by 2δ(+1) and another 50% of watermarked blocks will be roughly recovered. Therefore, the quality of the pirate image is degraded.
Table 1. Adaptive Random Attack Change1
Change2
Γ x,y
Detected bit Embedded bit
1
+δ
+δ(+1)
2δ + 0.1
1
1
2
+δ
−δ(−1)
−0.1
0
1
3
−δ
+δ(+1)
+0.1
1
0
4
−δ
−δ(−1)
−2δ − 0.1
0
0
Hybrid Attack In order to create a pirated image of good quality, the traitors integrate the adaptive random attack with our collusion attack when there are only two or three traitors. Specifically, the traitors remove the watermark embedded in the different blocks with the collusion attack to alleviate the distortion, then manipulate the rest of blocks with the random attack to reduce the correlation. This hybrid attack provides the traitors a pirated image of acceptable quality while reducing the risk of being identified. In Table 2, we compare the hybrid attack with the adaptive random attack when there are 2 or 3 traitors. The second column lists the expected number of blocks whose average modification is 2δ + 0.1, the third column lists the expected number of blocks whose average modification is 0.1, the fourth column lists the expected number of blocks whose average modification is 0. The last column lists the expected number of key bits. From
246
Y. Wu and R. Deng
Table 2 for watermarked blocks, we see that the traitors are safe. The pirated image generated with the hybrid attack has better quality than that generated with random attack because the former consists of fewer modified blocks. Table 2. Hybrid Attack
Adaptive Random Attack Hybrid Attack(2 traitors) Hybrid Attack(3 traitors)
4.2
#block (| Γ x,y |= 2δ + 0.1)
#block (| Γ x,y |= 0.1)
#block (| Γ x,y |= 0)
#Expected Detected bit
0.5m
0.5m
0
0.5m
m/4
m/4
m/2
0.5m
m/8
m/8
3m/4
0.5m
Image Quality Improvement
Unlike the previous attacks which decrease the quality of the watermarked image, our adaptive attack increases the image quality. This remarkable feature may seem unreasonable. Generally, the SNR of an image will decrease when an independent noise is embedded. However, if the inserted signal depends on the original signal, the SNR of the original signal may increase. Figure 2 shows the original image and a pirated image obtained from our attack.
Fig. 2. Original image and pirated image
To qualitatively describes the improvement on the watermarked image, we adopt the traditional definition of the P SN R (peak signal-noise-ratio) between the original image I and an inspected image I as σ=
W H 1 (I(x, y) − I(x, y))2 W H x=1 y=1
Adaptive Collusion Attack to a Block Oriented Watermarking Scheme
247
Fig. 3. PSNR vs. number of traitors
Fig. 4. Number of corrected blocks vs. number of traitors
P SN R = 10(log10 2552 − log10 σ) Figure 3 demonstrates the quality experiment result of the attack without applying enhanced random attack. The P SN R of a pirated image constructed by 2 traitors is only 44dB, but the P SN R of a pirated image constructed by 10 traitors is up to 64dB. The quality of the attacked image increases steadily. Assume t key values are random and independent. Thus the number of corrected blocks is (1 − 21−t )m. Figure 4 shows the percentage of corrected blocks vs. the number of traitors. We draw the theoretical result in solid line and the experiment result in dotted line. The experiment result matches the theoretical conclusion very well. From Figure 4, when there are 4 and 5 traitors, 88% and 94% of the changed blocks can be recovered, respectively.
248
5
Y. Wu and R. Deng
Conclusion
In ICICS’02, a block oriented watermarking scheme in spatial domain [1] was proposed to defeat many traditional attacks, including the collusion attack mentioned in [2]. The scheme divides an image into small blocks and modifies the intensity of some blocks depending on a secret key. In this paper, we proposed an adaptive collusion attack to this scheme by selectively manipulating the watermarked blocks. When the traitors find two unequal watermarked blocks generated from the same original block, they remove the watermark information by simply averaging them out. This attack not only removes the watermark, but also increases the image quality dramatically. We presented a theoretical analysis on the probability of successful tracing. Our experimental result matches the theoretical conclusion very well, and demonstrates that the attack is effective. From our analysis and experiments, we conclude that the method addressed in [1] can resist the conspiracy of at most three traitors. In order to create a high quality pirated image when there are only two or three traitors, we propose an adaptive random attack and integrate it with our collusion attack. Specifically, in the hybrid attack, the traitors remove the watermarks in the blocks which are not identical with the adaptive collusion attack, and then they create other pirated blocks with the adaptive random attack. The pirated blocks from both attacks form a complete pirated image. This hybrid pirated image has acceptable quality and the traitors are safe from being identified.
References 1. Tanmoy Kanti Das and Subhamoy Maitra, “A Robust Block Oriented Watermarking Scheme in Spatial Domain”, Proceedings of the 4th International Conference on Information and Communications Security (ICICS’02), LNCS 2513, pp. 184–196, 2002 2. I.J. Cox, J. Kilian, T.Leighton and T. Shamoon, “Secure Spread Spectrum Watermarking for Multimedia”, IEEE Trans. on Image Processing, Vol. 6, No. 12, 1673–1687, 1997. 3. I.Pitas, “A Method for Signature Casting on Digital Images”, Proceedings of the IEEE Int. Conf. On Image Processing, Vol.3, pp. 215–218, 1996. 4. R.B.Wolfgang and E.J.Delp, “A Watermark for Digital Images”, Proceedings of the IEEE Int. Conf. On Image Processing, Vol.3, pp. 219–222, 1996. 5. M.D.Swanson, B.Zhu and A.H.Tewfik, “Transparent Robust Image Watermarking”, Proceedings of the IEEE Int. Conf. On Image Processing, Vol.3, pp. 211–214, 1996. 6. J.F.Delaigle, C.De Vleeschouver and B.Macq, “Digital Watermarking”, Proceedings of SPIE, Optical Security and Counterfeit Deterrence Techniques, Vol.2659, pp. 99–110, 1996. 7. Funda Ergun, Joe Kilian, and Ravi Kumar, “A Note on the Limits of CollusionResistant Watermarks”, Advances in Cryptology – EUROCRYPT’99, LNCS 1592, pp. 140–149, 1999 8. M. Celik, G. Sharma, A.M. Tekalp, “Collusion-Resilient Fingerprinting Using Random Pre-Warping (abstract)”, submitted to IEEE Int. Conf. On Image Processing 2003.
ID-Based Distributed “Magic Ink” Signature from Pairings Yan Xie, Fangguo Zhang, Xiaofeng Chen, and Kwangjo Kim International Research center for Information Security (IRIS) Information and Communications University(ICU), 58-4 Hwaam-dong Yusong-ku, Taejon, 305-732 KOREA {yanxie, zhfg, crazymount, kkj}@icu.ac.kr
Abstract. The advantage of ID-based system is the simplification of key distribution and certification management; a user can directly use his identity as his public key instead of an arbitrary number, thus at the same time he can prove his identity rather than providing a certificate from CA. Now a revocable blind signature is becoming more practical; because a complete anonymity can be abused in real world applications. For instance the perfect crime concern in e-cash system. The “magic ink” signature provides a revocable anonymity solution, which means that the signer has some capability to revoke a blind signature to investigate the original user in case of abnormal activity, while keeping the legal user’s privacy anonymous. A single signer in “magic ink” signature can easily trace the original user of the message without any limitation; this scheme can’t satisfy anonymity for a legal user, so we can use n signers to sign the message through a (n, n) threshold secret sharing to distribute the commitment during the signature procedure, single signer’s revocability is limited, only under the agreement and cooperation of a set of n singers, the user’s identity can be discovered. In this paper an ID-based (n, n) threshold “magic ink” signature is proposed along with its analysis and application.
1
Introduction
Blind signature introduced by Chaum [6] can be user to protect the privacy such as anonymity of user in the electronic cash system. However, unconditional anonymity facilitates some crimes such as perfect crime, illegal purchasing, ect [17]. In order to solve these problems some technologies for anonymity revocation were proposed, such as “fair blind signature” [14], indirect discourse proofs [7], “magic ink” signature [1,11], group signature [18,21], and so on. Physically “magic ink” signature can be described as follows: a user writes some message on an envelope using magic ink, simultaneously this message also is copied on a blank paper through carbon paper in this envelope, then the signer writes down his signature on the envelope, this signature also will appear on the inside paper, finally the signer and user keep the envelop and signed inside paper respectively. Normally the message is invisible on the envelop, but in some S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 249–259, 2003. c Springer-Verlag Berlin Heidelberg 2003
250
Y. Xie et al.
case(criminal activity) signer can discover this invisible message. The “magic ink” signature provides a revocable anonymity solution, which means that the signer has some capability of revealing a blind signature to investigate the abnormal activity, whilst keeps the legal action anonymous. The first “magic ink” signature [11] is based on digital signature standard; this scheme approaches a revocable anonymity from a set of distributed servers through threshold cryptosystem instead of the enrollment of the trust third party in “fair blind signature”. It achieves more security and availability. In traditional CA-based public key cryptosystem, each participant should provide a digital certificate to prove the validity of his identity and public key; this procedure obviously exhausts huge system resource. In 1984, Shamir proposed an ID-based encryption and signature scheme [16], which directly utilizes user’s identity as his public key. So this scheme could simplify the key distribution and certification management process. Bilinear parings namely the Weil pairing and Tate pairing of algebraic curves were first used to analyze the discrete logarithm problem in cryptography, such as MOV attack [13] and FR attack [8]. Recently, the bilinear pairings have beeb found various applications in cryptography, more precisely, they can be need to constructed ID-based cryptographic schemes [3,4,12,19,5,10,15,20]. In this paper we proposed an ID-based distributed “magic ink” signature scheme by combining a distributed “magic ink” signature with an ID-based signature from bilinear pairing. This scheme can be used in some revocable ecash system or credential certificates applications. In case of a single signer can easily trace the original user of the message without any limitation; we can use a (n, n) threshold to share the commitment during the signature procedure. Only under the agreement and cooperation of n signers, the original user can be found. This paper is organized as follows: some properties of bilinear pairing is introduced in Section 2. We then discuss the structure of this scheme in Section 3. In Section 4 we describe the basic idea of this signature. Our main ID-based distributed “magic ink” signature is presented in Section 5. During Section 6 we analyze the correctness, unforgeable security, robustness, efficiency and comparison of our scheme. We dedicate some application which can be established on this scheme in Section 7. Conclusion is given in Section 8.
2
Some Properties of Bilinear Pairing
We assume G1 and G2 are two cyclic groups of order q for a large prime q, G1 is an additive group and G2 is a multiplicative group, A map is a bilinear pairing, if it satisfies following properties: 1. Bilinear: e(P1 + P2 , Q) = e(P1 , Q)e(P2 , Q) and e(P, Q1 + Q2 ) = e(P, Q1 )e(P, Q2 ). = 1. 2. Non-degenerate: there exits P, Q ∈ G1 , e(P, Q) 3. Computability: If P, Q ∈ G1 , there exists an efficient algorithm to compute e(P, Q).
ID-Based Distributed “Magic Ink” Signature from Pairings
251
There are some arithmetic hard problems in G1 , as follows: 1. Discrete Logarithm Problem (DLP): It means that if there are two groups Q and P , it is difficult to find an integer n, which can satisfy P = nQ. 2. Decision Diffle-Hellman Problem (DDHP): Given P , aP , bP , cP , and a, b, c ∈ Zq∗ , determine whether c ≡ ab mod q. 3. Computational Diffle-Hellman Problem (CDHP): Given P ,aP ,bP ,a, b ∈ Zq∗ , computes abP . 4. Gap Diffle-Hellman Problem(GDHP): A class of problems, when the DDHP is easy, but the CDHP is hard. We let CDHP and DLP are intractable in this paper, that means there is no polynomial time algorithm to solve CDHP and DLP with nonnegligible probability. We call a group G a Gap Diffle-Hellman group, when the DDHP is easy and CDHP is hard on that group. Such group can be found on supersingular elliptic curves or hyperelliptic curves over finite field, and the bilinear pairing can be derived from the Weil or Tate pairing.
3
Structure
3.1
Computation and Communication
We assume: there are a set of n signers and k receivers, all of them are polynomial-time randomized Turing machines. In communication model, we also assume: any receiver can build point to point communication channel with each signer through a secure channel. An adversary can corrupt up to n − 1 among the n signers. 3.2
ID-Based “Magic Ink” Signature
An ID-based “magic ink” signature scheme consists of three parties and five steps, which is described as follows: – Three parties are Trust Authority(TA), n signers and receiver. - Setup is a randomized algorithm, which generates system parameters and a master key by inputting a security parameter to TA. – In Extract step, TA inputs system parameters, master key and an arbitrary ID ∈ {0, 1}∗ , and outputs a private key SID . Here ID is the signer’s identity, which is treated as the signer’s public key. – Signature is a signature generation protocol engaged by receiver and a set of n signers, signers output a blind signature, and receiver finally produces a valid or failed signature. Signers record a signature-view variant in their database to indicate each blind signature.
252
Y. Xie et al.
– Verification is a randomized algorithm that takes message m with its signature and signers’ identities as an input, and outputs acceptation or rejection. – Tracing occurs in case of illegal activities, signers will search their database of signature-view invariant to find a value, which can be linked to the valid signature. From this value, signers can find the original signature receiver.
4
Basic Idea of ID-Based “Magic Ink” Signature (Single Signer)
ID-based “magic ink” signature can be regarded as a combination of ID-based signature with a revocable blind signature. We will describe the basic idea of ID-based “magic ink” signature from a single signer. First set G1 to be a cyclic additive group and G2 to be a multiplicative group, both of groups have a same prime order q, our scheme is built on Gap Diffle-Hellman Group . We view the bilinear map as e : G1 × G1 → G2 . At the beginning of this protocol, the TA operates Setup and Extract, during the generation of private key of the signer, we can use n TA’s with a (n, n) threshold security sharing to share the master key, in order to control the power of TA. Setup: Let P be a generator of G1 , randomly choose a number s ∈ Zq∗ as a master key of trust authority, set Ppub = sP . Construct two cryptographic hash functions H : {0, 1}∗ → Zq and H1 : {0, 1}∗ → G1 . Then the system parameters are : {q, P, Ppub , G1 , G2 , e, H, H1 }. Extract: Assume that the signer’s identity is his ID, we can calculate the public key as QID = H1 (ID), and the private key of signer is SID = sQID . Signature: – The signer randomly chooses a number r ∈ Zq∗ , and computers R = rP , then sends R to the receiver. – A number a ∈ Zq∗ will be chosen randomly by receiver as a blind factor, then receiver computes t = e(aPpub , R) and c = H(m, t) with his message m, sends blinded c by computing c = a−1 c mod q to signer. – After receiving c , signer uses his private key SID to produce the blind signature by computing S = c SID + rPpub , and sends the S to the receiver. – S is unblinded by factor a, then the final signature of message m is (S, t, m), where S = S a. The protocol is showed in Fig.1.
ID-Based Distributed “Magic Ink” Signature from Pairings
253
Verification: Receiver can verify whether the signature is valid or not by using signer’s public key to check: e(S, P ) = e(QID , Ppub )H(m,t) t. Receiver accepts the signature, if the above equation holds.
Receiver
Signer r ∈R Zq∗
a ∈R Zq∗
R = rP
R
t = e(aPpub , R) c = H(m, t) c = a−1 c (mod q)
c
S = c SID + rPpub
S = Sa
S
Fig. 1. ID-based “magic ink” signature protocol
Tracing: Let (c−1 S) identifies a valid signature (m, t, S), and (c , S ) can be viewed by the signer during the signature session. In each signature, we have c−1 S = c−1 S, since: c−1 S = c−1 a × Sa−1 = c−1 S. From a valid signature (m, t, S), signer can easily calculate c−1 S, here c = H(m, t). So if any illegal receiver needs to be discovered, signer can compare the value of c−1 S with the database of signature-view invariant. If signer can find the same value in the database, the original receiver can be identified.
5
ID-Based Distributed “Magic Ink” Signature (Multiple Signers)
It is trivial that the case of single signer can’t satisfy the privacy requirement because single signer can trace the user as his will. Therefore, we provide a (n, n)
254
Y. Xie et al.
threshold scheme by modifying our previous construction in a single signer case, which means a signer will be replaced by n signers in a way that key generation and signature generation require collaboration of at least n singers, whilst no subgroup of less than n participants can forge a signature. We set n signers to individually sign the message through using their own private keys and send it to user through point-to-point communication with receiver, and receiver combines those signatures to an ID-based “magic ink” signature. The advantage of ID-Based distributed “magic ink” signature is that it can hide the signature-view invariant to each signer, also it satisfies the original ID-based blind signature requirement. So without the agreement and cooperation of n signers, the signature can’t be revoked. The protocol of ID-based distributed “magic ink” signature is described as follow: Set G1 as a cyclic additive group and G2 as a multiplicative group, both of groups have a same prime order q. We view the bilinear group as e : G1 × G1 → G2 . Setup: Let P be a generator of G1 , randomly choose a number s ∈ Zq∗ as a master key of trust party, set Ppub = sP . Construct two cryptographic hash functions H : {0, 1}∗ → Zq and H1 : {0, 1}∗ → G1 . Then the system parameters are : {q, P, Ppub , G1 , G2 , e, H, H1 }. Extract: Assume each signer’s identity is IDi . We can express the public key of each signer as: QIDi = H1 (IDi ), and the private n key of signer is SIDi = sQIDi , so the public key of the scheme is QID = i=1 QIDi , i = 1, 2...n. Signature Session: – n signers obtain a (n, n) secretsharing (r1 , r2 , . . . rn ) of a randomly chosen n number r ∈ Zq∗ by letting r = i=1 ri , each signer computes Ri = ri P , and sends Ri to receiver. n – Receiver computes R = i=1 Ri , and randomly chooses a number a ∈ Zq∗ . Receiver computes t = e(aPpub , R) and c = H(m, t) with the message m, and sends blinded c by computing c = a−1 c mod q to each signer. – Each signer individually generates the signature Si = c SIDi + ri Ppub , and secretly sends it to receiver. n – After receiving all the signature Si , receiver computes S = i=1 Si = n n c i=1 SIDi + i=1 ri Ppub . He then unblinds S by computing S = S a, and the (S, t, m) will be the valid ID-based distributed “magic ink” signature on message m. Fig.2 shows the protocol. Verification: The verification is similar to the previous single signer verification, receiver uses public key QID to check whether it is a valid signature from equation: e(S, P ) = e(QID , Ppub )H(m,t) t.
ID-Based Distributed “Magic Ink” Signature from Pairings Receiver
255
Signer i ri ∈R Zq∗
R=
n
Ri i=1
Ri = ri P
Ri
a ∈R Zq∗ t = e(aPpub , R) c = H(m, t) c = a−1 c (mod q)
S =
n i=1
c
Si = c SIDi + ri Ppub
Si
Si
-
S = Sa
Fig. 2. ID-Based Distributed “Magic Ink” signature protocol
Tracing: Since S is blind to each signer, and each Si is secretly sent to receiver, so any signer can’t know S without cooperating with another n − 1 signers. Only n n signers work together to compute S from S = i=1 Si , then the signatureview invariant will be revoked. Through this value signers can compare with the signature to trace the original signature receiver.
6 6.1
Analysis of ID-Based Distributed “Magic Ink” Signature Correctness
This scheme is a valid signature; the proof of verification equation is as follow: n Si a, P ) e(S, P ) = e(S a, P ) = e(
= =
n i=1 n
i=1
e(ac SIDi + ari Ppub , P ) e(cSIDi , P )
i=1 c
= e(SID , P )
n i=1
n
e(ari Ppub , P )
i=1
e(aPpub , P )ri
256
Y. Xie et al.
= e(sQID , P )c = e(sQID , P )c
n i=1 n
e(aPpub , ri P ) e(aPpub , Ri )
i=1
= e(sQID , P )c e(aPpub , R) = e(QID , sP )c t = e(QID , Ppub )H(m,t) t 6.2
Blindness
This scheme basically can achieve blindness requirement, because the message M sent to signer will be blinded previously by a randomly chosen integer a ∈ Zq∗ , and the signer just signs the blinded message c . After receiving the blinded signature, the user can unblind this signature by using blind factor a and get the valid signature, but the signer can’t find any relationship between S and S, signer just has a probability of 1/q to correctly guess the unblinded signature, so we can say this scheme is blind. 6.3
Revocable Anonymity
A valid magic ink signature means that the scheme should be revocable anonymity; this scheme also supports such function. The signer receives c and S during each signature session, he can pre-compute the value of c−1 S and store each value into a specific database. When he needs to trace the user, he can compute the value of c−1 S from the signature (S, t, m). Since the signature view invariant, signature can search this value in database to find the original user. So the revocable property is maintained. The tracing mechanism of distributed magic ink should be cooperated by n signers, because each signer can’t get S by himself. The revocability of signers can be controlled 6.4
Unforgeable Security
We consider the following fame: assume that an adversary can cooperate n − 1 signers without loss of generality. Let the identities of these n − 1 signer are QIDi , i = 1, 2...n. So adversary can get SIDi to compute Si . If he can compute SIDn , he can forge a valid ID-based distributed “magic ink” signature. However it is equivalent to solve CDHP in G1 for computing sH(IDn ) with sP and H(IDn ). 6.5
Robustness
If the signature can’t pass the verification, there exists some dishonest signers. Since each signer should send his partial signature Si to the receiver, receiver can check each signature by verifying whether e(Si , P ) = e(QIDi , Ppub )H(m,t) e(aPpub , Ri ), here Si = Si a. If one of the signatures doesn’t pass, we can declare that this signer made some mistake or cheating.
ID-Based Distributed “Magic Ink” Signature from Pairings
6.6
257
Comparison and Efficiency
Jakobsson first proposed a distributed “magic ink” signature [11] in 1997. The comparison with our proposed scheme is showed in Table 1 . We denote DMIS the distributed “magic ink” signature [11], IDDMIS the ID-based distributed “magic ink” signature, M the cost of multiplication over G2 , D the cost of Division over a finite field, A the point addition over G2 , e the cost of weil pairing computation in G1 , m the cost of multiplication over a finite field, E the cost of exponent over a finite field, and I the cost of inverse over a finite field. Table 1. Comparison with Distributed “Magic Ink” Signature
Number of costs(reciever)
DMIS (2n + 1)E+ 4m+ 2I
Number of cost(each signer) 2E+ 3m Private key size(bit) 160bit Public key size(bit) 1024bit Threshold (n, t) Based Problem DLP
IDDMIS 1e+ 2M+ 1D+ (2n − 1)A 3M+ 1A 161bit 161bit (n, n) CDHP
Compared with IDDMIS, The advantages of our protocol are described as follows: – Due to the ID-based signature, n signers can directly use their identities such as an e-mail address related with their unique information instead of a certificate issued by Certification Authority. So it simplifies the key distribution and management in our scheme. – We compare the computation costs of receiver’s side between two schemes. we can find that if n, which denotes the number of distributed signers, is not less than 2, the computational costs in user side of our scheme is lower than previous scheme. If the system use a mount of distributed signers, our scheme will be more efficient as the number of n increases. For example, according to [2], on PIII 1 GHz one multiplication over a finite field costs 0.006 milliseconds, When n=20, in previous scheme each receiver takes 197 milliseconds, however our protocol for each receiver takes 25 milliseconds.
7
Application
Unconditional anonymity may facilitates perfect crimes such as money laundering, blackmailing, etc. So recently a revocable e-cash system is desirable in practical use, that is the anonymity of the user is revocable in some urgent case. Our ID-based distributed “magic ink” signature scheme also can be used in a revocable e-cash system, we can treat bank as signers, and buyer as a receiver,
258
Y. Xie et al.
during the withdrawal step, buyer first randomly chooses a message m as his e-coin, and gets the valid ID-based distributed “magic ink” signature to his coin from bank, bank assigns n different parties to sign this coin and as the same time stores each part of signature-view variant to their database. During payment step, vendor simply verifies whether the coin is valid or not by checking bank’s signature. If the coin is valid, a vendor will deposit it to bank. When bank detects some illegal activities such as blackmail or money laundering, he can search the database of signature-view invariant to find the corresponding user. Also if bank cooperates with user, he can act coin tracing to calculate the final coin and signature. But because of the use of distributed signature, the revocability of bank is limited, Only under the cooperation of all n parties, bank can get the signature-view invariant. In some previous fair e-cash system scheme, a trust third party(TTP) was used to send the pseudonym in signature put by user during the signature procedure to bank, in order to help bank to make tracing, but our scheme doesn’t need the enrollment of the TTP. It obviously reduces the protocol complexity and saves the system resource.
8
Conclusion
In this paper, we proposed an ID-based distributed “magic ink”s signature scheme. Our scheme combine the advantages of ID-based signature and traditional “magic ink” signature scheme, which can be used for designing revocable anonymity e-cash system without TTP. A disadvantage of our scheme is (n, n) threshold, so it lacks flexibility. Since it seems no (n, t) threshold ID-based signature until now, we will design a (n, t) threshold to improve the efficiency and availability in the future works.
References 1. F. Bao and R. Deng, “A new type of “magic ink” signature towards transcriptirrelevant anonymity revocation”, PKC’99, LNCS 1560, pp. 1–11, Springer-Verlag, Berlin Heidelberg 1999. 2. P.S.L.M. Barreto, H.Y. Kim, B. Lynn, and M. Scott, “Efficient algorithms for pairing-based cryptosystems”, Advances in Cryptology-Crypto 2002, LNCS 2442, pp. 354–368, Springer-Verlag, 2002. 3. D. Boneh and M. Franklin, “Identity-based encryption from the Weil Pairing”, Advances in Cryptology-Crypto’2001, LNCS 2139, pp. 213–29, Spring-Verlag, 2001. 4. D. Boneh, B. Lynn, and H. Shacham, “Short signatures from the Weil pairing”, Advances in Cryptology-Asiacrypt 2001, LNCS 2248, pp. 514–532, Springer-Verlag, 2001. 5. J.C. Cha and J.H. Cheon, “An Identity-based signature from gap Diffie-Hellman groups”, Cryptology ePrint Archive, Report 2002/018, available at http://eprint.iacr.org/2002/018/. 6. D. Chaum, “Blind signatures for untraceable payments”, Advanced in CryptologyCrypto’82, 1983, Plenum NY, pp. 199–203.
ID-Based Distributed “Magic Ink” Signature from Pairings
259
7. Y. Frankel, Y. Tsiounis, M. Yung, “Indirect discourse proofs: achieving efficient fair off-line e-cash”, Advanced in Cryptology-Asiacrypt’96, LNCS 1163, pp. 286–300, Springer-Verlag, 1996 8. G. Frey and H. R¨ uck, “A remark concerning m-divisibility and the discrete logarithm in the divisor class group of curves”, Mathematics of Computation, 62, pp. 865–874, 1994. 9. S. D. Galbraith, K. Harrison, and D. Soldera, “Implementing the Tate pairing”, ANTS 2002, LNCS 2369, pp. 324–337, Springer-Verlag, 2002. 10. F. Hess, “Exponent group signature schemes and efficient identity based signature schemes based on pairings”, Cryptology ePrint Archive, available at http://eprint.iacr.org/2002/012/. 11. M. Jakobsson and M. Yung, “Distributed magic ink signatures”, Advances in Cryptology-EUROCRYPT’97, LNCS 1233, pp. 450–464, Spring-Velag, 1997. 12. A. Joux, “The Weil and Tate Pairing as building blocks for Public Key Cryptosystem”, ANTS 2002, LNCS 2369, pp. 20–32, Springer-Verlag, 2002. 13. A. Menezes, T. Okamoto, and S. Vanstone, “Reducing elliptic curve logarithms to logarithms in a finite field”, IEEE Transaction on Information Theory, 39: 1639– 1646, 1993. 14. Y. Mu, K.Q. Nguyen, and V. Varadharajan, “A fair electronic cash scheme”, ISEC2001, LNCS 2040, pp. 20–32, Springer-Verlag, 2001 15. K.G. Paterson, “ID-based signatures from pairings on elliptic curves”, Cryptology ePrint Archive, available at http://eprint.iacr.org/2002/004/. 16. A. Shamir, “Identity-based cryptosystems and signature schemes”, Advances in Cryptology-Crypto’84, LNCS 196, pp. 47–53, Springer-Verlag, 1984. 17. B.V. Solms and D. Naccache, “On blind signatures and perfect crimes”, Computers and security, 11(6):581–583, 1992. 18. J. Traor, “Group signature and their relevance to privacy-protecting off-line electronic cash systems”, Proc. of ASISP99, LNCS 1587, pp. 228–243, Springer-Verlag, 1999. 19. F. Zhang, S. Liu and K. Kim, “ID-Based one round authenticated tripartite key agreement protocol with pairings”, Cryptology ePrint Archive, available at http://eprint.iacr.org/2002/122/. 20. F. Zhang and K. Kim, “ID-Based blind signature and ring signature from pairings”, Asiacrypt2002, New Zealand, LNCS 2501, pp. 533–547, Springer-Verlag, 2002. 21. F. Zhang, F.T. Zhang and Y. Wang, “Fair electronic cash systems with multiple banks”, SEC 2000, pp. 461–470, Kluwer, 2000.
A Simple Anonymous Fingerprinting Scheme Based on Blind Signature Yan Wang, Shuwang L¨ u, and Zhenhua Liu State Key Laboratory of Information Security, Graduate School of Chinese Academy of Science, Beijing 100039 ywang [email protected]
Abstract. Using the blind version of a modification of DSA signature scheme together with cut-and-choose technique, an anonymous fingerprinting protocol is proposed, which can offer anonymity for the buyer in a stronger sense by resisting the collusion between the merchant and the registration center.
1
Introduction
1.1
Background
With the fast development of information technology and electronic commerce during the past several years, people can easily access and deal with digital contents. How to protect intellectual property has become more and more important in this information era. A lot of research work has been invested into the design of methods which technically support the copyright protection of digital data. Among them, fingerprinting has emerged as an important way for copyright protection. When a buyer purchases some digital data, some information unique to him will be embedded in the data. Thus upon finding a redistributed copy, the original merchant of the data could identify the original buyer of the redistributed copy with the help of the embedded information. So fingerprinting can be used to deter people from illegally redistributing the digital data they purchased. However, in classical symmetric fingerprinting schemes ([BMP85],[BS95]), even if the merchant succeeds in identifying a dishonest buyer, she could not prove to others (usually, a judge) that the buyer is guilty. Because the merchant knows the fingerprinted copy as well. In [PS96], asymmetric fingerprinting was introduced, where only the buyer knows the fingerprinted copy and when the merchant finds a redistributed copy, she could get a proof of the buyer’s guilty that can convince any third party. But there is still a drawback that the buyer has to provide his identity during the purchases. This is contrary to anonymity
Supported by the National Natural Science Foundation of China under Grant No. 60277027,the National Grand Fundamental Research 973 Program of China under Grant No. TG1999035804 and the Innovation Foundation of State Key Laboratory of Information Security.
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 260–268, 2003. c Springer-Verlag Berlin Heidelberg 2003
A Simple Anonymous Fingerprinting Scheme Based on Blind Signature
261
which is one of the basic needs in electronic commerce. So in [PW97], anonymous fingerprinting is proposed where the buyer can purchase his copy anonymously. Only when he redistributes his fingerprinted copy, can his identity be revealed. Many papers have devoted to constructions for anonymous fingerprinting ([Dom98],[DJ98],[Dom99],[CCCW01],[KKK02],etc.) in recent years. Usually, a buyer registers himself to a registration center to get a certified pseudonym or get some certification on the information that will be used in the anonymous purchase process. Then he can interact with the merchant without revealing his real identity. Later when a redistributed copy of that buyer is found, the merchant can identify the buyer with the help of the secret information extracted from the redistributed copy. However, as was pointed out in [S01], quite a few of those anonymous fingerprinting schemes can only offer anonymity in a weak sense (or they are called semi-anonymous fingerprinting). Because the merchant and the registration center can link their view in the purchase and thus they can identify the buyer’s shopping behavior easily. In a stronger sense of anonymity ([PW97]), as long as the buyer does not redistribute his fingerprinted copy, even a collusion of the merchant and the registration center can not identify the buyer or link different purchases of the same buyer. In [PW97], a framework for anonymous fingerprinting is proposed which can offer stronger anonymity. But it is rather inefficient because it relies too much on the zero-knowledge proofs and explicit constructions are not provided in that paper. Based on digital coins, [PS99] presented the first explicit construction for anonymous fingerprinting in the stronger sense and provided detailed analysis for their construction. However, in their coin-based anonymous fingerprinting scheme, the accused buyer has to participate in the trial to deny the charges if possible. So in [PS00], the same authors of [PS99] improved the coined-based anonymous fingerprinting by introducing direct non-repudiation, i.e., the merchant could have enough information to convince any judge of the buyer’s guilty without the buyer’s participating. [Cam00] is another approach to construct anonymous fingerprinting where group signature is used to offer buyers’ anonymity and unlinkability. 1.2
Our Contribution
In this paper, we use blind signature directly to offer anonymity and unlinkability. The buyer selects some secret information and hides them in elements of a group in which the computation of discrete logarithms is infeasible. Then the elements and some corresponding commitments are sent to the registration center who is asked to give blind signature for some of the buyer’s information. Using a cut-and-choose technique, the registration center asks the buyer to open some of the commitments to check whether the relationship among them is the same as the buyer claimed them to be. If the opened buyer’s information goes through all the verifications, the registration center will be convinced with high probability that the remained information would also retain the relation. Then the registration center gives the buyer a blind signature for the element which takes the buyer’s secret information as its discrete logarithm. And the signature will be used as a certification for the buyer. When the buyer buys some copy,
262
Y. Wang, S. L¨ u, and Z. Liu
he gives the certification to the merchant and the information that corresponds to the certification will be embedded in the copy and serve as an identifier for the buyer. And only the merchant extracts the identifier from the redistributed copy, can the view of the merchant and the registration center be linked and thus the buyer’s identity is revealed. Our main contribution is the achievement of an easy and explicit way to ensure that the buyer could prove to the merchant that he has correctly registered at the registration center without allowing the registration center and the merchant to link their view. Comparing to previous constructions, our construction is more direct and explicit. And thus it offers a good alternative for future use in practice. Moreover, under the assumption that the registration center does not collude with buyers, the protocol can achieve the property of direct non-repudiation. With only little change, we can integrate the efficient fingerprint embedding protocol presented [KT01] into our anonymous fingerprinting scheme. Thus we can get a rather concrete anonymous fingerprinting scheme which can bring fingerprinting more close to practical use. The paper is organized as follows. Section 2 is our construction of the new anonymous fingerprinting scheme. The security is discussed in Sect. 3. In Section 4, the paper is concluded with some advices for future work.
2
Construction of Anonymous Fingerprinting
We will exploit the blind signature scheme which is proposed in [Cam94] by blinding a modification of DSA signature scheme. A blind signature is a protocol allowing Bob to obtain a valid signature for message m from a signer Alice while preventing Alice from seeing the message and its signature. Even if Alice could see the message and its signature later, she is unable to link the messagesignature pair to the particular instance of the signing protocol which has led to the pair. The blind signature scheme we use is as follows ([Cam94]): Let p is a prime with q a large prime factor of p − 1. g ∈ Zp∗ is a generator of order q. The signer generates a secret signing key x ∈ Zq∗ randomly and publishes the public key y = g x modp. In the following protocol, the signer, Alice, will give a blind signature to m (gcd(m, q) = 1), which is selected by the signature receiver Bob. = gk modp. 1. a) Alice randomly chooses k ∈ Zq , and computes R q) = 1. If this is not the case, it goes back to b) Alice checks wether gcd(R, step a). Otherwise, it sends R to Bob. q) = 1. 2. a) Bob checks that gcd(R, α g β modp. b) Bob randomly chooses α,β ∈ Zq and computes R = R c) Bob checks wether gcd(R, q) = 1. If this is not the case, he goes back to −1 modq and sends m step b). Otherwise, he computes m = αmRR to Alice. modq to Bob. 3. Alice forwards s = km + Rx −1 + βm modq and r = R modq. Then B accepts 4. Bob determines s = sRR the signature pair (r, s) as the correct signature on m if g s = y r rm modp holds.
A Simple Anonymous Fingerprinting Scheme Based on Blind Signature
263
We also need a bit commitment scheme (eg. a scheme from [Ped92]) in our fingerprinting protocol. A bit commitment scheme is a method that allows Bob to commit to the value of a bit in a way that presents the other party (Alice) from learning it without Bob’s help. Later Bob can open the commitment to show Alice which bit he has committed to and he should not be able to cheat by not showing the genuine bit that he chooses to commit. Some conventions: p,q is described as above, i.e. q is a large prime factor of p − 1, where p is prime. And we assume Zq is the unique subgroup of order q of the multiplicative group Zp∗ and the computation of discrete logarithms is infeasible in Zq . Three generators g,g1 ,g2 of Zq are randomly generated and publicized. Every party can generate a secret signing key x ∈ Zq∗ randomly and publishes the public key y = g x modp. There are four parties in our construction: the buyer (B), the merchant (M), the registration center (RC) and the judge (J). The anonymous fingerprinting protocol consists of four sub-protocols: registration, fingerprinting, identification and trial. We assume all the communications can be processed anonymously which has also been indicated by the e-commerce protocols and previous anonymous fingerprinting protocols. The private and public key pairs of RC and B are denoted by (xRC , yRC ), (xB ,yB ), where xRC ∈ Zq∗ , xB ∈ Zq∗ are secretly and randomly selected by RC and B respectively. sigB (m) is the signature generated by B using his privet key xB for m. The commitment to the bit string b is denoted by Com(b). 2.1
Registration
1. B randomly chooses T secret elements bi ∈ Zq∗ ,i = 1, . . . , T. Then he computes Gi1 = g1bi modp, Gi2 = g2bi modp and checks whether gcd(q, Gi1 ) = 1 holds. If this is not the case, he chooses a new bi randomly and until gcd(q, Gi1 ) = 1 holds for all i. Then he commits to bi and Gi1 and sends Com(bi ), Com(Gi1 ), Gi2 ,i = 1, . . . , T to RC. RC checks whether Gi2 has been used by former buyers. If this is not the case, the protocol goes to the next step. Otherwise, it asks the buyer to make new choices and the corresponding commitments. Note that one of Gi1 (i = 1, . . . , T ) will be blindly signed by RC in the following protocols. = gk modp. 2. a) RC randomly chooses k ∈ Zq , and computes R b) RC checks wether gcd(R, q) = 1. If this is not the case, it goes back to to B. step a). Otherwise, it sends R 3. a) B checks that gcd(R, q) = 1. αi g βi modp. b) B randomly chooses αi ,βi ∈ Zq and computes Ri = R c) B checks wether gcd(Ri , q) = 1. If this is not the case, he goes back to −1 modq. And he also commits i = αi Gi RR step b). Otherwise, he computes G 1 1 i i , Com(αi ), Com(βi ),i = 1, . . . , T, to RC . to αi , βi . Then he sends G 1 4. RC randomly chooses v ∈ {1, . . . , T }, and asks B to open = v. After B {Com(αi ), Com(βi ), Com(bi ), Com(Gi1 )}, i ∈ {1, . . . , T }, and i opens those commitments, RC can check whether Gi1 = g1bi modp, Gi2 = g2bi modp
264
Y. Wang, S. L¨ u, and Z. Liu
−1 modq, for i ∈ {1, . . . , T }, and i i = αi Gi RR and G = v. If all these equations 1 1 i hold, the protocol goes to the next step. Otherwise, it stops. 5. B sends his signature on his real identity (IDB ) and Gv2 , i.e. sigB (IDB , Gv2 ) to RC. 6. RC verifies the signature using B’s public key yB . If the signature is not v + Rx kG valid, the protocols stops. Otherwise, RC computes sv = 1 RC modq and sends it to B. −1 +βv Gv modq and rv = Rv modq. Then B accepts 7. B determines sv = sv Rv R 1 v r v G1 rv modp the signature pair (rv , sv ) as the correct signature on Gv1 if g sv = yRC holds. This registration protocol is the integration of the blind DSA-modification signature scheme and the cut-and-choose technique. We note that after the registration protocol, B receives a blind signature (rv , sv ) for Gv1 issued by RC. And RC gets a record (g2v , IDB , sigB (IDB , g2v )). 2.2
Fingerprinting
This protocol is executed between an anonymous buyer B and the merchant M. Suppose there is a secure embedding process, by which both parties will not leak their secret information and M could extract the embedded information completely upon finding the redistributed fingerprinted copy. 1. B makes another commitment Com (bv ) of bv , and sends Com (bv ), Gv1 and its signature (rv , sv ) to M. He also proves in zero-knowledge way that Com (bv ) indeed commits to the the discrete logarithm of Gv1 with respect to g1 . 2. M verifies the signature (rv , sv ) for Gv1 using RC’s public key yRC . If the signature is not valid, the protocol stops. 3. M and B enter a 2-party secure protocol. M’s secret inputs are the original copy P0 and Com (bv ), B’s secret inputs are bv and Com (bv ). The output of the protocol for B is a fingerprinted copy PB of P0 . Remark 1. [KT01] presented an efficient fingerprinting protocol, where the homomorphic property of Okamoto-Uchiyama cryptosystem is used to realize the embedding. Their paper’s main contribution is a construction for efficient asymmetric fingerprinting, since they don’t pay enough attention to the registration process. With some relevant changes, their embedding protocol can be well adapted to step 2 and step 3 in our above protocol. Thus we’ll be able to get a concrete and efficient anonymous fingerprinting protocol which brings fingerprinting more close to practical use. 2.3
Identification
When M finds a redistributed copy, she tries to trace an illegal user as follows: 1. M extracts a value bv from the redistributed copy using the underlying extracting scheme. Then she computes g1bv and searches in her database and find an item {Gv1 , rb , sb } by simply comparing g1bv with Gv1 . M gives {bv , Gv1 , rv , sv } to RC for possible identification of some buyer.
A Simple Anonymous Fingerprinting Scheme Based on Blind Signature
265
2. RC verifies Gv1 = g1bv and (rv , sv ) is a valid signature for g1bv . Then it computes g2bv and searches in its databases for possible item (Gv2 , IDB , sigB (IDB , g2v )) with Gv2 = g2bv . If it couldn’t find the item, then M stops tracing. Otherwise, RC sends IDB , sigB (IDB , g2bv ) to M. 3. Using the public key yB which corresponds to IDB , M verifies sigB (IDB , g2bv ) is the valid signature generated by B. 2.4
Trial
1. M sets proof = (bv , IDB , sigB (IDB , g2bv )) and sends it to J. 2. J verifies whether sigB (IDB , g2bv )) is a signature for (IDB , g2bv )) by B. If the signature goes through the verification, J is convinced that B is a traitor.
3
Security Discussion
We now analyze the security of our construction. We assume all the primitives used are secure. 3.1
Security for the Innocent Buyer
The secret information bv is the crucial element in the protocol. M would not be able frame an innocent buyer if she can not get the value bv . In fact, the knowledge on bv other parties can get is g1bv ,g2bv , Com(bv ),Com (bv ) and the zero-knowledge proof about bv in the fingerprinting protocol. Because the embedding procedure is assumed to be secure, knowledge on bv in embedding is not leaked. Because computing the discrete logarithm of g1bv , g2bv is infeasible and the cryptographic primitives are assumed to be secure (i.e., if the proof is actually zero-knowledge and the commitments are semantically secure), during the purchase, the buyer’s identifier bv can kept secret from other parties. Only M get a redistributed copy of B, can she get the identifier by using the underlying extracting algorithm. We consider another aspect. The attacker (it may be a collusion of M and RC) may forge a secret element b and embed it in her original copy P0 to get PB . Though she can get the buyer’s identity IDB from RC , she can not convince J that B redistributed the copy PB . Because no party could forge the signature
sigB (IDB , g2b )) which must be included in the proof . If a buyer has been taken to the trial for his illegal redistribution before, his new purchases will still be safe. Because the buyer would be care not to use the same secret information in his different purchases to avoid possible accusation by the merchant. 3.2
Anonymity and Unlinkability
We state that if only bv is kept secret from M and RC, B will stay anonymous. As has been discussed above, bv will be secure if the buyer does not redistributed the fingerprinted copy with bv embedded in it.
266
Y. Wang, S. L¨ u, and Z. Liu
For buyer’s unlinkability of his different purchases, we note that the generators g1 and g2 are generated independently, so they couldn’t be linked. Moreover, due to the unlinkability between the message-signature pair and the view of the signer in blind signature schemes, the view of RC in registration protocol couldn’t be linked to M’s view in the fingerprinting protocol. So even a collusion of M and RC could not identify an honest buyer through linking their views during the purchases. 3.3
Security for the Merchant
First we assume that using the underlying embedding scheme, when the redistributed copy is similar enough to the original copy and the tolerated size of buyers collusion is not exceeded, M can extract the embedded information completely. We state that if RC honestly executes the registration protocol and it does not collude with B, M could trace an illegal buyer with high probability. We note that after the verifications by RC in the registration protocol, with probability 1 − T −1 , the discrete logarithm of the element that is to be blindly signed by RC (say Gv1 ) equals to that of the element Gv2 , which is used as a record in the database of RC. This is because the buyer doesn’t know a priori which instance of the secret information he will be asked to open. So with probability 1 − T −1 , the merchant can identify the traitor. Though sometimes M may not be able to trace a traitor, we note that an honest merchant will never make wrong accusations. We think that the protection for the merchant from accusing innocent buyers is more important because that will bring her more lost by lower her reputation than the lost caused by giving up some tracing. Remark 2. We think that it is reasonable to assume that RC is not collude with the buyer. Since in practical environment, usually RC is a rather trustworthy agent. (Note that here we don’t assume RC is a totally trusted third party. Otherwise we could just let the buyer and the merchant give their secret information to it and that is not the case we discuss here.) There is also no exciting interest for RC to collude with any a buyer, and the former usually supported by the merchant or some organizations. To prevent the collusion between M and RC is a more important issue. Remark 3. If B can prove in a zero-knowledge way to RC that the content in the bit commitments satisfy the desired relation as described in step 4 of the registration protocol, we can improve the registration protocol and relax our assumption on RC that it does not collude with B. This is worth noting in our future research.
4
Conclusion
We have proposed an explicit anonymous fingerprinting protocol based on a blind signature scheme. Under some reasonable assumptions, not only can the
A Simple Anonymous Fingerprinting Scheme Based on Blind Signature
267
merchant trace the traitor with high probability, but also our protocol can offer the anonimity and unlinkability in a stronger sense for the buyer. There is still some trust assumption we have made on the registration center. We hope to improve the registration protocol and improves the probability that a merchant can trace illegal buyers when finding redistributed copies. Also the efficiency of the registration protocol should be improved. Moreover, we can see that there is a strong need for the development of secure embedding and extracting process, which should be researched with the attention to the development of the watermarking techniques.
References [BMP85] [BS95] [Cam00] [Cam94]
[CCCW01]
[DJ98]
[Dom98]
[Dom99]
[KKK02]
[KT01]
[Ped92] [PS96] [PS99] [PS00]
Blakley, G. R., and Meadows, C., Prudy, G.B: Fingerprinting long forgiving messages. CRYPTO ’85. LNCS 218, Springer, (1985) 180–189 Boneh,D., Shaw, J.: Collusion-secure fingerprinting for digital data. IEEE Trans. on Inform. Theory, Sep,vol IT-44 (1998) 1897–1905 Camenisch,J.: Efficient anonymdous fingerprinting with group Signatures. ASIACRYPT 2000. LNCS 1976, Springer, (2000) 415–428 Camenisch, J. L., Piveteau, J. M. Stadler, M.A.: Blind signatures based on the Discrete Logarithm Problem. Eurocrypt ’94. LNCS 950, Springer, (1995) 428–432 Chung, C., Choi, S., Choi, Y. and Won, D.: Efficient anonymous fingerprinting of electronic information with improved automatic identification of redistributors. Proc. of ICISC 2000. LNCS 2015, Springer, (2001) 221– 234 Domingo-Ferrer, J. , Herrera-Joancomarti, J.: Efficient smart-card based anonymous fingerprinting. Smart Card Research and Advanced Application – CARDIS’98. Springer, (1998) 221–228 Domingo-Ferrer, J.: Anonymous fingerprinting of electronic information with automatic identification of redistributors. IEE Electronics Letters. vol. 34, no. 13, Jun. (1998) 1303–1304 Domingo-Ferrer, J.: Anonymous fingerprinting based on Committed Oblivious Transfer. Public Key Cryptography’99. LNCS 1560, Springer, (1999) 43–52 Kim,M., Kim, J. and Kim, K.: Anonymous fingerprinting as secure as the bilinear Diffie-Hellman Assumption. Proc. of ICICS 2002. LNCS 2513, Springer, (2002) 97–108 Kuribayashi, M., Tanaka,H.: A new anonymous fingerprinting scheme with high enciphering Rate. Proc. of INDOCRYPTO 2001. LNCS 2247, Springer, (2001) 30–39 Pedersen, T. P.: Non-interactive and information-theoretic secure verifiable secret sharing. Crypto’91. LNCS 576, Springer, (1992) 129–140 Pfitzmann, B., Schunter, M.: Asymmetric fingerprinting. EUROCRYPT’96. LNCS 1070, Springer, (1996) 84–95 Pfitzmann,B. , Sadeghi.A.: Coin-based anonymous fingerprinting. EUROCRYPT ’99, LNCS 1592, Springer, (1999) 150–164 Pfitzmann,B. , Sadeghi,A.: Anonymous Fingerprinting with direct nonrepudiation. Proc. of ASIACRYPT 2000. LNCS 1976, Springer, (2000) 401–414
268 [PW97] [S01]
Y. Wang, S. L¨ u, and Z. Liu Pfitzmann, B.,Waidner,M.: Anonymous fingerprinting. EUROCRYPT ’97. LNCS 1233, Springer, (1997) 88–102 Sadeghi,A.: How to break a semi-anonymous fingerprinting Scheme. Proc. of Information Hiding 2001. LNCS 2137, Springer, (2001) 384–394
Compact Conversion Schemes for the Probabilistic OW-PCA Primitives Yang Cui, Kazukuni Kobara, and Hideki Imai University of Tokyo Institute of Industrial Science, Komaba 4-6-1, Tokyo, Japan [email protected], {kobara,imai}@iis.u-tokyo.ac.jp http://imailab-www.iis.u-tokyo.ac.jp/imailab.html
Abstract. In this paper, we propose two new generic conversion schemes which generate IND-CCA security from probabilistic public key encryption primitives, given that the underlying primitives are OW-PCA secure in the random oracle model. Compared to the previous generic conversions (GEM, REACT, etc.), both of our proposals have the advantage of compactness (if the input size of the public key encryption primitive is large), which is especially meaningful in the bandwidth-saving channels. Note that the latter one of our proposals has not necessarily to carry out the re-encryption for validity check, which will accelerate the decryption greatly. Keywords. Generic conversion, IND-CCA, probabilistic public key encryption, random oracle model, OW-PCA, compactness.
1
Introduction
The fundamental task of the cryptography is to provide the confidentiality of the communication system, and encryption targets this work, mainly. For public key encryption schemes, they are expected to have the strongest sense of security IND-CCA [15], i.e. indistinguishability against the adaptive chosen-ciphertext attacks. However, very few primitives have been proven to hold the underlying security in the standard model, and simultaneously practical. The random oracle model, first suggested by Fiat and Shamir [5], expanded by Bellare and Rogaway [1], is such a promising way to design and analyze a provably secure and efficient scheme, that many public key primitives embedded with paddings, have been converted to the strongest security sense scheme from a weaker one. Heuristic as it is said to be, the random oracle model is used in numerous public key cryptosystems, including those adopted by the standardization groups. Thus, it is meaningful to build a generic conversion which can adapt to many different encryption primitives and enhance their security, if the conversion does hold some advantages over others. This work presents generic conversions, which are provably secure in the random oracle model, and especially efficient in the data redundancy if the message input of the encryption primitive is long. They are applicable to numerous S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 269–279, 2003. c Springer-Verlag Berlin Heidelberg 2003
270
Y. Cui, K. Kobara, and H. Imai
probabilistic public key encryption primitives, such as ElGamal [4], OkamotoUchiyama [12] and Paillier [13], etc.
1.1
Related Work
Since Bellare and Rogaway proposed the Optimal Asymmetric Encryption Padding (OAEP) scheme [2], several generic conversions have come into public view. Fujisaki and Okamoto firstly introduced a conversion [6] to generate IND-CCA from IND-CPA security, i.e. indistinguishability against the chosenplaintext attacks, which is a little stronger requirement. Soon after that, they improved it to the generic conversion [7], by using a more general assumption of one way (OW-CPA) primitives. Independently, Pointcheval built another generic conversion with a similar result [14]. While REACT presented by Okamoto and Pointcheval [11], is different from the previous works because it based on security of OW-PCA i.e. one-wayness against plaintext-checking attacks, which security generally depends on a kind of gap problem [10]. It is efficient on the operating speed, however, pays more data redundancy for the cost of “on the fly” speed. More recently, Coron et al. gave us a generic conversion GEM [3] with more compactness in the ciphertext, but paid computation overhead. Note that all above works except OAEP can apply to probabilistic encryption.
1.2
Our Result
Although OAEP is widely used now, the IND-CCA security of the OAEP conversion does not base on the OW-CPA, but the Partial-Domain one-wayness against chosen plaintext attacks (PDOW-CPA) [8], which restricts its application fields out of some special one. Futhermore, OAEP cannot be applied to probabilistic public key encryption primitives, either. Thus, for the convenience, we expect to have a slight modification on the original OAEP, achieving the compact ciphertext and applicability to probabilistic primitives. In this paper, we propose two schemes, which are the most compact generic conversion for probabilistic encryption primitives, given that the primitive input is large. The first scheme P 1, has the most compact size, and the second scheme P 2 not necessarily has the re-encryption in the decryption process, but the size becomes more redundant.
1.3
Outline of the Paper
In the section 2, we start with some security notions; then show the new constructions in the section 3; give our security proof in section 4, with the use of the plaintext-checking oracle; in section 5, we have a comparison of the data redundancy of all the generic conversions; at last, draw a conclusion of our proposals in section 6.
Compact Conversion Schemes for the Probabilistic OW-PCA Primitives
2
271
Preliminaries
Here, in this section, we recall some basic security definitions so that we can conclude the security requirements of our schemes. 2.1
Public Key Encryption
Definition 1. Public key encryption is defined by a triple of algorithms, (K, E, D): – the key generation algorithm K: on a secret input 1k (k ∈ N), in polynomial time in k, it produces a pair of keys (pk, sk), public and secret known respectively. – the encryption algorithm E: on input of message m ∈ {0, 1}n and public key pk, the algorithm E(m; r) produces the ciphertext c of m, c ∈ {0, 1}∗ . (random coins r ∈ Ω). – the decryption algorithm D: By using a ciphertext c and the secret key sk, D returns the plaintext m, or when it is an invalid ciphertext, outputs ⊥. This algorithm is deterministic. The basic security requirement is one-wayness (OW), which roughly means that one cannot derive the whole plaintext from the ciphertext without knowing the secret key. But this does not mean any information of the plaintext is unavailable to the attacker. In order to hold the confidentiality of any information of the message, Indistinguishability (IND) is required. Thus, we have the following definitions. 2.2
Security Notions
Definition 2. In the notion of one-wayness, there exists no such an adversary A, with the public data only, that can get the whole preimage of the ciphertext in a polynomial time bound t, and with an inverting probability not more than ε: ? (pk, sk) ← K(1k ) : A(Epk (m; r)) = m ≤ ε, Pr R m←{0,1}n R r ←Ω
Note that in the public key scenarios, the one-wayness against the chosenplaintext attacks (OW-CPA) is the least requirement of the cryptosystem, otherwise it is insure because of encryption key is public to everyone including attackers. In [11], Okamoto and Pointcheval built a plaintext-checking oracle PCO to check whether the input (m; c) pair is the corresponding encryption pair by the public key primitive or not. Such that, output 1, when relation holds; otherwise output 0. Obviously, it is equivalent to the chosen-plaintext attack when the encryption algorithm is deterministic. So this is not a strong security definition.
272
Y. Cui, K. Kobara, and H. Imai
Definition 3. A public key encryption scheme is said to be OW-PCA, if no polynomial-time adversary A, with the public data and the help of the PCO, can get the whole preimage of the ciphertext with at most q queries to the PCO oracle, in a time bound t and a winning probability not more than ε: Pr
R b←{0,1} R r ←Ω
(pk, sk) ← K(1k ), c ← Epk (m; r) : ?
PCO(A(Epk (m; r)); c) = 1
≤ε
Beyond the one-wayness, the polynomial indistinguishability of the encryption can make the leak of any partial information as hard as that of the whole plaintext. Due to making sense in the strongest attack scenario, the IND-CCA has become the de facto requirement of the public key cryptosystem, as follows. Definition 4. A public key encryption scheme is IND-CCA secure, if there exists no polynomial-time adversary A = (A1 , A2 ) who, under the help of the decryption oracle, can distinguish the encryption of two equal-length, distinct plaintexts, with the probability significantly greater than 1/2 (the only restriction is that the target ciphertext cannot be sent to the decryption oracle directly). More formally, the scheme is IND-CCA secure, if with the time bound t, decryption oracle querying bound q and winning advantage ε, the following is satisfied: (sk, pk) ← K(1k ), (m0 , m1 , s) ← A1 (pk) 1+ε ≤ Pr ? R 2 (m ; r) : A (c, s) = b c ← ε pk b 2 b←{0,1} R r ←Ω
Remark. After clarifying the security notions, we find that OW-PCA and OWCPA make the same meaning when the encryption algorithm is deterministic, which means that distinct conversions with the same public key encryption primitive, although based on the two assumptions are reduced to the same hard math problem. However, what we focus on here is the probabilistic encryption primitive, OW-PCA and OW-CPA will reduce to different math problem, the former always has a reduction to the gap problem [10], though the latter is reduced to the computation problem. For example, to break the OW-PCA of the ElGamal cryptosystem, it is equivalent to solve the Gap-Diffie-Hellman problem [10]. Rather, to break the OW-CPA of the ElGamal scheme is as hard as solving Computation-Diffie-Hellman problem, which is obviously harder.
3
New Generic Conversions
We present two new generic conversions, which generate the IND-CCA security, from any OW-PCA probabilistic encryption primitive. Note that our padding scheme P 1 has the advantage on the data redundancy, and P 2 is not necessarily to have a re-encryption for the integrity and validity check in the decryption stage, which is expected to speed up the scheme greatly.
Compact Conversion Schemes for the Probabilistic OW-PCA Primitives
273
Conversion P1. The initialization of the scheme work as follows: A random number R ∈ {0, 1}k1 , thus k1 and k denote the length of the random number R and r, k2 and k3 are the length of the y2 , y3 respectively, and message size n is the total length of the y2 and y3 : n = k2 + k3 . G and H are two ideal hash functions, Epk and Dsk denote the public key encryption and decryption algorithms, respectively. G : {0, 1}k1 → {0, 1}k+n , H : {0, 1}n → {0, 1}k1 on input R, it generates G(R) = r||r , with r = [G(R)]k and r = [G(R)]n (We define [G(R)]k as the first k bits, and [G(R)]n as the last n bits of the G(R)). Encryption of m:
Decryption of c:
y1 := R ⊕ H(m ⊕ r ) y2 := [m ⊕ r ]k2 y3 := [m ⊕ r ]k3 , 0 < k3 ≤ n y4 := Epk (y3 ; r) c := y1 ||y2 ||y4 return c
c := y1 ||y2 ||y4 y3 := Dsk (y4 ) R := y1 ⊕ H(y2 ||y3 ) r := [G(R)]n r := [G(R)]k If Epk (y3 ; r) = y4 , output ⊥; otherwise, return m = r ⊕ (y2 ||y3 )
Fig. 1. Conversion P 1
Conversion P2. This conversion applies one more hash function as the validity and integrity check, leaves out the re-encryption process, and speeds up the decryption, but sacrifices the computation overhead. The initialization follows: the random number is generated to R||r with the length of k1 +k at first, and the output of the hash function G turns to be n-bit. k{2,3,4,5} represent the length of y{2,3,4,5} respectively. The hash function G, H1 and H2 have the mapping: G : {0, 1}k1 → {0, 1}n , H1 : {0, 1}n → {0, 1}k1 , H2 : {0, 1}k1 +k4 → {0, 1}k5 Remark. We have the modification on the input of encryption primitives, only encrypting part of the padding output, transferring the security assumption to one-wayness solely [9]. The input size is flexible and can be changed according to different settings. Next, we will prove our proposals by the following theorems.
4
Security Analysis
We will prove in this section that our generic conversions generate the IND-CCA security from the probabilistic OW-PCA primitive, under the help of the random oracle model. Let us first give the following theorem to prove that conversion P 1 is IND-CCA:
274
Y. Cui, K. Kobara, and H. Imai Encryption of m:
Decryption of c:
y1 := R ⊕ H1 (m ⊕ G(R)) y2 := [m ⊕ G(R)]k2 y3 := [m ⊕ G(R)]k3 , 0 < k3 ≤ n y4 := Epk (y3 ; r) y5 := H2 (y4 ; R) c := y1 ||y2 ||y4 ||y5 return c
c := y1 ||y2 ||y4 ||y5 y3 := Dsk (y4 ) R := y1 ⊕ H1 (y2 ||y3 ) m := G(R) ⊕ (y2 ||y3 ) If y5 = H2 (y4 ; R) output ⊥; otherwise, return m = m
Fig. 2. Conversion P 2
Theorem 1. Let A be a CCA adversary against the Indistinguishability of the P 1 conversion, with advantage ε and running time t, making at most qD , qG and qH queries to the decryption oracle, hash functions Gen and Hash respectively. Then there exists an algorithm B, against the OW-PCA security of the asymmetric encryption scheme with advantage ε and running time t , where G ε ≥ ε − q2Dk − qD2+q + 2qkD3 k1 t = t + TH · qH + TG · qG + (TD + O(1)) · qD
Corollary 1. From the theorem 1, we can get immediately that any probabilistic public key encryption primitive satisfying the OW-PCA, can be converted to INDCCA security by our conversion P 1, in the random oracle model. Proof. Theorem 1 can be proven by Lemma 1 and collaboration with the decryption oracle. TG , TH , TD are defined as execution time of the Gen, Hash and decryption oracle, respectively. Lemma 1. If there exists an adversary A which chooses plaintexts of the P1 conversion adaptively, making at most qG and qH queries to the hash functions Gen and Hash respectively, can distinguish the m0 and m1 with the corresponding ciphertext successfully, with non-negligible advantage ε and time t. Then one can have an algorithm B, which breaks the one-wayness against the plaintext-checking attack, with advantage ε and time t , where ε ≥ ε − 2qkG1 t = t + TH · qH + TG · qG
Proof. The algorithm B acts as follows. First B simulates the Gen and Hash oracles according to the query from A, and if B simulates the answer correctly, the A can break the indistinguishability of encryption with non-negligible advantage. Then algorithm B makes two tables, G-List and H-List, initialized empty. When the input on the Gen or Hash, the response is generated by the oracle. Hence the input and output pairs of both oracles should be recorded exactly, on the G-List and H-List, respectively.
Compact Conversion Schemes for the Probabilistic OW-PCA Primitives
275
First, B runs A as the find-stage mode simulating. We define that on the input k1 -bit g, the output n + k-bit G of the Gen, and input n-bit h, the output k1 -bit H of the Hash, in which G and H are uniformly and randomly generated by the oracles. Next, B runs A as the guess-stage mode. When (R , mb ), b ∈ {0, 1}, are taken as the input of the conversion, B will first go to the G-List to check whether such a (g, G) pair exists that makes both the corresponding G satisfying (1), G(R ) = r||{mb ⊕ (y2 ||∗)}
(1)
and the (y2 ||∗) (where ∗ is any k3 -bit random number.) is in the query table h of H-List. If they are satisfied, then “∗” is queried to the plaintext-checking oracle, checking whether the pair (∗, y4 ) is a plaintext-ciphertext pair. If it is false, then output “Invalid”; otherwise, return y3 := ∗, mb = (y2 ||y3 ) ⊕ [G(R )]n . A successfully distinguishes the mb with the correct simulation of B. However, it is difficult for B to simulate all the queries from A, such as simulating Gen when R is asked to it, without knowing the y3 exactly. Therefore we have to define the following scenarios. AskG and AskH respectively: – AskG defines the event that R has been asked to the Gen oracle in the qG queries, before that (y2 ||∗) is queried to Hash oracle. – AskH defines the event that (y2 ||∗) has been asked to Hash oracle in the qH queries before that R is queried to the Gen oracle. Because of the definition of the advantage ε, Pr(W in) = (ε + 1)/2, and (2) Pr(W in) ≤
Pr(AskG ∨ AskH) + 1 2
(2)
we know that Pr(AskG ∨ AskH) ≥ ε
(3)
Since the probability of one query to Gen equivalent to R is 1/2k1 , then for at most qG queries, qG 1 qG Pr(AskG) ≤ 1 − 1 − k1 ≤ k1 (4) 2 2 If the event AskG does not happen and the event AskH happened, then the algorithm B can recover the plaintext of the OW-PCA primitive. Obviously, since Pr(AskG ∧ AskH) = 0, Pr(AskG ∨ AskH) = Pr(AskG) + Pr(AskH)
(5)
Thus, from (3), (4) and (5), we get (6): Pr(¬AskG ∧ AskH) = Pr(AskH) = Pr(AskG ∨ AskH) − Pr(AskG) qG ≥ ε − k1 2
(6)
276
Y. Cui, K. Kobara, and H. Imai
The algorithm B uses steps of at most t + TH · qH + TG · qG times, in which TH is the steps for checking whether the query h is new or not, and the same as the TG . We finish the proof of Lemma 1. Decryption Oracle. When we face the adaptively chosen-ciphertext attack, there should be a decryption oracle, which simulates the response by going to check the H-List and G-List on the query. When an adversary A chooses a ciphertext c = (y1 ||y2 ||y4 ), by the two query-answer tables we use in Lemma 1, B searches the H-List for whether there exists (y2 ||∗) in the h query table. For any of these queries, the plaintext-checking oracle PCO will check the pair (∗, y4 ) to decide whether they are the plaintext-ciphertext pair. If some hi passing the examination, Hi ⊕ y1 will be searched in the query table g of the G-List. If there exist such gj and Gj pairs that gj = (Hi ⊕ y1 ), we use the public key of the OW-PCA primitive to encrypt the (r, ∗) once more to check the validity of the “∗” by: Epk ([Gj ]k , ∗) = y4 ?
(7)
If the relation successfully holds, A will recover the plaintext by the correct simulation of B, as follows: mb = (y2 ||∗) ⊕ [G(R )]n
(8)
However, the adversary A has the possibility to reject the valid ciphetext because the simulation fails. We have to consider these scenarios. – AskG defines the event that R is asked to the Gen oracle in the qG queries. – AskH defines the event that (y2 ||∗) has been asked to Hash oracle in the qH queries. Since R is chosen randomly by B, and A cannot get it without querying Gen, the probability of one query to Gen is equivalent accidently to [G(R)]k is 1/2k , then for at most qD queries to the decryption oracle, q 1 D qD Pr(¬AskG ) ≤ 1 − 1 − k ≤ k (9) 2 2 With the same reason, the probability of rejecting the valid ciphertext because of no querying to Hash is as the following, qD 1 qD ≤ k1 (10) Pr(¬AskH ) ≤ 1 − 1 − k1 2 2 Especially, no querying to either Gen or Hash, the adversary A can only guess the y3 randomly and check it to the plaintext-checking oracle, Pr(¬AskG ∧ ¬AskH ) =
qD 2k3
(11)
Compact Conversion Schemes for the Probabilistic OW-PCA Primitives
277
Therefore, with Lemma 1 we have proven, we can conclude that the total advantage of the algorithm B is, Pr(¬AskG ∧ AskH ∧ U nf ailingDecryption) = Pr(AskH) − [Pr(¬AskG ) + Pr(¬AskH ) − Pr(¬AskG ∧ ¬AskH )] = Pr(AskG ∨ AskH) − Pr(AskG) − Pr(¬AskG ) − Pr(¬AskH ) + Pr(¬AskG ∧ ¬AskH ) qD + qG qD qD + k3 ≥ε− k − 2 2k1 2
(12)
And the number of steps is at most t+TH ·qH +TG ·qG +(TD +O(1))·qD , TD is the time of the decryption oracle and the plaintext-checking time is bounded by some constant. Now we have proven Theorem 1. Theorem 2. Let A be a CCA adversary against the Indistinguishability of the P 2 conversion, with advantage ε and running time t, making at most qD , qG , qH1 and qH2 queries to the decryption oracle, hash functions Gen, H1 and H2 respectively. Then there exists an algorithm B, against the OW-PCA security of the asymmetric encryption scheme with advantage ε and running time t , where G ε ≥ ε − q2Dk − qD2+q + 2qkD3 k1 t = t + TH1 · qH1 + TH2 · qH2 + TG · qG + (TD + O(1)) · qD
Proof. Theorem 2 can be proven by Lemma 2 and Theorem 1. Lemma 2. In the conversion P 2, the algorithm B has a perfect simulation of the H2 random oracle. Proof. Lemma 2 can be proven in this way. 1. the output of H2 is H2 , if H2 = y5 , then output randomly and record the pairs on the H2 -list. 2. else the H2 = y5 , check the G-list records, if there it is, go back to the H1 -list to find the corresponding pairs to get the message; if it is not on the G-list, reject the data, and output the ⊥. Thus the simulation of the H2 oracle is perfect. This has finished the proof of Theorem 2. Corollary 2. We obtain the result from Theorem 2. that any OW-PCA secure probabilistic primitive can be embedded into our conversion P 2, generating INDCCA security. Remark. We show the security reduction cost of both conversions in the random oracle. Due to the slight difference between P 1 and P 2, we got the similar results of them. However, the performance is distinct, s.t. P 1 is more compact and P 2 is more efficient in decryption process.
278
5
Y. Cui, K. Kobara, and H. Imai
Comparison
In the above sections, we have proven the security of our proposals, IND-CCA in the random oracle model. We also claim that P 1 and P 2 schemes have a good performance on the ciphertext compactness. We compare our schemes to the previous work which can apply to probabilistic primitives, such as the FO (Fujisaki-Okamoto), Pointcheval, REACT and size-efficient GEM schemes. (Since FO’s first scheme [6] requires IND-CPA primitives, and cannot process arbitrary long messages, we use the enhanced FO schemes to have the evaluation.) Furthermore, because our proposal have the advantage when the input size of the primitive is large, we take the setting as follows: the length of the random number input and hash function output are set to 160-bit, and the message input of the public key encryption primitive is 1024-bit, except that ECC is 160-bit. All of this parameter are reasonable, and practical. We choose some usually used probabilistic public key encryption primitives such as, ElGamal, OU (Okamoto-Uchiyama), Paillier and ECC (Elliptic Curve Cryptosystem) in the following. Table 1. Comparison of the Data Redundancy∗1 Conversions Re-encryption Condition ElGamal OU Paillier ECC P1 Yes OW-PCA 1184 2208 1184 320 P2 No OW-PCA 1344 2368 1344 480 FO Yes OW-CPA 2048 3072 2048 320 Pointcheval Yes OW-CPA 2208 3232 2208 480 REACT No OW-PCA 2208 3232 2208 480 GEM No OW-PCA 2048 3072 2048 320 ∗ 1: Data Redundancy=Ciphertext Size – Plaintext Size.
From the table, we can see that when the input size is large, our schemes have an evident advantage of the data redundancy, but the other schemes have generally the same performance. In the ElGamal, our schemes can nearly save one half redundant size than the longest ones. And even in the no re-encryption scenario, our proposal P 2 can still save much of the redundant size, note that it is very meaningful for the bandwidthsaving communication system. However, we can also see that, in the ECC, or other short-size message input system, our advantage is gone, and “on-the-fly” REACT becomes advantageous this time.
Compact Conversion Schemes for the Probabilistic OW-PCA Primitives
6
279
Conclusion
Finally, we go to the conclusion, that our generic conversions are provably secure and practical, in the random oracle model. And our schemes hold the advantage on the ciphertext compactness, which is much valuable in the field where the bandwidth is concerned first.
References 1. M. Bellare and P. Rogaway. Random Oracles Are Practical: A paradigm for designing efficient protocols, in Proc. First Annual Conference on Computer and Communications Security, ACM, 1993. 2. M. Bellare and P. Rogaway. Optimal Asymmetric Encryption – How to Encrypt with RSA. In Eurocrypt’94, LNCS 950, pages 92–111. Springer-Verlag, Berlin, 1995. 3. J.Coron, H.Handschuh, M.Joye, P.Paillier, D.Pointcheval, C.Tymen. GEM: A Generic Chosen-Ciphertext Secure Encryption Method. In CT-RSA 2002, LNCS 2271, pages 263–276. Springer-Verlag, Berlin, 2002. 4. T.ElGamal. A Public Key Cryptosystem and a Signature Scheme Based on the Discrete Logarithms. IEEE Transactions on Information Theory, IT-31(4):469–472, July, 1985. 5. A.Fiat and A.Shamir. How to Prove Yourself: Practical Solutions of Identification and Signature Problems. In Crypto’86, LNCS 263, page 186–194, Springer-Verlag, Berlin, 1987. 6. E.Fujisaki and T.Okamoto. How to Enhance the Security of Public-Key Encryption at Minimum Cost. In PKC’99, LNCS 1560, pages 53–68. Springer-Verlag, Berlin, 1999. 7. E.Fujisaki and T.Okamoto. Secure Integration of Asymmetric and Symmetric Encryption Schemes, In Crypto’99, Springer-Verlag, LNCS 1666, pp.537–554, 1999. 8. E.Fujisaki, T.Okamoto, D.Pointcheval and J.Stern. RSA-OAEP Is Secure under the RSA Assumption. In Crypto’01, LNCS 2139, pages 260–274, Springer-Verlag, Berlin, 2001. 9. K.Kobara and H.Imai. OAEP++: A very simple way to apply OAEP to deterministic OW-CPA primitives. Manuscript, Aug.2002 http://eprint.iacr.org/2002/130 10. T.Okamoto and D.Pointcheval. The Gap-Problems: A New Class of Problems for the Security of Cryptographic Schemes. In PKC’01, LNCS 1992, pages 104–118, Springer-Verlag, Berlin, 2001. 11. T.Okamoto and D.Pointcheval. REACT: Rapid Enhanced-Security Asymmetric Cryptosystem Transform. In CT-RSA 2001, LNCS 2020. pages 159–175. 12. T.Okamoto and S.Uchiyama. A New Public-Key Cryptosystem as Secure as Factoring. In Eurocrypt’98, LNCS 1403, pages 308–318, Springer-Verlag, Berlin, 1998. 13. P.Paillier. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In Eurocrypt’99, LNCS 1592, pages 223–238, Springer-Verlag, Berlin, 1999. 14. D.Pointcheval. Chosen-Ciphertext Security for Any One-Way Cryptosystem. In PKC’00, LNCS 1751, pages 129–146,Springer-Verlag, Berlin, 2000. 15. C.Rackoff and D.Simon. Non-interactive Zero-knowledge Proof of Knowledge and Chosen Ciphertext Attack. In Crypto’91, LNCS 576, pages 433–444, SpringerVerlag, 1992.
A Security Verification Method for Information Flow Security Policies Implemented in Operating Systems* Xiao-dong Yi and Xue-jun Yang College of Computer, National Univ. of Defense Technology, Changsha 410073, China [email protected]
Abstract. Nowadays, operating system security depends much on the security policies implemented in the system. It’s necessary to verify whether the secure operating system’s implementation of security policies is correct. The paper provides a general and automaticable security verification method which is suitable for deploying in practice to verify information flow security policies implemented in information systems specially in secure operating systems. We first use information flow graphs (IFG) to express the information flow security policies specified by temporal logic. Then, based on the express method, we supply a verification framework to verify whether the implementation of an information system satisfies the restrictions of security policies. At last, a security verification framework based on mandatory access control (MAC) which is fit for current secure operating systems has been given. Keywords. Secure Operating System, Security Verification, Information Flow Security Policy
1 Introduction Information security is attracting more and more attention and becomes one of the key elements in designing operating systems. An important way to secure the operating system is enforcing various security policies. The information flow security policies are among the mostly used security policies. The information flow security policy is such a policy that restricts the system’s information flows. This paper focuses on how to verify whether an operating system’s implementation satisfies the restrictions of its information flow security policies. Early secure operating systems are implemented based on a single security model. So their security verification concentrates on verifying whether their implementation meets their security models. The operating systems of this class include UCLA [3], ASOS [4], EROS [5]. Their security verification methods are all special, having no generality and can’t be used in the verification of other OS. Current secure operating systems tend to multi-policy and policy flexibility. In this kind of OS, security policies are built as a loadable model instead of being built *
This paper is supported by China National 863 Software Project 2002AA1Z2101, “Server Operating System Kernel”.
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 280–291, 2003. © Springer-Verlag Berlin Heidelberg 2003
A Security Verification Method for Information Flow Security Policies
281
into the system. Furthermore, security policies in now OS are freely customized instead of using several classic policies. All the above revolutions of operating systems make it necessary for us to design a not only general but also automaticable security verification framework. The paper [2] proposes a general security verification framework. It contains the following three parts: 1. The policy specified in temporal logic; 2. The operating system implementation specified by Z; 3. The verification based on temporal logic. The specification and verification methods based on temporal logic can be automatic or half-automatic. But it can hardly be used in practice because even for the specification of an OS, it’s hard to promise correctness and workload. We propose a new method to use the [2]’s verification framework in practice. It carries the verification in a non-logic way which is described as following: 1. Using information flow graphs to express information flow security policies specified by temporal logic; 2. The simplified operating system’s state machine model based on mandatory access control; 3. The verification based on algorithm. In chapter 2, we define the information flow graphs and use them to express security policies specified in temporal logic. In chapter 3, we propose our verification framework for general information systems. Then refine the framework for current secure operating systems. The chapter 4 is a brief summary.
2 The Express Method Based on Information Flow Graphs The first step of our work is using a new way based on information flow graphs to express the policies originally specified by temporal logic. 2.1 Information Flow Security Policy and Its Specification in Temporal Logic Information flow security policy restricts the information flows generated by system. It says that the information flows in system should satisfy some attributes. Formally, information flow security policies can be specified by SP =< C , Op, P, A > [2]: • C is a finite set of information classes; • Op is a set of relations, including the equality relation (=), on information classes
in C ; P is a finite set of primitive propositions, defines the information flow between two information classes; • A is a set of policy statements. Policy statements are the key element of a security policy, used to judge whether an information flow is legal or not. As we can see, the key problem is what can be used to specify the policy statements. The paper [2] uses temporal logic and declares that it is very powerful. •
282
X.-d. Yi and X.-j. Yang
2.1.1 Introduction of Temporal Logic Temporal logic is first proposed by Pneuli, used to specify and verify concurrent software such as OS and network protocol [1][2]. Temporal logic includes temporal operators in addition to the traditional logical operators. We use one kind of temporal logic, CTL (Computation Tree Logic), to specify information flow secure policies. Let p is an atomic proposition. CTL formulae are defined recursively as following [6]:
φ ::= p | ¬φ | φ ∨ φ | EXφ | E (φUφ ) | A(φUφ )
While, for convenience, we use the following abbreviations:
EFφ = E (trueUφ ) AFφ = A(trueUφ ) EGφ = ¬AF¬φ AGφ = ¬EF¬φ AXφ = ¬EX¬φ
2.1.2 Using Temporal Logic to Specify Information Flow Security Policies In this paper, as at the beginning of our research, we use a subset of CTL plus a first order logic quantifier ∀ to specify secure policies. The syntax of our custom logic is:
φ ::= p | ¬φ | φ ∨ φ | ∀φ | EFφ
Plus following abbreviations:
∃φ = ¬∀¬φ AGφ = ¬EF¬φ
Our custom logic uses only a few temporal logic operators, but most often used information flow policies can be specified by it. We will include other temporal logic operators to enhance its specifying power in later research. 2.2 Information Flow Graph (IFG) An entity’s information flow graph describes the information flows from or to the entity. It shows all the information flows between the entity’s information class and other information classes. It can be specified as following: Definition 2.1. An entity’s information flow graph (IFG) is a directed graph, which can be specified as G =< S , E , l > : • S is a finite set of information classes; • E is a set of information flows between two classes.
E ⊆ {a → b | a, b ∈ S } .
a, b ∈ S , a → b ∈ E means the information flows from information class a to information class b . Remember, information flows in information flow
For any
graphs are intransitive. All information flows should be stated explicitly. That is, when a → b ∈ E and b → c ∈ E , we can’t say that information can flow from a to c . Whether the information has flowed from a to c is decided by whether there is a a → c existing in E ; • l ∈ S , indicates the information class of the entity which is described by this graph.
A Security Verification Method for Information Flow Security Policies
283
E is an empty set, we say this graph is an empty IFG. Various l stands for various empty IFGs, and there are total S empty IFGs. Definition 2.2. If an IFG’s
In information systems, we plan to use an IFG to describe a state of an entity. But security policies always restrict the sequences of state transitions of entities, so we must introduce state transition to IFG. Definition 2.3. In information systems, the state transitions are performed by a set of predefined operations. Every operation in system will lead one or more entities’ IFG change. We define this kind of change as the information system’s operation to IFG, abbreviating IFG operation. Obviously, every predefined operation of the system will lead an IFG operation. In fact, an IFG wants to record the ever flowed information between information class l and other information classes till now. So, an entity’s IFG has recorded all the information flows between the entity’s information class and other information classes. If information has flowed from s to o before, then this information flow will be a history and will ever lies in the IFGs of both s and o. Even o has been deleted, the fact that information has flowed from s to o can never be denied. So the information flows in IFG will never be deleted. In other words, the amount of information flows will never reduce. Then, We get the following IFG transition rule: IFG transiting rule . Let an IFG operation changes the IFG G =< S , E , l > to
G ’=< S ’, E ’, l ’> , then we have E ’ ≥ E . In other words, an IFG operation never reduces the amount of information flows of the IFG. Definition 2.4. An entity’s reachable IFG set, abbreviating entity’s reachable set, is the set of those IFGs which are acquired by performing some IFG operations on empty IFGs. The IFGs in the reachable IFG set are called the reachable IFGs. In fact, different information systems with different operations have different entity’s reachable set. Because the entity’s reachable set contains all possible IFGs of the entity in the information system, and also because an IFG describes one state of an entity, so the entity’s reachable set is entity’s state space and this is finite. Definition 2.5. Extended IFG is specified as Ge =< S , E , l , f > , while
f : E → N , N is the set of natural numbers. Extended IFG adds a function f , assigning each element in E a natural number. Some security policies state that the amount of information flows between two entities can never exceed a number. For this kind of policies, we can use extended IFG to specify. 2.3 The Expression of Temporal Logic Security Policies Based on IFG The state of information system is the set of the states of all the system’s entities, which can be specified using the set of all the entities’ IFGs, namely a subset of the entity’s reachable set. Security policy says that the sequences of system’s state transitions must satisfy some security attributes. From definition 2.4, we know that a reachable IFG is the result of performing some state transition operations on an entity. So, from the semantic point, we hope to build a one-one map between an IFG and a
284
X.-d. Yi and X.-j. Yang
police statement specified by temporal logic. Thus we can use those IFGs which make the policy statement TRUE to express the policy statement of a security policy. Before we can build a one-one map between an IFG G and a policy statement, we need to calculate the truth value of a policy statement specified by temporal logic when given an IFG G. We call the above truth value the policy statement’s truth value on G. If the truth value is TRUE, we say G satisfies the policy statement. In information systems, the restrictions of security policies are implemented by restricting the system’s operations. In the information system state machines, whether an operation is allowed by security policy can only be judged by considering history information flows. None of the information of future states can be used. So, when implementing systems, some policy statements specified by temporal logic are impossible to calculate truth values. Such as EF ( x → y ) , it states that in a future state along some paths, there will be an information flow x → y . But of course in current state, we have no ideal of anything about future states. So it is impossible to calculate this statement’s truth value. Theory 2.1. If a policy statement which is specified by temporal logic satisfies the following conditions, its truth value on an IFG can be worked out: 1. There is no negative quantifier ¬ before the temporal logic operator AG; 2. There must be a negative quantifier ¬ before the temporal logic operator EF. Proof: From the semantic point, if there is no negative quantifier ¬ before the temporal logic operator AG, the statement claims the system should HAVE something. So we only need to check every “current state” and make sure it satisfies the statement before we make sure that the system satisfies the security policy. On the other hand, if there lies a negative quantifier ¬ before the temporal logic operator EF, the statement claims that the system NOT HAVE something. So we also only need to check every “current state” to make sure the system satisfies the security policy. IFG can only describe the sequences of one entity’s state transitions, but the security policies always claim that the sequences of the whole system’s state transitions should satisfy some attributes. We hope to judge whether the system satisfies the security policy only by judging whether all the system’s entities satisfy the policy. In other words, we hope that if all the system’s entities satisfy the security policy, then the system satisfies the security policy. On the other hand, if the system does not satisfy the security policy, there must be at least one entity’s IFG does not satisfy the policy. But this is not always true. For an example, a policy’s policy statement is AG (( a → b) ∧ (c → d ) ⇒ ¬( EF ( a → d ))) . It states that if information has flowed from a to b and from c to d, then there should be no information flows from a to d. Assume a system and its four entities a, b, c, d look like figure 1(a), while the numbers on the arrows stand for the generating order of each information flow. As we can see, the system of figure 1 does not satisfy the statement, but the IFGs of all the entities a, b, c, d do satisfy the statement. Theory 2.2. For a security policy’s policy statement specified by temporal logic, if all of its primitive propositions (the elements in P ) have a common information class c . In other words, all of the primitive propositions of the policy statement are the form of c → x or x → c , while x stands for any information class. Then
A Security Verification Method for Information Flow Security Policies
c
c
c
d
d
285
2 d
d
3 a
a
a
b
b
b
(1)
(2)
(3)
a
1
(4)
(5)
Fig. 1. (1) is the system’s IFG. (2), (3), (4) and (5) are IFGs of entity a, b, c and d respectively. The square in IFG stands for the entity’s information class l.
we can judge whether a system satisfies security policy only by judging whether all the system’s entities satisfy the policy. We should notice that the information class c can be a constant element of C or an information variant quantified by first order logic quantifier ∀ or ∃ . Proof: Obviously, the statement which satisfies above theory states some relations between information class c and other information classes. So whether the system satisfies the security policy is equal to whether all the entities of information class c satisfy the policy. Definition 2.6. If a policy statement specified by temporal logic satisfies theory 2.1 and theory 2.2, we call this statement a calculable statement. And we call the security policy the calculable security policy, provided its statements are all calculable statements. The following verification is based on the calculable statements. The most practical and often used security policies are all calculable, such as MLS and Chinese Wall etc. For those calculable statements which contain negative quantifier ¬ before the temporal operator EF, if ¬ does not quantify EF directly, namely they are separated by other symbols, we can transform them to be the form of ¬EF without changing the statement’s truth value. Definition 2.7. For a calculable statement p while EF are direct quantified by ¬ , we calculate its truth value on a reachable IFG G =< S , E , l > as following: 1. If there are no temporal logic operators, then besides the first order logic quantifiers ∀ and ∃ , p can only contains some elements in P (the primitive propositions like x → y ) and the relations of information classes using the
x ≤ y ). For the proposition x ≤ y , we can calculate its truth value based on the definition of ≤ . For the proposition x → y , if x → y ∈ E , element in Op (like
then it is TRUE, else it’s FALSE. 2. If p is of the form AGq, then if q’s truth value on IFG G is TRUE, p is TRUE,
286
X.-d. Yi and X.-j. Yang
stating that till now, the IFG of an entity satisfies q, i.e., the sequence of the entity’s state transitions which is described by G satisfies p till now. 3. If p is of the form ¬ EFq, then if q’s truth value on IFG G is FALSE, p is TRUE, stating that till now, the IFG of an entity does not satisfy q, i.e., the sequence of the entity’s state transitions which is described by G satisfies p till now. Definition 2.8. A calculable security policy’s legal reachable IFG set, abbreviating security policy’s legal reachable set, is the set of those entity’s reachable IFGs which satisfy all the policy statements. The IFGs in the legal reachable set are called legal reachable IFGs. Till now, we have successfully expressed the information flow security policies specified by temporal logic using their legal reachable sets. It’s the base of our following work. 2.4 Conclusions In this chapter, we first use our custom logic to specify the often used information flow secure policies. Then we construct the policies’ legal reachable sets from the policies’ information classes, policy statements and the information systems’ operations. We should notice that, not all of the security policies specified by temporal logic can be expressed basing on IFG. In this chapter, we have defined the calculable policy, and use two theories to prove that all of the calculable policies can be expressed basing on IFG. As we can see that, the most practical and often used policies in current operating systems, such as MLS and Chinese Wall etc., are all calculable ones and can be specified using our method.
3 The Security Policy Verification Framework In this chapter, we first give a verification framework to verify information systems. Then we will propose a simplified verification framework based on mandatory access control which is fit for verifying current security operating systems. 3.1 The Specification of Information Systems An information system can be specified as a state machine
M =< S , E , τ , s0 > [2],
while: • S is the set of system’s states, described by means of state variables; • E is the set of entities in the system; • τ is the state transition relation, τ ⊆ S × S , and •
s0 is the initial state of the system. For all information systems, our framework assumes that there is no information flows between any two entities at initial state, i.e., any entity’s IFG is an empty IFG at initial state.
A Security Verification Method for Information Flow Security Policies
287
In the information system, state transitions are caused by system’s predefined operations. The information system’s implementation of security policy SP =< C , Op, P, A > is defined as the system’s interpretation of the security policy, which can be specified as • • •
I =< η , OPS , F > :
η : E → C , is used to assign system’s every entity an information class; OPS is a set. Its elements are of the form Cond i ⇒ OPi , saying that only condition Cond i is satisfied, the operation OPi can perform; F defines the corresponding IFG operation for every operation in OPS . For
information systems, it’s an easy and direct job to calculate the result IFG after an IFG operation is performed on an IFG. In general, the IFG operation will add some information flows to IFG or change the information class of an IFG (i.e., change the IFG’s l ) etc. Interpretation I connects the system’s state machine and the security policy. In I, the system’s state is defined as the set of all entities’ information classes. The operations in OPS can change the states, corresponding to τ in state machine. The condition
Cond i when performing an operation reflects the restrictions of security
policies. The principal task of verification is judging whether the operation which satisfies Cond i also satisfies security policy. In other words, for all legal reachable IFGs, are the IFGs after performing an operation which satisfies
Cond i legal
reachable too? 3.2 The Security Verification Framework for Information Systems The verification of the security policies implemented in information systems is divided into two steps: 1. Construct the security policies’ legal reachable sets. For a security policy specified by temporal logic, we first work out all the reachable IFGs based on the operations defined in the system’s interpretation. Following, we calculate the truth value of every policy statement on every reachable IFG, if all statements’ truth values are TRUE, then keep the corresponding IFG, else drop it. When this job finished, the IFGs left are all legal reachable IFGs. 2. Verify using the method stating in theory 3.1. Theory 3.1. The calculable security policy SP =< C , Op, P, A > , the system state machine M =< S , E , τ , s0 > and the interpretation I =< η , OPS , F > , for every IFG in the security policy’s legal reachable set, if it satisfies the condition Cond i of a certain OPi in OPS , then we perform the corresponding IFG operation defined in F on the above IFG and get a new IFG. Do this for all IFGs in security policy’s legal reachable set and get the new IFGs’ set A’. If A’= A , we say that the implementation of the information system satisfies the security policy.
288
X.-d. Yi and X.-j. Yang
Proof: The state of the system’s state machine State is defined as the set of all
State ∈ 2 A . Let the system’s A’ state transits from State to State’, then State’∈ 2 . Because A’ = A ,
the system’s entities’ legal reachable IFGs, namely
State’∈ 2 A is a legal state. So the theory is correct. 3.3 The Security Verification Framework for Current Secure Operating Systems Operating system is a kind of information system. Its operations are system calls. So we can use the information system’s verification framework to verify it. Here we will provide an interpretation based on mandatory access control. Then we will give a simplified verification framework for secure operating systems. In current secure operating systems, security policies are often implemented by means of mandatory access control (MAC) mechanism. MAC assigns a label to every subject (process etc.) and every object (inode, pipe, file, etc.) and the label’s content is defined by security policies. When a subject tries to access an object, MAC will submit the subject’s and the object’s label to the security policy, and the security policy will decide whether the access is granted or not. We find that the operations between entities can be constructed based on five basic operations, i.e., subjects read or write objects, subjects create or destroy entities and subjects relabel entities’ information classes. In order to describe corresponding IFG operations of above five basic operations, we first define two basic IFG operations: 1. Add a new information flow between two entities. If a new information flow l1 → l 2 is added between two entities, the two entities’ IFGs G1 =< S1 , E1 , l1 >
G2 =< S 2 , E2 , l 2 > will change. Let the two IFGs change to G1 ’=< S1 ’, E1 ’, l1 ’> and G2 ’=< S 2 ’, E2 ’, l 2 ’> respectively, then we have S1 ’= S1 , S 2 ’= S 2 , l1 ’= l1 , l 2 ’= l 2 , E1 ’= E1 {l1 → l2 } and E2 ’= E2 {l1 → l2 } {x → l2 | ( x → l1 ) ∈ E1} ;
and
2. Relabel the entity’s information class. If we relabel the entity’s information class from l to m , the entity’s IFG will change. Let its IFG G =< S , E , l > changes
G ’=< S ’, E ’, l ’> , then we have S ’= S , E ’= E {l → l ’} {x → l ’| ( x → l ) ∈ E} , and l ’= m .
to
The five basic operations of MAC are describe as following: 1. Read(s,o), the subject s reads the object o. This operation will information flow η (o) → η ( s ) to the IFGs of s and o following the IFG operation 1. 2. Write(s,o), the subject s writes the object o. This operation will information flow η ( s ) → η (o) to the IFGs of s and o following the IFG operation 1.
add a new above basic add a new above basic
A Security Verification Method for Information Flow Security Policies
289
3. Create(s,x), the subject s creates a new entity x. The IFG of x is the same with the IFG of s. 4. Destroy(s,x), the subject s destroys the entity x. There is no effect on IFG of s. 5. Relabel(s,x, l ), the subject relabels the information class of entity x to l . This operation changes the information class of x’s IFG to l following the basic IFG operation 2 defined above. The above five basic operations form a dynamic label system, i.e., the entities’ labels can be changed during the system’s running. As point out by [2], the dynamic label system’s functions are very flexible and powerful. So, in the interpretation I =< η , OPS , F > of the security policy implemented basing on MAC, OPS is defined as following:
OPS = {Cond R (η ( s),η (o)) ⇒ Re ad ( s, o) , CondW (η ( s ),η (o)) ⇒ Write( s, o) , Cond C (η ( s )) ⇒ Create( s, x) , Cond D (η ( s),η ( x)) ⇒ Destroy( s, x) , Cond L (η ( s ),η ( x), l ) ⇒ Re label ( s, x, l )} . The five Cond in OPS are worked out by the security policy’s implementing
codes on MAC. We want to verify whether the implementation of an operating system satisfies the security policy. Following theory 3.1, we only need to verify whether the new IFG is still a legal reachable IFG when performing above five basic operations on every legal reachable IFG provided that each basic operation satisfies its condition. Theory 3.2. For the calculable security policy SP =< C , Op, P, A > , the operating system’s state machine
M =< S , E , τ , s0 > and the MAC interpretation
I =< η , OPS , F > , after performing the following three basic operations on the security policy’s legal reachable set A , we get a new set of IFGs A’. If A’= A , we
say that the implementation of operating system satisfies the security policy. The three operations and their IFG operations are: 1. Read(s,o) while satisfying Cond R .
∀G1 =< S1 , E1 , l1 >, G2 =< S 2 , E2 , l 2 >∈ A , if Cond R (η ( s),η (o)) is satisfied and η ( s ) = l1 , η (o) = l 2 , then add a new information flow l 2 → l1 to G1 and G2 following the basic IFG operation 1 defined above, getting G1 ’ and G2 ’. Then we have A’= A {G1 ’, G2 ’} . 2. Write(s,o) while satisfying CondW . ∀G1 =< S1 , E1 , l1 >, G2 =< S 2 , E2 , l 2 >∈ A , If CondW (η ( s ),η (o)) is satisfied and η ( s ) = l1 , η (o) = l 2 , then add a new information flow l1 → l2 to G1 and G2 following the basic IFG operation 1 defined above, getting G1 ’ and G2 ’. Then we have A’= A {G1 ’, G2 ’}
290
X.-d. Yi and X.-j. Yang
Cond L . ∀G =< S , E , l >∈ A , if Cond L (η ( s),η ( x), m) is satisfied and l = η (x) , then relabel the information class of G from l to m following the basic IFG operation 2 defined above, getting G ’. Then we have A’= A {G ’} .
3. Relabel(s,x,m) while satisfying
Proof: It’s a special case of theory 3.1. 3.4 Conclusions Our verification framework is more practical than [2]’s. It can easily be put into the verification of our operating systems. This is the advantage. For its disadvantages, we can see that the framework assumes that the system’s implementation of all operations’ conditions contains only information classes. But some systems would like to implement the conditions based on both information classes and simple relations of entities. In such systems, we must convert the relations of entities to relations of information classes, but this is not very direct. However, in current OS, for multi-policy and policy flexibility, many systems tend to provide a MAC framework, such as MAC framework in FreeBSD 5.0. In such frameworks, the conditions for judging whether an operation is legal are all based on information classes to get generality and flexibility, making our framework suitable for their verification.
4 Summary The paper contains two major parts. One is on IFG, trying to express information flow security policies specified by temporal logic in an engineering way. The other introduces a security verification framework for information systems and for operating systems specially. The two parts together give us a simple and direct solution to answer the question whether our system’s implementation of one or more security policies is correct. Our research of general and automaticabel verification is in the beginning stage. The major future work includes introducing more temporal logic operators, refining express method, lowering the framework’s computation complexity and putting it into practice, etc.
References [1] [2] [3]
E. Allen Emerson, “TEMPORAL AND MODAL LOGIC”, 1995 Ramesh V. Peri, “Specification and Verification of Security Policies”, PhD Dissertation, 1996 Bruce J. Walker, Richard A. Kemmerer, and Gerald J. Popek, “Specification and Verification of the UCLA Unix Security Kernel”, 1980ACM
A Security Verification Method for Information Flow Security Policies [4] [5] [6]
291
Ben L. Di Vito, Paul H. Palmquist, Eric R. Anderson, and Michael L. Johnston, “Specification and Verification of the ASOS Kernel”, 1990IEEE J. S. Shapiro, S. Weber, “verifying Operating System Security”, Computer and Information Sciences Technical Report MS-CIS-97-26, 1997 Joost-Pieter Katoen, “Concepts, Algorithms, and Tools for Model Checking”, Lecture Notes of the Course “Mechanised Validation of Parallel Systems”, 1998/1999.
A Novel Efficient Group Signature Scheme with Forward Security Jianhong Zhang, Qianhong Wu, and Yumin Wang State key Lab. Of Integrated Service Networks, Xidian University, Xi’an Shannxi 710071 China {jhzhs,woochanhoma}@hotmail.com [email protected]
Abstract. A group signature scheme allows a group member to sign a message anonymously on behalf of the group. In case of a dispute, the group manager can reveal the actual identity of signer. In this paper, we propose a novel group signature satisfying the regular requirements. Furthermore, it also achieves the following advantages: (1) the size of signature is independent of the number of group members; (2) the group public key is constant; (3) Addition and Revocation of group members are convenient; (4) it enjoys forward security; (5) The total computation cost of signature and verification requires only 7 modular exponentiations. Hence, our scheme is very practical in many applications, especially for the dynamic large group applications. Keywords: Group signature scheme, forward security, revocation, anonymity, unlinkability
1
Introduction
Digital signatures play an important role in our modern electronic society because they have the properties of integrity and authentication. The integrity property ensures that the received messages are not modified, and the authentication property ensures that the sender is not impersonated. In well-known conventional digital signatures, such as RSA and DSA, a single signer is sufficient to produce a valid signature, and anyone can verify the validity of any given signature. Because of its importance, many variations of digital signature scheme were proposed, such as blind signature, group signature, undeniable signature etc, which can be used in different application situations. A group signature was introduced by Chaum and van Heyst [1]. It allows any member of a group to anonymously sign a document on behalf of the group. A user can verify a signature with the group public key that is usually constant and unique for the whole group. However, he/she cannot know which individual of the group signs the document. Many group signature schemes have been proposed [1,2,3,5,6,7,8]. All of them are much less efficient than regular signature
This work is supported by the national natural science foundation (No: 69931010)
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 292–300, 2003. c Springer-Verlag Berlin Heidelberg 2003
A Novel Efficient Group Signature Scheme with Forward Security
293
schemes. Designing an efficient group signature scheme is still an open problem. The recent scheme proposed by Ateniese et al. is particularly efficient and provably secure [2]. Unfortunately, several limitations still render all previous solution unsatisfactory in practice. Giuseppe Ateniese pointed out two important problems of group signature in [3]. One is how to deal with exposure of group signing keys; the other is how to allow efficient revocation. In this paper, we propose a novel and efficient group signature scheme with forward security to solve the above two important problems. The concept of forward security was proposed by Ross Anderson [4] for traditional signature. Several schemes have recently been proposed for traditional signatures and threshold signatures that satisfy the efficiency properties. Previous group signature schemes don’t provide forward security. Forward secure group signature schemes allows individual group member to join or leave a group or update their private signing keys without affecting the public group key. By dividing the lifetime of all individual private signing keys into discrete time intervals, and by tying all signatures to the time interval when they are produced, group members who are revoked in time interval i have their signing capability effectively stripped away in time interval i+1, while all their signature produced in time interval i or before remain verifiable and anonymous. In 2001, Song [5] firstly presented a practical forward security group signature scheme. Our proposed scheme is a little more efficient than Song’s scheme. The rest of this paper is organized as follows. In section 2, we overview the informal model of a secure group signature scheme and security requirements. After our group signature scheme is proposed in section 3, we give the corresponding security analysis to the scheme in section 4. in section 5, we analyze the efficiency of our proposed scheme and compares the cost with the Song’s scheme. Finally, we conclude this paper.
2
Group Signature Model and Security Requirements
The concept of group signature was introduced by Chaum and van Heyst [1]. It allows a group member to sign anonymously a message on behalf of the group. Any one can verify group signature with the group public key. In case of a dispute, the group manager can open the signature to identify the signer. Participants: A group signature scheme involves a group manager (responsible for admitting/deleting members and for revoking anonymity of group signature, e.g., in case of dispute or fraud), a set of group members, and a set of signature verifiers, all participants are modeled as probabilistic polynomial-time interactive Turing machines. A group signature scheme is comprised of the following procedure. Communication: All communication channels are assumed asynchronous, and the communication channel between a signer and a receiver is assumed to be anonymous. signature schemes are defined as follows. (See [8] for more details). A group signature scheme is comprised of the following procedure:
294
J. Zhang, Q. Wu, and Y. Wang
1. Setup: On input of a security parameter 1l this probabilistic algorithm outputs the initial group public key P and the secret key S for the group manager. 2. Join: An interactive protocol between the group manager and a user that results in the user becoming a new group member. 3. Sign: An interactive protocol between a group member and a user whereby a group signature on a user supplied message is computed by the group member. 4. Verify: An algorithm for establishing the validity of a group signature given a group public key and a signed message. 5. Open: An algorithm that, given a signed message and a group secret key, determines the identity of the signer. A secure group signature scheme must satisfy the following properties: 1. Correctness: Signature produces by a group member using Sign must be accepted by Verify. 2. Unforgeability: Only group member are able to sign messages on behalf of the group. 3. Anonimity: Given a signature, identifying the actual signer is computationally hard for everyone but the group manager. 4. Unlinkability: Deciding whether two different signatures were computed by the same group member is computationally hard. 5. Excupability: Even if the group manager and some of the group members collude, they cannot sign on behalf of non-involved group members. 6. Traceability: The group manager can always establish the identity of the member who issued a valid signature. 7. Coalition-resistance: A colluding subset of group members cannot generate a valid group signature that cannot be traced. To achieving practicability, in this paper, we propose a group signature scheme supporting the above properties and another two attributes, revocation and forward security, as well. Revocability: The group manager can revoke membership of a group member so that this group member cannot produce a valid group signature after being revoked. Forward security: When a group signing key is exposed, previously generated group signatures remain valid and do not need to be re-sign.
3 3.1
Our Proposed Group Signature Scheme System Parameters
The group manager (GM ) randomly chooses two primes p1 , p2 of the same size such that p1 = 2p1 + 1 and p2 = 2p2 + 1, where both p1 and p2 are also primes. Let n = P1 P2 and G =< g >,a cyclic subgroup of Zn∗ . GM randomly chooses an integer x as his secret key and computes the corresponding public
A Novel Efficient Group Signature Scheme with Forward Security
295
key y = g x (modn). GM selects a random integer e(e.g.,e = 3) which satisfies gcd(e, φ(n)) = 1 and computes d satisfying de = 1mod φ(n)where φ(n)is the Euler Totient function. h(·) is a coalition-resistant hash function (e.g., SHA-1, MD5). The time period is divided into T intervals and the intervals are publicly known. (c, s) = SP K{γ : y = g γ (”) denotes the signature of knowledge of logg y in G (See [2,6] for details). Finally, the group manager publishes the public key (y, n, g, e, h(·), IDGM , T ) , where IDGM is the identity of the group manager. 3.2
Join Procedure
If a user, say Bob, wants to join to the group, Bob executes an interactive protocol with GM. Firstly, Bob chooses a random number k ∈ Zn∗ as his secret key and computes his identity IDB = g k (mod n) and the signatures of knowledge (c, s) = SP K{γ : IDB = g γ }(”), which shows that he knows a secret value to meet IDB = g k (mod n) . Finally, Bob secretly preserves and sends(IDB , (c, s)) to the group manager. After the group manager receives (IDB , (c, s)) , he firstly verifies the signatures (c, s) of knowledge by (IDB , (c, s)) . If the verification holds, GM stores (IDB , (c, s)) in his group member database and then generates membership certificate for Bob. Thereby, GM randomly chooses a number α ∈ Zn∗ and computes as follows. rB = g α mod n, sB = α + rB x T
wB0 = (rB IDGM IDB )−d mod n GM sends (sB , rB , wB0 )to Bob via a private channel. GM stores (sB , rB , wB0 ) together with (IDB , (c, s))in his local database. After Bob receives (sB , rB , wB0 ), he verifies the following relations g sB = rB y rB mod n T
rB IDGM IDB = wB0 −e (mod n) If both the above equations hold, Bob stores (sB , rB , wB0 )as his resulting initial membership certificate. 3.3
Evolving Procedure
Assume that Bob has the group membership certificate (sB , rB , wBj ) at time period j. Then at time period j + 1, he can compute new group membership certificate via Evolving function f (x) = xe (mod n)and then his new group membership certificate becomes (sB , rB , wBj+1 ) where wBj+1 = (wBj )e mod n. (Note −dT −j
that wBj = (g sB IDGM IDB )
mod n).
296
3.4
J. Zhang, Q. Wu, and Y. Wang
Sign Procedure
Suppose that Bob has the group membership certificate (sB , rB , wBj ) at time period j. To sign a message m at time period j, Bob randomly chooses two numbers q1 , q2 ∈ Zn∗ and computes z1 = g q1 y q2 mod n, u = h(z1 , m) u mod n, r1 = q1 + (sB + k)uh(r2 ) r2 = wB j
r3 = q2 − rB h(r2 )u The resulting group signature on m is (u, r1 , r2 , r3 , m, j) . 3.5
Verify Procedure
Given a group signature (u, r1 , r2 , r3 , m, j) , a verifier validates whether the group signature is valid or not. He computes as follows h(r2 )eT −j r3
1) z1 = IDGM 2 g r1 r2 uh(r )
= =
y
mod n
h(r )u h(r )ueT −j r3 IDGM2 g q1 +(k+sB )uh(r2 ) wBj 2 y
mod n
T −j T −ju h(r )u y q2 −rB uh(r2 ) IDGM2 g q1 g sB uh(r2 ) g kuh(r2 ) (rB IDGM IDB )−h(r2 )d e h(r )u h(r )u IDGM2 g q1 g sB uh(r2 ) IDB 2 (rB IDGM IDB )−uh(r2 ) y −rB uh(r2 ) y q2 q1 q2
= =g y
(1)
2) u = h(z1 , m) and checks whether the equation u = u holds or not. If it holds, the verifier is convinced that (u, r1 , r2 , r3 , m, j) is a valid group signature on m from a legal group member. 3.6
Open Procedure
In case of a dispute, GM can open signature to reveal the actual identity of the signer who produced the signature. Given a signature (u, r1 , r2 , r3 , m, j), GM firstly checks the validity of the signature via the VERIFY procedure. Secondly, GM computes the following steps: Step 1: computes η = 1/(uh(r2 ))mod φ(n) .
uh(r )
h(r )eT −j
y r3 mod n. Step 2: computesz1 = IDGM 2 g r1 r2 2 η Step 3: checks IDB rB = (g r1 y r3 /z1 ) mod n . If there is duple (rB , IDB ) satisfying the above Step3, it is concluded that IDB is the actual identity of the signer.
A Novel Efficient Group Signature Scheme with Forward Security
3.7
297
Revoking Procedure
Suppose the membership certificate of the group member Bob need to be revoked at time period j, the group manager computes the following quantification: T −j
Rj = (rB IDB )d
mod n
and publishes duple (Rj , j)in the CRL(the Certificate Revocation List). Given a signature (u, r1 , r2 , r3 , m, j), when a verifier identifies whether the signature is produced by a revoked group member or not, he computes the following quantification
uh(r )
h(r )eT −j
y r3 mod n. Step 1: z1 = IDGM 2 g r1 r2 2 eT −j uh(r2 ) r1 r3 Step 2: z1 (Rj ) = g y mod n
(2)
For the signature (u, r1 , r2 , r3 , m, j), if the signature satisfies the above equation (2). We can conclude that the signature is revoked.
4
Security Analysis
In this subsection we show that our proposed group signature scheme is a secure group signature scheme and satisfies forward security. Correct: we can conclude that a produced group signature by a group member can be identified from equation (1) of the above Verifying Procedure. Anonymity: Given a group signature (u, r1 , r2 , r3 , m, j),z1 is generated through two random numbers q1 and q2 which are used once only and u = h(z1 , m) , so that we can infer that u is also a random number generated by random seed z1 . Any one (except for a group manager) cannot obtain any information about the identity of this signer from the group signature (u, r1 , r2 , r3 , m, j). Unlinkability: Given time period j, two different group signatures (u, r1 , r2 , r3 , m, j)and (u , r1 , r2 , r3 , m , j) , we can know that u(or u ) is a random number generated by random seed z1 , and uis different in each signing procedure and used once only, and u or random number q1 and q2 are included in r1 and r2 . However, an adversary cannot get the relation between the signature (u, r1 , r2 , r3 , m, j) and the signature (u , r1 , r2 , r3 , m , j) . Unforgeability: In this group signature scheme, the group manager is the most powerful forger in the sense. If the group manager wants to forge a signature at time period j, he chooses (z1 , r2 , r3 , j) (or (z1 , r2 , r1 , j)) and computes u = h(z1 , m). According to the equation (1), for solving r1 , he needs solve the discrete logarithm so that he cannot forge a group signature. Furthermore, as an adversary, because an adversary hasn’t a valid membership certificate, he cannot forge a group signature satisfying the verification procedure. And in view of the group manager, he cannot forge a valid group signature without knowing private k of group member. Forward Security: Assume an attacker breaks into a group member’s system in time period j and obtains the member’s membership certificate. Because
298
J. Zhang, Q. Wu, and Y. Wang
of the one-way property of f (x), the attacker cannot compute this member’s membership certificate corresponding to previous time period. Hence the attacker cannot generate the group signature corresponding to the previous time. Assume that the group member Bob is revoked at time period j, the group manager only revokes the group membership certificate of the time period j. then any valid signature with corresponding time period before j is still accepted. Because of the obtained signature (u, r1 , r2 , r3 , m, t), t < j. the signature (u, r1 , r2 , r3 , m, t) is still a valid signature on m and Bob would not need to produce a new signature on m. Revocation: When a user, say Bob, is expelled from the group starting from the time period i,Ri and i will be published in CRL. Assume a verifier has a signature for period j, where j ≥ i. To check whether the membership certificate of the j−i group member has been expelled, the verifier simply computes Rj = (Ri )e and T −j checks whether the equation (Rje )uh(r2 ) = (g r1 y r3 /z1 ) mod n holds or not. If it holds, it means that the signature has been revoked. Collision-resistant: Assume that two group members collude to forge a signature. Because they don’t know factorization of n and membership certificate of Bob, Furthermore, in Join phase, though the identification for each group member is computed by themselves according to number k , for two conspiracy group members, it is equivalent to forge group manager ElGamal signature to produce a new membership certificate for them. So that they cannot produce a valid membership certificate. Suppose that the group manager and a group member collude to produce the signature of a group member Bob. because they don’t know the private key k or (rB , sB , wBi )of group member Bob respectively, they cannot forge ’s signature. Efficiency: for the whole signature phase and verification phase, our scheme only needs 7 modular exponentiations, however, Song’s scheme needs more than 20 modular exponentiations. This implies that our scheme is very practical in large group applications. Table 1. The comparison of computational load of our scheme vs. Song Scheme
Signing phase computation Song’s Scheme 22E+1H+6M Proposed Scheme 3E+3H+5M
5
Verifying phase Total computation computation 14E+1H+6M 36E+2H+12M 4E+3M+1H 7E+8M+4H
Efficiency Analysis
In this section we show the efficiency of our scheme over that of Song scheme. In a signature scheme, the computational cost of signature is mainly determined by modular exponentiation operator. Let E, M and H respectively denote the
A Novel Efficient Group Signature Scheme with Forward Security
299
computational load for exponentiation, multiplication and hash. Then table 1 shows the comparison of computational load of our scheme vs. Song scheme. Signing phase and verifying phase in our scheme have less computation against Song’s scheme. Modular exponentiation is a complicated operator and plays a determinate role in a signature scheme. From the above data, we conclude that our scheme has computational advantage over that of Song. To the best of our knowledge, it takes the much least computation in group signature schemes. Hence, our proposed scheme is suitable to large group.
6
Conclusion
In this paper, we propose a new group signature scheme with forward-security. Our scheme satisfies not only the traditional security properties of the previous group signature schemes, but also forward security. Our scheme is efficient in the sense in that it is independent of the number of the group members and the size of group signature and the size of group key are independent of the number of time periods and the number of revoked members. Our scheme is a practical group signature scheme. Acknowledgments. The author would like to thank Dr. Wu Qianhong, Dr. Wang Jilin, Ms. Wu Menghong, Dr. Chen Zewen as well as the anonymous referees for their helpful comments.
References [1] D. Chaum, F. Heyst. Group Signature. Proceeding EUROCRYPT’91. SpringerVerlag, 1992, pp. 257–265. [2] G. Ateniese, J. Camenish, M. Joye, and G. Tsudik. A Practical and Provably Secure Coalition-Resistant Group signature Scheme. In M. Bellare, editor, Crypto’2000, vol(1880) of LNCS, Springer-Verlag, 2000, pp. 255–270. [3] G. Ateniese and G. Tsudik. Some Open Issues and New Direction in Group Signature. In Financial Cryptograph’99, 1999. [4] Ross Anderson. Invited Lecture, 4th ACM Computer and Communications Security, 1997. [5] Dawn Xiaodong Song, Practical forward secure group signature schemes. Proceedings of the 8th ACM conference on Computer and Communications Security, Pennsylvania, USA, November, pp. 225–234. [6] J. Camenish and M. Michels. A Group Signature with Improved Efficiency. K. Ohta and. Pei, editors, Asiacrypt’98.Vol 1514 of LNCS, Springer-Verlag,1999, pp. 160–174. [7] W. R. Lee, C. C. Chang. Efficient Group Signature Scheme Based on the Discrete Logarithm. IEE Proc. Computer Digital Technology, 1998, vol.145 (1), pp.15–18. [8] Constantin Popescu. An Efficient Group Signature Scheme for Large Groups. Studies in Informatics and Control. With Emphasis on Useful Applications of Advanced Technology, Vol.10 (1), 2001, pp. 3–9.
300
J. Zhang, Q. Wu, and Y. Wang
[9] Emmanuel Bresson and Jacques Stern. Efficient Revocation in Group Signature. PKC’2001, LNCS 1992, Springer-Verlag, Berlin Heidelberg 2001, pp. 190–206, 2001. [10] Michel Abdalla and Leonid Reyzin. A new forward secure digital signature scheme. In ASIACRYPT, Springer-Verlag, 2000, pp. 116–129. [11] Y. Tseng, J. Jan. A novel ID-based group signature, In T.L. Hwang and A.K. Lenstra, editors, 1998 international Computer Symposium, Workshop on Cryptology and Information Security, Tainan, 1998, pp. 159–164. [12] C. Popescu. Group signature schemes based on the difficulty of computation of approximate e-th roots, Proceedings of Protocols for Multimedia Systems (PROMS2000), Poland, pp. 325–331, 2000. [13] S. Kim, S.Park, D.Won,Group signatures for hierarchical multi-groups, Information Security Workshop, Lecture Notes in Computer Sciences 1396, SpringerVerlag, 1998, pp. 273–281. [14] M. Stadler, Publicly verifiable secret sharing, Advances in Cryptology, EUROCRYPT’96 lecture Notes in Computer Sciences 1070, Springer-Verlag, 1996, pp. 190–199. [15] A. Fiat and A. Shamir. How to prove yourself: practical solutions to identification and signature problems. In Advances in Cryptology – CRYPTO’86, vol. 263 of LNCS, pp. 186–194, Springer-Verlag, 1987. [16] S. Goldwasser, S. Micali, and R. Rivest. A digital signature scheme secure against adaptive chosen-message attacks. SIAM Journal on Computing, 17(2): 281–308, 1988. [17] J. Kilian and E. Petrank. Identity escrow. In Advances in Cryptology – CRYPTO’98, vol.1642 of LNCS, pp. 169–185, Springer-Verlag, 1998. [18] A. Lysyanskaya and Z. Ramzan. Group blind digital signatures: A scalable solution to electronic cash. In Financial Cryptography (FC’98), vol. 1465 of LNCS, pp. 184–197, Springer-Verlag, 1998. [19] R. Gennaro, H. Krawczyk, and T. Rabin. RSA-based Undeniable Signature. J. Cryptology, Volume (13)4, 2000, pp. 397–416.
Variations of Diffie-Hellman Problem Feng Bao, Robert H. Deng, and HuaFei Zhu Infocomm Security Department, Institute for Infocomm Research. 21 Heng Mui Keng Terrace, Singapore 119613. {baofeng, deng, huafei}@i2r.a-star.edu.sg
Abstract. This paper studies various computational and decisional Diffie-Hellman problems by providing reductions among them in the high granularity setting. We show that all three variations of computational Diffie-Hellman problem: square Diffie-Hellman problem, inverse Diffie-Hellman problem and divisible Diffie-Hellman problem, are equivalent with optimal reduction. Also, we are considering variations of the decisional Diffie-Hellman problem in single sample and polynomial samples settings, and we are able to show that all variations are equivalent except for the argument DDH ⇐ SDDH. We are not able to prove or disprove this statement, thus leave an interesting open problem. Keywords: Diffie-Hellman problem, Square Diffie-Hellman problem, Inverse Diffie-Hellman problem, Divisible Diffie-Hellman problem
1
Introduction
The Diffie-Hellman problem [9] is a golden mine for cryptographic purposes and is more and more studied. This problem is closely related to the difficult of computing the discrete logarithm problem over a cyclic group[11]. There are several works to study classical and variable Diffie-Hellman problems([13], [14], [21], [18]) in the generic model. For the decisional Diffie-Hellman problem setting, there is alternative, yet equivalent notation, called matching Diffie-Hellman problem, have been studied by Handschuh, Tsiounis and Yung [10]. These variations are by now the security of many protocols relying on ([1], [2], [5], [6],[8]). Tatsuaki Okamoto and David Pointcheval[16] introduce a new notion called the Gap-Problems, which can be considered as a dual to the class of the decision problems. While Sadeghi and Steinerhere [19] rigourously consider a set of DiffieHellman related problems by identifying a parameter termed granularity, which describes the underlying probabilistic space in an assumption. This paper studies various computational and decisional problems related to the Diffie-Hellman problems by providing reductions among them in the high granularity setting, i.e., we consider the variations of Diffie-Hellman problem defined over some cyclic group with explicit group structure. More precisely, we are interested in studying relationship among variations of Diffie-Hellman problem including computational and decisional cases in single and polynomial setting and try to obtain reductions that are efficient so that an advantage against one of these problems can be reached against the other one. S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 301–312, 2003. c Springer-Verlag Berlin Heidelberg 2003
302
F. Bao, R.H. Deng, and H. Zhu
The basic tools for relating the complexities of various problems are polynomial reductions and transformations. We say that a problem A reduces in polynomial time to another problem B, denoted by A ⇐ B, if and only if there is an algorithm for A which uses a subroutine for B, and each call to the subroutine for B counts as a single step, and the algorithm for A runs in polynomial-time. The latter implies that the subroutine for B can be called at most a polynomially bounded number of times. The practical implication comes from the following proposition: If A polynomially reduces to B and there is a polynomial time algorithm for B, then there is a polynomial time algorithm for A also. Specially, for considering variation of Diffie-Hellman problem in polynomial time sampling case, we need to define the conception of efficient constructing algorithm to meet the requirement of the standard hybrid technique. Our contributions: In this report, we are considering useful variations of Diffie-Hellman problem: square computational(and decisional) Diffie-Hellman problem, inverse computational(and decisional) Diffie-Hellman problem and divisible computational(and decisional) Diffie-Hellman problem. We are able to show that all variations of computational Diffie-Hellman problem are equivalent to the classic computational Diffie-Hellman problem if the order of a underlying cyclic group is a large prime. We remark that our reduction is efficient, that is an advantage against one of these problems can be reached against another one. Also, we are considering variations of the decisional Diffie-Hellman problem in single sample and polynomial samples settings, and we are able to show that all variations are equivalent except for the argument DDH ⇐ SDDH. We are not able to prove or disprove this statement, thus leave an interesting open problem in this report.
2
Variations of Computational Diffie-Hellman Problem
Let p be a large prime number such that the discrete logarithm problem defined in Zp∗ is hard. Let G ∈ Zp∗ be a cyclic group of prime order q and g is assumed to be a generator of G. Though out this paper, we assume that G is prime order, and security parameters p, q are defined as the fixed form p=2q + 1 and ord(g)=q. A remarkable computational problem has been defined on this kind of set by Diffie and Hellman [9]. More precisely, Diffie-Hellman assumption (CDH assumption) is referred to as the following statement: Computational Diffie-Hellman problem (CDH): On input g, g x , g y , computing g xy . An algorithm that solves the computational Diffie-Hellman problem is a probabilistic polynomial time Turing machine, on input g, g x , g y , outputs g xy with non-negligible probability. Computational Diffie-Hellman assumption means that there is no such a probabilistic polynomial time Turing machine. This assumption is believed to be true for many cyclic groups, such as the prime sub-group of the multiplicative group of finite fields.
Variations of Diffie-Hellman Problem
2.1
303
Square Computational Diffie-Hellman Assumption
Let G ∈ Zp∗ defined as above, we are interested in the square computational Diffie-Hellman problem, which has been studied at by a set of researchers already (see [3], [12],[13], [14] for more details). We remark that the reduction presented in this section emphasizes its efficient and optimal characteristic. Therefore our work is non-trivial indeed. Square computational Diffie-Hellman problem (SCDH): On input g, g x , com2 puting g x . An algorithm that solves the square computational Diffie-Hellman problem 2 is a probabilistic polynomial time Turing machine, on input g, g x , outputs g x with non-negligible probability. Square computational Diffie-Hellman assumption means that there is no such a probabilistic polynomial time Turing machine. Fortunately, we are able to argue that the SCDH assumption and CDH assumption are equivalent. SCDH ⇐ CDH Proof: Given an oracle A1 , on input g,g x , g y , outputs g xy , we want to show 2 that there exists an algorithm A2 , on input g x , outputs g x . Given a random value u := g r , we choose t1 , t2 ∈ Zq at random, and compute u1 = ut1 = g rt1 , 2 and u2 = ut2 = g rt2 . Therefore we are able to compute v = A1 (u1 , u2 )= g r t1 t2 2 with non-negligible probability. It follows that g r can be computed from v, t1 , t2 immediately with same advantage. CDH ⇐ SCDH 2 Proof: Given an oracle A2 , on input g, g x , outputs g x , we want to show that there exists an algorithm A1 , on input g, g x , g y , outputs g xy . Now given g x , we 2 choose s1 , s2 , t1 , t2 ∈ Zq at random and compute v1 := A2 (g x s1 ) =g (xs1 ) , v2 := 2 2 A2 ((g y )s2 ) =g (ys2 ) . Finally, we compute v3 := A2 (g xs1 t1 +ys2 t2 ) = g (xs1 t1 +ys2 t2 ) . Since s1 , s2 , t1 , t2 are known already, it follows that g xy can be computed from v1 , v2 , v3 , s1 , s2 , t1 , t2 immediately with same advantage. 2.2
Inverse Computational Diffie-Hellman Assumption
We are also interested in such a computational variation of computational DiffieHellman problem, called inverse computational Diffie-Hellman assumption (InvCDH assumption) first studied at [17]. Inverse computational Diffie-Hellman problem (InvCDH): On input g, g x , −1 outputs g x . An algorithm that solves the inverse computational Diffie-Hellman problem −1 is a probabilistic polynomial time Turing machine, on input g, g x , outputs g x with non-negligible probability. Inverse computational Diffie-Hellman assumption means that there is no such a probabilistic polynomial time Turing machine. Fortunately, we are able to argue that the SCDH assumption and InvCDH assumption are also equivalent. InvCDH ⇐ SCDH 2 Proof: Given an oracle A2 , on input g, g x , outputs g x , we want to show that −1 there exists an algorithm A3 , on input g x , outputs g x . Given a random value
304
F. Bao, R.H. Deng, and H. Zhu
g r , we set h1 ← g r and h2 ← g. Finally, we view (h1 , h2 ) as an input to the −2 −1 oracle A2 to obtain A2 (h1 , h2 ) = g r r . It follows that g r can be computed from A2 immediately with same advantage. SCDH ⇐ InvCDH −1 Proof: Given an oracle A3 , on input g, g x , outputs g x , we want to show 2 that there exists an algorithm A2 , on input g, g x , outputs g x . Now given g, g r , we set h1 ← g r and h2 ← g. Finally, we view (h1 , h2 ) as an input to the oracle −1 2 A3 to obtain A3 (h1 , h2 )= A3 (g r , (g r )r ). It follows that g r can be computed from A3 with the same advantage. 2.3
Divisible Computation Diifie-Hellman Assumption
Yet, there is another variation of CDH assumption, called divisible computation Diffie-Hellman assumption, which is interesting from point of views of both theoretical research and practice. Divisible computation Diifie-Hellman problem (DCDH problem): On random input g,g x , g y , computing g y/x . We refer this oracle to as divisional computation Diffie-Hellman problem. An algorithm that solves the divisible computational Diffie-Hellman problem is a probabilistic polynomial time Turing machine, on input g, g x , g y , outputs g x/y with non-negligible probability. Divisible computation Diffie-Hellman assumption means that there is no such a probabilistic polynomial time Turing machine. As desired, we are able to show that divisible computational DiifieHellman assumption is equivalent to computational Diffie-Hellman assumption: CDH ⇐ DCDH Proof: Suppose we are given an divisible computation Diffie-Hellman oracle denoted by A4 , on input g, g x , g y , outputs g y/x . We want to show that there exists an algorithm A1 , on input g, g x , g y , outputs g xy . Given g, g x , g y , we choose s1 , s2 , t1 , t2 ∈ Zq at random, and compute v1 := A4 (g, (g x )s1 , g s2 )=g xs1 /s2 , v2 : = A4 (g, g t1 , (g y )t2 = g t1 /(yt2 ) . Finally, we compute v := A3 (v1 , v2 ) = g (xys1 t2 )/(s2 t1 ) . Since s1 , s2 , t1 , t2 are known already, it follows that g xy can be computed from v, s1 , s2 , t1 , t2 immediately with same advantage. DCDH ⇐ CDH Proof: Suppose we are given an computational Diffie-Hellman oracle A1 , on input g, g x , g y , it outputs g xy . We want to show that there exists an algorithm A4 , on input g, g x , g y , outputs g y/x . Suppose we are given a triple g, g x , g y now. By assumption, we are given a computational Diffie-Hellman oracle A1 , consequently, we are able to construct an InvCDH oracle A3 . Viewing g, g y as −1 input to A3 to obtain v := g y . Finally, one views g, g x , v as input to A1 to x/y obtain g . We prove the fact that if the underlying group with prime order q, all variations of computational Diffie-Hellman problem are equivalent, i.e., CDH ⇔ SCDH ⇔ InvCDH ⇔ DCDH.
Variations of Diffie-Hellman Problem
3
305
Variations of Decisional Diffie-Hellman Problem
In this section, we study variations of decisional Dffie-Hellman problem. It has been known for years that the various DDH-based problems been published many times and commented under many angles. Recently reductions were known from the work of Sadeghi and Steiner [19] in the generic model, but the present paper provides reductions in the high granularity setting. Before formally study the relationship among the variation problems, we would like to provide a formal definitions of the related problems. 3.1
Formal Definitions on Variations of Decisional Diffie-Hellman Problem
Decisional Diffie-Hellman assumption-DDH: Let G be a large cyclic group of prime order q defined above. We consider the following two distributions: – Given a Diffie-Hellman quadruple g, g x , g y and g xy , where x, y ∈ Zq , are random strings chosen uniformly at random; – Given a random quadruple g, g x , g y and g r , where x, y, r ∈ Zq , are random strings chosen uniformly at random. An algorithm that solves the Decisional Diffie-Hellman problem is a statistical test that can efficiently distinguish these two distributions. Decisional DiffieHellman assumption means that there is no such a polynomial statistical test. This assumption is believed to be true for many cyclic groups, such as the prime sub-group of the multiplicative group of finite fields. Square decisional Diffie-Hellman assumption-SDDH: Let G be a large cyclic group of prime order q defined above. We consider the following two distributions: 2
– Given a square Diffie-Hellman triple g, g x and g x , where x ∈ Zq , is a random string chosen uniformly at random; – Given a random triple g, g x and g r , where x, r ∈ Zq , are two random strings chosen uniformly at random. An algorithm that solves the square decisional Diffie-Hellman problem (SDDH for short) is a statistical test that can efficiently distinguish these two distributions. Square decisional Diffie-Hellman assumption means that there is no such a polynomial statistical test. Inverse decisional Diffie-Hellman assumption -InvDDH: Let G be a large cyclic group of prime order q defined above. We consider the following two distributions: −1
– Given a inverse Diffie-Hellman triple g, g x and g x , where x ∈ Zq , is a random string chosen uniformly at random.; – Given a random triple g, g x and g r , where x, r ∈ Zq , are random strings chosen uniformly at random.
306
F. Bao, R.H. Deng, and H. Zhu
An algorithm that solves the Inverse decisional Diffie-Hellman problem (InvDDH for short) is a statistical test that can efficiently distinguish these two distributions. Inverse decisional Diffie-Hellman assumption means that there is no such a polynomial statistical test. Divisible decision Diffie-Hellman assumption-DDDH: Let G be a large cyclic group of prime order q defined above. We consider the following two distributions: – Given a divisible Diffie-Hellman quadruple g, g x , g y and g x/y , where x, y ∈ Zq , are random strings chosen uniformly at random; – Given a random quadruple g, g x and g y and g r , where x, y, r ∈ Zq , are random strings chosen uniformly at random. An algorithm that solves the divisible decision Diffie-Hellman problem (DDDH for short) is a statistical test that can efficiently distinguish these two distributions. Divisive decision Diffie-Hellman assumption means that there is no such a polynomial statistical test. 3.2
Relations among Variations of Decisional Diffie-Hellman Assumption
Analogous the arguments above, we consider relations among variations of decisional Diffie-Hellman assumption. We first prove the equivalence between InvDDH and SDDH assumptions. InvDDH ⇐ SDDH. Proof: Given a distinguisher D1 which is able to tell square Diffie-Hellman triple from a random triple with non-negligible probability, we want to show that there exists a polynomial distinguisher D2 which is able to tell inverse DiffieHellman triple from a random triple with non-negligible advantage. Now we are given g, g x and g r , where r is either x−1 or a random string. Setting h1 ← (g r )s , 2 h2 ← g s and h3 ← (g x )s ), where s ∈ Zq is a random string. We remark that −1 −1 −1 2 2 if r = x−1 , then h1 = (g x )s , and h2 = (g x )sx , and h3 = (g x )s x . If g r is a random triple, then (h1 , h2 , h3 ) is also a random triple. We then view (h1 , h2 , h3 ) as input to the oracle D1 to obtain correct value b ∈ {0, 1} (b=0 if the answer of D1 is SDDH triple, and 0 otherwise). Therefore, we have a polynomial distinguisher D2 which is able to tell inverse Diffie-Hellman triple from a random triple with same non-negligible advantage. SDDH ⇐ InvDDH. Proof: Given a distinguisher D2 , which is able to tell the inverse decisional Diffie-Hellman triple from a random triple with non-negligible advantage, we want to show that there exists a distinguisher D1 that is able to tell the square decisional Diffie-Hellman triple from a random pair with non-negligible advantage. Given g, g x , g r , where either r = x2 or r ∈ Zq a random string. Setting, −1 h1 ← g x , h2 ← (g r )s and h3 ← g s . We remark that if r = x2 , then h1 = g x , −1 h2 = (g x )xs and h3 = (g x )(xs) . If r is a random string, then h1 , h2 and h3 are random triple. We view (h1 , h2 , h3 ) as input to inverse decisional Diffie-Hellman distinguisher D2 to obtain correct value b ∈ {0, 1} (b=0 if the answer of D2 is
Variations of Diffie-Hellman Problem
307
InvDDH triple, and 0 otherwise). Therefore, we have a polynomial distinguisher D2 which is able to tell square Diffie-Hellman triple from a random triple with same non-negligible advantage. Based on the above arguments, we know the fact that SDDH ⇔ InvDDH. Then we consider the equivalence between DDDH and DDH. DDDH ⇔ DDH. Proof: Given (g, g x , g y , g x/y ), one simply submits (g, g y , g x/y , g x ) to DDH to decide the divisible format of the quadruple; DDH ⇔ DDDH Conversely, given (g, g x , g y , g xy ), one queries DDDH with (g, g xy , g y , g x ) and return DDDH’s answer (plus, queries can be easily randomized if needed). Therefore, we know the fact that DDDH ⇔ DDH. Finally, we consider the problem whether DDH ⇔ SDDH or not. Firstly, we show the fact below: SDDH ⇐ DDH. Proof: Given a distinguisher D, which is able to tell the standard decisional Diffie-Hellman triple from the random triple with non-negligible advantage, we want to show that there exists a distinguisher D1 that is able to tell the square decisional Diffie-Hellman triple from a random triple with nonnegligible advantage. Suppose we are given a triple (g, g x , g z ), where g z is either 2 of the form g y or g x , we then choose two strings s, t at random, and compute x s x t u ← (g ) ,v ← (g ) , w ← (g z )st . We remark that if (g, g x , g z ) is square DiffieHellman triple then (g, u, v, w) is a Diffie-Hellman quadruple and if (g, g x , g z ) is random triple then (g, u, v, w) is a random quadruple. Finally, we view the quadruple (g, u, v, w) as an input to the distinguisher D to obtain correct value b ∈ {0, 1} (b=0 if the answer of D is DDH quadruple, and 0 otherwise). Therefore if D1 is able to distinguish a Diffie-Hellman quadruple or random quadruple with non-negligible advantage then there is a square Difie-Hellman distinguisher D1 that is able to tell the square decisional Diffie-Hellman triple from a random triple with same non-negligible advantage. Unfortunately, we are not able to show that DDH ⇐ SDDH. This leaves an interesting research problem. Recall that the computational Diffie-Hellman problem (CDH assumption) equivalents the square computational Diffie-Hellman problem (SCDH assumption), we believe this conjecture true if the underlying group G ∈ Zp∗ , e.g., |G| = q and p = 2q + 1. Conjecture: Under the assumption of group structure of G, DDH is equivalent to SDDH. 3.3
Polynomial Samples Setting
We are interested in generalized variations of Diffie-Hellman problem. These assumptions play central role for the construction of dynamic group protocols([1], [3], [6], [7], [19], [20]). In this section, we are considering variations of the decisional Diffie-Hellman problem in polynomial samples setting. We study those generalized variations of Diffie-Hellman problem by first provided some related notions, then we present optimal reductions from one to another.
308
F. Bao, R.H. Deng, and H. Zhu
Generalized Decisional Diffie-Hellman assumption: for any k, the following distributions are indistinguishable: – The distribution R2k of any random tuple (g1 , · · · , gk , u1 , · · ·,uk ) ∈ G2k , where g1 , · · · , gk , and u1 , · · · , uk are uniformly distributed in G2k ; – The distribution D2k of tuples (g1 , · · · , gk , u1 , · · · , uk ) ∈ G2k , where g1 , · · · , gk are uniformly distributed in Gk , and u1 = g1r , · · · , uk = gkr for random r ∈ Zq chosen at random. An algorithm that solves the generalized decisional Diffie-Hellman problem is a statistical test that can efficiently distinguish these two distributions. Generalized decisional Diffie-Hellman assumption means that there is no such a polynomial statistical test. Similarly, one can extend the variation of decisional Diffie-Hellman problem to the general case of other types. Generalized square decisional Diffie-Hellman assumption (GSDDH): Let G be a large cyclic group of prime order q defined above. We consider the following two distributions: – The distribution R3k of any random tuple (g1 , · · · , gk , g1 x1 , · · · , gk xk , u1 , · · · , uk ) ∈ G3k , where g1 , · · · , gk , x1 , · · · , xk and u1 , · · · , uk are uniformly distributed in G3k ; – The distribution D3k of tuples (g1 , · · · , gk , g1 x1 , · · · , gk xk , u1 , · · · , uk ) ∈ G3k , where g1 , · · · , gk , g1 x1 , · · · , gk xk are uniformly distributed in Gk while u1 = 2 2 g1x1 , · · · , uk = gk xk for each xi uniformly distributed in Zq . An algorithm that solves the generalized square decisional Diffie-Hellman problem is a statistical test that can efficiently distinguish these two distributions. Square decisional Diffie-Hellman assumption means that there is no such a polynomial statistical test. Generalized inverse decisional Diffie-Hellman assumption (GInvDDH): Let G be a large cyclic group of prime order q defined above. We consider the following two distributions: – The distribution R3k of any random tuple (g1 , · · · , gk , g1 x1 , · · · , gk xk , u1 , · · · , uk ) ∈ G3k , where g1 , · · · , gk , x1 , · · · , xk and u1 , · · · , uk are uniformly distributed in G3k ; – The distribution D3k of tuples (g1 , · · · , gk , g1 x1 , · · · , gk xk , u1 , · · · , uk ) ∈ G3k , where g1 , · · · , gk , g1 x1 , · · · , gk xk are uniformly distributed in Gk while u1 = −1 −1 g1 x1 , · · · , uk = gk xk for each xi uniformly distributed in Zq . An algorithm that solves the generalized inverse decisional Diffie-Hellman problem (GInvDDH for short) is a statistical test that can efficiently distinguish these two distributions. Generalized inverse decisional Diffie-Hellman assumption means that there is no such a polynomial statistical test. Now we are able to show that the generalized decisional Diffie-Hellman assumption is true even in the polynomial sampling setting. The argument is by mathematics induction.
Variations of Diffie-Hellman Problem
309
6-DDH ⇐ 4-DDH. Proof: Let us consider a machine M that can get a non-negligible advantage between D4 and R4 . We define a 6-DDH distinguisher M , which runs as follows: Given any six-tuple (g1 , g2 , g3 , u1 , u2 , u3 ), which comes from either R6 or D6 , M runs M on the quadruple (g1 g2 , g3 , u1 u2 , u3 ) and simply forwards the answer. As explained by the equations presented below, that if (g1 , g2 , g3 , u1 , u2 , u3 ) follows the distribution D6 , then (g1 g2 , g3 , u1 u2 , u3 ) follows the distribution D4 . It is also the same between R6 and R4 . As a consequence, our new machine gets the same advantage in distinguishing D6 and R6 with the help of M in distinguishing D4 and R4 , performing just one more multiplication in G, where G is assumed to be a cyclic group of order q, and g is assumed to be a generator of this group. We denote the output of M (respectively M )as follows: If the input comes from D4 (D6 respectively), it outputs 1 and 0 if the input tuple comes from R4 (R6 respectively). P r[M (g1 g2 , g3 , u1 u2 , u3 ) = 1|(g1 , g2 , g3 , u1 , u2 , u3 ) ∈ R6 ] = P r[M (g x1 +x2 , g x3 , g x4 +x5 , g x6 ) = 1|x1 , x2 , x3 , x4 , x5 , x6 ∈ Zq ] = P r[M (g x , g y , g z , g r ) = 1|x, y, z, r ∈ Zq ] = P r[M (g1 , g2 , u1 , u2 ) = 1|(g1 , g2 , u1 , u2 ) ∈ R4 ] and P r[M (g1 g2 , g3 , u1 u2 , u3 ) = 1|(g1 , g2 , g3 , u1 , u2 , u3 ) ∈ D6 ] = P r[M (g x1 +x2 , g x3 , g r(x1 +x2 ) , g rx3 ) = 1|x1 , x2 , x3 , r ∈ Zq ] = P r[M (g x , g y , g rx , g ry ) = 1|x, y, r ∈ Zq ] = P r[M (g1 , g2 , u1 , u2 ) = 1|(g1 , g2 , u1 , u2 ) ∈ D4 ] 4-DDH ⇐ 6-DDH Let us consider a machine M that can get a non-negligible advantage between D6 and R6 . We define a 4-DDH distinguisher M , which runs as follows: on a given quadruple (g1 , g2 , u1 , u2 ), M runs M on the six-tuple (g1 , g2 , g1s g2t , u1 , u2 , us1 ut2 ), for randomly chosen s and t in Zq , and simply forwards the answer. Once again, the advantage of our new distinguisher M is exactly the same as the advantage of M , with very few more computations: we assume again g to be a generator of G, and we insist on the fact that Zq is a field. P r[M (g1 , g2 , u1 , u2 ) = 1|(g1 , g2 , u1 , u2 ) ∈ D4 ] = P r[M (g x1 , g x2 , g sx1 +tx2 , g rx1 , g rx2 , g srx1 +trx2 ) = 1|x1 , x2 , r, s, t ∈ Zq ] = P r[M (g x1 , g x2 , g x3 , g rx1 , g rx2 , g rx3 ) = 1|x1 , x2 , x3 , r ∈ Zq ] = P r[M (g1 , g2 , g3 , u1 , u2 , u3 ) = 1|(g1 , g2 , g3 , u1 , u2 , u3 ) ∈ D6 ] and
310
F. Bao, R.H. Deng, and H. Zhu
P r[M (g1 , g2 , u1 , u2 ) = 1|(g1 , g2 , u1 , u2 ) ∈ R4 ] = P r[M (g x1 , g x2 , g sx1 +tx2 , g y1 , g y2 , g sy1 +ty2 ) = 1|x1 , x2 , s, t, y1 , y2 ∈ Zq ] = P r[M (g x1 , g x2 , g x3 , g y1 , g y2 , g y3 ) = 1|(x1 , x2 , x3 , y1 , y2 , y3 ) ∈ Zq 6 ] = P r[M (g1 , g2 , g3 , u1 , u2 , u3 ) = 1|(g1 , g2 , g3 , u1 , u2 , u3 ) ∈ R6 ] Based on the above argument, we obtain the useful result: the Decisional Diffie-Hellman Problems, 4-DDH and 6-DDH, are equivalent. We known that the obtained reductions are optimal since an advantage against one of these problems can be reached against the other one. Therefore, under the sole classical Decisional Diffie-Hellman assumption, for any k, the generalized decisional Diffie-Hellman assumption is indistinguishable. With the same technique above, the generalized square decisional DiffieHellman assumption and the generalized inverse decisional Diffie-Hellman assumption can be easily proved. We also remark that the standard hybrid technique provides alternative approach to prove the Decisional Diffie-Hellman problem in the polynomial sampling setting.
4
Conclusions
We have studied the relationship among variations of Diffie-Hellman problem including the computational and decisional cases with efficient reductions. We show that all four variations of computational Diffie-Hellman problem are equivalent if the order of a underlying cyclic group is large prime. Also, we are considering variations of the decisional Diffie-Hellman problem in single sample and polynomial samples setting. We are able to show that all variations are equivalent except for the argument DDH ⇐ SDDH, and thus leave an interesting open problem.
References 1. Eli Biham, Dan Boneh, and Omer Reingold. Breaking generalized Diffie Hellman modulo a composite is no easier than factoring. Information Processing Letters, 70:83–87, 1999. 2. Bresson, Chevassut and Pointcheval, The Group Diffie-Hellman Problems, SAC’02. 3. Mike Burmester, Yvo Desmedt, and Jennifer Seberry. Equitable key escrow with limited time span (or, how to enforce time expiration cryptographically). In K. Ohta and D. Pei, editors, Advances in Cryptology – ASIACRYPT ’98, number 1514 in Lecture Notes in Computer Science, pages 380–391. Springer Verlag, Berlin Germany, 1998. 4. D.Beaver: Foundations of Secure Interactive Computing. CRYPTO 1991: 377–391. 5. Dan Boneh. The Decision Diffie-Hellman problem. In Third Algorithmic Number Theory Symposium, number 1423 in Lecture Notes in Computer Science, pages 48–63. Springer Verlag, Berlin Germany, 1998.
Variations of Diffie-Hellman Problem
311
6. Christian Cachin, Klaus Kursawe, and Victor Shoup. Random oracles in Constantinople: Practical asynchronous Byzantine agreement using cryptography. In Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing, Portland, Oregon, July 2000. ACM. Full version appeared as Cryptology ePrint Archive Report 2000/034 (2000/7/7). 7. Jan Camenisch, Ueli Maurer, and Markus Stadler. Digital payment systems with passive anonymity evoking trustees. In E. Bertino, H. Kurth, G. Martella, and E. Montolivo, editors, Proceedings of the Fourth European Symposium on Research in Computer Security (ESORICS), number 1146 in Lecture Notes in Computer Science, pages 33–43, Rome, Italy, September 1996. Springer Verlag, Berlin Germany. 8. Ronald Cramer and Victor Shoup. A practical public key cryptosystem provably secure against adaptive chosen ciphertext attack. In Hugo Krawczyk, editor, Advances in Cryptology-CRYPTO’98, number 1462 in Lecture Notes in Computer Science, pages 13–25. International Association for Cryptologic Research, Springer Verlag, Berlin Germany, 1998. 9. Whitfield Diffie and Martin Hellman. New directions in cryptography. IEEE Transactions on Information Theory, IT No.2(6):644–654, November 1976. 10. Helena Handschuh, Yiannis Tsiounis, and Moti Yung. Decision oracles are equivalent to matching oracles. In International Workshop on Practice and Theory in Public Key Cryptography ’99 (PKC ’99), number 1560 in Lecture Notes in Computer Science, Kamakura, Japan, March 1999. Springer Verlag, Berlin Germany. 11. Kevin S. McCurley. The discrete logarithm problem. In Carl Pomerance, editor, Cryptology and Computational Number Theory, volume 42 of Proceedings of Symposia in Applied Mathematics, pages 49–74, Providence, 1990. American Mathematical Society. 12. Ueli M. Maurer and Stefan Wolf. Diffie-Hellman oracles. Neal Koblitz, editor. Advances in Cryptology-CRYPTO ’96, number 1109 in Lecture Notes in Computer Science, pages 268–282. International Association for Cryptologic Research, Springer Verlag, Berlin Germany, 1996. 13. Ueli M. Maurer and Stefan Wolf. Lower bounds on generic algorithms in groups. In Kaisa Nyberg, editor, Advances in Cryptology-EUROCRYPT ’98, number 1403 in Lecture Notes in Computer Science, pages 72–84. International Association for Cryptologic Research, Springer Verlag, Berlin Germany, 1998. 14. Ueli M. Maurer and Stefan Wolf. Diffie-Hellman, Decision Diffie-Hellman, and discrete logarithms. In IEEE Symposium on Information Theory, page 327, Cambridge, USA, August 1998. 15. Moni Naor and Omer Reingold. Number theoretic constructions of efficient pseudorandom functions. In 38th Symposium on Foundations of Computer Science (FOCS), pages 458–467. IEEE Computer Society Press, 1997. 16. Tatsuaki Okamoto and David Pointcheval, The Gap-Problems: a New Class of Problems for the Security of Cryptographic Schemes. Proceedings of the 2001 International Workshop on Practice and Theory in Public Key Cryptography (PKC’2001)(13-15 February 2001, Cheju Island, South Korea) K. Kim Ed., pages 104–118, LNCS 1992, Springer-Verlag, 2001. 17. Birgit Pfitzmann and Ahmadeza Sadeghi. Anonymous fingerprinting with direct non-repudiation. T. Okamoto, editor. Advances in Cryptology – ASIACRYPT ’2000, number 1976 in Lecture Notes in Computer Science, Kyoto, Japan, 2000, pages 401–414. International Association for Cryptologic Research, Springer Verlag, Berlin Germany.
312
F. Bao, R.H. Deng, and H. Zhu
18. Victor Shoup. Lower bounds for discrete logarithms and related problems. In Walter Fumy, editor, Advances in Cryptology-EUROCRYPT’97, number 1233 in Lecture Notes in Computer Science, pages 256–266. International Association for Cryptologic Research, Springer Verlag, Berlin Germany, 1997. 19. Ahmad-Reza Sadeghi, Michael Steiner: Assumptions Related to Discrete Logarithms: Why Subtleties Make a Real Difference; Eurocrypt 2001, LNCS 2045, Springer-Verlag, May 2001, 243–260. 20. Michael Steiner, Gene Tsudik, and Michael Waidner. Key agreement in dynamic peer groups. IEEE Transactions on Parallel and Distributed Systems, 11(8):769– 780, August 2000. 21. Stefan Wolf. Information theoretically and Computationally Secure Key Agreement in Cryptography. PhD thesis, ETH Zurich, 1999.
A Study on the Covert Channel Detection of TCP/IP Header Using Support Vector Machine Taeshik Sohn1 , JungTaek Seo2 , and Jongsub Moon1 1
Center for Information Security Technologies, Korea University, Seoul, Korea {743zh2k,jsmoon}@korea.ac.kr 2 National Security Research Institute, ETRI, Daejeon, Korea [email protected]
Abstract. Nowadays, threats of information security have become a big issue in internet environments. Various security solutions are used as such problems’ countermeasure; IDS, Firewall and VPN. However, a TCP/IP protocol based Internet basically has great vulnerability of protocol itself. It is especially possible to establish a covert channel using TCP/IP header fields such as identification, sequence number, acknowledgement number, timestamp and so on[3]. In this paper, we focus on the covert channels using identification field of IP header and the sequence number field of TCP header. To detect such covert channels, our approach uses a Support Vector Machine which has excellent performance in pattern classification problems. Our experiments showed that the proposed method could discern the abnormal cases(including covert channels) from normal TCP/IP traffic using a Support Vector Machine. Keywords: Intrusion detection, covert channel, support vector machine, TCP/IP protocol security
1
Introduction
These days, the internet environment has many problems in information security as its network is increasing rapidly. So, various solutions for security protection such as IDS, firewall, VPN have evolved. Although these solutions were widely used, they are still very vulnerability due to the problems of protocol itself or defects in security solutions. Also, one vulnerability is the possibility of hidden channel creation. A hidden channel is defined as the communication channel used in a process which transmits information by methods violating the system’s security policy[1]. Among the many TCP/IP covert channel schemes, this paper analyzes the attack methods transmitting covert data using identification field and sequence number field in the TCP/IP header[3]. And then we use a Support Vector Machine(SVM) to detect TCP/IP covert channels. SVM, which is known as a kind of Universal Feed Foreword Network proposed by Vapnik in
This research is supported by Korea University Grant
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 313–324, 2003. c Springer-Verlag Berlin Heidelberg 2003
314
T. Sohn, J. Seo, and J. Moon
1995/citecor, is efficient in complex pattern recognition and classification. Specifically, it has the best solution for binary classification problems[4][13]. This paper is organized as follows: Section 2 addresses the related work of covert channels techniques. Section 3 describes the background of the SVM. Section 4 describes the analysis of covert channels in the TCP/IP header. section 5 describes our detection approach. Experiments are explained in section 6, followed by conclusions and future work in section 7.
2
Related Work
The TCP/IP is described in RFC791, RFC793. Many channels have been identified in TCP/IP protocol. A security analysis for TCP/IP is found in [10]. Paper[2] and [3] describe the related work on various covert channel establishments. Covert channels are discussed more generally in a variety of papers. A general survey of information-hiding techniques is described in ”Information Hiding - A Survey.” John Mchugh[2] provides a wealth of information on analyzing a system for covert channels in ”Covert Channel Analysis.” Especially, ”Covert Channels in the TCP/IP Protocol Suite”[3], Craig Rowland describes the possibility of passing covert data in the IP identification field, the initial sequence number field and the TCP acknowledge sequence number field. He programmed a simple proof-of-concept, raw socket implementation. ”Covert Messaging Through TCP Timestamps”[15] describes a tunnel using a timestamp field in a TCP header.
3 3.1
Support Vector Machine Background
A Support Vector Machine is a learning machine that plots the training vectors in high-dimensional feature space, labeling each vector by its class. The SVM views the classification problem as a quadratic optimization problem. It combines generalization control with a technique to avoid the ”curse of dimensionality” by placing an upper bound on a margin between the different classes, making it a practical tool for a large and dynamic data set. SVM classifies data by determining a set of support vectors, which are members of the set of training inputs that outline a hyper plane in feature space. The SVM is based on the idea of structural risk minimization, which minimizes the generalization error, i.e. true error on unseen examples. The number of free parameters used in the SVM depends on the margin that separates the data points to classes but not on the number of input features, thus SVM does not require a reduction in the number of features in order to avoid overfitting. SVM provides a generic mechanism to fit the data within a surface of a hyperplane of a class through the use of a kernel function. The user may provide a kernel function, such as a linear, polynomial, or sigmoid curve, to the SVM during the
A Study on the Covert Channel Detection of TCP/IP Header
315
training process, which selects support vectors along the surface of the function. This capability allows for classifying a broader range of problems. The primary advantage of SVM is binary classification and regression that it provides to a classifier with a minimal VC-dimension, which implies low expected probability of generalization errors[4][5][6]. 3.2
SVM for Classification
In this section we review some basic ideas of support vector machines. For the details about SVM for classification and nonlinear function estimation, see [8][9][13]. Given the training data set {(xi , di )}li=1 , with input data xi ∈ RN and corresponding binary class labels di ∈ {−1, 1}, the SVM classifier formulation starts from the following assumption. The classes represent by subset di = 1 and di = −1 are linearly separable, where ∃ w ∈ RN , b ∈ R such that T w xi + b>0 for di = +1 (1) wT xi + b<0 for di = −1 The goal of SVM is to find an optimal hyperplane for which the margin of separation ,ρ, is maximized. The margin of separation ,ρ, is defined by the separation between the separating hyperplane and the closest data point. If the optimal hyperplane is defined by wT x+b0 = 0, then the function g(x) = wT x+b0 gives a measure of the distance from x to the optimal hyperplane. Support Vectors are defined by data points x(s) that lie the closest to the decision surface. For a support vector x(s) and the canonical optimal hyperplane g, we have g(xs ) +1/||w0 || for d(s) = +1 = (2) r= −1/||w0 || for d(s) = −1. ||w0 || Since, the margin of separation ρ ∝ w10 , w0 should be minimal to achieve the maximal separation margin. Mathematical formulation for finding the canonical optimal separation hyperplane given the training data set {(xi , di )}li=1 , solve the following quadratic problem l minimize τ (w, ξ) = 12 w2 + c i=1 ξi (3) subject to di (wT xi + b) ≥ 1 − ξi for ξi ≥ 0, i = 1, ..., l Note that the global minimum of above problem must exist, because φ(w) = is convex in w and the constrains are linear in w and b. This constrained optimization problem is dealt with by introducing Lagrange multipliers αi ≥ 0 and a Lagrangian function given by 1 2 2 w0
L(w, b, ξ; α, ν) = τ (w, ξ) −
l i=1
lead to
αi [di (wiT xi + b) − 1 + ξk ] −
l i=1
νi ξi
(4)
316
T. Sohn, J. Seo, and J. Moon l l ∂L = 0 ⇐⇒ w − αi di xi (∴ w = αi di xi ) ∂w i=1 i=1
(5)
l ∂L = 0 ⇐⇒ w − αi d i = 0 ∂b i=1
(6)
∂L = 0, for 0 ≤ αi ≤ c , k = 1, ..., l ∂ξk
(7)
The solution vector thus has an expansion in terms of a subset of the training patterns, namely those patterns whose αi is non-zero, called(as we previous defined) Support Vectors. By the Karush-Kuhn-Tucker complementarity conditions, we have, αi [di (wT xi + b) − 1] = 0 for i = 1, ..., N
(8)
by substituting (5), (6) and (7) into L(equation (4)), find multipliers αi which maximize Θ(α) =
l i=1
1 αi αj di dj < xi · xj > 2 i=1 j=1 l
αi −
l
subject to 0 ≤ αi ≤ c, i = 1, ..., l and
l
αi yi = 0
(9)
(10)
i=1
The hyperplane decision function can thus be written as l yi αi · (x · xi ) + b f (x) = sgn
(11)
i=1
where b is computed using (8). To construct SVM, the optimal hyperplane algorithm have to be argumented by a method for computing dot products in feature spaces nonlinearly related to input space. The basic idea is to map the data into some other dot product space (called the feature space) F via a nonlinear map φ, and to perform the above linear algorithm in F , i.e nonseparable data {(xi , di )}li=1 , where xi ∈ RN , di ∈ {+1, −1}, preprocess the data with, φ : RN → F , x → φ(x) where l dimension(F )
(12)
Here w and xi are not calculated. According to Mercer’s theorem, φ(xi ), φ(xj ) = K(xi , xj )
(13)
and K(x, y) can be computed easily on the input space. Finally the nonlinear SVM classifier becomes l f (x) = sgn αi di K(xi , x) + b (14) i=1
Several choices for the kernel K(·, ·) are possible:
A Study on the Covert Channel Detection of TCP/IP Header
K(x, y) = y T x : : K(x, y) = (y T x + 1)d K(x, y) = exp{−x − xk 2 /2σ 2 } : : K(x, y) = tanh(kykT x + θ)
4
317
Linear SVM Polynomial SVM of degree d RBF SVM MLP SVM
An Analysis of Covert Channels in a TCP/IP Header
The TCP/IP header contains a number of fields where information can be stored and sent to a remote host in a covert manner. Within each header there are multitude of fields that are not used for normal transmission or are optional fields to be set as needed by the sender of the datagrams. An analysis of the fields of a typical TCP/IP header that are either unused or optional reveals many possibilities of data being stored and transmitted in the fields. For our purposes, we will focus on encapsulation of data in the more mandatory fields. This is not because they are better than the other optional fields. Rather these fields are not as likely to be altered in transit than the IP or TCP options fields which are sometimes changed or stripped off by packet filtering mechanisms or through fragment re-assembly. Therefore we will encode and decode the following fields: the IP packet identification field, the TCP initial sequence number field[3]. The Identification field encoding method simply replaces the IP identification field with the numerical ASCII representation of the character to be encoded. This allows for easy transmission to a remote host which simply reads the IP identification field and translates the encoded ASCII value to its printable counterpart. In the case of an identification field of an IP header, for example, it is converted to ASCII by 18432 which is a representation of 72*256(ASCII : 72[H]). After that, a covert channel server which is receiving such packets from a specific port decodes the identification value of the received packets into 256 and then obtains the covert data. The sequence number field also does the same but it is converted to ASCII by, for example, 16777216 which is a representation of 72*256*65536(ASCII : 72[H]). This enables a more realistic looking identification and sequence number[3]. However, there is a difference between normal TCP/IP header packet fields and abnormal TCP/IP header fields including, as mentioned above, forged identification and sequence number field. Also, because each forged packet transmits covert data like a TCP connection attempt, they set up an SYN flag and have specific values related to the IP flag, fragmentation offset which is different from normal TCP/IP packets and so on. Though they have differences as mentioned above, it is very difficult to distinguish them through a specific detection rule or the intuition of the observer. Thus, we proposes the detection method using time relation between the packets and the characteristics of modified packets to discern the covert channel using identification and sequence number field.
318
T. Sohn, J. Seo, and J. Moon
Fig. 1. Proposing the 1st Method for SVM learning
Fig. 2. Proposing the 2nd Method for SVM Learning
5
Proposing the Detection Methods Using SVM
In this section, we propose the learning method of a SVM to detect the covert channel in the TCP/IP header. First, we preprocess single TCP/IP packets and perform the SVM learning using the preprocessed packets. Learning method 1 considers one preprocessed packet as a single input data of the SVM. So, we expect the detection result will closely affect the number of features used in the preprocess procedure because this method only uses the characteristic of a single packet itself without examining the related sequentiality of each packet(illustrated as figure 1). Next, we propose the other method of SVM not considering a single packet but considering the successive sequential relation between packets. This method uses three sequences of TCP/IP packets slid by one as the input sequenced. Using such a scheme with packet sliding is based on the difference between TCP/IP packets including covert channels and normal TCP/IP packets. So, we can assume that the transmitting data in a covert channel has correlation between the successive packets. Accordingly, if we consider successive packets as a single input for SVM learning, we can expect it to be more efficient for the detection of a covert channel in the TCP/IP header(illustrated as figure 2).
A Study on the Covert Channel Detection of TCP/IP Header
6
319
Experiment
6.1
Experiment Methods
we use covert tcp[3] as a covert channel generation tool for TCP/IP header. Covert tcp exploits the covert channel that exists inside of TCP/IP header traffic. The trojan packets themselves are masqueraded as common TCP/IP traffic. First, the experimental data is comprised of a SVM Training data set and a SVM Test data set for the experimental detection of covert channels in a TCP/IP header. At this time, we collected normal TCP/IP packets using a tcpdump tool and abnormal TCP/IP packets(including covert fields) generated from covert tcp. Also, we divided attack cases using a covert channel in the identification field of the IP header and the sequence number field of a TCP header and then tested each case. Table 1 and 2 show the feature values for the preprocess procedure for the SVM Training and Test data set. Each feature is converted to decimal values, that is, hexa values of 16bits(2bytes) are rearranged by the integer value of the decimal in the raw dump values of the TCP/IP packets. Table 1. The Features for Covert Channel using Identification field Using Field
# of features 1 3
Identification of IP header 5
Feature Description(bytes) Identification(16) Identification(16)+Flags,Fragment Offset(16) +IP header Checksum(16) Identification(16)+Flags,Fragment Offset(16) +IP header Checksum(16)+TCP Control Flag(16)* +TCP header Checksum(16)
*TCP Control Flag(16) includes TCP HLEN(4)+Reserved(6).
Table 2. The Features for Sequence number field Covert Channel Using # of Field features Sequence Number 2 of TCP header 4
Feature Description(bytes) Sequence Number(32) Sequence Number(32)+TCP Control Flag(16)* +TCP header Checksum(16)
*TCP Control Flag(16) is including TCP HLEN(4)+Reserved(6).
After we preprocess one IP or one TCP header packet to extract the fields shown in table 1 and 2, then the extracted fields only consist of one SVM input data. The experiments were performed in two cases : a single preprocessed data is used for one method, and three sequences of the preprocessed data is used for the other method. In this experiment, a receiving window size means the number of the sequenced data used for one input. So, when we use three sequences of
320
T. Sohn, J. Seo, and J. Moon
data for one input, the window size is three. The window is slid by one as the input is sequenced by one. The example shown in figure 3 demonstrates two kinds of data sets. One data set(Training data set1, Test data set1) consists of one sequence packet and the other data set(Training data set2, Test data set2) consists of three sequences of packets.
Fig. 3. A Preprocess Procedure of Raw Packets(No sliding/Sliding)
Table 3. SVM Training Data Set Data Set Normal Packet
Abnormal Packet
Training Set1 - No Sliding (Total 10,000) Individual TCP/IP packets(5,000) Individual ID,SEQ exploited packets using covert tcp(5,000)
Training Set2 - Sliding (Total 10,000) A series of 3 TCP/IP packets(5,000) A series of 3 ID, SEQ exploited packets using covert tcp(5,000)
Table 4. SVM Test Data Set Data Set Normal Packet
Abnormal Packet
Test Set1 - No Sliding (Total 1,000) Individual TCP/IP packets(500) Individual ID,SEQ exploited packets using covert tcp(500)
Test Set2 - Sliding (Total 1,000) A series of 3 TCP/IP packets(500) A series of 3 ID, SEQ exploited packets using covert tcp(500)
Table 3 describes the SVM training data set for the detection of covert channels in a TCP/IP header. As mentioned above, the SVM Training set consists of a training data set 1 which is comprised of individual packets and training data
A Study on the Covert Channel Detection of TCP/IP Header
321
set 2 which is comprised of single attack units having three successive packets. So, the SVM Training set 1 consists of 10,000 packets which is divided 5,000 normal packets and 5,000 abnormal packets having spurious identification fields or sequence number fields. The SVM Training set 2 is comprised of 10,000 units, here, one attack unit is consecutive packets considering time relation between packets. The 10,000 units of the SVM training set 2 is also divided into 5,000 normal units and 5,000 abnormal units having spurious identification fields or sequence number fields. And the SVM Test data set of Table 3 is the same data organization as the above SVM Training data set, but the total packets or units of the Test data set is 1,000 packets, 1,000 units separately. All the SVM detection experiments were performed using the freeware package mySVM[3]. Also, to compare the detection performance, we used the two SVM kernel functions : linear and polynomial. Table 5. The experiment results of covert channel detection in a TCP/IP header TCP/IP Header fields
Kernel
Linear Identification Polynomial
Sequence Number
Linear Polynomial
Features 1 3 5 1 3 5 2 4 2 4
Test Set1 (No Sliding) FP FN TC 31.50 7.00 61.50 0.90 31.70 67.40 0.40 0.50 99.10 16.90 43.00 40.10 29.30 0.00 70.70 0.50 0.00 99.50 1.00 33.20 65.80 0.00 1.30 98.70 2.50 28.00 69.50 0.00 0.10 99.90
Test Set2 (Sliding) FP FN TC 31.40 14.60 54.00 0.20 14.20 85.60 0.00 0.10 99.90 3.00 9.20 87.80 1.20 5.20 93.60 0.10 0.10 99.80 11.10 1.00 87.90 0.90 0.00 99.10 0.50 7.50 92.00 0.00 0.10 99.90
*The degree of Polynomial Kernel = 3, FP = False Positive(%), FN = False Negative(%), TC = Total Correctness(%)
Table 6. The experiment results of each parameter – ID field(%) TS1 TS2 KR1 KR2 F1 F3 F5 Detection(%) 67.68 86.78 77.92 81.92 60.85 79.33 99.56 *TS1 = Test Set1(No Sliding), TS2 = Test Set2(Sliding), KR1 = Linear, KR2 = Polynomial, F# = the number of Features
6.2
Experiment Results
We analyzed the detection results of each Test Set1 and Test Set2 according to the two SVM kernel functions and the variation of the number of features. Table 5 shows the overall experiment results with identification and sequence number
322
T. Sohn, J. Seo, and J. Moon Table 7. The experiment results of each parameter – SEQ field(%) TS1 TS2 KR1 KR2 F2 F4 Detection(%) 73.05 94.73 87.88 90.33 78.80 99.40 *TS1 = Test Set1, TS2 = Test Set2, KR1 = Linear, KR2 = Polynomial, F# = the number of Features
Fig. 4. The result graph of ID covert channel
Fig. 5. The result graph of SEQ covert channel
fields. The resultant graph of covert channel detection using the identification field is shown in figure 4 and the resultant graph of covert channel detection using the sequence number field is illustrated in figure 5. Table 5 describes all experiment results. Also, Table 6 and 7 show the detection results for the covert channel using identification fields and sequence number fields according to the number of features, some kinds of SVM kernel functions and SVM data sets considering the time relation between packets.
A Study on the Covert Channel Detection of TCP/IP Header
323
In the result analysis related to the SVM learning pattern, we could see that it is more efficient to classify three sequences of packets with the time relation.(The Correctness of Test Set 2 using an ID field is 86.78%, The Correctness of Test Set 2 using an SEQ field is 94.73%) Also, we could see that in the case of the number of features, the more its number increasing, the more the correctness increases. That is, even though it was not described in this paper, if the training data set has more than 5 features of an ID field or 4 features of a SEQ field, we can ascertain that such cases could classify covert packets with the correctness of about 99%. In case of the SVM kernel function, the polynomial kernel of the degree value 3 was more efficient than the linear kernel.
7
Conclusion and Future Work
Covert channel attacks are an increasing potential threat to the Internet. As of yet, there has been no good solution for covert channel detection. The goal of this research was to propose a detection method for covert channels in the TCP/IP header with SVM which has excellent performance in pattern classification. The method of SVM learning to detect a covert channel consisted of the learning method of considering a single TCP/IP packet as one input for the SVM and the learning method of considering three sequential TCP/IP packets as one input for the SVM. Also, the experimental environment has been subjected to informal tests in a laboratory testbed. The results show that under these conditions, the detection provided by the SVM learning had a high correction rate as illustrated in Table 5. Future work will include the expansion of the training set and test set, the experiments for various kernels which can be used for performance improvement and some constraint parameters.
References 1. U.S. Department Of Defence, 1985. Trusted Computer System Criteria. 2. John McHugh, Covert Channel Analysis, Portland State University, 1995 3. Craig H. Rowland, “Covert Channels in the TCP/IP protocol suite”, First Monday, 1996 4. Vapnik V., “The Nature of Statistical Learning Theory”, Springer-Verlag, New York, 1995. 5. Bueges C.J.C., “A Tutorial on Support Vector Machines for Patter Recognition.”, Data Mining and Knowledge Discovery, Boston, 1988. 6. Cortes C., Vapnik V., “Support Vector Network”, Machine Learning, Vol.20, pp. 273–279, 1995. 7. Cristianini N., Shawe-Taylor J., “An Introduction to Support Vector Machines.”, Cambridge University press, 2000. 8. Sch¨ olkopf B., Sung K. K., Burges C., Girosi F., Poggio T., Vapnik V., “Comparing support vector machines with Gaussian kernels to radial basis function classifiers.”, IEEE Transactions on Signal Processing, Vol.45, No.11, pp. 2758–2765, 1997.
324
T. Sohn, J. Seo, and J. Moon
9. C. Campbell and N. Cristianini, “Simple Learning Algorithms for Training Support Vector Machines”, 1998 10. S.M. Bellovin, “Security Problems in the TCP/IP protocol suite”, Computer Communication Reviews,19(2):32–48, April 1989 11. S. Mukkamala et al., “Intrusion Detection Using Neural Networks and Support Vector Machines”, Proceedings of IEEE IJCNN, May 2002, pp. 1702–1707. 12. Dorothy E Denning, “An Intrusion Detection Model, In IEEE Transactions on SE”, Number 2, page 222, 02. 1997 13. Pontil, M. and Verri, A., “Properties of Support Vector Machines”, A.I. Memo No. 1612; CBCL paper No. 152, MIT, Cambridge, 1997. 14. Joachmims T, “mySVM – a Support Vector Machine”, University Dortmund 15. John Giffin, “Covert Messaging Through TCP Timestamps”, PET2002 16. Behrouz A. Forouzan, “TCP/IP Protocol Suite”, McGraw Hill
A Research on Intrusion Detection Based on Unsupervised Clustering and Support Vector Machine* Min Luo, Lina Wang, Huanguo Zhang, and Jin Chen School of Computer, The State Key Laboratory of Software Engineering, Wuhan University, Wuhan, 430072, Hubei, P.R.China. [email protected]
Abstract. An intrusion detection algorithm based on unsupervised clustering (UC) and support vector machine (SVM) is presented via combining the fast speed of UC and the high accuracy of SVM. The basic idea of the algorithm is to decide whether SVM classifier is utilized or not by comparing the distances between the network packets and the cluster centers. So the number of packets going through SVM reduces. Therefore, we can get a tradeoff between the speed and accuracy in the detection. The experiment uses KDD99 data sets, and its result shows that this approach can detect intrusions efficiently in the network connections.
1 Introduction Intrusion detection systems are an integral part of any complete security network system. Currently, the most widely deployed and commercially available methods for intrusion detection employ signature-based detection. These methods extract features from various audit streams, and detect intrusions by comparing the feature values to a set of attack signatures provided by human experts. Such methods can only detect previously known intrusions since these intrusions have corresponding signatures. Hence, there came up with many approaches such as data mining and knowledge discovering to detect intrusions [1–4]. However, the intrusion models that all these methods adopt totally depend on the instances of the training data sets, so clean data sets are crucial for building applied IDS. In fact, collecting clean data sets is very difficult and costly, so it is essential to study the unsupervised intrusion detection methods. In practice, unsupervised intrusion detection has many advantages over supervised detection. The main advantage is that it does not require a purely normal training set since the detection algorithm can be performed over unlabeled data, which is easy to obtain from a real world system. In addition, unsupervised detection algorithm can be used to analyze historical data used for forensic analysis.
*
Supported by The National Nature Science Foundation of China (90104005,90204011)
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 325–336, 2003. © Springer-Verlag Berlin Heidelberg 2003
326
M. Luo et al.
In this paper, a new intrusion detection algorithm based on unsupervised clustering (UC) and support vector machine (SVM) (UCSVM-ID algorithm) is presented. Compared with some other intrusion detection algorithms [5–11], the UCSVM-ID algorithm has the merits of high efficiency and accuracy, and it doesn’t need labeled training datasets. Therefore, little change is needed when the UCSVM-ID algorithm is used in the actual systems, for the training datasets that are collected directly from actual systems can be directly used in the UCSVM-ID algorithm. This paper is organized as follows. In Section two, we discuss related work on UCbased and SVM-based intrusion detection approach and their drawbacks. In section three, it outlines the key elements of the intrusion detection algorithm based on unsupervised clustering and support vector machine (UCSVM-ID algorithm). In section four, the experiment results are given. And in section five, it summarizes the experimental result and discusses the weakness of the algorithm.
2 Related Work In recent years, Portonoy L. and Eskin E. etc. have done some relevant work in the intrusion detection (ID) based on clustering [5–7]. They proposed a geometric framework for unsupervised anomaly ID. The data in their framework were mapped to the feature space, and the framework helped them detect intrusions by finding out the outlier based on the position of the point in the feature space. They presented some algorithms including a clustering-based algorithm and a k-nearest neighbor-based algorithm. Mukkamala S. applied a standard supervised SVM algorithm to the ID in 2002 [8–10]. Rao X. also applied a SVM algorithm to produce a host-based ID system [11]. But there are some drawbacks such as low detection rate especially for DOS attack, and high false positive rate in clustering-based algorithms. And as for the ID algorithms based on SVM, because of some weaknesses of SVM itself, for instance: they have to spend more times training and testing, and they cannot deal with symbolic data, though their detection rate is high, it cannot be applied to the practical system well either. The basic idea of the UCSVM-ID algorithm presented in this paper is to decide whether SVM classifier is utilized or not by comparing the distances between the network packets and the cluster centers produced by UC algorithm, thus only those data that are hard to classify for UC algorithm are sent to SVM, which reduces the number of packets going through SVM, increases the detection speed of the UCSVM-ID algorithm and exerts SVM algorithm to classify accurately as well. Additionally, we improve the UC algorithm and obtain different SVMs according to different types of network connection protocols, which makes the UCSVM-ID algorithm process the symbolic data effectively. The experiment uses KDD99 data sets, and its result indicates that the UCSVM-ID algorithm can overcome the drawbacks of the intrusion detection algorithms based on UC or SVM alone, and it can detect intrusions efficiently in the network connections.
A Research on Intrusion Detection Based on Unsupervised Clustering
3
327
Intrusion Detection Algorithm Based on UC and SVM
3.1 Basis of Clustering Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. However, they all require a partition of a given set of objects into clusters that optimizes a given objective function [12]. Clustering is a division of data into groups of similar objects. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in mathematics, statistics, and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning. 3.2 Basis of Support Vector Machine SVM (Support Vector Machine), the learning approach originally developed by Vapnik & Cortes in 1995 [13], is a great contribution as a result of the research of machine learning in recent years. According to Statistics Learning Theory, machine has to abide by SRM (Structure Risk Minimization) theory rather than ERM (Empirical Risk Minimization) theory in order to make the deviation between the actual and the ideal outputs as little as possible when data are subject to some (fixed while unknown) distribution. That is to say, minimizing the upper bound of the false probability is expected. So SVM, an instance of that theory, is developed. Comparing to conventional ANN (Artificial Neutral Network), SVM has not only succinct structure but also advanced technique performances especially generalization capability, which is proved by plenty of experiments. In a word, SVM is an approach that maps the training data nonlinearly into a higher-dimensional feature space via a kernel function, and constructs a separating hyperplane with maximum margin there [14]. 3.3 UCSVM-ID Algorithm UCSVM-ID algorithm is based on two assumptions. The first assumption is that the number of normal instances far exceeds the number of intrusions. The second assumption is that the intrusions themselves are qualitatively different from the normal instances. The basic idea is that since the intrusions are rare and different from normal ones, they will appear as outliers in the data which can be detected. UCSVM-ID algorithm is made up of two stages, training stage and testing stage. In the training stage, training algorithms produce some classified models from training sets, and then detection algorithms utilize the classified models to classify new data in the testing stage. The algorithms in the two stages are described in detail as follows.
328
M. Luo et al.
3.3.1 UCSVM-ID detection algorithm. UCSVM-ID detection algorithm is composed of two algorithms: One is UC-based detection algorithm, and the other is Unsupervised SVM-based detection algorithm. The whole framework of detection algorithm is shown in the following graph.
Raw Data
Data Preprocess
Clustering -based Classifier
Less than Threshold Comparing with a Threshold Greater than Threshold
SVM-ID Detector SVM-based IDS Classifier (TCP) SVM-based IDS Classifier (UDP)
Output
SVM-based IDS Classifier (ICMP) . . .
Fig. 1. Framework of UCSVM-ID Algorithm
Formal description of UCSVM-ID detection algorithm is as follows: Given a threshold ε and a test datasets Z={x1,x2,...xn},xi Rn, where xi is normalized by the statistic data gaining from the preprocess algorithm. Step1: if Z=, then stop; Step2: repeat step3-5 until Z=; Step3: choose xi Z, i=1..n; Z=Z í^xi}; Step4: get k-clustering normal and anomalous centers produced by clustering algorithm in the training stage, and then calculate the distances between xi and the normal centers and the distances between xi and the anomalous centers, respectively marked as distnormal(Ol ,xi) and distanomaly(Om ,xi); Step5: find out the minimal distnormal(Ol ,xi) and distanomaly(Om ,xi) separately, mark them as distnormal(Ol_min ,xi) and distanomaly(Om_min, xi), if | distnormal(Ol_min ,xi) ídistanomε , then seek out the Omin of min(distnormal(Ol_min ,xi), distanomaly(Om_min, aly(Om_min, xi)| xi)), and seek out the class label of Omin and label xi with it, else put xi into the corresponding SVM detection model and classify it according to its protocol type. 3.3.2 UCSVM-ID training algorithm. Besides, the detailed algorithm used in the training stage is given below. 1 UC-based Intrusion Detection Algorithm Assume we fix a regular constant L and use the feature vector of xi to express cluster center Oj. Step1: C1 {x1}, O1 x1(feature), num_cluster 1, Z {x1,x2,...xn}; Step2: if Z=, then stop; Step3: repeat step4-7 until Z=; Step4: choose xi Z, i=2..n; Z=Z í {xi}; Step5: Find a cluster center Omin which is the closest one to xi among all created clusters. In other words, for all Om &, find a cluster center Omin from Cj to make dist(Omin, xi)dist(Om, xi), m=1..num_cluster;
A Research on Intrusion Detection Based on Unsupervised Clustering
329
Step6: if dist(Omin, xi) /, then add xi into Cj, i.e. Cj Cj {xi}, and adjust the center of Cj, i.e. compute the average of feature vectors of all the instances of cluster Cj, and let the result be the new center of cluster Cj. Go to step3; Step7: else create a new cluster, num_cluster num_cluster+1, Cnum_cluster {xi}, Onum_cluster xi(feature); where dist(o, xi) is defined as a Euclidean distance, num_cluster is the number of currently created clusters, n is the number of instances of the training data set, C1, …,Cnum_cluster are the created clusters, and Oj is the center of cluster Cj. Conspicuously, the algorithm costs less time than num_clusterhn since the whole clusters should be traversed in every loop of the algorithm (num_cluster is the number of the created clusters when the algorithm is finished), thus it is efficient. Unlike the UCID algorithm of Portonoy L. etc. [7], this algorithm uses the mean value of the data contained in cluster, so it can express the distances of data better. Additionally, we need to label the clusters after they are created. With the two assumptions it is highly probable that among the finally created clusters, the clusters containing normal data will have a much larger number of instances than those containing anomalous data. Therefore we label some percentage N of the clusters as normal, which contain the largest number of ‘normal’ instances associated with them. The rest of the clusters are labeled as ‘anomalous’ and are considered containing attacks [7]. 2 One-Class SVM-based Intrusion Detection Algorithm The standard SVM algorithm is a supervised learning algorithm. It requires labeled training data to create its classification rule and cannot be used in our experiment, so we use the unsupervised SVM algorithm (One-Class SVM) presented in [15] by Schölkopf. This algorithm does not require its training set to be labeled to determine a decision surface. It attempts to find a small region where most of the data lies and label points in that region as class +1. Points in other regions are labeled as class –1. The main idea is that the algorithm attempts to find the hyperplane that separates the data points from the origin with maximal margin. After mapping input data space X into a highdimensional feature space H via a kernel, the algorithm treats the origin as the only member of the second class. The using “relaxation parameters” it separates the image of the one class from the origin. After that the two-class classification SVM algorithm are employed. The one-class SVM algorithm can be formulated as follows: Suppose we are given some dataset drawn from an underlying probability distribution P and we want estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a prior specified value v (0,1). The solution to this problem can be obtained by estimating a function f which is positive on S and negative on the complement S . In other words, function f takes the value +1 in a “small” region where most of the data lies, and –1 elsewhere. l Given a training dataset Z = { x1 , x2 , , xn }, xi ∈ R , 0 ≤ i ≤ l , where xi is a data after standardizing raw data using the statistical data coming from the preprocessing
330
M. Luo et al.
algorithm. Let ϕ : R → H be a kernel map which transforms the training examples into the feature space H. Then, to separate the dataset from the origin, we need to solve the following quadratic optimization problem: l
min
w∈H ,ξ ∈R
l
( , ρ ∈R
1 2
w + 2
1
∑ξ vl l i
i
(1)
− ρ)
S.T. ( w ⋅ ϕ ( xi )) ≥ ρ − ξ i , ξ i ≥ 0 where v (0,1) is a parameter that controls the tradeoff between maximizing the distance from the origin and containing most of the data in the region created by the hyperplane and corresponds to the ratio of “outliers” in the training dataset. Then the decision function for each point xi ,
f ( x) = sgn(( w ⋅ ϕ ( x )) − ρ )
will be positive for most examples xi contained in the training set. If we introduce a Lagrange multiplier and rewrite formula (1) in terms of the Lagrange multipliers α i , we can represent the formula (1) as
min
1 2
∑α α K i
j
ϕ
( xi , x j )
i, j
S.T.: 0 ≤ α i ≤
1 vl
, ∑αi = 1 i
In terms of the Lagrange multipliers, the decision function is
f ( x ) = sgn(∑ α i Kϕ ( xi , x )) − ρ ) j
At the optimum, ρ can be computed from the Lagrange multipliers for any xi such that the corresponding Lagrange multiplier α i satisfies 0 < α i <
ρ = ∑ α j Kφ ( x j , xi )
1 vl
j
One property of the optimization is that for the majority of the data points, α i = 0 , which makes the decision function efficient to compute. In our experiment we used the LIBSVM [16]. This is an integrated tool for support vector classification and regression which implements one-class SVM. 3.4 The Process of Symbolic Features in Data Set For each TCP/IP connection in KDD99 data set [17], there are 41 quantitative and qualitative features, among which 8 features are symbolic ones (including protocol type, service, flag, land, logged in, root_shell, is_host_login, is_guest_login) and the rest are continuous ones. So we have to process the instances’ features differently.
A Research on Intrusion Detection Based on Unsupervised Clustering
331
The processing of continuous features in dataset can be seen in section 4.2; the processing of symbolic features is as follows: a) The Processing of Symbolic Features in UC Algorithm Assume xi, xj as two raw instances possessing symbolic features and the distance between them is d(xi, xj). When the symbolic features of xi and xj are equal, let d(xi, xj)=0; otherwise, let d(xi, xj)=C, where C is a constant. b) The Processing of Symbolic Features in SVM Algorithm Since SVM algorithm itself cannot process symbolic data, we adopt the following method in SVM algorithm for symbolic data processing. According to the only three kinds of protocols (TCP, UDP, ICMP) existing in KDD99 test data and the expectation of more similarity in the data packets of the same protocol, we trained three SVM classifiers showed in the Figure 1. In terms of TCP, UDP and ICMP, we classify the data in the training set while training and then send the data to the three SVM classifiers after stripping the protocol type attribute off the data. So does the detection processing. Hence, this approach makes the model produced by SVM algorithm more accurate. Additionally, such symbolic features as land, logged_in, root_shell, is_host_login and is_guest_login have the value of 0 or 1, so we can handle these features in the same way as continuous features. We illuminate the steps by which the service and flag features are processed as follows. First, let us suppose that the service feature have four values of http, ftp, telnet and smtp, and they appear repeatedly in datasets like http, http, telnet, ftp, smtp, ftp, telnet and smtp, …,ftp. Then we encode the four values into 0001,0010,0100,1000 respectively and replace the service feature with four sub-features (service1, service2, service3, service4). When the service feature’s value is http, its four sub-features can be set as: service1 = 0, service2 = 0, service3 = 0, service4 = 1. When the service feature’s value is ftp, its four sub-features can be set as: service1 = 0,service2 = 0,service3 = 1,service4 = 0. And the rest may be deduced so on and so forth. Finally, we can switch the symbolic service feature into the four continuous features. What’s more, we can process the flag feature in the same way. The merits can be seen obviously from the fact that the distance of the service or flag feature for every two records in the datasets has no difference, and no bias will be produced.
4 Experiment 4.1 Description of Data Sets The KDD CUP1999 [17] data sets are the authoritative testing data sets in current intrusion detection field. The data are acquired from the 1998 DARPA intrusion detection evaluation program and consist of about 4,900,000 data instances. For each TCP/IP connection, there are 41 quantitative and qualitative features. Some features are basic features (e.g.: duration, protocol type etc), while other features are obtained
332
M. Luo et al.
by using some domain knowledge (e.g.: number of failed login attempts etc). Among all 41 features, there are 8 symbolic features and 33 continuous features. Attacks in the data sets are divided into four main categories: (1) DOS (Denial of Service), such as ping of death attack; (2) U2R (User to Root), such as eject attack; (3) R2U (Remote to User), such as guest attack; (4) PROBING, such as port scanning attack. In order to satisfy the two assumptions above, we need to filter the raw training data sets. We choose 60638 instances as the training set from the raw data. In this training set, there are 60032 normal instances and 606 attack instances. Table 1 shows the attacks included in the training set. Table 1. Type and the number of attacks in training data set
When picking the test datasets, we choose 4 groups of data altogether and each containing 20,000 records. Among them, group 1 and group 2 data sets are selected from training set while the other two group data sets are selected from KDDCUP99 data that do not include training set (We especially choose some data which do not be included in training set, namely unknown intrusion). Table 2 shows the number of data in the testing data sets. 4.2 Preprocessing
As for continuous features, different features of raw data are on different scales. This causes bias toward some larger features over other smaller features. As an example, given two 3-feature vectors: xi={1000,1,2}, xj={2000,2,1}, then 2 2 2 2 2 2 d ( xi , x j ) = | xi1 − x j1 | + | xi 2 − x j 2 | + | xi 3 − x j 3 | = | 1000 − 2000 | + | 1 − 2 | + | 2 − 1 |
.
Obviously, the first column feature dominates the whole data feature. To solve the problem, we have to standardize measurements. Given measurements for a variable f, this can be performed as follows: Firstly, calculate the mean absolute deviation S f : Sf =
1 n
n
∑ (x
if
− mf )
i =1
where x1 f , , xnf are n measurements of f, and m f is the mean value of f, that is
A Research on Intrusion Detection Based on Unsupervised Clustering
mf =
1 n
333
n
∑x
if
i =1
Secondly, calculate the standardized measurement:
zif =
xif − m f Sf
Then we can convert every instance in the training sets to a new one based on previous three formulas. It is a transformation of an instance from its own space to our standardized space, based on statistical information retrieved from the training sets, which can solve the problem above. Table 2. Number of data in testing data set
4.3 Experiment Results
Before the experiment, we trained UC and SVM algorithms by using preprocessed training data. On account of the preferable experiment results when using Gaussian Kernel function in our previous experiment employing One-class SVM algorithm, we adopt Gaussian Kernel function in the our experiment. The parameters used in experiment are listed below. a) Cluster width L, which determines what the distance is could lead to the two connection data to be assigned to the same cluster; b) Percentage of the largest clusters N, the ratio of the clusters that would be labeled ‘normal’ in the detection algorithm. c) Kernel parameter g, the width parameter of Gaussian Kernel function. d) Support vector ratio n, the ratio of the support vectors to the whole data in the SVM algorithm. e) Threshold ε , which decides how to split data flow in detection algorithm. In the experiment, the detection rate is defined as the number of intrusion instances detected by the system divided by the total number of intrusion instances presented in the test set. The false positive rate is defined as the total number of normal instances that are incorrectly classified as intrusions divided by the total number of normal instances. However, due to the lack of good methods in choosing parameters, we could only use trial-and-error method. With the experiences from our previous experiments, we set L = 40, N = 20%, g=1/41 and n=0.1. The performance comparison of UCSVM-ID algorithm under different threshold ε is shown in the following table.
334
M. Luo et al.
Table 3. Performance Comparison of UCSVM-ID algorithm using different thresholds, DT: Detection Time (s), DR: Detection Rate (%), FPR: False Positive Rate (%)
As we can see in table 3, the detection rate and detection time for each group fall in company with the decrease of threshold ε , which is in accord with our estimate. When the threshold ε decreases, the number of data sent to SVM classifier decreases. Under this condition, detection of UCSVM-ID algorithm is done mainly by UC algorithm that is fast but with low accuracy, thus both the detection rate and detection time are reduced. It is obvious that the entire performance is better when ε =0.1. Now we set ε =0.1 and keep the values of other parameters the same, and then compare the performances of UC, SVM, UCSVM-ID algorithms. The outcomes of this experiment are displayed in Table 4. Table 4. Performance Comparison of algorithms. DT: Detection Time (s), DR: Detection Rate (%), FPR: False Positive Rate (%)
It can be seen from Table 4 that in the aspect of detection time, UCSVM-ID algorithm is slower than UC algorithm but faster than SVM algorithm for the four groups of data. That’s because UCSVM-ID algorithm tackles a mass of data in the UC stage, thus reduces the data classified by SVM, thereby the detection time largely decreases. While in the aspect of detection ratio, UCSVM-ID is better than UC algorithm but worse than SVM algorithm. The reason is that UC algorithm deals with most typical and easily classified data, but those data that are hard to classify can be precisely classified by SVM algorithm. Therefore, as we estimate, UCSVM-ID algorithm has better performance than UC and SVM algorithms in detecting intrusions. Table 5 reveals the outcomes of experiments that employ these algorithms to detect known and unknown intrusions over group 3 and 4 test sets. Known intrusions refer to those included both in test set and in training set, and unknown intrusions refer to those included only in test set (for example: udpstorm intrusion in DOS, spy intrusion in R2U).
A Research on Intrusion Detection Based on Unsupervised Clustering
335
Table 5. Detection ratios of algorithms for known and unknown intrusions
As shown in table 5, UCSVM-ID algorithm gains high detection rate in detecting various intrusions, and gets over the difficulty of detecting R2U and DOS attacks using UC algorithm. (Because many of R2U intrusions attackers pretend to be legit users who are authorized to use the network or use it in a seemingly legitimate way, their features are not qualitatively different from normal instances. So the UC algorithm may cluster these instances together and the intrusion would be undetected. In addition, because there are so many instances of DOS intrusion that they occur in a similar number to normal instances, the UC algorithm has difficulty to detect DOS attacks.) The reasons are that UCSVM-ID algorithm employs SVM algorithm rather than UC algorithm to detect R2U and DOS intrusions, we use the One-class SVM algorithm to detect anomaly essentially by comparing the difference with normal data, and confirming classify plane is influenced only by support vector. The detection rates of UCSVM-ID algorithm in both group 3 and group 4 exceed 75%, showing that it can detect unknown intrusions effectively.
5 Conclusions The experiment result indicates that the UCSVM-ID algorithm based on UC and SVM is efficient for intrusion detection. The algorithm can get a fast speed and high accuracy by combining the UC and SVM, and it does not rely on labeled and filtered training sets. Furthermore, the algorithm has a good performance in detecting unknown intrusions. Finally, since it is simple and rapid, it can be used in real world systems without a lot of modification.
336
M. Luo et al.
However, due to the lack of good methods in choosing parameters, we could only use trial-and-error method. Future work is to find a method which can let the algorithm confirm two parameters itself, perhaps based on evolution algorithms.
References 1. 2. 3. 4.
5.
6.
7.
8.
9.
10.
11. 12. 13. 14. 15. 16. 17.
Ghosh A. K. Learning Program Behavior Profiles for Intrusion Detection. USENIX. 1999 Cannady J. Artificial Neural Networks for Misuse Detection. National Information Systems Security Conference. 1998 Ryan J., Lin M-J. Miikkulainen R. Intrusion Detection with Neural Networks. Advances in Neural Information Processing Systems 10, Cambridge, MA: MIT Press. 1998 Luo J., Bridges S. M. Mining Fuzzy Association Rules and Fuzzy FrequencyEpisodes for Intrusion Detection. International Journal of Intelligent Systems,John Wiley & Sons, 2000, 687–703 Eskin E., Arnold A., etc. A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data. Data Mining for Security Applications (DMSA2002). Kluwer 2002 Honig A., Howard A,. etc. Adaptive Model Generation: An Architecture for the Deployment of Data Minig-based Intrusion Detection Systems. Data Mining for Security Applications (DMSA-2002). Kluwer 2002 Portnoy L., Eskin E., etc. Intrusion Detection with Unlabeled Data Using Clustering. In Proceedings of ACM CSS Workshop on Data Mining Applied to Security(DMSA-2001), 2001 Mukkamala S., Janowski G., etc. Identifying Important Features For Intrusion Detection Using Support Vector Machines and Neural Networks. Applications and the Internet, 2003. Proceedings. 2003 Symposium, 2003, 209–216 Mukkamala S., Janowski G., etc. Intrusion Detection Using Neural Networks and Support Vector Machines. Proceedings of IEEE International Joint Conference on Neural Networks 2002, Hawaii, 2002.5, 1702–1707 Mukkamala S., Sung A.H. Comparison of Neural Networks and Support Vector Machines in Intrusion Detection. Workshop on Statistical and Machine Learning Techniques in Computer Intrusion Detection, June 11–13, 2002, Rao X.. An Intrusion Detection Based on SVM. Journal of Software 2002, 14(4), 798–803 Chatz,.A. and Tuzhilin.A. What Makes Patterns Interesting in Knowledge Discovery Systems. IEEE Transactions on Knowledge and Data Engineering, 1996.6, 970–974 Vapnik, V. The Nature of Statistical Learning Theory. New York, NY: Springer-Verlag, 1995 Nell.C and John.S, An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press. 2000 Schölkopf.B, Platt.J. C., etc. Estimating the support of a high-dimensional distribution. Neural Computation. 2001, 13 (7), 1443–1471 http://www.csie.ntu.edu.tw/~cjlin/libsvm http://kdd.ics.uci.edu/databases/kddcup99/task.html
UC-RBAC: A Usage Constrained Role-Based Access Control Model* Zhen Xu, Dengguo Feng, Lan Li, and Hua Chen State Key Lab. of Information security, ISCAS, th Haidian district, Zhongguancun South 4 Street No. 4, Beijing, China {xuzhen, feng, lilan, ch}@is.iscas.ac.cn
Abstract. Role-based access control (RBAC) models have received broad support as a generalized approach to access control. However, there are requirements to limit the maximum number of usage times of roles assigned to users, that cannot be modeled under current RBAC models. We present UCRBAC model, an extended RBAC model, to tackle such dynamic aspects. UCRBAC supports such constraints during periodic time. The constraints can be set to limit the usage of a role of both a specified user and all users assigned to the role. The formal definition and semantics of the model are presented.
1 Introduction In recent years, the importance of Role-base access control (RBAC) has been widely recognized [1, 2]. Vendors have implemented RBAC features in their products and a proposed voluntary consensus standard for role based access control is now available [1, 3]. In RBAC, permissions aren’t assigned to users directly but assigned to roles and users are also assigned to roles. Hence users acquire permissions by being members of roles. After activating a role, a user can exercise the role’s permissions. RBAC can directly support security policy of the organizations and greatly simplify authorization administration [1]. Although RBAC has been widely investigated and several extensions have been proposed [4, 5, 6], it failed to address some requirements. One of these requirements is to control number of times a user can play a role during periods of time. Such a requirement is very common in daily life. For example, when a project manager is in his vacation and a new project has to begin, there will need one time authorization to allow the vice manager to exercise the manager’s role during the vacation. Another example is that there’re always constraints on number of times an ATM card can draw cash from ATM machines in a day. To address such requirements, a constraint on the number of times that a user can play a role during certain periods of time is added to role activation. In this paper, we present a Usage Constrained RBAC (UC-RBAC) model, an extended RBAC model. One of the main features of UC-RBAC model is its number *
This research was supported in part by NSFC Grant 60025205, 60273027, Chinese National 973 Project G1999035802 and Chinese National 863 Project 2002AA141080.
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 337–347, 2003. © Springer-Verlag Berlin Heidelberg 2003
338
Z. Xu et al.
of times constraint on users’ usage of a role, called role ticket. And such constraints can be defined in periods of time, which greatly improved the temporal granularity of the number of times constrains. This model can greatly improve information system security and alleviate the burden of security mangers. Temporary authorization is one of the problems it can address. In organizations, many authorizations are temporary, which means these authorizations are valid during certain periods and can be exercised only for limited times. In traditional models, to accomplish such goals, security managers have to create roles and assign roles to the users and when the desired job is over security managers again have to revoke roles from the user explicitly. There’re two major disadvantages. First, security administrators cannot revoke the roles in time and the operations are prone to err. Second, such manual operation tends to be a burden of security managers. Another advantage of our model is its ability to prevent users from abusing their privileges to a certain degree. Role tickets limit the maximum number of times users may play their roles. When the constraints are properly set, users will not be able to abuse their privileges at will. Since the inception of RBAC model [7], it has been a hot topic in the research area of access control. In the year 1996, Sandhu et al. presented a framework of four rolebase access control models [8]. Constraints play a vital role in RBAC model [6, 9]. The work of Giuri and Iglio [10] defined a formal model of RBAC with separation of duty constraints on role activation. Later Ahn et al. provided a role-based authorization constraints specification language – RCL2000 [11]. In [18], Jason Crampton talked about specifying and enforcing constraints in role-based access control. The work of Gustaf Neumann and Mark Strembeck explored the engineering and enforcing aspects of context constraints in RBAC environments[19]. [20] captured an exhaustive set of temporal constraint needs for access control that has recently been proposed. A temporal role-based access control model was proposed in [12], where temporal constraints are associated with role activations. The status of a role can be activated or deactivated, and role can be activated only when it’s status is activated. They also introduce triggers to express the dependencies among role activation and deactivation. Our work also concerns the activation and deactivation of roles, but the constraints are on specified user’s activation/deactivation of roles. In another word, the constraints have finer granularity than those in [12]. The temporal aspects are considered in our work too. We incorporate the periodic time from [12] which in turn borrowed it from [13] and the period expression of which originally came from [14]. The main feature in our work is role ticket, a kind of constraints on users’ usage of roles. A role ticket limits a user’s maximum number of times that he can play a designated role of him. Related work can be seen in [15]. Originator Control (ORGCON) policy is an access control policy the objective of which is to let object owner control the usage of administrative rights propagated by him. Usage control (UCON) is presented as a means of controlling and managing usage of digital objects [16, 17]. Although all the three are trying to control the usage of rights, they differ significantly in motivation, objectives and control targets. The remainder of this paper is organized as follows. In section 2, we describe the role-base access control model and periodic time that serve as bases of our work. The usage-constrained role-based access control model is presented formally in section 3. Section 4 concludes the paper.
UC-RBAC: A Usage Constrained Role-Based Access Control Model
339
2 Preliminaries In this section, we’ll present the role-base access control model and the periodic time. 2.1 RBAC Model The RBAC model we use in this paper is mainly the one proposed by Sandhu et al. [9]. There’re four basic components of RBAC: a set of users, a set of roles, a set of permissions and a set of sessions. A user is a human being or an autonomous agent, a role is a job function or job title within the organization with some associated semantics regarding the authority and responsibility conferred on a member of the role, and permission is an approval of a particular mode of access to one or more objects in the system. When a user log in the system he establishes a session and he can request to activate some subset of roles he is authorized to play during this session. A user may have multiple sessions open at the same time. Each session may have a different combination of active roles. The user assignment (UA) and permission assignment (PA) relations are both many-to-many relations. A user can be a member of many roles, and a role can have many users. Similarly, a role can have many permissions, and the same permissions can be assigned to many roles. There is a partially ordered role hierarchy RH, also written as , where x y signifies that role x inherits the permissions assigned to role y. Inheritance along the role hierarchy is transitive and multiple inheritance is allowed in partial orders. Below is the formal definition of RBAC model [adapted from 9]. Definition 2.1 (RBAC Model). The RBAC model has the following components: • U, R, P, and S, sets of users, roles, permissions and sessions respectively, • PA ∈ P × R , a many-to-many permission (to role) assignment relation, • UA ∈ U × R , a many-to-many user (to role) assignment relation, • RH ∈ R × R , a partially ordered role hierarchy (written as ≥ ), • user : S → U , a function mapping each session st to the single user user(st) (constant for the session's lifetime),
roles : S → 2 R , a function mapping each session st to a set of roles [ users( si ), r ’)∈ UA]} (which can change with roles( si ) ⊆ {r (∃r ’≥ r )( time) so that session st has the permissions
{
[(
)
]}
r∈roles ( si ) P (∃r ’’≤ r ) p, r N ∈ PA , and
• a collection of constraints that determine whether or not values of various components of the RBAC model are acceptable (only acceptable values will be permitted).
340
Z. Xu et al.
2.2 Periodic Time We use a formal symbolic formal representation of periodic time introduced in [13]. The formalism for periodic expression in [13] in turn is based on the one proposed in [14]. A periodic expression denotes an infinite set of time instants. A basic notion of periodic expressions is calendars introduced in [14]. A calendar is a countable set of contiguous intervals, numbered by integers called as indexes of the intervals. In the rest of this paper, we assume there exist a set of calendars with Years, Months, Days, the finest granularity of which is Days, the tick of the system. There may exist a subcalendar relationship between calendars. Given two calendars C1 and C2, we say C1 is a subcalendar of C2, (written C1 ⊆ C2 ), if each interval of C2 is covered by a finite number of intervals of C1. Periodic expressions are composed of calendars, which are more general than the later. It can express periodic instants not contiguous, for example, the set of Monday mornings. Periodic expressions are formally defined as follows [13]. Definition 2.2 (Periodic Expression). Given calendars Cd,,C1, …, Cn,, a periodic expression P is defined as : n
P = ∑ Oi .Ci r.Cd i =1
Where Oi=all, Oi 2 {all} and Ci ⊆ Ci-1 for I = 2,…,n,Cd, and r IN. The part ahead of identifies the set of starting points of the intervals this expression represents, while the latter part specifies the duration of each interval in terms of calendar Cd. For example, all.Months + {1, 10}.Days 2.Days represents the set of intervals starting at the same instant as the first and tenth day of every month with the duration of 2 days. When Oi ‘s value is all, it’s omitted. r.Cd is omitted when r.Cd,=1.Cn. Periodic time is the combination of a periodic expression and a time interval which bounds the periodic expression. It’s defined as follows. Definition 2.3 (Periodic time). Given period expression P, a periodic time is a pair <[begin, end], P> where P is a periodic expression, and [begin, end] is a time interval denoting the lower and upper bounds that are imposed on instants in P. The infinite set of time instants denoted by periodic expression P is expressed by (P). We follow the definitions of function which is formally defined as follows. IN
n
Definition 2.4 (Function ). Let
P = ∑ Oi .Ci r.Cd be a period expression, i =1
then (P) is a set of time intervals whose common duration is r.Cd, and whose set S of starting points is computed as follows: • If n=1, S contains all the starting points of the intervals of calendar C1. • If n>1, and On={n1, …, nk}, then S contains the starting points of the n1th, …, nkth intervals (all intervals if On = all) of calendar Cn included in each interval of
n −1 ∏ ∑ Oi .Ci r.Cd . i =1
UC-RBAC: A Usage Constrained Role-Based Access Control Model
341
For example, period expression P is of the form of all.Years + {6, 7}.Months + {1, 10}.Days 2.Days. (P) is the set of intervals whose common duration is 2.Days and each element of the set of starting points is the instant of the first and tenth day of every June and July. The set of time instants denoted by <[begin, end], P> is defined by function Sol. Below is the formal definition. Definition 2.5 (Function Sol). Let t be a time instant, P a periodic expression, and begin and end two time instants. t Sol (<[begin, end], P>) if and only if there exists´ (P) such that t ´and begin t end.. The result of Sol (<[1/1/2003,2/1/2003], all.Months + {1, 10}.Days 2.Days >) is two sets of time instants, [1/1/2003,1/2/2003] and [1/10/2003,1/11/2003]. Function Pti maps a time instant and a periodic time to a time interval, if the time instant is an element of the time interval and the time interval is one the intervals of the periodic time. The formal definition is as follows. Definition 2.6 (Function Pti). Let t be a time instant, pt=<[begin, end], P> a periodic time, Pti(t, pt) is determined as follows: • Pti(t, pt) = [tb, te], where [tb, te] (PT) and t [tb, te]; • else not defined. Function Ovp is used to determine if the time lines of two periodic times overlap. It’s formally defined as follows. Definition 2.7 (Function Ovp). Let pt, pt’ be two period times, Ovp(pt, pt’) is determined as follows: • Ovp(pt, pt’)=true, where there exists a time instant t, t (Sol(pt)) and t (Sol(pt’)); • Ovp(pt, pt’)=false, where there doesn’t exist a time instant t, t (Sol(pt)) and t (Sol(pt’));.
3 UC-RBAC Model In this section we’ll present the UC-RBAC model. We first make the assumption of system behavior of user’s close session operation. Next, some basic definitions are given. After that we’ll present the formal semantic of the model. At last, the administrative aspects of the model are discussed. 3.1 Assumption We assume that a close session operation implies a series of deactivation of roles that are still active at the end of the session. The assumption is reasonable and it will help discussing our model. 3.2 Definitions We’ll first discuss how to express number of times. Intuitively it could be represented as nature numbers, IN. However, 0 is also needed. Below comes the definition.
342
Z. Xu et al.
Definition 3.1 (Number of Times). Number of Times, denoted as NT, is the set of numbers where for any element nt, nt IN {0}. To describe users’ request to activate and deactivate roles we introduce user request defined as follows. Definition 3.2 (User Request, User Request Sequence). User request is of the form (operation, user, role, session) where operation {create_session, close_session, activate, deactivate}, user Users, role Roles, session Sessions. A user request sequence denoted as URS is an infinite sequence the t-th element of which denoted as URS(t) is a set of user request occur at time t. There’re 4 kinds of user request. Users may request to (de)activate a role in a session and create or drop a session. And closing a session implies deactiving roles that’re still active. When the request is create_session or close_session, the field role of the request is ignored. User request brings about role event formally defined as follows. Definition 3.3 (Role event). Role event is of the form (operation, user, role, session) or (operation, user, role, session) where operation {activate, deactivate}, user Users, role Roles, session Sessions, (user, role) UA and user(session)=user. Users can activate roles assigned to him in his sessions and deativate the ones they activated in the same sessions. For example, User u start a session s. During the session, he activated role r1 and r2, then deactivate r1 and closed s.The event list according the description is listed as follows: (activate,u,r1,s) (activate,u,r2,s) (deactivate,u,r1,s) (deactivate,u,r2,s) Note, (deactivate,u,r2,s) is not explicitly required by u, however his close session operation brought such role event. In the remainder of this paper, we do not distinguish such kind of implicit role events from other role events. We introduce role ticket to express constraints on the number of times users can play roles assigned to them. A role ticket limits the maximum number of times a user may exercise a role during periods of time. For a user, it just likes receiving a ticket to be assigned a role with a constraint on the number of times of role usage. The “ticket” has its usage limits of number of times and valid time periods. It’s formal defined below. Role ticket can be associated with every periods or all the periodic time. Definition 3.4 (Role Ticket, Role Ticket Set). Role ticket is of the form (I, T, U, R, A), where I is a periodic time, T NT, U Users {All}, R Roles, (U, R) UA when U All and A {All, Each} indicates whether the constraint is set on every periods of the periodic time or on all the periodic time. Role ticket set denoted as RTS is a set of elements of Role tickets. The element All in U is used to indicate that the role ticket applies to all users assigned to a role. That is, the sum of all the user’s number of times playing the role cannot exeed a predefined number. Figure 1 shows some examples of role tickets. RT1 is assigned associated with the role Ra to Ua, it indicates that Ua can play Ra no more than once every the first day in a month during the year 2003. If the role ticket is changed to RT2, it means that Ua can only excise Ra once all the first day in a month during the year 2003 and he can exercise this role no more than once. In RT3,
UC-RBAC: A Usage Constrained Role-Based Access Control Model
(RT1) (<[1/1/2003 (RT2) (<[1/1/2003 (RT3) (<[1/1/2003 (RT4) (<[1/1/2003
343
12/31/2003] all.Months {1}.Days >, 1, Ua, Ra, Each) 12/31/2003] all.Months {1}.Days >, 1, Ua, Ra, All) 12/31/2003] all.Months {1}.Days >, 1, All, Ra, All) 12/31/2003] all.Months {1}.Days >, 0, Ua, Ra, Each) Figure 1: examples of role tickets Fig. 1. Examples of role tickets
the user field is All, which means the number of times of playing Ra sums up no more that 1 during all the first day in a month in the year 2003. However, the T field of RT4 is 0, this role ticket will block any activation request from Ua. In another word, our model does not support block user’s activation of roles in periodic times by now. In order for Ua to be able to activate Ra, such a role ticket must be removed explicitly. 3.3 Formal Semantics The dynamics of role events, successful role activation and the status of role activation are depicted as a sequence of snapshots. A snapshot models current set of role events, successful role activations and activated roles of users. For the convenience of notation, we three sequences RES (Role Event Sequence), US (Usage Sequence) and RAS (Role Activation State), respectively. Definition 3.6 (Role Event Sequence, Usage Sequence, Role Activation State). For all integers t 0, 1. Role event sequence denoted by RES is an infinite sequence the t-th element of which denoted by RES(t) is the set of role events that occur at time t. 2. Usage sequence denoted by US is an infinite sequence the t-th element of which denoted by US(t) is the set of activation role events that occur at time t and manage to activate the roles. 3. Role activation state denoted by RS is an infinite sequence. The t-th element of which denoted by RAS(t) is the set of pairs (user, role, session) where user Users, role Roles, role activated by user is active at time t and session Sessions. Next, we introduce a function UC to compute the count of usage of roles by users during a periodic time. Definition 3.5 (Function UC). Let u be a user or all, r be a role, pt=<[begin, end], P> be a period time, us be a usage sequence, the value of UC(u, r, pt, us) is determined as follows: • UC(u, r, pt ,us) is the count of all role events (activate, u, r, session) in all us(t) where t Sol(pt) and u all. • UC(u, r, pt, us) is the count of all role events (activate, u’, r, session) in all us(t) where t Sol(pt), u=all and u’ Users. Definition 3.6 (System State). Let rts be a role ticket set. System state is of the form , where res is a role event sequence, us is a usage sequence and ras is a role activation sequence. Res and ras should satisfy the following constraints, for all t 0, 1. if (activate, r, u, s) res(t) and there doesn’t exist a role ticket (I, T, U, R, A) rts where R=r and U=u, then (u, r, s) ras(t+1);
344
Z. Xu et al.
2. if (activate, r, u, s) res(t), there exists a role ticket (I, T, U, R, A) rts and there doesn’t exist a role ticket (I’, 0, U, R, A’) ∉ rts, where T>0, R=r ,U=u or U=All, t Sol(I), A=all , and UC(u, r, I)0, R=r ,U=u or U=All, t Sol(I), A=each and UC(u, r, )is an execution model of a role ticket set rts and a user request sequence urs where ras(0)=Ø and rts satisfies below condition, for any two role ticket (I, T, U, R, A), (I’, T’, U’, R’, A’) rts where U=U’, R=R’, T 0 and T’ 0, Ovp(I, I’)=false; And urs and res should follow below constraints, for all t 0, 1. (activate, user, role, session) res(t) iff (activate, user, role, session) urs(t); 2. if (deactivate, user, role, session) urs(t), then (deactivate, user, role, session) res(t); 3. if (close_session, user, rolex, session) urs(t), then (deactivate, user, roles(session), session) res(t);(rolex can be any valid value) 4. if (deactivate, user, role, session) res(t), then (deactivate, user, role, session) urs(t) or (close_session, user, rolex, session) urs(t) and role roles(session). (rolex can be any valid value) The constraints in the definition demonstrate the relationship between user requests and role events. User requests only come from Role events. Figure 2 shows a example of the execution model. The elements of the sequences are associated with a time stamp that can be expressed by an integer (Note, we have assumed Day as the finest granularity of time). Role events come from user requests. The first role event (activate, Ua, Ra, S1) leads to the activation of Ra. However, the second (activate, Ua, Ra, S1) is blocked, because Ua’s usage of the Ra has reached the limit according to RT1. Role event (activate, Ua, Rb, S1) is blocked by RT2. The role activation sequence changes according to the result of role events.
UC-RBAC: A Usage Constrained Role-Based Access Control Model
Role ticket set (RT1) (<[1/1/2003 (RT2) (<[1/1/2003
12/31/2003] 12/31/2003]
345
Months 1.Days>, 1, Ua, Ra, Each) Months 1.Days>, 0, Ua, Rb, Each)
Time stamp 1/2/2003 1/3/2003 1/4/2003 1/5/2003
User request sequence {(create_session, Ua, X, S1)} {(activate, Ua, Ra, S1), (activate, Ua, Rb, S1)} {(activate, Ua, Ra, S1)} { (close_session, Ua, X, S1)}
Time stamp 1/3/2003 1/4/2003 1/5/2003
Role event sequence {(activate, Ua, Ra, S1), (activate, Ua, Rb, S1)} {(activate, Ua, Ra, S1)} {(deactivate, Ua, Ra, S1)}
Time stamp 1/3/2003
Usage sequence {(activate, Ua, Ra, S1)}
Time stamp 1/2/2003 1/3/2003 1/5/2003
Role activation sequence {(Ua, Ra, S1)}
Fig. 2. Example of execution state
3.4 Administration of Role Ticket Set The administrative operations of role ticket set mainly include add role ticket and remove role ticket. First, we introduce two functions AddRT and RmRT. Definition 3.8 (Function AddRT). Given a role ticket set rts, a role ticket rt=(I, T, U, R, A), AddRT(rts, rt)=rts’, rts’ is the set of role tickets determined as follows: • rts’=rts where there doesn’t exists a role ticket rt’=(I’, T’, U, R, A’), T’ 0 and Ovp(I, I’)=true when T 0, and there doesn’t exist a role ticket (I’, 0, U, R, A’), I’ can be any periodic time, A’ can be any valid value when T=0; • rts’=rts {rt} else; Definition 3.9 (Function RmRT). Given a role ticket set rts, role ticket rt=(I, T, U, R, A), RmRT(rts, rt)=rts’, rts’ is the set of role tickets determined as follows: • rts’=rts where rt rts; • rts’=rts-{rt}.
346
Z. Xu et al.
Function AddRT will add a new role ticket to rts, if there doesn’t exist a role ticket whose user and role are the same as the new one and time line is overlapped by the new one’s. Function DropRT will drop the desire role ticket. With the definition of role ticket set and the add/remove role ticket function, we get the following result: Theorem 3.1. For a role ticket set rts, if there doesn’t exist two role tickets (I, T, U, R, A) and (I’, T’, U, R, A’) where T 0, T’ 0 and Ovp(I, I’)=true, then after any times of add role ticket and remove role ticket operation there still doesn’t exist such role two tickets. This theorem tells us at a certain time instant there is no more than one role ticket that applies to a role event in the execution model. Because the role ticket set of an execution model is initialized to be empty.
4 Conclusions In this paper we have presented UC-RBAC model, an extended RBAC model. The main innovative feature of the model is role ticket, constraints on the maximum number of times a user can play a role. Role tickets are defined during periods of time. The formal definition and semantics are introduced. Limiting usage of roles is a common security requirement. Our work makes it possible to model such requirement under RBAC framework. The idea of limiting usage of privileges is a general one. It can be incorporated into other security models and other aspects of RBAC such as RA and role delegation. Another possible direction of the future work is to increase the granularity of the constraints.
References 1. 2. 3. 4. 5. 6. 7. 8.
R. Sandhu, D. Ferraiolo, and D. Kuhn, The NIST model for role-based access control: towards a unified standard, In Proceedings of Fifth ACM Workshop on Role-Based Access Control (Phoenix, AZ), p. 47–63, 2000. Gregory Tassey, Michael P. Gallaher, et al., Economic Impact Assessment of NIST’s Role-Based Access Control (RBAC) Program, http://www.nist.gov/director/prog-ofc/report02-1.pdf, 2003 American National Standard for Information Technology – Role Based Access Control, http://csrc.nist.gov/rbac/rbac-std-ncits.pdf, 2003 Elisa Bertino , Piero Andrea Bonatti , Elena Ferrari, TRBAC: A temporal role-based access control model, ACM Transactions on Information and System Security (TISSEC), v.4 n.3, p. 191–233, August 2001 E. Barka and R. Sandhu, A role-based delegation model and some extensions. In Proceedings of the 23rd National Information Systems Security Conference (Baltimore, Md.), p. 16–19, Oct. 2000 Trent Jaeger, On the increasing importance of constraints, In Proceedings of the fourth ACM workshop on Role-based access control, p. 33–42, October 28–29, 1999 D. Ferraiolo and D. Kuhn, Role based access control, In Proceedings of the 15th Annual Conference on National (USA) Computer Security (Gaithersburg, MD), p.554–563, 1992 R. Sandhu, Edward J. Coyne , Hal L. Feinstein, Charles E. Youman, Role-Based Access Control Models, Computer, v.29 n.2, p. 38–47, February 1996
UC-RBAC: A Usage Constrained Role-Based Access Control Model 9.
10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
347
R. Sandhu, Role hierarchies and constraints for lattice-based access controls. In Proceedings of the Fourth European Symposium on Research in Computer Security (ESORICS96,Rome, Italy, Sept.), E. Bertino, Ed. Springer-Verlag, New York, NY, 1996. L. Giuri and P. Iglio, A formal model for role-based access control with constraints. In Proceedings of 9th IEEE Workshop on Computer Security Foundations (Kenmare, Ireland,June). IEEE Press, Piscataway, NJ, p.136–145, 1996 Gail-Joon Ahn, R. Sandhu, Role-based authorization constraints specification, ACM Transactions on Information and System Security (TISSEC), v.3 n.4, p.207–226, Nov. 2000 Elisa Bertino, Piero Andrea Bonatti, Elena Ferrari, TRBAC: A temporal role-based access control model, ACM Transactions on Information and System Security (TISSEC), v.4 n.3, p.191–233, August 2001 Elisa Bertino, Claudio Bettini, Elena Ferrari, Pierangela Samarati, An access control model supporting periodicity constraints and temporal reasoning, ACM Transactions on Database Systems (TODS), v.23 n.3, p.231–285, Sept. 1998 M. Niezette and J. Stevenne. An efficient symbolic representation of periodic time. In Proc. First International Conference on Information and Knowledge Management, 1992. Abrams, Marshall, et al., Generalized Framework for Access Control: Towards Prototyping the ORGCON Policy, Proceedings of the 14th National Computing Security Conference, p. 257–266, 1991. Park, Jaehong, and R. Sandhu, Towards Usage Control Models: Beyond Traditional Access Control, Proc. of the 7th ACM Symposium on Access Control Models and Technologies, 2002. J. Park, R. Sandhu, Originator Control in Usage Control, 3rd International Workshop on Policies for Distributed Systems and Networks (POLICY’02), June 05–07, p. 60–68, 2002 Jason Crampton, Specifying and enforcing constraints in role-based access control, Proceedings of the eighth ACM symposium on Access control models and technologies, p. 43–50, 2003. Gustaf Neumann, Mark Strembeck, An approach to engineer and enforce context constraints in an RBAC environment, Proceedings of the eighth ACM symposium on Access control models and technologies, p. 65–79, 2003 James B. D. Joshi, Basit Shafiq, Arif Ghafoor, Elisa Bertino, Dependencies and separation of duty constraints in GTRBAC, Proceedings of the eighth ACM symposium on Access control models and technologies, p. 51–64, 2003
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography Mathieu Ciet1 and Marc Joye2 1 UCL Crypto Group Place du Levant 3, 1348 Louvain-la-Neuve, Belgium [email protected] – http://www.dice.ucl.ac.be/crypto/ 2 Gemplus, Card Security Group La Vigie, Avenue du Jujubier, ZI Ath´elia IV, 13705 La Ciotat Cedex, France [email protected] – http://www.geocities.com/MarcJoye/ http://www.gemplus.com/smart/
Abstract. Randomization techniques play an important role in the protection of cryptosystems against implementation attacks. This paper studies the case of elliptic curve cryptography and propose three novel randomization methods, for the elliptic curve point multiplication, which do not impact the overall performance. Our first method, dedicated to elliptic curves over prime fields, combines the advantages of two previously known solutions: randomized projective coordinates and randomized isomorphisms. It is a generic point randomization and can be related to a certain multiplier randomization technique. Our second method introduces new elliptic curve models that are valid for all (non-supersingular) elliptic curves over binary fields. This allows to use randomized elliptic curve isomorphisms, which in turn allows to randomly compute on elliptic curves with affine coordinates. Our third method adapts a double ladder attributed to Shamir. We insist that all our randomization methods share the common feature to be free: the cost of our randomized implementations is virtually the same as the cost of the corresponding non-randomized implementations. Keywords: Randomization, elliptic curve cryptography, implementation attacks, side-channel analysis, elliptic curve models, point multiplication algorithms.
1
Introduction
The celebrated RSA cryptosystem is the most largely deployed cryptosystem but things are becoming to change. More and more applications propose to use the elliptic curve digital signature algorithm (ECDSA) to sign digital documents or messages. Elliptic curve cryptography bases its security on the hardness of computing discrete logarithms. More precisely, the elliptic curve discrete logarithm problem (ECDLP) consists in recovering the value of multiplier k, given points P and Q = [k]P on an elliptic curve. There are two main families of elliptic curves used S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 348–359, 2003. c Springer-Verlag Berlin Heidelberg 2003
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography
349
in cryptography [1]: elliptic curves over large prime fields and non-supersingular elliptic curves defined over binary fields. Although an elliptic curve cryptosystem may be mathematically sound and meets standard security requirements, it may totally succumb to implementation attacks. A powerful implementation attack, due to Kocher et al. [16,17], monitors certain side-channel information (e.g., running time or power consumption) during the course of a crypto-algorithm and thereby tries to deduce some sensitive data. For example, the double-and-add algorithm (Fig. 1-a) —i.e., the additive analogue of the so-called square-and-multiply algorithm— used for computing Q = [k]P does not behave regularly. This is even more true for elliptic curves as the classical formulæ for point doubling and point addition are different. To thwart simple power analysis (SPA) [17] (i.e., side-channel leakage from a single power trace), this algorithm is usually replaced with the ‘double-and-add always’ algorithm [6] (Fig. 1-b). Throughout this paper, we will use this latter algorithm to benchmark our randomization methods (as well as NAF based variants; cf. Appendix A). Input: P , k = (1, k−2 , . . . , k0 )2 Output: Q = [k]P R0 ← P for i = − 2 down to 0 do R0 ← [2]R0 if (ki = 1) then R0 ← R0 + P endfor return R0
Input: P , k = (1, k−2 , . . . , k0 )2 Output: Q = [k]P R0 ← P for i = − 2 down to 0 do R0 ← [2]R0 b ← ¬ki ; Rb ← Rb + P endfor return R0
(a) Double-and-add algorithm
(b) ‘Double-and-add always’ algorithm
Fig. 1. Binary point multiplication algorithms
Resistance against SPA does not imply resistance against the more sophisticated differential power analysis (DPA) [17]. In [6], Coron explains how to mount a DPA-type attack against the ‘double-and-add always’ algorithm. At step i, this attack requires to form two sets of points: the first set is comprised of points −1 t−i ]Pj = 0 and the second set of points Pj such that Pj such that Γ t=i [kt 2 −1 t−i ]Pj = 1 where Γ (P ) denotes a Boolean selection function (e.g., Γ t=i [kt 2 the value of any specific bit in the binary representation of P ). To avoid this attack, one has to prevent the attacker to form the two sets. This can be achieved by randomizing point P or multiplier k; or better, as recently exemplified by Goubin [10], by randomizing both P and k. In the last four years, several randomization methods have been proposed (e.g., [6,14]). This paper proposes further randomization methods that all have in common to be (virtually) free, leading to performances surpassing those of
350
M. Ciet and M. Joye
prior art. It is organized as follows. In the next section, we deal with elliptic curves over large prime fields. We propose a generic and free method for point randomization and compare it with a previous multiplier randomization. In Section 3, we introduce new models for elliptic curves over binary fields. Based on these, we propose free point randomization methods allowing to work in affine coordinates and so answer a problem left open in [14]. In Section 4, we present a regular variant of Shamir’s double ladder. Our variant allows to construct a free multiplier randomization method. Finally, we conclude in Section 5.
2
Point Randomization over Large Prime Fields
This section deals with point randomization techniques for elliptic curves defined over large prime fields. The case of elliptic curves over binary fields is treated in Section 3. Let Fp be a (large) prime field with p > 3. An elliptic curve over Fp is given by the points (x, y) ∈ Fp × Fp satisfying the Weiertraß equation E/Fp : y 2 = x3 + ax + b
(1)
along with point O at infinity. 2.1
Previous Work
For preventing DPA-type attacks, Coron [6] suggests to represent base-point P = (x, y) ∈ E \ {O} with an equivalent projective representation as P ∗ = ∗ := (r2 x, r3 y, r) —where r is randomly chosen in F× p — and to compute Q ∗ ∗ ∗ ∗ [k]P = (Xk , Yk , Zk ) in Jacobian coordinates. The result of the point multi =0 plication, Q = [k]P , is then obtained as Q = Xk∗ /(Zk∗ )2 , Yk∗ /(Zk∗ )3 if Zk and Q = O otherwise. The same technique applies if P ∗ is represented with homogeneous coordinates instead of Jacobian coordinates. We refer the reader to [6] for detail. Another efficient means for randomizing base-point P , proposed by Joye and Tymen [14], consists in working with isomorphic curves. All elliptic curves defined by the Weiertraß equations (u)
E/Fp : y 2 = x3 + u4 ax + u6 b with u ∈ F× p are isomorphic to the initial elliptic curve given by Eq. (1). So the evaluation of Q = kP can be carried out by picking a random r ∈ F× p , computing Q∗ := kP ∗ = (x∗k , yk∗ ) on E ∗ := E (r) where P ∗ = (r2 x, r3 y) and finally obtaining Q = (r−2 x∗k , r−3 yk∗ ). This technique naturally extends to projective coordinates [14]. If we compare the two methods, depending on the implementation, both have advantages. For efficiency reasons, point multiplications on elliptic curves over
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography
351
large prime fields are done using Jacobian coordinates [5] and curve parameter a is suggested to be selected as a = −3 [1]. The first method —randomized projective representations— allows to keep the value of a = −3. The second method —randomized isomorphic elliptic curves— allows in commonly used point multiplication algorithms to simplify the addition formulæ by taking the Z-coordinate of base-point P equal to 1. Assuming that Q = [k]P is computed with the ‘double-and-add always’ algorithm, the performances of the two methods are summarized in Table 1. The cost of pre- and post-computations are neglected. The bit-length of k is denoted by |k|2 . Table 1. Number of multiplications (in Fp ) for computing Q = [k]P in Jacobian coordinates on an elliptic curve with parameter a = −3 Method
‘double-and-add always’ (Fig. 1-b)
NAF-based variants1 simple HM ([12])
No randomization
19 · |k|2
17 12 · |k|2
Randomized representations ([6])
24 · |k|2
20 · |k|2
Randomized EC isomorphisms ([14])
2.2
21 · |k|2
20 12
· |k|2
15 · |k|2 17 79 · |k|2 17 29 · |k|2
New Method: 2P ∗
We now present a new randomization method, applicable to most left-to-right point multiplication algorithms, that combines the advantages of the two aforementioned methods: the value of parameter a and the Z-coordinate of base-point P are unchanged. Previously known solutions randomize the input base-point P as P ∗ := Υ (P ) and compute [k]P ∗ where from the value of Q := [k]P is derived. Our idea is fairly simple yet very efficient. Instead of randomizing P , we randomize [2]P by choosing the method of randomized projective coordinates for function Υ . This allows to keep the Z-coordinate of P equal to 1 throughout the point multiplication algorithm. Figure 2 depicts a slight modification of the basic ‘double-and-add always’ algorithm (Fig. 1-b) including our randomization method. The NAF based variants (Appendix A) can be adapted similarly. If Υ denotes the randomized projective representation method ([6]) then we need 19 · |k|2 field multiplications for evaluating Q = [k]P with our modified algorithm of Fig. 2 and 17 12 ·|k|2 (resp. 15·|k|2 ) with the corresponding adaptation of the NAF based variants, on an elliptic curve with parameter a = −3. In other words, as shown in Table 1, these algorithms have the same complexity as their deterministic (i.e., non-randomized) counterpart. Compared to the state-of-theart, this translates into a speedup factor of ≈ 10% for the ‘double-and-add always’ algorithm and of ≈ 13% for the NAF based variants. 1
The NAF based variants are described in Appendix A.
352
M. Ciet and M. Joye Input: P , k = (1, k−2 , . . . , k0 )2 Output: Q = [k]P P ∗ ← Υ (P ) [base-point randomization] R0 ← [2]P ∗ for i = − 2 down to 1 do b ← ¬ki ; Rb ← Rb + P R0 ← [2]R0 endfor b ← ¬k0 ; Rb ← Rb + P return Υ −1 (R0 ) Fig. 2. Randomized algorithm 2P ∗
It is also worth noting that our randomization technique is generic in the sense that it applies to numerous point multiplication algorithms. 2.3
Interpretation
In our case, the randomization of base-point P can nicely be related to randomization techniques of multiplier k in the computation of Q = [k]P . This pushes a step further previous observations made by Okeya and Sakurai in [21]. Let E denote an elliptic curve over Fp with #E points. Instead of computing Q := [k]P directly, Coron suggests in [6] to pick a short random number r (typically r is 32-bit integer) and then compute Q in a random way as k∗ := k + r · #E
and Q = [k ∗ ]P .
In order to optimize modular arithmetic, elliptic curves recommended in the cryptographic standards are defined over a prime field Fp where p is a generalized Mersenne prime, that is, a prime of the form p = 2 ±2m ±1 where m is relatively √ small. As a result, since from Hasse theorem we have |#E − p − 1| ≤ 2 p, it follows that the binary representation of #E is likely to be a ‘1’ followed by a long run of ‘0’s. For example, in hexadecimal, the elliptic curve “secp160k1” from [2, Section 2.4] has #E = 01 00000000 00000000 0001B8FA 16DFAB9A CA16B6B316 points. The randomized multiplier, k ∗ , then typically looks as k ∗ := k + r · #E = (r)2 k−1 · · · k−t some bits . :=α
Observe that the t most significant bits of multiplier k appear in clear. If [k ∗ ]P is evaluated with the ‘double-and-add always’ algorithm then, letting k ∗ = r 2 + k/2−t 2t + α, we first compute P1 := [r]P , and continue with k/2−t 2t + α as the multiplier.
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography
353
Remarking that with the ‘double-and-add always’ algorithm, (true/dummy) point additions are always performed with point P (not P1 ), our randomized algorithm 2P ∗ (Fig. 2) can be seen, in the previous example, as a variation of the randomized multiplier method where [2]P ∗ plays the role of P1 , for the leading bits of k.
3
Point Randomization over Binary Fields
3.1
Previous Work
The Weierstraß equation for non-supersingular elliptic curves over F2m is given by E/F2m : y 2 + xy = x3 + ax2 + b
(∪{O}) .
(2)
The use of randomized projective representations ([6]) for preventing DPAtype attacks is not restricted to elliptic curves over prime fields and equally apply to elliptic curves over binary fields. On the contrary, the method of randomized isomorphisms does not apply for elliptic curves over binary fields because the x-coordinate of a point is invariant through isomorphism, as noticed in [14]. This is most unfortunate because, over F2m , affine coordinates lead to better performances [7].2 The next section explains how to overcome this limitation without performance penalty. 3.2
New Representation
Rather than considering the short Weierstraß equation (Eq. (2)), we consider elliptic curves given by the extended model /F m : y 2 + xy + y = x3 + Ax2 + Bx + C E 2
(∪{O})
(3)
with , A, B, C ∈ F2m . As shown in the next proposition, this model is as general as the classical Weierstraß model. (given by Eq. (2) and Eq. (3), Proposition 1. The elliptic curves E and E respectively) are isomorphic over F2m if and only if there exists σ ∈ F2m such that A = a + . B = 2 + σ 2 3 2 C =b+ a+ +σ Furthermore, the isomorphism ∼ ϕ : E −→ E, 2
O −→ O . (x, y) −→ (x + , y + σ)
(4)
In [11], the authors suggest to use projective rather than affine coordinates. This comes from the ratio of inversion to multiplication. In [11] this ratio is roughly 10 to 1 whereas in [7] it is roughly 3 to 1. For hardware architectures affine coordinates are more suitable.
354
M. Ciet and M. Joye
Proof. This is an application of [18, Theorem 2.2].
\ {O}. The inverse of P1 is −P1 = Let P1 = (x1 , y1 ) and P2 = (x2 , y2 ) ∈ E (x1 , x1 + y1 + ). If P1 = −P2 then P1 + P2 = (x3 , y3 ) where x3 = λ2 + λ + A + x1 + x2 and y3 = (x1 + x3 )λ + x3 + y1 + y1 +y2 if x1 = x2 , with λ = x1 +x2 y1 +2 +B otherwise . x1 + + x1 + Neglecting (field) additions (i.e., xors), the addition formulæ on our extended model only requires an additional squaring for the computation of 2 , compared to the formulæ in classical Weierstraß model [1, § A.10]. If the value of 2 is precomputed or if normal bases [9] are used, its cost can be neglected too. Consequently, the computation of Q = [k](x, y) can be carried as follows: 1. 2. 3. 4.
Randomly choose , σ ∈ F2m ; Form P ∗ = (x + , y + σ); Compute Q∗ := [k]P ∗ on E; ∗ If Q = O output O else Q = (x∗k , yk∗ ) and output Q = (x∗k + , yk∗ + σ).
A better way for eliminating the additional cost due to the computation of 2 , valid in all cases, is to replace the extended model of Eq. (3) by the corresponding quartic form. This is achieved by replacing (x, y) with (x, y + x2 ). Doing so, we obtain an elliptic curve, isomorphic to Eq. (3), given by the equation 2 4 2 Q E /F2m : y + xy + y = x + (A + )x + Bx + C .
(5)
Q \ {O} is given The sum of two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) ∈ E by x3 = λ2 + λ + A + x1 + x2 and y3 = (x1 + x3 )(λ + x1 + x3 ) + x3 + y1 + 2 x1 + x2 + xy11 +y if x1 = x2 , +x2 . with λ = y1 +B otherwise . x1 + These formulæ only involve 1 squaring, 2 multiplies and 1 inversion to add or double points, as for the classical Weierstraß model. Neglecting the cost of (field) additions, the computation of Q = [k](x, y) can thus be evaluated in a random way and without penalty as: 1. 2. 3. 4.
Randomly choose , σ ∈ F2m ; Form P ∗ = (x + , y + σ + x2 + 2 ); Q ; Compute Q∗ := [k]P ∗ on E ∗ If Q = O output O else Q = (x∗k , yk∗ ) and output Q = (x∗k + , yk∗ + σ + (x∗k )2 + 2 ).
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography
4
355
Multiplier Randomization
A very natural way [4] to randomize multiplier k consists in choosing a random integer r of the size of k and to compute Q := [k]P as Q = [k − r]P + [r]P . Another possibility is to write k as k = k/rr + (k mod r) for a random r. Letting S := [r]P , we can obtain Q = [k]P as Q = [k1 ]P + [k2 ]S
(6)
where k1 := k mod r and k2 := k/r. The randomized splitting of k is generally disregarded as it appears to double the running time: two point multiplications have to be computed instead of one. However, as noted by Shamir (see [8]), if one has to evaluate y := g k hd in a group G, the intermediate values g k and hd are not needed [25]. The next figure describes a regular variant of Shamir’s double ladder, using additive notations and where G is the group of points of an elliptic curve. We let denote the bit-length of max(k, d) —and thus k−1 and/or d−1 are equal to 1. Input: P , k = (k−1 , k−2 , . . . , k0 )2 , S, d = (d−1 , d−2 , . . . , d0 )2 Output: Q = [k]P + [d]S R1 ← P ; R2 ← S; R3 ← P + S; c ← 2d−1 + k−1 ; R0 ← Rc for i = − 2 down to 0 do R0 ← [2]R0 b ← ¬(ki ∨ di ); c ← 2di + ki ; Rb ← Rb + Rc endfor return R0 Fig. 3. Regular variant of Shamir’s double ladder
Applied to the evaluation of Eq. (6), we see that this variant only requires one point doubling and one point addition per bit, that is, exactly the same cost as the ‘double-and-add always’ algorithm. The NAF based variants (Appendix A) can be adapted along the same lines.
5
Conclusion
This paper dealt with randomization techniques for elliptic curve cryptography; three free novel methods were presented: – randomized algorithm 2P ∗ ; – randomized isomorphisms in affine coordinates; – randomized algorithm based on Shamir’s ladder. Furthermore, we gave an original interpretation of certain point randomization techniques in terms of multiplier randomizations. We also introduced new models for elliptic curves over binary fields.
356
M. Ciet and M. Joye
Acknowledgements. Part of this work was done while the first author was visiting Gemplus. Thanks go to David Naccache, Philippe Proust and JeanJacques Quisquater for making this arrangement possible.
References 1. IEEE Std 1363-2000. IEEE Standard Specifications for Public-Key Cryptography. IEEE Computer Society, August 29, 2000. 2. SECG: Standard for Efficient Cryptography Group. SEC 1: Elliptic Curve Cryptography. Certicom Research, Version 1.0, September 20, 2000. Available at URL http://www.secg.org/secg docs.htm. 3. Ian Blake, Gadiel Seroussi, and Nigel Smart. Elliptic Curves in Cryptography, volume 265 of London Mathematical Society. Cambridge University Press, 2000. 4. Christophe Clavier and Marc Joye. Universal exponentiaion algorithm. In C ¸ .K. Ko¸c, D. Naccache, and C. Paar, editors, Cryptographic Hardware and Embedded Systems – CHES 2001, volume 2162 of Lecture Notes in Computer Science, pages 300–308. Springer-Verlag, 2001. 5. Henri Cohen, Atsuko Miyaji, and Takatoshi Ono. Efficient elliptic curve using mixed coordinates. In K. Ohta and D. Pei, editors, Advances in Cryptology - ASIACRYPT ’98, volume 1514 of Lecture Notes in Computer Science, pages 51–65. Springer-Verlag, 1998. 6. Jean-S´ebastien Coron. Resistance against differential power analysis for elliptic curve cryptosystems. In C ¸ .K. Ko¸c and C. Paar, editors, Cryptographic Hardware and Embedded Systems (CHES ’99), volume 1717 of Lecture Notes in Computer Science, pages 292–302. Springer-Verlag-Verlag, 1999. 7. Erik De Win, Serge Mister, Bart Preneel, and Michael Wiener. On the performance of signature schemes based on elliptic curves. In J.-P. Buhler, editor, Algorithmic Number Theory Symposium, volume 1423 of Lecture Notes in Computer Science, pages 252–266. Springer-Verlag-Verlag, 1998. 8. Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory, 31(4):469–472, 1985. 9. Shuhong Gao and Hendrik W. Lenstra, Jr. Optimal normal bases. Designs, Codes and Cryptography, 2:315–323, 1992. 10. Louis Goubin. A Refined Power-Analysis Attack on Elliptic Curve Cryptosystems. In Y. Desmedt, editor, Public Key Cryptography (PKC 2003), volume 2567 of Lecture Notes in Computer Science, pages 199–210. Springer-Verlag, 2003. 11. Darrel Hankerson, Julio L´ opez Hernandez, and Alfred Menezes. Software implementation of elliptic curve cryptography over binary fields. In C ¸ .K. Ko¸c and C. Paar, editors, Cryptographic Hardware and Embedded Systems – CHES 2000, volume 1965 of Lecture Notes in Computer Science, pages 1–24. Springer-Verlag, 2000. 12. Yvonne Hitchcock and Paul Montague. A new elliptic curve scalar multiplication algorithm to resist simple power analysis. In L.M. Batten and J. Seberry, editors, Information Security and Privacy (ACISP 2002), volume 2384 of Lecture Notes in Computer Science, pages 214–225. Springer-Verlag, 2002. 13. Kouichi Itoh, Jun Yajima, Masahiko Takenaka, and Naoya Torii. DPA countermeasures by improving the window method. In B.S. Kaliski Jr., C ¸ .K. Ko¸c, and C. Paar, editors, Cryptographic Hardware and Embedded Systems – CHES 2002, volume 2523 of Lecture Notes in Computer Science, pages 303–317. Springer-Verlag, 2003.
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography
357
14. Marc Joye and Christophe Tymen. Protections against differential analysis for elliptic curve cryptography: An algebraic approach. In C ¸ .K. Ko¸c, D. Naccache, and C. Paar, editors, Cryptographic Hardware and Embedded Systems (CHES 2001), volume 2162 of Lecture Notes in Computer Science, pages 377–390. SpringerVerlag-Verlag, 2001. 15. Neal Koblitz. CM-curves with good cryptographic properties. In J. Feigenbaum, editor, Advances in Cryptology – CRYPTO ’91, volume 576 of Lecture Notes in Computer Science, pages 279–287. Springer-Verlag, 1992. 16. Paul Kocher. Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In N. Koblitz, editor, Advances in Cryptology – CRYPTO ’96, volume 1109 of Lecture Notes in Computer Science, pages 104–113. Springer-Verlag, 1996. 17. Paul Kocher, Joshua Jaffe, and Benjamin Jun. Differential power analysis. In M. Wiener, editor, Advances in Cryptology – CRYPTO ’99, volume 1666 of Lecture Notes in Computer Science, pages 388–397. Springer-Verlag, 1999. 18. Alfred J. Menezes. Elliptic curve public key cryptosystems. Kluwer Academic Publishers, 1993. 19. Fran¸cois Morain and Jørge Olivos. Speeding up the computations on an elliptic curve using addition-subtraction chains. Inform. Theor. Appl., 24:531–543, 1990. 20. Katsuyuki Okeya, Kunihiko Miyazaki, and Kouichi Sakurai. A fast scalar multiplication method with randomized projective coordinates on a Montgomery-form elliptic curve secure against side channel attacks. In K. Kim, editor, Information and Communications Security, volume 2288 of Lecture Notes in Computer Science, pages 428-439. Springer-Verlag, 2002. 21. Katsuyuki Okeya and Kouichi Sakurai. Power analysis breaks elliptic curve cryptosystems even secure against the timing attack. In B.K. Roy and E. Okamoto, editors, Progress in Cryptology – INDOCRYPT 2000, volume 1977 of Lecture Notes in Computer Science, pages 178–190. Springer-Verlag, 2000. 22. Richard Schroeppel, Hilarie Orman, Sean W. O’Malley, and Oliver Spatscheck. Fast key exchange with elliptic curve systems. In D. Coppersmith, editor, Advances in Cryptography – CRYPTO ’95, volume 963 of Lecture Notes in Computer Science, pages 43–56. Springer-Verlag, 1995. 23. Jerome A. Solinas. An improved algorithm for arithmetic on a family of elliptic curves. In B.S. Kaliski Jr., editor, Advances in Cryptology – CRYPTO ’97, volume 1294 of Lecture Notes in Computer Science, pages 357–371. Springer-Verlag, 1997. 24. Jerome A. Solinas. Efficient arithmetic on Koblitz curves. Designs, Codes and Cryptography, 19:195–249, 2000. 25. Jerome A. Solinas. Low-weight binary representations for pairs of integers. Technical Report CORR 2001-41, CACR, Waterloo, 2001. Available at URL http://www.cacr.math.uwaterloo.ca/˜techreports/2001/corr2001-41.ps.
A
NAF-Based Regular Point Multiplication Algorithms
The computation of the inverse of a point P = (x, y) on an elliptic curve is free. So, the m-ary point multiplication algorithms for computing Q = [k]P can be speeded up by using a signed representation for k. In particular, for m = 2, a non-adjacent form (NAF) representation —that is, representing k as k = i=0 κi 2i with κi ∈ {−1, 0, 1} and κi · κi−1 = 0, ∀i— gives rise to a speedup factor of ≈ 11% [19].
358
M. Ciet and M. Joye
At first glance, NAFs do not seem to help in reducing the complexity of the ‘square-and-multiply always’ algorithm. However, the non-adjacency property, κi · κi−1 = 0, can be exploited by scanning two digits per iteration. We consider the following three cases and the corresponding operations to be performed (point doublings and point additions/subtractions are respectively denoted by D and A and underlined symbols represent dummy operations): DDAD; − (κi , κi−1 ) = (0, 0): − (κi , κi−1 ) = (0, ±1): D D A D ; − (κi , κi−1 ) = (±1, 0): D D A D . The cases (κi , κi−1 ) = (±1, ±1) and (κi , κi−1 ) = (±1, ∓1) never occur. The resulting algorithm is depicted on the next figure. Function sign(·) returns the sign of an integer (i.e., if a ≥ 0 then sign(a) = 0 and sign(a) = 1 if a < 0). Input: P , k = (1, κ−1 , . . . , κ0 )NAF Output: Q = [k]P R0 ← P ; i ← − 1 while (i ≥ 1) do h ← |κi |; Rh ← [2]Rh ; R0 ← [2]R0 b ← ¬|κi + κi−1 |; s ← ¬ sign(κi + κi−1 ) Rs ← −Rs ; Rb ← Rb + P ; Rs ← −Rs h ← ¬h; Rh ← [2]Rh i←i−2 endwhile h ← |i|; Rh ← [2]Rh b ← h ∨ ¬|κ0 |; s ← ¬ sign(κ0 ) Rs ← −Rs ; Rb ← Rb + P ; Rs ← −Rs return R0 Fig. 4. Simple NAF-based variant of the ‘double-and-add always’ algorithm
This algorithm is highly regular: at each iteration, there are two point doublings followed by a point addition and a point doubling, whatever the values of scanned digits. The cost per digit is 32 point doublings and 12 point addition; this has to be compared to the 1 point doubling and 1 point addition of the ‘double-and-add always’ algorithm. In Jacobian coordinates, a point doubling costs 8 multiplies when parameter a = −3 and 10 multiplies in the general case whereas a point addition costs 11 multiplies, provided that the Z-coordinate of P is set to 1 and 16 multiplies in the general case. Therefore, the algorithm of Fig. 4 is up to ≈ 8% faster with the same memory requirements (and ≈ 17% faster with randomized representations; see Table 1). A more involved algorithm using similar ideas was proposed by Hitchcock and Montague [12]. It basically corresponds to
(Virtually) Free Randomization Techniques for Elliptic Curve Cryptography
359
− (κi , κi−1 ) = (0, 0): DDA; − (κi , κi−1 ) = (0, ±1): D D A ; DDA. − (κi ) = (±1): 5 According to [12], the expected cost per digit is 10 9 point doublings and 9 point addition. The corresponding number of field multiplications for computing [k]P is listed in Table 1. As presented in [12], a ‘SPA-resistant NAF formatting’ algorithm is needed prior to the computation of Q = [k]P . We give hereafter a variant that does not require a prior recoding.
Input: P , k = (1, κ−1 , . . . , κ0 )NAF Output: Q = [k]P R0 ← P ; i = − 1 while (i ≥ 1) do h ← |κi |; Rh ← [2]Rh ; R0 ← [2]R0 b ← ¬|κi + κi−1 |; s ← ¬ sign(κi + κi−1 ) Rs ← −Rs ; Rb ← Rb + P ; Rs ← −Rs i ← i − 1 − ¬h endwhile h ← |i|; Rh ← [2]Rh b ← h ∨ ¬|κ0 |; s ← ¬ sign(κ0 ) Rs ← −Rs ; Rb ← Rb + P ; Rs ← −Rs return R0
Fig. 5. Modified Hitchcock-Montague algorithm (without recoding algorithm)
There is an important class of elliptic curves, which consists of the so-called anomalous binary curves (ABC for short) first proposed by Koblitz [15]. An ABC curve over F2n is given by the Weierstraß equation E/F2m : y 2 + xy = x3 + ax2 + 1
with a ∈ F2 .
Let τ denote the Frobenius endomorphism, τ (x, y) := (x2 , y 2 ). In [22,23, i 24], methods are proposed to decompose an integer k as k = i κi τ with κi ∈ {−1, 0, 1} and κi · κi−1 = 0, and the double-and-add algorithm is replaced by a τ -and-add algorithm, where τ application consists in two squarings. This method is particularly useful when optimal normal bases are used for representing elements F2m , see [9]. In that case, an adaptation of the simple NAF-based algorithm (Fig. 4) is more advantageous than the corresponding adaptation of the Hitchcock-Montague algorithm (Fig. 5) since, neglecting τ applications, the (expected) cost per digit amounts to 12 point addition vs. 59 point addition.
An Optimized Multi-bits Blind Watermarking Scheme* Xiaoqiang Li, Xiangyang Xue, and Wei Li Department of Computer Science and Engineering, Fudan University, Shanghai 200433, China [email protected],
Abstract. This paper presents a new multi-bits watermarking scheme in DCT domain based on a chaotic Direct Sequence Spread Spectrum (DSSS) communication system, which is combined with error correcting codes (ECC) and Human Visual System (HVS) model in spatial domain. To extract the hidden watermark from a possibly corrupted watermarked image without error, we model watermarking as a digital communication problem and apply BCH channel coding and shuffling. To ensure optimal adaptive DCT watermark, we also demonstrate how to optimally embed a watermark given the constraints imposed by the mask in the spatial domain. The robustness of the algorithm has been tested with StirMark 4.0. Without the original image during the decoding process, the algorithm allows for the recovery of 64 bits of information in a 256h256 graylevel image after a significant JPEG compression and other common signal processing attack.
1
Introduction
The World Wide Web, digital networks and multimedia afford virtually unprecedented opportunities to pirate copyrighted material. Consequently, the idea of using a robust digital watermark to detect and trace copyright violation has therefore stimulated significant interest among artists and publishers. In order for a watermark to be useful it must be robust to a variety of possible attacks by pirates. These include robustness against compression such as JPEG, scaling and aspect ratio changes, rotation, cropping, row and column removal, addition of noise, filtering, cryptographic and statistical attacks, as well as insertion of other watermarks. A discussion of possible attacks is given in [1]. In this paper however, we consider only attacks do not change geometry of the image. Our aim is to construct a robust multi-bits DCT domain watermark which takes into account the properties of the human visual system (HVS) and resist attack such as JPEG compression. Much work has been done in the now relatively mature field of DCT domain watermarking. The most recent work involves sophisticated masking models incorporating brightness, frequency and contrast which have been used in combination with an embedding into 8h8 DCT blocks. With few exceptions, the *
This work was supported in part by NSF of China under contract number 60003017, China 863 Projects under contract numbers 2001AA114120 and 2002AA103065.
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 360–369, 2003. © Springer-Verlag Berlin Heidelberg 2003
An Optimized Multi-bits Blind Watermarking Scheme
361
work in watermarking has involved a one-bit watermark. That is, at the detection a binary decision is made as to the presence of the watermark most often using hypothesis testing [2]. It is detected by correlating the known watermark sequence with either the extracted watermark or a transformed version of the watermarked signal itself (if the original host signal is not available for extraction). If the correlation factor is above a given threshold then the watermarking is detected. Barni [3] encodes roughly 10 bits by embedding 1 watermark from a set of 1000 into the DCT domain. The recovered watermark is the one which yields the best detector response. In practice however, many more applications are possible when the watermark length is of the order 60 bits since this allows for a unique identifier specifying the owner and buyer of an images as well as possibly indicating the type of content in the image. Such schemes are much flexible, but the problem is more challenging. To extract the hidden watermark without errors or with an acceptable low error rate, much effort has been made. Hernandez et al. modeled the watermarking process as a communication system and analyzed the performance of the watermarking process in terms of error probability when the watermarked image are corrupted by additive noise, cropping, and linear filtering [4]. Huang et al. [5] also present a robust information bit-hiding algorithm in discrete cosine transform (DCT) domain. Both two algorithms can afford multi-bits watermarking, but they need original image during extraction process. In this paper we first make a further investigation of Hartung’s scheme in [6], then we modify their scheme and present a new multi-bits watermarking without use original image during extraction process. To balance the trade-off between the capacity and robustness, we select 27 AC coefficients in each DCT block to embed watermark. Watermarks are modulated by chaotic sequences for precise detection and security purpose. To improve the robustness we adopt BCH channel code and shuffling algorithm to encode the watermark information. We also propose a novel way to optimally embed a watermark in DCT domain given the constraints imposed by the mask in the spatial domain, which ensure the watermark is been unperceptive and improve the robustness of the watermark. With the proposed algorithm, we embed a 64 bits length watermark into a 256h256 graylevel image. The robustness of our algorithm has been tested with StirMark 4.0. The experimental results demonstrate that the embedded watermark is perceptually invisible and our scheme performs well resisting common signal processing procedures such as Gaussian noise disturbing, scaling change, Gaussian filter, and JPEG compression with quality factor as low as 20. This paper is organized as follows. We introduce chaotic spread-spectrum system and model watermarking as a digital communication in section 2. Then in Section 3 we present new multi-bits watermarking scheme in DCT domain based on a chaotic Direct Sequence Spread Spectrum (DSSS) communication system, which is combined with errors correct codes (ECC) and Human Visual System (HVS) model in spatial domain. In Section 4, we demonstrate how to optimally embed a watermark given the constraints imposed by the mask in the spatial domain. The experimental results with the StirMark 4.0 on various images and the drawn conclusion are given in Section 5 and 6, respectively.
362
X. Li, X. Xue, and W. Li
2 Watermarking Based on Chaotic Spread-Spectrum We model the watermarking procedure as digital communication problem, as shown in Fig.1. In spread spectrum communication, a narrow band signal is transmitted over a much larger bandwidth such that the signal energy present in any single frequency is undetectable. Similarly the watermark bits are spread by a large factor called chip-rate so that it is imperceptible. We modify the method given in [6] for watermark insertion and extraction, using chaotic spread-spectrum technique. Note that for the sake of brevity, the block diagram of information extraction modeling is not shown. BCH
Watermark
Cover Image
8×8 DCT
Spread with ChipRate Embedding/ modulator
IDCT
Watermarked Image
HVS Fig. 1. Watermark embedded framework.
Chaos is a deterministic, random-like process found in non-linear, dynamical system, which is non-period, non-converging and bounded. Moreover, it has a very sensitive dependence upon its initial condition and parameter. Chaotic signal can be used in communication. A chaotic map is a discrete-time dynamical system x k + 1 = f ( x k ),
0 < x k < 1,
k = 0 ,1 , 2 ,...
(1)
running in chaotic state. The chaotic sequence {xk : k = 0,1,2.,..} can be used as spread-spectrum sequence in place of PN sequence in conventional DSSS communication system. Chaotic sequences are uncorrelated when their initial values are different, so in chaotic spread-spectrum systems, a user corresponds to an initial value. Contrasted to PN sequences, chaotic sequences have following advantages: 1) is non-period, non-converging, has an analogy to random process; 2) has a very sensitive dependence upon its initial condition and parameter; 3) easily been produced with initial value and an iterative equation; 4) due to the nonlinear behavior, decoding the chaotic map without prior information is extremely difficult. This allows a chaotic sequence to have better security, and a lower probability of detection and interference. The main contribution of chaotic sequence is enhancing the security of watermarking. In our algorithm, we use chaotic sequence produced by Hybrid chaotic dynamic system equation, because this chaotic sequence performance well on auto-correlation and correlation-restrain. Hybrid chaotic dynamic system equation is defined as follow:
An Optimized Multi-bits Blind Watermarking Scheme
1 − 2x2 1 1 − × (−2x)1.2 y= 2 1 − 2x − (2x −1)0.7
− 1 ≤ x < 0.5 − 0.5 ≤ x < 0 0 ≤ x ≤ 0.5 0.5 < x ≤ 1
363
(2)
After get the chaotic state trajectories according to initial value with equation (3), then select a threshold value to transform the trajectories into bipolar value sequence including –1 and 1, we can get chaotic sequence at last. Chaotic sequence highly depends on the selection of secret key. Without the secret key in detection, even if the embedding process is totally transparent to attacker, then can only detect the encrypted watermark data that are incomprehensible.
3 Watermarking Scheme 3.1 Watermark Generation Let N be the total number of DCT coefficients used to embed the watermark in an image. Let rc be the chip-rate used to spread the information bits. Then a total of N/rc information bits could be embedded in the image. Let {aj} be the sequence of information bits that has to be embedded into the image. This discrete signal is spread by a large factor, that is the chip-rate rc , to obtain the spread sequence {bi}: bi = a j , where j ⋅ rc ≤ i < ( j + 1) ⋅ rc (3) The purpose of spreading is to add redundancy by embedding one bit of information into rc DCT coefficients of the image. The spread sequence {bi} is then modulated by a chaotic sequence {pi} generated by method described in section 2, where pi {-1,1}. pi serves for frequency spreading. The modulated signal is scaled with a scaling factor a: wi = α ⋅ bi ⋅ pi (4) Where wi is the spread spectrum watermark, which is also a sequence with size equal to the selected DCT coefficients of image. Error correcting code (ECC). As mentioned above, watermarking can be viewed as a communication problem. Therefore, the detected signals at the receiving end may have some bit errors. To ensure the robustness of the watermarking, we encode the information by using error-correcting code. In the experiments of this research, we use the BCH codes because there is an ample selection of block lengths and code rates. Let W, W = {wi, 0 i L}, denote the watermark with length of L. We apply the BCH code (n, r, t), where n is the length of the codeword, k is the length of message and t is the number of bit errors that can be corrected, to the W and obtain a bit stream to hide L −1 r
L X = X i = xt ; xt ∈ {−1,1},0 ≤ t < ⋅ n r i =0
(5)
364
X. Li, X. Xue, and W. Li
While ECC can correct some error and improve the robustness of watermark, it needs many redundant bits. Considering the capacity and robustness, in experiment, we have used BCH codes (31,16,3). Shuffling. Note that interleaving coding technique is known in the communication theory as an effective way to combat bursts of errors. It is expected that interleaving techniques can be used to improve the robustness of the algorithm against the large size cropping [7]. In our algorithm, we use shuffling replace the interleaving. There are two advantages we use shuffling in our scheme. First, the performance of shuffling is prior to interleaving. Second, shuffling also enhances security since the shuffling table or a key for generating the table is needed to correctly extract the hidden data. A key k=(k0, k1) is chosen by the copyright owner, where k0 is an arbitrary integer, and k1 is an integer within the interval [N/3, 2N/3] and is prime to N. Define f (i) = (k 0 + k1 ) mod N i = 0,1,..., N (6) Clearly, a one-to-one mapping between i and f(i) exists. In extraction procedure, we can derive i from f(i) by using above algorithm on f(i). Define i = ( f (i ) − k 0 ) × k 2 mod N (7) Where k2 satisfies the equation (8) k 2 × k1 = 1 mod N (8) 3.2 Watermark Insertion To embed the watermark, the host image f (x, y) is split into a set of nonoverlapping blocks of 8h8, denoted by fk (x’, y’), 0 x’,y’ 8, k = 0, 1, …, K 1, where the subscript k denotes the index of blocks, the K the total number of blocks. Performing DCT on each fk (x’, y’), we obtain the DCT coefficients of each block, Fk (u, v). To embed the data, the DCT coefficients are modified as follows:
F (u , v) + a ⋅ x n , Fk (u , v) ∈ Rk Fk’ (u, v) = k ohterwise Fk (u , v)
(9)
Where a is scaling factor, may be different from different color channel; Rk denotes a subset of all the DCT coefficients in the kth block, i.e.,
Rk ⊂ {Fk (u , v ),0 ≤ u, v < 8} The size of Rk is denoted by l. To embed signals in the host image as strongly as possible, we vary a according to different characteristics of host image. Based on the perceptual model described in [8], scaling factor a should be small for those image contains mainly smooth regions, and large for the images with high texture complexity. Many people have researched on how to use these coefficients to embed data. Cox et al. suggested that hidden data should be placed in those perceptually significant components [9]. Specifically, they embedded data in the low-frequency coefficients. Others suggested using mid-frequency. Huang et al. claimed more robustness could be achieved if watermarks are embedded in DC components since DC components have much large perceptual capacity that any AC component [5]. In our scheme, we select 27 low-frequency AC DCT coefficients in each block for
An Optimized Multi-bits Blind Watermarking Scheme
365
embedding signals considering the invisibility, capacity and robustness requirements of watermarks. Performing inverse DCT on the image modified in the DCT domain, we can obtain stego-image f‘(x, y). In inverse DCT, the watermark is truncated or modulated in the spatial domain in order to satisfy masking constraints. The problem with these approaches is that spatial domain truncation or modulation leads inevitably to the degradation of the watermark in the DCT domain. In section 4, we present a new frame to embed watermark adaptively to resolve this problem. 3.3 Watermark Extraction The watermark could be extracted without using the original, unwatermarked image by means of a correlation receiver. But the chaotic noise sequence {pi} is needed for watermark extraction. We first get the watermarked AC DCT coefficient as embedding process. Then the demodulation process is the multiplication of the watermarked image with the same chaotic-noise signal {pi} that was used for embedding. This is followed by summation over a window of length equal to the chip-rate, yielding the correlation sum sj for the jth information bit. The watermarked image v’i=vi+wi, where wi=ahbih pi. The statistical characteristics of AC DCT coefficients have been studied and its distribution tends to the Gaussian distribution [3]. So we can describe the extraction process theoretically as follows: sj =
( j +1 ) ⋅rc −1
∑ (p
i
⋅ v ’i )
i = j ⋅rc
sj ≈
( j +1)⋅rc −1
∑
i = j ⋅rc
( p i ⋅ vi ) +
( j +1)⋅rc −1
∑(p
2
i
i = j⋅rc
⋅ α ⋅ bi ) + ∆
(10)
s j = a j ⋅ rc ⋅ α
∆ = −(
( j +1 ) ⋅ rc −1
( j +1 ) ⋅rc −1
i = j ⋅ rc
i = j ⋅ rc
∑
pi ) ⋅ E (
∑ v’ ) i
sign( s j ) = sign(a j ⋅ rc ⋅α ) = sign(a j ) = a j
(11)
(12)
This is because rc>0, a>0. Thus the embedded bit can be retrieved without any loss. This means that the embedded information bit is 1 if the correlation is positive and –1 if it is negative. But since the AC DCT coefficients of image tend to Gaussian distribution inaccurately, there may be errors in the extracted watermark bits. We use BCH code to improve the robustness of watermarking.
4
Optimized Adaptive DCT Watermark
To best make a tradeoff between perceptual invisibility and robustness to compress and other common signal processing, many algorithms [3,5,6] only adjust scaling parameter in equation (4). In other words, they embed a watermark in the DCT
366
X. Li, X. Xue, and W. Li
domain and then truncate or modulate in the spatial domain in order to satisfy masking constraints. The problem with these approaches is that spatial domain truncation or modulation leads inevitably to the degradation of the watermark in the DCT domain. To resolve this question, we present a framework here which combined adjusting scaling parameter and mask in the spatial domain. We assume that we are given an image to be watermarked denoted I. We are also given a masking function V(I) which return two matrices of the same size of I containing the values ∆ pi , j and ∆ ni , j corresponding to the amount by which pixel Ii,j can be respectively increased and decreased without being noticed. We note that these are not necessarily the same since we also take into account truncation effects. That is pixels are integers in the range 0-255 consequently it is possible to have a pixel whose value is 1 which can be increased by a large amount, but can be decrease by at most 1. The function V can be a complex function of texture, luminance, contrast, frequency and patterns. In our scheme, we use HVS model presented in [8] to get masking function V(I) by calculating the Just Noticeable Distortion (JND) mask of image directly in the spatial domain. This algorithm contains three aspects: texture and edge analysis, edge separation and reclassification, luminance sensitivity analysis. Firstly, the original image is divided in blocks of 8[8 pixels. Then to compute JND matrix for each 8[8 pixels block as follow: V ( x, y ) = l ( x, y ) + dif ( x, y ) (13) Where l ( x, y) represents the additional noise threshold and dif ( x, y ) represents the basic noise threshold of the block it belongs to. At last, we can get masking function V(I) of original image. The central problem in this scheme is that during embedding we would like to increase or decrease the DCT coefficients as much as possible for maximum robustness, at the same time we must satisfy the constraints imposed by V in the spatial domain. In order to accomplish this, we defined optimization problem as follows: I i , j − ∆ ni , j ≤ I ’i , j ≤ I i , j + ∆ pi , j (14) To realize this aim, we design an adaptive algorithm as follows. Adaptive Algorithm Step: 1) Select the scaling parameter a0 as initial value to embed watermark and get watermarked image I0’. 2) Using equation (14) to modify watermarked image to satisfy masking constraints in the spatial domain, and denoted modified watermarked image I0’’. 3) Computer the peak signal-to-noise ratio (PSNR) value using equation (15), denoted as PSNR(I0’’) . PSNR( I 0 ’’) = 20 ⋅ log10 (
255
∑ [ I ( x, y ) − I
0
’’( x, y)] 2 N 2
)
(15)
4) Increase scaling parameter with iterative equation (16) and repeat step 1 using new scalar factor: ai+1 = ai + 1 i = 0,1,2... (16) 5) Repeat step 2 and 3 to computer PSNR(Ii+1’’) by replacing I0” with Ii+1’’.
An Optimized Multi-bits Blind Watermarking Scheme
367
6) If the absolute value of PSNR(Ii’’) PSNR(Ii+1’’) is less than 0.01, we consider Ii+1’’ as adaptive watermarked image, the algorithm is over; else, repeat step 4 and 5 until the absolute of PSNR(Ii’’) PSNR(Ii+1’’)<0.01 Fig.2 is an example using above algorithm with “Lean” image. When scaling factor is 7, watermark can be recoverd from watermarked image without bit error when the quality factor of compression is 50%. While, after we use equation (14) to modify the aforementioned watermarked image with scaling factor 7, watermark is extracted with BER 26.56% fo. If we use adaptive algorithm, watermark is extracted with BER 4.69%. Experiment show that our algorithm improves the robustness given the constraints imposed by the mask in the spatial domain. In other word, adaptive algorithm ensures the watermark invisible in the case of making the watermark more robust.
(a)
(b)
(c)
Fig.2. (a) Original image. (b) Watermarked image with scaling parameter 50. (c) Result of using adaptive algorithm to deal with watermarked image (b)
5
Experiment Results and Discussion
We have tested the proposed algorithm on images with various content complexity, e.g., “Lena,” “Peppers,” “F16” and “Baboon.” The experimental results with the “Lena” and “Baboon” images of 256h256 graylevel are shown in Figure 2 and Table 1 respectively. The former is considered a representative of less complicated images, while the latter is a representative of relatively more complicated images. In experiment, the scaling test was done by scaling the watermarked image down or up and rescaled back to its original size during watermark extracted process; the cropping test was done by cutting out the watermarked image borders and filled the cut section with black pixels during watermark extracted process. Based our new algorithm, Fig.2 (a) and (c) shows the original and the adaptive watermarked images, respectively. The peak signal-to-noise ratio (PSNR) of the watermarked image with respect to the original image is 34.72dB. The watermarks are perceptually invisible when we compare Fig.2 (a) and (c). Table 1 lists the various test function in StirMark4.0 and the tested results with our proposed algorithms applied to the “Baboon” image. The watermarks can be extracted with no error when the PSNR of the watermarked image corrupted by additive Gaussian noise is about
368
X. Li, X. Xue, and W. Li
26dB. For the “Baboon” image, the PSNR can as low as 16.25dB. It is seen that our algorithm can successfully resist 3h3 Gaussian filter, rescaling and JPEG compression with a quality factor of 20. The success achieved with our approach can be attributed to the error correction coding, the selection of AC DCT coefficient and adaptive algorithm for watermarking. Table 1. Test results with StirMark4.0 for “Baboon” image StirMark attack JPEG_50 JPEG_40 JPEG_30 JPEG_20 JPEG_10 NOISE_2 NOISE_4 NOISE_6 NOISE_8 CROP_95 CROP_90 CROP_85 RESC_200 RESC_150 RESC_90 RESC_75 RESC_50 RESC_200
BER (%) 0 0 0 0 10.94 0 0 0 15.63 0 0 3.13 0 0 0 0 0 0
StirMark attack GAUSSIAN_3_3_0.5 GAUSSIAN_3_3_1 GAUSSIAN_3_3_1.5 GAUSSIAN_3_3_2 MEDIAN_2 MEDIAN_3 MEDIAN_4 MEDIAN_5 ROTSCALE_0.25 ROTSCALE_-0.25 ROTSCALE_0.5 ROTSCALE_-0.5 ROTSCALE_0.75 ROTSCALE_-0.75 LATESTRNDDIST_0.95 LATESTRNDDIST_1.05 RNDDIST_0.95 RNDDIST_1.05
BER (%) 0 0 7.81 7.81 3.13 3.13 35.94 35.94 0 0 26.56 31.25 48.44 39.06 56.25 54.69 50.00 56.25
However, Table 1 also indicates the shortcoming of the presented algorithm. The algorithm failed in passing the test function that lost synchronization, such as cropping, rotation, and random bending. It failed severely with the general linear transformation, the large angle rotation, and the large size cropping. This is because without original image during extraction process, geometric distortion can’t be estimated from the two images and be inverted. It appears that spread-spectrum based watermarking techniques are robust against noise corruption and collusion attack, while vulnerable to errors in synchronization. This is a large challenge faced by the steganography/watermarking community, especially in blind watermarking scheme. Many researchers [10,11] have used the image invariant feature resisting geometric attack to resolve this problem and have acquired good experiment results.
6
Conclusion
In this article we have described a new algorithm for the embedding DCT watermarking in an optimal manner. Based on direct sequence spread spectrum, we use the channel coding (specifically, BCH codes) to improve the robustness. Chaotic sequence improves the security of the watermark. To ensure optimal adaptive DCT watermark, we present a new framework and demonstrate how to optimally embed a
An Optimized Multi-bits Blind Watermarking Scheme
369
watermark given the constraints imposed by the mask in the spatial domain. Experiment results show that our adaptive algorithm is more robust than traditional algorithm, and ensures the watermark invisible. Without the original image during the decoding process, the algorithm allows for the recovery of 64 bits of information in a 256h256 graylevel image after a significant JPEG compression and other common signal processing attack. The main shortcoming of our algorithm is that it failed resisting geometric attack. We plan address this problem in near future.
References 1.
F. A. P. Petitcolas and R. J. Anderson, “Attacks on copyright marking systems”, In second international information hiding workshop, pp. 219–239, USA, Apr. 1998 2. G. Voyatzis and and I. Pitas, “The use of watermarks in the protection of digital multimedia products”, Proceedings of the IEEE, 87(7), July 1999. 3. M. Barni, et al., “A DCT-domain System for Robust Image Watermarking,” Signal Processing, vol. 66,no. 3, pp.357–372, May 1998. 4. J. R. Hernandez, F. P. Gonzalez, et al., “Performance analysis of a 2-D-multipulse amplitude modulation scheme for data hiding and watermarking of still images,” IEEE J. Select. Areas Commun. Vol. 16, pp. 510–524, 1998. 5. J. Huang and Y. Q. Shi, “Reliable Information Bit Hiding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 10, pp. 916–920, Oct. 2002. 6. F. Hartung and B. Girod, “Watermarking of Uncompressed and Compressed Video.” Signal Processing, vol. 66, no. 3, May 1998. 7. F. Elmasry and Y. Q. Shi, “2-D interleaving for enhancing the robustness of watermarking signals embedded in still image,” in Proceedings of ICME2000, New York, 2000. 8. Z. Y. Du, Y. Zhou, and P. Z. Lu, “An Optimized Spatial Data Hiding Scheme Combined with Convolutional Codes and Hilbert Scan”, LNCS2532, PCM2002. 9. I. J. Cox, M. L. Miller and A. L. McKellips, “Watermarking as communications with side information,” Proc. IEEE, vol. 87, no. 7, pp.1062–1077, July 1999. 10. P. Bas, J. Chassery and B. Macq, “Geometrically Invariant Watermarking Using Feature Points,” IEEE Transactions on Image Processing, vol.11, no.9, Sep. 2002. 11. P. Dong, N. P. Galatsanos, “Affine Transform Resistant Watermarking Based on Image Normalization,” in Proceeding of ICIP 2002, New York, Sep. 2002.
A Compound Intrusion Detection Model Jianhua Sun, Hai Jin, Hao Chen, Qian Zhang, and Zongfen Han Internet and Cluster Computing Center Huazhong University of Science and Technology, Wuhan, 430074, China {jhsun, hjin}@hust.edu.cn
Abstract. Intrusion detection systems (IDSs) have become a critical part of security systems. The goal of an intrusion detection system is to identify intrusion effectively and accurately. However, the performance of misuse intrusion detection system (MIDS) or anomaly intrusion detection system (AIDS) is not satisfying. In this paper, we study the issue of building a compound intrusion detection model, which has the merits of MIDS and AIDS. To build this compound model, we propose an improved Bayesian decision theorem. The improved Bayesian decision theorem brings some profits to this model: to eliminate the flaws of a narrow definition for intrusion patterns, to extend the known intrusions patterns to novel intrusions patterns, to reduce risks that detecting intrusion brings to system and to offer a method to build a compound intrusion detection model that integrates MIDS with AIDS.
1
Introduction
Security of network systems is becoming increasingly important, and intrusion detection system (IDS) is a critical technology to help protect systems. There are two well-known kinds of intrusion detection systems: misuse intrusion detection system (MIDS) and anomaly intrusion detection system (AIDS). MIDS is efficient and accurate in detecting known intrusions, but cannot detect novel intrusions without unknown signature patterns. AIDS can detect both novel and known attacks, but false alarm rate is high. Hence, MIDS and AIDS are often used together to complement each other. Compound intrusion detection system (CIDS) comprises models of both the normal behavior of the system and the intrusive behavior of the intruder. This kind of model gives us an improved indication of the quality of the alarm, and thus in some sense the most ”advanced” detectors [1]. Early research [7] and [21] suggested that the two main systems ought to be combined to provide a complete intrusion detection system capable of detecting a wide array of different computer security violations [2]. In this paper, we study a compound intrusion detection model. Unlike single MIDS or AIDS, we use CIDS to achieve higher detection rate and lower false alarm rate. Using one simple similarity measure, MIDS and AIDS can work well independently. By applying an improved Bayesian decision theorem to our
This paper is supported by Key Nature Science Foundation of Hubei Province under grant 2001ABA001
S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 370–381, 2003. c Springer-Verlag Berlin Heidelberg 2003
A Compound Intrusion Detection Model
371
model, the combination of MIDS and AIDS is achieved. Furthermore, by applying this improved theorem, the model is sensitive to various kinds of false alarms and minimizes the risks incurred by these false decisions. The main contributions of this paper are as follows. • We propose a novel compound intrusion detection model that integrates MIDS with AIDS. • We improve the Bayesian decision theorem that suits real security environment to minimize the risks incurred by false decisions. The rest of this paper is organized as follows. In section 2, we discuss some research background. Section 3 explains the reasons to improve Bayesian decision theorem, and show how to build this compound intrusion detection model. In section 4, we evaluate our intrusion detection model using sequences databases. Section 5 ends with a conclusion and some discussion.
2
Research Background
Warrender et al. gives us a comparison of anomaly detection techniques and draws a conclusion that the short sequences of system calls are more important than the particular method of analysis [25]. Stephanie Forrest presents an approach for modeling normal sequences using look ahead pairs [5] and contiguous sequences [14]. Lane examines unlabeled data for anomaly detection by comparing the sequences of users’ actions during an intrusion to the users’ normal profile [16][17][18]. In Linux operating system, a program consists of a number of system calls, and different processes have different system calls sequences. Because the diversities of processes coding, there are difference in the order and the frequency of invocation of system calls [22]. So the speciality in the order and the frequency of system calls provides clear separation between different kinds of processes. Experiments in [5] show that short sequences of system calls of processes generate a stable signature for normal behaviors and the short range ordering of system calls appears to be remarkably consistent. This suggests a simple model of normal behaviors.
3
Intrusion Detection Based on Improved Bayesian Decision Theorem
The goal of an IDS is to detect and respond to an intrusion when it happens, and is able to keep security from being disturbed by the false alarms. False negative means that when an intrusion really happens, but IDS does not catch it. A false positive is a situation where an abnormity defined by the IDS happens, but it does not turn out to be a real intrusion. Hence low false negatives and low false positives are the goal of an IDS. General IDSs often ignore the risks of false negatives and false positives. To minimize the risks, we build an intrusion detection model based on an improved Bayesian decision theorem.
372
3.1
J. Sun et al.
Related Work
Decision-making is needed in many domains, which has relationship with misclassification cost and class membership probability. In order to make an optimal decision, misclassification cost and class membership probability should be estimated first. In recent years, there is a substantial amount of research on cost-sensitive issue. Gaffney et al. [6] provide an expected cost metric and demonstrate that, contrary to common device, the value of an intrusion detection system and the optimal operation of that system depend not only on the system’s ROC (Receiver Operating Characteristic) curve, but also on cost metrics and the probability of hostility of operating. MetaCost offers a general method for cost-sensitive learning [4], which is based on the assumption that costs are known in advance and are the same for all examples. Certainly previous research has been based on the assumption that misclassification costs are the same for all examples and known in advance, but in general these costs are example-dependent and different. For example, different mistakes in diagnosis can cause distinct risks. In medical treatment, false positives and false negatives also occur in physical examination. A positive test for AIDS or cancer, when the person is disease free, is a false positive. The person suffers psychally from the outcome that he has a disease when he actually does not. A false negative is when there actually is a disease but the results come back as negative. A finding of no cancer, when there actually is cancer, is a false negative. The patient will be devastated because he does not get the timely treatment that he needs. Obviously, the results produced by these two wrong diagnostic decisions lead to apparently different harm. These false results cannot be completely eliminated, but they can be reduced. Meanwhile, class membership probabilities need to be estimated. Class membership probabilities are example-specific and not known in advance [26]. Direct cost-sensitive decision-making is made on the assumption that any learned classifier can provide conditional probability estimates for training examples and can also provide conditional probability estimates for test examples [26]. Axelsson offers a set of data, that in diagnosis even though the test is 90% certain, the chance of actual having the disease is only 1/100, because the population of health people is much larger than the population with the disease [2]. It is the same with security field that the number of normal actions is much larger than that of intrusion actions, and the chances of a real intrusion are small, even if an intrusion decision is made. Hence, it is important to make an optimal decision to minimize the risks incurred by wrong decisions. Bayesian decision theorem is an alternative method. Traditional Bayesian decision method is applied to many fields of real life, for example, market decision to reduce management risk and to make a larger profit. This method considers the action that produces the highest expected utility or lowest expected risk to be the most appropriate [24]. However, in order to calculate the expected utilities or risk, the probability of class membership must be calculated first. This method takes into account not only the risk incurred by the actions, but also the probability of class membership which is not suit for the requirements
A Compound Intrusion Detection Model
373
of intrusion detection. There are some intrusions that occur rarely but cause large damage. So we remove the function of probability and get the improved Bayesian decision theorem to satisfy the requirement of security problem. We choose Internet Information Service (IIS) [11] as the attack target service, and believe that experiment on other services can have the similar result. We first construct two databases (details about construction can be found in section 4.1). One is called NSCS database that contains normal system calls sequences, and the other is called ISCS database that contains intrusion system calls sequences. Several kinds of intrusion were estimated to attack the IIS, and during the intrusive process system calls sequences were recorded by Strace for NT [8]. Tables 1–4 list the experiment results. We take Table 1 as example to illustrate these tables. In Table 1, the 11 fields form a system call sequence of length 11 in ISCS database, and the sequence is ”NtWaitForMultipleObjects→NtWaitForSingleOb ject→NtWaitForMultipleObjects→NtWaitForSingleObject→ NtWaitForSingleO bject→NtWaitForSingleObject→ NtWaitForSingleObject→NtWaitForMultipleO bjects→ NtWaitForSingleObject→NtWaitForSingleObject→ NtWaitForSingleOb ject”. The number in the first line in Table 1 is the total number of this sequence occurs in the experiment. In these four tables, the numbers range from 23293 to 12. Maybe some intrusions occur incidentally, but if signatures of these intrusions are matched, intrusion decision should be made. So to determine whether an intrusion occurs or not, we should remove the function of its probability. In order to meet with the special requirement of security, we improve Bayesian decision theorem, apply it to our model to consider the different losses caused by various kinds mistakes, and offer an outcome that minimizes losses and risks. 3.2
Improved Bayesian Decision Theorem
Decision is more commonly called action in decision-making domain. Particular actions is denoted by a, while a set of all possible actions under consideration is denoted by Φ. Φ is defined as: Φ = {a1 , a2 , ..., ac }
(1)
Each element in Φ incurs some loss, which is often the function of decision and state of nature. The decision table is used to denote the relationship. Table 5 is the general form of a decision table. In Table 5, wi is the ith state of nature, αj is the jth action, and λ (αj , wi ) is a risk function related to αj and wi . The quantity w, which affects the decision process, is commonly called the state of nature. In making decision it is important to consider what the possible state of nature is. The symbol Ω is used to denote the set of all possible states of nature. Then Ω = {w1 , w2 , ..., wc }. (2) In this model, c equals 2, w1 denotes normal, and w2 denotes intrusion. Accordingly a1 means that the sequence is normal and can be passed over, and
374
J. Sun et al. Table 2. Seqence Sample Two
Table 1. Seqence Sample One Number field 1 field 2 field 3 field 4 field 5 field 6 field 7 field 8 field 9 field 10 field 11
Number field 1 field 2 field 3 field 4 field 5 field 6 field 7 field 8 field 9 field 10 field 11
23293 NtWaitForMultipleObjects NtWaitForSingleObject NtWaitForMultipleObjects NtWaitForSingleObject NtWaitForSingleObject NtWaitForSingleObject NtWaitForSingleObject NtWaitForMultipleObjects NtWaitForSingleObject NtWaitForSingleObject NtWaitForSingleObject
Table 3. Seqence Sample Three Number field 1 field 2 field 3 field 4 field 5 field 6 field 7 field 8 field 9 field 10 field 11
12830 NtQueryDefaultLocale NtAllocateVirtualMemory NtQueryVirtualMemory NtAllocateVirtualMemory NtQueryVirtualMemory NtFreeVirtualMemory NtDeviceIoControlFile NtRemoveIoCompletion NtDeviceIoControlFile NtDeviceIoControlFile NtDeviceIoControlFile
Table 4. Seqence Sample Four
139 NtQueryInformationToken NtSetInformationThread NtOpenKey NtOpenKey NtWaitForSingleObject NtReleaseSemaphore NtPulseEvent NtQueryInformationToken NtSetInformationThread NtFsControlFile NtCreateFile
Number field 1 field 2 field 3 field 4 field 5 field 6 field 7 field 8 field 9 field 10 field 11
12 NtQueryAttributesFile NtCreateFile NtQueryVolumeInformationFile NtQueryInformationFile NtSetInformationThread NtQueryInformationToken NtSetInformationThread NtCreateFile NtSetInformationThread NtQueryInformationToken NtSetInformationThread
Table 5. General Form of a Decision Table α1 α2 ... αi ... αc
w1 λ (α1 , w1 ) λ (α2 , w1 ) ... λ (αi , w1 ) ... λ (αc , w1 )
w2 λ (α1 , w2 ) λ (α2 , w2 ) ... λ (αi , w2 ) ... λ (αc , w2 )
... ... ... ... ... ... ...
wj λ (α1 , wj ) λ (α2 , wj ) ... λ (αi , wj ) ... λ (αc , wj )
... ... ... ... ... ... ...
wc λ (α1 , wc ) λ (α2 , wc ) ... λ (αi , wc ) ... λ (αc , wc )
a2 means that a signal of intrusion is emitted and that an action responds to the signal. A random variable is denoted by X, and a particular realization of X is denoted by x. x = x1 , x2 , ..., xn (xi ∈system calls) and x1 , x2 , ..., xn means a sequence of system calls like x1 → x2 → ... → xn , such as fstat64 →mmap2 →read → ... →close→munmap→rt sigprocmask. Each x is classified into a normal sequences set or an intrusion sequences set.
A Compound Intrusion Detection Model
375
In decision theory, a key element is the risk function. If a particular action ai is taken and wj (i, j = 1, 2, ..., c) turns out to be the true state of nature, then a risk λ (αi , wj ) is incurred. λ (α1 , w2 ) means the risk because that the sequence is ignored while it turns out to be an intrusion; λ (α2 , w1 ) means the risk because that a signal of intrusion is emitted while the sequence turns out to be normal. λ (α1 , w2 ) is normally larger greatly than λ (α2 , w1 ) . The traditional expected conditional risk R (αi | x) can be obtained from the following formula: R (αi | x) = E [λ (αi , wj )] =
c
j=1
λ (αi , wj ) P (wj | x), i = 1, 2, ..., c
(3)
where P (wj | x) is the conditional probability of wj for a given x and can be got through Bayesian theorem: p(x | wi )P (wi ) , i = 1, ..., c j=1 p(x | wj )P (wj )
P (wi | x) = c
(4)
where prior probabilities P (wi ) are assumed known. Instead we replace P (wj | x) in (3) by a similarity measure to get the improved Bayesian decision theorem. The similarity measure we used is similar to [16]. It differs in that we make a comparison between system calls sequences while [16] makes a comparison between command sequences. The set of normal system calls sequences is denoted by Ψ1 , and the set of intrusion system calls sequences is denoted by Ψ2 . Once Ψ1 and Ψ2 are formed, we compare an incoming sequence to sequences in Ψ1 and Ψ2 to calculate the similarity values between the observed sequence and each sequence of the two sets. If the two similarity values have wide gap, we directly classify this sequence to Ψ1 or Ψ2 . For example, if an observed sequence x owns a similarity value 0.8 with Ψ1 and 0.2 with Ψ2 , x is then classified into Ψ1 . Otherwise, if the two similarity values have little difference, we use Bayesian decision theorem to make a decision that this sequence is normal or not. The similarity measure simply assigns a score equal to the number of identical tokens found in the same location of the two sequences and assigns a higher score to adjacent identical token than to separated identical tokens. We define the similarity of an observed sequence x to a set of sequence, Ψi , as: (5) Sim(x, Ψi ) = max {Sim(x, seq)}, i = 1, ..., c Seq∈Ψi
And sequence y is most similar to x in Ψi . Sim(x, y) = max {Sim(x, seq)}, i = 1, ..., c Seq∈Ψi
(6)
The improved expected conditional risk R (αi | x) can be obtained in (7) R (αi | x) = E [λ (αi , wj )] =
c
j=1
λ (αi , wj ) Sim(x, Ψi ), i = 1, 2, ..., c
(7)
376
J. Sun et al.
Intrusions belonging to the same intrusion category have identical or similar attack principles and intrusion techniques. Therefore they have identical or similar system calls sequences and are significantly different from normal system calls sequences. Most novel attacks are variants of known attacks and the “signature” of known attacks can be sufficient to catch novel variants [9]. In experiment, it is easy to obtain full normal traces. However, due to the limited knowledge of known intrusions, we can only obtain the known intrusion traces. In order to detect novel intrusions, we use this similarity measure to extend the known intrusions traces to novel intrusions traces. Among these R (α1 | x) , R (α2 | x) , ..., R (αc | x) , the optimal decision is ak , which is got from the following: R (αk | x) = min R (αi | x) i=1,...,c
(8)
In our model we just make a comparison between R (α1 | x) and R (α2 | x) , and choose the action that bring less risk to the system. That is the improved Bayesian view of optimal decision making. We build two profile databases respectively. One is called NSCS database and the other ISCS database. Misuse intrusion detection can be achieved on the base of ISCS database, and felling back on NSCS database anomaly intrusion detection can be realized. These two kinds of detection sub-models can work independently. Through the improved Bayesian decision theorem, misuse intrusion detection and anomaly intrusion detection are combined. This improved Bayesian decision theorem brings four profits to this model. It eliminates the flaws of a narrow definition for normal patterns and intrusion patterns; extends the known intrusions patterns to novel intrusions patterns; reduces risks that detecting intrusion brings to system; and offers a method to build a compound intrusion detection model that integrates MIDS with AIDS.
4
Experiment
We have this experiment on the privileged process Sendmail. Sendmail provides various services that owns relatively more leaks and tends to be controlled easily. Sendmail, which is running with root privilege, has access to more parts of the system. Therefore hackers aim at Sendmail to gain the root privilege. Obviously privileged processes need paying more attentions, and we conduct this experiment on Sendmail. Sendmail is running on a cluster with the Linux operation system in Internet and Cluster Computing Center (ICCC) at Huazhong University of Science and Technology (HUST), and Strace 4.0 for Linux [10] is used to trace processes. 4.1
Sequences Databases Construction
NSCS database and ISCS database are constructed in this experiment. The implementation of NSCS database follows the method described in [5].
A Compound Intrusion Detection Model
377
The procedure of constructing these two databases can be found in our previous work [15]. We trace Sendmail running for two months and obtain traces of a total of 5.5 million system calls sequences through selecting typical data. Table 6 lists total numbers of unique system calls sequences given different sequences length. From the table we see that the longer sequences length, the more unique system calls sequences. Table 7, 8 and 9 list some sequence samples, and the total numbers of each sequence with different sequences length. It is obvious that the longer sequences length, the smaller the total number of each sequence. Table 6. Total Numbers of Unique System Calls Sequences given Different Sequences Length total number of unique system calls sequences sequences length 1348 6 1622 9 1938 12 Table 7. Total Numbers of Sequences Samples with Length 6 sequences samples total number of each fcntl64→fcntl64→fcntl64→fcntl64→fcntl64→fcntl64 742951 flock→fstat64→flock→flock→fstat64→flock 111113 time→getpid→getpid→stat64→lstat64→geteuid32 92456 Table 8. Total Numbers of Sequences Samples with Length 9 sequences samples total number of each fcntl64→fcntl64→fcntl64→fcntl64 →fcntl64→fcntl64→fcntl64→fcntl64→fcntl64 734558 flock→fstat64→flock→flock →fstat64→flock→flock→fstat64→flock 99528 time→getpid→getpid→stat64 →lstat64→geteuid32→lstat64→geteuid32→open 73744
Whereafter, we construct the ISCS database. We generate traces of three types of intrusions behaviors, which attack Sendmail effectively. The three types of intrusions include U2R (User to Root), buffer overflow and forwarding loop. The sunsendmailcp script delegating U2R uses a special command line option to cause Sendmail to append an email message to a file. By using this script, a local user might obtain root access. The syslog attack delegating buffer overflow uses the syslog interface to overflow a buffer in Sendmail and leaves one port for later intrusion. Forwarding loop writes special email addresses and forward files to form a logical circle and to send letters from machine to machine [5]. During intrusion, intrusion system calls sequences are attained. Strace runs on Sendmail
378
J. Sun et al. Table 9. Total Numbers of Sequences Samples with Length 12 sequences samples total number of each fcntl64→fcntl64→fcntl64→fcntl64→fcntl64→fcntl64 →fcntl64→fcntl64→fcntl64→fcntl64→fcntl64→fcntl64 725927 flock→fstat64→flock→flock→fstat64→flock →flock→fstat64→flock→flock→fstat64→flock 90746 time→getpid→getpid→stat64→lstat64→geteuid32 →lstat64→geteuid32→open→fstat64→flock→open 64939 Table 10. Detection Rates with Different Cost Ratio C C Category U2R Buffer Overflow Forwarding Loop
10 old new 82.5 26.5 91.2 35.7 92.5 56.3
20 old new 91.3 83.1 83.5 76.2 88.2 74.7
30 old new 92.6 73.5 89.1 72.4 84.4 76.4
40 old new 82.3 53.7 91.3 34.3 89.2 46.9
for two months to trace intrusion traces. The total intrusion system calls turns out to be 300K and the number of unique intrusion system calls sequences is about 342 with the length 6, about 420 with the length 9, and about 513 with the length 12. 4.2
Detect Known and Novel Intrusions
To determine whether a system calls sequence x is normal or not, we compare x with the sequences in ISCS database and NSCS database. If Sim(x, ISCS) is not less than λI , x is an intrusion system calls sequence; in the same way, if Sim(x, N SCS) is not less than λN , x is a normal system calls sequence. λN is a threshold value, above which a behavior is regarded as normal, and λI is also a threshold value, above which it is deemed intrusion. Using similarity measure, misuse intrusion detection based on ISCS database and anomaly intrusion detection based on NSCS database can work well independently. Otherwise, we use the improved Bayesian decision theorem to make a decision. Table 10 compares the detection rates for old intrusions and new intrusions with sequences of length 12 and with different cost ratio C. Here new intrusions refer to those that do not have corresponding instances in the training data. From the table we see that detection rates of old intrusions have nothing to do with C. Because the system calls sequences of these old intrusions have been stored in ISCS database, it is easy to detect old intrusions. Whereas detection rates of news intrusions are relevant to C, and a high detection rates can be got for C between 20 and 30. 4.3
Experiment on IIS
Additional experiment is carried on Internet Information Service (IIS) [11]. IIS is running on a cluster with the Windows operation system and Strace for NT [8] is
A Compound Intrusion Detection Model
379
used to trace processes. We trace IIS running for four months free intrusions, and finally there are about 14000 unique system calls sequences in NSCS database given the sequences length 11. In order to collect raw data, Strace for NT [8] is used to record system calls sequences of IIS process, and WinDump [12] and WinPcap [13] are used to collect network packets. During collecting data, several kinds of intrusion are estimated to attack the IIS: Ihttp delegating U2R attack, Iiscrash, a kind of buffer overflow attack; and others kinds of intrusions, such as DDOS and Fluxay47. Three packages, numbered 1, 2 and 3, are collected for two weeks. Each package includes traces of IIS. These three packages are used as raw data, and experiment results are shown in Table 11. The figures in Table 11 show that time consumed for detection is acceptable. Table 11. Time Consumed for Detection Size of Package Time consumed Package 1 123M 5ms Package 2 123M 6ms Package 3 203M 8ms
At the first two weeks we train this model, the size of NSCS database is 5.19k, and the false alarm rata is high; three months later, the size increases to 132k, and the false alarm rata is always below 10%. It shows that the richness of NSCS database (ISCS database) has effect on the performance of this model.
5
Conclusions and Discussion
In this paper, we propose a compound intrusion detection model based on improved Bayesian decision theorem to reduce false alarm rate and minimize the risks of false negatives and false positives. To achieve the goal of detection, NSCS database and ISCS database should be established first. Using similarity measure, misuse intrusion detection based on ISCS database and anomaly intrusion detection based on NSCS database can work well independently. By applying improved Bayesian decision theorem to our model, the combination of misuse intrusion detection and anomaly intrusion detection is achieved. Through the improved Bayesian decision theorem, we define the risk model to formulate the expected risk of an intrusion detection decision, and present risk sensitive machine learning techniques that can produce detection model to minimize the risks of false negatives and false positives Empirical experiments show that our model and deployment techniques are effective in reducing the overall intrusion detection risk. The results show that detection rates of new intrusions are relevant to cost ratio C, and a high detection rates can be obtained for a given C between 20 and 30. Whether NSCS database or ISCS database is rich enough influences performance of this model. As long as these databases are kept rich enough, intrusion can be checked effectively. In order to collect normal system calls sequences as
380
J. Sun et al.
many as possible, we need to trace Sendmail service long enough, keep the service out of attacks or intrusions, and require as many kinds of services of Sendmail as possible. Contrary to NSCS database, ISCS database is easier to build. Sequences that are considerably different from those in NSCS database will be inserted into ISCS database. In order to deal with novel intrusions effectively, ISCS database should be maintained frequently. Although some problems exist in our model, it provides us an alternative approach to intrusion detection. We will attempt to apply other theories and techniques to intrusion detection field.
References 1. S. Axelsson, “Intrusion Detection Systems: A Taxonomy and Survey”, Technical Report No 99-15, Dept. of Computer Engineering, Chalmers University of Technology, Sweden, March 2000 2. S. Axelsson, “The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection”, Proc. of the 6th ACM Conference on Computer and Communications Security, Kent Ridge Digital Labs, Singapore, November 1–4, 1999, pp. 1–7 3. G. Casella and R. Berger, Statistical Inference, Wadsworth & Brooks/Cole, Belmont, California, 1990, pp. 260–270 4. P. Domingos, “Metacost: A General Method for Making Classifiers Cost-sensitive”, Proc. of 5th Int. Conf. on Knowledge Discovery and Data Mining KDD, 1999, pp. 155–164 5. S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longsta., “A Sense of Self for Unix Processes”, Proc. IEEE Symposium on Security and Privacy, Los Alamitos, CA, 1996, pp. 120–128 6. J. E. Gaffney and J. W. Ulvila, “Evaluation of Intrusion Detectors: A Decision Theory Approach”, Proc. of IEEE Symposium on Security and Privacy, 2001, pp. 50–61 7. L. Halme and B. Kahn, “Building a Security Monitor with Adaptive User Work Profiles”, Proc. of the 11th National Computer Security Conference, Washington DC, Oct, 1988, pp. 274–283 8. http://razor.bindview.com/tools/desc/strace readme.html 9. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html 10. http://www.wi.leidenuniv.nl/˜wichert/strace 11. http://www.microsoft.com/iis 12. http://windump.polito.it/ 13. http://winpcap.polito.it/ 14. S. A. Hofmeyr, S. Forrest, and A. Somayaji, “Intrusion Detection using Sequences of System Calls”, Journal of Computer Security, 6, 1998, pp. 151–180 15. H. Jin, J. Sun, H. Chen, and Z. Han, “A Risk-sensitive Intrusion Detection Model”, Proc. of International Conference on Information Security and Cryptography (ICISC’02 ), LNCS 2587, Spinger-Verlag, 2003, pp. 107–117 16. T. Lane and C. E. Brodley, “Sequence Matching and Learning in Anomaly Detection for Computer Security”, Proc. of the AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management, Menlo Park, CA: AAAI Press. 1997, pp. 43–49
A Compound Intrusion Detection Model
381
17. T. Lane and C. E. Brodley, “Temporal Sequence Learning and Data Reduction for Anomaly Detection”, Proc. of the Fifth ACM Conference on Computer and Communications Security, 1998, pp. 150–158 18. T. Lane and C. E. Brodley, “Temporal Sequence Learning and Data Reduction for Anomaly Detection”, ACM Trans. on Information and System Security, 2, 1999, pp. 295–331 19. J. Lin, X. Wang, and S. Jajodia, “Abstraction-based Misuse Detection: High-level Specifications and Adaptable Strategies”, Proc. of IEEE Computer Security Foundations Workshop, Rockport, MA, June 1998, pp. 190–201 20. R. P. Lippman, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McCllung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham, and M. A. Zissman, “Evaluating Intrusion Detection Systems: The 1998 DARPA Off-line Intrusion Detection Evaluation”, Proc. of DARPA Information Survivability Conference and Exposition, Jan 25–27, 2000, vol.2, pp. 12–26 21. T. F. Lunt, “Automated Audit Trail Analysis and Intrusion Detection: A survey”, Proc. of the 11th National Computer Security Conference, Baltimore, Maryland, 1988, NIST, pp. 65–73 22. Y. Okazaki, I. Sato, and S. Goto, “A New Intrusion Detection Method based on Process Profiling”, Proc. of the 2002 Symposium on Applications and the Internet (SAINT’02 ), pp. 82–91 23. J. Sun, H. Jin, H. Chen, and Z. Han, “A Data Mining Based Intrusion Detection Model”, Proc of Fourth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL’03), 2003 24. T. Terano, K. Asai, and M. Sugeno, Fuzzy Systems Theory and Its Applications, Boston Academic Press, 1992, pp. 20–99 25. C. Warrender, S. Forrest, and B. Pearlmutter, “Detecting Intrusions using System Calls: Alternative Data Models”, Proc. of IEEE Symposium on Security and Privacy, 1999, pp. 133–145 26. B. Zadrozny and C. Elkan, “Learning and Making Decisions When Costs and Probabilities are Both Unknown”, Proc. of the Seventh International Conference on Knowledge Discovery and Data Mining (KDD’01 ), pp. 204–213
An Efficient Convertible Authenticated Encryption Scheme and Its Variant Hui-Feng Huang and Chin-Chen Chang Department of Computer Science and Information Engineering National Chung Cheng University, Chiayi, Taiwan {hfhuang, ccc}@cs.ccu.edu.tw
Abstract. The authenticated encryption scheme allows the specified receiver to simultaneously recover and verify a message. Recently, to protect the receiver’s benefit of a later dispute, Wu and Hsu proposed a convertible authenticated encryption scheme in which the receiver can convert the signature into an ordinary one that can be verified by anyone. However, Wu and Hsu’s scheme doesn’t consider that once the intruder knows the message then the intruder can also easily convert a signature into an ordinary digital signature. In this situation, the intruder may force the signer to be responsible for the terms of agreement of the documents and cause confusion. In this paper, we propose an efficient convertible authenticated encryption scheme which can provide better protection for both the signer and the specified receiver. On the other hand, we also propose an efficient and lower communication convertible authenticated encryption scheme with message linkages. It can be regarded as a variant of the convertible authenticated encryption scheme in that it is designed to link up the message blocks to avoid the message block being reordered, replicated, or partially deleted during the transmission.
1
Introduction
A digital signature on an electronic document plays the same role as a handwritten signature does on paper documents. Its main purpose is to specify the person responsible for the document. In some applications, it is not necessary for anyone to verify the validity of the signature while keeping the message secret from the public. For example, the use of credit cards only needs to be verified by the credit card company. Another application [1] is the case where a receiver is a public office and a signer is a public officer. The signer must sign an official document that will be published after a few years. The straightforward approach is that a signer uses the specified receiver’s encryption key to encrypt both the generated signature and the message. In this way, only the specified receiver can recover both the message and its corresponding signature and then check the validity of the signature. However, this method is costly in terms of the computational complexities and the communication overheads. To improve the efficiency, some researchers such as Horster et al. [4] developed authenticated S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 382–392, 2003. c Springer-Verlag Berlin Heidelberg 2003
An Efficient Convertible Authenticated Encryption Scheme and Its Variant
383
encryption schemes by modifying from Nyberg-Rueppel’s scheme [5]. In the authenticated encryption scheme, the signer may make a signature-ciphertext for a message and send it to a specified recipient. Only the specified recipient has the ability to recover and verify the message. But these authenticated encryption schemes are not digital signature schemes, no one except the specified receiver can be convinced of the signer’s valid signature. Further, consider the case of a later dispute, e.g., the credit card user denies having signed a signature. In this situation, the credit card company should have the ability to prove the dishonesty of those users. Then, it might be required to reveal the message along with its signature for verifying. To protect the recipient in case of a later dispute, some schemes [6,9] utilize an interactive repudiation settlement procedure between the recipient and the third party. It is inefficient due to the interactive communication. In 1999, based on Horster et al.’s scheme, Araki et al. [2] proposed a limited verifier signature scheme and a convertible limited verifier signature scheme in which a receiver can convert a limited verifier signature into an ordinary digital signature. In this way, as the signer denies the signature, the receiver can prove the dishonesty of the signer by revealing an ordinary signature that can be verified by any verifier (or judge). However, the conversion of the signature requires the signer to release one more parameter. This results in a further communication burden. In addition, it may be unworkable if the signer is uncooperative. Later, Wu and Hsu [8] proposed a convertible authenticated encryption scheme that can easily produce the ordinary signature without the cooperation of the signer, and their scheme is more efficient than Araki et al.’s in terms of the computation complexities and the communication costs. However, Wu and Hsu’s [8] convertible authenticated encryption scheme doesn’t consider that once the intruder knows message m then the intruder can also easily convert a signature into an ordinary digital signature. In this situation, the intruder may force the signer to be responsible for the terms of agreement of the documents and cause confusion. Using the concept of the ElGamal’s [3] public key cryptosystem and Schnorr’s [7] signature scheme, we improve Wu and Hsu’s [8] convertible authenticated encryption scheme to resolve the above problems. In the normal procedure, only the specified recipient can recover the message and verify the signature with an authenticated encryption mechanism. Once the signer denies the signature, the specified recipient can prove the dishonesty of the signer by revealing an ordinary signature that can be verified by any verifier (or judge) without the cooperation of the signer. In our method, not only are the computation complexities simpler than that of Wu and Hsu’s but also the signer doesn’t have to worry about an attacker knowing the message and forcing him to be responsible for the terms of agreement of the documents and create confusion. Therefore, the proposed scheme can provide better authenticity to both the signer and the specified receiver. On the other hand, consider the situation when a message is large. It must be divided into a sequence of message blocks, and each message block must be encrypted and signed individually, then it will require more computation and
384
H.-F. Huang and C.-C. Chang
communication costs. Thus, based on the ElGamal’s [3] public key cryptosystem and Schnorr’s [7] signature scheme, we also propose an efficient and low communication cost convertible authenticated encryption scheme with message linkages. It can be regarded as a variant of the previously proposed convertible authenticated encryption scheme in that it is designed to link up the message blocks to avoid the message block being reordered, replicated, or partially deleted during the transmission. And the proposed convertible authenticated encryption scheme with message linkages still provides the protection to both the signer and the specified receiver. This paper is organized as follows. In the next section, we introduce the proposed convertible authenticated encryption scheme. In Section 3, we will present an efficient and low communication cost convertible authenticated encryption scheme with message linkages. The security analyses and the performances of the proposed two schemes are discussed in Section 4. Some conclusions will be made in the last section.
2
The Proposed Scheme
In this section, we will propose a convertible authenticated encryption scheme based on the ElGamal’s [3] public key cryptosystem and Schnorr’s [7] signature scheme. There are two phases in our scheme: the signing/verification phase, and the conversion phase. In the signing/verification phase, the signer can construct a signature with message recovery to some specified recipient. When a later dispute, in the conversion phase, the recipient can reveal the converted signature and then any verifier (or judge) can prove the dishonesty of the signer without the cooperation of the signer. Initially, the system authority (SA) chooses a large prime number p such that p − 1 has a large prime factor q (q ≥ 2256 and p ≥ 2512 ). Let g be a generator with order q over GF (p). SA also selects a one-way hash function h(). Then, SA publishes p, q, g and h(). Each user in the system, Ut , owns a secret key xt in Zq and computes the corresponding public key yt = g xt mod p. Suppose that Ua is the signer, Ub the recipient, and m the message to be signed. According to the concept of ElGamal’s public key cryptosystem and Schnorr’s signature scheme, we describe these two phases as follows: 2.1
The Signing/Verification Phase
1. The signer Ua selects a random number k ∈ Zp∗ , and computes c = m × ybq−k mod p. 2. Compute r = h(m, yb , g k ) mod q, and s = k − xa r mod q. 3. Finally, Ua sends the signature (c, r, s) for m to the recipient . After receiving the signature (c, r, s), Ub uses his secret key xb and Ua ’s public key ya to recover the message m as m = c × (yar × g s )xb mod p.
(1)
An Efficient Convertible Authenticated Encryption Scheme and Its Variant
385
Then, Ub can verify the signature with the following equality: r = h(m, yb , yar g s ) mod q.
(2)
If it holds, the signature is valid. Hence, the recipient Ub confirms this secret message m and its signature were sent by the signer Ua . For the security of Schnorr’s [7] signature scheme, the random number k should not be reused with a different message.
2.2
The Conversion Phase
Later on, when the signer denies the signature, Ub can prove the dishonesty of the signer by revealing the message m for the converted signature (r, s). With this converted signature, anyone (or judge) can verify its validity from Equation (2). This phase is for the specified recipient to convince the judge that a signature is the signer’s true one provided that it is valid. In our conversion phase, only the recipient can reveal the message m and the converted signature (r, s) for any verifier to check whether Equation (2) holds or not. Therefore, the signer Ua cannot repudiate that he ever sent the message m to the recipient Ub . The following theorems show that the correctness of the proposed scheme. Theorem 1. If the signature (c, r, s) is produced by the proposed scheme, then the recipient can recover the message m as m = c × (yar × g s )xb mod p. (q−k) mod p, we have Proof: From yb = g xb mod p, g q = 1 mod p, and c = m × yb c × (yar × g s )xb mod p (q−k)
= m × yb
× (yar × g s )xb mod p
= m × g xb ×(q−k) × (g xa r+s )xb mod p = m × g xb ×(q−k) × (g k )xb mod p = m × (g q )xb mod p = m, where k = xa r + s mod q. Theorem 2. If the converted signature (r, s) of message m is produced by the proposed scheme, then the converted signature can be verified by Equation (2). Proof: Since r = h(m, yb , g k )modq, and s = k − xa r mod q. We can obtain g k = yar × g s mod p. Therefore, we have r = h(m, yb , (yar × g s )) mod q. It is obvious that our convertible authenticated encryption scheme can easily produce the ordinary signature without the cooperation of the signer. As there is the recipient Ub ’s public key yb in the verified equation r = h(m, yb , yar g s ) mod q, so it will be much easier to tell that the signer Ua only has to be responsible for the recipient Ub and not for others. Therefore, it is very convenient for the document’s signer to clarify the responsibility.
386
3
H.-F. Huang and C.-C. Chang
Variant
For data communications to achieve integrity, privacy and authentication, if the message is large, it must be divided into a sequence of message blocks. Since each message block is encrypted and signed individually, it will require more computation and communication costs. In this section, we propose an efficient convertible authenticated encryption scheme with message linkages that can be regarded as a variant of the scheme in Section 2. It is designed to link up the message blocks to avoid the message block being reordered, replicated, or partially deleted during the transmission. The proposed scheme not only provides the linkages among signature blocks, but also has better performance and low communication costs. In the following subsection, we describe our convertible authenticated encryption scheme with message linkages. 3.1
Convertible Authenticated Encryption Scheme with Message Linkages
The proposed scheme also consists of two phases: the signing/verification phase, and the conversion phase. The system initialization is the same as the one presented in Section 2. We depict these two phases as follows. The signing/verification phase. Without loss of generality, assume that signer Ua wants to send a message M to the specified receiver Ub . Message M is made up of the sequence M1 , M2 , · · · , Mn , where Mi ∈ GF (p). Thus, signer Ua executes the following procedure to construct the signature blocks for message M. 1. Let r0 = 0 and choose a random number t ∈ Zp , then compute ri = Mi × f (ri−1 t) mod p for i = 1, 2, · · · , n, where f () is a public one-way hash function and denotes the exclusive or operator. 2. Select a random number k ∈ Zq∗ and compute c = t × ybq−k mod p. 3. Compute r = h(L, yb , g k ) mod q, and s = k − xa r mod q, where L = h(M1 M2 · · · Mn ), h is a public one-way hash function, and denotes concatenation. Finally, Ua sends the signature (r, s, c, L, r1 , r2 , · · · , rn ) to Ub in a public way. Note that ri is used as a linking parameter to generate the ith and (i + 1)th message blocks. After receiving the set {r, s, c, L, r1 , r2 , · · · , rn }, Ub uses his secret key xb and Ua ’s public key ya to recover the message blocks {M1 , M2 , · · · , Mn } as follows. 1. Compute t = c × (yar × g s )xb mod p. 2. Recoverthe message blocks {M1 , M2 , · · · , M3 } by computing Mi = ri × f (ri−1 t)−1 mod p, for i = 1, 2, · · · , n, and r0 = 0.
An Efficient Convertible Authenticated Encryption Scheme and Its Variant
387
And using Ua ’s public key ya , Ub can verify the signature with the following equations: L = h(M1 M2 · · · Mn ) and r = h(L, yb , yar g s ) mod q.
(3) (4)
If Equations (3) and (4) hold, the signature is valid. Hence, the recipient Ub confirms this secret message M (or {M1 , M2 , · · · , Mn }) and its signature is indeed sent by the signer Ua . It can be seen that the message recovery and verification could almost be speeded up n times. For the security of Schnorr’s signature scheme, the random number k should not be reused with different messages. The conversion phase. Later on, if the signer repudiates the signature, Ub can prove the dishonesty of the signer by revealing message blocks {M1 , M2 , · · · , Mn } of the converted signature (r, s, L). With this converted signature, anyone (or judge) can verify its validity from Equations (3) and (4). This phase is for the specified recipient to convince a judge that a signature is the signer’s true one if it is valid. In the conversion phase, the recipient only reveals the message blocks M1 , M2 , · · · , Mn and the converted signature (r, s, L) for any verifier to check whether Equations (3) and (4) hold or not. It is obvious that our convertible authenticated encryption scheme with message linkages can also easily produce the ordinary signature without the cooperation of the signer. The verified equation r = h(L, yb , yar g s ) mod q contains Ub ’s public key yb , so everyone (or a judge) can confirm whether the signer Ua ever sent the message blocks M1 , M2 , · · · , Mn to the specified receiver Ub and not to other receivers. This provides better protection to both the signer and the receiver. The following theorem shows that the message blocks {M1 , M2 , · · · , Mn } can be correctly recovered and verified. Theorem 3. If the signature (r, s, c, L, r1 , r2 , · · · , rn ) is produced by the proposed scheme, then the recipient can recover the message blocks {M1 , M2 , · · · , Mn } by computing Mi = ri × f (ri−1 t)−1 mod p, for i = 1, 2, · · · , n, and r0 = 0; and the converted signature (r, s) for the message blocks {M1 , M2 , · · · , Mn } can be verified by Equations (3) and (4). (q−k) mod p, we have Proof : From yb = g xb mod p, g q = 1 mod p, and c = t × yb xb q xb ) = t × (g ) = t mod p, where c × (ybr × g s )xb mod p = t × g xb ×(q−k) × (g xa r+s k = xa r + s mod q. Since ri = Mi × f (ri−1 t) mod p, Mi can be obtained by Mi = ri × f (ri−1 t)−1 mod p, where r0 = 0. On the other hand, r = h(L, yb , yar g s ) mod q, and s = k − xa r mod q, where L = h(M1 M2 · · · Mn ). We can obtain k = xa r + s mod q and h(M1 M2 · · · Mn ) = L. Therefore, we have r = h(L, yb , yar g s ) mod q, where g k = yar × g s mod p. Hence, in the proposed convertible authenticated encryption scheme with message linkages, the verifier verifies these n digital signatures by the signer’s public key which needs only one verification instead of n verifications. Thus, the proposed scheme is very efficient and has lower communication costs.
388
H.-F. Huang and C.-C. Chang
In summary, we combine both ElGamal’s encryption and Schnorr’s signature schemes into our convertible authenticated encryption scheme and convertible authenticated encryption scheme with message linkages.
4
Discussions
In this section, we are going to explore the securities and the performances of these two proposed schemes. 4.1
Security Analyses
In our two schemes, both encrypting and signing are based on the ElGamal’s cryptosystem and Schnorr’s signature scheme, respectively. Thus, the securities of the two proposed schemes are founded in the difficulty of solving the discrete logarithm problem. These two proposed schemes, any user’s private key xt must be kept secret. From the public key yt = g xt modp, no one can derive the corresponding private key xt . This security results from the difficulty of solving the discrete logarithm problem. Moreover, in our schemes, the ordinary signature is embedded in the authenticated encryption signature. Thus, the receiver can easily release the converted signature to any verifier (or judge) when the signer denies his having signed. First, we consider the security in the proposed convertible authenticated encryption scheme, the signer Ua first uses ElGamal’s public key cryptosystem (q−k) to generate the ciphertext c of m by computing c = m × yb mod p, where k ∈ Zq is a secret random number, and Ua applies the concept of Schnorr’s signature scheme to construct the ordinary signature (r, s) for the message m, where r = h(m, yb , yar g s ) mod q and s = k −xa r mod q. Then, he delivers the signature (c, r, s) to the specified recipient Ub . After receiving the signature (c, r, s), Ub can recover the message m = c × (yar g s )xb mod p with his secret key xb and the signer Ua ’s public key ya . Hence, Ub confirms that the message m is sent from Ua by checking r = h(m, yb , yar g s ) mod q holds. By applying ElGamal’s public key cryptosystem [3], without Ub ’s secret key xb , no one can decrypt the message m and check its validity from the signature (c, r, s). Thus, when the signer makes a signature-ciphertext for a message and sends it to a specified receiver, only the specified receiver has the ability to recover and verify the message. Simultaneously, according to the concept of Schorr’s signature scheme, r = h(m, yb , g k ) mod q and s = k − xa r mod q, without the signer’s private key xa , anyone cannot forge the signature (r, s) for the message m, where k is a secret random number. Hence, anyone cannot masquerade as a signer Ua to forge the valid signature-ciphertext (c, r, s) and send it to a specified recipient Ub . For the security of Schnorr’s signature scheme, the secret random number k should not be reused with a different message. On the other hand, the receiver can release the converted signature (r, s) of message m to any verifier (or judge) when the signer repudiates his signing. Furthermore, in our convertible authenticated
An Efficient Convertible Authenticated Encryption Scheme and Its Variant
389
encryption scheme, any attacker cannot obtain the message m before the signature is converted. In this situation, it is more and more difficult for the attacker to forge other signatures from the valid signature (c, r, s). Because we combine the concept of ElGamal’s encryption and Schonrr’s signature schemes into our convertible authenticated encryption scheme, the security analysis for our convertible authenticated encryption scheme is the same as ElGamal’s encryption and Schnorr’s signature schemes. Hence, the proposed convertible authenticated encryption scheme as mentioned in the Section 2 is secure. In addition, the receiver Ub ’s public key yb is protected in the verified equation r = h(m, yb , yar g s ) mod q, so even if the intruder knows the signature of message m, it is still hard for others to believe that the message m has been sent to the intruder by the signer Ua . Hence, it is much easier to tell that the signer Ua only has to be responsible for the recipient Ub and not for others. On the other hand, if another receiver Ui wants to find out if m such that r = h(m , yi , yar g s ) mod q holds, then he can fake the message m sent to him by the signer Ua . However, m is protected under the one-way hash function h(), so it is very difficult to derive m . Therefore, it is very convenient for the document’s signer to clarify the responsibility. Next, the proposed convertible authenticated encryption scheme with message linkages is the extension of our convertible authenticated encryption scheme. It is also to combine both ElGamal’s encryption and Schonrr’s signature schemes into our convertible authenticated encryption scheme with message linkages. Thus, the security analysis of our convertible authenticated encryption scheme with message linkages is the same as ElGamal’s [3] encryption and Schnorr’s [7] signature schemes. In the proposed scheme, the signer Ua sends the signature (r, s, c, L, r1 , r2 , · · · , rn ) to the specified receiver Ub , Ub uses his secret key xb and recover the message blocks {M1 , M2 , · · · , Mn } the signer Ua ’s public key ya to by computing Mi = ri × f (ri−1 t)−1 mod p, for i = 1, 2, · · · , n, and r0 = 0, where t = c × (yar × g s )xb mod p. After computing t and recovering the message blocks {M1 , M2 , · · · , Mn }, Ub verifies the message blocks to check whether L = h(M1 M2 · · · Mn ) and r = h(L, yb , yar g s ) mod q or not. In the same way, by applying ElGamal’s public key cryptosystem [3], without Ub ’s secret key xb , unauthorized user cannot decrypt the message blocks {M1 , M2 , · · · , Mn } and check its validity from the signature (r, s, c, L, r1 , r2 , · · · , rn ). On the other hand, since r = h(L, yb , yar g s ) mod q and s = k − xa r mod q, according to Schnorr’s signature scheme [7], without the signer’s private key xa , anyone cannot forge the signature (r, s, L) for message blocks {M1 , M2 , · · · , Mn }. Hence, the intruder cannot masquerade as a signer Ua to forge the valid signature blocks (r, s, c, L, r1 , r2 , · · · , rn ) and send it to a specified recipient Ub . In case of a later dispute, the recipient can reveal the converted signature (r, s, L) of message blocks {M1 , M2 , · · · , Mn } to any verifier (or judge) for verifying. Furthermore, in the proposed convertible authenticated encryption scheme with message linkages, any intruder also cannot obtain the message blocks {M1 , M2 , · · · , Mn } before the signature is converted. In this case, it is more and more difficult for the attacker
390
H.-F. Huang and C.-C. Chang
to forge other signatures from the valid signature blocks (r, s, c, L, r1 , r2 , · · · , rn ). In the following, some security problems are considered. 1. If an intruder knows one message block Mi , he may try to derive the remaining message blocks. Although he may obtain f (ri−1 t) = Mi−1 × ri mod p, he cannot derive t, because t is protected under the one-way hashing function. Thus, our scheme can withstand the known-plaintext attack. 2. If the message blocks are reordered, modified, deleted or replicated, then the signature equations r = h(L, yb , yar g s ) mod q and s = k − xa r mod q must be modified as well. Since L = h(M1 M2 · · · Mn ), it is guaranteed that it is hard to reorder, delete, modify or replicate. Hence, the proposed convertible authenticated encryption scheme with message linkages is secure. The equation r = h(L, yb , yar g s ) mod q also contains L and the public key yb of receiver Ub , so the signer Ua doesn’t worry about that once others know the message and thus if forces him to be responsible for the documents and causes the confusion. Therefore, the proposed two schemes can provide a better protection to both signer and receiver. 4.2
Performances and Comparisons
The concept of convertible authenticated encryption was first proposed by Araki et al. [1,2]. However, in their schemes, the conversion of signature requires the signer to release one more parameter. One can see that it needs additional communication, and it might be unworkable if the signer is unwilling to cooperate. Wu and Hsu [8] improved upon Araki et al.’s [2] scheme so that the conversion does not require the cooperation of the signer, and their scheme outperforms Araki et al.’s scheme in terms of the computation complexities and the communication costs. For this reason, we only compare our convertible authenticated encryption scheme in Section 2 with Wu and Hsu’s scheme [8]. Wu and Hsu’s scheme consists of the signing/verification phase, and the conversion phase. To clarity the comparison, we briefly describe the processes as follows. The notations of parameters are the same as those in Section 2. The Signing/Verification Phase To produce the signature for m, which contains some redundancy, the signer Ua first selects a random number k and computes r1 = m(h(ybk mod p))−1 mod p r2 = m(h(g k mod p))−1 mod q s = k − xa r mod q.
(5) (6) (7)
Finally, Ua sends the signature (r1 , r2 , s) for m to the recipient Ub . Ub can recover the message m as m = h((g s × yar2 )xb mod p) × r1 mod p.
An Efficient Convertible Authenticated Encryption Scheme and Its Variant
391
Then, Ub verifies the signature with the following equality: r2 = h(m, h(g s yar2 mod p)) mod q.
(8)
If it holds, the signature is valid. The Conversion Phase Later on, if the signer denies the signature, Ub can prove the dishonesty of the signer by revealing the converted signature (r2 , s) for the message m. With this converted signature, anyone (or judge) can verify its validity with Equation (7). In their conversion phase, the recipient must reveal the message m and the converted signature (r2 , s) for any verifier (or judge) to check whether Equation (7) holds or not. By using their way, the signer Ua might worry about that once others know the message m and thus force him to be responsible for the documents and cause confusion, since Equation (7) doesn’t contain the public key yb of the receiver Ub . For convenience, the following notations are used to facilitate the performance evaluation. Te means the time for one exponentiation computation ; Ti denotes the time for one inverse computation ; Tm defines the time for one modular multiplication computation ; Th denotes the time for executing the adopted oneway hash function in one’s scheme; and |x| means the bit-length of an integer x. Note that the times for computing modular addition and subtraction are ignored, since it is much smaller than Te , Ti , Tm , and Th . We summarize the comparisons of our convertible authenticated encryption scheme with Wu and Hsu’s scheme in Table 1. As shown in Table 1, the computational complexity for the signature generation, message recovery and verification, and verifying converted signature are Th + 2Te + 2Tm , Th + 3Te + 2Tm , and Th + 2Te + Tm , respectively. It is obvious that the proposed scheme is more efficient and simpler than Wu and Hsu’s scheme. In addition, the proposed convertible authenticated encryption scheme with message linkages not only provides the linkages among signature blocks, but also has better performance and low communication costs. Table 1. Comparisons of Wu and Hsu’s scheme and the proposed scheme in computation costs Items Wu and Hsu’s scheme Length of original signature |p| + 2|q| 2|q| Length of converted signature 3Th + Ti + 2Te + 2Tm Signature generation Message recovery and verification 3Th + 3Te + 2Tm Signature conversion 0 2Th + 2Te + Tm Verifying converted signature
The proposed scheme |p| + 2|q| 2|q| Th + 2Te + 2Tm Th + 3Te + 2Tm 0 Th + 2Te + Tm
392
5
H.-F. Huang and C.-C. Chang
Conclusions
We use the concept of ElGamal’s encryption and Schnorr’s signature schemes to improve Wu and Hsu’s convertible authenticated encryption scheme. Comparing with Wu and Hsu’s scheme, our method is more efficient in computational complexity. On the other hand, we have also proposed an efficient convertible authenticated encryption scheme with message linkages which can be regarded as a variant of our convertible authenticated encryption scheme. Since, these two proposed schemes can provide better protection for both the signer and the specified receiver, so it is very convenient to fairly clarify the auditing responsibility. Therefore, the simplicity, efficiency, and low communications characteristics make our schemes very attractive in many electronic transactions.
References 1. Araki, S., Uehara, S., Imamura, K.: Convertible Limited Verifier Signature Based on Horster’s Authenticated Encryption. 1998 Symposium on Cryptography and Information Security. 2. Araki, S., Uehara, S., Imamura, K.: The Limited Verifier Signature and Its Application. IEICE Transactions on Fundamentals, Vol. E82-A , No.1. (1999) 63–68. 3. ElGamal, T.: A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms. IEEE Transactions on Information Theory, Vol. 30, No. 4. (1985) 469– 472. 4. Horster, P., Michels, M., Petersen, H.: Authenticated Encryption Schemes with Low Communication Costs. Electronics Letters, Vol. 30, No. 15. (1994) 1212–1213. 5. Nyberg K., Rueppel, R.A.: Message Recover for Signature Schemes Based on the Discrete Logarithm Problem. Advance in Cryptology-EUROCRYPT’94. Lecture Notes in Computer Science, Vol.950. Springer-Verlag, Berlin Heidelberg New York (1995) 182–193 6. Petersen, H., Michels, M.: Cryptanalysis and Improvement of Signcryption Schemes. IEE Proceedings-Computers and Digital Techniques, Vol. 145, No. 2. (1998) 149– 151. 7. Schnorr, C.P,: Efficient Identification and Signatures for Smart Cards. Advances in Cryptology-Crypto’89. Lecture Notes in Computer Science, Vol. 435. SpringerVerlag, Berlin Heidelberg New York (1990) 339–351. 8. Wu, T.S., Hsu, C. L.: Convertible Authenticated Encryption Scheme. The Journal of Systems and Software, Vol. 62. (2002) 205–209. 9. Zheng, Y.: Signcryption and Its Applications in Efficient Public Key Solutions. Advances in Information Security Workshop (ISW’97). New York (1997) 291–312.
Space-Economical Reassembly for Intrusion Detection System Meng Zhang and Jiu-bin Ju Jilin University, College of Computer Science and Technology, Changchun 130012, China [email protected], [email protected]
Abstract. The reassembly of IP fragments and TCP streams are very important in Intrusion Detection Systems (IDS). However, existing reassembly algorithms that cache fragments entirely are memory-greedy. It is vulnerable to memory exhaustion denial of service (DOS) attacks. In this paper, we present a spaceeconomical algorithm based on enhanced DAWG (Directed Acyclic Word Graph) automaton, which can detect the occurrences of a set of patterns in an out-of-order data stream. In contrast to existing algorithms, our algorithm scans each fragment by a multi-pattern matching automaton and just caches the returned solid-size index data structures, thus the memory requirement involved in caching fragments is largely reduced. Experiments and analysis show that our new algorithm greatly reduces the memory usage of reassembly in IDS and outperforms existing algorithms.
1 Introduction The reassembly of IP fragments and TCP streams had become an indispensable function of Intrusion Detection Systems (IDS). An IDS would be vulnerable, if it does not handle this problem properly. Attackers can launch insertion, evasion, and denial of service [9] attacks on IDS by intentionally scrambling the fragment streams. Detailed explanation of these techniques can be found in [1, 2, 3, 9]. TCP segments may arrive out of order for segments are transmitted as IP datagrams. The receiving TCP stack resequences the data if necessary, passing the received data in the correct order to the application. The process of taking a collection of unordered, sequenced packets and reconstructing the stream of data they contained is termed ‘‘reassembly’’ [9]. Cache algorithm [4,5] is employed to perform reassembling in TCP/IP stacks. It caches all the fragments that are not acknowledged, and passes the reassembled data to the application when acknowledging packet has arrived. Many ID systems also employed this algorithm for reassembling, including snort [7], nfr [8] and PreludeIDS [14]. The IDS is inherently more prone to resource starvation attacks than the endsystems. An IDS that employs cache algorithm to reassemble is vulnerable. There are many methods for an attacker to force the IDS to consume all available memory resources [9]. In this article, we present an algorithm that performs reassembling and multipattern matching of out-of-order streams with low complex of space, named On-Line Reassembly (OLR). DAWG (Directed Acycile Word Graph), a flexible and powerful S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 393–404, 2003. © Springer-Verlag Berlin Heidelberg 2003
394
M. Zhang and J.-b. Ju
data structure related to suffix trees and similar structures is employed in OLR. Basing on the work of [10, 11, 13], we extended the DAWG for a pattern set to a multi-pattern matching automaton and by which we implemented the index of factors of pattern set. Compared with cache algorithm, OLR is fairly space economical. It doesn’t cache the input fragments entirely, but stores the process results of fragments that are solid-size index data structures. The memory space of a fragment is always far more than that of the index with constant space. Thus, the memory requirement involved in caching fragments is largely reduced. Except for reassembling, the OLR also performs multi-pattern matching in fragment streams. The performance is better than that of matching patterns in serial [12].
2 The DAWG Let be a nonempty alphabet and * the set of words over with as the empty word. Let w is a word in *, |w| denotes its length, w[i] its ith letter, and w[i: j] its factor(subword) that begins at position i and ends at position j. If w=xyz with x, y, z ∈ *, then x, y and z denotes some factors or subwords of w, x is a prefix of w, and z is a suffix of w. Let p ∈ * be a pattern, denote pr the reverse pattern of p. Let P={ p1, p2, …, pk } be a set of k patterns, denote Pr={p1r, p2r, …, pkr} the set of reverse patterns. Pref(P) denotes the set of all prefixes of P, Suff(P) denotes the set of all suffixes of P and Fact(P) the set of its factors. If v=v1wv2, then w is said to occur in v at position |v1| and at end position |v1w|. A position (resp. end position) of w in a pattern set P refers to a pair , where j is a position of w in pi, endposP(w) is the set of all possible end positions of w in P. Definition 1. Let P={ p1, p2, …, pk }. For u, v ∈ Fact(P) , define u ≡ P v iff endposP (u)=endpos P (v). [u] P denotes the equivalence class of u of ≡ P. The DAWG(P) is a directed acyclic graph with set of nodes {[u] P | u ∈ Fact(P)} and set of edges E={([u] P, a, [ua] P)|u, ua ∈ Fact(P), a ∈ }.The node [ ]P is called the source of DAWG(P). The longest element of a ≡ P equivalence class is called the representative of this class. Viewed as a finite automaton with every state being accepting, the DAWG is a deterministic automaton recognizing the factors of P. If the edge (q, a, q’) exists, function out(q, a) returns q’; else out(q, a) is undefined. The edges of the DAWG are divided into two categories. The edge ([u]P, a, [ua]P) is called primary if ua is the representative of [ua]P, otherwise it is called secondary. With each node [u] P we associate a number depth([u]P) defined as the depth of [u] P in the tree of primary edges. Equivalently, depth([u]P) is the length of the representative of [u]P. If the edge (q, a, q’) is primary, then depth(q’)=depth(q)+1, otherwise depth(q’)>depth(q)+1. If w=pi for some pi ∈ P, then the node [w] P is called a terminal node for pi. Let p be a state of DAWG(P), different from the initial state, and let u a word of the equivalence class p. The suffix link of p, denoted by failP(p), is the state q which representative v is the longest suffix z of u such that u not ≡ P z. We have depth(q) < depth(p). Then the sequence (p, failP(p), failP2(p), …) is finite and ends at the initial state of DAWG(P). This sequence is called the suffix path of p.
Space-Economical Reassembly for Intrusion Detection System
395
3 The OLR Algorithm 3.1 Indexing Factors by 2-Tuple
In this section, we describe the method used in OLR for caching the fragments. First we introduce the 2-tuple set SP={(q, L) | q is a state of DAWG(P), 0
For each f ∈ Fact(P), there is a nonempty tuple set SFP(f)={(q, L)|f is the suffix of the representative of q, L=|f| }, that for each t ∈ SFP(f) StoPP(t)=f. This means any factor can be represented by a tuple, through which the factor can be rebuilt by function StoPP. In general, the memory space of a tuple that is constant is smaller than fragments it represents.
3.2 Multi-pattern Matching Algorithm We extend the multi-pattern DAWG, and design a multi-pattern matching algorithm on it. In this algorithm when the input w[0: i] has been scanned, the longest factor of suffix of w[0: i] is recorded by a tuple (q, L) that StoPP (q, L)=u. Both q and L are changing while scanning the input. We attach to each node of DAWG a pointer to the closest terminal node denoted by TermP(q) on its suffix path. Then the sequence (q, TermP(q), TermP2(q), …) that we call the terminal path of p is finite and ends at the initial state of DAWG(P). An occurrence of the pattern set is reported iff there are at least two states in the terminal path of q and among them there is a state q' such that depth(q') L. In pattern matching procedure, current tuple (q, L) which StoPP(q, L)=u is the longest factor of P in current input w[0:i]. Let w[i+1]=a, if ua is a factor of P, then (q, L) transits to ([ua] P, L+1). Otherwise, a mismatch occurs, the q of tuple transits to q' the longest factor of P in the suffix of ua, and L transits to its length. We call q' the fail state of q with input 'a'. It can be found by search the fail path of q when mismatch occurs [11]. Procedure StepP is the DAWG scan function, it reads a character 'a' from input and returns a tuple (q', L') where StoPP(q', L') is the longest factor in suffix of ua. 3.3 Reassembly and Multi-pattern Matching of Out-of-Order Stream
Definition 2. Let X ∈ *, b=b1, b2, …, bk is an ascending integer sequence, where b1 =0, 0bi<|X|, and ∀ i, j, 0i, jk, bibj; a fragment set of X on b is a word set {f1, f2, …, fk} where fi= X[bi: bi+1-1] and fk= X[bk: |X|]. Let Od be a permutation of 1, 2, …, k and Od[i] is its ith number; the sequence F1, F2, …, Fk noted by Fserial(X, b, Od) is a fragment stream of X, if Fi = fOd[i]. The problem of reassembly and multi-pattern matching in out-of-order stream is to find all occurrences of a finite pattern set P in a fragment stream. The traditional
396
M. Zhang and J.-b. Ju
method solving this problem is cache algorithm; it caches each fragment fi with the order of bi and processes the reassembled data with pattern matching algorithm. The OLR algorithm solves the problem in different way. It first builds the automata DAWG(P) and DAWG(Pr). The search phase of algorithm is divided into two parts: fragment process and exporting process. Let the input of algorithm is a pattern set P and a fragment stream Fs= Fserial(X, b, Od). In fragment process, the algorithm processes the current input fragment f by function ScanFragP and records the two tuples returned. By these tuples, the longest factor of P in prefix and suffix of f are recorded. In exporting process, OLR searches the recorded data for pattern. Let p be a pattern that occurs in X. If p is factor of a fragment of Fs, then p will be detected by ScanFragP. Else p is split into some fragments in which a suffix of a fragment is prefix of p is and a prefix of another fragment is a suffix of p, other fragments are factors of p. So, p must have been recoded in the tuples returned by ScanFragP after processing these fragments. 3.3.1 Data Structure Each fragment of X is represented by a data structure provided three attributes: data: the data of fragment offset: offset in X length: length of data The OLR algorithm stores the process result of fragments in a data structure noted by FragNode. FragNode provides the following basic attributes: offset: offset in X fragment length: length of fragment flag: flag of fragment prefix: the tuple represent the longest factor of prefix of fragment suffix: the tuple represent the longest factor of suffix of fragment Each FragNode n is the process result of the fragment X[n.offset: n.offset+n.length-1] noted by Frag(n). The relationship between Frag(n) and P is flagged by flag attribute. The following are three cases of flag and its meaning: (i) FAC Frag(n) is a factor of P. (ii) NFC Frag(n) is not a factor of P. (iii) UCT Whether Frag(n) is a factor or not is uncertain. The FragNodes generated by OLR can be organized by some well known data structures, such as link list and Splay tree, in which FragNodes are ordered by their offset. The choice of these data structures influences the performance of accessing FragNodes. The globe variable represents the FragNodes set in OLR is denoted by th FragBuffer and its i FragNode by FragBuffer[i]. The list Pattern stores the number of patterns that were founded in the running time. 3.3.2 Fragment Process In this stage, fragment stream is processed and the results are stored in FragBuffer and Pattern. FragProcess is the main procedure. Function ScanFragP is called by FragProcess for multi-pattern matching and indexing of each fragment. ScanFragP scans the input string v from left to right, starting from a tuple (q, L) that StoPP(q, L)=u. In the end, it returns two tuples: prefix which StoPP(Prefix) is the longest factor in prefix of uv and suffix which StoPP(suffix) is the longest factor in
Space-Economical Reassembly for Intrusion Detection System
397
suffix of uv. If there are patterns in uv, they will be recorded in Pattern list. All these works are done by scan v for only one time. ScanFragP ((q, L), v) 1 state q; mLen L; pos 0; prefix null; suffix null; 2 while ( pos < |v| ) { 3 if ( state != [ ] P && state is terminal && mLen = = depth (state) ) 4 A pattern is matched, insert it to Pattern; 5 term TermP (state ); 6 while ( term != [ ] P ) { 7 if ( mLen >= depth(term) ) 8 A pattern is matched, insert it to Pattern; 9 term TermP ( term ); 10 } 11 ( state, mLen ) Step P ( ( state, mLen ), v[pos] ); 12 if ( the first time that mismatch occurs ) prifix (state, mLen); 13 pos++; 14 } 15 if (prefix == null) prefix (state, mLen); 16 suffix (state, mLen); 17 return ((prefix, suffix)); In FragProcess, there are three cases in processing the incoming fragment: (i) Backward incorporating. For a fragment f, if f.offset=n.offset+n.length where n is a node of FragBuffer, f is called the backward fragment of n, and f will be backward incorporated in n. In backward incorporating OLR scans f with DAWG(P) beginning at n.suffix, and stores the 2-tuple returned in n.suffix. (ii) Forward incorporating. If f.offset+f.length=n.offset, f is called the forward fragment of n, and f will be forward incorporated in n. In this case, If Frag(n) is not a factor of P or uncertain, OLR will scan f with DAWG(Pr) beginning at n.prefix. If Frag(n) is a factor of P, OLR will scan f with DAWG(Pr) beginning at ([ ] Pr , 0). The 2-tuple returned by scanning is stored in n.prefix. (iii) Create new node. If f is neither a backward fragment nor a forward fragment of n, f is called the increase fragment of n, and a new FragNode will be created. The precise definition is presented in the following section. The "flag" attribute of a FargNode is set when FragNode is created. If incorporating occurs on n, n.flag will change in the following cases: (i) If mismatch occurs within incorporation operation on n, n.flag will transform to NFC. (ii) If no mismatch occurs and n.flag=FAC and incorporation is forward, the flag of n will be set to UCT. In this case whether Frag(n) is factor or not is uncertain, and there is no overlap between StoPP(n.prefix) and StoPP(n.suffix). The OLR algorithm handles overlapping fragments in a manner that always favors old data. Only the new data within a fragment is processed. We give below the pseudo code of the FragProcess procedure. The input is a fragment of a stream.
398
M. Zhang and J.-b. Ju
FragProcess(frag) 1 if (frag is an increase fragment){ 2 creat a new FragNode by frag; and insert to FragBuffer; 3 }else{ 4 dset WKHQHZGDWDRIfrag; 5 for (data HDFKIUDJPHQWLQdset){ 6 Search for node ff that data is the backward fragment of ff; 7 if (ff exists){ 8 Backward incorporate data to ff; 9 }else{ 10 Search for node bf that data is the forward fragment of bf; 11 if (bf exists){ 12 Forward incorporate data to bf; 13 } } } } 3.3.3 Export Fragment When the acknowledgement packet is received, the acknowledged fragments will be exported from the FragBuffer. In this stage, OLR searches the fragments recorded by FragNodes for pattern and inserts number of matched pattern to Pattern list. In IDS, the intrusion detection function will be called to detect whether the combination of packet header and matched pattern indicates an intrusion; if Pattern list is empty, only the packet header will be checked. In this procedure, a 2-tuple s is used to represent the longest factor in the data exported. By s, the patterns spans more than one exported data will not fail to report. The following is the pseudo code of Pattern_Match:
Pattern_Match(FragBuffer) 1 for(each fn ∈ { nodes of FragBuffer that are acknowledged}){ 2 (NULL, s ScanFrag p (s, the data corresponding to fn.prefix); 3 if( fn.flag == UCT ) 4 (NULL, s) ScanFrag p (s, StoPp(fn.suffix)); 5 s. length s. length + fn. length; 6 if( fn.flag == NFC ) 7 s fn.suffix; 8 delete fn; 9 }
4 Analysis of Algorithm 4.1 The Correctness of Algorithm
First, we present the data set that procedure Pattern_Match really inspects. Let Fs be a fragment stream of X, fset is an empty factor set of X. When all the fragments of Fs are processed, the following steps are executed on FragBuffer: (1) For each node n, if n.flag = FAC or UCT, then add Frag(n) to fset; if n.flag = NFC, then put StoPP(n.prefix) and StoPP(n.suffix) to fset.
Space-Economical Reassembly for Intrusion Detection System
399
(2) Process fset by the following rule iteratively: ∀a, b ∈ fset, a= X[a1: a2], b= X[b1: b2], if b1=a2+1, then delete a and b from fset and add ab=X[a1: b2] to fset, until there are no two elements of fset that can be incorporated. We denote the fset by value(P, Fs). For any stream Fs of X and pattern set P, the value(P, Fs) is unique. According to procedure FragProcess and Pattern_Match, value(P, Fs) is the data set that Pattern_Match really inspects. Lemma 1. Let P Fact({X}) , Fs be a fragment stream of X, p ∉ Fact(Fs) be a pattern that occurs in X. Then ∃ r ∈ value(P, Fs) that p is a factor of r. Let P Fact({X})= , then P Fact(value(P, Fs))= . Proof: If P Fact({x})= , for value(P, Fs) is the subset of Fact({x}), then P Fact(value(P, Fs))= . If P Fact({x}) . Let p = X[k1: k2] ∈ P Fact({X}), then there exists Fsm= X[m1: m2] such that m1k1m2 and Fsn = X[n1: n2] such that n1k2n2. For p ∉ Fact(Fs), then Fsm and Fsn are different. Let m>n, I = max(i|FragBuffer[i].offsetk1 ), J = min(j| FragBuffer[j].offset + FragBuffer[j].length-1k2), then StoPP(FragBuffer[I].suffix)=X[begin: b1] and StoPP(FragBuffer[J].prefix)=X[e0: end]. For begink1 and k2end, then p ∈ Fact({x[begin : end]}). ∀ i, I
Proof: Let p be any pattern of P that occurs in X, then there exists Fsm= x[m1: m2], m1k1m2, and Fsn = x[n1: n2], n1k2n2. Two cases can be distinguished: (i) m=n. Then p is a factor of Fsm. According to procedure ScanFragP, p will be detected by ScanFragP in procedure FragProcess. (ii) m>n. When FragBuffer is processed by procedure Pattern_Match which performs multi-pattern matching on value(P, Fs). By lemma 1, ∃ r ∈ value(P, Fs), p is factor of r. Then p will be detected by ScanFragP in procedure Pattern_Match. So any occurrence of patterns of P in fragment stream will be found by the algorithm OLR. 4.2 Space Usage
In this section, we will analyze the average-case and worst-case space of FragBuffer generated by OLR. The size of FragBuffer is related to the number of fragments in stream and the arrive order of fragments. If fragments of Fserial(X, b, Od) arrive in order, there is only on node in FragBuffer in the running time of OLR. In the permutation Od, if all the odd number are in front of all the even number or reverse, the peak point of the number of nodes in FragBuffer in the running time of OLR is maximum. Fragment orders that have this attribute are called gap order.
400
M. Zhang and J.-b. Ju
Definition 3. Let pm be a permutation of {1, 2, …, n}, pm[i] is the ith member of pm, pm-1[i] is the position of i that pm[pm-1[i]]=i; ∀ i, 1in, if any of the following conditions are satisfied, i is called an increase point of pm: pm[i]=1 and pm-1[2] > i. pm[i]=n and pm-1[n-1] > i. pm-1[pm[i]-1] > i and pm-1[pm[i]+1] > i.
(i) (ii) (iii)
The position that is not increase point of pm is called stable point of pm. The number of increase point of pm is denoted by Node(pm). The fragment Fi ∈ Fs=Fserial(X, b, Od) where i is an increase point of Od is called increase fragment of Fs. The other fragments are called stable fragment. The following two lemmas give the upper bound and mean of increase points. The proof can be got from author via email. Lemma 2. The maximum number of increase points of permutations of {1, …, n} where n1 is n / 2. Lemma 3. The average number of increase points of permutations of {1, …, n} where n1 is (n+1)/3. Let the input fragment stream is FS, denote the maximum memory space of FragBuffer in the running time of OLR by Peek(FS). Theorem 2. Let the memory space of each FragNode is S bytes, FS be a fragment stream that the number of fragments is n, then the maximum of Peek(FS) is n / 2S. Let the orders of fragments streams distribute with equal probability, then the mean of Peek(FS) is n + 1 S . 3
Proof: Let Fs=Fserial(X, b, Od). According to the OLR algorithm, if 1in is a stable point of Od, one of F Od-1[Od[i]-1] and FOd-1[Od[i]+1] or all of them is processed before fragment Fi is inputted. Fi is incorporated into the node n of FragBuffer that FOd-1[Od[i]-1 1] is a factor of Frag(n) or FOd [Od[i]+1] is a factor of Frag(n). If i is an increase point of -1 Od, neither FOd [Od[i]-1] nor FOd-1[Od[i]+1] was inputted, a new node is created and inserted into FragBuffer. So there are Node(Od) nodes in FragBuffer generated by procedure FragProcess after n fragments arrived. By lemma 2, the maximum of Peek(FS) is n / 2S. Let the orders of fragments streams distribute with equal probability, then by lemma 3 the mean of Peek(FS) is n + 1 S . 3
4.3 Performance The performance is measured in term of the number of inspections on streams. According to procedure FragProcess each fragment is scanned for one time when inputted. If the first forward fragment of node n arrives, an extra scan, denoted by r reverse scan, is performed on StoPP(n.prefix) with DAWG(p ). In stage of pattern
Space-Economical Reassembly for Intrusion Detection System
401
matching, for each node n of FragBuffer, the StoPP(n.prefix) is scanned by procedure Pattern_Match again. The number of inspections on stream is the sum of inspections of these three parts. The following theorem describes the boundary of the maximum number of inspections of the OLR algorithm. The proof can be got from author via email. Theorem 3. For FS=Fserial(x, b, Od), let f be the amount of fragments, n=|x|, m the longest length of pattern in P, the maximum number of inspections of the OLR algorithm denoted by MaxScan(FS) has the following boundary:
MaxScan( FS ) < n + fm MaxScan( FS ) < 3n − f ,
m ≤ (n − f / 2) / f / 2;
m > (n − f / 2) / f / 2 .
5 Application of OLR in IDS We implement the OLR algorithm in a plug_in of snort in charge of reassembly and pattern matching of TCP stream. 5.1 Snort and Stream4 Snort is an open source network intrusion detection system that relies on protocol analysis and pattern matching. It defines a rule language that describes attack signatures and corresponding respond actions. The following is an example snort rule:
alert tcp $EXTERNAL_NET 27374 -> $HOME_NET any (msg:"BACKDOOR subseven 22"; flags: A+; content: "|0d0a5b52504c5d3030320d0a|"; reference:arachnids,485; sid:103; classtype:misc-activity; rev:3;) The rule contains two parts: the rule header and the rule option(s). Rule header contains an action (alert in this case), a protocol (TCP), a source netmask, a source port (any), and a destination netmask and port. In rule option, the msg string is the alert to send if this rule is matched. The optional flags field specifies a set of TCP flags that must be set for a packet to match. The content and uricontent field specifies a string to match in the payload of packet. Snort has a plug_in architecture to integrate new functions and technologies to snort. In snort, the Stream4 plug_in is in charge of reassembly of TCP stream. It monitors TCP connections based on TCB (TCP control block) reconstruction and TCP state tracing; the reassembly is performed by cache algorithm. All the fragments of a TCP connection are cached in a Splay tree ordered by their offset. When the ACK of a range of stream arrives, stream4 exports the data and delivers it to snort detection engine. 5.2 OLR Plug_in of Snort The OLR plug_in inherits the TCB reconstruction and the TCP state tracing of Stream4. It employs the OLR algorithm to perform reassembly. In OLR plug_in, each
402
M. Zhang and J.-b. Ju
TCP connection is treated as an independent fragment stream and has its own FragBuffer and DAWGs. The pattern set of DAWG is the set of parameters of keyword "content" and "uricontent" in TCP snort rules. According to snort2.0 rule set, snort classifies TCP connections into 131 classes. For each class, build DAWGs from the pattern set generated from the rule set the class matched. The TCP connections that belong to the same class share the DAWGs of the class. In OLR plug_in, when a TCP packet arrives, the state of rebuilt TCB that the packet is belong to is transacted according to the TCP header of the packet. If the packet has data, it is processed by procedure FragProcess. If there are patterns being matched, the patterns are delivered to detection engine as rebuilt TCP fragments with its offset and length in stream. When the ACK of a range of stream arrives, the data cached in FragBuffer are rebuilt as TCP fragments with its offset and length in stream and delivered to detection engine. The detection engine doesn’t inspect the payload data, it only checks whether the combination of packet header and matched pattern indicates an intrusion. According to source code, we find that snort and PreludeIDS don't have the ability of consecutive pattern matching of TCP stream. If the attack signature is split to different rebuilt stream data, they can't detect. NFR2.0.3 research version performs the consecutive pattern matching correctly. It performs the matching of each pattern in serial and stores the pattern matching states of each pattern in a list by which the new data and old data are processed in succession. Compared with OLR, this method is poor in both memory space and performance. In OLR, the matching states of all the patterns are in one tuple and patterns are searched in parallel. It performs the consecutive pattern matching of TCP streams faster and space-economical.
5.3 Experiments We compare the performance and memory usage of OLR algorithm and cache algorithm by several experiments. All experiments were conducted on a 600MHz Celeron, and snort version 2.0 with full rule set. The test data is a serial of network traffics with multiple TCP connections generated by modified fragrouter [15] and some TCP applications. These traffics were generated with different parameters include the payload length and the order of TCP fragments, and were recorded to packet trace files by tcpdump. The length of 1, 5, 10, 50, 100 and 200 bytes and three kind of order: normal order, random order and gap order are used. The factors of pattern set are inserted to the payload of fragments randomly. In experiments, snort reads packet from trace file. The time is measure by cycle counter of Pentium. All the results are the mean of 10 times experiments. Experiment 1. We recorded the memory usage of FragBuffer of OLR and Stream4 in their running time. Eighteen traffic traces were tested, and Fig. 1 to Fig. 3 shows the three of those experiment results. A dot (n, m) of curve in a figure means that the size th th of FragBuffer is m bytes after the packets form 1 to n of trace has been processed. Experiment 2. We compared the running time of OLR and Stream4. Traffic traces are different both in size and packet count, so the running time and memory usage of different traces are not comparable. However, for the traces with the same fragment
Space-Economical Reassembly for Intrusion Detection System
Fig. 1. Experiment result of normal order trace with 100bytes fragment length.
Fig. 2. Experiment result of random trace order with 100bytes fragment length.
403
Fig. 3. Experiment result of gap order trace with 100bytes fragment length.
length and type of order, the ratio of the running time of two algorithms is static. So does that of memory usage. Therefore, we compared the two algorithms in term of these ratios. Fig. 4 shows the peek size of Stream4 FragBuffer of random order traces peek size of OLR FragBuffer with fragment length of 1, 5, 10, 50, 100 and 200 bytes. Figure 5 shows running time of OLR(CPU cycle) of random order traces with fragment length of 1, running time of Stream4(CPU cycle)
8
1.0
7
0.8
Run time Ratio
Ratio of Memory Useage
5, 10, 50, 100 and 200 bytes.
6 5 4 3
0
50
100
150
200
Packet Size(bytes)
Fig. 4. Memory usage ratio of Stream4and OLR on different traces of length.
0.6 0.4 0.2 0.0
0
50
100
150
200
Packet Size(bytes)
Fig. 5. Running time ratio of OLR and Stream4 on different fragment traces of fragment length.
6 Conclusions We have presented an algorithm that solves the problem of TCP stream reassembling and IP defragment for IDS. Compared with other methods that cache the whole fragment, our algorithm caches each fragment with a two-tuple that is constant size data structure, thus the memory requirement involved in caching fragments is largely reduced. A multi-pattern matching algorithm based on DAWG automaton is also designed for OLR. The analysis and experiment of algorithm show that our approach
404
M. Zhang and J.-b. Ju
is space-economical and its performance approximates to that of cache algorithm. By using OLR algorithm the resistance to resource-starvation attacks of IDS is enhanced.
References 1.
2.
3. 4.
5. 6. 7. 8. 9. 10.
11.
12. 13.
14. 15.
Christopher Kruegel, Fredrik Valeur, Giovanni Vigna, Richard Kemmerer. Stateful Intrusion Detection for High-Speed Networks. 2002 IEEE Symposium on Security and Priviacy May 12–15 Berkeley California 2002. M. Handley, C. Kreibich and V. Paxson, Network Intrusion Detection: Evasion, Traffic Normalization, and End-to-End Protocol Semantics Proc. USENIX Security Symposium 2001. Cisco Systems, Inc. The Science of Intrusion Detection System Attack Identification, 2002, http://www.snort.org/docs/dssa_wp.pdf G. P. Chandranmenon and G. Varghese, "Reconsidering fragmentation and reassembly" in PODC: 17th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, 1998. Linux IP Stacks Sourcecode, http://lxr.linux.no/ FreeBSD IP Stacks Sourcecode, http://www.freebsd.org/ Snort: The Open Source Network Intrusion Detection System, http:// www.snort.org. NFR Network Intrusion Detection (NFR NID), http://www.nfr.com/products/NID/ Thomas H. Ptacek and Timothy N. Newsham. Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection. Technical report, Secure Networks, Inc., 1998. A.Blumer, J.Blumer, D.Haussler, A.Ehrenfeucht, M.T.Chen, and J.Seiferas. The smallest automation recognizing the subwords of a text. Theoretical Computer Science, 40:31–55, 1985. M. Crochemore and C. Hancart, Automata for matching patterns, in Handbook of Formal Languages, G. Rosenberg and A. Salomaa, eds., volume 2, Linear Modeling, SpringerVerlag, 1997, 399–462. Mike Fisk, George Varghese. Fast Content-Based Packet Handling for Intrusion Detection. UCSD Technical Report CS2001-0670, 2001. Gregory Kucherov and Michael Rusinowitch Matching a Set of Strings with Variable Length Don’t Cares.In Proceedings of 6th Annual Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, vol.937, Springer Verlag, 1995. PreludeIDS, http://www.prelude-ids.org Fragrouter, http://www.anzen.com/research/nidsbench/
A Functional Decomposition of Virus and Worm Programs J. Krishna Murthy Department of Computer Science & Engineering, Guru Nanak Engineering College, Ibrahimpatnam, A.P, INDIA. Phone: +9104055332276, +919849409307 [email protected]
Abstract. This paper presents a decomposition of virus and worm programs based on their core functional components. The decomposition yields of a catalogue of six functions performed by such malicious programs and a classification of various ways these functions are implemented. The catalogue and classification provide a foundation to improve current reactive technologies for virus detection and to develop new proactive technologies for the same. Current state-of-the-art, reactive technologies identify malicious programs by matching signatures, sequences of bits, collected from previously infected documents. The catalogue presented may be used to train engineers into what to “look for” when studying infected documents to extract signatures, to concisely document how various viruses’ work, and to exchange this information with other engineers, thus speeding up signature discovery. The catalogue may also be used to develop automatic recognizers using program pattern recognition techniques. When generalized these recognizers can identify new, though related viruses, without any new signature.
1 Introduction Virus detection approaches can be broadly classified in two categories: AV software that employs static methods of detection and AV software that employ dynamic methods of detection. While static methods involve scanning the programs for a sequence of symbols, which are always found in any program infected with the virus, the dynamic methods involve the detection of viruses by running a suspect program in an environment, which emulates an actual PC [Kumar 92]. Commonly known static methods of detection are signature scanning; check summing, integrity shells and heuristics. Among these, the most widely used method is signature scanning [Bontchev 02a] because it is simple to implement. The chief disadvantage of signature scanning is that it cannot detect unknown viruses. The dynamic methods of detection provide a means for detecting known and unknown viruses in programs, by executing the program in an emulated environment. If the program under emulation makes anomalous accesses to system resources, it can be flagged as a virus. The main problem with this approach is an accidental execution of a virus program, which may break the defense mechanism of the emulator and thus execute on the actual computer system. In this case, we see that instead of defending a user from the virus, the S. Qing, D. Gollmann, and J. Zhou (Eds.): ICICS 2003, LNCS 2836, pp. 405–414, 2003. © Springer-Verlag Berlin Heidelberg 2003
406
J.K. Murthy
defense mechanism may actually aid the virus in compromising the user’s system, by providing the user with a false sense of security.
What This Paper Presents This paper presents a physiology for a class of programmed threats1 commonly named as viruses and worms. This paper identifies the various functional organs and their characteristics in virus and worm programs. The reasons for doing a physiological study of viruses and worms are: Our study of the widely available virus and worm creation toolkits, namely, VBSWorm generator kit, Walrus Macro Virus Generator, W97MVCK, available from web sites [Heavens 02], shows that these software systems provide a variety of options for generating different types of worms and viruses. The options in the software provided were similar across different toolkits. This motivates a thorough dissection of virus and worm codes for program features, which are achieved using these options. These program features may not individually qualify to be malicious, but a combination of these features does qualify to be malicious.
Contributions and Impact of This Research The physiology of viral and worm programs provides a starting point and a framework for developing techniques for static program analysis of programs. It identifies virus and worm program properties, which are found in most classes of computer viruses. The paper studies implementations of malicious behavior in existing virus and worm programs, thus providing a better understanding of these behaviors. The behaviors identified provide a new way of proactive detection of virus and worm programs when used with static analysis tools.
2 Physiology This section presents the main contribution of our research, the physiology of worm and virus programs.
2.1 Physiology of Viruses and Worm Programs Physiology is defined as “The study of all the functions of a living organism or any of its parts” [Websters 98]. Previous researchers have shown that computer viruses are artificial life forms, performing similar functions as biological life forms [Spafford 94, Witten 90]. This work extends the analogy further by identifying and studying the functional organs of virus and worm programs. In Figure 1 we present an abstract model for an organ.
1
A threat to a computer system is defined as a potential occurrence of a malicious or nonmalicious event that has adverse effect on the assets and resources associated with a computer system.
A Functional Decomposition of Virus and Worm Programs
407
Fig. 1. An abstract model for an organ of virus or worm program
Definition: An organ is defined as a 4 tuple {subject, action, object, function}. ¾ Object: An object is a passive system resource that is used to store information. Each object is assigned a security label. An object is uniquely identified by the following attributes: Address: Each object in a system has an address, which is used to access the object. Property: This is a characteristic or attribute possessed by an object. Security Label: A security label is defined as an attribute that is associated with a computer system entity, to denote its hierarchical sensitivity and need-to-know attributes. A security label consists of two components: A hierarchical security level and a possibly empty set of nonhierarchical security categories. In this model a security label is referred as a label. ¾
Subject: Subjects are active entities in a system. A security label is associated with each subject. Subjects are also considered to be objects: thus S ⊆ O. Subjects can initiate requests for resources and utilize these resources to complete a computing task. Subjects are usually system processes or tasks, which are initiated on behalf of the user. Each subject is uniquely identified by the following attributes:
408
J.K. Murthy
Identifier: An identifier consists of the name and address information of a subject, which can aid in uniquely locating a subject. Security Label: The security label for a subject has the same definition as that of the security label for an object. This is used to enforce a security policy in a system, which decides in what way the subject can act on an object. E.g. objects with a security label of {Administrator: write/read/execute, User: read/execute} can only be written to by users with administrator level privileges while others can only read and execute the object.
¾
Action: This is an abstraction which involves procedures that are initiated on behalf of a subject and are applied on an object. An action is always invoked by a trigger. An action is made of the following attributes: Trigger: An action procedure executes when a trigger event for action occurs. The triggering event can be a call-based-event or a time-based-event. A call-based-event occurs when some other function or procedure calls the action procedure. These are asynchronous in nature. An example is a call to an action procedure when a logic condition in a program evaluates to True2. Another example for this is when an interrupt is generated by the system when a user hits a specific combination of keys on his keyboard. Time based triggers are synchronous signals generated by the system, which may be received by the virus organ. The virus organ may in turn decide to act on the event or ignore it. Procedure: A procedure is a sequence of functions which, when applied by a subject on an object, produces a result.
¾
• • • • • •
2
Function: A function is a unique outcome of an action initiated by the subject on an object. In the current model of classification, we have identified seven functions defined as outcome of any action. The function characterizes the behavior of an organ. By fixing the function field with one of the seven organ functionalities, in a 4-tuple organ, we identify the subjects, objects and actions, which may be involved. The organs in Figure 2.1-2 form the organ set O = {N, S, C, G, I, P} for virus and worm programs. By analyzing the source code (which were extracted from infected documents, of selected virus and worm programs in the wild and by studying reports on viruses by virus researchers and antivirus vendors, we have identified the following functional organs in viruses and worms. Each organ consists of code which executes to produce the following program functions: i(N)stall (S)urvey (C)onceal Propa(G)ate (I)nject (P)ayload
True and False are boolean types
A Functional Decomposition of Virus and Worm Programs
409
This study of virus and worms here deals with their functional organs; it does not include a clean host program Ph as a functional organ of a virus. Let U = a set of programs which can execute on a given computer system. Ph ∈ U. Ph is called the host program when code segments implementing the organs of the virus are inserted in it. The host program is called a vector when it is used to carry the virus across different computer systems. Ph has been included in Figure 2 for completeness, since a virus program cannot be present in a system without attaching itself to a program (Ph). A high level representation of the infection and replications cycles of worm and virus programs is shown in Figures 3.
Installer Payload
Surveyor
Ph
Injector
Replicator Concealer
Fig. 2. The functional organs of virus and worm programs shown as grayed nodes
Let V = A set of code segments implementing viral characteristics. Then Pi = Ph ∪ V. The operation of a virus program involves an infected program (Pi), which when executed, performs a set of functions, which are characteristic of the organs present in the set O. The operation of a worm program involves a program from U, to perform a set of functions, characteristic of the organs present in the set O. The organs of the virus programs execute a function that leads a system from an uncompromised integrity state to a compromised integrity state. One complete cycle of executing the given functions of the identified organs is called an infection cycle in case of a virus and a replication cycle in case of a worm. A mandatory requirement for a virus program is the absence of a Propagator organ in the infection cycle while a mandatory requirement for a worm program is the presence of a Propagator organ. 2.1.1 Installer Definition: An installer creates and maintains the installation qualifier for the virus to execute on the victim system and ensures the automatic interpretation of code segments from the set V.
410
J.K. Murthy
Fig. 3. A representation of the replication cycle for a worm program
An installation qualifier is a permanent or a semi-permanent change in a machine’s integrity state. A semi-permanent change is a change that may be reset when a system is restarted. This definition considers two criteria for a code segment to qualify as an Installer. 1. 2.
The code should cause a (semi) permanent change in the machine’s integrity state to indicate that the system is infected. The code may ensure that the virus program is invoked after every time ti, the system is restarted or on an occurrence of an event.
2.1.2 Surveyor Definition: A surveyor actively identifies appropriate targets, network hosts or objects and their locators for other organs to perform correctly. Here, a locator is an address or path information to the target. The function of identifying suitable targets and their locators is divided into three sub functions, which the surveyor may decide to carry out: 1. Find locators for host and network objects 2. Find vulnerabilities 3. Sense the replication qualifier’s status 2.1.3 Concealer Definition: A concealer prevents the discovery of activity and structure of a virus program for the purpose of avoiding virus detection and forensics. The Webster’s dictionary defines “forensics” as “The use of science and technology to investigate and establish facts in a criminal or civil court of law.” Software forensics is the use of forensics in software related disputes. It has been used for three reasons: • Author identification • Author discrimination • Author characterization
2.1.4 Propagator Definition: The propagator provides the logistic mechanisms for the transfer of virus code. Logistic mechanisms are technical and/or non-technical methods for the transfer of a virus from an infected network host to another target host.
A Functional Decomposition of Virus and Worm Programs
411
The Propagator is a mandatory organ of the worm program. It is responsible for transferring a copy of the worm program from one host to another host. The Surveyor organ provides it with the vulnerabilities to be exploited. Thus a Propagator executes the exploits, which are received from the Surveyor. 2.1.5 Injector Definition: The injector organ injects a copy of the virus into the victim object such that the virus is placed in the execution space of the victim object. The copy of the virus may be exact or evolved, after being processed by the concealer organ. The execution space of an object is the code segment of the victim object or the environment in which the interpretation of the object will take place. The injector is a mandatory organ of a virus program. It enforces the mechanisms for copying the virus code into a clean3 object within a system. The mechanisms of injection are based on one condition to always hold true: The virus should have the information about the objects, which the virus is going to attack. In other words, the injection can occur only on known objects. Hence, there will always be an exchange of information between the Injector and the Surveyor organs for the injection process to execute. Figure 2.6-1 displays the virus injection process in a program. The important design issue in a virus is the selection of the injection point X as shown in the Figure. The selection of X requires the injection condition to hold true. The virus injection shown in the Figure may not always involve the insertion of all the virus instructions between two instructions of the target object. The virus instructions may be appended at the end or beginning of the target, and an instruction for transfer of program control to the virus block may be inserted at any desired point X in the target. This helps the virus to reduce the work required to create enough space in the program code segment for inserting the complete virus block and re-compute the relative addresses referenced by the program instructions. This is an important reason for viruses to not to choose arbitrary points of injection in target objects. We see that the majority of viruses, written using low-level languages, inject their virus code at the beginning or end of the target object. This conclusion does not hold true for viruses implemented using scripting languages. The reason being that the insertion can take place at a desired point X, using a call to the virus function In this case there is no need of recomputing the relative addresses, after code insertion, since that is taken care by the language implementation itself (during the compilation or interpretation stage). A virus implementation has to just check that the selected injection point lies inside the target’s main4 routine. Injection of virus code into binary programs is dependent on the file format of the target. Usually a virus or worm is confined to injecting code in objects that adhere to a narrow range of file formats, usually one or two. Current day platforms like Microsoft Windows use the Portable executable format (PE file format) to store programloading5 information. The section table contains information about each section in the
3 4 5
Clean is a relative term here. Since the object may have been infected by another virus The C language equivalent of main is main(char **argv, int argc) The linker provides the loading information in the file header of an executable and a loader to load the program image into the memory uses this information.
412
J.K. Murthy
Clean Object
Clean Object
Program execution direction
Instr 1
Virus Program
Instr i
Instr 1
X Instr 2
Instr 2
Injection
Instr i +1
Instr 3
Instr i + 2
Instr N
Instr i + N
Instr 3
Instr N
Fig. 4. Injection of a virus into a target
executable code. The commonly know sections of an executable code are: .text, .data and .bss sections. These respectively contain the program code, the program data and the statically defined data in a program. During the injection process, the virus usually patches a new section header in the section table present in the executable’s image. The body of the virus is appended to the end of the original host program and the PE header’s AddressOfEntryPoint field (the program entry point) is updated to point to the virus’s code (present at the end of the executable). Also, the number-ofsections field in the PE header is incremented by one. Thus, whenever this modified image is executed, first the virus code executes and then after finishing its execution, the virus transfers the execution control to the actual code of the program image. Other methods of injections in binary executables are usually variations of this technique. 2.1.6 Payload Definition: Payload organ can be considered a thunk since it behaves as a closure, which is created to delay evaluation. The thunk consists of a set of symbol sequences, which may be interpreted at a. time tp after the installation of the virus where 0 < tp < Tp (a finite time) b. an instance of a logic condition being satisfied c. or a system or user generated event occurs This section carries out the task for which the virus has been constructed. The task payload can range from a benign to a malicious activity intended by the virus author(s). The task payload section is identified if it carries out anomalous activity on the victim host or network.
A Functional Decomposition of Virus and Worm Programs
413
3 Conclusions Detecting viruses and worms by studying their behavior is a new development in the field of anti-virus research. This thesis identifies the organs of virus programs and gives abstract definitions for them. We present a method of decomposing malicious behavior using a 4-tuple representation: {Subject, Object, Action and Function}. This model classifies the different aspects of a malicious program on the basis of: who executes it, what it acts on, how it acts and the results of the action. The advantage of this method of classification is the easy identification of code segments in a malicious program. While studying the virus and worm source code as part of thesis work, it was a frequent observation that the different viruses, spaced by the time of their occurrence in the wild, had very similar source code. Sometimes, parts of source code in a virus seemed to have been copied from old viruses. Those viruses that had remarkably different source codes (even those which were implemented in different languages) displayed identical program behavior. A conclusion from this observation is that though detecting viruses is an undecidable problem, detecting a class of most commonly occurring viruses by studying previous virus behaviors is possible.
References [Bishop 01] [Bontchev 02] [Bontchev 02a] [Bontchev 98] [Bontchev 96] [Chess 91] [Cifuentes 94] [Cohen 94] [Cohen 85] [Cohen 84] [Eichin 89]
Mat Bishop. A critical Analysis of vulnerability Taxonomies. Technical Report 96-11. Department of Computer Science. University of California at Davis. April 19, 2001. V. V. Bontchev. Extracting Word Macros. Personal Communication. 17 March, 2002. V. V. Bontchev. Number of Signatures per Anti-virus software. Personal Communication. 18 March, 2002. V. V. Bontchev. Methodology of Computer Anti-Virus Research. PhD dissertation. University of Hamburg, Hamburg. 1998. V. V. Bontchev. Possible Macro Virus Attacks and how to prevent them. Proceedings of the 6th Virus Bulletin Conference, September 1996, Brighton/UK, Virus Bulletin Ltd, Oxfordshire, England. 1996. D. M. Chess. Virus Verification and Removal Tools and Techniques. http://www.research.ibm.com/antivirus/SciPapers/Chess/CHESS3/chess3. html, November 18, 1991. C. Cifuentes. Reverse compilation techniques. PhD dissertation, Queensland University of technology, 1994. F. Cohen. A Short Course in Computer Viruses. John Wiley and Sons. 1994. F. Cohen. Computer Virus. PhD dissertation. Department of Computer Science. University of Southern California. 1985. F. Cohen. Computer Viruses-Theory and Experiments. Computers and Security. Volume 6, (Number 1). pp 22–35. 1984. Mark W. Eichin and Jon A. Rochlis. With Microscope and Tweezers: An Analysis of the Internet Virus of November 1988. Proceedings of the 1989 IEEE Computer Society Symposium on Security and Privacy. 1989.
414
J.K. Murthy
[Fyoder 98] [Group 99] [Howard 97] [Ko 97] [Kumar 92] [Microsoft 02] [Moore 01] [Morris 85] [Heavens 02] [Pethia 99] [Sander 02] [Skulason 91] [Spafford 94] [Spafford 89] [Weaver 02] [Websters 98] [Wildlist 02] [Witten 90]
Fyoder. Remote OS detection via TCP/IP Stack FingerPrinting. http://www.insecure.org/nmap/nmap-fingerprinting-article.txt, October 18, 1998. H. R. Group. The Honeynet Project. http://www.honeynet.org, 2001. J. D. Howard. An Analysis of Security Incidents on the Internet. PhD Dissertation. Carnegie Mellon University. http://www.cert.org/research/JHThesis/Start.html, 1997. C. Ko, M. Ruschitzka, and K. Levitt. Execution monitoring of securitycritical programs in distributed systems: a specification-based Approach. Proc. IEEE Symposium on Security and Privacy. 1997. Sandeep Kumar and E. H. Spafford. Generic Virus Scanner in C++. Proceedings of the 8th Computer Security Applications Conference. 2–4 Dec 1992. Microsoft-MSDN. Using Script Encoder. MSDN. http://msdn.microsoft.com, 2002. D. Moore. The Spread of the Code-Red Worm (CRv2). CAIDA. http://www.caida.org, 2001. R. T. Morris. A Weakness in the 4.2BSD Unix TCP/IP Software. Technical Report Computer Science #117. AT&T Bell Labs. 1985. VX Heavens, Virus Creation Tools. http://vx.netlux.org/dat/vct.shtml, 2002. R. Pethia. The Melissa Virus: Inoculating our Information Technology from Emerging Threats. Testimony of Richard Pethia. http://www.cert.org/congressional_testimony/pethia9904.html, 1999. P. A. Porras. Virology Lecture Notes. http://www.tulane.edu/~dmsander/WWW/224/224Virology.html, 2002. A.S. Fridrik Skulason and Vesselin Bontchev. A New Virus Naming Convention. CARO meeting. http://vx.netlux.org/lib/asb01.html, 1991. Eugene H. Spafford. Computer Viruses as Artificial Life. Artificial Life. Volume 1, number 3. pages 249–265. 1994. E. H. Spafford. The Internet Worm Program: An Analysis. ACM Computer 19(1). pages 17–57. 1989. N. Weaver. Potential Strategies for High Speed Active Worms: A worst Case Analysis. http://www.cs.berkeley.edu/~nweaver, 2002. Merriam-Webster’s Collegiate Dictionary 10th Index edition. International Thomson Publishing. ISBN: 0877797099. 1998. The WildList FAQ. The WildList Organization International. http://www.wildlist.org/faq.htm, 2001. I. H. Witten, H. W. Thimbleby, G. F. Coulouris, and S. Greenberg. Liveware: A new approach to sharing data in social networks. International Journal of Man-Machine Studies. 1990.
Author Index
Ashourian, Mohsen Bao, Feng
179
72, 84, 88, 301
Chan, Pik-Wah 202 Chang, Chin-Chen 382 Chen, Hao 370 Chen, Hua 337 Chen, Jin 325 Chen, Xiaofeng 249 Cheon, Jung Hee 11 Chi, Chi-Hung 22 Ciet, Mathieu 348 Cui, Yang 269 Deng, Robert H. 72, 84, 88, 238, 301 Duc, Dang Nguyen 11 Esparza, Oscar
191
Feng, Dengguo 337 Feng, Wang 1 Forn´e, Jordi 191 Fung, Karyin 34 Gao, Wen
136
269
Jamhour, Edgard Jin, Hai 370 Joye, Marc 348 Ju, Jiu-bin 393
Ma, Miao 124 Maziero, Carlos 47 Mihaljevi´c, Miodrag J. 158 Moon, Jongsub 313 Morikawa, Yoshitaka 1 Mu˜ noz, Jose L. 191 Murthy, J. Krishna 405 Nabhen, Ricardo 47 Naccache, David 60 Nogami, Yasuyuki 1 Onieva, Jose Antonio
Han, Zongfen 370 Hitchens, Michael 145 Ho, Yo-Sung 179 Huang, Hui-Feng 382 Hypp¨ onen, Konstantin 60 Imai, Hideki
Li, Lan 337 Li, Tie-Yan 214 Li, Tieyan 22 Li, Wei 360 Li, Xiaoqiang 360 Liu, Joseph K. 34 Liu, Shaohui 136 Liu, Yongliang 136 Liu, Zhenhua 260 Lopez, Javier 112 L¨ u, Shuwang 260 Luo, Min 325 Lyu, Michael R. 202
47
Kim, Kwangjo 11, 249 Kobara, Kazukuni 269 Lam, Kwok-Yan 214 Lee, Henry C.J. 124 Li, Gang 170
Rhee, Kyung Hyune
112 100
Saunders, Gregory 145 Seo, JungTaek 313 Sohn, Taeshik 313 Soriano, Miguel 191 Sun, Jianhua 370 Sur, Chul 100 Tchoulkine, Alexei 60 Thing, Vrizlynn L.L. 124 Trichina, Elena 60 Varadharajan, Vijay Wang, Wang, Wang, Wang,
Guilin 72 Lina 325 Yan 22, 260 Yu-Min 226
145
416
Author Index
Wang, Yumin 292 Wang, Zhao 136 Wei, Victor K. 34 Wong, Duncan S. 34 Wu, Hongjun 84 Wu, Qian-Hong 226 Wu, Qianhong 292 Wu, Yongdong 238 Xie, Yan 249 Xu, Yi 124 Xu, Zhen 337 Xue, Xiangyang
360
Yang, Jie 170 Yang, Jong-Phil 100 Yang, Xue-jun 280 Yi, Xiao-dong 280 Zhang, Fangguo 249 Zhang, Huanguo 325 Zhang, Jian-Hong 226 Zhang, Jianhong 292 Zhang, Meng 393 Zhang, Qian 370 Zhou, Jianying 72, 88, 112 Zhu, HuaFei 214, 301