Privacy Protection for E-Services George Yee, National Research Council Canada, Canada
IDEA GROUP PUBLISHING Hershey • London • Melbourne • Singapore
Acquisitions Editor: Development Editor: Senior Managing Editor: Managing Editor: Copy Editor: Typesetter: Cover Design: Printed at:
Michelle Potter Kristin Roth Amanda Appicello Jennifer Neidig Kim Barger Sharon Berger Lisa Tosheff Yurchak Printing Inc.
Published in the United States of America by Idea Group Publishing (an imprint of Idea Group Inc.) 701 E. Chocolate Avenue Hershey PA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail:
[email protected] Web site: http://www.idea-group.com and in the United Kingdom by Idea Group Publishing (an imprint of Idea Group Inc.) 3 Henrietta Street Covent Garden London WC2E 8LU Tel: 44 20 7240 0856 Fax: 44 20 7379 0609 Web site: http://www.eurospanonline.com Copyright © 2006 by Idea Group Inc. All rights reserved. No part of this book may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this book are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI of the trademark or registered trademark. Library of Congress Cataloging-in-Publication Data
Privacy protection for e-services / George Yee, editor. p. cm. Summary: "This book reports on the latest advances in privacy protection issues and technologies for e-services, ranging from consumer empowerment to assess privacy risks, to security technologies needed for privacy protection, to systems for privacy policy enforcement, and even methods for assessing privacy technologies"--Provided by publisher. Includes bibliographical references and index. ISBN 1-59140-914-4 (hardcover) -- ISBN 1-59140-915-2 (softcover) -- ISBN 1-59140-916-0 (ebook) 1. Electronic commerce--Security measures. 2. Computer security. 3. Data protection. 4. Privacy, Right of. I. Yee, George. HF5548.37.P753 2006 005.8--dc22 2005032104 British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher.
Privacy Protection for E-Services Table of Contents
Preface ................................................................................................................... vii Section I: Issues and Challenges Chapter I Exercising the Right of Privacy ................................................................................ 1 Scott Flinn, National Research Council Canada, Canada Scott Buffett, National Research Council Canada, Canada Chapter II Privacy Issues in the Web Services Architecture (WSA) ....................................... 29 Barbara Carminati, University of Insubria at Como, Italy Elena Ferrari, University of Insubria at Como, Italy Patrick C. K. Hung, University of Ontario Institute of Technology (UOIT), Canada Chapter III The Impact of Information Technology in Healthcare Privacy ................................. 56 Maria Yin Ling Fung, University of Auckland, New Zealand John Paynter, University of Auckland, New Zealand Chapter IV E-Services Privacy: Needs, Approaches, Challenges, Models, and Dimensions ....... 94 Osama Shata, Specialized Engineering Office, Egypt
Section II: Privacy Protection From Security Mechanisms and Standards Chapter V Privacy Protection Through Security ................................................................... 115 Martine C. Ménard, Policy Research Initiative, Canada Chapter VI Pseudonym Technology for E-Services ................................................................. 141 Ronggong Song, National Research Council Canada, Canada Larry Korba, National Research Council Canada, Canada George Yee, National Research Council Canada, Canada Chapter VII Privacy Enforcement in E-Services Environments ................................................. 172 Carlisle Adams, University of Ottawa, Canada Katerine Barbieri, University of Ottawa, Canada Chapter VIII Protecting Privacy Using XML, XACML, and SAML ........................................... 203 Ed Simon, XMLsec Inc., Canada Section III: Privacy Protection Architectures and Other Privacy Topics Chapter IX Privacy Management Architectures for E-Services .............................................. 234 Larry Korba, National Research Council Canada, Canada Ronggong Song, National Research Council Canada, Canada George Yee, National Research Council Canada, Canada Chapter X Modeling Method for Assessing Privacy Technologies ......................................... 265 Michael Weiss, Carleton University, Canada Babak Esfandiari, Carleton University, Canada Chapter XI Legislative Bases for Personal Privacy Policy Specification ................................. 281 George Yee, National Research Council Canada, Canada Larry Korba, National Research Council Canada, Canada Ronggong Song, National Research Council Canada, Canada About the Authors ................................................................................................ 295 Index .................................................................................................................. 300
vii
Preface
Almost no one realizes exactly how important privacy is in his or her life. Bruce Schneier in “Secrets and Lies” (Schneier, 2000) This book arises from the confluence of three recent trends, namely, the growth of the Internet and e-services, the growth of consumer awareness of their lack of privacy, and the spread of privacy legislation enacted by many jurisdictions. The first two trends in part dictate the need for the third trend, but as we will see, privacy legislation was not enacted solely for e-services and also involves non-electronic privacy. Let us examine each of these trends in more detail: Growth of the Internet and E-Services. The Internet is growing by leaps and bounds as can be seen by the rapidly increasing amounts of information returned by search engines. The search problem faced by Internet users today is not the lack of information from searches, but how to make sense of the realms and realms of information generated by a search. In addition, the Internet is growing from the spread of computer technology in developing countries as well as the harnessing of the Internet for new applications such as healthcare. Accompanying the growth of the Internet has been the availability of a diverse number of e-services. Most consumers are familiar with online banking or online retailing (e.g. Amazon.com) via the Internet. Other Internet-based services such as e-learning (online courses), e-government (online government services such as tax information), and e-health (online medical advice and others) are becoming more common place as well. Before proceeding any further, it is useful to have a common understanding of an e-service. An e-service for the purposes of this book is characterized by the following attributes (Yee & Korba, 2005):
• •
The service is accessible across the Internet.
•
The provider’s service software can make use of the service software of other providers in order to perform its service.
The service is performed by application software (service software) that is owned by a provider (usually a company).
viii
• •
A provider can have more than one such service.
• •
There is usually a fee that the consumer pays the provider for use of the service.
The service is consumed by a person or another application accessing the service across the Internet. The consumer has privacy and security preferences for the service that may or may not be followed by the provider.
For example, consider Amazon.com. Its retailing service is accessible across the Internet. The service is performed by application software owned by Amazon.com. Amazon.com makes use of other providers (e.g. Paypal.com) to provide its service. The service is consumed by individual users across the Internet for a fee (which is built into the price of a product). Finally, each user has privacy and security preferences for the service such as not wanting personal contact information to be disclosed to other parties without permission and wanting such information to be stored securely by the provider. These preferences bring us to the next trend. Growth of Consumer Awareness of Their Lack of Privacy. Everyone who has ever purchased anything from the Internet has had the experience of pausing and wondering if it is “safe” to enter one’s credit card information. This is an example of a consumer becoming aware of his/her possible lack of privacy. Clearly, the more one is exposed to new services on the Internet and the varied personal information that is demanded by these services, the more one wonders whether the personal information that one enters would be kept safe. Other factors also come into play to push this message home. One very important factor is due to frequent news events of break-ins to credit card servers and corporate process errors that result in consumer private records being faxed to total strangers. How can the consumer have any confidence or trust in providers if they keep hearing such events in the news? In fact, studies have shown that the growth of e-commerce would be many times the current rate if consumers could trust their eservice providers, and a key component of mistrust is the lack of privacy. According to Van Slyke, Belanger, and Comunale (2004), worldwide Internet commerce is expected to reach $8.5 trillion in 2005, of which online retail sales is the most evident, with U.S. consumers spending $51.3 billion online in 2001, $72.1 billion in 2002, and a projected $217.8 billion in 2007. However, these authors also report that not all forecasts are as rosy: while total online spending is increasing, per person online spending is quickly declining. The authors indicate that concerns over privacy and trust are among the most important factors that turn an online buyer into a non-buyer. Finally, consumer attention is being focused on their right to privacy from government legislation (third trend). For example, in Canada, federal privacy legislation has forced common consumer service providers such as dentists and eye glass makers to request consumers to sign forms giving the providers permission to collect their private information. Spread of Privacy Legislation Enacted by Many Jurisdictions. In recent years, more and more jurisdictions have enacted privacy legislation, which as noted above, focuses the consumer on his/her rights to privacy and contributes to consumer privacy awareness. The federal legislation referred to at the end of the paragraph on Growth of Consumer Awareness of Their Lack of Privacy is known as the Personal Information Protection and Electronic Documents Act (PIPEDA) (Government of Canada) and con-
ix
tains provisions for provider accountability, identification of the purpose of private data collection, consent of the consumer for data collection, and others. In the European Union (EU), privacy is defined as a human right under Article 8 of the 1950 European Convention of Human Rights and Fundamental Freedoms. The implementation of this Article can be traced to The Directive (European Union Directive). The Directive applies to all sectors of EU public life and is framed in terms of “data subjects” (owners of private data), “data controllers” (entities having control over private data accountable for correct processing and handling of the data), and “data processors” (entities that process private data on behalf of data controllers). The structure of this framework balances the fundamental rights of the data subject against the legitimate interests of data controllers. Privacy protection in the United States is achieved through a patchwork of legislation at the federal and state levels. Privacy legislation is largely sectorbased (Banisar, 1999). Of prominent recent interest are privacy laws enacted in the U.S. healthcare sector, as exemplified by the Health Insurance Portability and Accountability Act (HIPAA) (U.S. Government). HIPAA consumer privacy provisions include the right to obtain a copy of one’s medical record, the right to make corrections to one’s health information, and the right to give or not give one’s permission for the use or sharing of one’s health information by a service provider. In all cases of privacy legislation described above, the legislation applies to all pertinent activities that an individual may have in daily life involving the exchange of private information and not only to electronic activity. The confluence of these three trends yields the following conclusions:
• • •
If e-services are to grow and succeed, consumer privacy must be protected. E-service providers will need to protect consumer privacy as prescribed by law. International e-services will be subject to privacy laws from multiple jurisdictions. There will be a need for international cooperation to ensure that privacy laws are consistent across national boundaries.
This book suggests solutions for the first two points. The third point is a very complex legal and political issue that requires much indepth research. It is outside the scope of this book.
Current Situation With Protection of Electronic Privacy The current response of e-service providers to the need for consumer privacy has been weak at best. There are essentially two tiers of privacy provisions offered: the first tier is merely the posting of the Web site’s privacy policy, while the second tier consists of the use of P3P (Platform for Privacy Preferences Project) (World Wide Web Consortium) technology that allows the automatic checking of a consumer ’s privacy preferences against the provider ’s privacy policy. Merely the posting of the Web site’s privacy policy and requesting consumers to read it is tantamount to a joke. It is more a legalistic
x
self-protection action rather than one that has the consumer’s best interest at heart. First of all, most consumers will not bother to read it — who would? There are many other more pressing (and perhaps more interesting) things to do. Secondly, and more importantly, the posted provisions do not speak to the consumer’s personal privacy needs, only the provider’s needs. Everyone is different and has different privacy needs. The expectation of one provider policy fitting everyone’s needs is ridiculous. The second tier, the use of P3P technology for automatic checking of a consumer ’s privacy preferences against the provider’s policy is not much better. It only solves the problem of the consumer not bothering to read the policy. The problem of possible mismatch between the consumer’s privacy preferences and the provider’s privacy policy is still present. To solve this latter problem, providers need to implement systems that allow a consumer to negotiate the provider’s privacy policy, where it fails to match-up with the consumer’s privacy preferences (Yee & Korba, January 2003, May 2003). While we are on the subject of P3P technology, we should also mention the AT&T Privacy Bird (AT&T), which is a user agent in the form of a bird on the user’s screen that changes color to signal that a user’s privacy preferences are incompatible with a Web site’s privacy policy. This is a cute way of promoting P3P technology and makes the technology more attractive to use, but it still does not solve the personal preferences — provider policy matching problem. In addition to the current problems of consumers expressing their privacy preferences, the other side of the coin has to do with provider follow-up of consumer ’s privacy preferences. Let us assume that consumers can express privacy preferences to providers. How do consumers know that their preferences will be followed? Currently, most eservice sites do not provide any form of guarantee that they will even honor their own privacy policies, let alone the privacy preferences of consumers. Yet, such guarantees are needed to avoid consumer mistrust of e-services. Finally, there is today a lack of good security to protect consumers’ private information while in the possession of providers. Witness the recent news events concerning breakins at servers holding consumer credit card information, and bank negligence that saw the faxing of customers’ private banking records to a total stranger (the bank continued the faxing even after it was told of it!). This lack of security is becoming even more visible to the public eye as the media focuses on high profile uses of private information in sectors such as healthcare. Another blight on security is the possibility of an insider attack which is extremely difficult to defend against. In the healthcare sector, abuse of private health information by insiders is rampant. Given today’s situation with consumer privacy protection, it is an understatement to say that there is much work to be done. Hopefully, this book will supply some of the missing pieces.
Challenges and Opportunities In discussing challenges and opportunities (C & O) in e-services privacy protection, a logical path to take is to first address C & O that arise from the current situation as
xi
described in the previous section. After that, it would be useful to mention any remaining items. C & O arising from the current situation with e-services privacy protection include:
•
Negotiation (perhaps automated) mechanisms are needed to allow e-service consumers to negotiate a provider’s privacy policy. Some headway is being made in the area of XML-based Web services where languages such as SOAP, WS-Policy (World Wide Web Consortium-1), and XACML (OASIS) provide facilities to capture privacy preferences and build negotiation systems. However, this work towards negotiation systems is still at the research stage.
•
Providers need to implement systems (policy conformance systems) that automatically implement the provisions of a consumer’s privacy policy (statement of privacy preferences) and provide guarantees that consumer privacy preferences are followed.
•
There needs to be better security put in place to protect a consumer ’s private information while it is in the provider’s possession. In particular, there is a need to find better defenses against insider attack.
•
Security implementers need to consider other paradigms of privacy protection. For example, it is not always the protection of private information once it has been given away that is the most effective. The other paradigm is not to give away the private information at all and still fulfill the requirements of e-services. A technology that follows the latter paradigm is pseudonym technology. Another example of a paradigm-like shift in thinking is to apply the mechanisms of digital rights protection to privacy rights protection.
•
Consumers need to be educated on their privacy rights (e.g. through exposure to privacy legislation) as well as the use of the Internet. Such education will prepare them to formulate effective privacy policies and negotiate them on the Internet.
•
Consumers also need to be empowered with tools and techniques to assess privacy risks in order to make the right privacy choices, either in formulating their privacy policies or in negotiation.
•
Standards are needed for policy negotiation and conformance systems to promote cross-service development, consistent essential operation, and trust.
Other challenges and opportunities that complement current privacy needs include:
•
System implementers need to examine their system’s privacy requirements and learn from good examples of privacy architectures that fulfill the requirements and have been published.
•
To make progress in privacy technologies, methods and tools are needed to assess one privacy technology against another and to compare privacy technologies in terms of measures of effectiveness.
xii
Organization of This Book This book reports on the latest advances in privacy protection issues and technologies for e-services. It is organized into three sections and 11 chapters. A brief description of each chapter follows.
Section I: Issues and Challenges Chapter I discusses privacy from the viewpoint of the consumer of e-services. It provides a foundation for developing approaches to empower users with control over their private information. It proposes a technique for risk management assessment designed to help consumers evaluate a situation to identify and understand potential privacy concerns. The chapter discusses how a consumer can understand exposure risks and how information can be controlled and monitored to mitigate the risks. It also proposes a method for assessing the consumer’s value of personal information. In addition, it presents a mechanism for automated negotiation to facilitate fair, private information exchange. The authors believe that these or similar techniques are essential to give consumers of e-services meaningful control over the personal information they release. Chapter II discusses privacy challenges of Web Services, which are based on a set of XML standards such as Universal Description, Discovery and Integration (UDDI), Web Services Description Language (WSDL), and Simple Object Access Protocol (SOAP). To enable privacy protection for Web service consumers across multiple domains and services, the World Wide Web Consortium (W3C) published a document called “Web Services Architecture (WSA) Requirements” that defines some fundamental privacy requirements for Web services. However, no comprehensive solutions to the various privacy issues have been so far defined. This chapter focuses on privacy technologies by first discussing the main privacy issues in WSA and related protocols. Then, it provides illustrations of the standardization efforts going on in the context of privacy for Web services and proposes different technical approaches to tackle the privacy issues. Chapter III examines privacy issues in the health sector and how they are handled in the United States and New Zealand. The increased use of the Internet and the latest information technologies such as wireless computing are revolutionizing the healthcare industry by improving services and reducing costs. These advances in technology help to empower individuals to understand and take charge of their healthcare needs. For example, patients can search for healthcare information over the Internet and interact with physicians. However, the same advances in technologies have also heightened privacy awareness. Privacy issues include healthcare Web sites that do not practice the privacy policies they preach, computer break-ins, insider and hacker attacks, temporary and careless employees, virus attacks, human errors, system design faults, and social engineering. The chapter reports from a study using a sample of 20 New Zealand health Web sites. Chapter IV describes several aspects of electronic privacy such as needs, approaches, challenges, and models. The author’s view is that privacy protection, although of inter-
xiii
est to many parties such as industry, government, and individuals, is very difficult to achieve since these stakeholders often have conflicting needs and requirements and may even have conflicting understandings of privacy. Therefore, finding one model or one approach to privacy protection that satisfies all these stakeholders is a daunting task. The chapter discusses various aspects of privacy protection, such as the development of privacy policies, the privacy needs of individuals and organizations, the challenges of adopting and coping with privacy policies, the tools and models to support privacy protection in both public and private networks, related laws that protect or constrain privacy, as well as spamming and Internet censorship in the privacy context. The author hopes that understanding these privacy aspects will assist researchers in developing policies and systems that will bring the interests of the different parties into better alignment.
Section II: Privacy Protection From Security Mechanisms and Standards Chapter V discusses how implementing network and computer security measures can protect the privacy of Internet users. Personally identifiable information is valuable to both clients and businesses alike, and therefore, both are responsible for securing privacy. Clients and businesses need to understand the vulnerabilities, threats, and risks that they face. They need to know what information requires protection and from whom. Businesses in addition need to comprehend the business issues involved in securing data. Privacy protecting security measures need to be a strong mix of technological, physical, procedural, and logical measures where each measure is implemented in overlapping layers. According to the author, privacy solutions must be flexible, meet the objectives and businesses goals, and be revised on a regular basis. Chapter VI describes the use of pseudonyms, a privacy protection technology that is rising quickly to prominence. Current e-services allow for easy and efficient personal data collection through integration, interconnection, and data mining, since the user’s real identity is used. Pseudonym technology with unlinkability, anonymity, and accountability can give the user the ability to control the collection, retention, and distribution of his/her personal information. The chapter explores the challenges, issues, and solutions associated with pseudonym technology for privacy protection in e-services. The chapter describes a general pseudonym system architecture, discusses the relationships between pseudonyms and other privacy technologies, and summarizes pseudonym application requirements. Based on these requirements, the chapter compares a number of existing pseudonym technologies. In addition, the chapter gives an example of a pseudonym application — the use of an e-wallet for e-services. Chapter VII presents technologies for privacy enforcement (techniques that can be used to ensure that an organization’s privacy promises will be kept). It gives an introduction to the current state of privacy enforcement technologies for e-services environments, proposes a comprehensive privacy enforcement architecture, and discusses some issues and challenges related to privacy enforcement solutions. The authors state that the goal of their proposed architecture, aside from bringing together many of the current isolated technologies, is to ensure consistency between the advertised
xiv
privacy promises and the actual privacy practices of the e-service provider, so that users can have greater confidence that their personal data will be safeguarded as promised. Chapter VIII presents a tutorial on how two new XML-based technologies, XACML (eXtensible Access Control Markup Language) and SAML (Security Assertion Markup Language), can be used to help protect privacy in e-services. The chapter briefly introduces XML, and then details the privacy features of XACML and SAML. The chapter illustrates concepts with detailed examples. The author hopes that readers will be both informed and intrigued by the possibilities for privacy applications made possible by XML, XACML, and SAML.
Section III: Privacy Protection Architectures and Other Privacy Topics Chapter IX first describes some driving forces and approaches for developing and deploying a privacy architecture for e-services. It then reviews several architectures that have been proposed or developed for managing privacy. The chapter offers the reader a quick tour of ideas and building blocks for creating privacy-protection enabled e-services and describes several privacy information flow scenarios that can be applied in assessing any e-service privacy architecture. The authors conclude the chapter with a summary of the work covered and a discussion of some outstanding issues in the application of privacy architectures to e-services. Chapter X proposes a modeling framework for assessing privacy technologies. The main contributions of the framework are to allow the modeling of aspects of privacy and related system concerns (such as security and scalability) in a more comprehensive manner than the dataflow diagrams traditionally used for privacy analysis. The chapter also takes a feature interaction perspective that allows reasoning about conflicts between a service user’s model of how the service works and the service’s actual implementation. To demonstrate the framework, the authors illustrate how it can be applied to the analysis of single sign-on solutions such as .Net Passport. Chapter XI describes how recent privacy legislation in Canada, the European Union, and the United States can be used to define the minimum and necessary content of a personal privacy policy. The authors believe that the use of a personal privacy policy to express an individual’s privacy preferences is best-suited for managing consumer privacy in e-commerce. The chapter first motivates the reader with an e-service privacy policy model that explains how personal privacy policies can be used for e-services. It then derives the minimum and necessary (because it is the law) content of a personal privacy policy by examining some key privacy legislation selected from Canada, the European Union, and the United States.
xv
Conclusions The editor of this book has collected material that addresses most of the challenges and opportunities mentioned. The material ranges from consumer empowerment to assess privacy risks to security technologies needed for privacy protection to systems for privacy policy enforcement, and even methods for assessing privacy technologies. The editor is confident that the reader will find this book invaluable in the domain of eservices privacy protection. This book is intended for consumers, educators, researchers, designers, and developers who are interested in the protection of consumer privacy for Internet services. Although there are other books on privacy, no other book contains the latest information and deals with the challenges and opportunities of consumer privacy protection as presented here.
References AT&T. (n.d.). Articles about AT&T Privacy Bird. Retrieved August 26, 2005, from http:/ /www.privacybird.com/news.html Banisar, D. (1999, September 13). Privacy and data protection around the world. 21st International Conference on Privacy and Personal Data Protection. European Union Directive. (1995). Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Unofficial text retrieved September 5, 2003, from http://aspe.hhs.gov/datacncl/ eudirect.htm Government of Canada. (n.d.). Personal Information Protection and Electronic Documents Act. Retrieved February 28, 2005, from http://www.privcom.gc.ca/legislation/index_e.asp OASIS. (n.d.). OASIS standards and other approved work. Retrieved August 26, 2005, from http://www.oasis-open.org/specs/index.php#xacmlv2.0 Schneier, B. (2000). Secrets and lies, digital security in a networked world. John Wiley & Sons. U.S. Government. (n.d.). Office for Civil Rights HIPAA: Medical privacy National standards to protect the privacy of personal health information. Retrieved February 28, 2005, from http://www.hhs.gov/ocr/hipaa/ Van Slyke, C., Belanger, F., & Comunale, C. L. (2004, June). Factors influencing the adoption of Web-based shopping: The impact of trust. ACM SIGMIS Database, 35(2). World Wide Web Consortium. (n.d.). Platform for Privacy Preferences Project (P3P). Retrieved August 26, 2005, from http://www.w3.org/P3P/
xvi
World Wide Web Consortium-1. (n.d.). Links to SOAP and WS-Policy descriptions. Retrieved August 26, 2005, from http://www.w3.org/ Yee, G., & Korba, L. (2003, January). Bilateral e-services negotiation under uncertainty. In Proceedings of the 2003 International Symposium on Applications and the Internet (SAINT2003), Orlando, Florida, USA. Yee, G., & Korba, L. (2003, May). The negotiation of privacy policies in distance education. In Proceedings of the 14th IRMA International Conference, Philadelphia, Pennsylvania, USA. Yee, G., & Korba, L. (2005). Negotiated security policies for e-services and Web services. In Proceedings of the 2005 IEEE International Conference on Web Services (ICWS 2005), Orlando, Florida, USA.
xvii
Acknowledgments
I would like to thank all the authors for their excellent contributions to this book. My heartfelt gratitude goes to all the reviewers who provided insightful and constructive comments, in particular to Carlisle Adams of the University of Ottawa, Michael Weiss of Carleton University, Maria Fung of the University of Auckland, Ed Simon of XMLsec Inc., and last but not least, my colleagues Larry Korba, Scott Flinn, Scott Buffett, and Ronggong Song of the National Research Council Canada. My thanks also go to all who were involved in the collation and review process of this book. Without their support, the project could not have been satisfactorily completed. A special note of thanks goes to all of the staff at Idea Group Inc. Their contributions throughout the whole process from inception of the initial idea to final publication have been very helpful. In particular, I would like to thank Jan Travers, whose guidance through the initial contract formulation process made it all possible, and Kristin Roth, whose timely reminders kept me on track. Finally, I am indebted to the Institute for Information Technology, National Research Council Canada, for providing me the resources for this project.
George Yee National Research Council Canada, Canada August 2005
xxiii
Section I: Issues and Challenges
Exercising the Right of Privacy 1
Chapter I
Exercising the Right of Privacy Scott Flinn, National Research Council Canada, Canada Scott Buffett, National Research Council Canada, Canada
Abstract This chapter discusses privacy from the perspective of the consumer of e-services. It proposes a technique for risk management assessment designed to help consumers evaluate a situation to identify and understand potential privacy concerns. The technique centers around a series of questions based on common principles of privacy protection. The chapter discusses how a consumer can understand exposure risks and how information can be controlled and monitored to mitigate the risks. It also proposes a method for assessing the consumer’s value of personal information, and a mechanism for automated negotiation is presented to facilitate fair, private information exchange. The authors believe that these or similar techniques are essential to give consumers of e-services meaningful control over the personal information they release. This forwardlooking chapter provides a foundation for developing methods to empower users with control over their private information.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
2 Flinn & Buffett
Introduction In a transaction or relationship in which there is an expectation of privacy, the roles of the parties involved are generally not symmetric. In the simplest case, one party (which we will call the sender) discloses information to another (the receiver). More generally, one or more senders will disclose information to one or more receivers. We also refer to senders as relying parties because they must rely on receivers to keep information in confidence. Widely recognized privacy principles (such as those of the CSA Model Code introduced in the next section) can be interpreted from both perspectives. For the receiver, they define a duty of confidentiality. For the sender, they define a right of privacy that is primarily concerned with the sender’s ability to exercise control over the exposure of sensitive information. The privacy concerns relating to a direct relationship between parties are often spelled out in legal agreements or legislation. In these cases, it may be clear to relying parties what their exposure is and how to manage it. In many cases, however, relationships may be informal, indirect, or complex multi-party relationships. In these circumstances, the situation is much less clear for senders. This chapter focuses on senders who must manage their exposure in these complex situations. It outlines a vision for privacy management from a sender’s perspective that has three major components. The ultimate objective is to assist senders in making good decisions regarding the trade-offs between the potential costs and benefits of disclosing sensitive information. First, senders must be aware of the potential privacy risks they face as they conduct their business online. What are they? How likely are they? What are the associated costs? What incentives balance the risks? We discuss these and other questions, and introduce a decision process based on a risk management approach. Second, senders must have the means to limit their exposure, obtaining benefits in proportion to the value of the information they reveal. We discuss the valuation of personal information from both the senders’ and receivers’ perspectives and describe in detail a protocol for negotiating equitable exchanges. Third, senders should have some degree of direct control over their personal information following its release. We review the challenges this goal presents and discuss techniques to address it. Each of these components has implications for the architecture and design of e-services, which we will highlight throughout the discussion. This chapter is very much a forward-looking one. It is important to recognize that privacy is an ill-defined concept, meaning many different things to different people. In a recent article on the meaning of privacy, Daniel Solove put it this way: “Privacy is a concept in disarray. Nobody can articulate what it means” (Solove, 2006). The comment is especially relevant for this chapter with its focus on individuals who may each have their own conception of privacy. It is therefore unrealistic to expect to find turn-key technologies capable of ensuring end-user privacy today. In this regard, any potential solution is necessarily forward-looking. The proposals in this chapter are not yet widely deployed
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 3
or thoroughly evaluated, but we believe they reflect some of the most promising avenues for further progress.
Background In recent years, there has been a growing interest in the research of new methods designed to help Web users maintain control over their private information. This is due in part to the surprisingly slow growth of electronic commerce. A study by Udo (2001) found that concern over privacy and security is the number one reason why Web users do not make purchases over the Web. Culnan and Armstrong (1999) also argue that concerns over Internet privacy have a negative influence on the likelihood of electronic exchange. Contrary to popular belief, experience does not tend to breed wariness. In fact, increased Web usage has been shown to decrease concern over privacy (Metzger, 2004). One reason for this is that, with experience, people tend to see the benefits of giving away information, such as Web site personalization, customer profiling or lower prices. Another factor that has a positive relation with reduced concern over privacy is perceived control over one’s private information (Metzger, 2004). Culnan and Armstrong argue that consumers are more willing to share their private information if they believe that fair information practices are in place. Fair information practices are those that (1) reveal why the information is being collected and how it will be used, and (2) give consumers control over its possible uses. Empowering users with knowledge of the advantages and disadvantages of releasing private data, and also with control over such data after its release, is thus a vital step toward overcoming fears associated with privacy and achieving growth and prosperity in the area of electronic commerce. Privacy policies help businesses to inform users of their data-collecting practices, ideally putting visitors at ease so they feel uninhibited in their participation. However, many users do not read such privacy policies, believing them to be too time-consuming to read or too difficult to understand. The Platform for Privacy Preferences Project (P3P) enables Web sites to express their privacy policies in a machine readable format, allowing P3P user agents to read and “understand” policies on behalf of the user (Cranor, Langheinrich, Marchiori, Presler-Marshall, & Reagle, 2002). Cranor et al. have worked extensively on user interfaces for these agents, including the AT&T Privacy Bird (Cranor, 2003; Cranor, Arjula, & Guduru, 2002). Even if an information sender is well informed of the privacy policy, he/she might still find the data collection too intrusive. In this case it may be important for the receiver to either offer some form of incentive, or at least make the sender understand the benefits of transmitting their data. Recent work in privacy economics research shows that people are typically willing to share their private information if they foresee a sufficient reward in return. Chellappa and Sin (2005) and Culnan and Bies (2003) argue that, when attempting to collect user data in order to personalize Web sites, consumers are willing to share preference information in exchange for benefits such as convenience if the quantified value of services outweighs the quantified loss of privacy. Hann, Hui, Lee, and Png (2002) show that economic incentives affect users’ willingness to share
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
4 Flinn & Buffett
information, and derive consumers’ monetary worth of secondary use of personal information. A Cheskin study (2000) found that many expert users (younger males especially) are willing to sacrifice more private information if it leads to better prices. While people understand that their information has value, since different users value their information differently, techniques are needed to help determine these values in order to facilitate effective decision making. Utility elicitation techniques (Chajewska, Koller, & Parr, 2000) can be used to help senders determine how they privately value their personal information. Once these values have been assessed, since information receivers are not likely to agree on such values, negotiation may take place to determine a suitable exchange. Earlier drafts of P3P included a protocol for multi-round negotiation. However, it was believed that this made P3P too complicated and was thus dropped from the specification. Cranor and Resnick (2000) show that under assumptions of user anonymity, publicly known Web site strategies, and no negotiation transaction costs for users, take-it-or-leave-it offers yield just as much Web site profit as any negotiation strategy. On the other hand, Buffett, Jia, Liu, Spencer, and Wang (2004) show that when rewards are offered these assumptions are no longer valid, and give a protocol for multiissue automated negotiation (Jennings, Faratin, Lomuscio, Parsons, Sierra, & Wooldridge, 2001; Fatima, Wooldridge, & Jennings, 2004; Jennings, Parsons, Sierra, & Faratin, 2000) where the information receiver can offer a certain level of service (e.g., 10% discount, free delivery) in exchange for private information. Multi-attribute utility theory (Keeney & Raiffa, 1976) is used to rank each party’s preferences.
Privacy Principles and Standards The Canadian Standards Association (CSA) Model Code for the Protection of Personal Information, developed in 1995-96 by the CSA, is the set of standards for Canada addressing the way organizations collect, use, disclose, and protect personal information. It also defines the right of individuals to have access to personal information about themselves, and, if necessary, to have the information corrected. Ten interrelated principles form the basis of the CSA Model Code. In one form or another, they are widely accepted throughout the western world. Privacy standards and legislation in other jurisdictions may differ in terminology and how the overlapping concepts are divided, but the spirit will be largely the same. Notable examples include the European Privacy Directive (95/46/EC), and the Code of Fair Information Practices defined in 1973 by the United States Department of Health, Education and Welfare. We refer to the principles of the CSA Model Code throughout this chapter, so it will be useful to list them in full. They are: 1.
Accountability. An organization is responsible for personal information under its control.
2.
Identifying purposes. The purposes for which personal information is collected shall be identified by the organization.
3.
Consent. The knowledge and consent of the individual are required for the collection, use, or disclosure of personal information, except where inappropriate.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 5
4.
Limiting collection. The collection of personal information shall be limited to that which is necessary for the purposes identified by the organization.
5.
Limiting use, disclosure, and retention. Personal information shall not be used or disclosed for purposes other than those for which it was collected.
6.
Accuracy. Personal information shall be as accurate, complete, and up-to-date as is necessary for the purposes for which it is to be used.
7.
Safeguards. Personal information shall be protected by security safeguards appropriate to the sensitivity of the information.
8.
Openness. An organization shall make readily available specific information about its policies and practices relating to the management of personal information.
9.
Individual access. Upon request, an individual shall be informed of the existence, use, and disclosure of his or her personal information and shall be given access to that information. An individual shall be able to challenge the accuracy and completeness of the information and have it amended as appropriate.
10.
Challenging compliance. An individual shall be able to address a challenge concerning compliance with the above principles to the designated individual or individuals accountable for the organization’s compliance.
The Risk Management Perspective The privacy principles discussed in the previous section can be interpreted equally by senders and receivers, but their presentation is clearly framed from the perspective of the receiver. Indeed, modern privacy legislation in Europe, Canada, and elsewhere is substantially similar to the CSA Model Code and defines a receiver’s duty of confidentiality under the law. For senders, the privacy principles are more informative than functional. They can help senders understand what to expect from the parties they rely on, but they do not clearly indicate what actions a sender should take to achieve optimal (or even acceptable) tradeoffs. Here we describe a risk management approach to privacy that can help relying parties evaluate a given situation and understand its privacy implications. We begin with the observation that people are already engaged in an informal risk management process. We face risks every time we go online. In the absence of mature technologies that can strongly assure our privacy in a way tailored to each person’s conception of the term, we must make our own decisions regarding the trade-offs involved in releasing sensitive information. Presently, however, the information and understanding we have to guide those decisions are woefully inadequate. The approach we describe here is not intended to be a comprehensive, precise, and expensive actuarial process. Rather, it is designed to inform, structure, and support the informal risk management process that is already practiced by default.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
6 Flinn & Buffett
The approach is based on a set of questions that loosely resemble the principles of the CSA Model Code. Taken together, they define a functional decision procedure. The underlying premise is that senders who can quickly and accurately answer these questions with respect to any potential action they might take will generally be more aware of their exposure and be better able to manage it. 1.
What are the risks?
2.
How likely is each potential risk?
3.
What is the potential impact of exposure to me or to others?
4.
What incentive do I have to risk exposure?
5.
How can I control the risk or limit my exposure?
6.
What commitments have been made to me, and who is responsible for them?
7.
How would I know if something went wrong?
8.
What recourse do I have if commitments made to me are violated?
Each of these questions presents unique challenges. The remainder of this section examines each question in more detail, summarizing the challenges and reviewing conventional approaches for addressing them. The section concludes with a brief review of the relationship between the risk management questions and conventional privacy principles.
Making Risk Management Decisions What are the Risks? Risk identification is the foundation of risk management: you cannot manage risks if you are not aware of them. Privacy is a sweeping concept that ranges from freedom of thought and control over one’s body to protection of one’s reputation and protection from searches and interrogations (Solove, 2002). Privacy risks are correspondingly diverse. In cyberspace, they range from minor intrusions such as spam e-mail to global surveillance (Poole, 1999) and widespread data aggregation and profiling (Garfinkel, 2001). Senders become aware of privacy risks in different ways. In some domains, privacy legislation or codes of professional conduct may require that informed consent be obtained before information is collected. In this case, it is the duty of the receiver to clearly explain what data will be collected, how it will be used, and, ideally, other factors such as how long it will be retained and how the sender can monitor its use and accuracy on an ongoing basis. When informed consent is not explicitly required, receivers may provide similar information on a voluntary basis. This is commonly done through privacy policies associated with Web sites that collect personal information. Such policies are often written in a natural language (e.g., English) and are intended only for human consumption. This practice places a considerable burden on senders to read and understand policies that
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 7
are often lengthy and filled with legal jargon. Policies may also be encoded in a machinereadable format. The P3P specification is the most common example of this. Site operators and Web service providers can encode their policy and make it available in a standard way. People using the sites and services can then utilize a software agent that has been configured with their personal privacy preferences. Upon first contact with the site or service, the agent can retrieve the encoded privacy policy and compare it with its recorded preferences, alerting the sender if a discrepancy is found. The AT&T Privacy Bird plug-in for Microsoft’s Internet Explorer browser (http://www.privacybird.com/) is an example of such an agent. Privacy policies are certainly an improvement over a state of relative ignorance regarding a receiver’s intentions. However, they have proven to be a relatively weak tool for engendering trust in end users. In a recent study that examined Internet users’ awareness and perceptions of Web site privacy policies (Flinn & Lumsden, 2005), a large number of respondents drew attention to one or more of the following perceived weaknesses: •
Privacy policies typically disclaim the sharing of information, rather than assuring its protection;
•
The legal standing of privacy policies is not well known and is presumed to be very weak; and
•
Privacy policies are subject to change at any time, which is widely presumed to mean that site operators can, with impunity, ignore any promises they may have made simply by changing their policy.
Senders may also seek to educate themselves regarding the risks associated with the release of sensitive information. In the absence of informed consent or trusted privacy policies, this is often all the sender has to rely on.
How Likely is Each Potential Risk? Awareness of the risks is of value only when combined with accurate empirical knowledge of their likelihood. There is always a possibility of unwanted disclosure whenever sensitive information is entrusted to a receiver. The decision as to whether the benefits of releasing information outweigh the risks clearly depends on the probabilities involved. How attractive a target does the receiver present? How carefully have they safeguarded their systems? Do they have problems with morale or disgruntled employees? Presently, this is a difficult question to answer even in common situations. For example, how can one estimate the likelihood that the information collected by a receiver will be compromised by a malicious insider? Such acts are often performed for reasons that are not strictly rational and are therefore difficult to model or predict. In today’s environment, the threats are largely unquantified. This is an area in which reputation — evidence of past performance — may have considerable potential as a predictor of future performance. If an organization demonstrates an ability to maintain its privacy commitments over some period of time, it is
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
8 Flinn & Buffett
reasonable to expect that it will continue to do so (and conversely if it has a record of failing its commitments). The well-known eBay auction system (http://www.ebay.com/) provides a good example. It operates a reputation system in which parties to a transaction rate each other on their experience. When entering into a transaction with a previously unknown party, an eBay user can refer to the ratings (and the reasons behind them) to assess potential risks and their likelihood. This is a highly regarded and widely used system that exists precisely for the purpose of risk identification and assessment. With respect to privacy protection in e-commerce, however, the eBay system falls short in a number of ways. For example, the risks it helps identify are generally not privacy risks. This is due at least in part to the difficulty of detecting privacy violations in eBay transactions. If they are not detected, they cannot be reported. This is discussed further in this section. Second, the eBay system is a relatively small, closed environment. Because auctions occur in a closed environment, eBay is in a position to aggregate reputation data and make it available to its clients. In a larger, more distributed reputation system, where would the data come from? There are a number of possibilities. For example, recent legislation in California compels organizations to notify their clients of inappropriate disclosure of personal information. In other jurisdictions, some organizations have voluntarily notified clients of breaches, believing that their customers will value the perceived commitment to privacy issues. Customers may be unhappy to learn that their confidentiality has been compromised, but they will be less happy still if they learn it through a third party and consequently believe that the breach has been covered up or dismissed as unimportant. Over time, these notifications will produce a body of raw data that would fuel a reputation network. In the absence of readily available and trustworthy reputation data, consumers have little guidance regarding the likelihood of various privacy risks when interacting with a Web site or service. Trust mark programs, such as those provided by TRUSTe, the Better Business Bureau, VeriSign, ePublicEye, and others, are a notable exception. In all of these programs, the trust mark provider undertakes some kind of evaluation of a Web site (depending on the program), and awards its trust mark seal if the site meets the standards of the program. The site is then entitled to display the seal. The expectation is that visitors to the Web site will be assured by the seal that the site is trustworthy. In most cases, the seal is an active hyperlink to a page that confirms the authenticity of the seal and provides additional information regarding what the seal actually signifies. Each program is slightly different. For example, the TRUSTe program is primarily concerned with best practices for privacy protection. The VeriSign program is tightly coupled with its digital server certificate business and focuses on the assurance of authenticity. The Better Business Bureau (BBB) offers a privacy seal that is similar to the TRUSTe program. It also offers a reliability seal that, among other things, requires a participant to have a satisfactory complaint handling record with the BBB. In this regard, it functions somewhat like a reputation system in that the seal is linked to the past behavior of the organization. Like Web site privacy policies, trust marks have not yet proven to be a powerful tool for engendering trust in end users. In a recent study that examined Internet users’ awareness
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 9
and perception of trust marks (Flinn & Lumsden, 2005), respondents drew attention to one or more of the following perceived weaknesses:
•
Trust mark programs are not yet widely recognized, and the assurances they provide (and the differences between programs) are poorly understood;
•
There is skepticism about the strength of the assurances provided, especially given the commercial nature of the trust mark programs; and
•
Authenticity of trust mark seals is perceived as a problem (even respondents who knew that seals can be validated by clicking on them were not confident that they would know if validation information was forged).
What is the Potential Impact of Exposure? The essence of a risk management decision is to weigh potential benefit against potential cost. Once you know the privacy risks and how likely each one is, the third variable in assessing the potential cost is the harm associated with each risk. What damage might be caused; how severe might it be; and why should one be concerned about it? In some cases, disclosure of information may result in an increase in junk mail, spam, or unwanted phone calls at home. In others, it may result in denial of insurance coverage, loss of employment, or social isolation. The direct impact of releasing personal information may be easy to predict. For example, the receiver may use the information to market other products and services to you. Even then, it is often unclear how frequently you may be contacted. Indirect impacts, on the other hand, are rarely obvious. Once information is disclosed beyond the initial receiver, its flow is generally hidden. It is difficult for the sender to know how widely it is disseminated or the purposes for which it is used. It is especially difficult to know how it might be combined with other sources of information about the sender. Browser cookies provide a good example. A cookie is a small piece of data left by a Web site on a user’s local disk. When the user returns to that Web site, the data is retrieved by the site, allowing it to “remember” certain things about the user. A third-party cookie is one associated with a request for embedded content from another domain, such as a banner ad or content for an inline frame. It is well known that third-party browser cookies are used to link visits by a single browser across multiple Web sites. In theory, these linkages can be used to associate fragments of information about an individual stored at each site to build a more detailed dossier that can be shared between them. In practice, it is a difficult phenomenon to quantify. When browser cookies were first introduced in the Netscape browser, they were completely invisible. Over time, browsers evolved more sophisticated tools for filtering and managing cookies (Millett, Friedman, & Felten, 2001). Yet even today, it is difficult to assess the potential impact of accepting a browser cookie. Strategies based on accepting cookies only from sites representing a trusted brand, or filtering on the basis of P3P compact privacy policies (as is done in some contemporary browsers) are heuristics at best. They are not based on certain and objective knowledge of the consequences of accepting a given cookie.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
10 Flinn & Buffett
What Incentive do I Have to Risk Exposure? Once the impact of exposure has been estimated, it can be compared with the potential benefits. The benefits are usually more direct than the privacy risks, and are therefore much easier to evaluate. For example, the benefits of conducting a banking transaction online may be that it can be done at a convenient time (late at night, say) and without the need for a time consuming trip to the local branch. The theoretical privacy risks of banking online may be well known, but the practical risks are less clear. In more complex situations, personal information can be viewed as a commodity that may be exchanged for increased benefit, thereby creating an opportunity for mutually beneficial negotiation. The upcoming section on managing exposure explores this idea in depth.
How can I Control the Risk or Limit My Exposure? It is always difficult to control the proliferation of information once it has been released. This is especially true when the information is stored and communicated digitally because digital records can be duplicated and communicated with such ease. Suppose, for example, that you have stored sensitive personal information gathered from clients after obtaining their consent for its collection and use. Each time you perform a system backup of the database or file server, you create a copy of the data that must be managed according to the consent you have been given regarding time limits for retention. This creates complexity, and complexity leads to errors. Similarly, when disks or servers are replaced, the data are copied to the replacement hardware. Care must be taken to ensure that the original disks are scrubbed to guard against inappropriate disclosure of the information. These issues are well understood and are reflected in industry best practices. Furthermore, receivers are increasingly bound by legislation to implement common privacy principles such as those of the CSA Model Code. These principles are squarely focused on creating control mechanisms for senders. They limit use, disclosure, and retention times. They give senders the ability to monitor and update information for accuracy, and they create avenues for recourse and dispute resolution. However, they describe the things that receivers must do and, in so doing, stop short of directly giving control to senders. From the sender’s perspective, there is little, if any, opportunity to influence the process. In the absence of direct control mechanisms, it becomes a matter of trust. In deciding whether to disclose information, senders must consider how much they trust receivers to implement solid practices. In the long term, digital rights management (DRM) technologies have the potential to address the issue of direct control. Although today they are primarily concerned with ensuring that people pay for things they consume, they are more generally capable of ensuring that usage rights and limitations remain bound to the information they describe. Korba and Kenny (2002) have explored this idea, showing how contemporary DRM technology can be adapted to achieve privacy rights management.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 11
When direct control is not possible, another strategy for senders is to strategically minimize the disclosure of sensitive information. The idea is to reveal only as much information as is strictly necessary to achieve a goal or receive some benefit. The process is made strategic by ensuring that the value of the information disclosed is balanced with the value of the benefit received. To do this, a sender must have an objective way of estimating the values of both the information to be disclosed and the benefit to be received, and the sender and receiver require a shared protocol through which they can negotiate an equitable exchange. The section on managing exposure discusses these issues in depth and introduces a number of techniques and protocols for achieving these objectives.
What Commitments Have Been Made to Me? Once you have assessed the potential costs and benefits of engaging in a transaction and considered how to balance them, you might decide to proceed. Before doing so, you should probably give some thought to what might go wrong. To begin with, it is important to have a clear idea of precisely what to expect of the receiver in terms of the handling of your sensitive information. The privacy principles can guide this assessment. For what purposes will the information be used? How long will it be retained? What opportunities exist to monitor and adjust it for accuracy? Armed with a realistic set of expectations, you can then plan what to do if one or more of your expectations are not met. Is there a clear accountability structure? Is there an individual or office to which complaints can be addressed? Is there a dispute resolution process? The answers to these questions affect the trade-off between potential benefits and risks. Risks may be weighted more heavily if there does not appear to be an accountability structure to provide a safety net should problems arise. Unfortunately, good answers may be difficult to find. If they are available at all, they will often be found in written privacy policies (which suffer from a number of perceived weaknesses, as noted earlier). This is another area in which reputation is relevant. The trust placed in the commitments made by receivers will be less blind if empirical data describing their past behavior are available from trustworthy sources.
How Would I Know if Something Went Wrong? Suppose you visit an e-commerce Web site and find an item you would like to purchase. To complete the transaction you will have to provide a valid shipping address and payment information. You read the privacy policy, which promises not to share your personal information with any third party. So you decide to trust the site operators and proceed to purchase the item. A short time later your address is sold to a third party, in contravention of the advertised privacy policy. Would you be aware of the violation? Suppose the company is honorable and intends to respect their policy, but someone within the company having legitimate access (a database administrator, say) pilfers the payment data and sells it on the black market. How would you know? Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
12 Flinn & Buffett
In general, it can be very difficult to detect privacy breaches. This is really a corollary of the assertion that it is difficult to control the proliferation of information once it is disclosed. In both cases, the root of the problem lies in the lack of visibility into information flow. From the sender’s perspective, little can be done to monitor the flow of personal information once it has been disclosed. As an alternative, one can look for external evidence of a policy violation. For example, when submitting shipping information in the example above, you might use a unique perturbation of your name and record the fact that you sent that name to that site. If you subsequently receive postal mail from a third party addressed to that name, then you know the site released your information in contravention of its policy. (This technique is not new; it is commonly used both online and off-line, and similar features are beginning to appear in personal firewall software. One popular online usage is to subscribe to a mailing list using an e-mail address that is unique to that list. If you receive spam at that address, then you know the address was compromised through the list). If a reputation network is available, then violations could be recorded as they are detected and shared with other members of the network. Reconsider your visit to the e-commerce site, this time assuming the existence of such a reputation network and a user agent that is able to utilize it. The agent could consult the network upon first visiting the site. If it detects previous privacy violations, it can report the details to you. These examples are intended only as a suggestion as to how the problem of detection may be tackled. In particular, the technical and political challenges involved in building the kind of reputation network we suggest are immense. In the next section (Privacy Control) we describe some preliminary steps in this direction.
What Recourse do I Have? Recourse can soften the impact of a privacy breach by correcting a mistake or providing compensation for damages. An obvious avenue for recourse is created by legislation based on common privacy principles, which typically requires organizations to appoint a privacy officer and to facilitate a dispute resolution process. In practice, however, remedial actions are often very modest. In Canada, for example, there are no immediate consequences for a company that fails to resolve privacy complaints brought against it. Unsatisfied complainants can refer their cases to the federal Privacy Commissioner who has relatively broad powers, including the power to summon witnesses, administer oaths, and compel the production of evidence if voluntary cooperation is not forthcoming. The Commissioner acts as an advocate for complainants and seeks to negotiate a satisfactory resolution but ultimately has limited power to impose settlement conditions on unwilling parties. Opportunities for recourse are nevertheless an important element in building trust. As one of the pioneers of the risk management approach to security has put it, “I trust anyone against whom I have recourse, and no one else” (Dan Geer, personal communication).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 13
Risk Management and the CSA Model Code The risk management approach can be thought of as traditional privacy principles viewed from the sender’s perspective — the flip side of the privacy coin, so to speak. Whereas the CSA Model Code can be used by receivers as a guide to maintaining their duty of confidentiality; the questions discussed in the previous section are intended for senders as a guide to exercising their right of privacy. Not surprisingly, then, many of the risk management questions reflect principles of the CSA Model Code. The correspondences are highlighted here to reinforce the relationship between the two perspectives. What are the risks? Answers to this question can be drawn from principles 2 (identifying purposes), 5 (limiting use, disclosure, and retention), and 8 (openness). Principles 2 and 8 are explicitly concerned with communicating intentions to senders. Principle 5 defines responsibilities with respect to information collected, but the policies it mandates are often communicated to senders as well. How likely is each potential risk? Although likelihood is an empirical matter that is best supported by historical evidence, principles 4 (limiting collection), 5 (limiting use, disclosure, and retention), and 7 (safeguards) can help in estimating it. Principles 4 and 5 help define the potential extent of exposure, and principle 7 is concerned with providing protections against exposure. What is the potential impact of exposure to me or to others? The risk management process depends strongly on assessment of the value of information disclosed and benefits received, but the privacy principles are not directly concerned with value assessment. However, such assessments are implicit in the management of sensitive information by a receiver. For example, principle 7 requires that safeguards appropriate to the sensitivity of the information be provided. How can I control the risk or limit my exposure? Principles 3 (consent) and 9 (individual access) exist for the purpose of giving some control to senders of information. Consent is typically given subject to conditions that limit what receivers can do with information they collect. Principle 9 extends control over information to the period following its collection. Principle 8 (openness) is also important in this context because it helps senders understand their rights and the opportunities available to them. What commitments have been made to me, and who is responsible for them? This question is reflected in principles 1 (accountability), 8 (openness), 9 (individual access), and 10 (challenging compliance). Principle 1 asserts that an organization is accountable for its practices. It does not directly provide further assistance to senders, but reflects the importance of accountability. Principle 8 works to inform senders of what to expect and who is accountable, and principles 9 and 10 help ensure that policies and procedures will be detailed and concrete. How would I know if something went wrong? Principles 8 (openness) and 9 (individual access) provide some assistance in this regard. Principle 9 creates the opportunity for individuals to monitor the information that describes them and to ensure that it is accurate. Principle 8 informs them of the opportunity. Of course, inaccuracy is only one of many things that can go wrong. Nevertheless, it is a good place to start.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
14 Flinn & Buffett
What recourse do I have if commitments made to me are violated? This question corresponds with principles 1 (accountability), 8 (openness), and 10 (challenging compliance). Principle 10 in particular exists for the purpose of providing an avenue of recourse to senders of information. Principle 1 establishes accountability of the organization, and principle 8 ensures that senders can learn of their rights and opportunities for redress.
Privacy Control Given the risk management perspective of the previous section, we now turn to the practical matter of what senders can do for themselves to help maintain their own privacy. We discuss the topic in two parts. First we consider what can be done to develop an awareness of information exposure. Then we introduce some practical techniques to manage that exposure.
Understanding Exposure Sensitive information can be inappropriately exposed in a variety of ways. Here we identify three general classes of exposure and describe some of the ways that senders can develop awareness of exposure in each category.
Intentional Disclosure It is common for receivers to share the information they collect from senders with other parties. Their intention to do so is shaped by legislation, professional codes of conduct, and business goals, and it is communicated to potential senders through legal contracts, published privacy policies, and other means. The onus is largely on senders to understand the legal and regulatory environment and to review the privacy commitments made to them through contracts, policies, and other agreements. Presently there are few tools to assist with these tasks. However, a good start has been made as part of the Privacy Incorporated Software Agents project (PISA) (Blarkom, Borking, & Olk, 2003), a European Fifth Framework Programme project whose goal was to develop privacy enhancing technologies to assist users of intelligent software agents in electronic commerce settings. As part of this project, Patrick and Kenny (2003) undertook a detailed analysis of the European Privacy Directive with a view to developing human computer interaction (HCI) principles to guide the design of application software. The goal is to design software capable of exposing privacy concerns and guiding users in effective privacy management. Through an engineering psychology approach, they defined four requirement categories for effective privacy interface design — comprehension, consciousness, control, and consent — and identified requirements within each category. This effort is focused precisely on the need to
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 15
assist users in understanding the legal and regulatory context in which they are operating. However, it is limited by the difficulty of obtaining meaningful data to guide the operation of the interfaces they propose. For example, they suggest that opportunities for controlling the disposition of sensitive information be made obvious through clearly recognized user interface controls, but do not go further to suggest what the controls might be. In systems where there is no coupling between the sender’s user interface and the data held by a receiver, the essential problem of control remains. The Platform for Privacy Preferences Project introduced earlier is perhaps the best known example of a technology designed to proactively alert users to the possibilities for intentional disclosure. It has been widely deployed, in large part because support for it was included in version 6 of Microsoft’s Internet Explorer browser (support has subsequently been added in Netscape and Mozilla browsers as well). As we noted in the previous section, P3P supports a mode of operation in which a user agent (such as the AT&T Privacy Bird) automatically compares the details of a privacy policy it fetches from each Web site visited with locally configured privacy preferences. If a discrepancy is found, the user is alerted. Any subsequent decision to disclose sensitive information to the site will then be made with heightened awareness of the policy.
Accidental or Unplanned Disclosure Even when receivers make commitments to hold sensitive information in confidence, breaches of privacy may still result from malicious insiders or outsiders, sloppy procedures for securing databases or recycling equipment, and so on. Receivers, as custodians of sensitive information, have a variety of tools for protecting against accidental disclosure. These include industry best practices for privacy impact assessment (PIA) and threat risk analysis (TRA), the ISO 17799 information security standard, and a host of practices and products for network and information security. Within large organizations, sensitive client information can flow in complex ways as needs arise for different parts of the organization to perform some processing relating to the client. Managing this complexity can be a particularly difficult challenge. When it is not managed correctly, accidental or unplanned disclosures can occur. IBM’s EPAL (Enterprise Privacy Authorization Language [EPAL 1.1], 2003) has been designed with this challenge in mind. It allows organizations to encode their internal policies in a way that enables processing software to automatically constrain its actions as required to respect the policy. It is much harder for senders to understand and quantify the risk of accidental disclosure by a receiver, and harder still for them to exert any control over it. Large-scale reputation systems may be well suited to addressing the problem by providing a way for users to obtain objective evidence of past behavior. The reputation approach still falls short of making it possible for senders to exert some direct influence over the handling of their information by receivers, but it does make relevant information available to senders upon which more informed decisions can be based. Unfortunately, we do not yet have the global-scale trust network that is a prerequisite for such a system, although Branchaud and Flinn (2004) have proposed a scalable trust management infrastructure called xTrust
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
16 Flinn & Buffett
that appears well suited for the task. We will return to this briefly at the end of this section when we discuss the role of user agents in understanding exposure.
Tracking, Linking, and Profiling Much has been made of the potential for receivers to collude to build more detailed personal dossiers than any one receiver could compile on its own. In the context of the Internet, and the World Wide Web in particular, browser cookies and Web bugs are frequently cited as sources of concern. Web bugs are images or inline frames placed on a Web page by a third party for the purpose of monitoring the activity of the Web page’s visitors. These elements are essentially invisible (usually being 1·1 pixels in size) and collect such information as IP address, URL of the current page, URL of the Web bug image, the time the Web bug was viewed, type of browser that fetched the image, and previously set cookie values. The Bugnosis plug-in for Microsoft’s Internet Explorer browser is a popular tool for rendering Web bugs visible (see http://www.bugnosis.org/). There are many concerns about the potential impact of cookies, Web bugs, and similar devices. Many of the threats are well known and have been frequently described; yet the scope of the problem is largely unquantified. Although we know what exposure is theoretically possible, we do not yet know how likely it is. Quantification is essential to support accurate risk management decisions. Some preliminary studies have been completed. For example, a number of studies have investigated the prevalence of Web bugs (Martin, Wu, & Alsaid, 2003; SecuritySpace Web Bug Report, 2005). These studies highlight the difficulty of fully quantifying the threat. Although the prevalence of Web bugs can be measured (and it is surprisingly high), we can still only speculate as to their purpose. Certainly the purpose is surveillance of some kind — Web bugs are not useful for anything else. But is the surveillance sinister? Are the sites tracking and profiling individual users, possibly in collusion with each other, or are they merely doing the necessary bookkeeping associated with advertising programs? Some sites surely are engaged in tracking and profiling, but the proportion is still unknown. It is well known that browser cookies can assist with tracking individuals as they traverse the Web, linking their activities across multiple sites in a way that permits the aggregation of data from those sites. Here again, the extent of the tracking is unknown. Surveys of common cookie usage (e.g., SecuritySpace Internet Cookie Report [2005]) clearly show that most are being used for simple session tracking. How many of these are being leveraged for other purposes? What is the impact of cookies that are in fact used for tracking and profiling? Millett, Friedman, and Felten have conducted a retrospective analysis of cookie management features in Web browsers (Millett et al., 2001), describing a progression from early browsers, in which cookies were virtually invisible, to the relatively sophisticated tools available in contemporary browsers. Yet even now, it is difficult to understand the potential exposure associated with any given cookie. The simple strategy of blocking all third party cookies is quite effective because it breaks the link that allows visits to
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 17
different sites to be associated with a single individual. But is it overkill? Do the risks really outweigh the potential benefits? The trade-offs are not yet well understood, especially by typical Internet users. People are clearly concerned about cookies and have made serious attempts to educate themselves concerning the risks, but most still find the privacy implications of the technology to be very confusing (Flinn & Lumsden, 2005).
User Agents and Visibility The Bugnosis Web bug detector, the AT&T Privacy Bird, and contemporary browser cookie managers are all examples of user agents designed to render visible privacy risks that were previously invisible. They bring potential risks to the attention of the user precisely when they are most relevant and create negligible, cognitive burden the rest of the time. Their major weakness is that their ability to detect risks and accurately report on them is limited because they rely solely on evidence found in normal HTTP communication (augmented by privacy policies and user configuration in the case of Privacy Bird). The same idea can be applied to any of the other risk categories we have described in this chapter. We have been working with a system called Omnivore that provides an alerting framework into which risk-specific modules can be plugged (Flinn & Stoyles, 2004). For example, Web bug and P3P modules replicate the basic functionality of Bugnosis and the Privacy Bird. However, the project seeks to address the limitations of these tools by involving secondary sources of evidence and using predictive modeling. As we have noted throughout the chapter, reputation systems are a promising source for secondary evidence. For example, the automatic evaluation of a privacy policy can be complemented by consulting the reputation network to determine if policy violations have been reported for a site. The Omnivore architecture provides a framework for utilizing such evidence in a comprehensive alerting mechanism. The Omnivore system also explores the potential of predictive modeling. The prototype system is structured as an HTTP proxy that monitors all communication with a browser. From this position, the proxy is able to observe the exchange of cookies and other relevant data and is therefore capable of performing the same linking and tracking analysis as any remote site. In this way, it builds up a relationship graph that it can subsequently use to predict the effects of visiting sites or accepting new cookies in terms of relationships in the graph. These and other techniques can be combined. Suppose, for example, that you are visiting a new e-commerce site for the first time. Based on the relationship graph, the Omnivore user agent is able to determine that your visit to this site can be linked, through a common advertiser, to other sites you have previously visited. The agent might then consult the reputation network for evidence that the companies involved have partnerships to trade or sell personal information. If such evidence is found, the proxy could inform you of the potential for information you submit to the new site to be shared with others. In this case the risk is not purely theoretical — it is based on empirical evidence, and concrete data (such as the names of the sites in question) can be provided. The proxy might also block the exchange of data that would allow the companies involved to correlate your visits to their sites. Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
18 Flinn & Buffett
Many of these ideas are still at an early stage of development. We nevertheless believe that this approach is an effective way to help senders understand their potential exposure.
Managing Exposure Privacy Management as a Risk Management Strategy Once an information sender is fully aware of what information is being requested, the expected usages of such information, and the possible effects that may be realized as a result of this information transfer, a risk analysis should be done to determine the cost/ benefit of such a transmission. In particular, the sender should attempt to answer the following two risk management questions from the previous section:
• •
What is the potential impact of exposure to me or to others? What incentive do I have to risk exposure?
In other words, the information sender must be able to determine not only the possible risks and the likelihood of such events, but also must be able to quantify how meaningful and severe the potential damage might be. Incentives offered by the receiver may simply be the completion of the transaction for which the data was requested, such as a purchase, registration, or download. However, information seekers may offer extra incentives in exchange for additional information, such as a discount or a free gift. Quantification of answers to these two questions is a difficult task, but a necessary step toward assisting the sender in determining whether the private information should be provided. Once the sender has evaluated the benefit of a potential exchange, one cannot assume that this will be universally accepted as fair market value. Receivers’ valuations will vary as well, depending on the potential usage of the information and also on the importance or significance of the particular sender (regular customer, target demographic, etc.). Thus negotiation might be necessary in order to find a mutually agreeable exchange. In this section, we discuss the process of computing personal valuations and determining fair exchanges by first formally defining the concept of private information contracts. We then propose a technique for utility elicitation as a means for computing the sender’s value for private information, and demonstrate a protocol for performing automated negotiation of such private information contracts. We conclude with a brief discussion on strategies for carrying out these negotiations.
Private Information Contracts We view privacy negotiation as the process through which an information sender and an information receiver determine a suitable contract for private information exchange. The terms of the contract dictate the information to be sent, the parties that will receive Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 19
such information (since the receiver may share it), the purposes for collecting the data, and the duration for which the data will be retained. In addition, the contract may specify a reward, denoting the compensation the sender receives in return for the information. Under this model, a private information contract can be viewed as a composition of two major components: (1) the privacy component and (2) the reward component. Since the terms related to the privacy component correspond to the terms that can be dictated using P3P, we can formalize the privacy component as a P3P statement. More formally, Let D, R, and P be sets of allowable values for requested data, intended recipients, and purposes, respectively, as given in the P3P specification. Let S be the set of P3P statements, where each element s = Æd,r,p,tæ contains a set d ˝ D of data, a set r ˝ R of recipients, a set p ˝ P of purposes, and a real-valued retention time t ˝ R. Rewards in these contracts could include discounts on merchandise, free software or document downloads, air miles, and so forth. We assume that typically rewards would just be short term offers to be redeemed immediately, rather than long-term offers such as a lifetime gold membership. Let T denote the set of tokens, where each token represents some reward. A valid contract Æs,tæ then consists of a statement s ä S and a token t ˝ T. A deal involving more than one statement and/or more than one token can be viewed as a deal containing more than one contract. Once both sides agree to a contract Æs,tæ, the information sender transmits information as given by the terms in s, and the information receiver credits the sender with the reward represented by the token t.
Elicitation of Sender Preferences for Information Contracts To model the information sender’s preferences over the set of possible private information contracts, we employ the concept of multi-attribute utility (Keeney & Raiffa, 1976). Utility elicitation techniques are then used to help determine the sender’s multi-attribute utility function. We model utility as follows: Let us : S fi R be the utility function over the set of statements, indicating the sender’s utility of each statement, and let ut : T fi R be utility function over the set of tokens, indicating the sender’s utility of each token. We assume that the two attributes S and T in our model are mutually utility independent. That is, the sender’s preference relation for statements remains the same regardless of which token is in question and vice-versa. Given that the attributes are mutually utility independent and that us and ut are fully specified over their respective domains, the two-attribute utility function u can then be expressed by the bilinear function: u(Æs,tæ) = k s us (s) + k t ut (t) + k st us (s) ut (t)
(1)
for all s ä S and t ä T, where ks, kt , and kst are scaling constants that sum to 1 (refer to Keeney and Raiffa [1976], for example). To determine a reasonably close approximation of the sender’s true utility function, we employ the method of utility elicitation as given by Chajewska et al. (2000). With this
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
20 Flinn & Buffett
approach, utility for a given option is treated as a random variable that is drawn from a known distribution. A series of questions about preferences is then asked to reduce the variance of the distribution of possible outcomes for the true utility. Each subsequent question asked is the one whose answer is predicted to yield the most valuable information. That is, given a decision strategy (in our case, a decision on whether to accept on offer, or perhaps a negotiation strategy), the next question to ask is the one such that the information to be gained will increase the expected utility of the strategy the most. A general description of the technique is given briefly as follows (see Buffett, Scott, Spencer, Richter, and Fleming [2004] for a more detailed examination). Given utility distributions U over the set of options and a decision strategy p, the expected utility Eu[p|U] of the strategy given the distributions is computed. The goal is then to determine the question to ask the decision maker that will provide the most valuable information. Let q be such a question with n possible answers. If the user gives the ith answer with probability p(i), and the resulting distributions given this new information are Ui , then the posterior expected utility after asking q is: n i =1
p(i ) Eu[p | U i ]
Typical questions follow a gamble pattern. Let x 1, x 2, and x be alternatives such that utilities u(x1) and u(x 2) are known, and x 1 f x f x2 (x 1 is preferred over x which is preferred over x 2). The sender could then be asked, “Given a choice between receiving a) alternative x for sure and b) a lottery which gives alternative x1 with probability s and alternative x 2 with probability 1-s, which would you choose?” If the sender chooses x, then we know that u(x) > u(x 1)s + u(x 2)(1-s); otherwise, u(x) < u(x 1)s + u(x 2)(1-s). The probability distribution function for u(x) is then updated accordingly. This approach can be used to determine the information sender’s utilities for the consequences that can be associated with giving away certain information. For example, if there is a 25% chance that the sender will receive phone calls from telemarketers if he gives up his name and phone number, then given some reward t, his utility for receiving t in exchange for this risk is equal to his utility for a contract in which he would receive t in exchange for his name and phone number. We use the technique for utility elicitation described above to derive the user’s utilities as follows. Let C be the set of possible consequences that can occur as a result of the candidate private information exchange, and let U be the set of probability distribution functions for the sender’s true utility for each consequence in the set C. Also, let p be the chosen strategy in the current negotiation, and let Eu[p|U] be the expected utility of executing p, given the beliefs U about the user’s utilities. Note that in some cases p might be quite simple, such as in takeit-or-leave-it negotiations where an offer o is proposed by the business, and thus the strategy p is simply p(o) = “accept” if and only if we believe that the user’s utility for o is greater than or equal to some threshold. In these cases, Eu[p|D] can be quite simple to compute. In other scenarios, such as multi-round bilateral negotiations, p can be quite
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 21
complex, and thus computation of Eu[p|D] might require more sophisticated gametheoretic or decision-theoretic techniques. In either case, we assume that its computation, or at least its estimation, is feasible. The question q that maximizes the expected utility of p is then determined. Let uq denote the expected increase in utility associated with asking q. Also, a bother cost bc is determined, indicating how much the user will be bothered if another question is asked. This bother cost can be viewed as the reduction in utility that will be realized by the user as a result of having to answer a question. Thus, the question q is asked if and only if uq > bc. Once there is no q that satisfies this inequality, the question period is terminated since not enough information can be gained to make the effort worthwhile. For example, consider the following simple privacy negotiation where the information receiver offers the sender a private information contract. The contract states that the sender will receive a particular discount on a product in exchange for her name and home address, which may be given to a third party. The negotiation is simply a take-it-or-leaveit type. We estimate that the consequence “receive junk mail” will occur with 0.8 probability as a result of this information sharing and that the user’s utility for receiving junk mail and the discount is either 0.5 or 0.6, each with 50% likelihood. Also assume that the sender’s utility of no exchange is 0.62. That is, the sender would prefer not to make an exchange rather than accept a contract with expected utility below 0.62. Thus the strategy for the offer o is p(o) = accept iff u(o) > 0.62. Since the expected utility of o is Eu[o] = 0.55 · 0.8 + 1 · 0.2 = 0.64 (since the mean utility given that the consequence occurs is 0.55 and we assume that the utility of no consequence is 1), the sender should accept the offer and expect 0.64 utility. However, there is a chance that the sender’s true utility for the offer is below 0.62 (when the utility for the consequence is 0.5), and thus there is a possibility that we have wrongly advised the sender. If we could further ascertain the sender’s true utility, we can reduce the chance of such ill-advised suggestions, and consequently increase the expected utility of the transaction. Let q be a question that allows us to determine whether the sender’s utility is 0.5 or 0.6. If we find that the utility is 0.5 (with 50% probability), then we know the expected utility of the offer is Eu[o] = 0.5 · 0.8 + 1 · 0.2 = 0.6. Since this is less than 0.62, the buyer should be advised to reject the offer, thus achieving the no-offer utility of 0.62. If we find that the utility is 0.6 (with 50% probability), then we know the expected utility of the offer is Eu[o] = 0.5 · 0.8 + 1 · 0.2 = 0.68, in which case the sender should accept the offer. Thus the expected utility of the strategy (which is to accept if and only if Eu[o] > 0.62), given that we will know the answer to q is Eu[o] = 0.62 · 0.5 + 0.68 · 0.5 = 0.65. Asking q thus increases the expected utility of the strategy by 0.01 (since it had been 0.64). If this value exceeds the bother cost of asking the question, then the sender should be asked; otherwise the elicitation period is over. See Buffett, Scott, et al. (2004) for a deeper examination into the computation of bother cost. At this point, best estimates of the sender’s utilities over the set of possible contracts can be computed, and negotiation can commence. Note that it is possible that negotiation and utility elicitation can be interleaved. This can, however, complicate the situation as new issues surface on how to conduct this effectively. For simplicity here we treat the two phases separately.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
22 Flinn & Buffett
Automated Negotiation of Information Contracts In this section we discuss the PrivacyPact protocol (Buffett, Jia, et al., 2004), a bilateral bargaining protocol for the exchange of private information. The PrivacyPact protocol dictates how private information contracts can be offered between the information receiver and sender until a mutually agreeable contract is found (or the negotiation is terminated). The protocol aims to guide the negotiation process efficiently to a convergence by attempting to force each party to make progress. Specifically, a party p in the negotiation is not allowed to make an offer that is necessarily worse to the other party than another offer previously made by p. While the receiver’s and sender’s utility functions are private, a partial order of each participant’s preference ranking is mutually known. Specifically, we assume that the receiver necessarily values a statement s no more than another statement s' if the data, recipients, and purposes specified in s are subsets of those specified in s' and the retention time specified in s is no longer than that in s'. For the sender, the opposite is true. More formally, we define the partial order operator ° over the set S as follows: Let s, s 'ä S be two statements with s = Æd,r,p,tæ and s' =Æd',r',p',t'æ (for the P3P retention value < indefinitely/>, we assume t =¥), s°s
d ˝ d'
r ˝ r'
p ˝ p'
t £ t'
For example, let s1, s2, and s3 be P3P statements where s1=Æ{e-mail address}, {ours}, {admin}, 1 weekæ, s2=Æ{name, e-mail address}, {ours}, {admin, telemarketing}, 3 weeksæ and s3=Æ{name, phone number}, {ours}, {admin, telemarketing}, 3 weeksæ. Then s1 ° s2 since the data, recipients, and purposes specified in s1 are subsets of those specified in s2, and the retention time specified in s1 is less than that specified in s2. On the other hand, s1 ¾ s3 since {e-mail address} is not a subset of {name, phone number}. It is possible that the receiver could value obtaining {name, phone number} more and thus have a higher utility for s3, or conversely that the sender could prefer giving up {name, phone number} less and thus have a lower utility for s3. However, these preferences would be only privately known. Similar to that for statements, since some rewards are mutually agreeable to be “more” or “better” than others, we define a partial order relation ° over T. For example, while the sender’s preference over various free software downloads might be subjectively decided upon and therefore be only privately known, it is assumed to be obvious to all that the sender would value a 20% discount more than a 10% discount on the same items. For any two tokens t, t 'ä T, t ° t' if and only if it is mutually known that the reward represented by t is no greater than that represented by t'. These mutually agreeable partial orderings over statements and tokens induce a partial ordering over offers. For any two offers Æs,tæ and Æs',t'æ, if s ° s' and t' ° t then the sender prefers Æs,tæ at least as much, while the receiver prefers Æs',t'æ. That is, the sender prefers less information and more reward, while the receiver prefers more information and less reward.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 23
PrivacyPact is an alternating-offers bargaining protocol under which the participants initially agree on the subdomains of negotiation, D'Í D, R'Í R, P'Í P, R'ÍR, T 'Í T. Each message sent under the protocol consists of a candidate private information contract offer. Starting with the receiver, offers are sent one at a time from alternating parties until either an agreement is reached or a party chooses to terminate the negotiation. Offers sent must conform to the following two constraints:
•
Constraint 1: Information receiver makes progress. Let ásr n,t r nñ be the n-th message sent by the information receiver. Suppose that this n-th message is the current message and the information sender is determining whether the message is legal. ásr n,t r nñ must satisfy: "i = 1,..., n - 1, sri p/ srn
or trn p/ tri
(2)
That is, no offer to the information sender may be worse than a previous offer. For every previous offer, either it does not ask for more information, or it offers more in return than every previous offer. •
Constraint 2: Information sender makes progress. Now consider ássn,tsnñ the n-th message sent by the information sender. This offer must be no worse for the information receiver than any previous offer. "i = 1,..., n - 1, sri p/ srn
or trn p/ tri
(3)
Negotiation will end, and an agreement is considered to have been reached if either of the following two termination conditions are met:
•
Condition 1: Information receiver terminates. If the information receiver’s current offer ásrn,tr nñ is the same as, or an improvement from the information sender’s point of view over one of the sender’s previous offers, then this offer must be accepted by the sender, thus ending the negotiation. More formally, if $ ássi ,t si ñ where i = 1,...,n-1 such that sr n ° ssi and t si ° t r n, then the current offer ássn,t snñ must be accepted by the information sender.
•
Condition 2: Information sender terminates. If the information sender’s current offer ássn,tsnñ is the same as, or an improvement from the information receiver’s point of view over one of the receiver’s previous offers, then this offer must be accepted by the receiver. Such an offer ends the negotiation. More formally, if $ ásr i,tri ñ where i = 1,...,n such that sr i ° ssn and tsn ° tri , then the current offer ássn,t snñ must be accepted by the information receiver.
One participant p may choose to terminate the negotiation at time n, perhaps because of time constraints or because the protocol disallows any further offers that he deems satisfactory. At this time, p can offer a take-it-or-leave-it message over the set of previous
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
24 Flinn & Buffett
offers Æspi ,t piæ, i = 1,...,n. In this case, p’s opponent can choose to accept one of p’s previous offers or decline. In either instance, the negotiation is terminated. This last chance message ensures that a mutually acceptable offer can always be reached under this protocol if one exists (see Buffett, Jia, et al. [2004] for a proof). Without being able to accurately characterize the opponent’s preferences over the set of possible offers, or its urgency to find a deal, designing effective negotiation strategies is difficult. One simple route to take is a miserly approach, where offers are made simply by starting with the one with highest utility, then next highest, and so forth, until either the receiver accepts an offer or makes one that the sender finds acceptable. This method is rather short-sighted, since no attempt is made to cooperate with the opponent by offering deals that it might find more enticing. Thus negotiations could take too long and break off before a deal is reached. A more cooperative strategy is to make deals for which the opponent is likely to have some interest, but at the same time are good for the offering party. This concept of making effective trade-offs that keep utility constant while possibly increasing utility for the opponent, first proposed by Faratin, Sierra, and Jennings (2002), involves finding offers that maximize a similarity measure when compared with the opponent’s past offers. Offers that are more similar to the opponent’s offers are likely to be preferred. Buffett, Jia, et al. (2004) have performed several experiments with various strategies of this form within the PrivacyPact protocol.
Future Trends The future holds dramatically greater complexity. Information and communications technologies have matured to the point where a wide range of business and social processes are migrating to cyberspace. Today, information shared across public networks is of relatively limited sensitivity: e-mail addresses, other contact information, credit card numbers, unstructured information exchanged primarily through e-mail, and so on. Online banking involves more sensitive information (e.g., account balances or transaction details), but the potential exposure is relatively low because only two parties are involved, and the communication channel is strongly secured (although authentication of users remains a relative weakness). Already many jurisdictions have projects underway that will increase the amount and sensitivity of information being processed and exchanged. A good example is furnished by electronic health record (EHR) systems. One vision of the electronic health record is the one-patient-one-record model in which a single virtual record follows an individual from birth to death. It accumulates medical history over time, and clinicians and others access the record as necessary to deliver care. Privacy issues in these systems are complex because many people have a legitimate need to interact with the record at different times. A single record may be accessed by physicians, nurses, specialists, other clinicians and by management, administrative, and clerical staff working for hospitals, clinics, private practices, testing labs, ambulance and paramedical services, insurers, and government agencies. Each participant in the care Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 25
giving process has legal and ethical obligations regarding the information they access and produce, leading to confusion over questions of ownership and liability. These conditions make it difficult to even specify a privacy policy, much less to implement and enforce it. What hope does the healthcare consumer have of understanding the potential exposure? To manage this complexity, custodians of personal information can look to technologies like EPAL to begin the process of ensuring compliance with privacy obligations at every step. For senders, the challenge is even greater because the processes are more opaque, and opportunities for control are less direct. We believe that a three-prong approach is required to address the need; we have discussed two of them in depth in this chapter. First, before senders can hope to manage their exposure or control information that has been released, they must first understand the legal and regulatory environment and the potential for exposure. Sophisticated user agents capable of detecting potential risks and communicating them to users appear to be a promising approach, especially when they have rich sources of evidence to draw upon such as global scale reputation systems. The Omnivore system suggests one way to approach the problem. User interfaces should also be designed in a way that helps users remain aware of the risks and the opportunities to control or mitigate them (Patrick & Kenny, 2003). Second, given sufficient awareness and understanding, users will be in a position to intelligently manage their exposure. Tools to help assess the value of personal information and to fairly negotiate its exchange will be essential. The PrivacyPact protocol is an example of such a tool. Third, senders must have opportunities for direct control over information they have released. Privacy legislation provides this to some degree, but senders remain dependent on the compliance of receivers. The privacy rights management concept (Korba & Kenny, 2002) has the potential to give senders some meaningful control.
Conclusions In this chapter, we propose a novel approach to privacy management that focuses on the information sender’s perspective. In a typical private information exchange, information receivers have particular duties with respect to the information, as outlined by modern privacy standards such as the CSA Model Code. Information senders must understand these duties and rely on the receivers to uphold their commitments and ensure that no harm comes to them. Even with privacy policies, it is difficult for typical senders to understand the potential benefits and risks of information disclosure, and their rights regarding the monitoring and control over such information. Thus many people prefer to disengage themselves from situations where such information transfer may occur. Among other things, a major drag on the growth of electronic commerce has been realized as a result, since potential buyers are lost. To fill this void, we present a framework for helping an information sender exercise the right of privacy. This includes a set of risk management questions, designed to help the
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
26 Flinn & Buffett
sender evaluate a given situation and understand the potential implications of private information disclosure. These questions loosely resemble the principles of the CSA Model Code. While the CSA Model Code outlines the duties of information receivers, our questions can help information senders realize how information will be used and how disclosure can benefit and/or harm them. We also highlight the differences between intentional and unplanned exposure and show how unplanned exposure can be limited by discussing a few useful technologies that are in place today. Once the sender is fully aware of the risks and benefits of disclosure, it may still be unclear just how beneficial such a disclosure might be. To help a sender understand the value of information, we model preferences using multi-attribute utility and employ a utility elicitation technique. Since such values are not likely to be agreed upon with the receiver, a mechanism is presented for automated negotiation of private information contracts. These contracts can include a reward offered by the receiver as extra incentive to provide requested information. The PrivacyPact protocol is presented as the set of rules to which bargaining must conform, and a simple negotiation strategy is presented.
References Branchaud, M., & Flinn, S. (2004). xTrust: A scalable trust management infrastructure. In S. Marsh (Ed.), Proceedings of the Second Annual Conference on Privacy, Security and Trust (PST 2004) (pp. 207-218). Fredericton, New Brunswick, Canada: Electronic Text Centre, University of New Brunswick. Retrieved February 27, 2005, from http://dev.hil.unb.ca/Texts/PST/pdf/branchaud.pdf Buffett, S., Jia, K., Liu, S., Spencer, B., & Wang, F. (2004). Negotiating exchanges of P3Plabeled information for compensation. Computational Intelligence, 20(4), 663-677. Buffett, S., Scott, N., Spencer, B., Richter, M. M., & Fleming, M. W. (2004, October 1415). Determining Internet users’ values for private information. In S. Marsh (Ed.), Proceedings of the Second Annual Conference on Privacy, Security and Trust (PST 2004) (pp. 79-88). Fredericton, New Brunswick, Canada. Chajewska, U., Koller, D., & Parr, R. (2000). Making rational decisions using adaptive utility elicitation. In Proceedings of the 17th National Conference on Artificial Intelligence (AAAI-00), Austin, Texas, USA (pp. 363-369). Chellappa, R., & Sin, R. (2005). Personalization versus privacy: An empirical examination of the online consumer’s dilemma. Information Technology and Management, 6(23). Cranor, L. F. (2003, April 6). Designing a privacy preference specification interface: A case study. Presented at the CHI’03 Workshop on HCI and Security Systems. Retrieved February 27, 2005, from http://www.andrewpatrick.ca/CHI2003/HCISEC/ hcisec-workshop-cranor.pdf Cranor, L. F., Arjula, M., & Guduru, P. (2002). Use of a P3P user agent by early adopters. In Proceedings of the ACM Workshop on Privacy in the Electronic Society (pp. 1-10). ACM Press.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Exercising the Right of Privacy 27
Cranor, L. F., Langheinrich, M., Marchiori, M., Presler-Marshall, M., & Reagle, J. (2002). The Platform for Privacy Preferences 1.0 (P3P1.0) specification. Retrieved from http://www.w3.org/TR/P3P/ Cranor, L. F., & Resnick, P. (2000). Protocols for automated negotiations with buyer anonymity and seller reputations. Netnomics, 2(1), 1-23. Culnan, M., & Armstrong, P. (1999). Information privacy concerns, procedural fairness, and impersonal trust: An empirical investigation. Organization Science, 10(1), 104-115. Culnan, M., & Bies, R. (2003). Customer privacy: Balancing economic and justice considerations. Journal of Social Issues, 59(2), 104-115. Enterprise Privacy Authorization Language (EPAL 1.1) (IBM Research Report). (2003, October 1). Retrieved February 28, 2005, from http://www.zurich.ibm.com/security/ enterprise-privacy/epal/Specification/ Faratin, P., Sierra, C., & Jennings, N. (2002). Using similarity criteria to make issue tradeoffs in automated negotiations. Artificial Intelligence, 142, 205-237. Fatima, S., Wooldridge, M., & Jennings, N. R. (2004). An agenda-based framework for multi-issue negotiation. Artificial Intelligence, 152(1), 1-45. Flinn, S., & Lumsden, J. (2005, October 12-14). User Perceptions of Privacy and Security on the Web. In Proceedings of the Third Annual Conference on Privacy, Security and Trust (PST 2005). St. Andrews, New Brunswick, Canada. Flinn, S., & Stoyles, S. (2004, September 20-23). Omnivore: Risk management through bidirectional transparency. In V. Raskin (Ed.), Proceedings of the New Security Paradigms Workshop (NSPW). Liverpool, Nova Scotia, Canada. Garfinkel, S. (2001). Database nation. O’Reilly. Hann, I., Hui, K., Lee, T., & Png, I. (2002). Online information privacy: Measuring the costbenefit trade-off. In Proceedings of the 23rd International Conference on Information Systems. Internet cookie report (SecuritySpace Research Report from E-Soft Inc.). (2005, February 1). Retrieved February 27, 2005, from http://www.securityspace.com/s_survey/ data/man.200501/cookieReport.html Jennings, N. R., Faratin, P., Lomuscio, A. R., Parsons, S., Sierra, C., & Wooldridge, M. (2001). Automated negotiation: Prospects, methods and challenges. Int. J. of Group Decision and Negotiation, 10(2), 199-215. Jennings, N. R., Parsons, S., Sierra, C., & Faratin, P. (2000). Automated negotiation. In Proceedings of the 5th International Conference on the Practical Application of Intelligent Agents and Multiagent Systems (PAAM-2000), Manchester, UK, (pp. 23-30). Keeney, R. L., & Raiffa, H. (1976). Decisions with multiple objectives: Preferences and value tradeoffs. John Wiley & Sons. Korba, L., & Kenny, S. (2002, November 18-22). Towards meeting the privacy challenge: Adapting DRM. In Proceedings of the ACM Workshop on Digital Rights Management, Washington, DC, USA. (Held in conjunction with the Ninth ACM Conference
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
28 Flinn & Buffett
on Computer and Communications Security). Retrieved February 27, 2005, from http://www.iit.nrc.gc.ca/iit-publications-iti/docs/NRC-44956.pdf Martin, D., Wu, H., & Alsaid, A. (2003). Hidden surveillance by Web sites: Web bugs in contemporary use. Communications of the ACM, 46(12), 258-264. Metzger, M. J. (2004). Privacy, trust, and disclosure: Exploring barriers to electronic commerce. Journal of Computer-Mediated Communication, 9(4). Millett, L. I., Friedman, B., & Felten, E. (2001). Cookies and Web browser design: Toward realizing informed consent online. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 46-52). ACM Press. Patrick, A. S., & Kenny, S. (2003, March 26-28). From privacy legislation to interface design: Implementing information privacy in human-computer interfaces. In R. Dingledine (Ed.), Proceedings of Privacy Enhancing Technologies Workshop (PET2003), Lecture Notes in Computer Science 2760 (pp. 107-124). Dresden, Germany: Springer. Poole, P. S. (1999). Echelon: America’s secret global surveillance network. Retrieved May 20, 2005, from http://fly.hiwaay.net/%pspoole/echelon.html Solove, D. J. (2002). Conceptualizing privacy. California Law Review, 90, 1087-1156. Retrieved May 20, 2005, from http://ssrn.com/abstract=313103 Solove, D. J. (To appear, January 2006). A taxonomy of privacy. University of Pennsylvania Law Review, 154. Retrieved May 20, 2005, from http://ssrn.com/abstract=667622 Trust in the wired Americas (Research Report from Cheskin Research). (2000, July). Retr ieved February 27, 2005, from http://www.cheskin.com/p/ar.asp? mlid=7&arid=12&art=0 Udo, G. J. (2001). Privacy and security concerns as major barriers for e-commerce: A survey study. Information Management and Computer Security, 9(4), 165-174. van Blarkom, G. W., Borking, J. J., & Olk, J. G. E. (Eds.). (2003). Handbook of privacy and privacy-enhancing technologies: The case of intelligent software agents. Den Haag, The Netherlands: College bescherming persoonsgegevens. Web bug report (SecuritySpace Research Report from E-Soft Inc.). (2005, February 1). Retrieved February 27, 2005, from http://www.securityspace.com/s_survey/data/ man.200501/webbug.html
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 29
Chapter II
Privacy Issues in the Web Services Architecture (WSA) Barbara Carminati, University of Insubria at Como, Italy Elena Ferrari, University of Insubria at Como, Italy Patrick C. K. Hung, University of Ontario Institute of Technology (UOIT), Canada
Abstract A Web service is a software system that supports interoperable application-toapplication interactions over a network. Web services are based on a set of XML standards such as Universal Description, Discovery and Integration (UDDI), Web Services Description Language (WSDL), and Simple Object Access Protocol (SOAP). Recently, there have been increasing demands and discussions about Web services privacy technologies in the industry and research community. To enable privacy protection for Web service consumers across multiple domains and services, the World Wide Web Consortium (W3C) published a document called “Web Services Architecture (WSA) Requirements” that defines some fundamental privacy requirements for Web services. However, no comprehensive solutions to the various privacy issues have been so far defined. For these reasons, this chapter will focus on privacy technologies by first discussing the main privacy issues in WSA and related protocols. Then, this chapter illustrates the standardization efforts going on in the context of privacy for Web services and proposes different technical approaches to tackle the privacy issues.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
30
Carminati, Ferrari & Hung
Introduction Privacy is a state or condition of limited access to a person (Schoeman, 1984). Information privacy relates to an individual’s right to determine how, when, and to what extent information about the self will be released to another person or to an organization (LeinoKilpi, Dassen, Gasull, Lemonidou, Scott, & Arndt, 2001). Threats to information privacy can come from insiders and from outsiders in each organization (Fischer-Hubner, 2001). In general, privacy policies describe what information an organization collects from individuals (e.g., consumers) and what (e.g., purposes) they do with it. Many studies show that good privacy protection is an important factor to generate a good business (Bennett, 1997). In the U.S., the Privacy Act of 1974 (Fischer-Hubner, 2001) requires that federal agencies grant individuals access to their identifiable records that are maintained by the agency, ensure that existing information is accurate and timely, and limit the collection of unnecessary information and the disclosure of identifiable information to third parties. In Canada, the Personal Information Protection and Electronic Documents Act (PIPEDA) governs privacy issues related to collected data, including those collected via traditional Web-based applications (PIPEDA, 2005). On the reverse, the Europe Union Data Protection Directive (Steinke, 2002) contains two statements contradicting the U.S. one. The first statement requires that an organization must inform individuals about the purposes for which it collects and uses information about them, how to contact the organization, and the types of third parties to which it discloses the information. The second statement requires that personal data on EU citizens may only be transferred to countries outside the 15 nation block that adopt these rules or are deemed to provide “adequate protection” for the data. As a result, these statements imply that no information of any EU citizen can be transferred to the U.S. due to the conflicts between two privacy acts. Consequently this creates obstacles for conducting business activities between these two regions. To overcome the difficulties, the U.S. government already has a voluntary scheme called “Safe Harbour” to provide an adequate level of data protection which can safeguard transfers of personal data to the U.S. from EU. The U.S. companies doing business in the EU must certify to the Commerce Department that they will follow the regulations of the EU directive. Any violation would be subject to prosecution by the Federal Trade Commission (FTC) for deceptive business practices. For instance, biometrics (Grijpink, 2001) and healthcare applications (Cheng & Hung, 2005) have to seriously enforce privacy protection. Under the Health Insurance Portability and Accountability Act of 1996 (HIPAA) privacy rules (HL7, 2004) in the U.S., Protected Health Information (PHI) includes individually identifiable health information related to past, present, and future physical and mental health conditions, as well as the past, present, and future payment for the provisions of healthcare to an individual. HIPAA provided a set of standard policies that the healthcare providers must exercise in order to protect a patient’s privacy. HIPAA provides a standard set of electronic transaction formats and regulations to ensure the privacy and security of healthcarerelated transactions. Similarly, the Personal Health Information Protection Act of 2004 (PHIPA) in Canada establishes rules for the collection, use, and disclosure of personal health information about individuals that protect the confidentiality of that information
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 31
and the privacy of individuals with respect to that information, while facilitating the effective provision of healthcare (PHIPA, 2005). For example, applied research in e-health services shows that integrated views can result in different information formats being provided through the use of intelligent electronic health data access, analysis and visualization tools (Lacroix, 2002; Tan & Hung, 2005). However, e-health services that link e-patients’ health datasets to other sources of patient-specific data pose significant risks to the privacy of stored patient data. Indeed, these “micro-datasets” often contain identifiable and sensitive information such as genetic or demographic data about individuals, for example, name, age, sex, address, phone number, employment status, family composition, and DNA (Quantin, Allaert, & Dusserre, 2000). In this sense, not only will disclosure of sensitive information of particular individuals potentially create personal embarrassment, but it may also, very possibly, lead to social ostracism (France, 1996). All of these facts state that privacy is a very important topic, while there are more and more business applications deployed on the Internet nowadays. As Web services are becoming more and more popular for supporting different business applications, there are also increasing demands and discussions about Web services privacy technologies in the industry and research community. The information exchange in such a Web services-based business environment must be protected by privacy-enhancing technologies, especially if the information may be sensitive (Senicar, Jerman-Blazic, & Klobucar, 2003). Thus, it is required to have a privacy framework for supporting the Web services-based businesses. To enable privacy protection for Web service consumers across multiple domains and services, the World Wide Web Consortium (W3C) published a document called “Web Services Architecture (WSA) Requirements” that defines some specific privacy requirements for Web services. However, no comprehensive solutions to the various privacy issues have been so far defined. For these reasons, this chapter will focus on privacy technologies by first discussing the main privacy issues in WSA and related protocols. Then, this chapter illustrates the standardization efforts going on in the context of privacy for Web services, and proposes different technical approaches to tackle the privacy issues. The remainder of this chapter is organized as follows: Section iv discusses the motivation and background information of this chapter. Next, Section v gives a literature review. Section vi introduces privacy policy enforcement in WSA. Then, Section vii addresses privacy issues in Web services discovery agencies. Section viii discusses strategies for privacy enforcement in Web services discovery agencies. Section ix presents the future trends, whereas Section x discusses the conclusions.
Motivation and Background In this chapter, we first give the motivations behind the chapter, and then we present some background information of security mechanisms that are needed to understand the solutions proposed in Section vii.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
32
Carminati, Ferrari & Hung
Motivation Figure 1 depicts different privacy concerns existing in the context of Web services architecture (WSA). On the left hand side, the users interact with the Web services application via information exchanges. The information exchanges between the users and Web services application always contain different confidential and sensitive data. Referring to the publish/find/bind model in Web services (Mohen, 2002), one can imagine that Web services providers publish their Web services descriptions (e.g., WSDL documents) at registries (e.g., UDDI) for the public to access. Then, the users (Web service requestors) find the appropriate Web services at the registries. In many cases, there may have a mediator (i.e., a service locator) that helps to find appropriate Web services for requestors. This process is called matchmaking (Zhang & Zhang, 2002). Once the Web services are found, the Web services application is trying to bind to each Web service via SOAP messages. From the user’s point of view, privacy concerns mainly raise in the registries and Web services. For example, the users may want the registries to protect their privacy such as their identities and what information they have retrieved from the registries. In addition, the users may also want to validate the privacy policies of business entities and services based on their privacy preferences (W3C, 2002a). It means that the Web services application may only bind to those Web services satisfying their privacy policies. From another point of view, the privacy policies defined in UDDI for specific business entities and services must be consistent with the privacy policies defined in the WSDL documents of Web services. As what we mentioned, there is little research on addressing Web services privacy. Very little privacy standards exist beyond a principal statement made by IBM and Microsoft: “organizations creating, managing, and using Web services will often need to state their privacy policies and require that incoming requests make claims about the senders’ adherence to these policies” (IBM, 2002). For illustration, Figure 2 shows an e-health
Figure 1. Privacy concerns in Web services architecture (WSA) (Hung, Ferrari, & Carminati, 2004) Priv acy
Priv acy P rivacy
Priv acy P rivacy
Concerns
Concerns
Concerns
Privacy Concern s
Co ncerns
interact
Users
Web Services Application
Forwarding & Active Intermediaries? bind
Registries publish
find
Concerns
Propagation & Delegation?
Priv acy P rivacy
Co ncerns Concerns
Web Services
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 33
Figure 2. An illustrative e-health database application example
Reques t Privacy Polic y
Web Services Application
Result
Read / W rit e
E-Health Care Database
Web Service
database application example involving three entities: Web services application, Web service, and e-health care database. The Web services application can be any healthcare application at a health institute that is connected to a Web service at another health institute over the Internet. You can assume that the Web service is used as an interface to receive the request (e.g., retrieve/store healthcare data) from the application and then communicate with the e-health care database at the backend (e.g., read/write data). Once the request is completed, the Web service returns a result (e.g., acknowledgment or health data) to the application. Let’s assume that this example is a healthcare scenario in the U.S. Thus, there is a HIPAA compliant privacy policy enforced at the Web service according to the law (Cheng & Hung, 2005). It means that every request must be checked and verified with the privacy policy. If the request is eligible according to the privacy rules set in the HIPAA, the Web service will handle the request and initiate the process. Otherwise, the Web service will deny the request. In general, there are six rights that HIPAA gives patients with regards to their PHI as follows (HSS, 2004a): 1.
The right to view and make a copy of a patient’s own medical records.
2.
The right to request the correction of inaccurate health information.
3.
The right to find out where PHI has been shared for purposes other than care, payment, or healthcare operations.
4.
The right to request special restrictions on the use or disclosure of PHI.
5.
The right to request PHI to be shared with the patient in a particular way.
6.
The right to file complaints.
The scenario described so far is good enough for describing and justifying the five “Web Services Architecture (WSA) Requirements” introduced by W3C (Ref: AC020) for enabling privacy protection for the consumer (user) of a Web service across multiple domains and services (W3C, 2002b):
•
AR020.1: The WSA must enable privacy policy statements to be expressed about Web services;
•
AR020.2: Advertised Web service privacy policies must be expressed in P3P (W3C, 2002c);
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
34
Carminati, Ferrari & Hung
•
AR020.3: The WSA must enable a consumer to access a Web service’s advertised privacy policy statement;
•
AR020.5: The WSA must enable delegation and propagation of privacy policy; and
•
AR020.6: Web services must not be precluded from supporting interactions where one or more parties of the interaction are anonymous.
The major purpose of these WSA requirements is to enforce privacy policies in the context of WSA, where AR020.6 requirement is strongly related to workflow/business process integration issues (Hung & Chiu, 2003). Applying privacy policies in the context of WSA is one of the first important steps to develop a technical framework for supporting Web services privacy policies. Especially, they recommend adopting P3P technologies to define privacy policies. However, these WSA requirements are not covering all the related issues to be investigated for the real scenario. For instance, one can imagine that vocabularies vary in different business applications. Thus, it is essential to have a vocabulary for an independent privacy meta-language for WSA. The WSA requirement AR020.5 points out another relevant issue: the privacy policies also have to be enforced in a delegation and propagation situation, as shown in Figure 1. Web services may delegate some sub-activities that are decomposed from the assigned activities to other Web services. This assignment process is also called delegation or propagation (IBM, 2002). Nevertheless, the other important area to be further investigated is the privacy concerns in the intermediaries that pass the SOAP messages between the Web services application and Web services/registries, as shown in Figure 1. In the following sections, we focus on two main aspects: privacy issues arising in Web services discovery agencies (Section vii) and the definition of a framework for privacy policy enforcement (Section vi), compliant with the WSA requirements introduced in this section.
Background on Security Mechanisms Ever since security has become an essential asset for all information systems, several security solutions for protecting data have been proposed, (see Stallings [2000] for an overview). In general, these solutions exploit access control mechanisms for enforcing data confidentiality and integrity and cryptography-based solution for assuring confidentiality and authenticity during data transmission. In what follows, we give a brief overview of these techniques which will be used in Section vii for privacy enforcement in the context of WSA.
Access Control Mechanisms The task of an access control mechanism is to avoid unauthorized operations on the managed data. The access control mechanism (or reference monitor) is a software module
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 35
that intercepts each access request submitted by a user to the system and determines whether the access should be partially or totally authorized or whether it should be denied. In order to decide what is authorized for a user, the access control mechanism consults a set of authorizations, which state for each user the rights he/she can exercise on the objects managed by the system. In its basic format, an authorization consists of three main components: a subject, who is the entity to which the authorization is granted; an object, which is the resource to which the access is granted; and an access mode, specifying the action that the subject can exercise on the object. The authorizations are specified according to the access control policies adopted by the system, that is, the high level security rules stating how the information should be managed. More precisely, the access control policies are translated in authorizations by means of an access control model, which specifies the characteristics of authorization basic components (subjects, objects, and access modes) and states how access control should take place. In the past years, several access control models have been proposed, for example, mandatory (Bell & LaPadula, 1975), discretionary (Gollmann, 1999), rolebased (Ferraiolo, Sandhu, Gavrila, Kuhn, & Chandramouli, 2001), and credential-based (Winslett, Ching, Jones, & Slepchin, 1997), which mainly differ on the subject and object specification and in the way they perform access control.
Encryption Algorithms In general, the main role covered by encryption primitives in security mechanisms is to obscure the data, thus making it inaccessible by users not supplied with the appropriate decryption keys. Encryption algorithms can be grouped into two main classes: symmetric and asymmetric algorithms. In symmetric encryption (e.g., DES, Blowfish, RC5, AES), a unique key, called secret key, is used to both encrypt and decrypt the data. Thus, according to symmetric encryption, if user A wants to send confidential data to user B, A has to encrypt the data with the secret key that it shares with B. The main drawback of this approach is that it requires a secure channel for the secret key exchange. By contrast, asymmetric encryption (e.g., RSA, ElGamal) exploits two distinguished keys: a public key that can be published and distributed, and the private key, which must be kept secret. In general, the asymmetric algorithms are defined in such a way that data encrypted with the public key can be only decrypted using the corresponding private key. This property makes the asymmetric encryption able to ensure data confidentiality. Indeed, according to asymmetric encryption, with each user is associated a pair of private and public keys. Thus, if user A wants to send confidential data to user B, A has to encrypt the data with the public key of B. Since B is the only one that has the corresponding private key, he/she is the only one able to decrypt the data sent by A. Some asymmetric algorithms (e.g., RSA) can also be used to verify the source of the information. In this case, if user A wants to send user B data and makes it able to verify its authenticity, A has to encrypt the data with its private key. If B is able to decrypt the data with the public key that corresponds to the claimed sender, it is ensured about the source authenticity.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
36
Carminati, Ferrari & Hung
Hash Functions Hash functions (e.g., MD5, SHA-1) cover an important role in several authentication mechanisms (like authentication protocols, digital signatures). A function H() is a hash function if and only if it satisfies the following properties: (1) taking as input a message of arbitrary length, H always returns a fixed-length output; (2) given a hash value h, it must be computationally infeasible to find a value x such that H(x)=h (one-way property); (3) it must be computationally infeasible to find a pair x, y such that H(y)=H(x) (strong collision resistance property). Given the above-mentioned properties, the hash function can be exploited to produce a fingerprint of the message to be authenticated. Indeed, one of the most well-known usages of hash function is in digital signatures. The digital signature of a message M consists of the encryption of the hash value of M, called digest, with the private key of the user. Thus, when user A wants to send a message M to user B and make it able to verify its authenticity, A computes the digest of M and encrypts it with its private key (i.e., it digitally signs the message). When B receives the message and its digital signature, it decrypts the digital signature with the public key of A, obtaining thus the hash value of the original message. Then, B computes the hash value of the received message and compares this value with the decrypted hash value. Due to the properties of hash functions and asymmetric encryption, if the hash values match, B is assured that the message has not been modified and has been effectively created by A.
Literature Review In the recent years, there are increasing demands and discussions about privacy technologies for supporting Web services-based applications. For example, WS-Policy describes the business policies to be enforced on intermediaries and endpoints (IBM, 2002). The business policies contain certain requirements such as required security tokens, supported encryption algorithms, and privacy rules. The WS-Policy is represented by a policy expression, that is, an XML Infoset representation of one or more policy statements. The WS-Policy includes a set of general messaging-related assertions defined in WS-PolicyAssertions (IBM, 2002) and a set of security policy assertions related to supporting the WS-Security specification defined in WS-SecurityPolicy (IBM, 2002). In particular, WS-Security describes how to attach security tokens such as X.509 certificates to SOAP messages (IBM, 2002). However, the current WS-Policy specification does not discuss the privacy rules in detail. Even though WS-Privacy is mentioned to describe a model for defining subject privacy preferences and organizational privacy practice statements, WS-Privacy has not been developed yet (IBM, 2002). Next, the Platform for Privacy Preferences Project (P3P) working group at W3C develops the P3P specification for enabling Web sites to express their privacy practices (W3C, 2002c). On the other hand, P3P user agents allow users to automatically be informed of site practices and to automate decision making based on the Web sites’ privacy practices. Thus, P3P also provides a language called P3P Preference Exchange Language
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 37
1.0 (APPEL1.0), to be used to express users’ preferences for making automated or semiautomated decisions regarding the acceptability of machine-readable privacy policies from P3P enabled Web sites (W3C, 2002a). Referring to the illustrative e-health database application example, let’s assume that the Web services application sends a request to store the health data (e.g., Hospital, Treatment, and Pharmacy) to the Web service. Before the application submits the health data to the Web service, the application must check the P3P privacy policy (Figure 3) posted on the Web service with its APPEL privacy preference (Figure 4). The privacy policy at the Web service states that “The Web service is the only recipient collected the data (Hospital, Treatment, and Pharmacy) for healthcare purposes.” Referring to the P3P privacy policy shown in Figure 3, the assertion describes the data practices as applied to data elements. The assertion describes the legal entity, or domain, beyond the service provider and its agents where data may be distributed. In this case, the recipient is that means only the Web service collected the data. The assertion describes the reason(s) for data collection and use. In this case, the purpose is for healthcare only. Then, the assertion describes the data to be transferred or inferred. This case covers the data in the hospital, treatment, and pharmacy. On the Web services application side, the privacy preference states that “The Web services must be the only recipient to collect the data (Hospital, Treatment, and Pharmacy) for health care purposes.” Referring to the APPEL privacy preference shown in Figure 4, there are two statements (rules) defined as follows. Only the Web service can be the recipient of the data of hospital, treatment, and pharmacy, and the purpose must be only for . Based on this example, the P3P privacy policy at the Web service matches with the rules set at the application’s APPEL privacy preference. Thus, the application will submit the health data to the Web service; the Web service will write the data into the e-health care database shown in Figure 2.
Figure 3. An illustrative P3P privacy policy ... ... ... ...
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
38
Carminati, Ferrari & Hung
Figure 4. An illustrative APPEL privacy preference ... ... ... ...
Furthermore, Lategan and Olivier (2002) propose a conceptual model for enhancing the decision making at the user agents by using the Chinese Wall security policy based on the P3P framework. Though the P3P framework is not mainly designed for supporting Web services privacy policies, the P3P working group is currently studying the feasibility of applying a revised version of P3P into Web services privacy policy framework. Next, the Enterprise Privacy Authorization Language (EPAL) technical specification is used to formalize privacy authorization for actual enforcement within an intra- or interenterprise for business-to-business privacy control (IBM, 2003). EPAL concentrates on the privacy authorization by abstracting data models and user-authentication from all deployment details. The goal behind EPAL is to enable an enterprise to encode its privacy-related data-handling policies and practices in XML for facilitating privacyenforcement in enterprise information systems. Its recent emergence as a fine-grained, privacy-related language standard is in response to the irreversible trend of having more
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 39
and more dynamically formed and evolving federations of organizations in this ebusiness era. The EPAL vocabulary includes lists of hierarchies of data-categories, usercategories, and purposes, and also sets of actions, obligations, and conditions. Datacategories are used to define different categories of collected data handled differently from a privacy perspective such as financial data. User-categories are used to describe the users or groups accessing collected data such as investors. Purposes are used to model the intended service for which data is used such as an investment. Actions are used to model how the data is used such as buy and sell. Obligations are used to define actions that must be taken by the environment of EPAL such as, “No personal data will be released to any unauthorized party.” In particular, conditions are Boolean expressions such as, “all sellers must have signed the confidential agreement form.” A vocabulary may be shared by more than one enterprise. On the other hand, the EPAL policy defines the privacy authorization rules that allow or deny actions on data-categories by usercategories for certain purposes under certain conditions while mandating certain obligations. EPAL work together with the access control markup language XACML (OASIS, 2003a) as well as the recent development access control in Semantic Web (Agarwal, Sprick, & Wortmann, 2004) to achieve privacy data protections. Referring to the illustrative e-health database application example, let’s assume that the Web services application (healthcare system) sends a request to retrieve health data to the Web service for healthcare purposes. Moreover, let’s assume that the Web service adopts the EPAL privacy policy shown in Figure 5. In particular, this privacy policy contains a privacy authorization rule that allows healthcare systems (i.e., ) to “retrieve” (i.e., ) the health date from Hospital, Treatment, and
Figure 5. An illustrative EPAL policy ... ... ... ... ... ... ... ...
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
40
Carminati, Ferrari & Hung
Pharmacy (i.e., ) for healthcare purposes (i.e., ). In this case, the Web service will read the health data from the e-health care database and return the health data to the application. There are a few number of research works related to Web services privacy policies. For example, Langheinrich (2002) discusses a privacy awareness system targeted at ubiquitous computing environments. In the privacy awareness system, privacy proxies, which are implemented as a set of SOAP services, handle privacy relevant interactions between data subjects and data collectors but also provide access to specific user control capabilities disclosed in the privacy policy. Though this work is not mainly target on the context of Web services, it provides a basic framework for implementing Web service privacy-enhancing technologies in the future. Further, Rezgui et al. (2002) view privacy in Web services from the aspects of user privacy, data privacy, and service privacy. In particular, service privacy includes three types of policies: the usage policy stating the purposes for which the information collected can be used; the storage policy specifying whether and until when the information collected can be stored by the Web service; the disclosure policy stating if and to whom the information collected from a given user can be revealed. In addition, they have also applied their model into a digital government architecture that aims at preserving citizens’ privacy (Rezgui et al., 2002). Also, Tumer, Dogac, and Toroslu (2003) present a semantic-based privacy framework for Web services by using DAML-S.
Privacy Policy Enforcement in WSA Referring to the WSA requirements introduced in Section iv, enabling privacy protection for the consumer of a Web service across multiple domains and services only defines the guidelines according to which privacy enhancing technologies for Web services should be designed (Hung, Ferrari, & Carminati, 2004). The first important point is that from the language point of view, different business applications certainly will wish to adapt the privacy policies to their own circumstances. Here we propose the idea of having a vocabulary independent framework, able to adapt to different Web services applications. Figure 6 shows the concept of domain specific vocabularies for supporting different types of business applications in the proposed privacy authorization language framework. For example, one can imagine that there exists a financial or medical application specific vocabulary (Webmethod, 2002). The vocabulary can be described by using DAML-S (DAML, 2003) or the OWL Web Ontology Language (W3C, 2003a). DAML-S defines an upper ontology for describing the semantics of Web services. OWL is an XML language proposed by W3C for defining Web ontology. OWL ontology includes descriptions of classes, properties, and their instances, as well as formal semantics for deriving logical consequences in entailments (Carminati, Ferrari, & Bertino, 2005). The second point is defining a protocol for policy enforcement. As far as AC020 is concerned, P3P is proposed to be the privacy authorization language in WSA. An example of such a protocol is presented in (Hung et al., 2004), and discussed below assuming that privacy policies and preference exchange rules are specified using P3P
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 41
Figure 6. A privacy authorization language framework (Hung, Ferrari, & Carminati, 2004) Domain Specific Vocabularies Privacy Policy Web Services Application
Figure 7. Protocol for enforcing privacy policies in WSA (Hung, Ferrari, & Carminati, 2004)
1, 3
2
Web S ervices (Provider)
Discovery Agency
C
4
5
Web Services Consumer
8
A
B 9
7
SOAP Intermediary
6
D
Web S ervices Partner E
(W3C, 2002c) and APPEL (W3C, 2002a), respectively. Parties involved in the protocol are: Web services (provider) — A, Web services consumer — B, Discovery agency — C, SOAP intermediary — D, and Web services partner — E (see Figure 7). The interactions are described as follows: 1.
AfiC: Request discovery agency’s privacy policy in P3P.
2.
A: The Web services provider matches its privacy preferences in APPEL with discovery agency’s privacy policy.
3.
AfiC: If they match, the Web services provider publishes service in WSDL and related privacy policies in P3P. Otherwise, the provider can decide what to do.
4.
BfiC: Find an appropriate Web service via UDDI.
5.
B: The Web Service Consumer matches discovery agency’s and service provider’s privacy policies in P3P with its privacy preferences in APPEL.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
42
Carminati, Ferrari & Hung
6.
BfiD: If they match, the Web services consumer attempts to bind to the Web service via SOAP message, by attaching a P3P privacy policy in the SOAP header for enforcing SOAP intermediaries to obey.
7.
DfiA: If the SOAP intermediaries all obey the privacy policy, the SOAP message will be passed to the Web service.
8.
AfiB: Request consent from the Web services consumer for propagation and delegation of information if there is a need.
9.
AfiE: Once the consent is given from the Web services consumer, the Web services provider will pass the information to the Web services partner if and only if the privacy policy at the partner is also compatible with the provider’s.
It is our belief that such issues, independent from the used privacy language, are the basic steps towards the standardization of Web services privacy technologies in the coming years.
Privacy Issues in Web Services Discovery Agencies In this section, we focus on a particular class of privacy issues arising in the WSA, that is, those referring to discovery agencies. Discovery agencies provide a searchable set of service descriptions in centralized or distributed UDDI registries. Discovery agencies take service requestors’ queries and then search appropriate Web services to suit the specific requirements in the queries. In the current practices, a UDDI entry is optional to Web services in that the service provider can also send the service description directly to the service requestor. However, this usually occurs only after two business partners have agreed on terms of doing e-business over the Internet. For this reason, in the following we focus on the more general case in which a discovery agency acts as service locators that help to find appropriate Web services for requestors. The interaction between the service providers and the discovery agency is called a publish operation (see Figure 8). Next, Web services requestors find the appropriate Web services by querying the discovery agencies. In addition to a list of Web services, the discovery agencies return some value-added information, such as performance evaluations and predictions. This process is always called matchmaking (Zhang & Zhang, 2002). Once a suitable Web service has been selected, the Web services requestor gets the correspondent WSDL document and tries to bind with the Web service via SOAP. In this scenario, there may be several privacy concerns (see Figure 8) that we classify according to the publish-find-bind model. Let us first consider the publish operation. In this case, the service provider may not want the discovery agency to access some of its personal information. For example, the Web services providers may have to provide some registration information to the discovery agencies. The registration information may
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 43
Figure 8. Privacy concerns in Web services discovery agencies (Carminati, Ferrari, & Hung, 2005) P riv ac y C once rns
Delegation and Propagation? Pr iva c y Conce rns etri eve st o re /r
P riva c y Conc e rns
U DD I Registries
Discovery Agencies Queries
h is
Fi
bl
nd
Pu
Service Descriptions + Va lue-added Information
Serv ice Descri ptions + Registration Information
Dir ec t Publish
Web Services Requestors
Service Descriptions
Web Services Providers
contain some sensitive and confidential information for handling the business transactions, such as the trade-off model between quality and cost of service, which must be provided to the Web service requestor. Furthermore, the service provider may be an individual; it may not like to release identifiable information to unauthorized or unaffiliated parties, such as mailing address, phone number, and social security number (SSN). Further, there also exist privacy concerns in the find operation to the discovery agencies. For example, the Web services requestors may want the discovery agencies to protect their privacy such as their identities. Additionally, the service requestor may not want the discovery agencies to release the details of their queries or even the patterns to any unauthorized or unaffiliated party. It is because the Web services requestors may have concerns about the discovery agencies to release the information to competitors or for other purposes such as marketing promotions. Finally, let us consider the bind operation. A Web service requestor may want to validate the privacy policies of business entities and services based on their privacy preferences (W3C, 2002c), before binding to the Web service. This means that the Web services requestors may only bind to those Web services whose privacy policies are matched with their privacy constraints or preferences. From another point of view, the privacy policies defined in UDDI registries for specific business entities and services must be consistent with the privacy policies defined in the service descriptions (WSDL documents). Recently, the W3C P3P Beyond HTTP task force (W3C, 2003b) recommended that associating a privacy policy with the UDDI entries is one of the technical approaches to tackle this concern. However, when optional associations are provided, the Web services providers must ensure that multiple associations do not conflict with each other in different UDDI entries. In addition, discovery agencies may delegate some tasks to other services. There also exist privacy concerns in the delegation and propagation, in that the Web services Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
44
Carminati, Ferrari & Hung
providers and requestors not only concern how the discovery agencies protect their confidential and sensitive information but also whether the discovery agencies may delegate and propagate their information to other third parties without getting their consents. Based on all the scenarios just discussed, a first requirement is that the discovery agencies should have their own privacy policies that govern the use of collected data, with the following two properties (W3C, 2003c):
•
Identifying purposes: All the information collected from Web services providers and requestors will only be used for performing publish and find operations respectively.
•
Limiting use, disclosure, and retention: Providers’ and requestors’ information must not be used or disclosed for purposes other than performing the publish and find operations respectively for which it was collected, except with the consent of the subject or as required by law. Web services providers’ and requestors’ information must be retained only as long as necessary for the fulfillment of performing publish and find operations, respectively.
Then, we need to devise suitable mechanisms for privacy enforcement. Some of them are described in Section viii.
Strategies for Privacy Enforcement in Web Services Discovery Agencies With respect to the publish and find operations, the WSA can be considered as a thirdparty architecture, in that the owner of the information (i.e., Web service providers) is distinct from the entities (i.e., the discovery agencies) responsible for managing information descriptions and for answering queries. In a third-party architecture, it is not always possible to adopt the traditional techniques developed for database protection (i.e., those relying on the existence of a trusted reference monitor), since they require the presence of a trusted party implementing the access control mechanism. Thus, we need to devise strategies for ensuring privacy in discovery agencies, which do not always rely on the availability of a trusted third party. Therefore, it is not possible to devise a single solution for privacy enforcement that fits in all the environments, since the right solution depends on many factors, such as the trust on the discovery agency, the sensitivity of the data, and the trade-off between efficiency and the degree of privacy assurance. Therefore, in the following, we describe three different kinds of solutions for privacy enforcement, whose applicability depend on the above-mentioned characteristics (Carminati, Ferrari, & Hung, 2005).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 45
Access Control-Based Solution The first kind of proposed solution relies on the presence of a trusted party inside the WSA, in charge of managing an access control mechanism and specifying access control policies (see Background on Security Mechanisms). Access control mechanisms regulate the access to UDDI registries through a set of access control policies, ensuring privacy for both Web service requestors and providers. Such trusted party can be either the discovery agency or a third party. Exploiting an access control-based solution implies that when a Web services requestor submits a query to the discovery agency, the access control mechanism filters the query answer according to the specified access control policies, and possibly prunes some portions of the answer if the requesting subject does not have proper authorizations for it. A key component of this solution (see Figure 9, solution 1) is the availability of an access control model according to which access control policies can be specified. An access control policy states which Web services consumers/providers can access which UDDI entries (or portions of them) and under which access mode and conditions. To this purpose, several access control models have been proposed (see Background on Security Mechanisms). Also, some XML-based languages for encoding authorization rules are today available (OASIS, 2003b; Bertino, Carminati, & Ferrari, 2001). The choice of the right access control model and language mainly depends on the sensitivity of the information in UDDI registries and the kinds of privacy constraints we would like to enforce. Therefore, in the following, we use an abstract notation for authorization rules, which does not make any assumption on the policy language and the underlying model. More precisely, an authorization rule is
Figure 9. Three solutions for enforcing privacy (Carminati, Ferrari, & Hung, 2005) Solution 3: Hash M echanism
t stor e/re
rie ve
UDDI Reg istries
Solution 2: Encryption Mechanism
Discovery Agencies Q ueries
is h
Web Services Requestors
bl Pu
Ser vice Descr iption s + Val ue-added I nformation
Fi nd
Solu ti on 1 : Ac cess Control Me chanism
Service Descri ptions + Registrati on I nform ation
Web Services Providers
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
46
Carminati, Ferrari & Hung
represented as a tuple: (subj_spec, obj_spec, acc_mode, constraints), where subj_spec denotes the subjects to which the rule applies (e.g., user IDs, roles, conditions on user credentials); obj_spec denotes the protection objects to which the rule applies (e.g., a whole UDDI registry, portions of it, registration information); acc_mode is the access mode granted by the rule; and constraints are the conditions under which the rule applies. With this solution, it is, thus, possible to address some of the privacy issues presented in Section vi, by simply stating the right access control policies. For instance, consider the privacy concerns related to the publish operation to the discovery agencies, that is, the need of ensuring the privacy of the registration information submitted by Web services providers to UDDI registries. According to the access control-based solution, to avoid the discovery agency releasing this information to competitors, the trusted party specifies an access control policy stating that the registration information of a Web service must be made available only to the Web service provider that has published it. Another interesting use of the access control-based solution is the possibility of enforcing Web services requestors’ privacy preferences. Indeed, as remarked in Section vi, a user may want that the discovery agency returns him/her only those services validating his/her privacy preferences. This can be easily obtained if the third party defines an access control policy, which allows a Web services requestor to access an UDDI entry only if its privacy preferences are validated by the corresponding Web service privacy practices.
Example 1 The following are examples of authorization rules enforcing the above-mentioned privacy requirements (for simplicity, we assume that all authorization rules refer to the same discovery agency): R1 = (service_req1,business_entity,read,match_privacy_preferences) R2 = (service_provider,registration_info,registration_info.id = service_provider.id) The first authorization rule authorizes the service requestor service_req1 to see the business entities only of those services whose privacy policies match its privacy preferences. By contrast, the second rule makes registration information available only to the service providers to which they refer. Note that the overhead required to implement the access-control-based solution is similar to the one we have in conventional database management systems (DBMSs) for enforcing access control. Similar to the proposed solution, in a standard DBMS each query is intercepted by the reference monitor, which verifies whether the access request can be authorized or not, according to the specified access control rules. In relational DBMSs access rules are stored into system catalogues (i.e., relational tables). Therefore, access control is very efficient since it can be performed by issuing few SQL queries on the system catalogues.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 47
Cryptographic-Based Solution According to the access control-based solution, data contained in the UDDI registry are published in clear text; therefore, this solution imposes a certain degree of trust also in the discovery agency that it does not release unauthorized information to other parties. To relax this assumption, we propose an alternative solution that always relies on a trusted party but that does not require discovery agencies to be trusted. To prevent discovery agencies from maliciously using the data they manage, we insert an additional module, with the goal of making such information unusable by discovery agencies. More precisely, we propose a solution that is similar to that proposed in Carminati, Ferrari, and Hung (2005). This solution relies on an encryption module (see Figure 9, solution 2), which encrypts different portions of the same UDDI entry with different encryption keys, according to the specified access control policies, and then publishes the encrypted copy of the entry to the UDDI registry. When a Web services provider publishes its services descriptions, the access control module marks such data with the applicable access control policies, and then the encryption module encrypts it with one or more keys, depending on the result of the marking. Finally, the encrypted copy of the UDDI entry is submitted to the UDDI registry. In such a scenario, one can assume that the trusted party covers the task of key management by supplying the right keys to the right Web services requestors, according to the specified access control policies. When a Web services requestor needs to perform a query, it submits an encrypted query, that is, a query stating the conditions of the search in an encrypted form. This has the further benefit of avoiding that the discovery agency can trace the requestor queries. Clearly, the discovery agency must be able to evaluate queries over encrypted data without having the decryption keys. To this purpose, a method similar to the one proposed in Carminati, Ferrari, and Hung (2005) data can be adopted.
Example 2 Suppose that the service provider MyBank does not want its competitors to access the details of the new home banking service it offers to its clients, whereas such details can be seen by all the other customers accessing the UDDI registry. According to the cryptographic-based solution, the UDDI entry corresponding to MyBank is encrypted with two different keys, k1 and k2, where: one, say k1, is used to encrypt the business service element corresponding to the new home banking service, whereas the other is used to encrypt all the remaining portions of the business entity element associated with the considered service provider. Then, MyBank’s competitors will receive only k2, whereas all the other requestors will receive both k1 and k2. The performance overhead implied by the use of a cryptographic-based solution is mainly related to two factors: encryption generation and management, and the overhead implied by querying encrypted data. Let us first consider encryption generation and management. Clearly, the cost of encryption generation depends on the number of encryption keys used to generate the encryption, which strictly depends on the number of specified authorization rules, and the size of the ciphered data. However, encryption Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
48
Carminati, Ferrari & Hung
generation is done once, when the UDDI entry is submitted to the discovery agency, and therefore such cost does not impact the overall performance very much. By contrast, the cost of update management should be carefully considered. Indeed, each time an authorization rule, or a portion of an UDDI entry is modified, this may require the update of the entry encryption. To limit the overhead implied by such operation, an incremental approach can be used (similar to the one presented in Carminati and Ferrari [2003]), which modifies only those portions of the encryption that are really affected by the update, without the need of regenerating the whole encryption from scratch each time an update occurs. As far as query processing is concerned, the development of efficient techniques for querying encrypted data is still an open research issue. However, some work as been done both in the context of relational (Hacigumus, Iyer, Li, & Mehrotra, 2002) and XML data (Carminati, Ferrari, & Hung, 2005), which can be applied also to UDDI registries. Finally, it is important to note that one of the main drawbacks of any encryption-based scheme is that it requires costly key management procedures (e.g., safe key storage, key recovery, and key delivery procedures), whose cost depends on the number of keys to be managed. Therefore, one of the key issues is that of devising a key generation method able to minimize the number of encryption keys that need to be managed. By applying the naïve solution according to which a different key is associated with each different set of authorization rules applied to a portion of an UDDI entry, such number is, in the worst case, equal to 2N_authrules, where N_authrules is the number of specified authorization rules. However, more sophisticated key assignment schemes can be devised (such as the one proposed in Bertino, Carminati, and Ferrari [2002]), which greatly reduces the number of keys that need to be managed. More precisely, according to the approach proposed in Bertino, Carminati, and Ferrari (2002), the number of keys to be generated is linear in the number of specified authorization rules.
Hash-Based Solution Both the solutions discussed so far are suitable for a scenario where there exists a trusted third party managing the access control policies and the encryption keys. However, there can be cases in which it is not possible to rely on this assumption. For this reason, in Carminati, Ferrari, and Hung (2005) has been proposed an additional approach (Figure 9, solution 3), which exploits hashing techniques and does not rely on a trusted third party. Such approach provides a solution to privacy concerns of Web services requestors that do not want to release their query’s details or even pattern to any unauthorized or unaffiliated party. According to this approach, the Web services providers publish hashed service descriptions in an untrusted discovery agency. More precisely, the published version contains all the information regarding how to contact the Web services providers as clear text, whereas all the other information is hashed using a standard hash function. Thus, when a Web services requestor looks for a service with certain properties, it generates a query specifying all the conditions on the properties as hashed values, and then it submits it to the untrusted discovery agency, which cannot infer the searching criteria. The discovery agency is able to perform the hashed query on the hashed description and to return to the service requestor the information for contacting the Web services that match its requirements, since this information is in clear
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 49
text. Having this information, the Web services requestor is then able to contact the Web services provider for further interactions. The overhead implied by this solution is related to the generation of the hashed entries and answering a query containing hash values. The implied overhead is, however, less than that implied by cryptographic-based solutions in that the hashing of the UDDI entry and the query generation are not driven by authorization rules. As such, the cost of such operations mainly depends on the size of the input data and the selected hash function. Moreover, this approach does not have any overhead due to key management.
Proposed Solutions for the Five UDDI Scenarios In this section, we show how the solutions presented so far can be effectively applied to discovery agencies. The WSA (W3C, 2002b) defines five major types of UDDI registries, described in what follows. For each of them, we discuss the applicable solutions.
1. Internal Enterprise Application UDDI Registry This UDDI registry is for companies’ internal use. Therefore, all the entities accessing the UDDI registries (either as publishers or requestors) belong to the same organization in which the UDDI resides, and the UDDI registry is placed behind the firewall. In such a scenario characterized by the fact that all the Web services are well known and trusted within the organization, there is no need to apply cryptographic or hash-based solutions. By contrast, the access control-based solution is more appropriate, where the task of specifying access control policies and managing the access control module can be carried on by the organization itself. Applying the access control-based solution to the Internal UDDI scenario has a further benefit, in that the access control mechanism can be also used to enforce workflow rules within the organization, which usually express precedence relationships among the execution of Web services. This feature can be achieved by defining proper access control rules that limit the access of a Web services requestor to the UDDI entries of only those Web services that the requestor can access according to the business rules in place at the organization.
2. Portal UDDI Registry A portal UDDI registry is preferred each time there is the need for a distinction between the services offered for external partners and those for internal use. In this scenario, the UDDI registry is located in the service provider’s environment, outside the firewall or in a demilitarized zone (DMZ) between firewalls. In this case, the UDDI manages Web services descriptions all belonging to the same organization, but it is queried also by external partners. Even in this case, an access control-based solution can be used; however, the underlying access control model must support additional ways of qualifying the subjects to which a policy applies, wrt traditional identity-based mechanisms,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
50
Carminati, Ferrari & Hung
since the identity of external requestors cannot always be easily verified. A possible solution is the use of an access control model based on subject credentials (Jones, Ching, & Winslett, 1995). According to such models, subjects to which an access control policy applies are determined by exploiting the notion of credential, which represents in this context, a set of Web services requestor properties to be used for access control purposes. Moreover, in this scenario, the UDDI registry could not be located behind the firewall, and thus could be untrusted. In this case, a possible solution is to apply the cryptographic-based approach.
3. Partner Catalog UDDI Registry This type of UDDI registry publishes Web services descriptions to be used by a particular company. A partner catalog UDDI registry acts like a private UDDI registry that sits behind the firewall. This kind of private UDDI registry contains only approved, tested, and valid Web services descriptions from legitimate business partners. Therefore, all the considerations made for the internal UDDI registries are still valid for the partner catalogue UDDI registries.
4. E-Marketplace UDDI Registry E-Marketplace UDDI registries are used to publish service descriptions related to the Web services for which a service provider intends to compete for requestors’ business. These kinds of UDDI registries are usually hosted by an industrial consortium, with the goal of managing the descriptions of the Web services providers intending to integrate with other providers for requestor’s business purposes. In this scenario, applying the access control-based solution implies that the party hosting the UDDI registry must specify the access control policies and manage the access control module. Even in this case, a credential-based paradigm is the most appropriate one for performing access control. Moreover, similar to the Portal UDDI scenario, the UDDI registry could be untrusted. In that case, the cryptographic-based solution can be adopted.
5. UDDI Business Registry Web services may also wish to publish to the UDDI Business Registry in some other public registries that may be discovered by new potential business partners or service users. This scenario is characterized by the absence of a trusted third party or a trusted UDDI registry. Therefore, only the hash-based solution is applicable. Table 1 summarizes what we have discussed so far, by showing the applicability of each of the proposed solutions in all the considered UDDI scenarios. From Table 1, it is clear that access control-based solution can be applied only in those scenarios where it is possible to determine a trusted third party. In addition, when the UDDI is not behind the firewall, it is necessary to apply together with the access control-based solution also the cryptographic one.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 51
Table 1. Applicability of the proposed solutions UDDI scenario
AC
Crypto
Hash
Internal Enterprise Application UDDI Registry Portal UDDI Registry Partner Catalog UDDI Registry E-Marketplace UDDI Registry UDDI Business Registry
P P P P NA
A-Nr P A-Nr P NA
A-Nr A-Nr A-Nr A-Nr P
P=preferred; A =applicable but not required; NA=not applicable
Future Trends One of the potential users for applying such a privacy framework is the healthcare industry (also shown in Figure 2). With the increase in digitalizing health information such as Electronic Medical Records (EMR), one can imagine that the demand for privacy enhancing technologies for healthcare applications, especially based on Web services, is ever increasing. In the context of Web services, the traditional view of an access control model for healthcare applications should be extended with an enterprise wide privacy policy for the management and enforcement of individual privacy preferences (Powers, Ashley, & Schunter, 2002). For illustration, an access control model must be extended to fit in the privacy rules in different countries such as HIPAA. According to the U.S. Department of Health and Human Services (HSS), HIPAA is a set of rules to be followed by health plans, doctors, hospitals, and other healthcare providers in the U.S. (HSS, 2004b). HIPAA privacy rules create national standards to protect individuals’ health information. It is therefore necessary to be standardized in Web services. Therefore, a conceptual layered architecture is needed for facilitating the design and implementation of privacy act-compliant Web services-based applications.
Conclusions In the past few years, Web services privacy issues are attracting more and more attention from the industry and research community. While the number of Web services-based business applications is increasing, one can imagine that the demands for privacyenhancing technologies for Web services will also be increased in the future. This chapter provided an overview of privacy issues in WSA, surveyed related technologies, and proposed solutions for some of these issues. In particular, we have presented a privacy enforcement framework, compliant with the privacy requirements defined in the “Web Services Architecture Requirements” document. Furthermore, we have proposed a suite of strategies for privacy enforcement in Web service discovery agencies.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
52
Carminati, Ferrari & Hung
References Agarwal, S., Sprick B., & Wortmann, S. (2004). Credential based access control for semantic Web services. In AAAI Spring Symposium — Semantic Web Services. Bell, D., & LaPadula, L. (1975). Secure computer systems: Unified exposition and multics interpretation (ESD-TR-75-306). Hanscom Air Force Base, Bedford, MA. Bennett, C. J. (1997). Arguments for the standardization of privacy protection policy: Canadian initiatives and American and international responses. Government Information Quarterly, 14(4), 351-362. Bertino, E., Carminati, B. & Ferrari, E. (2002). A temporal key management scheme for secure broadcasting of XML documents. In ACM Conference on Computer and Communications Security. Bertino, E., Castano, S., & Ferrari, E. (2001). On specifying security policies for Web documents with an XML-based language. In ACM Symposium on Access Control Models and Technologies (SACMAT). Carminati, B., & Ferrari, E. (2003). Management of access control policies for XML document sources. International Journal of Information Security, 1(4), 236-260. Carminati, B., Ferrari, E., & Bertino, E. (2005). Assuring security properties in third-party architectures. In Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), Tokyo, Japan. Carminati, B., Ferrari, E., & Hung, P. C. K. (2005, September/October). Exploring privacy issues in Web services discovery agencies. IEEE Security & Privacy Magazine. Cheng, V. S. Y., & Hung, P. C. K. (2005). Health Insurance Portability and Accountability Act (HIPAA) compliant access control model for Web services. The International Journal of Health Information Systems and Informatics, 1(1). DAML. (2003). DAML-S: Semantic markup for Web services, version 0.9. Retrieved from http://www.daml.org/services/daml-s/0.9/daml-s.html Ferraiolo, D. F., Sandhu, R. S., Gavrila, S. I., Kuhn, D. R., & Chandramouli, R. (2001). Proposed NIST standard for role-based access control. ACM Transactions on Information and System Security (TISSEC), 4(3), 224-274. Fischer-Hubner, S. (2001). IT-security and privacy. In Lecture notes on computer science 1958. Springer-Verlag. France, F. (1996). Control and use of health information: A doctor’s perspective. International Journal of Biomedical Computing, 43(1-2), 19-25. Gollmann, D. (1999). Computer Security. John Wiley & Sons. Grijpink, J. (2001). Privacy law: Biometrics and privacy. Computer Law & Security Report, 17(3), 154-160. Hacigumus, H., Iyer, B. R., Li, C., & Mehrotra, S. (2002). Executing SQL over encrypted data in the database service provider model. In Proceedings of the SIGMOD Conference.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 53
HL7. (2004). HIPAA claims and attachments preparing for regulation. Retrieved from h ttp:/ /www.h l7.org /memon ly/downloads/Att achmen t_Specificat ion s/ HIPAA_and_Claims_Attachments_White_Paper_20040518.pdf HSS. (2004a). Medical privacy — National standards to protect the privacy of personal health information. Retrieved from http://www.hhs.gov/ocr/hipaa/ HSS. (2004b). Protecting personal health information in research: Understanding the HIPAA privacy rule. U.S. Department of Health & Human Services (HSS). Retrieved from http://privacyruleandre-search.nih.gov/pr_02.asp Hung, P. C. K., & Chiu, D. K. W. (2003). Workflow-based information integration in a Web services environment. In Proceedings of the First International Conference on Web Services (ICWS’03), Las Vegas, Nevada, USA. Hung, P. C. K., Ferrari, E., & Carminati, B. (2004). Towards standardized Web services privacy technologies. In Proceedings of the 2004 IEEE International Conference on Web Services (ICWS) (pp. 174-181). IBM. (2002). Security in a Web services world: A proposed architecture and roadmap (White Paper, Version 1.0). Retrieved from http://www-106.ibm.com/ developerworks/library/ws-secroad/ IBM. (2003). Enterprise Privacy Authorization Language (EPAL) (IBM Research Report). Retrieved from http://www.zurich.ibm.com/security/enterprise-privacy/ epal Jones, V. E., Ching, N., & Winslett, M. (1995). Credentials for privacy and interoperation. In Proceedings of the New Security Paradigms Workshop (pp. 92-100). Lacroix, Z. (2002). Biological data integration: Wrapping data and tools. IEEE Transactions on Information Technology in Biomedicine, 6(2), 123-128. Langheinrich, M. (2002). A privacy awareness system for ubiquitous computing environments. In Proceedings of the 4th International Conference on Ubiquitous Computing (UbiComp2002), Lecture Notes on Computer Science 2498 (pp. 237245). Lategan, F. A., & Olivier, M. S. (2002). A Chinese wall approach to privacy policies for the Web. In Proceedings of the 26th Annual International Computer Software and Applications Conference (COMPSAC’02). Leino-Kilpi, H., Valimaki, M., Dassen, T., Gasull, M., Lemonidou, C., Scott, A., & Arndt, M. (2001). Privacy: A review of the literature. International Journal of Nursing Studies, 38, 663-671. Mohen, C. (2002). Tutorial: Application servers and associated technologies. In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD’02), Madison, USA. OASIS. (2003a). eXtensible Access Control Markup Language (XACML). Retrieved from http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml OASIS. (2003b). eXtensible Access Control Markup Language (XACML 2.0). Retrieved from http://docs.oasis-open.org/xacml/access_control-xacml-2_0-core-spec-cd04.pdf
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
54
Carminati, Ferrari & Hung
PHIPA. (2005). Personal Health Information Protection Act of 2004. Retrieved from http:/ /www.e-laws.gov.on.ca/DBLaws/Statutes/English/04p03_e.htm#BK0 PIPEDA. (2005). Personal Information Protection and Electronic Documents Act of 2004. Retrieved from http://laws.justice.gc.ca/en/P-8.6/text.html Powers, C. S., Ashley, P., & Schunter, M. (2002). Privacy promises, access control, and privacy management — Enforcing privacy throughout an enterprise by extending access control. In Proceedings of the Third International Symposium on Electronic Commerce (pp. 13-21). Quantin, C., Allaert, F., & Dusserre, L. (2000). Anonymous statistical methods versus cryptographic methods in epidemiology. International Journal of Medical Informatics, 60(2), 177-183. Rezgui, A., Ouzzani, M., Bouguettaya, A., & Medjahed, B. (2002). Preserving privacy in Web services. In Proceedings of the 4th International ACM Workshop on Web Information and Data Management, Virginia, USA (pp. 56-62). Rezgui, A., Wen, Z., & Bouguettaya, A. (2002). Enforcing privacy in interoperable egovernment applications. In dg.o 2002 NSF Conference. Schoeman, E. D. (1984). Philosophical dimensions of privacy: An anthology. New York: Cambridge University Press. Senicar, V., Jerman-Blazic, B., & Klobucar, T. (2003). Privacy-enhancing technologies — Approaches and development. Computer Standards & Interfaces, 25, 147-158. Stallings, W. (2000). Network security essentials: Applications and standards. Prentice Hall. Steinke, G. (2002). Data privacy approaches from US and EU perspectives. Telematics and Informatics, 19, 193-200. Tan, J. K. H., & Hung, P. C. K. (2005). E-security: Framework for privacy and security in e-health data integration and aggregation. In E-health care information systems: An introduction for students and professionals (pp. 450-478). Jossey-Bass Tumer, A., Dogac, A., & Toroslu, H. (2003). Semantic based privacy framework for Web services. In Proceedings of WWW’03 Workshop on E-Services and the Semantic Web (ESSW 03), Budapest, Hungary. W3C. (2002a). A P3P Preference Exchange Language 1.0 (APPEL1.0) (World Wide Web Consortium [W3C] Working Draft). Retrieved from http://www.w3.org/TR/ P3P-preferences/ W3C. (2002b). Web services architecture requirements (World Wide Web Consortium [W3C] Working Draft). Retrieved from http://www.w3.org/TR/2002/WD-wsa-reqs20021114 W3C. (2002c). The Platform for Privacy Preferences 1.0 (P3P1.0) specification (World Wide Web Consortium [W3C] Recommendation). Retrieved from http://www.w3.org/ TR/P3P/
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Issues in the Web Services Architecture (WSA) 55
W3C. (2003a). OWL Web ontology language. Web-Ontology (WebOnt) Working Group. Retrieved from http://www.w3.org/2001/sw/WebOnt W3C. (2003b). P3P: Beyond HTTP (P3P Task Force Report). Retrieved from http:// www.w3.org/P3P/2003/p3p-beyond-http/Overview.html W3C. (2003c). P3P Beyond HTTP Task Force. Retrieved from http://www.w3.org/P3P/ 2003/03-binding.html Webmethod. (2002). Enterprise Web services in the financial services industry — Driving new integration solutions (Webmethod Technical Document). Winslett, M., Ching, N., Jones, V., & Slepchin, I. (1997). Using digital credentials on the World Wide Web. Journal of Computer Security, 5(3), 255-267. Zhang, Z., & Zhang, C. (2002). An improvement to matchmaking algorithms for middle agents. In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems (1340-1347).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
56 Fung & Paynter
Chapter III
The Impact of Information Technology in Healthcare Privacy Maria Yin Ling Fung, University of Auckland, New Zealand John Paynter, University of Auckland, New Zealand
Abstract The increased use of the Internet and latest information technologies such as wireless computing is revolutionizing the healthcare industry by improving services and reducing costs. The advances in technology help to empower individuals to understand and take charge of their healthcare needs. Patients can participate in healthcare processes, such as diagnosis and treatment, through secure electronic communication services. Patients can search healthcare information over the Internet and interact with physicians. The same advances in technology have also heightened privacy awareness. Privacy concerns include healthcare Web sites that do not practice the privacy policies they preach, computer break-ins, insider and hacker attacks, temporary and careless employees, virus attacks, human errors, system design faults, and social engineering. This chapter looks at medical privacy issues and how they are handled in the U.S. and New Zealand. A sample of 20 New Zealand health Web sites was investigated.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
57
Introduction Advances in information technology have increased the efficiency of providing healthcare services to patients. Using Web-based technology, the healthcare team can also include the patient, who must be an informed decision maker and active participant in his or her care. These same advances also improve the features, functions, and capabilities of the electronic medical record systems and potentially increase the number of parties, namely hospitals, insurance companies, marketing agencies, pharmaceutical companies, and employers that may have unauthorized access to private medical information. These systems are justifying themselves in terms of cost and life savings. Accessibility to mobile computing devices in the healthcare industry is also evolving. Wireless computing devices enable physicians, clinicians, and nurses to enter patient data at the point of care (Kimmel & Sensmeier, 2002). Disease management systems provide caregivers with information on efficacy of drugs and treatments at various stages of a medical condition. Using bar-coding technology together with decision support, systems can ensure that patients can receive the correct medication or treatment. Healthcare organizations must manage a tremendous amount of information, from clinical test results, to financial data, to patient tracking information. While most healthcare organizations have policies and procedures in place to guarantee at least minimum levels of privacy protection, they are not core features of most technology systems in the healthcare industry. This is true despite the fact that unauthorized disclosure of an individual’s private medical information can affect one’s career, insurance status, and even reputation in the community. Without adequate privacy protection, individuals must take steps to protect themselves from what they consider harmful and intrusive uses of their health information, often at significant costs to their health. Healthcare privacy is an increasingly complex legal and operational issue facing the healthcare industry. For example, in the areas of mental health, HIV, pharmaceuticals, and genetic information, issues of privacy and the appropriate use of health information have already shown themselves to be particularly sensitive. The public has also become increasingly conscious of privacy issues, such as protection of electronic medical records, commercial uses of health information, and insurer and employer access to patient-identifiable information. The increasing use of the Internet also brings a corresponding need for privacy awareness. The very nature of electronic records makes them more easily transportable and thus accessible. Healthcare professionals face many challenges as they seek ways to deliver quality healthcare while maximizing efficiency and effectiveness and at the same time ensuring privacy. A substantial barrier to improving the quality of and access to healthcare is the lack of enforceable privacy rules. Individuals share a great deal of sensitive, personal information with their doctors. This information is then shared with others, such as insurance companies, pharmacies, researchers, and employers, for many reasons. Yet unlike other personal information, there is very little legal protection for medical records. This chapter focuses mainly on the impact that information technology has on healthcare privacy and the ways in which privacy can be achieved. We examine this in the context of the situation in the U.S.A. and in New Zealand, which has supposedly the world’s
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
58 Fung & Paynter
strictest privacy legislations in the Privacy Act (1993). Comparisons to other countries are also made where information security technology has been applied in the medical domain.
What is Health Information? The American Health Information Management Association (AHIMA) (The American Health Information Management Association and the Medical Transcription Industry Alliance, 1998) defines health information as:
• • • •
Clinical data captured during the process of diagnosis and treatment.
•
Research data gathered as a part of care and used for research or gathered for specific research purposes in clinical trials.
• •
Clinical data and observations taken by trainees in a teaching hospital.
•
Coded data that is translated into a standard nomenclature or classification so that it may be aggregated, analyzed, and compared.
Epidemiological databases that aggregate data about a population. Demographic data used to identify and communicate with and about an individual. Financial data derived from the care process or aggregated for an organization or population.
Reference data that interacts with the care of the individual or with the healthcare delivery systems, like a formula, protocol, care plan, clinical alerts, or reminders.
AHIMA further states that healthcare information and data serve important functions, including:
• •
Evaluation of the adequacy and appropriateness of patient care.
• •
Support for insurance and benefit claims.
• •
Identification of disease incidence to control outbreaks and improve public health.
•
Provision of data to expand the body of medical knowledge.
Use in making decisions regarding healthcare policies, delivery systems, funding, expansion, education, and research. Assistance in protecting the legal interests of the patients, healthcare professionals, and healthcare facilities. Provision of case studies and epidemiological data for the education of health professionals.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
59
What is Healthcare Privacy? Healthcare is a service industry that relies on information for every aspect of its delivery. Health information is important to the patients, the medical practitioners, the healthcare professionals, and institutions, in addition to society as it directs the health of the population. It must be protected as a valuable asset, and in its primary form as the medical record of a unique individual, it must be safeguarded. Privacy of health information is a legitimate concern. Such concerns grow as technology is in place to allow confidential medical records and sensitive health information such as: mental illness, HIV, substance abuse, sexually transmitted disease, and genetic information, to be made available to employers, bankers, insurers, credit card companies, and government agencies for making decisions about hiring, firing, loan approval, and for developing consumer marketing. The application of information technology to healthcare, especially the development of electronic medical records and the linking of clinical databases, has generated growing concerns regarding the privacy and security of health information. The security and integrity of electronic health data must be protected from unauthorized users. However, in the medical field, accessibility for certain authorized functions must overrule any other concerns, that is, when a doctor needs to access the information about a patient in order to provide emergency treatment, it is imperative that the data become available without delay (Ateniese, Curtmola, de Medeiros, & Davis, 2003). While patients have a strong interest in preserving the privacy of their personal health information, they may also have an interest in medical research and other efforts by healthcare organizations to improve the quality of medical care they receive.
Categories of Healthcare Privacy In addition to technological revolutions, which are the main cause for privacy concerns, there are three distinct kinds of violations of health information privacy according to the congressional testimony of Janlori Goldman, director of the Health Privacy Project at Georgetown University (Starr, 1999):
• • •
Individual misappropriation of medical records; Institutional practices — ambiguous harm to identifiable individuals; and Institutional practices — unambiguous harm to identifiable individuals.
Individual Misappropriation of Medical Records Starr (1999) states that this category involves individuals who misuse medical data, often publicly disclosing sensitive information and typically violating both the policies of the institutions that kept the records and the laws of their state. It is by far the most common
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
60 Fung & Paynter
type of violation of health information privacy that can be corrected by stronger penalties and more aggressive enforcement of privacy laws and policies. According to Health Privacy Project: Medical Privacy Stories (2003), examples include:
•
Following the rape accusations against basketball player Kobe Bryant, the alleged victim’s medical records were subpoenaed by Bryant’s defense lawyers from a Colorado hospital. After a hospital employee released the records to a judge, attorneys for the hospital have asked that judge to throw out the subpoenas and destroy the records already received by him, citing state and federal medical privacy laws. Attorneys for the victim are also attempting to prevent Bryant’s defense team from gaining access to her medical records from two other hospitals. However, a number of news stories have published sensitive medical information that reporters allege came from hospital employees (Miller, 2003).
•
A hospital clerk at Jackson Memorial Hospital in Miami, Florida stole the social security numbers of 16 patients named Theresa when they registered at the hospital. The hospital clerk then provided the social security numbers and medical record information to a friend, also named Theresa, who opened up over 200 bank and credit card accounts and bought six new cars (Sherman, 2002).
Institutional Practices: Ambiguous Harm to Identifiable Individuals This category consists of the use of personal health data for marketing and other purposes where the harm to the individual is ambiguous or relatively small. For example, a chemist or pharmacist sells patient prescription records to a direct mail and pharmaceutical company for tracking customers who do not refill prescriptions, and sending patients letters encouraging them to refill and consider alternative treatments. The problem is not so much harmful to the customers, who might have appreciated the reminders; what worries them most is the hands into which such lists might fall. This may also raise the question of the merchandising of health data for purposes unrelated to those for which patients provided the original information.
Institutional Practices: Unambiguous Harm to Identifiable Individuals This category consists of institutional practices that do cause harm to identifiable individuals. Different from the other two categories, this one raises much more serious privacy issues and needs correction and reform. Starr stresses that the commingling of the insurance and employment functions in the United States has led to serious abuse of confidential medical information; and the development of genetics has made possible a new and insidious form of discrimination. He recommends security measures such as encryption, the use of a universal health identifier, segmentation of medical records, and biometric identifiers for and audit trails of those accessing medical records (Starr, 1999).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
61
Examples from Health Privacy Project: Medical Privacy Stories (2003) and Starr (1999) include:
•
Two hundred and six respondents in a survey reported discrimination as a result of access to genetic information, culminating in loss of employment and insurance coverage or ineligibility for benefits (Science and Engineering Ethics, 1996).
•
A survey found that 35% of Fortune 500 Companies look at peoples’ medical records before making hiring and promotion decisions (Unpublished study, University of Illinois at Urbana-Champaign, 1996).
•
An Atlanta truck driver lost his job in early 1998 after his employer learned from his insurance company that he had sought treatment for a drinking problem (J. Appleby, “File safe? Health Records May Not Be Confidential,” USA Today, March 23, 2000, p. A1).
Technological Changes Information technologies, such as the Internet and databases, have advanced substantially in the past few years. With the adoption of these new technologies, the healthcare industry is able to save billions of dollars in administrative and overhead costs. These savings can be used to discover new drugs or expand coverage for the uninsured. Through these new technologies, patient care will also be improved; for example, telemedicine allows medical specialists to “examine” and “treat” patients halfway around the world. Perhaps most importantly, information technologies help to empower individuals to understand and to take charge of their own healthcare needs. Patients become active participants in the healthcare process through secure electronic communication services. Wilson, Leitner, and Moussalli (2004) suggest that by putting the patient at the center of the diagnosis and treatment process, communication is more open, and there is more scope for feedback or complaint. This enhances and supports human rights in the delivery of healthcare.
The Internet and Patients The use of Internet in the healthcare industry involves confidential health information being developed and implemented electronically. There are already several applications available on the Internet for caregivers and patients to communicate and for the electronic storage of patient data. These applications include: electronic mail, online conversations and discussion lists (online chat and NetMeeting), information retrieval, and bulletin boards. Caregivers and patients use electronic mail and online chat to communicate. Patients can search the Web for information about symptoms, remedies, support groups, and health insurance rates. They can also obtain healthcare services, such as second opinions and medical consultations, and products, such as prescription drugs, online (Choy, Hudson, Pritts, & Goldman, 2002 ).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
62 Fung & Paynter
Patient databases are stored on the Internet, with some providers storing complete patient records in Internet-accessible sites. Patients can interact with databases to retrieve tailored health information (selection-based on personal profile, disease, or a particular need such as travel or cross-border healthcare) (Wilson et al., 2004). However, the availability of medical information in an electronic form (whether or not available over the Internet) raises privacy issues.
The Internet and Health Professionals Through the use of the Internet, health professionals will have the most up-to-date information available at the click of a mouse. Hospitals, clinics, laboratories, and medical offices will be digitally linked together, allowing for the quick, efficient transfer and exchange of information. The test results will be digitized, allowing for a speedy transfer from labs to hospitals while gaining back the valuable time lost in physical transport. For example, Telehealth in Canada will make geography disappear on a large scale (Siman, 1999). It is a new initiative that significantly improves health services, particularly to remote and rural areas. It also allows physicians to do a complete physical examination of the patient via a digital link. Diagnosis can be made over long-distance telephone lines, rather than after long-distance travel, thus saving the patient the strain and cost of travel. Physicians and other caregivers may use the Internet to discuss unusual cases and obtain advice from others with expertise in treating a particular disease or condition (Siman, 1999).
The Internet and Health-Related Activities The Internet can support numerous health-related activities beyond the direct provision of care. By supporting financial and administrative transactions, public health surveillance, professional education, and biomedical research, the Internet can streamline the administrative overhead associated with healthcare, improve the health of the nation’s population, and lead to new insight into the nature of disease. In each of these domains, specific applications can be envisioned in which the Internet is used to transfer text, graphics, and video files (and even voice); control remote medical or experimental equipment; search for needed information; and support collaboration, in real time, among members of the health community (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 2000). For example, the Internet could do the following (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 2000):
•
Enable consumers to access their health records, enter data or information on symptoms, and receive computer-generated suggestions for improving health and reducing risk;
•
Allow emergency room physicians to identify an unconscious patient and download the patient’s medical record from a hospital across town;
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
63
•
Enable homebound patients to consult with care providers over real-time video connections from home, using medical devices capable of transmitting information over the Internet;
•
Support teams of specialists from across the country who wish to plan particularly challenging surgical procedures by manipulating shared three-dimensional images and simulating different operative approaches;
•
Allow a health plan to provide instantaneous approval for a referral to a specialist and to schedule an appointment electronically;
•
Enable public health officials to detect potential contamination of the public water supply by analyzing data on nonprescription sales of antidiarrheal remedies in local pharmacies;
•
Help medical students and practitioners access, from the examining room, clinical information regarding symptoms they have never before encountered; and
•
Permit biomedical researchers at a local university to create three-dimensional images of a biological structure using an electron microscope 1,000 miles away.
Also called: “Medicine of the Millennium,” telemedicine is connecting geographically separate healthcare facilities via telecommunications, video, and information systems. The purpose of telemedicine is for remote clinical diagnosis and treatment, remote continuing, medical education, and access to central data repositories for electronic patient records, test requests, and care outcomes. However, the increasing use of the Internet brings a corresponding need for privacy awareness. The very nature of electronic records makes them more easily transportable and, thus, accessible. Privacy on the Internet is becoming more and more of a concern as confidential information transmitted via the Internet may be intercepted and read by unauthorized persons. Some commonly used Internet protocols may allow information to be altered or deleted without this being evident to either the sender or receiver. Patients may be totally unaware that their personally identifiable health information is being maintained or transmitted via the Internet, and worse still, they may be subject to discrimination, embarrassment, or other harm if unauthorized individuals access this confidential information. While technology can and should be used to enhance privacy, it can also be used to undermine privacy.
Why are There Healthcare Privacy Concerns? Undoubtedly, the Internet is a valuable tool for improving healthcare because of its ability to reach millions of Internet users at little or no additional cost and absence of geographic and national boundaries. Unfortunately, the Internet is also an ideal tool for
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
64 Fung & Paynter
the commission of fraud and other online crime. Examples of such fraud include healthcare scams such as the selling of misbranded and adulterated drugs, and bogus miracle cures. Many of the bigger healthcare Web sites collect information by inviting users to create a personalized Web page where they can acquire medical information tailored specifically to their age, gender, medical history, diet, weight, and other factors. Some sites offer alerts on special medical conditions, health and fitness quizzes, and even the opportunity to store one’s own medical records and prescriptions online in case of emergency (Medical privacy malpractice: Think before you reveal your medical history, 2001). Other Web sites collect information using cookies. Cookies are small pieces of data stored by the user’s Internet browser on the user’s computer hard drive. Cookies will be sent by the user’s browser to the Web server when the user revisits the same Web site. Hence the user’s information such as number of visits, average time spent, pages viewed, and e-mail address will be collected and used to recognize the user on subsequent visits and provide tailored offerings or content to the user. The California HealthCare Foundation recently examined the privacy policies and practices of 21 popular health sites including: DrKoop.com, Drugstore.com, and WebMD.com (Medical privacy malpractice: Think before you reveal your medical history, 2001). They found that visitors to the sites are not anonymous, and that many leading health Web sites do not practice the privacy policies they preach. In some cases, third-party ad networks run banner ads on the sites that collect information and build detailed profiles of each individual’s health conditions. In New Zealand, no published survey has been previously conducted. In order to examine the privacy policies and practices of the New Zealand health sites, 20 medical related Web sites were chosen from the electronic yellow pages (www.yellowpages.com.nz) and studied for the purpose of this chapter. At this site, the individual listings are arranged such that those that have Web sites appear first. Only unique sites were examined. That is, those with multiple listings or branches were ignored. Those with only a simple banner ad in the electronic yellow pages were also excluded. Of these 20 Web sites, three were medical insurers, or offered a medical insurance policy as one of their services; one was for health professionals to use to support traveling patients; and the rest were medical clinics and hospitals. The result shows that all but one of these Web sites collected personal information, but only one had a privacy statement, and it was very obscure; three used cookies, and none mentioned the purpose for which the information was collected. The New Zealand Information Privacy Principle 3 requires that a wellexpressed Web site should have a privacy statement. A privacy statement tells consumers that their privacy right is being considered (Wiles, 1998). The Web sites studied all failed to meet such a requirement. The results of the above studies indicate that healthcare privacy concerns are not just problems in New Zealand, but universal ones. According to Anderson (1996), many medical records can be easily obtained by private detectives, who typically telephone a general practice, family health services authority, or hospital and pretend to be the secretary of a doctor giving emergencytreatment to the person who is the subject of the investigation Although privacy is a concern as electronic information is vulnerable to hackers and system errors that can expose patients’ most intimate data, the most persistent risk to
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
65
security and privacy is through the people who have authorized access, much more so than the hackers or inadvertent system errors. As medical information systems are deployed more widely and made more interconnected, security violations are expected to increase in number.
What are the Concerns? The American Health Information Management Association (1998) estimates that when a patient enters a hospital, roughly 150 people have legitimate access to that person’s medical record, including food workers, pharmacists, lab technicians, and nursing staff, each with a specific authority to view components of the record necessary for their job and each with unique ability to act within a system. The increasing use of the Internet in the healthcare industry has also heightened concerns on privacy. The CERT Coordination Center at Carnegie Mellon University, a national resource for collecting information about Internet security problems and disseminating solutions (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 1997), lists seven general areas of vulnerability:
•
Compromised system administrator privileges on systems that are unpatched or running old OS versions.
• • •
Compromised user-level accounts that are used to gain further access.
• • •
Software piracy.
Packet sniffers and Trojan horse programs. Spoofing attacks, in which attackers alter the address from which their messages seem to originate. Denial of Service. Network File System and Network Information System attacks and automated tools to scan for vulnerabilities.
In addition to the above vulnerabilities, other concerns are:
•
To whom should organizations be allowed to disclose personal health information with and without patient consent? Under what conditions may such disclosures be made?
•
What steps must organizations take to protect personal health information from loss, unauthorized editing, or mischief?
•
What types of security technologies and administrative policies will be considered sufficient protection?
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
66 Fung & Paynter
Additional Common Threats and Attacks A threat is any of the capabilities, intentions, and attack methods of adversaries to exploit or cause harm to information or a system. Threats are further defined as being passive (monitoring but no alteration of data) and active (deliberate alteration of information). King, Dalton, and Osmanoglu (2001) define four common threat consequences and the sources of threats in the following sections:
•
Disclosure — If information or data is revealed to unauthorized persons (breach of confidentiality).
•
Deception — If corporate information is altered in an unauthorized manner (system or data integrity violation).
•
Disruption — If corporate resources are rendered unusable or unavailable to authorized users (denial of service).
•
Usurpation — If the corporate resources are misused by unauthorized persons (violation of authorization).
Temporary or Careless Employees Electronic health records stored at healthcare organizations are vulnerable to internal or external threats. Although with the protection of firewalls, careless employees, temporary employees, or disgruntled former employees cause far more problems than do hackers. As a company’s employees have tremendous access to the company’s resources, it is possible that the computer system could be hacked into internally, as well as by third parties. For example, an employee attaches a database of 50,000 names to an e-mail and sends it to a business partner who is working on a marketing campaign at another company. It would be very likely that data could be intercepted or harvested by a third party and used for improper or unauthorized purposes (Silverman, 2002).
Human Errors and Design Faults A serious threat to the confidentiality of personal health information in hospitals and health authorities is the poor design and lax administration of access controls (Anderson, 1996). In many hospitals, all users may access all records; it is also common for users to share passwords or to leave a terminal permanently logged on for the use of everyone in a ward. This causes a breakdown of clinical and medico-legal accountability and may lead to direct harm. Other design errors include improperly installing and managing equipment or software, accidentally erasing files, updating the wrong file, or neglecting to change a password or backup a hard disk.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
67
Insiders Another source of threat comes from the trusted personnel (the insiders) who engage in unauthorized activities (copying, stealing, or sabotaging information, and yet their actions may remain undetected) or activities that exceed their authority (abusing their access). The insiders may disable the network operation or otherwise violate safeguards through actions that require no special authorization.
Crackers, Hackers, and Other Intruders While internal threats consist of authorized system users who abuse their privileges by accessing information for inappropriate reasons or uses, external threats consist of outsiders who are not authorized to use an information system or access its data, but who nevertheless attempt to access or manipulate data or to render the system inoperable. Computer break-ins are proven to have occurred in the healthcare industry. The Health Care Privacy Project, a non-profit corporation in Washington DC, reported that a hacker found a Web page used by the Drexel University College of Medicine in Pennsylvania that linked to a database of 5,500 records of neurosurgical patients (Health Privacy Project: Medical privacy stories, 2003). The records included patient addresses, telephone numbers, and detailed information about diseases and treatments. After finding the database through the search engine Google, the hacker was able to access the information by typing in identical usernames and passwords. Drexel University shut down its database upon learning of the vulnerability, and a university spokeswoman stated that officials had been unaware that the database was available online, as it was not a sanctioned university site. A “2002 Computer Crime and Security” survey conducted by the Computer Security Institute (CSI) with the participation of the San Francisco Federal Bureau of Investigation’s (FBI) Computer Intrusion Squad found that the threat from computer crime and other information security breaches continues unabated and that the threat from within the organization is far greater than the threat from outside the organization. Results show that 74% cited their Internet connection as a frequent point of attack than cited their internal systems as a frequent point of attack (33%); 28% suffered unauthorized access or misuse on their Web sites within the last twelve months; 21% said that they did not know if there had been unauthorized access or misuse; 55% reported denial of service; and 12% reported theft of transaction information (Cyber crime bleeds U.S. corporations, survey shows; Financial losses from attacks climb for third year in a row, 2002).
Social Engineering According to King et al. (2001), a social engineering attack involves impersonating an employee with known authority, either in person (disguised) or by using an electronic means of communication (e-mail, fax, or the telephone). For example, an attacker places a phone call to the system administrator claming to be a corporate executive who has lost
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
68 Fung & Paynter
the modem pool number and forgotten the password. In the hospitals, an outsider places a phone call to an authorized insider, pretending to be a physician in legitimate need of medical information.
Information Warfare A RAND Corporation study of information warfare scenarios in 1995 suggests that terrorists using hacker technologies could wreak havoc in computer-based systems underlying emergency telephone services, electric power distribution networks, banking and securities systems, train services, pipeline systems, information broadcast channels, and other parts of our information infrastructure (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 2000). Although the above examples do not specifically describe threats to healthcare organizations, they do indicate the growing vulnerability of information systems connected to public infrastructure such as the Internet. As such, the drive for increased use of electronic health information linked together by modern networking technologies could possibly expose sensitive health information to a variety of threats that will need to be appropriately addressed.
Healthcare Privacy Concerns in the United States According to Ball, Weaver, and Kiel (2004), a national survey of e-health behavior in the U.S. found that 75% of people are concerned about health Web sites sharing information without their permission and that a significant percentage do not and will not engage in certain health-related activities online because of privacy and security concerns. For example, 40% will not give a doctor online access to their medical records; 25% will not buy or refill prescriptions online; and 16% will not register at Web sites. However, nearly 80% said that a privacy policy enabling them to make choices about whether and how their information is shared would make them more willing to use the Internet for their private health information. A Pew report (2005) documented that 89% of health seekers were concerned about privacy issues, with fully 71% very concerned. When people were made aware of the possibility of the issuance of universal medical ID numbers, a Gallup poll found that 91% opposed the plan; 96% opposed the placement on the Web of information about themselves held by their own doctor (The Gallup Organisation, 2000). On the other hand, the healthcare administrators are aware of security issues and have many safeguards in place. In a recent survey of healthcare information technology executives, participants ranked the protection of health data as their primary concern (Reid, 2004). Hospitals, for example, indicate that current security technologies in use include anti-virus software (100%), firewalls (96%), virtual private networks (83%), data encryption (65%), intrusion detection (60%), vulnerability assessment (57%), public key infractions (20%), and biometrics (10%). Virtually all respondents expected to use all these technologies to some degree during the next two years (The Gallup Organisation, 2000). Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
69
Recent evidence indicates that many medical organizations are lagging behind in their implementation of appropriate security systems. A study of 167 U.S. hospitals conducted by research firm HCPro found that 76% had not conducted an information security audit, and only half planned to do so by April 2001 (Johnson, 2001). Of the hospitals that had performed an audit, 51% said that they would need major improvements to, or a complete overhaul of, their security systems, and 49% claimed that they would have to significantly change or replace their security policies. Alarmingly, only 5% said they had an annual budget for HIPAA compliance. The inadequacy of some medical providers’ security systems was recently underscored by the hacking of the University of Washington Medical Center (UWMC) computers (Thomson Corporation and health data management, 2005). SecurityFocus.com reported that an intruder was able to break into the UWMC computers and view the name, address, and Social Security number and medical procedures of over 4,000 cardiology patients. Theoretically, the UWMC could face potential lawsuits by distressed patients.
Healthcare Privacy Concerns in New Zealand A survey was conducted in 1998 to study the practice and plans in New Zealand for the collation and retention of health records about identifiable individuals, with particular reference to the implications for privacy arising from the increased use of National Health Index Numbers (NHI) (Stevens, 1998). What is NHI? The NHI provides a mechanism to uniquely identify healthcare users. It was developed to help protect personally identifying health data held on computer systems and to enable linkage between different information systems whilst still protecting privacy. The NHI database records contain information of each person to whom an NHI number has been allocated, their name, date of birth, date of death, address, gender, ethnicity (up to three entries allowed), residence status, and other names by which they may be known (Stevens, 1998). It, however, does not contain any clinical information, and its availability for research purposes tends to be limited chiefly to a peripheral role in cohort studies and clinical trials (Stevens, 1998). Alongside the NHI database is the Medical Warnings System (MWS) database, which can only be accessed via the individual’s NHI number. The MWS is designed to warn healthcare providers of the presence of any known risk factors that may be important in making clinical decisions about individual patient care. The MWS database records contain individuals’ NHI numbers, donor information (e.g., heart or kidney), contact details for next of kin (name, relationship, and phone number), medical warnings (typically allergies and drug sensitivities, classified as “danger” or “warning” or unverified “report”), medical condition alerts (such as diabetes), and summaries of healthcare events (so far these have been limited to hospital admissions, showing dates of admission and discharge, hospital, and diagnosis or procedure code). These two databases are maintained by New Zealand Health Care Information Services (NZHIS) formed in 1991, a division of the Ministry of Health. NSHIS is responsible for the collecting, extraction, analysis, and distribution of health information.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
70 Fung & Paynter
In this survey, it is found that a statement on the Ministry Web site states that access to the MWS is “restricted solely for the use of providers in the context of caring for that individual.” However, it is estimated that there are some 20,000 people who have direct access to the MWS and a further 70,000 who potentially have access to it, so that in practice, the security of the system probably relies heavily upon the difficulty of getting a hold of the NHI for the individual subject of an unauthorized enquiry (Johnson, 2001). The survey further reveals that the same Web site document states, in respect of the NHI and MWS systems, “The Privacy Commissioner will be continuously involved in ensuring that the very highest possible standards of integrity and probity are maintained.” Yet NZHIS do not appear to have taken any steps either to check with the Privacy Commissioner before making that statement or, having made the statement without the Commissioner’s knowledge or agreement, to involve him/her at all in checking arrangements for operation of these databases. At the least, therefore, the statement is misleading in suggesting a form of endorsement by the Privacy Commissioner. During the survey, more than one doctor contacted admitted that they use a different name for transactions involving their own healthcare, because they do not trust the security of records held by hospitals, laboratories, and other healthcare agencies with which they deal. This implies that the more health records that are to be integrated, the more users that must be concerned about the possibility of security breaches in any one part of the larger system. This also implies that the functions of an information system can be subverted if it does not gain and keep the confidence of both users and subjects.
Who Has Access to the Healthcare Information? There are a variety of organizations and individuals who have an interest in medical data, and they are both within and outside of the healthcare industry. Usually access to the health information requires a patient’s agreement by signing a “blanket waiver” or “general consent forms” when the patient obtains medical care. Signing of such a waiver allows healthcare providers to release medical information to employers, insurance companies, medical practitioners, government agencies, court orders or legal proceedings, direct marketers, medical institutions, hospitals, and newsgroups/chat rooms on the Internet.
Employers Employers have an interest in an employee’s fitness to work and fitness to perform particular tasks such as flying airplanes, controlling air traffic, and driving trains, buses, trucks, and cars. Some self-insured businesses establish a fund to cover the insurance claims of employees, which requires employees’ medical records to be open for inspection by employers instead of an insurance company.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
71
Insurance Companies Insurance companies seek to combat rising costs of care by using large amounts of patient data in order to judge the appropriateness of medical procedures. They may also have an interest in healthcare data about a person’s injuries and illnesses in relation to medical claims. In New Zealand, the Accident Rehabilitation and Compensation Insurance Corporation (ACC), whose accident records are used for calculating workplace premium, will be shared with healthcare organizations. For example, to be eligible for weekly compensation, an injured person must be (a) incapacitated through injuries and (b) an earner at the time of the incapacity. ACC obtains medical opinion to clarify incapacity. It also obtains information from Inland Revenue, employers, and accountants to satisfy the second criteria.
Medical Practitioners The medical practitioners have an explicit statutory obligation to disclose information on patients who have a serious physical condition, notifiable disease, or impairment that the doctor knows is likely to result in significant danger to the public (Clarke, 1990). In some cases it may be important that sensitive health data to be conveyed as part of information provided about a referral, in particular if the patient has been diagnosed as HIV-positive.
Government Agencies In the U.S., government agencies may request citizens’ medical records to verify claims made through Medicare, MediCal, Social Security Disability, and Workers Compensation. In New Zealand, government agencies such as Inland Revenue Department may share the information with healthcare organizations and ACC for tax and benefits purposes.
Medical Institutions and Clinical Researchers Medical institutions such as hospitals or individual physicians require health information for evaluation of quality of service. This evaluation is required for most hospitals to receive their licenses. Clinical researchers and epidemiologists need health information to answer questions about the effectiveness of specific therapies, patterns of health risks, behavioral risks, environmental hazards, or genetic predisposition for a disease or condition (e.g., birth defects).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
72 Fung & Paynter
Direct Marketers Drug companies want to know who is taking which drug so that they can conduct postmarketing surveillance to develop marketing strategies. Direct marketers use healthscreening tests to collect medical information and build up data banks of businesses for promoting and selling products that are related to the information collected.
Court Orders/Legal Proceedings In the U.S., medical records may be subpoenaed for court cases for people who are involved in litigation, an administrative hearing, or workers’ compensation hearing. In the (less litigatious) New Zealand context, this is more likely to involve the granting of powers of attorney to make decisions on medical matters for patients who are not capable of making such decisions.
Internet Service Providers/Users The Internet is available for individuals to share information on specific diseases and health conditions. While the Web sites dispense a wide variety of information, there is no guarantee that information disclosed in any of these forums is confidential.
Mechanisms for Addressing Healthcare Privacy Today, healthcare organizations are confronting the challenge of maintaining easy access to medical/clinical data while increasing data security. Technology is only part of the solution. Many healthcare organizations have deployed, to varying degrees, mechanisms to protect the security and integrity of their medical records, such as the use of strong enterprise-wide authentication, encryption, several levels of role-based access control, auditing trails, computer misuse detection systems, protection of external communications, and disaster protection through system architecture as well as through physically different locations. Among other strategies, databases are also used to address security and access control. One database will have consumer identification (ID) geographic information linked to an ID number. The second database will have actual clinical information indexed by patient ID number but no personal data (Ball et al., 2004). However, there are obstacles to the use of security technologies which are yet to be resolved.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
73
Table 1. Functions of technological security tools Principles
Implementation
Availability
Ensuring that information is accurate and up to date when needed
Accountability
Ensuring the access to and use of information is based on a legitimate need and right to know
Perimeter identification
Knowing and controlling the boundaries of trusted access to the information system, both physically and logically
Controlling access
Ensuring the access is only to information essential to the performance of jobs and limiting the access beyond a legitimate need Ensuring that record owners, data stewards, and patients understand and have effective control over appropriate aspects of information privacy and access
Comprehensibility and control
Technological Solution Technological security tools are essential components of modern distributed healthcare information systems. At the highest level, they serve five key functions, as seen in Table 1 (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 1997): availability, accountability, perimeter identification, controlling access, and comprehensibility and control. However, these types of controls focus more on protecting information within healthcare provider institutions and do not address the problems of unrestricted exploitation of information (e.g., for data mining) after it has passed outside the provider institution to secondary players or to other stakeholders in the health information services industry (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 1997). In New Zealand, the Health Intranet, a
Table 2. Key elements of a security policy Principles
Implementation
Confidentiality
Ensuring that the message is not readable by unauthorized parties whilst in transit by applying strong encryption tools, such as Digital Certificates
Integrity
Ensuring that the message is not damaged or altered whilst in transit by using secure private networks and Digital Signatures
Authenticity
Ensuring that the user is a trusted party by using user ID/password and/or Digital Certificates
Non-repudiation
Ensuring that the sender cannot claim the message is counterfeit, or deny sending and receiving it by using secure private networks and Digital Signatures Recording user connectivity and site access for audit purposes Identifying clear responsibilities of organizations and individual users through compliance with Legislation and Security Policies
Auditing Accountability
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
74 Fung & Paynter
communications infrastructure that allows health information to be exchanged between healthcare providers in a secure way, defines six key elements that any security policy must address (New Zealand Health Information Service, 2001): confidentiality, integrity, authenticity, non-repudiation, auditing and accountability.
Security Architecture The primary goal of a security architecture design in the healthcare industry is the protection of the healthcare provider’s assets: hardware, software, network components, and information resources. Healthcare Finance Administration (HCFA) (CPRI toolkit: Managing information security in heath care, 2005) suggests that technical protection measures are traditionally grouped into three high level categories:
•
Confidentiality measures provide the mechanism to ensure that the privacy of information is maintained. Mechanisms include encryption (e.g., virtual private networks, end-to-end, and link level encryption).
•
Integrity measures enhance the reliability of information by guarding against unauthorized alteration. Protection measures include: digital signature and strong authentication using certificates provided through the Public Key Infrastructure (PKI) initiative.
•
Availability measures seek to ensure that information assets are accessible to internal and external users when needed and guard against “denial of service” attacks. Protection measures include: firewalls and router filters for mitigating availability risks created by denial of service attacks.
While developing guidelines for the clinical security system for BMA (British Medical Association), Ross Anderson (1996) identified a few shortcomings of the NHS (UK National Health Services) wide network, which are useful for any security architectures to be built for the healthcare industry:
•
The absence of an agreed common security policyenforced by all the systems that will connect to the network.
• •
The lack of confidence in the technical security measures such as firewalls. Many of the NHS wide network applications are unethical, which make personal health information available to an ever-growing number of administrators and others outside the control of both patient and clinician. Such availabilitycontravenes the ethical principle that personal health information may be shared only with the patient’s informed and voluntary consent. For example, the administrative registers will record patients’ use of contraceptive and mental health services, while the NHS clearing system will handle contract claims for inpatient hospital treatment and contain a large amount of identifiable clinical information.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
•
75
Item of service and other information sent over existing electronic links between general practitioners and family health services authorities. While registration links are fairly innocuous, at least two suppliers are developing software for authorities that enables claims for items of service, prescriptions, and contract data to be pieced together into a “shadow” patient record that is outside clinical control (Advanced information system, Family Health Services computer unit, 1995; Data Logic product information at http://www.datlog.co.uk/).
Table 3 is a typical security architecture, the components of which are formed based on the ten basic security services (physical security; firewalls; intrusion detection; access control; authentication; privacy and integrity (encryption); electronic signature/nonrepudiation; virus protection; audit trail creation and analysis; and database security) identified by HCFA and a list of application-specific baseline requirements for the healthcare industry proposed by King et al. (2001). Some of the components and guidelines are also adopted from the Anderson’s UK NHS model.
Table 3. Security architecture principles and guidelines Principles
Security Services
Confidentiality
Encryption is required over all communications channels (e.g., Internet, ISP-based connections, dial up etc.). Confidential data must be kept encrypted on user laptops and workstations. Such information is to be disclosed only to named individuals on a need-toknow basis.
Mechanisms • Firewalls—Use at connection to Internet and
boundary points.
• Physical Control—Central office and Data
Center continued physical security; integrated smart card access control. • Encryption—Application-specific, primarily DES-based and PKI-based key mgmt; SSL. • Database Security—Proprietary, DBMSspecific; DAC, PKI-enabled; RBAC (rolebased access control) integrated; DAC.
Integrity
Business unit managed change control is required. Field-level change history must be maintained. Rollback functionality is required.
Encryption—Application-specific, primarily DES-based and PKI-based key management; SSL (Secure Socket Layer).
Availability
Virus scanning and redundant and high availability solutions are required. Strong system configuration, change control, and regular backup/restore processes are required.
Virus Prevention—Workstation-based and server-based program; signed applications.
Identification and authentication
Strong authentication (encrypted username and password, token, certificate).
Authentication—User ID and password-based with limited smart card pilots; Private key-based with multi-factor identification.
Continued on following page
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
76 Fung & Paynter
Table 3. Security architecture principles and guidelines (cont.) Principles
Security Services
Mechanisms
Authorization and access control
Authorization by business unit or function and detailed role-based access control are required.
Access Control—Platform-specific access control lists; RBAC-based, centrally managed access.
Nonrepudiation
Strict change controls are required. Field-level file change history must be maintained. Digital signatures for creator and the checker are required. System-level for user-access, file changes, failed login attempts, alarms.
Electronic Signature—FIPS 140-1 digital signature; Escrow for encryption keys (not signing keys).
Auditing and monitoring
Compliance with regulations
Compliance with Legislation.
• Audit Trail Creation & Analysis—Logs
generated on a platform-specific basis; Consistent log content, directive data reduction and analysis. • Intrusion Detection—Automated monitoring of limited entry/exit points; Pro-active with integrated action plan. For example, HIPAA in the U.S., European Union Data Protection Directive or the Health Information Privacy Code 1994 in New Zealand.
Encryption There is an increasing number of health practitioners transferring patient health information using electronic mail across wide area networks, for example, using mailbox systems to transfer registration data and item of service claims to family health services authorities, links between general practitioners and hospitals for pathology reports, and the use of Internet electronic mail to communicate with patients that require continuing management. Anderson (1996) suggests that the problem may be tackled using cryptography: encryption and digital signatures can protect personal health information against disclosure and alteration, whether accidental or malicious, while in transit through a network. Encryption is a tool for preventing the possibility of attack and interception during transmission and storage of data, for assuring confidentiality and integrity of information, and for authenticating the asserted identity of individuals and computer systems by rendering the data meaningless to anyone who does not know the “key.” Information that has been cryptographically wrapped cannot be altered without detection. For example, the integrity of a health message is destroyed by removal of personal identifiers or by encryption of crucial pieces of the message. At the destination, the receiver decrypts the message using the same key (symmetrical encryption) or a complementary but different key (asymmetrical encryption) (New Zealand Health Information Service, 2001). Pretty Good Privacy (PGP) and GNU-PGP are commonly used third-party encryption software, which are available free for most common makes of computer. Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
77
There are two types of encryption systems: Public-key encryption and private-key encryption. The most commonly used and secure private-key encryption system is the Data Encryption Standard (DES) algorithm developed by IBM in the 1970s, which is gradually replaced by the newer and more efficient algorithm, the Advanced Encryption Standard (AES), which was chosen by the U.S. government after a long, open contest. According to Wayner (2002), the basic design of DES consists of two different and complementary actions: confusion and diffusion. Confusion consists of scrambling up a message or modifying it in some nonlinear way. Diffusion involves taking one part of the message and modifying another part so that each part of the final message depends on many other parts of the message. DES consists of 16 alternating rounds of confusion and diffusion. Public-key encryption is quite different from the DES. The most popular public-key encryption system is the RSA algorithm, developed in the late 1970s, which uses two keys. If one key encrypts the data, then only the other key can decrypt it. Each person can create a pair of keys and publicize one of the pair, perhaps by listing it in some electronic phone book. The other key is kept secret. If someone wants to send you a message, only the other key can decrypt this message, and only you have a copy of that key. In a very abstract sense, the RSA algorithm works by arranging the set of all possible messages in a long loop in an abstract mathematical space (Wayner, 2002). Public key cryptography is the underlying means for authenticating users, securing information integrity, and protecting privacy. For example, New Zealand North Health is planning to use encryption to encrypt the patients’ NHI number and to deposit the information in a database. As such, information about any individual can only be retrieved by means of the encrypted identifier. In the wide area networks, both secure socket layer (SSL) encryption and IP security (IPSec) should be deployed to allow the continued evaluation of different modes of securing transactions across the Internet. SSL is used to transport the encrypted messages on a communication channel so that no message could be “intercepted” or “faked.” It provides authentication through digital certificates and also provides privacy through the use of encryption. (IPSec) protocol, a standards-based method of providing privacy, integrity, and authenticity to information transferred across IP networks, provides IP network-layer encryption.
Virtual Private Network (VPN) Virtual private networks (VPNs) are standard secure links between companies and their resource users, which allow a company’s local networks to be linked together without their traffic being exposed to eavesdropping. It can reduce the exposure to interception of international network traffic. With the increasing use of Internet in the healthcare industry, VPNs play a significant role in securing privacy. VPNs use tunneling and advanced encryption to permit healthcare organizations to establish secure, end-to-end, private network connections over third party networks. Some practical applications that will be used include accessing and updating patient medical records, Tele-consultation for medical and mental health patients, electronic transfer of medical images (x-ray, MRI, mammography, etc.), psychiatric consultations, distance learning, and data vaulting (ScreamingMedia, 1999). Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
78 Fung & Paynter
The Hawaii Health Systems Corporation (HHSC) has created a Virtual Private Healthcare Network and Intranet solution that allows for collaboration between its 12 hospitals, 3,200 employees, and 5,000 partners located worldwide. By creating a sophisticated healthcare network that supports high speed, broadband data connectivity, doctors, specialists, and administrators can collaborate throughout the State of Hawaii just as if they were together at the same hospital. This scalable solution also allows existing and future partners, clients, and suppliers to connect to the HHSC network to collaborate and share data. By using a unique subscription profile concept, the network provides impenetrable security and allows for the free and secure flow of mission critical data (ScreamingMedia, 1999).
Firewalls When private networks carrying confidential data are connected to the Internet, firewalls must be utilized extensively to establish internal security boundaries for protecting the internal private network, computers, data, and other “electronic assets” from tampering or theft by outsiders. Firewalls are a collection of network computing devices such as routers, adaptive hubs, and filters working in tandem and configured to ensure that only expressly permitted packets of data may enter or exit a private network. Firewalls will screen all communication between the internal and external networks according to directives established by the organization. For example, Internet access to an internal patient data system should be entirely prohibited or limited only to those people authenticated by a password or security token (Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies, 2000). Communications security is also important. Some general practices have branch surgeries, and many hospitals have branch clinics, so the possibility of access via a dial up modem from branches is often raised (Anderson, 1996). In such cases, the main additional risk is that an outside hacker might dial up the main system and gain access by guessing a password. In order to avoid that, Anderson (1996) suggests that there should be no direct dial access to the main computer system. Instead, the main system should dial back the branches. Extra effort should also be made to educate users to choose passwords with care, and all incidents should be investigated diligently.
Audit Trails and Intrusion Detection Monitoring Transaction logs and audit trails are important, as changes to the patient data can be closely monitored and traced. Audit trails record who and when alterations are made to particular files. The use of audit trails is invaluable, as they can be used as evidence in a court of law. The HCFA information systems create audit logs that record, in a centralized repository, logon and logoff; instances where a role is authorized access or denied access; the individual acting in that role; the sensitivity level of the data or other asset accessed; what type of access was performed or attempted (e.g., whether the nature of the requested action was to create, read, update, execute a program, or delete). Anderson (1996) suggests that periodic audits should be carried out, and from time to
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
79
time these should include penetration tests. For example, a private detective might be paid to obtain the personal health information of a consenting patient. In this way, any channels that have developed to sell information on the black market may be identified and closed off. Intrusion detection is primarily a reactive function that responds as attacks are identified. HCFA recommends the use of intrusion detection software to monitor network and hostbased assets and employ a computer emergency response team to report and respond when incidents occur.
Biometric Systems New technology called “biometric authentication” is being used to replace passwords and other security measures with digital recognition of fingerprints or other unique attributes. Biometrics uses individual physiological (finger-scan, iris scan, hand-scan, and retina-scan) or behavioral characteristics (voice and signature scans) to determine or verify identity. The most commonly used is the physiological biometrics. Because biometric security is based on a unique feature of an individual’s body, for instance, a fingerprint, it is very difficult to copy, steal, or replicate this information (The Independent Research Group, 2002). Iris-scan is very suitable for use by healthcare institutions. Iris-scan can verify or identify a person based on the unique characteristics of the human iris. The strengths of iris-scan include its high resistance to false matching, the stability of the iris over time, and the ability to use this biometric to provide access to healthcare information or entry into physically secure locations, such as a medical record-keeping or information technology department. A study done in Albuquerque, New Mexico indicates that the most effective technologies currently available for identification verification (i.e., verifying the claimed identity of an individual who has presented a magnetic stripe card, smart card, or PIN) are systems based on retinal, iris, or hand geometry patterns (Stevens, 1998). On the other hand, single-sign-on technology enables users to log on via user IDs and passwords, biometrics, or other means to gain immediate access to all information systems linked to a network (Clarke, 1990). Single sign-on (SSO) is the capability to authenticate to a given system/application once, and then all participating systems/applications will not require another authentication (King et al., 2001). Both technologies are designed to provide increased security in an unobtrusive manner (Clarke, 1990). St. Vincent Hospital and Health Care Services, Indianapolis had implemented a combined biometric and singlesign-on system in one of its acute care departments using different types of biometric readers to identify physicians and nurses.
Smart Cards Internet commerce interests are pushing forward aggressively on standards for developing and deploying token-based cryptographic authentication and authorization systems (e.g., the Mastercard-Visa consortium and CyberCash Inc.) (Siman, 1999). Smart Card Token is a smart card about the size of a credit card and has a liquid crystal display Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
80 Fung & Paynter
on which a number appears that changes every minute or so. Each user card generates a unique sequence of numbers over time and, through a shared secret algorithm for which the user has been assigned access privileges, can generate the corresponding sequence of numbers. The number can be used as a session password. The write-controlled internal memory supports services such as user-specific information storage, authentication, and cryptographic certificate management. Some even have biometric access control features. Employees and appropriate contractors will be issued smart cards or tokens that store a private key and other essential authentication information.
Access Control A serious threat to the confidentiality of personal health information in hospitals and health authorities is the poor design and lax administration of access controls (Anderson, 1996). Anderson stresses that, in particular, the introduction of networking may turn local vulnerabilities into global ones if the systems with ineffective access controls are connected together in a network, and then instead of the data being available merely to all staff in the hospital, they might become available to everyone on the network. However, access controls must also be harmonized among networked systems, or moving information from one system to another could result in leaks. The solution for this is to have a common security policythat clearly states who may access what records and under what circumstances. Anderson emphasizes that the following are important to the implementation of effective access controls:
•
A senior person such as a hospital manager or partner in general practice must be responsible for security, especially if routine administration is delegated to junior staff. Many security failuresresult from delegating responsibility to people without the authority to insist on good practice.
•
The mechanisms for identifying and authenticating users should be managed carefully. For example, users should be educated to pick passwords that are hard to guess and to change them regularly; and terminals should be logged off automatically after being unused for five minutes.
•
Systems should be configured intelligently. Dangerous defaults such as maintenance passwords and anonymous file transfer access supplied by the manufacturer should be removed. User access should be restricted to departments or care teams as appropriate. With hospital systems that hold records on many people, onlya few staff should have access to the files of patients not currently receiving treatment.
Password Management In many hospitals all users may access all records and often share passwords and leave terminals permanently logged on for the use of everyone in a ward. Such behavior causes a breakdown of clinical and medicolegal accountability and may lead to direct harm: one case has been reported in which a psychiatric patient changed prescription information at a terminal that was left logged on (Anderson, 1996). Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
81
It is important for administrators to educate all users that passwords issued to an individual should be kept confidential and not be shared with anyone. When a user ID is issued to a temporary user who needs access to a system, it must be deleted from the system when the user has finished his or her work. All passwords should be distinctly different from the user ID, and ideally they should be alphanumeric and at least six characters in length. Also, passwords should be changed regularly, at least every 30 days. Rittinghouse and Ransome (2004) suggest that it is a good security practice for administrators to make a list of frequently used forbidden passwords. Standard passwords that are often used to get access to different systems for maintenance purposes are not recommended.
Database Security Database authentication and access control will be public key enabled and role-based. This means that a user will employ a multi-factor authentication procedure based on knowledge of his/her private key to obtain access to a database. Once authentication is complete, access, sometimes down to the record level, will be granted or denied based on the user’s roles and associated privileges. Database security will be implemented on a discretionary access control (DAC) basis.
Social Engineering and Careless Disclosure Safeguards The weakest link in security will always be people, and the easiest way to break into a system is to engineer your way into it through the human interface (CPRI toolkit: Managing information security in heath care, 2005). The main threat to the confidentiality of clinical records is carelessness in handling telephone/e-mail/fax inquiries, instant messaging and on-site visits, and inadequate disposal of information. According to King et al. (2001), social engineering safeguards consist of non-technical (procedural) means that include: security training for all corporate users; security awareness training for all system administration personnel with well-documented procedures, handling, and reporting; and security awareness training for personnel responsible for allowing outside visitors into restricted areas (such as assigned escorts). With regard to careless disclosure, Anderson (1996) developed a set of common sense rules that the best practices have used for years and that are agreed by the UK NHS Executives. Whether records are computerized or not, these rules of best practice can be summed up as clinician-consent-call back-care-commit:
•
Only a clinician should release personal health information. It should not be released by a receptionist or secretary.
•
The patient’s consent must be obtained, except when the patient is known to be receiving treatment from the caller or in the case of emergency or the statutory exemptions. In the latter two cases the patient must be notified as soon as reasonably possible afterward.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
82 Fung & Paynter
•
The clinician must call back if the caller is not known personally, and the number must be verified, for example, in the Medical Directory. This procedure must be followed even when an emergency is claimed, as private investigators routinely claim emergencies.
•
Care must be taken, especially when the information is or may be highly sensitive, such as HIV status, details of contraception, psychiatric history, or any information about celebrities.
•
The clinician must commit a record of the disclosure to a ledger. This should have the patient’s name; whether consent was sought at the time (and, if not, the date and means of notification); the number called back and how it was verified; and whether anything highly sensitive was disclosed.
In addition, the guidelines for disclosure by telephone should also apply to faxes. Verifying the identity or, failing that, the location of the caller is just as important as it is when disclosing personal health information over the telephone. It is important, and it is the BMA’s established advice that personal health information should be faxed only to a machine that is known to be secure during working hours.
Equipment Theft, Loss, and Damage Anderson (1996) considers the most serious threat to the continued availability of computerized clinical information in general practice to be theft of the computer that has been experienced by over 10% of general practices surveyed. Data can also be destroyed in other ways such as by fire, flood, equipment failure, and computer viruses. He suggests that physical security measures must be taken; hygiene rules to control the risk of computer virus infestation must be applied together with a tested recovery plan. Since most organizations do not perform realistic tests of their procedures, with the result that when real disasters strike recovery is usually held up for lack of manuals and suppliers’ phone numbers, it is important that a drill based on a realistic scenario, such as the complete destruction ofa surgery or hospital computer room by fire must be carried out, and a full system recovery to another machine from back up media held off site must be performed. Another measure is to keep several generations of back ups in cases of equipment failure and virus attacks that it may take timeto notice that something has gone wrong. A typical schedule in a well run establishment might involve back ups aged one, two, three, four, eight, and twelve weeks, as well as daily incremental back ups.
Limitations of Security Technologies Despite an aggressive move toward computerized healthcare records in recent years and ongoing parallel technological improvements, there are still limitations of the security technologies to achieve usable and secure systems (Gillespie, 2001).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
83
Firewalls Firewalls do not offer perfect protection, as they may be vulnerable to so-called tunneling attacks, in which packets for a forbidden protocol are encapsulated inside packets for an authorized protocol, or to attacks involving internal collusion (Gillespie, 2001). One of the concerns with firewalls is that most firewalls pass traffic that appears to be Web pages and requests more and more, as it is the way to get things to work through the firewall. The solution is to re-implement the whole as Web services (Webmail being a good example). These pressures continually erode the effectiveness of firewalls (Ateniese et al., 2003). For example, the NHS Network in Britain is a private intranet intended for all health service users (family doctors, hospitals, and clinics — a total of 11,000 organizations employing about a million staff in total). Initially, this had a single firewall to the outside world. The designers thought this would be enough, as they expected most traffic to be local (as most of the previous data flows in the health service had been). What they did not anticipate was that as the Internet took off in the mid-1990s, 40% of traffic at every level became international. Doctors and nurses found it very convenient to consult medical reference sites, most of which were in America. Trying to squeeze all this traffic through a single orifice was unrealistic. Also, since almost all attacks on healthcare systems come from people who are already inside the system, it was unclear what this central firewall was ever likely to achieve (Ateniese et al., 2003).
Cryptography The basis for many of the features desired for security in healthcare information systems depends on deploying cryptographic technologies. However, there are limitations to the use of cryptography. One problem is that security tools based on cryptography are still largely undeployed. One general weakness is poor usage of the system by individuals that includes: easily guessed passwords to the cryptographic system are chosen, or even written down on a sticker and stuck on the notebook, or people use the same password across different systems. The password then becomes as safe as the weakest system that is using it (which will often be something like a Web browser that has been told to remember the password) (Anderson, 2005; Gutmann, 2005). The other problem is that cryptography does not solve the security problem, that is, cryptography transforms the access problem into a key management problem, including authentication, digital signatures, information integrity management, session key exchange, rights management, and so on. It is observed that as the scope of key management services grows, trust in the integrity of key assignments tends to diminish, and the problems of revocation in the case of key compromise become much more difficult (Gillespie, 2001). Although public key infrastructure can help deal with the problem, it has also introduced complexities of its own. This has led to organizations effectively misusing cryptographic keys, as managing them appropriately has become too complex. The simplest example is that everyone in the organization really does get the same key.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
84 Fung & Paynter
Biometrics The deployment of biometrics is proven to be advantageous to the healthcare providers because it provides added security, convenience, reduction in fraud, and increased accountability. It increases the level of security by providing access to health information to authorized individuals and locking out those with nefarious intent. However, there are drawbacks to the technology. For example, when performing an iris-scan, individuals must remain perfectly still during enrollment and presentation, or the system will not be able to scan the iris, therefore causing false non-matching and failure-to-enroll anomalies to occur. Reid (2004) further identified a few drawbacks to biometrics: hardware costs, user perception, placement, and size. For example, iris-scans require specialized cameras with their own unique light source that can be very expensive. The user perception on having infrared light shined into the eye is quite disconcerting. To get the iris in the proper position can be quite time consuming. Some cameras can use eye recognition techniques to try to auto-pan and focus the camera, but such solutions do increase the cost of the camera and may still require some user coordination. The current size of the camera, which has been reduced to that of a desktop camera on steroids, is still very large. It needs further reduction to be able to work efficiently on a desk.
Hardware and Software Costs The costs of putting secure technologies in place can be tremendous. Very often the implementation of secured systems requires procurement of new software and hardware as the legacy system becomes obsolete. Unfortunately, there are not many commercial tools readily available in the market to integrate legacy systems into modern distributed computing environments. Furthermore, such integration will involve many database content inconsistencies that need to be overcome, including patient identifier systems, metadata standards, information types, and units of measurement. Overall the lack of standards for security controls and for vendor products that interoperate between disparate systems will hinder the implementation and enforcement of effective security solutions.
Legislation The importance of assuring the privacy and confidentiality of health information has always been acknowledged. However, up until recently the legal protection for personal information has been patchy and disorganized (ScreamingMedia, 1999).
U.S. Legislation The healthcare industry is currently going through an overhaul to meet governmentmandated regulations stemming from HIPAA to ensure patient confidentiality, privacy,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
85
and efficiency. HIPAA, which was passed in 1996 and effective in 2001, gives consumers the right to their medical records, to limit disclosure, and to add or amend their records. Providers must have complied by April 2003. Entities covered include health insurers, physicians, hospitals, pharmacists, and alternative practitioners such as acupuncturists. HIPAA requires all healthcare providers, health insurers, and claims clearinghouses to develop and implement administrative, technical, and physical safeguards to ensure the security, integrity, and availability of individually identifiable electronic health data. Failure to comply with HIPAA can result in civil fines of up to $25,000 a year for each violation of a standard. Because HIPAA encompasses dozens of standards, the fines can add up quickly, and wrongful disclosure of health information carries a criminal fine of up to $250,000, 10 years imprisonment, or both (King et al., 2001).
N.Z. Legislation In New Zealand, the Privacy Act, which came into force on July 1, 1993, provides a measure of legal protection for all personal information, including health information, and applies to the public and private sectors and to information held in both paper and electronic formats. The Health Information Privacy Code 1994, which is consistent with the provisions of the Privacy Act 1993 (s.46), was issued by the Privacy Commissioner specifically to protect the privacy of personal health information. While the code protects personal health information relating to an identifiable individual, it does not apply to statistical or anonymous information that does not enable the identification of an individual. The Medicines Act 1981 was issued by the Ministry of Health to penalize any unauthorized sale of prescription medicines, publication of advertisements containing insufficient information about precautions and side effects, and advertising the availability of new medicines before their approval for use in New Zealand. Under Section 20, the maximum penalty for an individual is up to six months imprisonment or a fine not exceeding $20,000. Sections 57 and 18 have a maximum penalty for an individual of three months imprisonment and a fine not exceeding $500. The Privacy Commission also considered the application of the Privacy Act to the process of caching (Anderson, 2001). Caching occurs when a Web page accessed by users is temporarily stored by the user’s computer (client caching) or by the network server that provides the user with Internet access (proxy caching). It also considered that the Privacy Act applied to the use of cookies within New Zealand and offered sufficient protection. It is proposed that using cookies for the purpose of collecting, holding, or giving access to personal information would be an offence unless the Web site indicated such information would be gathered. However, the Privacy Commissioner did not support the creation of such an offense.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
86 Fung & Paynter
European Union Data Protection Directive International action may further affect the ways in which personal health information is transmitted over the Internet. The EU Data Protection Directive, which went into effect on October 25, 1998, requires EU member states to block outbound transmissions of data to countries that do not have laws providing a level of privacy protection similar to that in the country where the data originated (Siman, 1999). The directive affords the people to whom the data refer a host of rights, including the right to be notified of data collection practices, to access information collected about them, and to correct inaccuracies (Stevens, 1998). In 1998, New Zealand addressed three aspects of the Privacy Act to ensure it is adequate for the purposes of the EU directive. This is important for New Zealand businesses dealing in personal data originating from Europe because the directive limits the exportation of data to third countries (countries outside the EU) that do not have an adequate privacy protection (Wiles, 1998). The three aspects are:
•
The channeling of data from Europe through New Zealand to unprotected data havens;
•
Limits on who may exercise rights of access and correction under the Privacy Act; and
•
The complaints process. “With a long queue of complaints awaiting investigation, the EU may have concerns that our complaints system is not sufficiently resourced to provide timely resolution of complaints.”
In view of the above, the Privacy Commissioner addressed that the Privacy Act is built upon a desire that the collection, holding, use, and disclosure of personal information should be carefully considered and that all activities in this area should be as open as possible.
Future Trends The growth of wireless computing in healthcare will take place for two reasons (The Independent Research Group, 2002):
•
For all electronic medical record systems to work, physicians cannot be tied down to wired PC workstations. They will need to use some type of wireless device that allows them access to the relevant hospital databases.
•
As the cost of healthcare continues to rise, many individuals are being treated on an outpatient basis. To keep track of an outpatient’s vital statistics or signal when the patient needs immediate medical attention, many pervasive devices, such as toilet seats, scales, smart shirts, smart socks, and pacemakers, are being developed that collect relevant patient information. Collected data can then be transmitted via a wireless device using a wireless or mobile network to the patient’s physician, who can then decide on possible interventions.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
87
In the recent years, the technological advancements in sophisticated applications and interoperability has increased the popularity of wireless LAN (WLAN) and the use of wireless technology in healthcare. Also with the faster connection speeds of broadband LANs, the healthcare providers have developed a number of applications to improve patient safety and the healthcare delivery process. According to Kourey (2005), the use of personal digital assistants (PDAs) has become increasingly popular. It is because, “PDAs provide access to data and e-mail, store and retrieve personal and professional information and facilitate communication in wireless environments, their use among healthcare professionals has skyrocketed. Industry experts predict the trend to continue. In 2001, 26% of American physicians used handheld devices for tasks related to patient care. While some experts predict this number to reach 50% by 2005.” Increasingly, clinicians can check on patient data or order treatments through secure wireless networks from anywhere in the hospital. For example (Hermann & Norine, 2004):
•
A nurse is automatically notified on a handheld wireless device that a patient’s blood pressure is falling.
•
A doctor on rounds receives the results of an important blood test on a wireless PDA instead of having to call the lab for the information.
•
A telemetry system records the vitals signs of dozens of patients in critical care and sends them wirelessly to a central control station for continuous, around-the-clock monitoring.
•
A surgeon completing a procedure writes after-care orders while still in the operating room and transmits them to the clinical information system, making them instantly part of the patient’s electronic record.
Rittinghouse and Ransome (2004) stress that employees who have not been properly educated about wireless security may not realize the dangers a wireless network can pose to an organization, given wireless computing is still a very new technology. They classify WLAN security attacks into two types:
•
Passive attacks — An unauthorized party simply gains access to an asset and does not modify its content (i.e., eavesdropping). While an attacker is eavesdropping, he or she simply monitors network transmissions, evaluating packets for specific message content. For example, a person is listening to the transmissions between two workstations broadcast on an LAN or that he or she is running into transmissions that take place between a wireless handset and a base station.
•
Active attacks — An unauthorized party makes deliberate modifications to messages, data streams, or files. It is possible to detect this type of attack, but it is often not preventable. Active attacks usually take one of four forms (or some combination of such): 1.
Masquerading: The attacker will successfully impersonate an authorized network user and gain that user’s level of privileges.
2.
Replay: The attacker monitors transmissions (passive attack) and retransmits messages as if they were sent by a legitimate messages user.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
88 Fung & Paynter
3.
Message modification: It occurs when an attacker alters legitimate messages by deleting, adding, changing, or reordering the content of the message.
4.
Denial-of-service (DoS): It is a condition that occurs when the attacker prevents normal use of a network.
When patient information is sent wirelessly, additional security measures are advisable, although a well-defined wireless utility basically protects confidentiality and restricts where the signal travels (Hermann & Norine, 2004). Tabar (2000) suggests that the growth of new technology also creates a unique security threat and requires user authentication protocols. For example, PDAs, laptops, and even mobile carts can fall into unauthorized hands; the electronic ID must be stored elsewhere. Vendors are working on solutions such as: hardware ID tokens that are inserted into the mobile devices before use and radio transmitter-tracking devices. Other browser-based only applications on the mobile computing device are also used such that the patient data resides only on the server and cannot be accessed by the mobile computing device once it is outside the WLAN coverage area. Turisco and Case (2001) argue that while vendors are responsible for code sets, encryption, privacy, and audit trails, user organizations need to manage the device with extreme care or cautions. Physical security is of paramount concern in the wireless communications. The device needs to be turned off when not in use and be kept in a safe place. Tabar (2000) concurs that the greatest hurdle in information security still rests with the user, and no technology can make up for slack policies and procedures. “Changing perceptions, culture and behavior will be the biggest challenges,” says Monica Summers, IS Director at Beaufort Memorial Hospital, Beaufort, S.C. “It’s not just the technology. You could slap down $5 million in technology, and it won’t stop people from giving out their password.” (Tabar, 2000).
Conclusions Privacy is not just about security measures, but is at least as much about what information is collected and collated and practically recoverable. Health information has always been regarded as highly sensitive, which must be protected by medical ethics and privacy legislations. The emergence of new technology and new organizational structures in the healthcare industry has opened up the means and the desire to collect and collate such information in ways never previously considered. The increased use of the Internet and latest information technologies such as wireless computing are revolutionizing the healthcare industry by improving healthcare services, and perhaps most importantly, empowering individuals to understand and take charge of their own healthcare needs. Patients become involved and actively participate in the healthcare processes, such as diagnosis and treatment through secure electronic communication services. Patients can search healthcare information over the Internet and interact with physicians. This enhances and supports human rights in the delivery
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
89
of healthcare. The same technologies have also heightened privacy awareness. Privacy concerns include: healthcare Web sites that do not practice the privacy policies they preach, computer break-ins, insider and hacker attacks, temporary and careless employees, virus attacks, human errors, system design faults, and social engineering. Other concerns are the collection, collation, and disclosure of health information. Healthcare providers and professionals must take into account the confidentiality and security of the information they collect and retain. They must also ensure that their privacy policies or secure technologies meet the public expectation and abide by the law. Such policies and technologies must also be implemented to ensure the confidentiality, availability, and integrity of the medical records. If this is not done, resources could be wasted in developing secure systems, which never reach fruition, and the new systems will never gain the confidence of the public or of the health professionals who are expected to use them. Technology is, to a large extent, both the cause of and the solution to concerns about the protection of personal health information. However, there are limitations to the secure technologies that need on-going research and development. Technologies, if coupled with physical security control, employee education, and disaster recovery plans, will be more effective in securing healthcare privacy. Further advances of new information technologies, if designed and monitored carefully, will continue to benefit the healthcare industry. Yet patients must be assured that the use of such technologies does not come at the expense of their privacy.
References Anderson, R. (1996, January 2). Clinical system security: Interim guidelines. Retrieved June 2005, from http://www.ftp.cl.cam.ac.uk/ftp/users/rja14/guidelines.txt Anderson, R. (2001). Security engineering: A guide to building dependable distributed systems. Wiley. Retrieved June 2005, from http://www.ftp.cl.cam.ac.uk/ftp/users/ rja14/c18_anderson.pdf Andreson, R. (n.d.). Why cryptosystems fail. Retrieved June 2005, from http:// www.cl.cam.ac.uk/users/rja14/ Ateniese, G., Curtmola, R., de Medeiros, B., & Davis, D. (2003, February 21). Medical information privacy assurance: Cryptographic and system aspects. Retrieved June 2005, from http://www.cs.jhu.edu/~ateniese/papers/scn.02.pdf Ball, M., Weaver, C., & Kiel, J. (2004). Healthcare information management systems: Cases, strategies, and solutions (3rd ed.). New York: Springer-Verlag. Carter, M. (2000). Integrated electronic health records and patient privacy: Possible benefits but real dangers. MJA 2000, 172, 28-30. Retrieved 2001, from http:// www.mja.com.au/public/issues/172_01_030100/carter/carter.html Choy, A., Hudson, Z., Pritts, J., & Goldman, J., (2002, April). Exposed online: Why the new federal health privacy regulation doesn’t offer much protection to Internet
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
90 Fung & Paynter
users (Report of the Pew Internet & American Life Project). Retrieved June 2005, from http://www.pesinternetorg/pdfs/PIP_HPP_HealthPriv_report.pdf Clarke, R. (1990). Paper presented to the Australian Medical Informatics Association, Pert. Australian National University. Retrieved 2001, from http://www.anu.edu.au/ people/Rogger.Clarke/DV/PaperMedical.html Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies. (1997). For the record: Protecting electronic health information. National Academy Press: Washington. Retrieved 2001, from http:// bob.nap.edu/html/for/contents.html Committee on Enhancing the Internet for Health Applications: Technical requirements and implementation strategies. (2000). Networking health: Prescriptions for the Internet. National Academy Press: Washington. Retrieved 2001, from htttp:// www.nap.edu/books/0309068436/html Constantinides, H., & Swenson, J. (2000). Credibility and medical Web sites: A literature review. Retrieved 2001, from http://www.isc.umn.edu/research/papers/ medcred.pdf CPRI toolkit: Managing information security in heath care. (n.d.). Target it architecture Vol. 6. Security architecture version 1. The Centers for Medicare & Medicaid Services (CMS) Information Security (previously Health Care Financing Administration). Retrieved June 2005, from http://www.cms.hhs.gov/it/security/docs/ ITAv6.pdf Cyber crime bleeds U.S. corporations, survey shows; financial losses from attacks climb for third year in a row. (2002, April 7). Retrieved June 2005, from http:// www.gocsi.com/press/20020407.jhtml?_requestid=953064 DeadMan’s handle and cryptography. (n.d.). Retrieved June 2005, from http:// www.deadmanshandle.com/papers/DMHAndCryptology.pdf E-commerce, privacy laws must mesh. (2001, February/April). News from the Office of The Privacy Commissioner, (39). Retrieved 2001, from http://www.privacy.org.nz/ privword/pwtop.html Fox, S. (n.d.). Vital decisions. Pew Internet & American Life Project. Retrieved June 2005, from http://www.pewinternet.org Fox S., & Wilson R. (n.d.). HIPAA regulations: Final. HIPAA Regs 2003. Retrieved June 2005, from http://www.hipaadvisory.com/regs/ Gillespie, G. (2001). CIOs strive to increase security while decreasing the ‘obstacles’ between users and data. Retr ieved 200 1, from ht tp :// www. h ealt h datamanagement.com/html/current/CurrentIssueStory.cfm?PostID=9059 Gutmann, P. Lessons learned in implementing and deploying crypto software. Retrieved June 2005, from http://www.cs.auckland.ac.nz/~pgut001/ Health Insurance Portablity and Accountability Act of 1996. Public Law No. 104-191. Section 1173, USC 101. (1996). Health privacy (About 1.4 million computer records for in-home supportive service breached). (2004, October 21). California Healthline. Retrieved June 2005, from
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
91
http://www.californiahealthline.org/index.cfm?Action=dspItem&itemID=10 6520&ClassCD=CL141 Health Privacy Project: Medical privacy stories. (2003). Retrieved June 2005, from http:/ www.healthprivacy.org/usr_doc/Storiesupd.pdf Hermann, J., & Norine P. (2004). Harnessing the power of wireless technology in healthcare. Retrieved April 2005, from http://www.johnsoncontrols.com/cgiHealthbeat. (2001). Business and Finance: Survey. Retrieved June 2005, from http:// www.ihealthbeat.org/members/basecontentwireless/pdfs/healthcare.pdf Johnson, A. (2001). The Camelot Avalon: Healthcare procrastinates on HIPAA. Retr ieved 2001, fr om h tt p:/ /www.cam elot. com/ n ewslett er.a sp?PageID =326&SpageId=504 Kimmel, K., & Sensmeier, J. (2002). A technological approach to enhancing patient safety (White Paper). The Healthcare Information and Management Systems Society (HIMSS) (sponsored by Eclipsys Corporation). Retrieved June 2005, from http://www.eclipsys.com/About/IndustryIssues/HIMSS-TechApproach-toPatientSafety-6-02-FORMATTED.pdf King, C., Dalton, C., & Osmanoglu, T. (2001). Security architecture: Design, deployment & operations. USA: Osborne/McGraw Hill. Kourey, T. (n.d.). Handheld in healthcare part two: Understanding challenges, soluti ons and future tre nds. Retr ieved Mar ch 2005 , from h tt p:/ / www.dell4healthcare.com/offers/article_229.pdf Medical Privacy Malpractice: Think before you reveal your medical history. (n.d.). Retrieved 2001, from http://www.perfectlyprivate.com/beware_medical.asp Miller, M. (2003, September 8). Issues of privacy in Bryant case. Los Angeles Times. Ministry of Health Press Release. (2001). Conviction for Internet drugs warning to other. Retrieved 2001, from http://www.http://www.moh.govt.nz/moh.nsf/ aa6c02e6249e77359cc256e7f0005521d/5332afe9b6839587cc256ae800647e1b? OpenDocument More prosecutions tipped for online drugs. (2001, October 19). Retrieved 2001, from http://www.stuff.co.nz/inl/index/0,1008,978172a1896,FF.html New Zealand Health Information Service. (2001). Health intranet: Health information standards. Retrieved 2001, from http://www.nzhis.govt.nz/intranet/standards.html Null, C. (2003, March 4). Google: Net hacker tool du jour. Wired News. Paynter, J., & Chung, W. (2002, January). Privacy issues on the Internet. In Proceedings of the Thirty-Fifth Hawaii International Conference on System Sciences (HICSS35), Hawaii, USA. Reid, P. (2004). Biometrics for network security. New Jersey: Prentice Hall. Rogers, R. D. (1998). Information protection in healthcare: Knowledge at what price? We’re drowning in information and starving for knowledge. Retrieved 2001, from http://www.privacy.org.nz/media/aichelth.html
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
92 Fung & Paynter
Rossman, R. (2001). The Camelot Avalon: Despite HIPAA uncertainty, security must prevail. Retrieved 2001, from http://www.camelot.com/newsletter.asp?Page ID=416&SpageID=504 Rottinghouse, J., & Ransome J. (2004). Wireless operational security. USA: Elsevier Digital Press. ScreamingMedia, Business Wire. (1999). SevenMountains Software and SCI Healthcare Group to deliver a secure virtual private networking solution to Hawaii Health Systems Corporation. Retrieved 2001, from http://www.industry.java.sun.com/ javanews/stories/story2/0,1072,18810,00.html Sherman, D. (2002) Stealing from the sick. Retrieved May 21, 2002, from http:// www.NBC6.net Silverman, M. (2002). Inside the minds: Privacy matters. Retrieved June 2005, from http:/ /www.duanemorris.com/articles/static/SilvermanBookExcerpt.pdf Siman, A. J. (1999). The Canada health infoway — Entering a new era in healthcare. Retrieved 2001, from http://www.hc-sc.gc.ca/ohih-bsi/available/documents/ ecompriv_e.html Stevens, R. (1998). Medical record databases. Just what you need? Retrieved 2001, from http://www.privacy.org.nz/people/mrdrep.html Starr, P. (1999). Privacy & access to information: Striking the right balance in healthcare (Health and the Right to Privacy, Justice Louis Brandeis Lecture). Massachusetts Health Data Consortium, Boston, USA. Retrieved 2001 and June 2005, from http://www.nchica.org/HIPAAResources/Samples1/privacylessons/ P-101%20Massachusetts%20Health%20Data%20Consortium.htm Tabar, P. (2000). Data security: Healthcare faces a tricky conundrum of confidentiality, data integrity and timeliness. Healthcare Informatics. Retrieved April 2005, from http://www.healthcare-informatics.com/issues/2000/02_00/cover.htm The American Health Information Management Association and the Medical Transcription Industry Alliance. (1998). AHIMA position statement: Privacy official. Retrieved 2001, from http://www.ahima.org/infocenter/index.html The Gallup Organisation. (2000). Public attitudes towards medical privacy. The Institute for Health Freedom. Retrieved June 2005, from http://www.forhealthfreedom.org/ Gallupsurvey/ The Independent Research Group. (2002). SAFLINK Report. Retrieved 2001, from http:/ /www.cohenresearch.com/reports/sflk_11-05-02.pdf Thomson Corporation and health data management. Survey: Hospitals boosting data security. (n.d.). Retrieved June 2005, from http://www.healthdatamanagement.com/ html/ Turisco, F., & Case J. (2001). Wireless and mobile computing. Retrieved April 2005, from http://www.chcf.org/documents/ihealth/WirelessAndMobileComputing.pdf
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
The Impact of Information Technology in Healthcare Privacy
93
Wayner, P. (2002). Disappearing cryptography. Information hiding: Steganography & watermarking (2nd ed.). USA: Elsevier Science. Wiles, A. (1998). Integrated care and capitation: New challenges for information protection. Retrieved 2001, from http://www.privacy.org.nz/shealthf.html Wilson, P., Leitner, C., & Moussalli, A. (2004). Mapping the potential of eHealth: Empowering the citizen through eHealth tools and services. Retrieved June 2005, from http://www.cybertherapy.info/pages/e_health_2004.pdf
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
94 Shata
Chapter IV
E-Services Privacy: Needs, Approaches, Challenges, Models, and Dimensions Osama Shata, Specialized Engineering Office, Egypt
Abstract This chapter introduces several aspects related to e-privacy such as needs, approaches, challenges, and models. It argues that e-privacy protection, although being of interest to many parties such as industry, government, and individuals, is very difficult to achieve since these stakeholders often have conflicting needs and requirements and may even have conflicting understanding of e-privacy. So finding one model or one approach to e-privacy protection that may satisfy these stakeholders is a challenging task. Furthermore, the author hopes that this chapter will present an acceptable definition for e-privacy and use this definition to discuss various aspects of e-privacy protection such as principles of developing e-privacy policies, individuals and organizations needs of various privacy issues, challenges of adopting and coping with e-privacy policies, tools and models to support e-privacy protection in both public and private networks, related legislations that protect or constraint e-privacy, and spamming and Internet censorship in the context of e-privacy. The author hopes that understanding these aspects will assist researchers in developing policies and systems that will bring the conflict in e-privacy protection needs of individuals, industry, and government into better alignment.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
95
Introduction The Internet in general and the World Wide Web (WWW) in particular, were initially intended to facilitate sharing of information between individuals, research centers, organizations, and so forth. However, they have now become the fastest growing means to provide a variety of services such as e-government, e-commerce, e-communication, e-entertainment, e-education, e-investment, and so on. Although “electronic services” (e-services) is a term that implies the offering of services by electronic means, it is mostly used now to mean the offering of services via the Internet/WWW. E-services are of various types, including those that enable individuals and organizations to access information (e.g., surfing the WWW) and those that facilitate transmitting of data (e.g., banking application, e-shopping). Individuals and organizations using and offering e-services are subject to many potential threats (e.g., unauthorized intrusion and collection of IP addresses, session hijacking, copying/stealing information digitally stored, etc.). This raises the need for high standards of security measures. One of the threats that is receiving growing attention is violating the privacy of users using e-services. One type of violation may occur by harmful software that attacks computers to collect sensitive information for purposes such as identity theft, or to destroy stored information. This requires continuous adopting of new and up-to-date protection techniques. A second type of privacy violation is committed by organizations offering e-services. Such organizations tend to collect some of an individual’s personal identifiable information (PII), which is considered critical for the organizations’ interests, but also is seen private by the individual using the e-services. This necessitates preventing PII from being collected without consent and protecting PII collected with consent. This has raised Internet privacy protection as one of the top policy issues for legal institutions and legislators. In order to resolve this conflict of interests between individuals and organizations, several laws and acts have been issued with the aim of balancing the interests of the two parties. The purpose of these laws and acts is to organize the process of collecting, processing, and protecting PII of individuals using e-services, and hence, to provide some protection for individuals. This is what we call in this chapter “e-service privacy protection,” or “e-privacy” for short. E-privacy is a concept that is difficult to define. It is seen differently by the parties involved. Some of the organizations that collect PII may view the Internet as a public environment, and those who connect to it should expect to be noticed. Other organizations offer free services, thus those who use the services should expect some trade off. On the other hand, individuals believe that their online activities and all their PII are private and belong to them. Since these individuals switch between TV channels and view whatever they prefer in privacy without being tracked, they expect the same privacy when surfing the WWW. Legislators always debate comprehensively, before the issuing of any related privacy law, on how to balance the interests of the collecting organizations and individuals, and what principles and standards may be used (e.g., the Canadian
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
96 Shata
Personal Information Protection and Electronic Documentation Act [Government of Canada-1]). In addition, e-privacy may be broadened to cover individuals’ rights not to receive any unsolicited advertisements (spamming) in their e-mail inboxes, as well as their rights to access Web sites without restrictions (Internet censorship). Another question is whether or not the meaning of e-privacy would differ according to whether the communication network used is public (Internet) or private (belongs to an organization or a workplace). There are many issues related to e-privacy such as its criticality, types, scope, standards, legal requirements, challenges, approaches, and how to protect it. This chapter will discuss some of these issues. In particular, it aims to: 1.
Introduce e-privacy, define and consider it from various perspectives,
2.
Highlight standards and principles of e-privacy policies,
3.
Identify various challenges of adopting and coping with e-privacy policies,
4.
Discuss e-privacy in the context of public networks,
5.
Relate e-privacy to electronic security,
6.
Identify critical organizational, legal, and technical issues in managing e-privacy,
7.
Introduce example models and approaches to maintain individuals’ e-privacy,
8.
Discuss spamming and Internet censorship in the context of e-privacy, and
9.
Introduce e-privacy considerations in private networks.
What is E-Privacy? Privacy is an abstract word that has various meanings, scopes, and dimensions to individuals depending on each individual’s background, psychology, beliefs, and ethics. However, most individuals will relate its meaning to their right to act freely from unauthorized intrusion and to their right to keep what they believe to be private from others. Hence, we can look at e-privacy as an individual’s right to act freely online (on the Internet/WWW) without being monitored, traced, restricted, and to keep their PII from being collected or distributed to other parties without their consent. Unfortunately, this definition may not be approved by some organizations that find it necessary to monitor individuals while they are online to collect some necessary PII. This chapter will use the above definition of e-privacy as a base when discussing the balancing of conflicting interests between individuals and organizations and to highlight some other related issues. The next section will introduce examples of violating individuals’ privacy online and discuss the need for protecting e-privacy.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
97
The Need for E-Services Privacy Protection New technology has enabled electronic services providers and other parties to monitor online users and to collect and transfer users’ PII. These technological capabilities have raised concerns among users that their PII could be used in ways that they would consider an invasion of their e-privacy. As an attempt to understand the need for eprivacy protection, it would be helpful to list some examples of the misuse of the Internet which have affected individuals’ e-privacy, whether the misuse was intentional or unintentional. In a recent article, Cobb and Cobb (2004) gave an example to illustrate what happens when people fail to understand how technology may affect privacy. Some years ago, legislators in the State of Florida authorized counties to put all public records on the Web. As a result, anyone could view the private data in the records such as name, social security number, address, and in some cases signatures. This can undoubtedly be classified as a privacy violation and can lead to various crimes such as identity theft. A sample record that has been used by Cobb and Cobb (2004) may be examined for illustration at: http:// www.privacyforbusiness.com/example1.htm. While Cobb and Cobb’s example of privacy invasion was unintentional and would not be considered a criminal act, one can find many examples of intentional privacy invasion for the sake of electronic fraud. One example of this is Western Union’s Web site was broken into, and it is thought that the hackers have managed to copy credit card information for more than 15,000 customers. (ID Theft, Schemes, Scams, Frauds). Large organizations also suffer from private information violations. In a presentation by Fred Holborn (2003), he presented that: “92% of large organizations detected computer security attacks in 2003; 75% acknowledged financial losses due to computer breaches; theft of proprietary information caused the greatest financial loss — $2.7 million average.” One can argue that these examples of privacy invasion may occur in non-electronic environments as well and that the use of the Internet and electronic services has just made them easier. However, there are examples of privacy invasions that would occur only because of the electronic services and the use of the Internet/WWW. Such invasions are committed by unauthorized software. This software is of various types, including spy-ware, ad-ware, viruses, cookies, online activities trackers, scum-ware, and Web beacons. However, these various types share the characteristic of being uninvited. In most cases, the user is not aware of their existence until his or her computer starts functioning in an unexpected way, and they can be difficult to remove. While some unauthorized software can cause serious problems such as destroying personal data, and identity theft, the aim of much other unauthorized software is to collect information for the sake of sending ads or even for security purposes. The first type of unauthorized software causes illegal activities and would require individuals and organizations that keep online data to increase their security measures (e.g., to use firewalls). Meanwhile,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
98 Shata
the second type of the unauthorized software would need to be controlled by some policy. To understand this better consider the following examples:
•
An online shopping Web site tracks its visitors’ online activities and collects their PII to e-mail them discounted offers.
•
A search engine tracks its users’ surfing and passes their PII to its sponsoring organizations.
•
An e-government Web site collects PII with individuals’ consent, but shares the PII with other government agencies.
•
An organization is increasing its security measures and checks outgoing e-mail to protect its critical data from being leaked, or is monitoring its employees online activities during working hours to increase productivity — is this an invasion of its employees’ e-privacy?
Such examples of what may be considered e-privacy invasion led many legislators to conclude that industry needs standards, policies, and laws to organize the process of monitoring individuals’ online activities and online private data collection and usage. Some governments (e.g., U.S., EU, Australia, and Canada) have already issued related bills and laws (e.g., the Canadian Personal Information Protection and Electronic Documentation Act [Government of Canada-1]). The discussion of the need for e-services privacy protection usually focuses on eservices provided within public networks and does not focus on e-services provided within non-public networks (although the later type also requires attention). However, the topic of e-privacy is still evolving, and there is no unified definition or scope for online privacy and protection. We will introduce in Section vi a discussion of the related standards, principles, and models applicable to e-privacy.
Standards, Principles, and Models of E-Services Privacy Protection The rapid growth of Internet e-services and Web-based applications that target consumers and collect their PII led to concerns over e-privacy. This required the issuing of laws for enforcing e-privacy protection. Most e-privacy-related laws enforce sites and organizations that collect PII from individuals using an e-service by adopting an eprivacy policy with minimally enforced standards and specifications. In many cases the laws and regulations for e-privacy are amendments to off-line privacy acts that are already in place. In Canada, there is the Personal Information Protection and Electronic Documents Act. The purpose of this act, as stated on the Department of Justice Web site is (Department of Justice - Canada): “to provide Canadians with a right of privacy with respect to their personal information that is collected, used, or disclosed by an organization in the private
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
99
sector in an era in which technology increasingly facilitates the collection and free flow of information.” The privacy provisions of the Personal Information Protection and Electronic Documents Act are based on the Canadian Standards Association’s Model Code for the Protection of Personal Information, recognized as a national standard in 1996 (Government of Canada). The code’s 10 principles are: “Accountability,” “Identifying Purposes,” “Consent,” “Limiting Collection,” “Limiting Use, Disclosure, and Retention,” “Accuracy,” “Safeguards,” “Openness,” “Individual Access,” and “Challenging Compliance.” This act is supposed to cover both the government and the private sectors. There are also other provincial acts (e.g., Alberta’s Personal Information Protection Act [Alberta Government] and British Columbia’s Personal Information Protection Act). The UK has the Data Protection Act 1998 (UK, 1998). The act “applies to a data controller in respect of any data only if: (a)
the data controller is established in the United Kingdom and the data are processed in the context of that establishment, or
(b)
the data controller is established neither in the United Kingdom nor in any other EEA State but uses equipment in the United Kingdom for processing the data otherwise than for the purposes of transit through the United Kingdom.” The act’s main principles emphasize that personal data: shall be processed fairly and lawfully; obtained only for one or more specified and lawful purposes; shall be adequate, relevant, and accurate; shall not be kept for longer than is necessary; shall be protected by appropriate technical and organizational measures against unauthorized or unlawful processing and against accidental loss or destruction; and shall only be transferred to a country or territory outside the European Economic Area (EEA) under conditions that ensure an adequate level of protection of personal data and the rights and freedoms of data subjects (UK, 1998). The reader may realize the great similarity between the principles of the UK’s Data Protection Act 1998 and those of the Canadian Personal Information Protection and Electronic Documents Act. Australia has the Federal Privacy Law. It contains 11 Information Privacy Principles (IPPs) and which apply to Commonwealth and government agencies (Federal Privacy Commissioner [Australia], 1988). The 11 principles are: Manner and purpose of collection of personal information; solicitation of personal information from individual concerned; solicitation of personal information generally; storage and security of personal information; information relating to records kept by record-keeper; access to records containing personal information; alteration of records containing personal information; record-keeper to check accuracy and so forth of personal information before use; personal information to be used only for relevant purposes; limits on use of personal information; and limits on disclosure of personal information. The law also has 10 National Privacy Principles that apply to parts of the private sector and all health service providers.
While the Canadian act and the Australian law cover both the federal and private sectors, other countries have laws and acts to govern the federal sector and leave the private sector to develop its own privacy policies (e.g., USA). Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
100 Shata
In the United States legislators have passed a legislation regarding information practices and e-privacy for the federal government. However, legislators are still debating whether an e-privacy act for the private sector is needed or industry self regulations are enough to protect individual’s PII. The European Union has a set of directives related to e-privacy called “The Data Protection Directive” (e.g., Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector [Directive on privacy and electronic communications]) (The European Commission, 2002). While e-privacy laws and acts may differ according to the political structure and local cultures, they share the objective of protecting PII that uniquely identifies a user (e.g., full name, social security number, e-mail address), or data that uniquely identifies a particular device or a location used by a user (e.g., IP address). A standard e-privacy policy would state: (a)
The purpose for which PII needs to be collected, and that this purpose shall be made clear to individuals before the collecting process begins,
(b)
Whether collecting PII will be automatic, or would individuals be notified before the collecting process begins,
(c)
What PII is collected,
(d)
How the collected PII will be used,
(e)
If and how cookies are used,
(f)
That the collecting organization is responsible for protecting the PII collected,
(g)
What security policies are used, with references to them,
(h)
The conditions under which the PII may be released,
(i)
For how long will the collected PII be retained, and
(j)
The privacy act and principles that the policy is based on.
A few policies will state more enhanced standards such as: (a)
The collecting organization would provide information regarding its management of the collected PII to concerned individuals, and
(b)
Concerned individuals shall be able to access their PII and challenge the appropriateness of the PII.
Questions that arise here are whether e-services providers really adopt clear e-privacy policies and whether e-privacy laws and acts really protect an individual’s e-privacy. An answer may be provided in the online report “Super Beware: Personal Privacy and the Internet” by the Electronic Privacy Information Center (1997). The report states: “The Electronic Privacy Information Center (EPIC) reviewed 100 of the most frequently visited Web sites on the Internet. We checked whether sites collected personal information, had established privacy policies, made use of cookies, and allowed people to visit without disclosing their actual identity. We found that few Web sites today have explicit privacy
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
101
policies (only 17 of our sample) and none of the top 100 Web sites meet basic standards for privacy protection.” While laws and acts are meant to force e-services providers to adopt clear e-privacy policies, they differ from one country to another according to culture and political structure. Robert Lee (1997) was involved in research to focus on and compare how personal privacy related regulations in two countries with close ideas of personal freedom and governmental structures — the United States of America and Australia — would affect Internet applications collecting PII. Roberts (2003) states that: “Despite the similarities in culture and aspirations for individual freedom from bureaucracy in the United States and Australia, this limited research demonstrated that access to private information on individuals was more freely available in the United States than Australia. The difference in individual privacy protection resulted from the extension of Australian federal privacy regulations to cover commercial businesses in addition to government databases.” Section vii will identify some challenges that may be encountered when adopting and coping with an e-policy.
Challenges E-privacy policies consider e-privacy in two dimensions: 1.
Providing the protection for individuals’ PII against unauthorized collection and usage when using e-services, and
2.
Providing the protection for individuals’ PII, when collected with consent, against electronic theft or reproduction by a third party.
We have focused our discussion on the first dimension since we believe that the second dimension would be more related to electronic security rather than to e-privacy. However, maintaining the second dimension faces many challenges, hence it will be included in our discussion in this section. Adopting and coping with e-privacy policies face several challenges. We list some of them below and classify them into policy and security challenges:
Policy Challenges •
Enforcing standards among all collectors of PII. As we clarified in an earlier section, laws and acts differ from one country to another based on culture, beliefs, and political structure. An organization may provide an e-service to thousands of individuals across the globe and may be subject to some e-privacy laws. This organization’s competitors, based in other countries and providing the same service also to thousands of individuals across the globe, may
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
102 Shata
not be bound by similar laws. This gave rise to “Safe Harbour” agreements, such as the safe harbour agreement between the U.S. and the EU (U.S. Department of Commerce, 2000). In such an agreement, U.S. organizations may voluntarily participate in the safe harbour and be committed to cooperate and comply with the European Data Protection Authorities. This will ease the flow of information from EU organizations to participating U.S. organizations.
•
Diversity of sectors. There are differences between public (government) and private sectors. The public sector often accepts committing to higher e-privacy standards better than the private sector. For example, the public sector does not look to share collected PII outside its departments, while private sector organizations may find it necessary to trade collected PII with other private sector organizations for commercial purposes, competition, and so on.
•
Diversity of laws and legislations. When a multi-national organization has several branches with several Web sites in different jurisdictions, to which e-privacy law would it be subject?
•
Diversity of individuals. Some individuals may accept (reasonable) risks in giving up their PII for getting an e-service (e.g., have free access to software). For example, Yahoo uses Web Beacons to track Yahoo users (Yahoo). How would an e-privacy policy balance between those and other individuals who prefer (total) protection?
•
Internal resistance from organizations that have to adopt an e-privacy policy, since violating the policy may have unpleasant legal consequences.
•
Exceptions. Almost every e-privacy law or act has some exceptions that affect the proper implementation of the e-privacy policies that refer to that law or act. While one may understand releasing or hiding PII for legal or security reasons, other exceptions may be confusing. For example, the Canadian Personal Information Protection and Electronic Documents Act (Government of Canada) notes that an individual may inquire about the existence, use, or disclosure of his or her PII and can have access to it. However, the act also states that, “In certain situations, an organization may not be able to provide access to all the personal information it holds about an individual.” And, “Exceptions may include information that is prohibitively costly to provide, information that contains references to other individuals, information that cannot be disclosed for legal, security, or commercial proprietary reasons….” But, who determines that the information is prohibitively costly to provide? Why did providing it become costly, while collecting it was affordable? Also, who determines the commercial proprietary reasons?
•
Conflict with other laws. An e-privacy law in one country may conflict with another law in another country, or even in the same country. For example, in 2004, British Columbia’s Information
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
103
and Privacy Commissioner released a report warning that Canadians’ privacy was at risk and that the USA PATRIOT Act violates British Columbian privacy laws (Information and Privacy Commissioner for British Columbia, 2004). The report clarifies that necessary changes to the British Columbian privacy laws are needed to protect British Columbians’ personal information from being seized under the controversial American law. A second example of conflicts of laws is the potential misuse of the Digital Millennium Copyright Act (DMCA), passed in the United States in 1998 by Congress (U.S. Government, 1998). Among other concerns, there is the concern that this law may be misused by some parties to violate individuals’ e-privacy. A copyright holder may use the DMCA subpoena to force an Internet service provider (ISP) to release PII of an Internet user based on a claim of copyright infringement. What if there is no actual copyright infringement? What if an irrelevant IP address was released by mistake? Could misuse or abuse be involved?
Security Challenges (Related to Providing Enough Security for Collected PII) All e-privacy policies state that a collecting organization is responsible for protecting PII collected. The issue here is that there are no unified security measures. This raises several questions:
•
Would senior management in all collecting organizations equally appreciate and understand the issue of security and be committed to spending for high security techniques and skills?
•
What security techniques are enough to protect the collected PII (firewalls, authentication, anti-virus software, data encryption, etc.)?
•
Would adopting high security measures conflict with individuals’ rights to access their stored PII?
•
Would adopting some security techniques (e.g., authentication) undermine eprivacy?
Authentication refers to a set of techniques that may be used to verify that the user of a system is really who he or she claims to be (e.g., using a password known only to the person logging in). However, there are experts in breaking down (simple) passwords; also, there are software programs that assist in this task. The need for more secure authentication systems would require collecting more data from the user (e.g., answers for private and confidential questions) or for using cookies that assist in identifying the computer machine used. While authentication can help protect e-privacy by making sure that those who access PII stored electronically are authorized to do so, it may also undermine e-privacy, as argued by Kent and Millett (2003), since it could result in authentication systems that:
• •
“Increase requests for identification, Increase the collection of personal information,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
104 Shata
•
Decrease the ability of individuals to understand and participate in data collection decisions,
• •
Facilitate record linkage and profiling, and Decrease the likelihood that individuals will receive notice of or have the right to object to third-party access to personal information.”
Critical Issues in Managing E-Privacy Adopting an e-privacy policy is not a matter of choice in some countries; it is a must. There is no question that more countries will pass laws that ask e-services providers to adopt clear e-privacy policies. However, establishing an effective e-privacy policy that is in compliance with applicable laws and acts to protect e-privacy requires the integration of the following guidelines at three levels: organizational, legal, and technical. On the organizational level, a deep understanding from senior management is needed to appreciate that having a proper e-privacy policy would actually benefit its e-service business. If an e-service provider is publicly recognized as not protecting users’ privacy, then this would have a dramatic, damaging effect on its reputation and business. Management must be willing to spend generously on technology and skills to put eprivacy in place. E-privacy must be seen as an additional value to the organization’s business and not as a barrier to it. Hence, the development of the e-privacy policy, its requirements, and resources must be integrated within the organization’s overall business plan. The implications of the e-privacy policy and its implementation on the organization must be considered at the early stages of designing and developing the organization’s overall business plan so that proper identification of the needed PII and techniques for collecting, storing, processing, controlling, and transferring PII are properly implemented. A clerk responsible for managing the e-privacy policy must be understanding to concerned individuals and accept that a concerned individual is entitled to have access to his or her PII record at his or her chosen time. The clerk must help the individual to the largest extent authorized by the applicable law or act. On the legal level, a deep understanding of the applicable laws on e-privacy is needed. Legal advisors must frequently revise the e-privacy policy. On the technical level, proper measures and technologies for data security must be adopted to protect PII from improper access while it is being collected, stored, used, processed, and transferred between servers and sites. Some guidelines that may help for data security and protecting e-privacy are listed next. Some of these guidelines may help e-services providers that collect, store and transfer data electronically; others could be helpful to individuals to protect their e-privacy while using an e-service or surfing the Web (the list of guidelines is not intended to be comprehensive or to guarantee full protection, but suggestions to consider):
•
Use public key encryption (PKE) to collect sensitive data from individuals (by their consent) and for data flow between sites and servers.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
105
Encryption is a technique used to encode data so that it may not be understood by others, only by the encoder. Public key encryption has recently become a cornerstone in online business and e-services concerned with providing a high level of protection to data collected and transmitted electronically.
• • •
Use authentication and authorization techniques for accessing stored data.
•
Use antivirus software, and update it frequently.
Use encryption when storing data. Authorization is finding out if an authenticated person has the privileges to access some classified data. Frequently use antivirus software to scan and clean computer disks and memory from viruses, worms, and Trojan horses that can cause serious damage to data and computer functioning.
•
Use firewalls. Use a firewall system (could be hardware/software) to enforce an access control policy. Use it to protect networked computers from possible intrusion that may compromise e-privacy by restricting communication between the internet and a networked computer that contains data to be protected.
•
Prevent/control cookies. Always check for cookies, block them, or at least be alert when a cookie will be placed on a computer hard disk, and delete unwanted ones. Many e-services Web sites place cookies on an individual’s machine to recognize those who revisit their sites. Cookies are small text files that contain some information (e.g., preferences of an individual when he or she visits that Web site). In principle, cookies do not automatically collect PII, but they can save PII provided by an individual with consent. While cookies were originally meant to exchange information (PII) with the Web site that sent them and for which the individual has given PII by consent (first-party-cookie), other cookies (third-party-cookies) can compromise e-privacy. Third-party-cookies may track an individual’s online activities and send information about him/her to Web sites that the individual knows nothing about. Cookies can easily be blocked, removed, or protected against by using opt-out cookies.
•
Consider anonymous Web surfing. Anonymous surfing helps to protect e-privacy by making it difficult for Web sites visited to collect PII (e.g., IP address) or to track an individual’s online activities. The idea depends on not contacting the intended Web site directly but through a second site that uses an anonymous surfing proxy that will not allow the individual’s particulars to be passed to the intended site. But can an individual really trust the second site?
•
Consider secure e-mail. Some tools can help an individual to access, store, and send e-mail in an encrypted environment.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
106 Shata
• •
Use proper tools to block spam, and filter incoming e-mail. Secure online communication. Encrypt TCP/IP communication such as instant messaging, HTTP, FTP, voicemail faxes, and streaming audio/video.
•
Frequently run privacy/security risk assessments to identify the greatest risk associated with unauthorized intrusion to sensitive stored data.
Approaches for E-Privacy Management The increased concerns of individuals accessing e-services exposing their e-privacy led researchers to investigate approaches for managing e-privacy. Specially, individuals are of limited experience and resources when compared to e-services providers. The later have enough resources to develop and enforce their e-privacy policies compared to individuals who may even face difficulty in understanding some lengthy e-privacy policies. One of the leading approaches for e-privacy management is the Platform for Privacy Preferences Project (P3P) developed by the World Wide Web Consortium (W3C) (W3C, 2004). This is a protocol that may be used as an intermediary between Web sites and Internet users. A Web site may express its e-privacy policy requirements for each of its Web pages using the P3P language, specifying, for example, what PII to be collected from a user (e.g., a page may not require to collect any user’s PII; a second page may need to place cookies on the user’s machine; a third page may ask for the user’s e-mail, etc.). A user indicates what PII he or she is willing to release to Web sites, whether he or she likes to be notified at the time of releasing the PII, and other preferences (e.g., disclosing of PII will take place only over a secured communication channel) to an agent (usually a browser) that also understands the P3P language. The user can visit Web sites and leave it to the agent to “negotiate” the indicated e-privacy preferences with the visited Web sites. This protocol is more like a PII disclosure organizer than an e-privacy protector. If the user accepts to disclose his or her e-mail, then whenever a Web page asks for the user’s e-mail, the agent provides it and saves the user writing it several times whenever a Web page requests it. In this sense, P3P does not eliminate the need for other e-privacy protection measures (e.g., encryption). It is not an intelligent agent that would advise a user whether to trust a Web site or not, or whether to release PII or not. An increasing number of e-service providers are adopting the P3P protocol. Commercial tools have been developed to help users to declare their e-privacy preferences (e.g., The IBM P3P Policy Editor [IBM]). A second approach for e-privacy is proposed by Tumer, Dogac, and Toroslu (2003). This approach introduces a framework where a Web site classifies any PII it requests to collect as mandatory or optional. Mandatory means that data is necessary for the service to take place (e.g., user name, contact telephone number). Optional means not necessary, it can also take the form of a rule (e.g., a certain data item such as e-mail is optional if the user
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
107
provides a telephone number; otherwise it becomes mandatory), or it can be absolutely optional data that will have no affect on the service provided. A user can associate one of three permission levels with each of his or her PII: free (may be released unconditionally), limited (may be released only if mandatory), or not-given (may not be released). Permission levels are declared in a context ontology. Services are organized in a hierarchical nodes structure, and there is a collection of privacy rule sets associated with nodes in a service ontology. General principles govern the release of data (e.g., any service node inherits the permission definitions associated with a higher service node in the hierarchy, and specializations override generalities). The main goal of this approach is to disclose the minimal data needed by an e-service provider. A user’s agent would store the user’s preferences and negotiate with a Web site visited to disclose minimal data. Again this framework does not eliminate the need for other e-privacy protection measures (e.g., encryption), and it is not an intelligent agent that would advise a user whether to trust a Web site or not or whether to release PII or not. The work in the area of e-privacy protection and e-privacy agents is active. Other related work includes authentication services like Microsoft Passport (Microsoft).
Spamming We have focused in our discussion until now on e-privacy invasion and protection, in the context of collecting, using, distributing, or accessing individuals’ PII without their consent. A second dimension of e-privacy invasion is spamming. Spamming is unsolicited e-mail. Commercial organizations send bulk e-mail for advertising and business purposes, probably containing offensive materials to some individuals. What are the differences between receiving uninvited advertisements in e-mail inboxes and receiving uninvited advertisement flyers in regular mailboxes? It would appear that one difference is the greater quantity of spam e-mail. Another difference is that a spam e-mail may contain a virus (maybe unintended) that can wreak havoc with a computer. In most cases, spam e-mail does not include viruses, warms, Trojan horses, or spyware that steal PII or destroy files, but it is still a form of e-privacy invasion, and can cause several problems such as:
• • •
Wasting an individual’s time reading junk e-mail, Forcing content that may be offending to a recipient, and Quickly filling recipient inbox quota, probably causing important e-mail not to be delivered.
Many spam systems make thorough examinations of the Internet for any visible e-mail addresses on Web pages. Other systems would use some descriptive group of users’ names such as staff@..., faculty@..., users@... and use domains of large organizations. A third type of spam systems would depend on hackers who break into e-mail directories for individuals, organizations, and newsgroups to copy the e-mail addresses found.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
108 Shata
Advertisement materials would then be sent to the copied addresses. A more harmful spam system would depend on hackers to take control of individuals’ computers and use them to send spam. Unfortunately there is no guaranteed way to fully fight spam. Fighting spam can be at an individual level or at an ISP level. Individuals may fight spam by installing anti-spam software that would just check patterns in e-mail messages that are frequent in spam email (e.g., lowest mortgage ever, right time to buy, etc.). The system would then collect what it believes to be spam, store it on some area on the e-mail server’s hard disk for a pre-defined period of time (e.g., two weeks), and report the addresses and headers of the assumed spam to the recipient in a form of a list. The recipient has to go through the list and decide whether he or she would like to retrieve one of the blocked e-mails. This is not effective, as the recipient still has to waste time checking the list and may find it difficult to make a decision regarding an e-mail based on the header. Some servers would work on protecting their clients by checking received e-mails before forwarding them to clients. ISPs fight spam by several approaches such as using spam filters. The idea of spam filters depends on blocking e-mails received from IP addresses known to be spam, blocking emails intended for several recipients (exceeding a maximum number), or by checking emails’ contents for certain words and structures that are known to be used by spammers. However, the spam filter approach may result in problems such as the “false positive,” where valid e-mails are blocked and not delivered to their intended recipients. According to Loren McDonald (McDonald), “A recent study by Return Path indicated that approximately 12% of all e-mail messages sent to valid e-mail addresses at the top nine ISPs and Web mail service providers did not end up in recipients’ inboxes as intended.” This is basically due to spam filters. The problem of false positives may be considered a form of e-privacy invasion. To protect users from spam, which is a form of e-privacy invasion, spam filters are used and may result in not delivering valid intended e-mail, which is another form of e-privacy invasion to both the sender and the recipient. It seems that there is no effective way to fight spam. Spammers use several techniques to trick ISPs and make it difficult for recipients to identify and report them to ISPs (e.g., hiding addresses, decimal/hexadecimal addresses, redirection). In response to concerns over this increasing problem, several countries (e.g., U.S., EU) considered issuing applicable laws. However, it is not clear how to fight spammers outside the boundaries of those countries, since spammers can send spam from any location in the world to any other location in the world.
Internet Censorship A third form of e-privacy invasion is Internet censorship (IC). IC refers to installing software on computers to restrict Internet surfing. Some parents would use IC software (e.g., McAfee Office: Guard Dog) to limit their children’s access to some Web sites that would be considered inappropriate (e.g., contains materials that promote drugs, discrimination, violence, etc.). In addition, some schools and public libraries would use IC for
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
109
similar reasons; this is understandable. However, IC is used on a wider scale by some authorities and governments that install software on ISPs’ servers to limit all individuals in a community or in a country from accessing some special Web sites (e.g., political Web sites). This may be considered by some parties (e.g., human rights, liberty) and individuals as a form of e-privacy invasion. IC software adopts several techniques such as blocking Web sites that appear in a list of Web sites that are known to offer offending material or blocking Web sites based on a list of banned words. This section aims at briefly highlighting some various aspects of IC and not arguing the appropriateness of IC. Firstly, countries that argue in favor of some IC adopt applicable laws to restrict access to Web sites that offer Internet content that conflicts with their off-line laws. Electronic Frontiers Australia (2002) has published a comprehensive report on Internet censorship laws and policies around the world. Secondly, individuals who believe that IC is violating their e-privacy and their rights to surf the Internet freely seek techniques to bypass IC. In addition, restricted Web sites seek techniques to be accessed, as they consider this restriction a violation of their eprivacy in online publishing. Freerk (2003) provides a lengthy and detailed discussion on methods of censorship and ways to bypass IC. Thirdly, technically speaking, the restriction based on a list of banned words may result in limiting access to pages that are actually scientific, legal in nature, or even for children (e.g., preventing access to Web pages on sexual harassment policies based on the word sexual, or on adult education based on the world adult). There is an interesting report by the Electronic Privacy Information Center discussing how content filters may block access to kid-friendly Web sites (Electronic Privacy Information Center-1, 1997). In addition, the banning may result in slowing down the loading of Web pages that have to be analyzed first.
E-Privacy in Private Networks E-privacy in private networks is an expression that is mostly used to mean monitoring employees’ online activities at their workplaces by their employers. From the employees’ point of view, this monitoring is violating their e-privacy. On the other hand, employers claim that such monitoring is essential for the benefit of the workplace. According to a CNN report published in 2000 (CNN, 2000): “A recent survey by the American Management Association finds 54% of companies said they monitored their employees’ Internet connections, while 38% said they reviewed worker e-mail messages.” To the best of my knowledge, there is no current specific legislation that addresses this conflict of interests. However, most courts’ interpretations of privacy laws, which when adopted and applied to e-privacy cases, support employers. The employers’ main point is that they purchased computers to be used by employees who get paid to work a certain number of hours per working day. There is no difference between an employee who is
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
110 Shata
absent and another employee who goes to work but wastes his or her time surfing the Web for personal purposes. In addition, some employees may use their workplace computers in harassing activities that may bring their employers under the law. This area of e-privacy is still evolving with many debates and many questions: Should there be monitoring in the first place? Would the monitoring be occasional or continuous? Should all employees be monitored or randomly selected, or only those with low productivity? How about giving employees relief from monitoring at lunch breaks? Is there a difference between monitoring Web surfing and e-mail messages? Would it make a difference if an employee is using his or her organization’s e-mail account, or if he or she is using a personal e-mail account? Would monitoring e-mail focus on content/time taken to compose it/sender and/or recipient? Would an employer show some tolerance if an employee is doing some e-learning online to improve his/her qualifications but not surfing for adult content? In the absence of a special law for organizing e-privacy in private networks, what other laws or acts may be related or applied either by the employer or by the employee? Human rights acts? Online data protection acts for public networks? Communication acts? Information privacy acts? At this time, and until the matter is settled, most advice to employers focuses on the need to have an explicit monitoring policy and on telling their employees about the intention to start monitoring them. On the other hand, the best advice to employees is expect that your online activities are monitored; be prepared to justify; or even better — leave your e-privacy outside before you enter your workplace, and claim it back when you are out.
Future Work The area of e-privacy has three main dimensions: legal, organizational, and technical. Each of those dimensions has many aspects that require further work. The legal dimension needs emerging laws to organize the e-privacy issues between employees and employers. Also, the current laws, since they are relatively recent, must be assessed for any improvements. A framework for international legal collaboration is required to fight spam. The organizational dimension has many aspects that need further work. It is not enough for each organization to have a clear e-policy for monitoring employees, but rather a policy that balances and caters to both the employee’s and the organization’s interests. Employees working under pressure of e-privacy invasion and under continuous monitoring may not be able to think and act freely and naturally, and this may affect work productivity. In addition, some employees may be spending many hours at work, probably more than what they spend at home or at private places, and they would need to access their private e-mail at their work places. In the same way that employers show tolerance when employees use telephones and faxes at work, some tolerance is also needed for online activities. Future research could investigate what may be considered an acceptable limit of tolerance. Future work may also look at how multinational
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
111
organizations may cope with several e-privacy laws and determining a minimum set of security measures to be recommended and used by all organizations. The technical dimension is under continuous development. There is work to improve restricting software to block only offending sites and not some of the friendly ones as well. Work is needed for more accurate blocking techniques and algorithms. More advanced techniques are needed to fight spam. Providing adequate security to PII collected from unauthorized access during storage, processing, and transfer is a focus of much current research (e.g., is it better to provide security at applications level or at IP level?). Current research also focuses on developing software tools that may be used easily by individuals to enforce and to guarantee their e-privacy policies.
Conclusions With e-services on the Internet growing at an unprecedented rate, the issue of e-privacy is receiving growing attention as well. E-privacy is a term that is difficult to define, as it is seen differently by various stakeholders of conflicting interests: governments, individuals, commercial organizations, legislators, liberty advocates, and so on. This chapter aimed at introducing various key areas of e-privacy that are receiving significant amounts of research. This includes the nature and critical need for e-privacy, the relationship between e-privacy and electronic security, e-privacy policies, legal and technical aspects and challenges, e-privacy management approaches (forms, models), and e-privacy considerations in public and private networks. As the topic is still unsettled, one may expect to see increasing research and debates in the areas mentioned.
References Alberta Government. Personal Information Protection Act, S.A. 2003, c. P-6.5. (2003). Retr ieved Febr ua r y 18, 2005, from h tt p: //www.psp. gov.a b.ca/ index.cfm?page=legislation/act/index.html CNN. (2000). More employers taking advantage of new cyber-surveillance software. Retrieved February 28, 2005, from http://archives.cnn.com/2000/US/07/10/ workplace.eprivacy/ Cobb, S., & Cobb, C. (2004). Florida’s ID Theft Kit. Retrieved February 25, 2005, from http://www.cobb.com/help/art-florida.htm Cranor, L. (2002). Web privacy with P3P. USA.: O’Reilly & Associates. Cranor, L., Langheinrich, M., Marchiori, M., Presler-Marshall, M., & Reagle, J. (2002). The Platform for Privacy Preferences 1.0 (P3P1.0) specification. Retrieved February 24, 2005, from http://www.w3.org/TR/P3P/ Department of Justice - Canada. (n.d.). Privacy provisions highlights. Retrieved February 3, 2005, from http://canada.justice.gc.ca/en/news/nr/1998/attback2.html Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
112 Shata
Electronic Frontiers Australia. (2002). Internet censorship: Law & policy around the world. Retrieved February 23, 2005, from http://www.efa.org.au/Issues/Censor/ cens3.html#intro Electronic Privacy Information Center. (1997). Super beware: Personal privacy and the Internet. Retrieved February 15, 2005, from http://www.epic.org/reports/surferbeware.html Electronic Privacy Information Center-1. (1997). Faulty filters: How content filters block access to kid-friendly information on the Internet. Retrieved February 5, 2005, from http://www2.epic.org/reports/filter-report.html Federal Privacy Commissioner (Australia). (1988). Information privacy principles under the Privacy Act 1988. Retrieved February 11, 2005, from http://www.privacy.gov.au/ publications/ipps.html Freerk, O. (2005). How to bypass Internet censorship. Retrieved February 12, 2005, from https://ssl-account.com/zensur.freer k.com/. (Eur opean mir ror: http:// www.zensur.freerk.com/). Government of Canada. (n.d.). The Personal Information Protection and Electronic Documents Act (Unofficial Version). Retrieved February 6, 2005, from http:// www.privcom.gc.ca/legislation/02_06_01_01_e.asp Government of Canada-1. The Personal Information Protection and Electronic Documents Act (Official Version). Retrieved February 18, 2005, from http:// www.parl.gc.ca/36/2/parlbus/chambus/house/bills/government/C-6/C-6_4/C6_cover-E.html Holborn, F. (2003). Theft happens: Data security for intellectual property managers. Retrieved February 15, 2005, from http://ipsociety.net/psiframe-ip-security.pdf IBM. (n.d.). P3P policy editor. Retrieved February 11, 2005, from http:// www.alphaworks.ibm.com/tech/p3peditor ID Theft, Schemes, Scams, Frauds. Identity theft Examples using social engineering and phone phishing techniques. (n.d.). Retrieved February 11, 2005, from http:// www.crimes-of-persuasion.com/Crimes/Telemar keting/Inbound/MajorIn/ id_theft.htm Information and Privacy Commissioner for British Columbia. (2004). Privacy and the USA Patriot Act. Implications for British Columbia public sector Outsourcing. USA Paptriot Act threatens Canadians’ privacy. Retrieved February 10, 2005, from http://www.oipcbc.org/sector_public/usa_patriot_act/pdfs/report/privacyfinal.pdf Kent, S., & Millett, L. (Eds.). (2003). Privacy challenges in authentication systems in who goes there?: Authentication through the lens of privacy. USA: The National Academic Press. McDonald, L. (n.d.). Why 12% of your e-mails are not reaching their intended recipients. Retrieved February 18, 2005, from http://www.emaillabs.com/articles/email_articles/ article_unknownbounces.html
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
E-Services Privacy: Needs, Approaches, Challenges, Models, Dimensions
113
Microsoft. (n.d.). Microsoft Passport. Retrieved February 20, 2005, from http://wwwmicrosoft.com/myservices/passport Office of the Privacy Commissioner of Canada. (n.d.). Privacy Legislation. Retrieved from http://www.privcom.gc.ca/legislation/index_e.asp Roberts, L. (2003). Personal privacy and the Internet. Info Tech Talk, 8(3), 2-3. Retrieved February 7, 2005, from http://www.ndu.edu/irmc/elearning/newletters/ newletters_pdf/itt0603.pdf The European Commission. (2002). Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications). Retrieved February 6, 2005, from http:// europa.eu.int/comm/internal_market/privacy/law_en.htm Tumer, A., Dogac, A., & Toroslu, H.(2003). A Semantic based privacy framework for Web services. Retrieved February 22, 2005, from http://www.srdc.metu.edu.tr/webpage/ publications/2003/TumerDogacToroslu.pdf UK. (1998). Data Protection Act. Retrieved May 14, 2005, from http://www.hmso.gov.uk/ acts/acts1998/19980029.htm U.S. Department of Commerce. (2000). Safe Harbour Agreement. Retrieved May 14, 2005, from http://www.export.gov/safeharbor/ U.S. Government. (1998). The Digital Millennium Copyright Act of 1998 — U.S. Copyright Office Summary. Retrieved February 13, 2005, from http://www.copyright.gov/ legislation/dmca.pdf VanderLeest, S. H., (Ed.). (2001). Being fluent and faithful in a digital world. Calvin College. Retrieved February 18, 2005, from http://www.calvin.edu/academic/rit/ webBook W3C. (2004). Platform for Privacy Preferences Project (P3P). Retrieved February 19, 2005, from http://www.w3.org/P3P/ Yahoo. (n.d.). Yahoo! Privacy Center. Retrieved May 14, 2005, from http:// privacy.yahoo.com/privacy/us/pixels/details.html
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
114 Shata
Section II: Privacy Protection From Security Mechanisms and Standards
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 115
Chapter V
Privacy Protection Through Security Martine C. Ménard, Policy Research Initiative, Canada
Abstract This chapter discusses how implementing network and computer security can protect privacy of Internet users. It argues that personal identifiable information is valuable to both clients and businesses alike, and therefore, both are responsible for securing privacy. They must understand the vulnerabilities, threats, and risks that they face, what information requires protection, and from whom. Businesses must also comprehend the business issues involved in securing data. Finally, security measures should be a strong mix of technological, physical, procedural, and logical measures where each measure is implemented in overlapping layers. Proposed solutions must be flexible, meet the objectives and businesses goals, and be revised on a regular basis. The author hopes that by understanding the proposed security solutions, readers will be able to implement steps to protect their privacy or client’s privacy.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
116 Ménard
Introduction The popularity of the Internet with its many e-services available is bringing more people online who simply want to browse or use services such as online banking or online shopping anonymously. However, their privacy as an individual is at stake since a lot of information can be and is collected without their knowledge or consent. Much of the information collected is done stealthily through cookies, logs on visited Web sites, or through silent installation of viruses, spyware, and Trojan horses that are often obtained through e-mails, drive-by downloads, or legitimate downloaded files that are infected. Organizations such as department or retail stores, banks, government departments, or any businesses with online services, by offering savings or convenience to consumers in exchange for personal information, are in a position to collect much information about individuals. These processes infringe on privacy since they remove customers’ rights to be left alone and not be profiled, be free from surveillance, and have control of information they share. Furthermore, privacy is breached when customers do not know what is being collected about them, for how long it will be kept, the purposes of collection, and if it will be shared with third parties.
The term “hackers” can be used to describe both good and bad individuals. They use their programming, analytical, and problem solving skills to either denounce vulnerabilities, security issues and promote solutions or use their knowledge and vulnerabilities of systems for their own gain. Therefore, what distinguishes them is the intent and goal behind their actions. The Oxford dictionary supports this by defining hackers as: (1) programmers that use their skills for good, or (2) individuals who use computers to gain unauthorized access to computer networks. For the purpose of this chapter, hackers will be seen as individuals with bad or evil intentions that use their skills and computers for unauthorized, illegal, or criminal activity against people, organizations, other computers, or networks.
Organizations are also vulnerable to privacy issues because of their increased reliance on digital data and services. A security and privacy breach could anger customers and shareholders, make shares drop, remove competitive advantages, and threaten the organization’s existence. Organizations are even more at risks of being targeted since their information resources are more interesting. An attacker, gaining illegal access to their systems, will not only gain prominence from the hacking community, but could also gain knowledge of thousands of credit card numbers, usernames and passwords, and other important information to commit identity theft. Therefore, organizations need to put security safeguards in place to protect themselves from liabilities, protect shareholders investments, and assign accountability. Both organizations and home users have a responsibility in protecting privacy. First, users must believe that their information is valuable, and they must act like it is. This includes implementing security measures and forfeiting conveniences or savings when the exchange is not right or equal. Organizations must also implement security measures
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 117
since they are most likely to be targets of hackers. But security is not all that organizations must do to gain trust and confidence of Internet users. They must implement steps that make them accountable and transparent with what they can and cannot do with information they collect. Finally, securing the privacy of individuals and organizations that use or implement eservices is much more than setting up firewalls, anti-virus software, or applying patches and updates. It is understanding what needs to be protected, how it needs to be protected, and from whom. It is understanding the vulnerabilities, threats and risks that exist and their effects on particular resources and data. It is understanding how different security measures work together, what their weaknesses are, and implementing the best strategy for minimizing each risk.
Security „ Privacy Security is not the same as privacy, nor does implementing sound security practices guarantee that privacy will be achieved. This is because privacy is most concerned with identifiable user data and their rights to control what can be collected about them; what it can be used for; and to whom it may be disclosed. The only way organizations can protect user data from misuse is by implementing policies, standards, and fair information practices (Cavoukian, 2005). On the other hand, privacy cannot be obtained without security since security provides the physical, logical, and procedural safeguards needed to keep the information private.
This chapter will provide tangible solutions and implementation strategies that will help secure networks and minimize threats and risks that exist. By implementing these solutions, both organizations and home users start to protect their privacy. This chapter will first provide some background information on privacy and how digitalization is affecting it. It will then deal with the following security issues: role of security in protecting privacy, threat and risk analysis, role of security policies, and methods of securing the infrastructure. Finally, a short overview of future trends in privacy and security will be discussed.
Background The Oxford dictionary defines privacy as: (1) being private, undisturbed and (2) being free from intrusion or public attention. Baase (2003) expands this definition by including the right to be free from surveillance and be able to control our own information. Privacy can also be defined as personal information that we wish to keep confidential and away from third parties (Cady & McGregor, 2002; Ghosh, 2002). Most of the information we want to keep private falls into two categories: data properties and behavioural characteristics. Under data properties are name, address, phone number, job title, height, and any other information we believe makes us unique individuals. Behavioral characteristics include such things as: schedules we keep, shops we visit, banks we do business with,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
118 Ménard
purchases we make, sites we browse, and length of time we remain there. As soon as collection of information becomes intrusive, is used for surveillance or to create profiles of our habits, sends us coupons or target us for specific marketing schemes, then organizations are breaching our privacy. Our privacy also gets compromised when our personal information is shared or used in ways that we did not expect or without our consent. In today’s society, which is always hungry for more information, people need to take responsibility for their own privacy. A good starting point is through awareness and education. Both of these strategies will help people understand what happens once they have shared their information, threats that surround this sharing, and an increased knowledge and confidence to say no to certain requests. Hyatt (2001) and Meeks (2000) believe that we are our worst enemy since we freely volunteer information in exchange for convenience and savings. For example, we give up some of our privacy when we fill out ballots for the chance to win trips of a lifetime, fill out surveys, warranty questionnaires, or enable our browser to remember our passwords. Rarely do we think about what other organizations could do with our information. According to Bellotti (1998) and Ghosh (2001), one of the biggest issues with online privacy is not knowing what information is being collected about us. Who uses it? For what purposes can it be used? Where is it stored and for how long? Who has access to it? What measures are in place to secure it? How are we transmitting or conveying it? And finally, what does it look like? The sad part is that once we have given our information away, we can never get it back, and we no longer own it. The collecting organizations can do whatever they want with it (Meeks, 2000). It is becoming harder to interact with strangers and not share some sort of information (Baase, 2003). Without this exchange, banks may not want to lend us money since they could not verify if we have secure jobs, check our credit history, and understand if we are likely to repay them. Landlords may not want to rent us apartments if we cannot provide good character references that will indicate that we can provide rent on time and will not damage their property. In an off-line society, as is evident in small towns, everyone knew one another, so the exchange of private information was not necessary since much of it was already known. We always went to the same branch for our banking needs; the store clerk knew where we worked because he/she knew our kids or lived down the block from us. In big cities, however, we became more anonymous and therefore started to trade some personal information to obtain certain services and privileges, such as providing our drivers license number, address, and phone number when writing checks. This information was necessary to establish trust. In the past, we did not worry too much about privacy because the information was not in a digital format; for stores and other organizations to build profiles would have been very time consuming and hard. The information they collected was for their own protection. If we did not pay, they could track us to demand payment. However, in an online world, information is already in a digital format, and much of it can be collected easily and effortlessly without customer consent or knowledge. This information can also be traced back to one specific person. The ease of gathering information and using it for intrusive purposes to create profiles, survey people, and understand their shopping habits so that they could be better targeted to spend more is the basis of the privacy
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 119
debate. The problem is that organizations are collecting much more information than is really needed, just because it is easy to do so, without letting customers know what will be done with it (Cady & McGregor, 2002; Ghosh, 2001). Furthermore, with all the existing malicious software that exists to collect personal information and new threats that keep appearing, it is safe to say that in an online world, no one is anonymous. However, Hyatt (2001) believes that people can still have total anonymity if they are willing to put forth the effort. How much privacy they will have will be dependent on their lifestyle, financial resources, values, awareness of the problem, how many sacrifices in savings are acceptable to them, and how much work they, as individuals or families, are willing to put in. It seems that privacy is inversely proportional to convenience and savings. The more privacy we have, the less convenience and saving we have and vice versa. Hyatt’s vision and some of his solutions described below are a bit drastic and too difficult for the average person to achieve. For example, he encourages people to use a drop box instead of a street address, buy a house or transfer it to a trust and put all utility bills under it, and have two phone numbers. The first is public and hooked to a voice mail where you can pick up messages at a convenient time; the second line, which is unlisted, is for your family and a few close friends only. Hyatt also suggests creating for yourself a new identity, with credit card, e-mail address and then to use this identity for most of your interaction. It can easily be seen that for the average person this is not achievable since it is expensive, time consuming, and very inconvenient. This total anonymity means a total change in lifestyle, habits, and the way we interact with people. With one slip of the tongue, by you or someone who knows your real identity, all your hard work could be destroyed. This fake identity could also be tracked and profiles created. Finally, organizations or people that truly want to find you, would, since there will always be weak points that can be exploited. Hyatt (2001) understands this difficulty in obtaining total anonymity and instead proposes that we become much more aware of the information we share on a daily basis and guard it as something truly valuable. We need to start asking why organizations need that information and what it will be used for. We should also not be afraid or intimidated to refuse to give it out. In this case, we need to be ready to give up our purchases or extra savings, but ultimately the choice is ours. To have privacy, we need to take control of our information and not be so naïve as to think that organizations can better protect it. If we do not protect it or are not careful with the information we disclose, then why should organization take our privacy seriously? Scott McNealy of Sun Microsystems, on the other hand, believes that we have absolutely no privacy, and we should learn to live with it (Sprenger, 1999). His statement is probably closer to reality than we would like to think. Even if we still have some privacy left, it is eroding quickly. By using credit cards, cell phones, crossing international borders, making reservations, driving, going online, and having pizza delivered to our doors, we compromise our privacy (Baase, 2003). Cady and McGregor (2001) also support McNealy and believe that total anonymity is hard to achieve because we live in a world that is hungry for information. This information is needed for every day transactions, and we are quick to pass our personal information along to obtain faster service without really thinking about the consequences. We must stop thinking that privacy needs to be all or nothing. Each case should be evaluated on an individual basis, upon what information
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
120 Ménard
is shared, if this information is freely or easily available, what it will be used for, and the reputation of organizations with which we deal. Much of our personal information has been publicly available to third parties for a long time without much privacy concern on our part because it was hard for them to gain useful utility of available information. This was because the information was segregated in different databases with no easy link between them. The phone book is a good example of how much the Internet has changed accessibility and ease of use of our information. A few years ago, if you were looking for someone in a different city, you needed a phone book of that city. If you did not know where the street was located in the city, you had to find a map and locate the address. The Internet has simplified this process and made searches much more efficient and rapid. We can find just about anyone, anywhere, with just a few clicks of the mouse. We can find them by names or by phone numbers. Many people directories also have a map function so that with an extra click you know exactly where their house is located and can view their neighbourhood through satellite imagery. Technology, with its evolution, rapid growth, and development, has changed the ways organizations process personal information and data and therefore changed our view of privacy. The technology behind the Internet and e-services helps in the collection and storage of information. Also, because of the nature of the Internet, users are not aware of what is or can be collected, what purposes it has, or what might be oozing out of their PCs without their knowledge. This digital medium also enables easy access and makes distribution of large amounts of information easy (Baase, 2003). Once information has been gathered it becomes easy to search and analyze it. Organizations can now merge their information with other databases or link to them and with the advances of data mining tools, can easily create profile of their clients, which can be sold to other organizations that want to target specific people. Also, with the help of good data mining tools, organizations can now make many different profiles of a client. The information that was once disparate and independent can now be combined, and organizations can make conclusion about the buying pattern of their customers. Even though technology is playing a big role in the erosion of our privacy, we cannot solely blame those problems on it (Hyatt, 2001). Many of the problems stem from our naïve eagerness to embrace technology before we truly have understood it and tamed it. What we saw as good has been pushed to the limits and used in unexpected ways. The technology that has made it simple to find long lost friends has also made it simple for organizations to do market manipulation, for governments to increase surveillance, for hackers to execute credit card fraud, for crooks to realize identity theft, and for disgruntled friends, coworkers, and strangers to stalk their prey. Also, there have been many instances where our personal information has been seen by many because of computer glitches and software updates that have gone wrong. We cannot hope for organizations to keep our information private. We need to take steps to protect it. Cady and McGregor (2002) believe that if people keep giving out personal information for convenience and greater savings, then chances are that very good profiles are being built about them. These profiles can then be sold to anyone wanting to target a specific set of consumers. Customers have not yet realized that their information has value, not only to organizations collecting it and their partners, but also to them. How much value it has is dependent on: individuals, information that is being shared, in what venue it is being colleted, and if the information is readily available. But one thing is certain, we all Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 121
have the right to be free from profiling and have the right to be left alone and not be bombarded with telemarketing phone calls and spam at anytime of the day or night. We should also feel we have a choice in controlling our information and not be intimidated about giving it (Baase, 2003; Ghosh, 2001; Meeks, 2000).
The Role of Security in Protecting Privacy Security helps protect our privacy by reducing our risk of exposure and protecting the organizations’ assets such as client information, intellectual property, strategic goals, and financial statements (Cady and McGregor, 2001; Canavan, 2001; Gollmann, 1999). Without proper protection of these assets, consumer confidence and trust can quickly disintegrate making customers look for other solutions to their needs such as returning to off-line services or taking their business elsewhere and affecting organizations’ bottom lines (Delio, 2005; Hare, 2002). Therefore, a relaxed attitude toward security is never acceptable since publicity about a privacy breach can quickly destroy an organization and remove its competitive advantage. Security also ensures confidentiality, integrity, and availability (CIA) of the data. Confidentiality is important for privacy since it assures that data available on the network is only read and utilized by authorized employees. It also prevents unauthorized disclosure of information. Integrity deals with the information itself by making sure it has not been altered or modified by unauthorized people. Finally, availability ensures that both data and systems are in place and useable for the purposes for which they were created. Table 1 summarizes the difference between privacy and security and their respective roles in protecting data and information. Even if organizations promise privacy, yet have not implemented any security measures to protect their databases and other assets, privacy will not be achievable since hackers could easily access the information from the Internet and sell it (Ghosh, 2001; Scalet, 2005). Furthermore, as databases of information grow and get consolidated, hackers will be more tempted to break in and obtain information for identity theft, credit card fraud, or phishing scams (Delio, 2005). Also, without security measures, people from the street could walk into organizations and freely obtain confidential information. However, Ghosh warns that security alone does not provide privacy since without a strong privacy policy that states what can and cannot be done with the information collected, organizations could simply sell their information to third parties. Finally, appropriate technology and procedures need to be in place to maintain and enforce the policy. Security cannot only be defined in terms of technologies since not all solutions are applicable to all organizations and users. Security is as much about understanding business issues, information that needs protection, and human factors involved as it is about products implementation (Parker, 1998). Security is not a fixed objective that can be achieved, but rather a process that is always evolving as businesses change (Day, 2003).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
122 Ménard
Table 1. Differences between privacy and security Privacy is concerned with: • Protecting identifiable user data • How identifiable information is collected and kept • Fair use practices o Who can use the information? o For what purposes can the data be used? o Where is it stored and for how long? o Who has access to it? o What measures are in place to secure it? o How is it transmitted? o What rights and access do users have to it? o Who is accountable? o Are users asked for consent before their information is used? • Standards and policies Security is concerned with: • Reducing risk of exposure • Understanding business issues and their risks • Minimizing human factors and their risks • Protecting data, information, and systems • Confidentiality: Only authorized users can read or use the information • Integrity: Knowledge that the information has not been altered or modified by unauthorized users • Availability: Both data and systems can be used for the purpose for which they were created Working together, privacy, and security increases: • User trust and confidence • Accuracy of data • Organization competitiveness • Online services Working together, privacy, and security decreases: • Likelihood of theft, both physical and electronic • Misuse of information and data • Leaks Therefore, Privacy + Security = Freedom Where freedom is defined as freedom of choice, personal control of information that is given and self-determination (Cavoukian, 2005). Freedom also provides users with more confidence that their information will not be used in unacceptable ways, thereby increasing their reliance on online services. Freedom provides organizations with incentives to carefully use the information they are given and increase their online presence to better meet user needs and wants.
Furthermore, security can help home users and organizations protect their information by denying viruses, malware, and spyware from being installed and used to access their computers and information. It is difficult to assess what information these programs send back home and for what it is used. They may be logging all keystrokes, hoping to obtain username and password combinations, credit card numbers, bank accounts, or e-mail addresses, or they might simply log all Web sites visited so that they can send
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 123
Table 2. Summary of threats to security and privacy Types of Threats Software
E-mail Users
Known Threats • Viruses, worms, and Trojan horse • Spyware, malware, grayware • Drive-by downloads • Misconfiguration • Software bugs • Cookies • Phishing • Spam • Hackers • User error
appropriate pop up ads or spam messages. Even if we are careful in what applications we install and what attachments we open, we can still become infected through drive-by downloads. Drive-by downloads are malicious programs found on Web pages that get installed on PCs just by viewing an infected page. Cookies also have the capabilities to collect information. They can track passwords and keep track of items found in shopping carts even after ending the session. Finally, the simple act of getting online can provide hackers with most of the information they need to implement their attacks. With simple DOS commands and freely available tools, hackers can find our computer name, IP address, from which our ISP can be determined, our operating system, and browser versions with installed patches. Implementing good security measures can eliminate much of this information leakage from systems. Therefore, the first step toward protecting our privacy is to secure our own PC and prevent as much of this information from leaking out as possible. A summary of security and privacy threats is given in Table 2.
Threat and Risk Analysis Many are reluctant to spend time and money on doing a threat and risk analysis (TRA), thinking that they have no real information and that they could not possibly be targets of hackers (Parker, 1998). However, with high speed Internet, home users are now offering their PCs for all to see and use. Many hackers will use less protected systems to practice and hone their skills before attacking organizations that provide harder challenges but much better pay off in terms of information gained if attacks are successful. Before continuing with the discussion, the following keywords need to be defined: vulnerability, threat, attacks, and impact. Vulnerability is defined as any weaknesses found in systems, either through design, configuration, or implementation, that could be exploited. Every system, whether it be a standalone computer or one that is networked with thousands of other PCs, has some sort of vulnerability (Canavan, 2001; Hare 2001; Summers, 1997). The degree to which this
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
124 Ménard
vulnerability can be used by hackers is dependent on the availability of exploit code and technical knowledge of hackers. Threats are actions that could be implemented, either intentionally or otherwise, against systems to disrupt their good operation by breaching existing security measures and exploiting their vulnerabilities (Hare, 2001; Stallings, 2003; Summers, 1997). Attacks are a series of deliberate activities performed to find and understand weaknesses of systems and circumvent their security measures (Fried, 2001; Stallings, 2003). In other words, attacks are threats acted upon. They always have a direct impact on confidentiality, integrity, and availability of the organization’s information assets (King, Dalton, & Osmanoglu, 2001). Impact is the degree of harm that organizations could suffer if a threat were to materialize (Hare, 2001). TRA helps organizations understand vulnerabilities they have, threats they face, and possibility that each one could be exploited. Once this information is known, they can better implement solutions to remove or minimize them while still meeting organizational needs. This analysis also helps in promoting solutions that are technologically and fiscally sound. Even though organizations are most likely to be targets of attacks, home users are not immune from doing TRAs since their PCs could be used in attacks against organizations, or the PCs could be infected by viruses, worms, or spyware that could send their information to third parties. Therefore, understanding the vulnerabilities of their systems and how each of them could be exploited are important. Finally, threat and risk analysis is an ongoing process since organizations and systems are not static. For example, organizations change direction, mandate, or priorities; staff come and go bringing with them new requirements and leaving with passwords and network configuration knowledge. Furthermore, the configuration of equipment changes over time due to patches or new requirements; older equipment is replaced or upgraded and new systems added. All of these changes have the possibility of creating new vulnerabilities or opening up new security holes, but by doing TRAs regularly, many of these problems and other security risks that were not previously seen or thought about may be caught and fixed. The risk of each threat to be exploited should be evaluated against destruction, modification, and disclosure of information and the impact this will have on confidentiality, integrity, and availability of data. In terms of privacy, risk must be evaluated against loss of trust and confidence from customers and how a breach will affect their use of e-services. Finally, it is important to note that without vulnerabilities, threats, and attacks, exploits do not exist. There are four categories of threats that need to be taken into consideration when doing TRAs. These are: natural, environmental, human, and malicious software, which are summarized in Table 3. Both natural and environmental threats are easy to identify and protect against. Natural threats include earthquakes, storms, fires, severe weather, and flooding to name but a few. Environmental threats are dependant on the location of where systems are located. These could include breakdown of air conditioning, water damage due to a leaky pipe, hard drive failure, power surge, or brown out, and so forth. Steps to protect systems against such threats include installing smoke detectors near equipment, doing back ups and storing all backup media offsite, grounding equipment, and installing
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 125
Table 3. Summary of threats and their protective measures Types Natural
Environmental
Human
Malicious Software
Potential Threats • Earthquakes • Storms • Fires • Severe weather • Flooding • Air conditioning breakdown • Water damage due to leaky pipe • Hard drive failure • Power surge • Brown outs Intentional • Theft • Fraud • Espionage • Sabotage Unintentiona l • User error • Misconfiguration • Carelessness • Incompetence
Needing host • Trap doors • Logic bombs • Trojan horses • Viruses Self-contained • Worms • Zombies Other • Spyware • Robots • Phishing • Software vulnerability • Blended threats
Preventive Measures • UPS • Disaster recovery • Backups • Offsite storage of tapes • Grounding equipment • UPS • Disaster recovery • Backups • Offsite storage of tapes • Grounding equipment • • • • • • • • • • • • • • • • • • • • •
UPS Backups User training Locked doors to equipment rooms Security badges Escort of contractors User authentication Implementation of firewalls and anti-viruses Auditing Lowest right privileges and need to know Strong password policies Change control Documentation User training Firewalls Anti-virus Anti-spyware Lowest right privileges Mail filtering Attachment blocking Backups
uninterruptible power supplies (UPS) (Steinke, 2002). Although these threats cannot be considered malicious, they need to be taken into consideration since any of them could leave systems vulnerable to other, more dangerous and malicious attacks. Human threats are much harder to protect against since security depends more on user attitude than on the latest technology. Human threats are classified either as intentional
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
126 Ménard
or unintentional. Intentional threats include theft, fraud, espionage, and sabotage, which can be done by either trusted employees or hackers. It is important to have security measures in place to protect against these threats since they will affect the privacy of customers. On the other hand, unintentional threats are done by careless, lazy, incompetent, indifferent, negligent, or angry employees and could also be caused by user error. Most of the damage includes modification and deletions of files and can usually be recovered with good backup strategies. Social engineering is also a human threat that should be taken seriously since the process of gathering information is always to breach systems and gain unlawful access to information they contain. The best prevention to social engineering is awareness and training. Malicious programs are now becoming the number one threat to privacy and security of information. According to Symantec (2004), the time it takes for a vulnerability to be discovered to the time malicious code is written to exploit it and made available on the Internet is 5.8 days. These malicious programs are used to install backdoors into systems so hackers can gain control. They collect any and all information including keystrokes, which are then sent to hackers and are used to spy on users. Often these programs are installed without user knowledge or consent, either through drive-by downloads, legitimate software that is infected, or software that passes itself for a useful utility yet never seems to work properly. Malicious programs are classified into two categories: those that need a host to launch their attacks, such as trap doors, logic bombs, Trojan horses, and viruses, and those that do not or are self-contained, such as worms and zombies. Self-contained software will automatically install and search, either on the Internet or local network, for vulnerable systems to replicate to (Stallings, 2003). It is therefore very important to have a security plan that can quickly deal with newly discovered vulnerabilities. The above-mentioned malicious software with spyware, robots, phishing, software vulnerabilities, and blended threats are some of the most popular software threats affecting corporate and home networks and are defined in Table 4. Organizations that are serious about the privacy of their customers and the welfare of their organization will seek to understand the vulnerabilities they face and implement steps to minimize the risks of an exploit. They will also try to educate their customers on safe browsing habits and what steps they are taking to protect customer privacy.
The Role of Policies in Privacy Policies are high-level documents that should be simple to understand, to the point, and short, 10 pages or less. They should be written in terms of responsibilities, expectations, and acceptable behaviors of employees and organization as a whole. Their purpose is to provide strategies about how to implement effective, efficient, affordable security that meets all business needs and goals of organizations (Schneier, 2000). They should state goals and objectives to be achieved and not the hardware or software to be used. Policies should provide a balance between ease of access to information, resources, and assets while providing adequate security measures that will ensure data confidentiality,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 127
Table 4. Different types of malicious programs defined Malicious Programs Needing Host to Launch Attack Trap doors are defined as any undocumented ways of obtaining access to systems that were built-in by designers. These traps doors were used by programmers to test applications but can now be used by hackers, who know the proper key sequence, to gain access to the information. Trap doors are usually hard to see during code inspections and very hard to protect against (Canavan, 2001; Stallings, 2003). Logic bombs are small programs that run on PCs but have no CPU resources assigned to them until a specific set of logical conditions are met, such as a particular date, which will trigger the bomb. Once activated, bombs can delete, modify, or crash systems (Canavan, 2001; Pipkin, 2003; Stallings, 2003). Trojan horses are programs that hide in apparently useful applications, yet when installed have completely different functions, which are unknown to users. Trojan horses are often used to provide remote access to hackers. They can also collect usernames and passwords, which are sent back to the authors or can delete files. Trojan horses need user involvement to be installed. Viruses are small programs that attach themselves to executable files, which when run will execute their predetermined actions. Viruses are not self-replicating and can only propagate by having infected files come into contact with uninfected ones. This can be done through file sharing, downloading software from an untrusted Web site, or by opening programs remotely. Self-Contained Malicious Software Worms are very similar to viruses except that they are self-contained and try to replicate themselves by actively searching network connections for other systems to infect. Once infected, systems may start sharing, with unknown third parties, information such as passwords, personal data, system data, IP addresses, and host lists. Zombies are small programs that install silently and take over systems. They are used to launch specified attacks on other computers. Many hackers now have networks of a thousand zombie PCs, mostly created with unprotected home PCs that are used to launch denial of service attacks on organizations. Other Malicious Software Spyware, also known as adware, malware, or grayware, is by far becoming the biggest threat to home user privacy. According to de Argaez (2004), 90% of all computers are now infected with some type of spyware. Spyware, which is most often installed through drive-by-downloads on the Internet, collect personal information found on users PCs, monitor computer habits, modify browser settings, slow down PC performance, make pop up window ads appear, and report users activity to third parties (Symantec, 2004; Vision Technology Management, 2004). Many users are unaware that they have spyware and that their personal information is being collected and shared (Germain, 2004). Furthermore, Symantec (2004) reports an increase of spyware from less than 2 million incidences in August 2003 to over 14 million in March 2004. Robots or “bots” are programs that get installed on PCs without user knowledge, through communication channels such as Internet Relay Chat (IRC), shared network drives, peer-to-peer networks, or by exploiting remote vulnerabilities (Symantec, 2004). Their purpose is to steal application serial numbers, user passwords, or other valuable information found on systems. They also have the capability to establish zombie networks. Phishing, which can be described as online attempted fraud, is becoming the most popular tool for identity theft. According to Symantec (2004) over 1.78 million people have been victim to online fraud as a result of phishing. Delio (2005) suggests that 5% of recipients in phishing scams respond. The email, which is usually sent with a spoofed address, urges users to update personal information, which often includes credit card numbers, bank accounts, and usernames and passwords against the threat that their accounts will be suspended or closed if not acted upon rapidly. A link to the spoofed Web site is usually included. The best prevention is user education. Software vulnerability is another big threat to users and organizations. According to Symantec (2004), software vulnerability has increase by 300% in the last five years with buffer overflow being the most prevalent. If these vulnerabilities are not removed, then hackers with malicious software can exploit them. Blended threats, such as bugbear, sasser, and blaster, are a blend from any of the above and work together to exploit a known vulnerability. They have the capability of infecting large numbers of systems in a very short time with no human intervention (Symantec, 2004). Blended threats have a direct impact on the confidentiality, integrity, and availability of an organization’s information and resources.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
128 Ménard
integrity, and availability. Policies also ensure business continuity by providing a plan to follow if a breach does occur. Finally, they support departments with the authority to say “no” to requests that violate established policies until a win-win solution can be found. To be valuable, policies must be kept current and must be updated regularly to match the changing business goals of the organization and reflect changes in legislation. Their relevancy is only as good as the information they contain. They should be general enough to stand for a long time, yet specific enough for managers to base their purchases, implementation, and configuration on them. Policies should also be able to answer most questions that users or customers have. On the other hand, to be effective, policies must be published, distributed, read, and understood by all individuals constrained by them. Everyone must understand their responsibilities in securing information, systems, and resources. Clear consequences should also be spelled out to warn users about disciplinary actions that could take place if they wilfully break them. Their structure is summarized in Table 5. Every online organization offering e-services should also have a clear privacy policy easily accessible to customers. This privacy policy should state in very simple terms what information may be collected, why it is collected, for how long it will be kept, if it will be shared with third parties, and what security measures are in place to protect it. This policy should also provide information about how organizations will contact customers and
Table 5. Policy structure Policies are: • High level documents • Short and to the point • Simple to understand Policies must provide: • Business continuity plans • Support for departments to say “no” to certain practices • Balance between ease of use and security (CIA) • A guide for managers to do their purchases, implementation, and configuration Policies must state: • Responsibilities • Expectations • Acceptable behavior • Objectives to be achieved Policies must: • Be current • Be regularly updated • Match business goals • Reflect changes in legislation • Be general enough so that they can stand for a long time • Be published • Be understood Policies are not concerned with: • Hardware or software implementation
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 129
what information they could request by e-mail, therefore minimizing phishing threats through user awareness. Finally, privacy policies should provide enough information for customers to make an educated choice about what information, if any, they should share with organizations.
Securing the Infrastructure There are three security myths that should be dispelled before security measures are discussed. These are: organizations and users often believe that technology will solve all security problems; once policies are written up, published, and implemented, everyone will comply; and vendors approach to security is always the best solution (Shaurette, 2001). Security technology can only defend networks from attacks. Yet many more threats and risks exist. For example, technology does not defend against walking off with equipment or backup tapes, configuration errors, change in environment, evolution of organizations, or human factors such as carelessness, shoulder surfing, or social engineering. It also does not take new vulnerabilities, threats, and risks into account. As for the second myth, it takes more than having written and published a few policies for security to work. Policies need to be followed, enforced, and reviewed on a regular basis to reflect changing needs of organizations. Otherwise, they become obsolete very quickly. Furthermore, policies need to be widely circulated and understood by users. They must realize their importance and the consequences of breaching them. As for privacy policies, they should be clear and easily available to customers. Finally, vendors’ approaches to security are rarely seen as the best solution to security since they offer generic, cookie cutter solutions that may not meet specific security goals of different organizations. Also, vulnerabilities found in their suite of products may build on each other. Good security strategies start with truly understanding what needs to be protected, its value, risks organizations face, their goals and objectives, and the direction they are going. As organizations recognize how all these components interact with each other, they will be in a better position to implement solutions that truly meet their needs, which can evolve with their changing requirement. It is important to reiterate that security is not a totally achievable goal, since threats and risks keep changing. Security measures to protect systems must also be flexible enough to protect against these new threats. Security is a process, a direction one travels in, an evolution that must continually be measured. Organizations must stop implementing the first solution that comes to mind but truly research what is best for them. With some creativity and imagination, organizations can find solutions that provide very good security but with less privacy and convenience trade-off (Schneier, 2003). The second step to good security is to think of it as layers, placed one on top of the other. Each layer is formed of technical and non-technical safeguards that are based on organizational policies. Layers should implement measures that are independent of one another yet provide some overlap. This overlap is essential since it helps protect the weakest link, often found where two systems interact together. This layered approach
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
130 Ménard
Table 6. Basic steps for good security Steps for good security 1. Know and understand what needs to be protected. 2. Think about security as a process, which is always evolving, always changing. 3. Think about security in terms of technical and non-technical layers complementing each other.
to security provides redundancy in security tools and expanded depth of protection and presents a different set of challenges for hackers (Matthew, 2002; Schneier, 2003). Basic steps for good security are summarized in Table 6. The most common layers of security are: physical, logical, and procedural and are summarized in Table 7. The physical layer is simple to visualize and achieve. It is all about physically securing the environment where equipment and information are kept. Organizations can have the most up to date technological measures in place, yet still be breached if hackers can simply walk up to systems and access files or data or walk out with backup tapes or servers. Organizations should also be aware that hackers can find information through their dumpster, so a shredding policy should be in place to prevent confidential information from finding its way into the wrong hands. Good security starts with physical security. Some measures that can easily be implemented include: installing locks on doors where equipment is kept, installing tags on systems to deter theft, using guards, badge systems, and swipe cards to protect work areas, installing barriers to protect the perimeter of buildings, and using lights to deter trespassers and loitering. Other measures that help prevent hackers from gaining physical access to data include: using BIOS passwords, disabling booting from floppy or CD-ROM drives, and encrypting all confidential information on servers or PCs. The shortcomings of this layer are that it cannot prevent social engineering, people from writing down combination code or passwords, thereby enabling intruder access, tailgating into buildings, jacking doors open, or complacency of users who eventually lose faith in systems if too many false alarms are generated. The best way to have users abide to physical security practices is to educate them on the importance of this security layer and enlist their help to find weaknesses in systems or better ways of doing things. If people feel part of the solution, they are more likely to respect the measures in place. The logical layer implements processes and structure that help reduce technological shortcomings of implemented software (Schneier, 2000). For example, this layer deals with the isolation of services, separation of duties, least privileges, change control, weakest link, choke points, failing securely, and enlisting user help. Each of these is discussed in Table 8. The last layer is the procedural layer. It is there to address people issues by encouraging them to always think in terms of security and by providing training when necessary. As Day (2003) suggests, users must first of all think about security, then do their work while still thinking about security. This is really important for home users to ingrain in their minds. Without thinking about security, they can compromise their systems in seconds. The procedural layer is also about vigilance. This includes applying software patches
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 131
Table 7. Common security layers Layer
Physical
Logical
Procedural
Purpose • Securing equipment • Securing information and data • Securing employees Security Measures Shortcomings Cannot prevent • Locks • Social engineering • Badge systems • Tailgating • Barriers • Jacking doors open • Lights • Writing down • BIOS passwords passwords or • Disabling floppy or combinations Cd-rom booting • User complacency • Encryption Purpose • Provide processes and structure to reduce technological shortcomings Security Measures Shortcomings • Isolation of services • Users may try to circumvent security • Separation of duties measures if security is • Least privileges too tight • Change control • May provide • Weakest link restriction to users in • Choke points how they do their • Failing securely work • Enlisting user help Purpose • Addresses people issues • Encourages users to think in terms of security Security Measures Shortcomings • Security is only as • Applying software good as the last patches signature file or patch • Updating viruses and applied spyware signature • Must enroll user files participation • Deleted unused accounts • User training
as soon as they are available and have been tested, downloading virus signatures on a regular basis, and frequently scanning PCs for spyware, adware, viruses, and worms. Finally, this layer deals with account management. Unused accounts should be deleted soon after the departure of users, and new accounts should be created just in time for the arrival of new employees. All of these layers are important to privacy. Physical security will prevent hackers and strangers from simply walking up to servers and workstations and leaving with the equipment or the information found on it. The logical layer, with all of its strategies, also helps prevent privacy breaches. For example, with separation of duties, no one person
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
132 Ménard
Table 8. Logical layer security measures Logical Layer Security Measures Isolation of services reduces organizations vulnerabilities and threats they face by safely removing and shutting down unneeded services and ports. This reduces the opportunity for hackers to gain access or for a vulnerability to be present. Furthermore, all major applications should be installed on separate servers, since security degrades in systems that are complex (Schneier, 2000). It is much harder to secure a server with many applications running on it since administrators will need to understand how each application interacts with the other. Also new vulnerabilities that are not available to either application may surface with their interaction. Separation of duties deals with separating administrative task between users. This can greatly reduce temptation to commit fraud since more people need to be enlisted to commit crimes. This is harder to implement in smaller organizations since there is often only one person assigned for all administration tasks, yet when this option is possible it should be implemented. For example, two people could be assigned to a backup and restore process. One only has rights to backup servers, while the other can only restore them. Also, two administrators could each have part of the password to log on to servers. This requires both administrators to be present whenever a configuration change, update, or other work requiring administrative privileges is done. This not only reduces temptation to commit fraud but also minimizes the chance of errors since two people are there to implement the work. Least privilege is important for administrators, users, and home users to have accounts that have just enough rights for them to do their work. Often, more rights will be given for convenience, and home users often use the administrator’s account to work from. This is a big security risk, since many programs require administrative rights to be installed. Also, if a Trojan horse does get installed, then hackers gain rights of the account that is logged on. Change control is also a must for any organization since it helps eliminate mistakes and improves the chance that malicious changes will be caught (Pipkin, 2003). Good change control should have in place a procedure to test and question every assumption, design change, and decision. Weakest links can only be found when security is seen as a whole and not piecemeal or by security measures implemented. Finding this link is important since this is where security is most likely to break. The weakest link could appear in any of the layers, security measures implemented, policies, processes, or be human related. Once it is identified, it is important to address it and make it as strong as possible to prevent breaches from occurring at this junction. Once this is done, organizations and home users should then focus on finding the next weakest link and fixing it. Finding weak links is an ongoing process since there will always be a weak link somewhere on the system. The goal is to know where they are and fix them before hackers find and exploit them. Choke points are created by having all information flow through one appliance or software. This limits exposure of data and assets to the outside world by having only one or two areas of access. These choke points help monitor and control the flow in and out of networks and can help in identifying unusual or suspect activity. Firewalls, sometimes installed for load balancing, offer good choke points. Failing securely deals with how systems should fa il. This is important in case of a power failure, software bug, or breach. When applications or systems crash, they should prevent all access to data, and the information should be left in a usable state. Also, if the failing device is security related, it should prevent all traffic from accessing systems it is protecting. Enlisting use r help—The more ears and eyes an organization has, the better it can protect its customers, users, systems, and data. Furthermore, if people are involved in the process, they w ill be less inclined to circumvent security measures in place and may even suggest ways to improve them. They are at the front lines and often see security holes that security officers and network administrators do not see since they use data and systems in different ways.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 133
can see, access, or change all the information, and choke points prevent free flow of information. Organizations may also find weak links in how they collect, store, or access information and choose to secure them. Finally, the procedural layer will help by engraining into users the importance of security and by encouraging them to be vigilant. The infrastructure could not be secured and privacy protected without implementing some technological controls such as firewalls, anti-virus, anti-spyware, and anti-spam software, encryption, operating system hardening, intrusion detection systems, and vulnerability scanning. These are discussed below and summarized in Table 9. Firewalls are an important tool to protect against intrusion. They can be hardware or software based and are used to protect the perimeter of networks by erecting a wall
Table 9. Technical tools to protect networks Technological Tool Firewall
Anti-virus, anti-spyware Anti-spam
Operating system harde ning
Cryptography
Intrusion detection system
Purpose • Protects perimeter • Protects individual PCs • Provides choke point in which all traffic must pass • Controls what information can come in and out of a PC or network • Protects against viruses and spyware • Protects against spam, some phishing attempts, and some viruses • Minimizes security holes by removing or disabling services, ports, and unutilized software • Provides integrity, authentication, and confidentiality • •
Vulne rability scanning
•
Monitors real time traffic for suspicious traffic Can identify internal attacks Used to improve awareness and determine effectiveness of existing security measures
Shortcomings • Does not authenticate users • Does not protect against viruses, worms, or spyware • Cannot protect against attack on open ports • Does not provide data integrity or confidentiality • Only as good at the latest signature file • Only as good at the latest signature file • • • • • • • •
In complex systems, it may be hard to disable services or ports Only as good as the latest patch Sharing private keys securely Keeping keys private Key length Only as good at the latest signature file Hard to implement and manage Tests are only as good as the person implementing them
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
134 Ménard
between the internal or private network and the Internet. This wall provides a choke point for all traffic to go through and limits access to outsiders from looking in. If traffic that arrives at the PC was not requested or uses a port that is not opened, it will be dropped. Furthermore, firewalls have the ability to hide IP addresses of users and the operating system that is running, thereby limiting what the information hackers will see when doing a scan. Using both a software and hardware firewall increases security by adding an extra layer of protection. If hackers penetrate the hardware firewall, they will still have to defeat the software firewall before gaining access to data on PCs. Implementing software firewalls will also protect systems from internal attacks. Firewalls also provide good control of all connections into and out of the network, can deny certain applications from sending information and accessing the Internet, can filter data according to specific filters or rules, and provide a central location for monitoring security events. However, firewalls do not authenticate users, do not protect against viruses, worms, or spyware, or provide data integrity or confidentiality. This means that if spyware is installed on a PC, and this software sends confidential information through an open port, the firewall will let this information through, no questions asked. Finally, even if firewalls are really good at blocking traffic that is not allowed, they can do nothing about attacks that use commonly open ports, such as port 80. The next most common tool is anti-virus and anti-spyware software. These programs are installed on local machines and servers and inspect all files being saved or modified. For anti-virus and anti-spyware to be most effective, their signatures need to be downloaded regularly. This process should be automated and should not rely on user intervention. Both softwares should also be loaded at PC start up and never disabled. Regular scans should also be done to make sure that newer signature files find existing problems that older signature files have not detected. Anti-spam software can be installed on mail servers or individual machines. This software usually works with signature files, which must be updated regularly. Anti-spam can protect users from phishing and other viruses or malware that come through e-mails. Another advantage of using anti-spam software is that it will liberate system resources by not having to process or store spam messages. Another good security strategy is to harden the operating system. During this process, all services and ports that are not needed are disabled and closed, unutilized software is uninstalled, and the most secure configuration is made (King et al., 2001). By doing this, information that could reveal the system’s identity, services, open ports, and vulnerabilities is minimized. This means that exploits are less likely, and also, fewer patches will need to be installed. A check for new patches should be done regularly. Administrators and home users should also subscribe to security bulletins from operating system manufacturers, anti-virus companies, and organizations such as CERT, NIST, and SANS and read them regularly. These newsletters explain new vulnerabilities and threats and provide solutions to minimize or remove them. They also supply removal tools in case a system gets infected with malicious software. Other helpful strategies to secure operating systems are to implement group policies, auditing, and authentication. Group policies allow administrators to apply certain security measure globally from one central location. This prevents PCs from being forgotten during initial configuration or when security changes are required, as well as
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 135
protects them from users who would like to disable the security measures. Group policies can also enforce minimum password length, password expiry, lockout duration time, and logoff time after inactivity. Auditing creates logs of certain events. These should be carefully examined to discern inappropriate behavior. Finally, authentication at login should always be implemented since it provides an extra layer of security and helps ensure that users are truly who they say they are. Cryptography plays a big part in the search for privacy. It provides integrity checking, authentication, and confidentiality (Canavan, 2001; Kaufman, Perlman, & Speciner, 2002). Integrity checking deals with the assurance that information has not been modified in transit since it was generated. Authentication provides the ability to verify someone’s identity, and confidentiality provides assurance that even if messages are intercepted, they cannot be read. All confidential information on PCs should be encrypted preventing hackers or thieves from easily reading it. The major issue with encryption deals with keeping keys secret. If hackers get a hold of the right keys, then they can read all information. Keys should be backed up and placed in a secure location so that if a key is ever lost, the data can still be decrypted and is not lost. Key length is another big issue with encryption since shorter key lengths are now easily broken. Finally, cryptography should only be seen as a deterrent. Hackers that desperately want your information will find other weaknesses to exploit in the surrounding system. With e-services, all organizations should use secure socket layer (SSL) when exchanging sensitive information. This provides encryption between the customer and organizations they are dealing with. Therefore, if information is intercepted, it could not be read. Customers need to learn how to verify digital certificates of organizations hosting eservices needed. Users should also be aware that some phishing scam can fake locks in browsers, and many hackers can obtain digital certificates or URLs with names similar to any organization. A common trick is to replace “I” (capital i) with “l” (small L). Customers need to be aware of these schemes and learn how to verify that certificates are indeed valid, issued by a trusted certifying authority, and issued to the appropriate organization. All the above-mentioned tools are passive. They do not provide real time information when attacks are occurring. Intrusion detection systems (IDS), on the other hand, are reactive tools. They are installed on the perimeter of networks and have the ability to monitor traffic in real-time. If suspicious activity is encountered, they can alert administrators. An IDS can either be rule-based, which uses a database of signatures that needs to be downloaded on a regular basis, or statistical, which compares legitimate behavior of users over a period of time to real-time traffic to detect suspicious patterns. However, IDSs are hard to implement and manage. They often create a lot of data that needs to be analyzed and can generate many false alarms. They are also resource intensive and usually only as good as their latest signature file and the skill of the operator using them. On the positive side, IDSs can detect attacks that are launched internally, can be used to collect evidence of an attack or breach, and can identify from where attacks are coming. Although IDSs can identify intrusions and attacks, they are not equipped to stop them. Vulnerability scanning and penetration testing make up another layer of security. They improve awareness and determine the effectiveness of the security measures in place (Flynn, 2000; Fried, 2001). They can also find holes and weak links in the implemented
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
136 Ménard
security. These tests should be implemented in such a way as to investigate and examine the limits of networks and should test all layers. It is of no use to have strong technological security measures in place if one can just walk up to servers and gain access to information. During this type of testing, network administrators use hacker tools to identify weaknesses in the implemented security policies and underlying information systems and networks. They must also think outside the box and find new ways of testing systems. Tests should concentrate on the interaction of systems and their surroundings instead of directly on the measures themselves. For example, locks on doors can be really good, yet rooms could still be compromised if hackers can enter by lifting a few ceiling tiles or by entering through a window. With the information they gather, administrators are better equipped to close vulnerabilities or to find solutions that will address the particular holes found. This testing should be done on a regular basis, since configuration or network changes can often open new holes. All PCs, whether they have a direct access to the Internet or are protected by firewalls or proxy servers, should be scanned since attacks can as easily come from the inside as the outside. Finally, top management should be aware when this testing is going on so that administrators are not accused of looking for vulnerabilities that they could exploit for themselves. Many of the tools discussed above can be found gratis on the Internet so that cost should not be an issue for home users. However, a word of warning: if you download any security tools, make sure they are from a reliable and trusted site and that the tools themselves are reputable. Many users have been infected with Trojan horses, worms, and viruses by downloading software applications from untrusted sources or have been infected because the tool itself was malicious.
Future Trends Is there any hope for customer privacy using e-services? First and foremost, customers will have to wake up and believe that their personal information is valuable, not only to them but also to organizations. They must realize that they ultimately have control over what information they share or exchange. Furthermore, they need to realize that protecting their privacy is their responsibility and should take steps to secure it. Once they do, they can limit the leakage of information. They also need to start putting pressure on organizations by asking questions about why information is collected or needed, what it will be used for, how long it will be kept, if it will be shared, and if they can have access to review and change it. By starting to ask questions and limiting the amount of information shared, organizations will start to take privacy issues more seriously and adapt policies and security measures to protect personal information. They will also have to create ways to minimize the impact of breaches. Customers should not tolerate privacy breaches and should take their business elsewhere if one ever happens. Organizations will also need to be honest with what they do with information they gather. There needs to be some accountability and transparency on the part of organizations so that everyone becomes equal and plays on the same level field. This means that the same information must be available to all, not just the ones with money and mega resources.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 137
Transparency is about giving back power to customers to hold organizations, who could violate our privacy, accountable (Meeks, 2005). People should also have a choice about what information they give, be able to review the information and make changes if desired, and have a way of monitoring the organization compliance with their privacy policies to make sure that the personal information supplied is not being used in ways that it was not meant to be (Ghosh, 2001). On the security side, networks need to become much more self-defending (Venetis, 2005). Implementing passive tools, such as firewalls, anti-virus, and anti-spyware software will soon not be enough since vulnerabilities and exploits are created rapidly and now have the ability to mutate. Good security will need to be based on intelligent tools that will be able to recognize malicious activity and stop it dead in its tracks. Also, users will need to understand their systems and stop being so trustful of strangers. Finally, security will continue to play a part in privacy until organizations find ways to optimize the use of information without infringing on privacy rights of customers. Scalet (2005) suggests that privacy issues are limited to information management, and as organizations learn to better manage confidential information and remove the profiling and surveillance aspects, privacy issues will disappear.
Conclusions Privacy issues began when our personal information started to be used for intrusive purposes, such as surveillance, profiling, and marketing schemes that targeted us specifically. The problem increased when people started using the Internet and various e-services since the nature of the medium prevented us from browsing or shopping anonymously. As a first step in regaining our privacy, we must take responsibility for our information by understanding what information we should and should not share, for how long it will be kept, who will be able to access it, how it will be used, and what security measures organizations are providing. We then need to realize that our information is valuable and take control of what we want to share. We should not let ourselves be intimidated or forced to share our personal information. But by taking action to protect our information, we should be ready to sacrifice savings and convenience. Once we understand our role, implementing different security measures can help reduce our risk of exposure by minimizing vulnerabilities, threats, and risks that both home users and organizations face. It is not always easy to implement good security since it is a complex issue that must always support changing conditions. As a result, security measures must be flexible, meet the objectives and goals of organizations, and be reevaluated often since threats evolve, configuration modifications are made, and organizational needs change as they grow and mature. Therefore, security must be seen as a process and not a goal that can be achieved and then forgotten. Doing a threat and risk analysis is important since it will identify vulnerabilities, threats, and risks that organizations face and will help them understand what needs to be protected and from what or whom. From this assessment, policies can then be created.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
138 Ménard
These documents are important since they assign accountability and define responsibilities, expectations, acceptable behaviors, and consequences. They also state how security measures should be implemented and how each solution should support the goals and objectives of organizations and home users. For each organization offering eservices, a privacy policy should be clearly written and be made available to customers. Once organizations and home users understand what it is they are trying to protect, its value, risks they face, their goals and objectives, and direction they are going, they are ready to implement security. These measures are best implemented in layers since layers provide many barriers with various challenges for hackers to cross before gaining access to systems or data. Each measure should be independent of each other but provide some overlapping in their security goals. Therefore, if one measure fails, others are behind it to protect the information. Layering may also work as a deterrent since hackers may prefer to attack simpler, less protected systems. Finally, it is important that when organizations implement these layers, the overall security of systems is taken into consideration and that the best solutions are implemented and not just the first ones that come to mind. This is because with a little more engineering, organizations and home users can provide the same amount of security while minimizing the negative effects of security, which are pervasiveness and reduction in freedom, privacy, convenience, and ease of use. Finally, by implementing simple and effective security measures, understanding our role in protecting our information, controlling what we share, and putting value on the information we exchange, we will be able to regain some of our privacy.
References Baase, S. (2003). A gift of fire: Social, legal, and ethical issues for computers and the Internet (2nd ed.). Upper Saddle River, NJ: Pearson Education. Bellotti, V. (1998). Design for privacy in multimedia computing and communications environments. In P. E. Agre, & M. Rotenberg (Eds.), Technology and privacy: The new landscape (pp. 63-98). Cambridge, MA: The MIT Press. Cady, G. H., & McGregor, P. (2002). Protect your digital privacy: Survival skills for the Information Age. Que. Canavan, J. E. (2001). Fundamentals of network security. Norwood, MA: Artech House. Cavoukian, A. (2005, March). The privacy imperative: Go beyond compliance to competitive advantage. In Presentation at the IT Security and Privacy Symposium, Toronto, Ontario, Canada. Day, K. (2003). Inside the security mind making the tough decisions. Upper Saddle River, NJ: Prentice Hall PTR. De Argaez, E. (2004). How to prevent the online invasion of spyware and adware. Retrieved September 27, 2004, from http://www.internetworldstats.com/articles/ art053.htm Delio, M. (2005). IT tackles phishing. InfoWorld, 27(4), 30-35.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Protection Through Security 139
Flynn, J. (2000). How to trap the network intruder. In H. F. Tipton, & M. Krause (Eds.), Information security management handbook: Vol. 1 (pp. 543-550). Boca Raton, FL: CRC Press LLC. Fried, S. (2001). Penetration testing. In H. F. Tipton, & M. Krause (Eds.), Information security management handbook: Vol. 2 (pp. 201-220). Boca Raton, FL: CRC Press LLC. Germain, J. M. (2004). Spyware: The next spam? Retrieved September 21, 2004, from http:/ /www.technewsworld.com/story/34775.html Ghosh, A. K. (2001). Security and privacy for e-business. New York: John Wiley & Sons. Ghosh, S. (2002). Principles of secure network systems design. New York: Springer. Gollmann, D. (1999). Computer security. Chichester, UK: John Wiley & Sons. Hare, C. (2002). Policy development. In H. F. Tipton, & M. Krause (Eds.), Information security management handbook: Vol. 3 (pp. 353-383). Boca Raton, FL: CRC Press LLC. Hyatt, M. S. (2001). Invasion of privacy. How to protect yourself in the Digital Age. Washington, DC: Regnery Publishing. Kaufman, C., Perlman, R., & Speciner, M. (2002). Network security private communication in a public world. Upper Saddle River, NJ: Prentice Hall PTR. King, C. M., Dalton, C. E., & Osmanoglu, T. E. (2001). Security architecture design, deployment & operations. Berkeley, CA: Osborne/McGraw-Hill. Matthews, B. R. (2002). Physical security: Controlled access and layered defense. In H. F. Tipton, & M. Krause (Eds.), Information security management handbook: Vol. 3 (pp. 775-792). Boca Raton, FL: CRC Press LLC. Meeks, B. N. (2000). Is privacy possible in the Digital Age? If it isn’t dead, then it’s hanging on by a thread. Retrieved February 16, 2005, from http://msnbc.msn.com/ id/3078854/print/displaymode/1098/ Parker, D. B. (1998). Fighting computer crime: A new framework for protecting information. New York: John Wiley & Sons. Pipkin, D. L. (2003). Halting the hacker: A practical guide to computer security. Upper Saddle River, NJ: Prentice Hall. Scalet, S. D. (2005). Five things every CSO needs to know about the chief privacy officer. CSO: The Resource for Security Executives, 4(2), 26-32. Schneier, B. (2000). Secrets and lies digital security in a networked world. New York: John Wiley & Sons. Schneier, B. (2003). Beyond fear thinking sensibly about security in an uncertain world. New York: Copernicus Books. Shaurette, K. M. (2001). The building blocks of information security. In H. F. Tipton, & M. Krause (Eds.), Information security management handbook: Vol. 2 (pp. 221240). Boca Raton, FL: CRC Press LLC. Sprenger, P. (1999). Sun on privacy: ‘Get over it’. Retrieved February 16, 2005, from http:/ /www.wired.com/news/politics/0,1283,17538,00.html
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
140 Ménard
Stallings, W. (2003). Network security essentials applications and standards. Upper Saddle River, NJ: Pearson Education, Inc. Steinke, C. (2002). Physical security: A foundation for information security. In H. F. Tipton, & M. Krause (Eds.), Information security management handbook: Vol. 3 (pp. 761-774). Boca Raton, Florida: CRC Press LLC. Summers, R. C. (1997). Secure computing threats and safeguards. New York: McGrawHill. Symantec. (2004, September). Trends for January 1, 2004-June 30, 2004: Vol. VI (Symantec Internet Security Threat Report). Retrieved September 27, 2004, from http://enterprisesecurity.symantec.com/content.cfm?articleid=1539 The Oxford dictionary of current English (2nd ed.). (1992). Oxford University Press. Venetis, T. (2004). Network security must be proactive, not reactive says Cisco CEO. IT World Canada. Retrieved February 21, 2005, from http://www.itworldcanada.com/ Pages/Docbase/ViewArticle.aspx?ID=idgml-6fd8c0ab-7cab-4227-890081f09c12c9b9 Vision Technology Management. (2004). Spyware, adware and worms oh my! Retrieved September 21, 2004, from http://www.visiontm.com/Spy/
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
141
Chapter VI
Pseudonym Technology for E-Services1 Ronggong Song, National Research Council Canada, Canada Larry Korba, National Research Council Canada, Canada George Yee, National Research Council Canada, Canada
Abstract Pseudonym technology is attracting more and more attention and, together with privacy violations, is becoming a major issue in various e-services. Current e-service systems make personal data collection very easy and efficient through integration, interconnection, and data mining technologies since they use the user’s real identity. Pseudonym technology with unlinkability, anonymity, and accountability can give the user the ability to control the collection, retention, and distribution of his or her personal information. This chapter explores the challenges, issues, and solutions associated with pseudonym technology for privacy protection in e-services. To have a better understanding of how the pseudonym technology provides privacy protection in e-services, we describe a general pseudonym system architecture, discuss its relationships with other privacy technologies, and summarize its requirements. Based on the requirements, we review, analyze, and compare a number of existing pseudonym technologies. We then give an example of a pseudonym practice — e-wallet for eservices and discuss current issues.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
142 Song, Korba & Yee
Introduction Background and Context E-services such as e-commerce, e-government, e-health, and e-learning are becoming part of everyday life and, together with the Internet, have come to be seen as an information infrastructure for every subject and many application domains. The tremendous growth of the varied e-services has catapulted them from their original realm of academic research toward new mainstream acceptance and increasing social relevance. However, this dramatic increase has created the potential for eroding personal privacy. The fact is that cyberspace has invaded private space. Currently, almost all of the online e-services can be monitored by some unseen parties on the Internet. Controversies about cookies, click streams, traffic analysis, packet sniffing, and spam form merely the tip of an iceberg. It is a small wonder that privacy is such a critical issue for e-services. Users feel that one of the most important barriers to using e-services is the fear of having their privacy violated. Governments around the world have introduced legislation placing requirements upon the way in which personal information is handled. According to the definition given by Goldberg in 1997, privacy refers to the ability of individuals to control the collection, retention, and distribution of information about themselves (Goldberg, Wagner, & Brewer, 1997). This does not mean that their personal information never gets revealed to any others. However, a system that respects their privacy should allow them to select what information about them is revealed and to whom. This personal information may be any of a large number of items, including their shopping habits, nationality, work history, living habits, personal communications, e-mail address, IP address, physical address, identity, and others. Recently, many new techniques have been developed for providing privacy protection. Privacy protection is a process of finding an appropriate balance between privacy and multiple competing interests. Generally, they can be summarized into several kinds of techniques. One technique is the use of pseudonym technology for providing both anonymity and accountability. Another is the use of an anonymous communication network for providing anonymity and unobservability. A third is the use of personal privacy policies along with secure mechanisms to guarantee that e-service providers conform to these policies (Yee & Korba, 2005). A pseudonym is a fake name or alias, for instance, a user’s digital account in a bank or an access account for a Web service. However, these pseudonyms are not protected with any special technologies so they can be easily linked to the real identity of the user. We name the special technologies as pseudonym technologies that can prevent service providers from linking a pseudonym to the real identity of the user. With pseudonym technology, users can access the e-services by their pseudonyms instead of their real identities while still allowing the system to authenticate them as valid users. Furthermore, the system not only cannot link the pseudonyms with the real identities but also cannot link the pseudonyms used for different applications. This gives the users certain privacy protection and the service provider essential security protection. For instance, the users can protect their personal information and shopping habits if they use pseudonymcredentials (e.g., e-cash) to access some e-services or order some products. At the same Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
143
time, the service providers or retail sellers can authenticate the credentials and users anonymously to reduce a variety of risks (e.g., fraud, repudiation) and protect their services. Pseudonym technology provides a good solution to privacy and security protection for most e-services. In this chapter, we only discuss the pseudonym technologies.
Pseudonym Technology The real danger is the gradual erosion of individual liberties through the automation, integration, and interconnection of many small, separate recordkeeping systems, each of which alone may seem innocuous, even benevolent, and wholly justifiable. (U.S. Privacy Protection Study Commission, 1977) With the characteristics of unlinkability, anonymity, and accountability, pseudonym technology has become available after lengthy research. The technology for pseudonym systems took a major step forward with the introduction of digital pseudonyms. According to a high-tech dictionary definition, a digital pseudonym is a pseudonym an individual can use to set up an online account with an organization without revealing personal information. For instance, a public key, which is owned by an anonymous holder, can serve as a digital pseudonym. The holder can prove he or she is the owner of the public key by verifying signatures made with his or her corresponding private key. Digital pseudonyms were first introduced by David Chaum in 1981 for untraceable electronic mail services (Chaum, 1981). In this system, an authority creates a roster for all pseudonyms and decides which applications of pseudonyms to accept, but the authority is unable to trace the pseudonyms in the roster. The technology aimed at providing some limited anonymity for MIX networks which can take a list of values as input and outputs a permuted list of function evaluations of the input items without revealing the relationship between input and output elements. The concept of pseudonym systems was introduced by Chaum in 1985 (in order to protect the privacy and maintain the security of both individuals and organizations for largescale automated transaction systems [Chaum, 1985]). Pseudonym systems have several features. First, they allow users to interact with multiple organizations anonymously using pseudonyms so that personal information is not required or used for identifying themselves. For example, a purchase with e-cash is made under a one-time-use pseudonym credential. Second, with the pseudonym technology an individual is able to authenticate ownership of the pseudonyms and ensure that the pseudonyms are not improperly used by others. Furthermore, the individual can obtain a credential from one organization using one of his or her pseudonyms and demonstrate possession of the credential to another organization without revealing the individual’s first pseudonym to the second organization. For instance, a consumer may get e-cash from his or her bank and make a purchase with it in any retail store. In the pseudonym systems, an individual uses a different digital pseudonym with each organization. These pseudonyms are unlinkable to the person’s identity, but the organizations are able to ensure that the pseudonyms are not used improperly. Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
144 Song, Korba & Yee
In order to give a practical implementation for pseudonym systems, Chaum and Evertse developed a model and constructed a scheme in 1986 (Chaum & Evertse, 1986) based on the RSA public key cryptosystem (Rivest, Shamir, & Adleman, 1978). In this scheme, the credentials are the RSA signatures on pseudonyms. However, the disadvantage is that the scheme relies on a trusted central authority who must sign all credentials. Damgard presented another pseudonym system scheme in 1988 based on a multiparty computing protocol with secret inputs and outputs (Damgard, 1988). The scheme is to establish the existence of credential mechanisms, protect organizations from credential forgery, and secure the secrecy of users’ identities at an information-theoretic level, that is, unconditionally secure (Menezes, Oorschot, & Vanstone, 1996). In addition, the role of the central authority in this scheme is limited to ensuring that each pseudonym belongs to the valid user. In order to simplify the process of validating pseudonyms, Chen shifts the credential system from an RSA setting to a discrete logarithm setting (Chen, 1995). In this scheme, the central authority will no longer be required after the pseudonyms are validated since each organization has its own secret key for issuing a credential without the central authority’s help. In addition, users can validate their own secret keys in the system when the signatures are required under the pseudonyms. Another feature of this system is that each version of the credential can be shown only once to an organization, which makes it suitable for one-time credential environments such as an electronic cheque. The newest and most sophisticated pseudonym technology is Pseudonym Systems, which are based on discrete logarithms in order to prevent a user from sharing his or her pseudonyms or credentials with other users (Lysyanskaya, Rivest, Sahai, & Wolf, 2000). In this model, each user could open an account with a different organization using different unlinkable pseudonyms after registering with a Certification Authority (CA). The organization then issues a credential to the user by the pseudonym, which he or she uses to open the account. The credential could be single-use like an e-cash or multipleuse like a health card depending on the application. Another new pseudonym technology is Private Credentials, recently proposed by ZeroKnowledge Systems (Glenn, Goldberg, Legare, & Stiglic, 2001; Brands, 2000). Private Credentials minimizes the risk of identity fraud and overcomes the efficiency and security shortcomings of identity certificates, especially beneficial in the authentication-based environment. Finally, anonymous e-cash (Chaum, 1982, 1988), e-wallet (Chaum & Pedersen, 1992), eticket (Song & Korba, 2003), and e-voting (Liaw, 2003) are other state-of-the-art pseudonym technologies for privacy protection in e-services. As a variety of e-commerce and e-government services are becoming huge driving forces for the future of the Internet, these solutions offering privacy and anonymity protection are very valuable.
Challenges and Issues The past few years have shown a significant increase in public privacy awareness along with the widespread use of the varied e-services. Some of challenges and issues associated with privacy protection for e-services are highlighted here.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
•
145
Consumer attitudes. More and more consumers have realized the value of their personal information and the danger in leaving it unprotected. According to a multi-national privacy survey by IBM, 80% of U.S. consumers strongly agree that they have lost all control over how their personal information is collected and used by companies, and 54% of consumers have decided not to purchase anything from a company when they are not sure how the company will collect and use their personal information (IBM, 1999). Furthermore, privacy breaches almost always result in a decrease in customer loyalty and cause damage to the reputation of the e-services.
•
Legislation. Governments around the world are beginning to introduce more and more privacy regulations and legislation for personal information protection. Some of them have become law, for instance, the European Union Directive on Data Protection (European Union), Canada’s Personal Information Protection and Electronic Documents Act (PIPEDA) (Government of Canada), and the U.S. Health Insurance Portability and Accountability Act (HIPAA) (U.S. Government). There are many challenges when enacting privacy programs, for instance, the organizations must not only be aware of current regulations, but also strategically plan for future regulations. In addition, the companies must monitor the regulatory environment, create privacy standards and documentation, establish office procedures, and train their employees. In order to spell out the requirements for the collection, use, disclosure, retention, and disposal of personal data, Canada has incorporated 10 Privacy Principles (Dept. of Justice) in PIPEDA. However, the implementation of the principles may vary in the different systems due to different underlying applications.
•
Public safety. Public safety is another challenge for privacy protection. Citizens have been forced to question how much they value their personal information compared to their safety after the terrorist attacks on the U.S. on September 11, 2001. Consequently, public tolerance to surveillance has increased. New legislation has been passed to make the citizens’ personal information more accessible to those who require it (e.g., police) in order to fight terrorism. However, this also allows personal information to be more accessible to those who should not have access if the precautions or technologies are taken inappropriately. It is a significant challenge to satisfy the requirements from both privacy protection and public safety.
•
Technology. Advancements in information technologies such as the Internet, high speed transfer, packet sniffing technologies, and efficient data mining have made personal data collecting, transmitting, storing, and analyzing much easier than before. It is becoming harder and harder for consumers to protect their personal data. As a good privacy protection technology, pseudonym technology has many advantages such as anonymity, authenticity, and accountability. However, there
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
146 Song, Korba & Yee
are many challenges as to how to make the pseudonym technologies satisfy the privacy requirements and principles within the varied e-services and how to make them comply with privacy legislation and standards. Other issues may arise from the trustability, reliability, and practice of a privacy protection system. In addition, a good privacy protection system may require many privacy protection mechanisms and technologies to be used together since each of them has limitations. For instance, pseudonym technology usually has limitations when defending against traffic analysis (Raymond, 2000) and may need other technologies such as onion routing (Goldschlag, Reed, & Syverson, 1999) and MIX networks (Berthold, Federrath, & Kopsell, 2000).
Pseudonym Systems In order to have a good understanding how a pseudonym system can protect a user’s privacy in e-services, we first summarize the pseudonym requirements for privacy protection in e-services. We then introduce a general pseudonym system architecture and discuss its relationship with other privacy technologies.
Pseudonym Requirements for E-Services Privacy protection requires that each individual have the power to control his or her personal data, for instance, deciding how his or her personal data is collected and used. In order to do this, some privacy requirements have been researched by Brands and Lysyanskaya (Brands, 2000; Lysyanskaya et al., 2000). However, to comply with as many of the privacy principles and legislation as possible and to improve e-services, a good pseudonym technology should satisfy the following characteristics. Basic requirements. These are very important requirements for a pseudonym technology in order to satisfy the privacy principles and be applicable to e-services. We say they are basic because anyone of them, if broken, can destroy the whole system. Furthermore, we categorize them as privacy-related requirements and security-related requirements.
•
Privacy-related requirements. ¡
Pseudonymity: Pseudonymity can let the user maintain one or more persistent personae, but these personae are unlinkable to the user’s physical identity. This allows a pseudonym to have a certain level of anonymity in order to serve as a basic requirement for privacy. As many researchers have already addressed, full anonymity is not beneficial to anyone under many situations, especially authentication-based e-services. With pseudonymity, the users can control their personal data more effectively. In addition, it is of great benefit to organizations, too. They can minimize the risk of identity fraud, increase the authenticity and accountability of their e-services, and cultivate goodwill among users.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services ¡
¡
•
147
Unlinkability: Unlinkability means that the organizations cannot learn more than what the pseudonym reveals, that is, to make the pseudonyms linkable is not much better than random guessing. This requirement can let the users control how much personal data they actually disclose under an e-service. Otherwise, the aggregate linked information would be much more than the users were willing to disclose. Property sharing resistance: This is to protect organizations from a user that improperly shares his or her pseudonyms or credentials with other users so that the users can get some privileges which he or she otherwise would not have. It is very difficult to reach this goal for a protocol, especially for multipleuse credentials. There are two solutions to-date. One is to let the credential sharing become like e-cash sharing (the e-cash system checks for doublespending). This at least causes the organization no big loss. The other solution is to let the pseudonym or credential sharing result in sharing the user master secret key such as in the Pseudonym System (Lysyanskaya et al., 2000) and the Private Credential (Brands, 2000) (detailed information in section entitled Pseudonym System Architecture).
Security-related requirements. ¡
¡
¡
¡
Authentication: With authentication, the organizations can authenticate the users effectively, that is, reject the invalid users or hackers and accept the valid users only. This is a basic security requirement for most e-services. Unforgeability: Unforgeability requires that a credential cannot be generated solely by the user. It must be issued with the organization’s cooperation. Without unforgeability, the system will become useless. Security of the user’s secret key: The system must make sure that the user’s secret key is not revealed during all system processing. In addition, the key generation technology itself should make sure that the secret key is secure under complexity-theoretic security or computational security. Security of the protocols: All security protocols in the system must be strong enough under existing cryptanalysis technologies and secure against the varied attacks.
Advanced requirements. These requirements are considered secondary for a pseudonym technology since they are only adding more features for a pseudonym technology, and some pseudonym applications may not require them. But they add very good properties to some special application systems and make the technology work more effectively for certain special e-services.
•
Selective disclosure. This means that the user can show the different attributes of a credential to the different organizations without revealing other attributes in the credential. One example is Private Credentials proposed by Brands (Brands, 2000). This is a very good property for most multiple-use credentials.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
148 Song, Korba & Yee
•
Reissuance. This requirement was also proposed by Brands. With this property, an organization can refresh an issued credential without knowing the attributes it contains. The technology can prevent the organization from learning attributes of the credentials. In addition, different organizations can certify different attributes for the same credential that has this reissuance property.
•
Dossier resistance. This is another requirement presented by Brands in order to let a multiple-use credential leave no more evidence than is necessary to validate the user at the transactions. One solution is to use a self-authenticating technology to let the credential have self-authenticating evidence for user validation. This requirement can protect the users against a central authority learning more personal information than the users have disclosed.
•
Non-repudiation. With this requirement, the system can protect the organizations against a user denying his or her previous actions. Most of the pseudonym technologies use signature technologies for non-repudiation services, but the signature does not reveal any personal information about the user (Song & Korba, 2003).
•
Confidentiality. This protects the content of information such as the communication messages or credential’s attributes from all but the users or organizations authorized to have them. With this requirement, the pseudonym technology has to use some encryption and decryption algorithms.
Pseudonym System Architecture A pseudonym system is an identity and certificate management system with pseudonym and credential management and privacy protection functionalities. It consists of three parts: certification authority, organizations, and users, based on the pseudonym architecture models developed by Chaum and Evertse (1986), Chen (1995), Lysyanskaya et al. (2000), and Brands (2000). The certification authority (CA) is a special organization to register users and organizations with their public keys and issue the public key credentials to them as the valid users and organizations in the system. Users can then prove to an organization that their pseudonyms correspond to the public keys of the valid users. The organizations (e.g., bank, government) set up pseudonyms and accounts for the users to access their e-services. Some organizations may issue private credentials to the users so that the users can demonstrate possession of the credentials to other organization without revealing their personal information. Each user uses different pseudonym accounts with different organizations and the pseudonym accounts are unlinkable to each other. Figure 1 depicts the components of a general pseudonym system and the process flows among the different components.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
149
Figure 1. General pseudonym system components CA Issue credential
Issue credential
Issue pseudonym or credential Access e-services Users
Organizations
A fundamental technology used in most pseudonym systems is the blind signature technology. In a pseudonym system, to register a pseudonym with an organization, the user must show his or her real name or certificate to the organization for verification. In order to prevent the organization from tracing the user’s pseudonym, the user usually creates a blinded pseudonym message (using a specially designed function with the pseudonym and a random number as input) and sends it for registration. After verification, the organization signs the blinded pseudonym message and sends it to the user. The user then employs the organization’s signature together with his or her pseudonym to access the organization’s services. The organization cannot trace the pseudonym since the signature of the pseudonym is different from the organization’s signature of the blinded pseudonym message. The previous pseudonym system involves several different public keys, for instance, master public key, pseudonym, and credential. These keys are issued by different organizations and have different purpose. Table 1 gives a simple comparison of them.
Table 1. Comparison of the different keys in a pseudonym system Features Keys
Master public key
Key generating party
Key issuer
By user
CA
Pseudonym
By user and organization
Organization
Credential
By user and organization
Organization
Key generating protocol
Protocols where key is applied
Registration protocol Pseudonym registration protocol
All protocols in the system Pseudonym authentication protocol
Credential issue protocol
Credential transfer protocol
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
150 Song, Korba & Yee
•
Master public key: The concept of the master public key was proposed by Lysyanskaya (Lysyanskaya et al., 2000) in order to protect the pseudonym system against property (pseudonym and credential) sharing. This means that the user must share his or her master secret key with others if he or she wants to share pseudonyms or credential with them in the pseudonym system. To do this, a set of special protocols should be designed carefully. The user can generate pseudonyms or credentials with the organization together using the user’s master public key and secret key by running the protocols. In addition, the user must register his or her master public key with the certification authority first in order to make the master public key valid in the system. In some pseudonym technologies like e-cash, the user does not have a master public key. In that case, the user usually uses his or her public key certificate to get the pseudonym or credential (e.g., e-coin) by running a special protocol (e.g., blind signature protocol). The credential (e-coin) sharing here is sharing their money.
•
Pseudonym: A pseudonym is one kind of public key that has the same secret key as the master public key if the system uses the master public key technology. The pseudonyms are issued by the organizations for their clients to access their eservices anonymously and authentically. Another purpose of the pseudonym is to generate a credential for the user using the organization’s interactions. In order to get a pseudonym, the user must interact with the organization by running a pseudonym generation protocol with his or her master public key and secret key.
•
Credential: A credential is one kind of certificate. Some schemes use the same secret key with the master public key and pseudonym for the credentials during authentication, for instance, Lysyanskaya’s scheme (Lysyanskaya et al., 2000). Other schemes use different secret keys for each credential such as e-cash and eticket (Song & Korba, 2003). The credential is usually issued by one organization to a client of the organization to demonstrate to another organization that he or she has gone through credential issuance. A good example is when a bank issues ecash to a client and the client gives the e-cash to a retail store for purchasing some products. In order to get a credential, the user must interact with the bank by running a credential issue protocol with his or her pseudonym and secret key.
The following protocols should be designed and developed in a pseudonym system in order to reach the above goal and complete the privacy protection functionalities. Figure 2 depicts the work flow of the protocols in a pseudonym system, where the longer dotted line separates the different applications, the shorter dotted line separates the different protocols under the same application based on the process sequence, and the ellipses represent activities made by a subject if the ellipses are under the subject directly, or interactive activities made by two subjects if the ellipses are between the two subjects.
•
The user master public key registration protocol: The goal of this protocol (shown in part j of Figure 2) is to issue a credential to a user based on his or her master public key so that he or she can prove to an organization that he or she is a valid user who owns the master public key in the system. In this protocol, the user is required to reveal his or her true identity and master public key to the certification authority. The certification authority verifies if the user really owns the corre-
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
151
Figure 2. Work flow of the protocols in a pseudonym system Use
CA
Organization
Other organizations
Credential registration protocol
Create a master public key Request for a master public key Credentia l
Verify the public key and user
Request for Credentia l
Pseudonym registration protocol
Credentia l Request for pseudonym Interactive pseudonym generation processing Pseudonym
j
k
l
Pseudonym authentication protocol
Request for credential Interactive credential issue processing Credentia l
Credential transfer protocol
Interactive pseudonym authentication Secure channel
Credential issue protocol
Request e-services
m
for e-services
n
The user and organization’s credential Interactive credential verification processing
o
Accepted or denied
sponding secret key of the master public key by an interactive security protocol, for instance, a challenge-response authentication protocol. If the verification is successful, the CA will issue the corresponding credential of the master public key to the user.
•
The organization credential registration protocol: This protocol (shown in part k of Figure 2) is to issue a credential to an organization. The procedure of the protocol is similar to the previous protocol. With the public key credential, an
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
152 Song, Korba & Yee
organization can prove to users or other organizations that it is a valid organization in the system.
•
The pseudonym registration protocol: In this protocol (shown in part l of Figure 2), the user first sends his or her master public key and corresponding credential to the organization. The organization will generate a pseudonym for the user with the user together through a pseudonym generation protocol and open a pseudonym account for the user to access the e-services.
•
The pseudonym authentication protocol: The pseudonym authentication protocol (shown in part m of Figure 2) establishes a secure communication between the user and organization. It could be a normal authentication with pseudonym characteristics. The user must prove that he or she is the owner of the pseudonym during authentication. After the protocol, a secure communication channel should be established between the user and organization.
•
The credential issue and transfer protocols: The goal of these protocols is to let a user obtain a credential from one organization using his or her pseudonym and prove possession of the credential to another organization without revealing any other personal information about himself or herself. In these protocols, the user needs to prove that he or she is the owner of the pseudonym first by running a pseudonym authentication protocol. The organization then interacts with the user together and generates a credential for him or her through the credential issue protocol (shown in part n of Figure 2). After that the user can demonstrate to another organization that he or she is the owner of the credential through the credential transfer protocol (shown in part o of Figure 2).
Practice and Relationship With Other Privacy Technologies As we mentioned, there are several kinds of privacy protection techniques for e-services. They have different functionalities. The main purpose of pseudonym technology is to provide anonymous authentication for users to access e-services. With pseudonym technology, a service provider can verify the users through access control, but the provider cannot link the pseudonyms with the real identities of the users. The technology can limit the provider’s ability for personal information collection. The pseudonym technology should be implemented as part of the access control and identity management in an e-service system. Obviously, it cannot protect the users from traffic analysis attacks during communications. For instance, a service provider or an attacker can easily trace a computer with some meta data such as IP address in the communication network layer and link the pseudonym used in the communication with the user of the computer. Anonymous communication networks such as onion routing or MIX networks can provide anonymous communication for users to protect their privacy from traffic analysis attacks in the communication layer, such as tracing a message to identify the sender and receiver. With this technology, it is very difficult for a service provider to trace the real
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
153
source or destination of a message, but it cannot prevent the service provider from collecting personal information through the communication content if the system does not use other privacy mechanisms such as pseudonym technology. Privacy policy technology refers to efforts to protect the user’s privacy by controlling the personal data with certain rules that are compatible with privacy legislation. For a service provider, the rules may describe which part of personal information it will collect and for what purpose. In a policy enforcement system, some rules may trigger security mechanisms such as encryption and integrity in order to protect the personal data. Usually, pseudonym technology is provided by an e-service provider and implemented in the application layer along with the e-service system. The anonymous communication network is provided by an anonymous network service provider and implemented in the lower communication network layer. Obviously, anonymous communication and pseudonym technology are implemented separately. However, they can be easily combined and used together in order to provide solid privacy protection for an e-service system. For instance, in a Web-based service system, a user can set up http communications through an anonymous network proxy and let the anonymous network forward the user’s communication messages. The user can then use his or her pseudonym, which is provided by a service provider, to access the Web service. This is not difficult for a person who has the required knowledge. Less knowledgeable users, however, may feel incapable of managing these technologies. A privacy policy enforcement system can provide efficient methods and give the user a better understanding of privacy protection. Such a system can combine these technologies together so that the user only has to manage his or her privacy policy. The policy will automatically call the privacy and security mechanisms to protect privacy.
Pseudonym Technologies for E-Serivces We review several pseudonym technologies for privacy protection in e-services in this section and compare them based on the pseudonym requirements listed previously. These pseudonym technologies are e-cash, e-ticket, e-voting, pseudonym system, and private credentials.
E-Cash System for E-Commerce Electronic cash (e-cash) is a kind of digital money that can be transferred by means of a computer network and traded as a token exchangeable for real money (Telecom Glossary 2000). In pseudonym systems, e-cash is one kind of single-use credential. The e-cash system was first proposed by Chaum in 1982 (Chaum, 1982; Chaum, Fiat, & Naor, 1988) in order to protect personal information from payment tracing by using a blind signature technology (explained later). After that, many new e-cash schemes have been proposed and developed with improved properties, for instance, Abe’s scheme (Abe & Fujisaki, 1996), Miyazaki’s scheme (Miyazaki & Sakurai, 1998), and Kim’s scheme (Kim & Oh,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
154 Song, Korba & Yee
Figure 3. E-cash system components CA Issue Certificate
Issue Certificate Issue
Request for e-cash Issue an e-cash Users
Certificate Deposit the e-cash
Web banks
Confirmation
Purchase using e-cash
Web
Stores
Merchandise
2002), but blind technology is always the key technology used to achieve the privacy protection goals for e-cash systems. Figure 3 depicts an e-cash system. An e-cash system consists of four elements: a certification authority, Web banks, Web stores, and users. The certification authority issues the public key certificates to the users, Web banks, and Web stores. In the system, the principal technology used is the blind signature technology involving four phases. The first phase is the initiating phase. In this phase, the bank will set up its public key parameters for the e-cash system. The second phase is the withdrawing phase. In this phase, the user requests to withdraw ecash from his or her bank. To do this, the user blinds a message (M) that contains a random number as pseudonym using a specially designed one-way function (e.g., f 1()), i.e., the blinded message is M’=f 1(M, r…) where r is a secret random number owned by the user. The user then sends the blinded message (M’) with his or her identity and bank account to the bank. The bank signs the blinded message (M’), takes the money for the e-cash from the user’s account, and sends the signature (S’) to the user. The third phase is the unblinding phase in which the user recovers the bank’s signature (S) on the original message (M) using another one-way function (e.g., f 2()), i.e., S= f 2(S’, r…), and gets his or her e-cash (M, S). We call this mechanism a blind signature. It means the bank cannot trace back to the user when the user spends (M, S) later. The last phase is the depositing phase. In this phase, the user can buy any merchandise in a Web store with the e-cash. The Web store will verify the e-cash and send it to its Web bank for an online or off-line double-spending check. If the double-spending check is successful, the bank will add the same money to the store’s account. The store then delivers the purchased merchandise to the user. Figure 4 (CA’s function not shown) depicts the system processing work flow. E-cash is a single-use credential, but its pseudonym uses a random series number, not a public key, so that the user usually does not have a secret key for the user authentication. This forces the system to use other technologies such as SSL for the user authentication. Furthermore, the system uses a double-spending check technology in
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
155
Figure 4. Work flow of an e-cash system
Initiating phase
User
Web banks
Web stores
Set up public key parameters
Unblinding phase
Withdrawing phase
Blind a message M with the one-way function: f1 () Blinded mess age: M’, ID, Account Sign the blinded mess age and take money from the Signature: S’ user’s account Unblind the message and get the e-cash: (M, S)
Depositing phase
E-cash (M, S) and product purchase statement Verify the e-cash E-cash: (M, S) Double-spending check Confirmat ion Purchased merchandise
order to resist property sharing. A good e-cash system satisfies most of the pseudonym requirements such as pseudonymity, unlinkability, unforgeability, property sharing resistance, dossier resistance, and security of the protocols, but it does not have characteristics like authentication, selective disclosure, reissuance, non-repudiation, and confidentiality, where the authentication and non-repudiation are very important requirements for an e-commerce application. Most real applications use SSL technology for the user authentication. This would expose the user’s identity and destroy the unlinkability. One solution we suggested is to embed a public key into the e-cash as the pseudonym instead of the random series number (Song & Korba, 2004). The main idea is similar to the following e-ticket system.
E-Ticket for Pay-TV System Security and privacy on Pay-TV systems have been researched for some time (Lee, Change, Lin, & Hwang, 2000; Lee, 2000; Song & Korba, 2003; Song & Lyu, 2001). The latest electronic ticket system (e-ticket) for Pay-TV applications was proposed by Song and Korba (Song & Korba, 2003). In this system, an e-ticket could be a single-use or
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
156 Song, Korba & Yee
multiple-use credential depending on the application requirements. The e-ticket technology is based on existing e-cash technology (Abe & Fujisaki, 1996), but it enhances the security and privacy characteristics with user authentication and non-repudiation protection for the system. The system consists of three elements: certification authority, TV service providers, and users. The certification authority issues the public key certificates to the TV service providers and users. The TV service providers issue e-tickets to the users. The users then use the e-tickets to subscribe to TV channels. The protocol for the system includes four phases as follows. 1.
E-ticket issue phase: In this phase, the user inserts a random public key as a pseudonym in the blinded message so that the user holds a secret key for the eticket. This enables the system to support the user authentication and nonrepudiation protection when the user spends the e-ticket later.
2.
TV channel subscription phase: In this phase, the user sends a statement of the TV channels and programs along with his or her e-ticket to the TV service provider. The whole message sent to the provider from the user is signed with the corresponding secret key of the pseudonym by the owner of the e-ticket so that the provider can authenticate the message by the signature and time stamp. At the same time, the provider charges the money from the e-ticket and sends the balance to the user through an anonymous network or e-mail if the e-ticket is a multiple-user credential. Otherwise, the provider will destroy the e-ticket.
3.
TV channel adaptation and suspension phase: In this phase, the user can change and stop his or her selected TV channels. To do this, the user sends the changed information with the e-ticket together to the provider. The provider will authenticate the user by the signature.
4.
E-ticket renew phase: The user can renew his or her e-ticket before the e-ticket expires. To do this, the user sends his or her old e-ticket to the provider. The provider then reissues the e-ticket to the user with a new expiration date.
The e-ticket system satisfies all the pseudonym requirements that we mentioned previously except selective disclosure. Selective disclosure is not required for a Pay-TV system. In order to satisfy sharing resistance, the system uses the double-spending check so that the pseudonym or credential (e-ticket) sharing means money sharing for the consumers.
E-Voting Electronic voting (e-voting) is an election system that allows voters to record their secure and secret ballots electronically. In the last 20 years, many kinds of e-voting technologies have been proposed and developed, for instance, Internet-based, telephone-based, anonymous network-based, and pseudonym-based (Chaum, 1988; Cramer, Gennaro, & Schoenmakers, 1997; Hoffman, 2000; Jorba, Ruiz, & Brown, 2003; Juang, Lei, & Liaw, 2002). Here, we only discuss pseudonym-based e-voting technologies. The latest
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
157
Figure 5. E-voting system components CA Issue Certificate
Issue Certificate Certificate
Certificate
Blinded ballot
Ballot
Blinded signature Voters
Signer
Voting center
Signed voting ballot
Publisher
pseudonym-based e-voting scheme was proposed by Liaw in 2003 (Liaw, 2003) to solve some problems such as uncoercibility, non-cheating, uniqueness, fairness, anonymity, mobility, and efficiency. This system consists of five elements: certification authority, publisher, decryptor, signer, and voters. The certification authority issues the public key certificates for the users, signer, and publisher. The signer will sign and check the voter’s election anonymously using a blind signature technology. The voter then sends the signed voting ballot to the untraceable decryptor (renders voting ballot untraceable) in the voting center. The voting center records the voting ticket and forwards it to the publisher. The publisher finally decrypts the voting ticket and reveals the voting result. Figure 5 depicts the e-voting system and process flow. In this system, an electronic ballot (e-ballot) is a single-use credential. The primary technology used in the system is a blind signature scheme. The protocol includes four phases: initiating phase, voting phase, scrutiny phase, and publishing phase. 1.
Initiating phase: In this phase, the signer, publisher, and voters request their public key certificates from the certification authority. In addition, each voter needs to request a smart card from the CA with a unique identifying number authorized by the CA. The smart card contains the signer’s public key and publisher’s public key for computing the blinded voting message. For each vote, the voting center chooses a random number (RD) to check the validity of the votes. The random number is used to create the blinded voting message.
2.
Voting phase: In this phase, each voter fills out his or her voting ballot, and inputs it into his or her smart card. The smart card encrypts and blinds the ballot with the signer’s public key, publisher’s public key, and random number RD together, and sends it to the signer. The signer verifies the message and signs it for the voter. With this signed message, the voter obtains the signature for the blinded ballot. Finally, the voter sends the blinded ballot with the signature to the voting center.
3.
Scrutiny phase: The voting center verifies the signature and the blinded ballot. It then records the voting ballot and forwards it to the publisher if the verification is successful.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
158 Song, Korba & Yee
4.
Publishing phase: Upon reception of the voting ballot, the publisher decrypts the ballot and publishes the vote that consists of a hash number and a voting choice. The hash number was created by the voter with a random number in the voting phase so that the voter can use the published vote (both hash number and voting choice) to check if his or her vote has been counted.
This e-voting system satisfies most of the pseudonym properties listed above except for non-repudiation, selective disclosure, and reissuance. This is to be expected since evoting has different requirements from other e-services such as e-cash and e-ticket. In addition, this e-voting system satisfies other properties that are not listed in the pseudonym requirements but are very important for an e-voting system, for instance, completeness, incoercibility, and non-cheating.
Pseudonym Systems The general pseudonym system architecture has been described above. We now introduce a recent pseudonym system called Pseudonym Systems constructed by Lysyanskaya (Lysyanskaya et al., 2000). The system is based on a blind transcription technology, the discrete logarithm problem (Menezes et al., 1996), and the ElGamal public key cryptosystem (ElGamal, 1985). In this system, the users first set up their master public key parameters, publish their public keys through a public key infrastructure system (PKI) or others, and keep the secret keys for themselves. Each organization creates their credential keys for issuing pseudonyms. After that, a user can apply the different pseudonyms from the different organizations for their e-services, where all pseudonyms are related to the user’s master secret key in order to dissuade the user from sharing his or her pseudonym with others (which would result in the user sharing his or her master secret key). Using a pseudonym, the user can communicate with an organization securely and anonymously, for instance, through authentication, encryption, and signature. In addition, the user can apply credentials from organizations and use them with other organizations through the credential issue and transfer protocol, where the credentials also use the same secret key with the user’s master public key for user authentication. The system satisfies most of the pseudonym requirements listed above except for selective disclosure and reissuance.
Private Credentials The Private Credentials system is an application proposed by Brands in 2000 (Brands, 2000) for the Freedom Network managed by Zero-Knowledge Systems. The primary technology in the Private Credentials system is similar to that of blind signatures first proposed by Chaum (Chaum, 1982, 1985). However, this technology has very different properties. For instance, Private Credentials has selective disclosure characteristics. In addition, the system has similar components as the pseudonym system shown in Figure
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
159
Figure 6. Process flow of the Private Credential system CA Issue Private Credentials Show private credentials to get e-services Users
Organizations
1. However, the process flows are very different. In the Private Credentials system, the certification authority directly issues the Private Credentials to the users. The users then show their private credentials to other organizations to get the e-services. Figure 6 depicts the process flow of the system. Two protocols — the private credential issue protocol and the authentication protocol — are designed and developed in order to promote practicality and simplicity (see Glenn et al. [2001]). The protocols are patented by Zero-Knowledge Systems Inc.
•
Private credential issue protocol: This protocol includes four phases. The first phase is the Initiating Phase. In this phase, the user and CA set up their parameters for issuing private credentials. The second phase is the Private Attributes Validation Phase in which the user must send all credential attributes to the CA. The CA verifies if these attributes are correct. The third phase is the Blinding and Signing phase. In this phase, the user blinds his or her pseudonym and credential with the parameters sent from the CA and sends the blinded message to the CA. The CA then signs the blinded message and sends the signature to the user. The last phase is the Unblinding Phase in which the user unblinds the signature and gets his or her private credential. The private credential has two parts: public part like a pseudonym and secret part for authentication later. In this protocol the CA cannot learn who obtains which credential since the pseudonym (i.e., the public parameters of the credential) is blinded during the protocol.
•
Private credential authentication protocol: With a private credential, the user can convince other organizations that he or she possesses the credential and use the corresponding secret key to authenticate a message.
Private Credentials satisfies almost all pseudonym requirements listed previously. It uses the same strategy as Pseudonym Systems for discouraging property sharing, that is, the property sharing will reveal the user’s master secret key. In addition, Private
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
160 Song, Korba & Yee
Credentials has another good property — selective disclosure, which makes the system more practical and convenient. For example, the user may want to disclose only his or her medical condition to a medical office instead of all private information. Many other pseudonym technologies do not have this property.
Comparison of Pseudonym Technologies Based on the review of the different pseudonym technologies and pseudonym characteristics, a comparison of them is presented in Table 2. The comparison only gives a general idea. In Table 2, an application that has more properties does not necessarily have better privacy protection since each application may have different requirements. For example, the requirements for the e-voting system are very different with other applications like e-cash. In addition, the following comparison is based on the current techniques implemented or proposed. Pseudonym technologies for e-services are improved over time.
Table 2. Comparison of the different pseudonym technologies Properties
Ecash
Eticket
E-voting
Pseudonym Systems
Private Credentials
Pseudonymity
P
P
P
P
P
P
P
P
P
Authentication Unlinkability
P
P
P
P
P
Unforgeability
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
Security of the secret key Security of the protocols Property sharing resistance Selective disclosure
P
Reissuance Dossier resistance Nonrepudiation Confidentiality
P P
P
P P
P P
P
P
P
P
P
P
P
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
161
Case Study: E-Wallet Current payment systems using credit cards and bank debit cards make it easy for the merchants to collect the consumer’s personal data. They can easily record the user’s purchase habits and personal information (e.g., the size of clothes purchased) during payment since the user’s identity is on the card. The electronic wallet (e-wallet) provides a way to spend money and receive marketing material or services from service providers, and at the same time protect the consumer’s privacy and payment transactions. With the e-wallet protocol (Chaum & Pedersen, 1992), users can obtain their pseudonyms (or credentials) to which the issuer and other powerful organizations cannot link the users’ identities. Unlike online e-cash, e-wallet can provide off-line payments (e.g., The ESPRIT Project CAFE [Boly et al., 1994]). In order to examine how the e-wallet system can protect the user’s privacy and secure payment transactions, we review its architecture and protocols, and introduce an application: the CAFE project. Finally, we discuss the current applications of e-wallet over the Internet.
E-Wallet System Architecture The electronic wallet was first proposed by Chaum and Pedersen (1992) in order to be sure that an organization can only store, read, and update valid information (e.g., pseudonym or credential) in a tamper-proof device (e.g., smart card) issued to a user. For these purposes, an e-wallet consists of an observer and a purse.
•
The observer is the tamper-proof device (e.g., smart card) trusted by the issuer (e.g., bank) and protects the issuer’s interest during off-line payment transactions. It is a container for e-coins issued by banks. However, the observer cannot directly communicate with service providers (e.g., retail stores or bank) during transactions. All communications from the observer must go through the purse to connect with outside (e.g., a service terminal).
•
The purse is a hardware device owned and trusted by the user. When a user wants to spend his or her e-coins, the user puts his or her observer into his or her purse and connects the purse to the service device using a standard interface. With the purse, the user can fully control the communications between the observer and service provider, and this prevents the observer from performing unsolicited actions. However, the user cannot modify the data stored in the observer, nor can the user modify the transaction messages between the observer and service provider since the observer is a tamper-resistant device and the messages are protected by security functions such as digital signatures.
Figure 7 depicts the Chaum-Pedersen e-wallet architecture. From the architecture, we can see that the user (purse) can freely communicate with the outside world without the knowledge of the observer, but an honest organization (service terminal) will only accept messages approved by the observer. In addition, an e-wallet system is similar to an ecash system (see Figure 3), but the e-wallet system can provide off-line payments (i.e.,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
162 Song, Korba & Yee
Figure 7. Chaum-Pedersen e-wallet architecture E-walle t Observer
Service terminal
Purse
both the payer and payee do not need to connect to any bank during transactions) and better protections for both the consumers and service providers. For e-wallet applications, the European Community’s ESPRIT project CAFE (Boly, et al., 1994) has developed technology and a generalization of the concept of an electronic wallet based on the security architecture of the Chaum-Pedersen e-wallet. In order to make the e-wallet system work under the wireless and ubiquitous computing environment, Mjolsnes and Rong (2003) extended the system with decentralized credential keepers. In the Mjolsnes-Rong e-wallet system, the observer is located on the remote home security domain or on at a trusted third party named “credential keeper,” like a bank safety-deposit box that stores and protects the pseudonyms and credentials issued to the user. The e-wallet is a mobile device (e.g., cell phone, PDA) and consists of the keeper’s agent and a purse. The keeper’s agent is a tamper-resistant hardware such as smart card that contains a secret key to protect the communications and transactions between the credential keeper and the agent. The credential keeper’s agent is triggered to control and communicate with the credential keeper when the e-wallet application (purse) is required to communicate with the observer, for instance, to request a credential. Since the communications between the mobile device, credential keeper, and service terminal may cross an open public network, some end-to-end security protection mechanisms (e.g., AES end-to-end encryption, SSL/TLS security for Web services, Bluetooth Security Model 3 for short range physical services) are required. In addition, the Mjolsnes-Rong e-wallet system can be applied to various application areas such as the electronic acquisition of an e-token (e.g., e-ticket). Figure 8 depicts the MjolsnesRong e-wallet architecture, where the service terminal can consist of physical services and Web-based services.
Figure 8. Mjolsnes-Rong e-wallet architecture Credential keeper Observer Security doma in
E-wallet Keeper’s agent
Purse
Service terminal
Mobile device
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
163
Figure 9. E-coin issuing protocol in e-wallet E-wallet Observer
Purse
E-coin issuer
Blinded e-coin request message Part signature on the e-coin Challenge request message Approved challenge on the e-coin by the observer Another part signature on the e-coin
E-Coin Issuing Protocol In order to let the user ensure that his or her privacy is not compromised during transactions and let the observer validate all messages from the user to the outside world, Chaum and Pedersen designed an e-coin issuing protocol. The protocol is to avoid allowing the user to get a signature on any message that he or she chooses. For this purpose, the protocol must ensure that the message has been validated and approved by the observer before it is signed by the organization, that is, the organization only signs the blinded message approved by the observer. However, the organization cannot learn any personal information from the blinded message, although the observer knows the original message since the user (purse) only allows the observer to send the approved challenge to the outside organization if the user follows the protocol. Figure 9 concisely depicts the process flow of the protocol. With the protocol, a user can request an e-coin from an organization and store it into a smart card (observer). Later the user can show the e-coin to other organizations, for instance, pay the e-coin to get a cup of coffee from a unsupervised and isolated coffee machine. With e-wallet, the user can use an off-line payment to pay for it, that is, the user (purse) sends a message to the observer to record the transactions (e.g., deduct the e-coin from the smart card for the purchase) and get an approved payment transaction message from the coffee machine (e.g., a signature on the payment confirmation). Later, the owner of the coffee machine can get money from the bank using the e-coin and payment confirmation message stored in the coffee machine. However, with online e-cash, this off-line payment cannot be done.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
164 Song, Korba & Yee
E-Wallet Application Practice in CAFE CAFE (“Conditional Access for Europe”) is a European Community’s ESPRIT project that is primarily intended for payments from e-wallet to POS (point-of-sale) and aims at a market of small everyday payments and personal data protection. In the CAFE system, every user has his or her own e-wallet, which controls his or her privacy and security. The basic CAFE device is an electronic wallet based on the Chaum-Pedersen e-wallet security architecture. The e-wallet consists of two basic devices: a small portable computer (purse) similar to a PDA (personal digital assistant) and a tamper-resistant device (observer) called guardian. The purse protects the user’s interests, and the guardian protects the electronic money issuer’s interests. In addition, the guardian is only allowed to communicate via the purse since the user is not supposed to trust the guardian. So the purse can check and modify all messages the guardian sends and receives. However, an honest payee only accepts the messages approved by the guardian on behalf of the payee, that is, no payment is accepted without the guardian’s cooperation such as a signature. In the CAFE system, the purse communicates with other outside devices such as service points and tills provided by banks and merchants using a short range infrared channel, or over a computer network such as the Internet. It can also directly make transactions with other e-wallets held by other users. In addition, the system can be combined with a PDA, mobile phone, or a laptop. Except for the hardware devices (purse and guardian), CAFE also uses some cryptographic mechanisms such as blind signature and off-line coin to protect the system. These crypto protocols can control all inflow and outflow of communications and prevent extra personal information from being disclosed to the outside world. With the blind signature protocols (Chaum, 1985, 1992), a CAFE user can obtain an electronic coin signed by an electronic money issuer but the issuer does not know what the electronic coin looks like except for a certain form that the electronic coin must take in order to be in compliance. This ensures that the electronic money issuer cannot recognize the coins when the payer spends them and thus be able to trace the payer. The off-line coin mechanism is designed for off-line payments (Chaum et al., 1988). With the off-line coin protocol, the payer’s identity is encoded into the coin. The payer must reveal a part of the identity coded in the coin when he uses the coin for a payment. The protocol is constructed so that the identity can be found out if the same coin is used in two payments in which case the electronic money issuer can detect the cheating payers. More information can be found in (Chaum et al., 1988; Franklin & Yung, 1993; Brans, 1993). The CAFE system combines the off-line coin with the guardian in such a way that one part of the coin is held by the purse and another part by the guardian. The two parts together can create a secret key for signature on a payment with the electronic coin. The guardian can prevent an electronic coin from being spent twice because the guardian would know not to provide its part of the coin for the secret key twice. Furthermore, CAFE employs a loss tolerance mechanism to protect the user from a lost or stolen wallet. The mechanism is based on the loss-tolerance electronic wallet proposed by Waidner and Pfitzmann (1990, 1991). The idea is to keep a backup of the user’s electronic money outside the wallet, but the backup should not violate the privacy of the payer and the security of the electronic money issuer. With the backup, the electronic
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
165
money can be reconstructed, and the part that has not been spent can be credited to the user’s account. Based on its security architecture, the basic CAFE system has implemented the following features:
•
Security: The system uses the multi-party security mechanism (Chaum, 1985), which means the guaranteed security requirements do not force one party to trust other parties, that is, a party has to trust itself and the jurisdiction. This is beneficial for all parties in the system. Furthermore, in the CAFE system, fake-terminal attacks can be prevented by directly entering PINs into the e-wallet during verification.
•
Privacy: The system can protect the personal data with untraceability and unlinkability. This means that the payer (user) is perfectly untraceable for any payment transactions, that is, the e-wallet issuer cannot learn the identity of the payer from the payment. Furthermore, the different payments are unlinkable.
•
Prepayment: This means the user must purchase some electronic money from an electronic money issuer (e.g., bank) and store it into his or her e-wallet before he can make any payment transactions.
•
Off-line payment: It is not necessary to contact the electronic money issuer during a payment. This is good for low-value, everyday payments since the online communication and processing with the electronic money issuer may be expensive for small payments.
•
Loss tolerance: This means that the user can get his or her money back if the user’s e-wallet is lost, broken, or stolen.
•
Open architecture and system: This means that the system is designed for a universal payment system and interoperable between the different electronic money issuers. In addition, the system is open for new hardware platforms and can be integrated into other systems.
E-Wallet Practice in Web-Based Services E-wallet services and applications are becoming more and more popular on the Internet, especially for payments and transactions over the Internet. Many large Internet service providers such as Yahoo, Amazon, eBay, AT&T, and Microsoft have provided e-wallet services. However, most of the e-wallet applications in current Internet services have lost the important functionalities of the original e-wallet such as pseudonym-based and owner controlled privacy protection, that is, personal data can be protected with pseudonym technology and controlled by its owner with cryptographic technology such as the blind signature. These new e-wallet applications in Internet services use agreement-based and provider-controlled privacy protection technology, that is, the user’s personal data is controlled and protected by the service provider, not the owner of the personal information, based on an agreement signed by the service provider and the owner (user). The agreement describes the conditions when and how the personal data can be used by the provider, exposed to whom, and for what purposes. For instance, the
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
166 Song, Korba & Yee
AT&T Wireless e-Wallet User Agreement (2005) describes the privacy conditions of use of the AT&T wireless e-wallet as: We collect, and you consent to such collection of, the information you provide or confirm at registration as well as information about your purchases and other transaction information. We disclose that information, and you consent to such disclosure, to those merchants involved in the transaction, to your credit card company and bank, the merchant bank, merchant aggregators, and other vendors, companies or service providers used to facilitate or complete the transaction (“Third Parties”). Information about you received by those Third Parties will be governed by their own privacy policies, not this User Agreement or the AT&T Wireless Privacy Policy. Whenever third parties have a role in any transaction, you should review their privacy policies and practices. You consent to Third Parties sharing information about you with AT&T Wireless to facilitate e-Wallet transactions. In addition, you authorize AT&T Wireless and Payment Processor to exchange your registration and transaction information with each other in order to provide the Bill to Phone service to you. The personal information collected by AT&T is described in the AT&T e-Wallet Supplemental Privacy Notice (2005) as: In connection with the e-Wallet Services, we collect the following categories of personal information: Registration Information: We may collect personal information from you during the registration process, including: (i) your name and your mailing and email addresses, (ii) your mobile phone number, (iii) your credit card or debit card numbers, and (iv) a user name and password (PIN). AT&T Wireless also uses “cookies” to keep track of each use by you of e-Wallet Services, but does not store any information about you in a cookie. Transaction Information and Information from Vendors and Merchants: We collect personal information about your use of the e-Wallet Services, including purchase and other information from your transactions conducted using the e-Wallet Services. This information may include the type of purchase, the name of the merchant, and the amount of the purchase. We also receive information from vendors or merchants that provide services either individually or jointly with us, as part of the e-Wallet Services or your other AT&T Wireless services. This means the provider may collect a lot of personal information with the e-wallet services since the e-wallet uses the real name of the user for all payments and transactions. Furthermore, the e-wallet is a simple database that gathers personal information such as name, address, and credit card account, together with some security protection such as encryption. The major purpose of the e-wallet applications here is to provide a Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
167
simple and convenient approach for payments and transactions in Internet services. Personal data protection relies on the privacy protection agreement and legislation.
Future Trends Pseudonym technology will be accepted and employed by more and more applications when people find that their personal information is readily exposed to the public. However, pseudonym technology needs to be improved and standardized in order to satisfy the requirements of the customers, applications, and legislations. There is still much to be done in terms of pseudonym technology research for privacy protection in e-services. As we mentioned, most current e-services have not applied pseudonym technologies since most of these technologies are still in the research stage. For instance, in e-commerce services, the current payment system and business model do not involve pseudonym technologies. It is a challenge to fill this gap and propose a practical pseudonym technology that can be easily combined with the current credit card or debit card payment system. CAFE is an example of a good practice toward answering this challenge. Furthermore, the computational complexity, efficiency, and scalability of the existing pseudonym technologies require further research to arrive at a state when they can be embedded in e-services. Lower cost, improved privacy protection, and better services are good drivers for the development of pseudonym technologies to achieve greater practicality. In addition, new e-services may require new pseudonym technologies to implement new privacy protection requirements required by organizations or the law. In addition, along with the development of advanced technologies such as ubiquitous computing systems, wireless systems, high performance processors, and large memory storage, a new kind of comprehensive e-wallet, which can contain thousands of different e-certificates, would become very attractive for e-services. More and more people are complaining that a small traditional wallet cannot take too many cards (e.g., credit cards, debit cards, driver license, and membership cards), and it is very dangerous and inconvenient if the traditional wallet is lost since anyone can open the wallet when they find it. In this case, the thousands of e-certificates delegating the different physical cards can be easily stored in a small e-wallet. Compared with the traditional wallet, the e-wallet has many benefits such as more security, large storage, more efficiency, and others. Of course, the new standards for the variety of e-certificates, e-wallets, and interfaces need to be researched and developed.
Conclusions In this chapter, we have introduced the reader to current research, challenges, and issues of pseudonym technologies for privacy protection in e-services. We summarized the general pseudonym system architecture and processing of the protocols and gave a Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
168 Song, Korba & Yee
comparison of the different keys such as master public key, pseudonym, and credential involved in the system, including their functionalities and roles. We analyzed the pseudonym requirements for e-services and summarized them as two different kinds of requirements, that is, basic requirements and advanced requirements, in order to evaluate the pseudonym technologies and applications. Furthermore, we reviewed several very important pseudonym technologies such as e-cash, e-ticket, e-voting, Pseudonym System, and Private Credentials and compared them according to pseudonym properties. These technologies can be used in different applications and e-services to provide better privacy protection.
References Abe, M., & Fujisaki, E. (1996). How to date blind signatures. In AsiaCrypto’96, LNCS, 1163 (pp. 244-251). AT&T e-Wallet Supplemental Privacy Notice. (2005). Retrieved April, 2005, from http:/ /www.mymmode.com/e-wallet/privacy.html AT&T Wireless e-Wallet User Agreement. (2005). Retrieved April, 2005, from http:// www.mobile.att.net/e-wallet/agreement.html Berthold, O., Federrath, H., & Kopsell, S. (2000). Web MIXes: A system for anonymous and unobservable Internet access. In H. Federrath (Ed.), Anonymity 2000, LNCS, 2009 (pp. 115-129). Boly, J. P., Bosselaers, A., Cramer, R., Michelsen, R., Mjølsnes, S., Muller, F., et al. (1994). The ESPRIT Project CAFE — High security digital payment systems. In Proceedings of Third European Symposium on Research in Computer Security (ESORICS 94), LNCS 875 (pp. 217-230). Brands, S. (1993). An efficient off-line electronic cash system based on the representation problem (Report CS-R9323). Centrum voor Wiskunde en Informatica, Computer Science/Department of Algorithmics and Architecture. Brands, S. (2000). Private credentials (White Paper by Zero-Knowledge Systems, Inc.). Retrieved from http://www.zks.net/media/credsnew.pdf Chaum, D. (1981). Untraceable electronic mail, return address, and digital pseudonyms. Communications of the ACM, 24(2), 84-88. Chaum, D. (1982). Blind signatures for untraceable payments. In D. Chaum, R. L. Rivest, & A. T. Sherman (Eds.), Advances in Cryptology — CRYPTO’82 (pp. 199-203). Chaum, D. (1985). Security without identification: Transaction systems to make big brother obsolete. Communications of the ACM, 28(10), 1030-1044. Chaum, D. (1988). Elections with unconditionally-secret ballots and disruption equivalent to breaking RSA. In Advances in Cryptology — Eurocrypt’88 (pp.177-182). Chaum, D., & Evertse, J. (1986). A secure and privacy-protecting protocol for transmitting personal information between organizations. In Advances in Cryptology — CRYPTO’86, LNCS 0263 (pp. 118-167). Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
169
Chaum, D., Fiat, A., & Naor, M. (1988). Untraceable electronic cash. In Advances in Cryptology — CRYPTO’88 (pp. 319-327). Chaum, D., & Pedersen, T. P. (1992). Wallet databases with observers. In Advances in Cryptology — CRYPTO’92, LNCS 0740 (pp. 89-105). Chen, L. (1995). Access with pseudonyms. In E. Dawson & J. Galic (Eds.), Cryptography: Policies and algorithms, LNCS 1029 (pp. 232-243). Cramer, R., Gennaro, R., & Schoenmakers, B. (1997). A secure and optimally efficient multi-authority election scheme. In Advances in Cryptology — Eurocrypt’97, LNCS 1233 (pp.103-118). Damgard, I. B. (1988). Payment systems and credential mechanisms with provably secure against adaptive abuse by individuals. In Advances in Cryptology — CRYPTO’88, LNCS 0403 (pp. 328-335). Department of Justice. (n.d.). Privacy provisions highlights. Retrieved February 28, 2005, from http://canada.justice.gc.ca/en/news/nr/1998/attback2.html Diffie, W., & Hellman, M. (1976). New directions in cryptography. IEEE Transactions on Information Theory, 22(6), 644-654. ElGamal, T. (1985). A public-key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory, 22(6), 469-472. European Union. (n.d.). Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Unofficial text retrieved September 5, 2003, from http://aspe.hhs.gov/datacncl/eudirect.htm Franklin, M., & Yung, M. (1993). Secure and efficient off-line digital money. In Proceedings of 20th International Colloquium on Automata, Languages and Programming (ICALP), LNCS 700 (pp. 265-276). Glenn, A., Goldberg, I., Legare, F., & Stiglic, A. (2001). A description of protocols for private credentials (White Paper patented by Zero-Knowledge Systems, Inc.). Retr ieved fr om h tt p:/ /crypto.cs.m cgil l.ca/~ stigl ic/Pap er s/ brands.pdf#search=’private credential’ Goldberg, I., Wagner, D., & Brewer, E. (1997). Privacy-enhancing technologies for the Internet. In IEEE COMPCON’97 (pp. 103-109). Goldschlag, D., Reed, M., & Syverson, P. (1999). Onion routing for anonymous and private Internet connections. Communication of the ACM, 42(2), 39-41. Government of Canada. (n.d.). Personal Information Protection and Electronic Documents Act. Retrieved February 28, 2005, from http://www.privcom.gc.ca/legislation/02_06 _01_01_e.asp Hoffman, L. J. (2000). Internet voting: Will it spur or corrupt democracy? In Proceedings of the 10th Conference on Computers, Freedom and Privacy: Challenges the Assumptions (pp.219-223). IBM. (1999). Multi-national consumer privacy survey. Retrieved from http:// www.mischiefmarketing.com/privacy_survey_oct991.pdf
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
170 Song, Korba & Yee
Jorba, A. R., Ruiz, J. A. O., & Brown, P. (2003). Advanced security to enable trustworthy electronic voting. In Proceedings of the 3rd European Conference on E-Government (pp. 377-384). Juang, W. S., Lei, C. L., & Liaw, H. T. (2002). A verifiable multi-authority secret election allowing abstention from voting. Journal of Computer, 45(6), 672-682. Kim, S., & Oh, H. (2002). A new electronic check system with reusable refunds. International Journal of Information Security, 1(3), 175-188. Lee, N. Y. (2000). Fairness and privacy on pay-per view system for Web-based video service. IEEE Transactions on Consumer Electronics, 46(4), 980-984. Lee, N. Y., Change, C. C., Lin, C. L., & Hwang, T. (2000). Privacy and non-repudiation on pay-TV systems. IEEE Transactions on Consumer Electronics, 46(1), 20-26. Liaw, H. T. (2003). A secure electronic voting protocol for general elections. Journal of Computers & Security, 23, 107-119. Lysyanskaya, A., Rivest, R. L., Sahai, A., & Wolf, S. (2000). Pseudonym systems. In H. Heys, & C. Adams (Eds.), SAC’99, LNCS 1758 (pp. 184-199). Menezes, A., Oorschot, P. V., & Vanstone, S. (1996). Handbook of applied cryptography (pp.103-113). CRC Press. Miyazaki, S., & Sakurai, K. (1998). A more efficient untraceable e-cash system with partially blind signatures based on the discrete logarithm problem. In Financial Cryptography (FC’98), LNCS 1465 (pp. 296-308). Mjolsnes, S. F., & Rong, C. (2003). Online e-Wallet system with decentralized credential keepers. Mobile Networks and Applications, 8(1), 87-99. Raymond, J. (2000). Traffic analysis: Protocols, attacks, design issues, and open problems. In H. Federrath (Ed.), Anonymity 2000, LNCS 2009 (pp.10-29). Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of ACM, 21(2), 120-126. Song, R., & Korba, L. (2003). Pay-TV system with strong privacy and non-repudiation protection. IEEE Transactions on Consumer Electronics, 49(2), 408-413. Song, R., & Korba, L. (2004, April 5-7). How to make e-cash with non-repudiation and anonymity (NRC 46549). In Proceedings of International Conference on Information Technology (ITCC 2004), Las Vegas, Nevada, USA. Song, R., & Lyu, M. R. (2001). Analysis of privacy and non-repudiation on pay-TV systems. IEEE Transactions on Consumer Electronics, 47(4), 729-733. Telecom Glossary 2000. (2000). Development site for proposed revisions to America National Standard T1.523-2001. Retrieved from http://www.its.bldrdoc.gov/ projects/ devglossary/t1g2k.html U.S. Government. (n.d.). Office for Civil Rights — HIPAA: Medical privacy — National standards to protect the privacy of personal health information. Retrieved February 28, 2005, from http://www.hhs.gov/ocr/hipaa/ Waidner, M., & Pfitzmann, B. (1990). Loss-tolerance for electronic wallets. In Proceedings of 20th International Symposium on Fault-Tolerance Computing (FTCS 20), Newcastle upon Tyne, UK (pp. 140-147). Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Pseudonym Technology for E-Services
171
Waidner, M., & Pfitzmann, B. (1991). Loss-tolerance electronic wallet. In D. Chaum (Ed.), Smart Card 2000. Selected papers from the Second International Smart Card 2000 Conference (pp. 127-150). Yee, G., & Korba, L. (2005, January). Semiautomatic derivation and use of personal privacy policies in e-business. International Journal of E-Business Research, 1(1), 54-69. Yee, G., & Korba, L. (2005, March). An agent architecture for e-services privacy policy compliance. In Proceedings from The IEEE 19th International Conference on Advanced Information Networking and Applications (AINA 2005), Taiwan.
Endnote 1
NRC Paper Number: NRC 48269
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
172
Adams & Barbieri
Chapter VII
Privacy Enforcement in E-Services Environments Carlisle Adams, University of Ottawa, Canada Katerine Barbieri, University of Ottawa, Canada
Abstract This chapter presents technological measures for privacy enforcement (techniques that can be used to ensure that an organization’s privacy promises will be kept). It gives an introduction to the current state of technological privacy enforcement measures for e-services environments, proposes a comprehensive privacy enforcement architecture, and discusses some remaining issues and challenges related to privacy enforcement solutions. The goal of the proposed architecture, aside from integrating many of the current isolated technologies, is to ensure consistency between advertised privacy promises and actual privacy practices at the e-service provider Web site so that users can have greater confidence that their personal data will be safeguarded.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
173
Introduction Privacy has become a growing concern for users on the Internet. It is widely believed that privacy concerns have affected the success of e-commerce initiatives, since users are reluctant to use online services for fear that their private data will be violated in some way. Privacy violations can be categorized according to many different schemes, but perhaps the simplest categorization divides violations into two major types. The first is the dissemination of private data to unintended recipients; for example, an entity collects a user’s private data (such as address, phone number, and credit card number) and sells it to another entity without user consent. The second is the illegitimate use of a user’s private data by an entity that legitimately received that data during a transaction; for example, private data provided for one purpose (a shipping address submitted for delivery of a book purchased online) is used by the legitimate receiver for another purpose (the address is used for sending unwanted advertising pamphlets). Privacy enhancing technologies (PETs) can be helpful in protecting users against both types of privacy violation. The ultimate goal of PETs is to allow users to control how and with whom their private data is shared. A number of organizations have begun efforts — including the implementation and use of PETs — to alleviate users’ concerns regarding privacy. For example, many organizations now publish textual privacy policies on their Web sites. Some instead (or in addition) use the Platform for Privacy Preferences (P3P 1.0, 2002) policy definition language to formulate their privacy policies in a standard, machine-readable format. Other initiatives include performing privacy impact assessments (Hope-Tindall, 2002) and audits to ensure that defined privacy policies are adhered to. However, publishing privacy promises and performing assessments are only first steps to what may be referred to as privacy enforcement. Privacy promises and audits do not guarantee a user that the organization is always protecting his or her private data and using it only for its intended purpose. Privacy enforcement takes privacy promises one step further: enforcement is the collection of techniques used to ensure (and, correspondingly, to give users the assurance) that an organization’s privacy promises will be kept. That is, enforcement has to do with guaranteeing that organizations do what they say they will do with personal data (“practice what they preach”). This pertains to the entire lifespan of the data (collection, storage, use, dissemination, destruction) and should ideally take into account a user’s own personal preferences regarding his or her personal data. The techniques used in privacy enforcement today are typically procedural, organizational, or legal in nature. However, there is growing interest in the use of technological measures for privacy enforcement, primarily for the greater assurance that such measures can offer with respect to accuracy and effectiveness of the enforcement. Research initiatives in privacy enforcement technology focus on such things as privacy policy definition languages that are machine-readable and machine-processable, storage technologies that are privacy-aware, data transformation or encapsulation methods that explicitly or implicitly incorporate privacy policies, and tools that assist users in creating appropriate privacy preferences. However, a critical lacking element is an overall
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
174
Adams & Barbieri
architecture that ties these various initiatives together and provides a consistent privacy enforcement strategy across the enterprise. Such an architecture should readily accommodate industry standards wherever possible and would also be beneficial in helping to clarify missing components or research gaps in the privacy enforcement picture. The primary objectives of this chapter are to introduce the reader to the state of technological privacy enforcement measures for e-services, to propose a comprehensive privacy enforcement architecture, and to discuss some remaining issues and challenges related to privacy enforcement solutions.
Background This section provides definitions and detailed descriptions of technological measures for privacy enforcement, as well as systems and major architectural components that have been researched and implemented.
Privacy Policy Definition and Exchange Languages A significant amount of work has been done in the area of privacy policy definition languages. Some examples follow.
Platform for Privacy Preferences (P3P) Platform for Privacy Preferences (P3P) is an XML-based W3C standard (P3P 1.0, 2002) for Web privacy that is gaining popularity. As stated by Cranor (Cranor, 2002, p.10), “by April 2002, about a third of the top 100 Web sites had adopted P3P.” P3P is a privacy policy definition language that allows organizations to publish their privacy policies in a computer-readable format. Traditionally, organizations published their privacy promises in free text in a natural language such as English on their Web sites. This free text can easily be misinterpreted, may be incomplete, and can be difficult to compare from site to site. With privacy policies written using P3P, organizations can display their privacy promises in a consistent language that is easily retrieved and processed by P3P-enabled user agents such as Internet Explorer 6, Netscape 7, and the AT&T Privacy Bird (AT&T Privacy Bird). Several P3P implementations have been developed and are listed on the Web site http://www.w3.org/P3P/implementations (P3P Implementations). This site includes tools such as several P3P editors (e.g., P3PEdit, P3Pwriter.com, P3P Policy Editor, and P3P Editor), P3P proxy software, enterprise server-side solutions (e.g., IBM Tivoli Privacy Manager for e-business), and P3P validators. Different P3P tools are also discussed in references such as “Web Site Privacy with P3P” (Lindskog & Lindskog, 2003).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
175
Organizations use the P3P vocabulary to express their privacy policies. The vocabulary includes terms to describe what type of private data the site collects, for what purposes the data is collected, and how it uses this private data. A P3P policy also includes the contact information of the entity (individual) that is responsible for dealing with privacy violations and how information regarding users can access private data collected about them. P3P uses a list of specified elements to define privacy policies. These elements include pre-defined values or categories that are used to further define particular policy settings. P3P includes syntax for a policy reference file that lists the different policies for this Web site and describes the portions of the Web site to which they each apply. The P3P vocabulary also defines policy-level and statement-level elements. The policy-level elements describe policy-specific details such as procedures and contact information relevant to the site’s general privacy practices. The statement-level elements describe the site’s data handling practices for the different data types collected. Note that the vocabulary terms described here are those included in version 1.0 of P3P. Additional terms have been added in version 1.1 (Working Draft dated 27 April 2004) (P3P 1.1, 2004) but will not be covered here as this new version of the specification is in a draft state at the time of this writing. The P3P vocabulary is comprised of the following policy-level elements:
• •
POLICIES: Groups a set of POLICY elements.
•
ENTITY: Identifies the entity responsible for the privacy practices included in the policy.
•
ACCESS: States whether the site gives the user access to their own private data that has been collected.
•
DISPUTES: Describes the site’s procedures for dispute resolution with regard to their privacy practices. Should include a REMEDIES element.
•
REMEDIES: States the possible remedies that may be applied when a privacy policy breach occurs.
POLICY: Contains a P3P policy with elements such as ENTITY, ACCESS, DISPUTES, one or more STATEMENT, and so forth.
The P3P vocabulary also includes the following statement-level elements:
•
STATEMENT: Describes data practices with elements such as PURPOSE, RECIPIENT, RETENTION, and CONSEQUENCE.
•
CONSEQUENCE: Provides a human-readable text summary of the data practices specified in the STATEMENT element.
•
NON-IDENTIFIABLE: If included in the STATEMENT element, it states that no personal data is collected or that it will be anonymized.
• •
PURPOSE: Specifies for which purposes the site collects or uses the personal data. RECIPIENT: Lists the different entities to which the private data may be disclosed.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
176
Adams & Barbieri
• •
RETENTION: Describes the data retention policy applied to private data. DATA-GROUP: Specifies which private data is being used. Includes one or more DATA elements.
A P3P Preference Exchange Language (APPEL) A P3P Preference Exchange Language (APPEL) is a W3C Working Draft of a preference interchange language that is used to compare the P3P privacy policy of a Web site with the user preferences specified in the user agent’s privacy configuration (APPEL, 2000). A goal of APPEL is to allow users to easily import privacy preferences (possibly recommended by a trusted organization). In APPEL, these privacy preferences are represented as a ruleset. Another goal of APPEL is to facilitate the transportation of a user’s privacy preferences between P3P-APPEL-enabled applications. When the APPEL engine evaluates the Web site policy against the user’s ruleset, it can be configured to perform resulting actions such as “request” (allow the request), “limited” (allow the request but send minimal information), or “block” (block the request).
Enterprise Privacy Authorization Language (EPAL) Enterprise Privacy Authorization Language (EPAL) is an XML-based language specification for the definition of enterprise-wide privacy policies for privacy enforcement (Ashley, Hada, Karjoth, Powers, & Schunter, 2003). This research is being conducted at IBM and has been submitted for consideration as a W3C Recommendation. The Platform for Enterprise Privacy Practices (E-P3P) is the predecessor to EPAL. EPAL focuses on privacy authorization but is also based on many access control concepts. The EPAL language includes the following elements for writing privacy access rules:
•
User category: Describes individuals who are accessing or receiving data. The user types are typically defined as a hierarchy of categories and not specific users.
• •
Action: Describes the activity performed on the data.
• • •
Data category: Describes the type of data implicated in the rule. The data types are typically defined as a hierarchy of categories and not specific data. Purpose: Describes why the data is used, collected, or disclosed by the user. Condition: Describes requirements that must be met in order for a rule to apply. Obligation: Describes steps that must be taken when access to the data is allowed.
EPAL rules are expressed in semi-structured format as follows: ALLOW/DENY [User] TO PERFORM [Action] ON [Data] FOR [Purpose] IF [Condition] AND CARRY OUT [Obligation]
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
177
eXtensible Access Control Markup Language (XACML) The eXtensible Access Control Markup Language, XACML, is an OASIS specification for an extensible access control policy language (XACML 1.1, 2003). XACML can be used to specify an access control policy using its pre-defined, but extensible, taxonomy in XML. It can also be used to express access control decision requests and responses in XML. Upon receiving a request to perform an action against a particular resource, the XACML system will match the request with the relevant rules in the access control policy and determine the resulting decision (Permit, Deny, Indeterminate, or Not Applicable). XACML defines combining algorithms (deny-overrides, permit-overrides, first applicable, and only-one-applicable) to process several relevant rules or policies and output a final authorization decision. XACML has many other features that are important to the problem of privacy control. These include policies based on resource contents (e.g., a user can only read a particular resource based on its contents) and obligations (e.g., an access request by a particular user for a particular resource can trigger an action that must be performed). XACML is a technology for authorization systems that is well suited to distributed environments (Lorch, Proctor, Lepro, Kafura, & Shah, 2003). For this reason, it is also useful for privacy control where privacy policies are enforced at different locations in the enterprise infrastructure. Application programming interfaces (APIs) are available (SunAPI, 2003) that allow application and system builders to incorporate XACML’s access control functionality into their infrastructure devices or applications. XACML consists of the following policy language elements:
•
Policy Set: Defines a Target, a Policy Combining Algorithm identifier, one or more Policies and Obligations.
•
Policy Combining Algorithm: Defines the procedure to be used to evaluate the different Policies in a Policy Set and produce a decision.
•
Policy: Defines a Target, a Rule Combining Algorithm identifier, one or more Rules and Obligations.
•
Obligation: Defines actions that must be performed when a Rule is successfully evaluated.
•
Rule Combining Algorithm: Defines the procedure to use to evaluate the different Rules in a Policy and produce a decision.
•
Rule: Defines the basic component of a policy and specifies the applicable Target, Effect, and Condition.
•
Target: Defines the Resource, Subject, and Action to which a Rule or Policy should apply.
•
Effect: Defines the consequence (allow or deny) of the successful evaluation of a Rule.
•
Condition: Defines a refinement of the applicability of the Rule.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
178
Adams & Barbieri
EPAL vs. XACML Policy Syntax Table 1 compares XACML’s and EPAL’s policy language by mapping their corresponding high-level elements. Note that this table is appropriate for both XACML 1.1 and XACML 2.0 (XACML 2.0, 2005) since the high-level language elements are identical in these two versions of the specification. In EPAL and XACML, hierarchies of values are defined (denoted by * in Table 1). In XACML hierarchies are defined using the “scope” attribute with a value of “Immediate,” “Children,” or “Descendants.” Our research related to EPAL and XACML confirms that all features required for a privacy control and enforcement system can be found in a flexible access control and enforcement language such as XACML. Some of the elements found in EPAL but not explicitly found in XACML can readily be implemented within the language. In particular, in XACML, user and data categories can be implemented using attributes of Subject and Resource respectively, and purpose can be implemented using an attribute of Action. On the other hand, there are important elements found in XACML that are not present in EPAL. Target is an element in XACML that is used at many levels (Policy Set, Policy, Rule) and increases efficiency of the system since not all rules or policies need to be processed in order to determine if there are applicable rules to evaluate. Rule Combining
Table 1. EPAL vs. XACML Policy Syntax XACML
Policy Set Policy Combining Algorithm Policy
Rule Combining Algorithm Rule Effect Condition Obligations Target Subject + attribute of Subject *Resource + attribute of Resource Action + attribute of Action
EPAL
(no corresponding element) (no corresponding element) Epal-policy • Default-ruling • Global-condition Policy-info (no corresponding element) Rule • Ruling Condition • xacml: condition Obligation (no corresponding element) *User-category *Data-category Action *Purpose Epal-vocabulary • location Vocabulary-info Container
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
179
Algorithms allow for more sophisticated and flexible rule writing. Policy Combining Algorithms are useful for comparing policies between organizations or within organizations and for comparing organizational policies with user preferences.
Privacy-Aware Storage Technologies Research is being conducted in the area of storage technologies that are inherently privacy-aware. Some promising directions are as follows.
Hippocratic Databases In Proceedings of the 28th VLDB Conference (Agrawal, Kiernan, Srikant, & Xu, 2002), Agrawal and other researchers at the IBM Almaden Research Center introduce the idea of a Hippocratic database. A Hippocratic database is inspired by the Hippocratic Oath, which has guided physicians for centuries. Hippocratic databases are intended to be database systems that include responsibility for privacy as a core concern. It is recognized that a full solution to privacy not only includes technological solutions such as Hippocratic databases, but also includes laws, regulations, and societal norms. Database systems provide the ability to manage persistent data and access a large amount of data efficiently. With Hippocratic databases that consider privacy, the focus will be less on efficiency and more on consented sharing. Including “purpose” as a basis for data retrieval will require changes in data definition and query languages, query processing, access control mechanisms, and so on. In particular, any data that is collected must be collected for a specific purpose; the purpose must be stored with the data; and the purpose must limit how the data can be subsequently used. Hippocratic databases draw from research performed on database systems. For example, techniques developed for preserving privacy in statistical databases can be applied to Hippocratic databases. Also, research related to secure databases that employ access control, encryption, and multilevel security (e.g., top secret, secret, unclassified) can be applied to Hippocratic databases. The idea of the Hippocratic database was based on privacy regulations and guidelines such as the United States Privacy Act of 1974, the OECD privacy guidelines, the Health Insurance Portability and Accountability Act (HIPAA), the Gramm-Leach-Bliley Financial Services Modernization Act, and other privacy regulations in Canada, Japan, and Australia. The paper lists 10 founding principles of Hippocratic database systems: purpose specification, consent, limited collection, limited use, limited disclosure, limited retention, accuracy, safety, openness, and compliance. See Proceedings of the 28th VLDB Conference (Agrawal et al., 2002) for further descriptions of the main features of Hippocratic database operation. The Hippocratic database architecture can be easily integrated with existing privacy standards such as P3P since the P3P PURPOSE and RETENTION concepts can be directly mapped to those found in this architecture. With this, a P3P policy can be processed using components of the Hippocratic database to generate the required privacy state-
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
180
Adams & Barbieri
ments that will function within the system. Another interesting feature of the Hippocratic database design is that it could provide some data mining inference control protection mechanisms if it is implemented as part of a privacy enforcement system.
Implementing P3P Using Database Technology Through lessons learned in the Hippocratic databases research, the same team of researchers at IBM Almaden Research Center also published a paper that describes a method of implementing P3P in database technology (Agrawal, Kiernan, Srikant, & Xu, 2003). This research suggests a method to store P3P policies in a database and translate APPEL rules for user preferences into queries to the database.
Privacy-Enhanced Data Protection Research has been conducted in the area of data transformation or encapsulation technologies that explicitly or implicitly bind privacy policies with personal data for its entire lifespan. Some results in this direction follow.
Privacy-Aware Role-Based Access Control (PARBAC) A study at North Carolina State University (He, 2003) proposes combining role-based access control (RBAC) and domain-type enforcement (DTE) to provide privacy policy enforcement. The proposed approach is named privacy-aware role-based access control (PARBAC). In this paper, the author argues that traditional security models and access control models do not provide for privacy requirements such as purpose binding (data collected for one purpose should not be used for another purpose without user consent) and the principle of necessity (data should only be collected and processed if it is needed to complete the task at hand). The author proposes PARBAC as a solution that accounts for purpose binding. This approach is based on the Dynamic Authorization Framework for Multiple Authorization Types (DAFMAT), which combines RBAC and DTE for a healthcare application, and the E-P3P work at IBM. PARBAC is a metadata-based approach that tags private data with privacy policy information. An assumption made is that the privacy enforcement is done in a trusted environment.
Sticky Policies The Trusted Systems Laboratory (TSL) at Hewlett-Packard (HP) is conducting ongoing research in the area of accountability and enterprise privacy policy enforcement (Beres, Bramhall, Casassa Mont, Gittler, & Pearson, 2003). Their research is focused on expanding privacy policy definition languages to allow the specification to include hardware, platform, or operating system-related restrictions and on ensuring that the privacy policy of the data is coupled with the private data throughout its lifespan.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
181
Their research highlights the point that privacy should be enforced at several levels, not only at the application level, but also at the hardware, platform, operating system, and network levels. Another important observation made by the research team is that there is not likely to be a “tamper proof” solution; therefore, audit is a key component to a successful privacy architecture. Audit logs that track private data flow and can be verified by trusted third parties will provide an added layer of protection. This additional component can increase user confidence and trust and provide a method of verifying compliance with policies, laws, and regulations where technological methods are limited. In their approach, enforcement occurs on the data owner’s system, or checks are performed by a trusted third party (TTP). Identity-based encryption (IBE) technology is suggested as a mechanism for ensuring that the privacy policies “stick” to the data. In their proposed architecture, private data is sent in obfuscated/encrypted form. The data is encrypted using IBE encryption keys, which are constructed using the “sticky policy.” To decrypt the private data, a user must provide valid credentials and configuration/policy information that proves that the user satisfies the privacy policy. Trusted third parties verify the adherence to privacy policies and issue the decryption keys. Once the data has been successfully decrypted by a permitted host, the data will be in clear text. At this point, other protection mechanisms are required such as trusted platforms and operating systems. These additional mechanisms can ensure that the privacy policy is maintained while the data is stored on the trusted system. The data can be tagged with a privacy label that follows the data at all times.
Privacy Preference Creation Tools Work has been conducted into the creation of appropriate personal preference files for users, either through other entities or through negotiation with Web sites. Examples of this work follow.
Semi-Automated Derivation of Personal Privacy Preferences Researchers at the National Research Council (NRC) in Canada have published papers regarding the derivation and compliance of personal privacy policies. Yee and Korba (2004a) describe the need for more easily formulated privacy policies for users (i.e., user privacy preferences), which they have named personal privacy policies. The authors suggest that existing privacy policy specification languages such as P3P are too complex for average Internet users. A personal privacy policy contains header information such as the policy’s use, owner, availability of proxy, and validity period. The paper describes a personal privacy policy’s rules in terms of five main features:
• • • • •
Collector: Organization that wishes to collect the information. What: Nature of the data collected or disclosed. Purposes: Purposes for which the data is being collected. Retention time: Amount of time that the provider is to keep the information. Disclosure to: Parties to whom the data will be disclosed.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
182
Adams & Barbieri
Two approaches are presented for the semi-automated derivation of personal privacy policies. The first approach is the derivation through third party surveys, where organizations obtain user privacy sensitivity levels for different uses of private information. The organization then compiles the privacy sensitivity levels, associates them with privacy rules, and makes them available to clients. Clients are given the possibility to adapt the privacy levels for different uses and service providers. This approach for policy derivation requires that the policy provider (organization) be trusted to create accurate privacy sensitivity ranges. The second approach involves deriving policies from information obtained from a community of peers. This approach assumes that the community of peers has pre-determined, desired privacy levels for different uses. These privacy level values are derived from third party surveys (as described above) or other methods. When a user makes a request for privacy rules for a particular use to the community, the community peers respond with matching rules that the user must gather, examine, and then select with respect to the best fit. These two approaches are described in the paper as possible methods of creating user privacy preference templates for different uses of the data (i.e., different online services for which the privacy policy will be used). Other research being conducted at the NRC includes (Yee & Korba, 2003a, 2003b), which introduce a negotiations approach to resolve conflicts between Web service users’ privacy policies and providers’ privacy policies.
Privacy Enforcement Agents Enforcement agents are the components in the architecture that enforce the privacyrelated decisions rendered by access control (or privacy control) decision engines. Some work in this area follows.
Mobile Enforcement Agents In a paper titled “Preserving Privacy in Web Services” (Rezgui, Ouzzani, Bouguettaya, & Medjahed, 2002), the authors discuss the challenges of maintaining privacy throughout e-commerce transactions. Some issues discussed include incompatibility of privacy policies between communicating organizations (e.g., a credit card company and an ecommerce site). The privacy design for Web services proposed in the paper is based on three components: digital privacy credentials, data filters, and mobile privacy enforcement agents. The data filters define rules that use the privacy credentials to enforce privacy control of remote entities to local private data. Once the data have been delivered to the querying entity, it is the mobile privacy enforcement agents (based on mobile agent technologies) that enforce privacy at the remote site. These privacy enforcement agents, as described, provide a solution for implementing “sticky policies.” The paper suggests technologies that can be used to implement the privacy enforcement agents, including proof-carrying code, the BSD packet filter, or authentication and authorization mechanisms. Many of these technologies are not yet mature enough (at the time of this writing) to handle privacy control on a large scale.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
183
Other Tools and Procedures Translating E-P3P (EPAL) to P3P Researchers at IBM propose a method to translate privacy practices expressed in E-P3P into corresponding privacy promises expressed in P3P (Karjoth, Schunter, & Waidner, 2002; Karjoth et al., 2003). This allows organizations to publish privacy promises on their Web sites that correspond to their internal privacy policies. This translation is done using mapping tables that associate language elements from each privacy policy language.
Efficient Comparison of Enterprise Privacy Policies Research at the IBM Zurich Research Laboratory on the refinement of EPAL policies is documented in (Backes, Bagga, Karjoth, & Schunter, 2004). The paper provides a technique for comparing privacy policies. Major benefits of this technique include the ability to check organizational privacy policies against existing legislation and the ability to check that privacy policies of partnered organizations are compatible (i.e., maintain the privacy policy claims and users’ preferences). The paper states that regular access control policy systems are not sufficient to implement this type of policy refinement due to three privacy-specific reasons: (1) data hierarchies must be verified; (2) obligations and conditions must be verified; and (3) rules usually overlap and will require an efficient algorithm to compare them as a whole. However, these concerns are addressed, to some degree, by existing technology (policy combining algorithms) included in access control standards such as XACML, as discussed elsewhere in this chapter.
IBM Tivoli Privacy Manager for E-Business IBM Tivoli Privacy Manager for e-business (IBM Privacy Manager, 2002) is a product offered by IBM that allows organizations to:
• • • • •
Define their privacy policies (in P3P format) using a Policy Editor component, Deploy the privacy policy across the organization, Log user consent and preferences, Monitor and enforce policy compliance, and Audit the transactions related to the private data.
Policy Compliance System Yee and Korba stress the need for ways of ensuring that Web service providers comply with users’ privacy policies or preferences (Yee & Korba, 2004b). The main goal of the Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
184
Adams & Barbieri
paper is to derive requirements for a privacy policy compliance system with the use of privacy legislation (in particular, Canada’s federal privacy legislation titled Personal Information Protection and Electronic Documents Act [PIPEDA]) and propose an architecture for this system. In this privacy policy compliance system (PPCS) architecture, the data is stored in three separate databases: (1) provider/organization information, (2) consumer/individual information, and (3) logs. The architecture also includes a Database Access component that provides read and write access to the data as controlled by the Privacy Controller component.Interactions with the user are provided by the Web Interface component, and the Private Data Import/Export component sends personal data disclosures to other providers/organizations. The paper also describes security features required by the architecture such as firewalls, intrusion detection systems, read/write protection using encryption or operating system controls, secure channels, and authentication. Compliance by parties receiving private data disclosures is assumed to be provided by PPCS installed and in force at these parties’ sites. The paper raises the concern that organizations can tamper with the system’s logs or other data and suggests the need for certified or standard PPCS software, possibly with critical components embedded in tamperproof hardware.
Proposed Unifying Architecture The technologies presented in Section i provide technical frameworks and policy languages that each solves a portion of the overall privacy enforcement architecture. However, many of these technologies have been developed in isolation, without regard to other, complementary technologies. Furthermore, products or components are often deployed and used independently of other products or components. As with other areas in security, it may generally be stated that multiple, separately installed “point solutions” will not achieve the overall privacy goals of the organization or of the user. Unexpected interactions between different products or different technologies can lead to unintended leakage and use of personal information (which is a violation of privacy and which may, in some environments, contravene applicable laws). But the problem is even deeper than this: within the same technology (privacy policies, for example), the use of different languages to express the content and format of rules and operational procedure can cause ambiguities or incompatibilities that, again, may lead to information leakage and undesired behavior. This section proposes and describes APEX (Architecture for Privacy Enforcement using XML), a comprehensive privacy enforcement architecture with two primary goals. The first goal is consistency of privacy policy throughout the (potentially multi-domain) environment, both within each technology (by using a single language framework — XML — to express all rules) and across technologies (by using automated tools to derive one policy from another). The second goal is simplified enforcement by using the access control model of policy decision points (PDPs) and policy enforcement points (PEPs) whenever requests to access personal data are made. The underlying philosophy of APEX is that if the privacy policy is consistent throughout the environment, if all PDPs
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
185
are driven by this policy, and if each PEP properly safeguards the decisions handed to it by a PDP, then privacy will be correctly enforced. The overall enforcement problem is thus confined to a relatively small set of components (policy transformation engines, decision engines, and enforcement engines), which can be carefully designed, implemented, analyzed, and audited by internal and external experts in each area. This is a major improvement over the current situation in which enforcement is handled in ad hoc ways by myriad programs and components across the environment, and there are no specific places in which experts can focus their attention for analysis and audit. As one example of where APEX can be used, consider a privacy-aware Web site. The Web site’s P3P policy (which declares publicly the site’s privacy promises) may be derived directly from the internal XACML access control policy (which codifies the full set of rules governing all access to personal data) in an automated fashion using XSLT (eXtensible Stylesheet Language Transformation) technology. These two policies are therefore guaranteed to be consistent, giving users greater confidence that the site will actually practice what it preaches, especially if a privacy seal agency has inspected and certified the relevant XSLT, PDP, and PEP engines. Without an architecture such as APEX, users have no reason to be confident about the Web site’s behavior since the access control policy and the P3P policy may be written at different times, in different policy languages, by entirely different groups of people. There is nothing whatsoever that automatically guarantees their consistency, and seal agencies may have a difficult time ensuring that the P3P promises accurately reflect the way access control is actually implemented at the site.
Threat Model It is important to define the threat model for which APEX was designed. Organizations that deal with the personal data of employees and (especially) external users generally fall into one of three categories: the “good,” the “bad,” and the “indifferent.” Those that are “indifferent” do not really care very much about properly safeguarding this data. To a large extent, relevant privacy laws for the jurisdiction(s) in which they operate will cause them to begin to care over time, at which point they become part of the “good” category. “Good” organizations take privacy seriously and want to protect it properly, but may have a significant number of sloppy practices or poor implementations that have not yet been discovered or exposed; alternatively, they may find that their environments are sufficiently large and complex that they don’t have a good understanding of who is able to access what personal data, when, and for what reasons. APEX was designed to help “good” organizations ensure that consistent privacy policies are enforced throughout their environments. The third category, “bad” organizations, consists of those who deliberately set out to use personal data for malicious intent (in the absence of, or in spite of, privacy laws that strive to deter such behavior). Clearly, neither APEX nor any other technical mechanism can prevent this: the organization has the personal data and can use it improperly if it wishes. (For example, malicious entities within the organization could switch off enforcement engines or put decision engines into a debug mode such that all requests trigger
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
186
Adams & Barbieri
a response of “Permit”). APEX cannot impose correct privacy enforcement on such “bad” organizations. However, as previously discussed, this architecture greatly simplifies the task of the trusted agency that inspects the organization’s business processes and data flows. Hence APEX, in conjunction with random external audits of the critical engines, can help to keep a “bad” organization “honest” and enforce proper privacy practices. APEX then, if correctly implemented, gives guaranteed privacy enforcement to “good” organizations that might otherwise find it too complex to get the enforcement correct. Furthermore, it increases the probability that “bad” organizations will also provide privacy enforcement by making it significantly easier for trusted agencies to discover malicious behavior.
Architecture Design An important aspect of privacy is the control of access to personally identifiable information (PII). This aspect can be implemented using a comprehensive access control architecture, such as the one described by XACML (XACML 1.1, 2003). APEX describes how a privacy enforcement architecture can be implemented using XML technologies for the policy specifications and other technologies to enforce the privacy requirements. Figure 1 provides an overview of a privacy policy architecture (PPA). This architecture
Figure 1. APEX privacy policy architecture
Pill Storage
Pill Storage
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
187
illustrates the different privacy policies required in most organizations: inter-organizational privacy policy (IOPP), organizational privacy policy (OPP), local privacy policy (LPP), and Web privacy policy (WPP). These policies describe how private data is handled within the organization, always in harmony with the user’s preferences. Note that in this diagram the user is an entity external to the organization (as a customer is external to the e-commerce Web site he or she visits) and so is not covered directly or solely by the OPP. Users within the organization (e.g., employees and their PCs) are included within the cloud representing the organization. The PPA also includes several privacy policy enforcement points (i.e., inter-organizational privacy enforcement engine [IOPEE], organizational privacy enforcement engine [OPEE], local privacy enforcement engine [LPEE], and Web privacy enforcement engine [WPEE]), which ensure that all access to private data abides by the appropriate privacy policies. These components are described in further detail next. As stated previously, in this proposed architecture all organizational privacy policies are guaranteed to be consistent, thus making enterprise-wide policy enforcement a more readily achievable goal. The privacy policy architecture is derived from the companywide access control policy. This transformation is performed in an automated fashion using XSLT engines. The privacy policy architecture is enforced by a number of privacy enforcement engines. These engines are based in part on the XACML architecture. In the XACML architecture, there are enforcement points that accept resource access requests. These enforcement points formulate the request in XACML syntax and forward it to a decision point. The decision point makes a decision based on the XACML policy and other information available and returns an XACML response to the enforcement point. The enforcement point then allows or denies access to the requested resource based on the decision received from the decision point. In the privacy enforcement architecture, a similar process is employed. When an application requests access to personal data, the request is processed by a privacy enforcement engine deployed at key choke points in the enterprise (as shown in Figure 1). This privacy enforcement engine formulates the request in the XACML syntax and forwards it to the privacy transformation and decision engine (see Figure 2). The privacy transformation and decision engine, which was responsible for transforming any applicable XML-based privacy policies to or from XACML access control policies, evaluates the request against the access control policy to produce an XACML response. The XACML response, which includes the enforcement action (permit or deny) and possibly some obligations, is sent from the privacy transformation and decision engine to the privacy enforcement engine. The privacy enforcement engine then permits or denies access to the personal data based on the XACML response. The use of privacy enforcement points and the transformation of policies to maintain consistent privacy decisions across the enterprise both contribute to increase the assurance that the privacy policies are properly enforced. Auditing the policy transformations as well as the privacy enforcement points provides additional assurance.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
188
Adams & Barbieri
Figure 2. APEX request processing
Data Owner (User) Privacy Preferences
Privacy
XACML
Policy
Policy
Architectural Components XML-Based Privacy Policy There are several types of privacy policies applicable throughout the organization. The primary examples are the local privacy policy (LPP), the Web privacy policy (WPP), the organization privacy policy (OPP), and the inter-organization privacy policy (IOPP), but other policies may also be relevant in specific environments. There has been some work on specifying a language for the definition of each of these four types of policies. Although the current initiatives may not be complete, they have made significant progress in defining the requirements and the syntax of these policy languages. Because they are specified in XML, it will be relatively easy to augment them as new requirements are identified.
•
Local privacy policy (LPP): The LPP specifies the rules to be enforced with regard to transactions involving private data on the local system. We are unaware of an implementation of a local privacy policy at the current time other than traditional file system access control policies. A very useful feature for non-technical users would be pre-defined privacy control policies (templates) provided by trusted organizations. The research described by Yee and Korba (2004a) introduces two approaches to deriving personal (or local) privacy policies in a semi-automated fashion.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
189
•
Web privacy policy (WPP): The WPP specifies the rules to be enforced with regard to Web-based transactions involving private data. An existing standard for the definition of these policies is P3P (P3P 1.0, 2002).
•
Organizational privacy policy (OPP): The OPP specifies the rules to be enforced with regard to transactions involving private data within the organization. A proposed language for the definition of these policies is EPAL (Ashley et al., 2003).
•
Inter-organizational privacy policy (IOPP): The IOPP specifies the rules to be enforced with regard to transactions involving private data between organizations. Even though EPAL (Ashley et al., 2003) is a proposed language for the definition of these policies, it may not be complete. For example, researchers have identified some requirements that are not covered by the current version of EPAL (Lakshminarayanan, Ramamoorthy, & Hung, 2003). Although researchers at IBM initially had strong counter-opinions on this topic, we note that IBM recently appears to be displaying slightly less interest in EPAL and slightly more interest in alternative policy languages (such as XACML). Thus, although it is not yet clear what the final result will be, XACML is showing increasing promise as a strong candidate in this area.
Privacy Transformation and Decision Engine (PTDE) Figure 3 illustrates how XML-based privacy policies are transformed to and from XACML access control policies. This transformation occurs in the Privacy Transformation and Decision Engine (PTDE). There are several possible functions for the PTDE. Organizational and Web privacy policies can be derived from the organization’s XACML access control policy so that their privacy promises are reflective of their implemented access control practices. XMLbased policies can also be transformed into XACML. For example, when a user submits his or her privacy preferences, the PTDE will transform these user preferences into an
Figure 3. Privacy transformation engine XML-based privacy policy
(converts privacy policy fi XACML policy) access control policies
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
190
Adams & Barbieri
XACML policy and compare them with Web and organizational privacy policies to determine whether they are acceptable to the site (or perhaps whether negotiation may be possible). Another important function of the PTDE is to apply policy-combining algorithms when necessary. Policy-combining algorithms can be used when making a decision that involves the user preferences XACML policy and the organization’s XACML policy, or that involves two different organizations’ policies. The combining algorithm can be used to specify the final ruling of the combined policies. For example, with the combining algorithm “deny overrides,” if at least one of the policies denies the request then the final ruling will be to deny the request. Other combining algorithms include “allow overrides,” “first applicable,” and “only one applicable.” The XACML specification also allows the definition of custom combining algorithms. In many environments, the Privacy Transformation and Decision Engines will evaluate the following information, in some specified order: 1.
User (data owner) preferences.
2.
Enterprise privacy policies (OPP, IOPP, WPP, LPP).
3.
Traditional access control policies.
For example, the user’s preferences or consents may be considered first, and then applicable enterprise policies may be evaluated. If there are no user preferences and no enterprise privacy policies to dictate how the data should be handled, the traditional access control policies defined for the data will be employed. It is important to note, however, that the syntax of XACML allows these various policy components to be evaluated in any other desired order, as specified by the XACML policy writer. Note that combining algorithms do not resolve policy conflicts; rather, they reflect/ encode resolution decisions that have been made by the policy writer (hopefully in conjunction with many of the relevant interested parties) at the time of policy writing. Thus, it is the policy writer that specifies, for example, that user preferences will be evaluated before corporate privacy policies or that any policy that denies an access will overrule all others. Typically, within an organization the owner of a resource will write the XACML policy that controls access to that resource. Things can become more complicated when the resource is user personal data, however, because the XACML writer may be a corporate entity, even though it may be difficult to determine precisely who the “owner” is. If discussions regarding combining strategies have not taken place, or have not reached consensus, policy conflicts can still occur with respect to this data and will need to be resolved in some other way, either at the time of the transaction or at a later time. The Privacy Transformation and Decision Engines (PTDE) are made up of two components:
• •
Transformation engines; and Decision engines.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
191
The following subsections describe each of these sub-components. Transformation Engines Research by IBM (Karjoth et al., 2003) describes the process involved in transforming EPAL policies to P3P policies using mapping tables. A similar method of using mapping tables is employed in the APEX architecture’s transformation engines, where policies are transformed by first mapping the different elements of the two policy languages. APEX transformation engines go one step further by automating the transformation using XSLT engines. The APEX transformation engines use XSLT to translate XML-based privacy policies to or from XML-based access control policies (e.g., XACML). Different transformation engines are required for each of the different XML-based privacy policies (e.g., P3P and EPAL). Researchers at University of Ottawa (Li & Yan, 2004) have defined and implemented the XSLT transformation engine that automatically translates XACML policies to P3P policies. Some of the challenges identified in this translation are as follows:
•
XACML is flexible and allows policies to be written with different labels for data, purpose, and so on. However, to successfully create appropriate mapping tables, it is necessary that some labels be consistent between policies and organizations.
•
Some information such as entity information, dispute information, and URLs are not encoded in an XACML policy. This information must therefore be input from other sources as parameters to the transformation engine.
Nevertheless, a working prototype of this transformation has been developed (see Figure 4). Engines for translating from P3P to XACML, and from APPEL to P3P, have also been successfully implemented (Mahmoudian, 2004; Davies & Tou, 2005). Other transformation engines will need to be developed to transform other privacy policies (such as EPAL) to and from XACML policies. These are the subjects of future work. Decision Engines The decision engines evaluate the rules and provide a ruling that is passed on to the enforcement engines. This process is identical to the one employed in the XACML architecture (XACML 1.1, 2003). Since all privacy policies in the APEX are transformed to XACML, the XACML PDP can be used as a decision engine. Implementations of a valid XACML PDP engine are available; see, for example, SunAPI (2003).
Privacy Enforcement Engine(s) Privacy Enforcement Engine(s) will enforce access control decisions specific to their domain (Web, Organization, Inter-Organization, or Local) as dictated by their governing policies. Privacy Enforcement Engines also perform obligation notification (that is, they
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
192
Adams & Barbieri
Figure 4. XSLT engine for XACML to P3P XACML access control policies
XSLT Engine
(converts XACML policy P3P privacy policy)
Other Information e.g. entity, dispute info
P3P privacy policy
convey policy-specified obligation notices to the entity/entities that are responsible for carrying out these obligations). Note that obligation enforcement (ensuring that the specified obligations have been completed correctly) is a complex and difficult problem that is a research topic in many organizations in the areas of digital rights management (DRM), access control, and privacy enforcement. Web Privacy Enforcement Engine (WPEE) The Web privacy enforcement engine will enforce privacy policies using access control at a Web point of presence. It will be used for e-commerce type activities such as form submission of private data or requests to view private data. The Web privacy enforcement engine can be implemented in the Web user agent (e.g., browser) or in a Web proxy. (In Figure 1, the WPEE is deliberately shown in the middle between the user and the Web site to convey the notion that it might be instantiated at either end). A form of the Web privacy enforcement engine exists with APPEL engines (P3P Implementations), a technology for the evaluation of a P3P policy with respect to the user privacy preferences. APPEL engines compare the user preferences to the P3P policy of the object accessed (i.e., the Web page) and warn the user if his or her preferences cannot be satisfied. In work by IBM (Agrawal et al., 2003), P3P policies are implemented in a database, and APPEL preferences are translated into queries. The user’s preferences are matched to the privacy policy elements using a query to the database, and the query response is the result sent back to the user. Organizational Privacy Enforcement Engine (OPEE) The organizational privacy enforcement engine will enforce privacy policies using access control for all access to private data throughout the organization. The organizational privacy enforcement engine can be implemented in the file server’s file system or as a mandatory application proxy or gateway that intercepts all access (read, write) to
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
193
stored private data. EPAL is an initiative that could potentially be used as the policy input to this architecture component. Inter-Organizational Privacy Enforcement Engine (IOPEE) The inter-organizational privacy enforcement engine will enforce privacy policies using access control for all access to private data between organizations. The inter-organizational privacy enforcement engine can be implemented in the file server’s file system as a mandatory application proxy or as an extranet gateway that intercepts all access (read, write) to stored private data. Mapping terminology and (especially) mapping semantics between organizational privacy policies can be a difficult problem at this inter-organizational layer; see Section vi.vi for further discussion. Local Privacy Enforcement Engine (LPEE) The local privacy enforcement engine will enforce privacy policies using access control for all access to private data on a desktop or local system. The Local Privacy Enforcement Engine can be implemented in the file system as a local proxy or as a “sticky policy” implementation. As discussed earlier, forms of the Local Privacy Enforcement Engine have been researched at HP Trusted System Laboratory, whereby privacy labels or tags are attached to the private data and “stick” to the private data as it traverses the system (Beres et al., 2003). They have also researched the use of IBE technology to obfuscate the private data and enforce privacy control.
Personally Identifiable Information Storage The PII storage component will be a database or other data storage device that will contain all private data collected. This data storage device will interoperate with the enforcement engines to ensure that only authorized access to the private data is allowed. This component should include traditional database security features such as encryption, passwords, and auditing. There has been limited research in the area of privacyaware databases. The Hippocratic database work by IBM discussed earlier (Agrawal et al., 2002) is one example of a database system that could be integrated into the APEX architecture to provide further privacy enforcement at the database level. Other research related to privacy-aware database systems includes the pawS system (Langheinrich, 2002), a privacy awareness system that implements a privacy-aware database (pawDB). pawDB stores users’ data with their relevant privacy policies or preferences.
Audit Audit is implemented throughout the enterprise infrastructure. We can distinguish two types of audit in this architecture. The first is the collection and storage of audit logs
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
194
Adams & Barbieri
containing all requested data and response actions. This auditing should be performed on each Privacy Enforcement Engine and kept in a well-protected environment. The second type of audit is an external audit such as a privacy assessment where a privacy seal or similar testament of privacy assurance can be awarded to organizations that convincingly demonstrate to the seal agency their adherence to privacy laws, regulations, and/or best practices. Even though the implementation of APEX can increase user trust in an organization’s privacy practices, external audits are still needed as an additional level of assurance. However, because APEX is designed with engines deployed strategically throughout the organization, external auditing can be simplified by focusing the assessment on the core of the architecture — the enforcement engines and the decision and transformation engines. Possible technologies that can be used to implement audit features include simple operating system, network, application, or database-level audit logs, self-certifying software (Felty & Matwin, 2002), and external certification initiatives such as the Common Criteria (Common Criteria), which has been discussed in various forums as a way to provide privacy certifications and Privacy Seal programs (e.g., TRUSTe [n.d.]).
Remaining Issues and Challenges The research presented in this chapter integrates and extends various access control and privacy policy XML-based technologies to create a new architecture. However, there are still a number of areas in which further progress may be made. Some of the more significant issues and challenges that remain are discussed in this section.
Immaturity of Current Technology Table 2 summarizes (from the section entitled Background) the main components of a privacy enforcement architecture and the corresponding technologies or research directions currently under investigation. Although it is clear that work is being done with respect to each of the architectural components, at the present time, few (if any) of these efforts are sufficiently complete or mature for wide-scale adoption and deployment.
Privacy Laws and Regulations: Jurisdictions Privacy can be seen as the protection of personal information through a combination of legislation, policy, and technology. APEX addresses the implementation of policy and technology to some degree but does not address legislation. A significant issue on the legal side is how different laws in different countries or states affect the privacy enforcement architecture. Should these differences be dealt with in the privacy policies or elsewhere in the architecture? Should they be considered after the fact using the auditing mechanisms? In many cases the privacy policies are defined by organizations
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
195
Table 2. Privacy enforcement architecture components and current research Privacy Enforcement Architecture Component Privacy Policy
Current Technology or Research
Local privacy policy
•
Web privacy policy Organizational privacy policy Inter-organizational privacy policy
• • • •
Privacy Trans. and Decision Engine Privacy transformation engine
• • •
Privacy Decision Engine
•
Web privacy enforcement engine
•
Privacy Enforcement Engine
User preference creation (Yee & Korba, 2004a) P3P policies (P3P 1.0, 2002) EPAL policies (Ashley et al., 2003) EPAL policies (Ashley et al., 2003) Efficient comparison of enterprise privacy policies (Backes et al., 2004) XSLT engine for transformation from XACML to P3P (Li & Yan, 2004) XSLT engine for transformation from P3P to XACML (Mahmoudian, 2004) Transforming EPAL to P3P (Karjoth et al., 2003) XACML PDP engine (SunAPI, 2003)
Organizational privacy enforcement engine
• • •
Inter-organizational privacy enforcement engine
• • •
Local privacy enforcement engine
•
APPEL implementation (P3P Implementations) Implementing P3P using database technology (Agrawal et al., 2003) File server’s file system Mandatory application proxy Internal gateway that intercepts all access to stored private data File server’s file system Mandatory application proxy Extranet gateway that intercepts all access to stored private data HP sticky policies (Beres et al., 2003)
Database
• •
Hippocratic databases (Agrawal et al., 2002) pawDB (Langheinrich, 2002)
Local certification
• • • • •
3rd-party certification
•
Operating system logs Network logs Application logs Database logs Self-certifying software (Felty & Matwin, 2002) Common Criteria certifications (Common Criteria) Privacy Seal programs
•
Personally Identifiable Info. Storage Audit
Log files
•
with particular privacy legislation in mind (federal legislation, provincial/state legislation, sector-based legislation, and so on). Some privacy policy languages (e.g., P3P version 1.1) have begun to incorporate “jurisdiction” as part of their specification Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
196
Adams & Barbieri
language, allowing organizations to specify to which jurisdiction a particular data recipient belongs. The question about where jurisdiction compatibility enforcement should be implemented within the architecture is one important direction for further research.
Dealing With Private Data After Retrieval Once private data has been requested and successfully retrieved, the data is typically stored on the requesting computer in an unprotected state. A malicious user could transmit this data to an unauthorized user or system and violate the privacy policy for this data. A method is required for detecting this and (ideally) preventing it from occurring. The APEX architecture includes a Local Privacy Enforcement Engine (LPEE) that can be used to enforce privacy policies on the local system. However, technical implementations to provide this functionality are likely to be challenging.
Purpose It has been noted that binding a “purpose” to private data use is not included in traditional access control systems (Ashley, Hada, Karjoth, & Schunter, 2002a; Ashley, Powers, & Schunter, 2002b). A way to deal with this in an access control policy language such as XACML is to extend XACML to make use of “purpose” in access control decisions. This can be done relatively simply by adding “purpose” as a specified Action attribute in the XACML schema. In this way, APEX engines can incorporate the concept of access control decisions based upon the purpose for which the requester wishes to retrieve the data. This work has been completed within the XACML Technical Committee and was standardized on February 1, 2005, by OASIS (Organization for the Advancement of Structured Information Standards) as part of the XACML 2.0 suite of specifications (XACML 2.0, 2005).
Inference of Private Data With Multiple Requests One issue often discussed with respect to a privacy enforcement architecture is that it is difficult to prevent privacy violations that occur due to deductions (e.g., information is gathered by asking multiple queries throughout the system, and private data is deduced from these independent queries). Data mining techniques are often used in this way to obtain private data dishonestly. Various techniques have been suggested to attempt to prohibit malicious users from mining private data. These efforts can be classified into two categories (Agrawal & Srikant, 2000): query restriction and data perturbation. Query restriction techniques include:
• •
Restricting the size of query results, Controlling overlap amongst successive queries,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
197
•
Keeping audit trails of all answered queries and constantly checking for possible compromise,
• •
Suppression of data cells of small size, and Clustering entities into mutually exclusive atomic populations.
Data perturbation techniques include:
• • •
Swapping values between records,
• •
Adding noise to the results of a query, and
Replacing the original database value by a sample from the same distribution, Adding noise to the values in the database while maintaining one or more statistical invariants, Sampling the result of a query.
Some of the above techniques are more effective than others, and some are useful only in particular environments. Some specific research efforts in these areas include (Sweeney, 2000, 2001, 2002; Agrawal & Srikant, 2000; Agrawal et al., 2002; Clifton, 2001; Kantarcioglu & Clifton, 2003). One major disadvantage of query restriction and especially data perturbation techniques is that they can often destroy the integrity of the data. A potential advantage of the proposed architecture APEX is that the distributed Privacy Enforcement Engines can each log requests made to them over time. Correlation of these logs can then be done in an attempt to detect privacy violations via inference without altering the private data. The effectiveness of such an approach is an important area for future research.
Standard Privacy Policy Vocabularies Another challenge that must be addressed to ensure successful implementations of privacy enforcement systems is the need for standard, or at least compatible, vocabularies for privacy policy specification. Most of the current work involves mapping different privacy and access control policy specification languages. These languages (especially XACML and EPAL) are designed to be very flexible and allow organizations to customize the policies to their organization’s practices by defining their own terminology. Languages such as P3P, which limit the terminology used to define policies, prove to be less problematic because there is a controlled vocabulary that must be mapped or translated (note: EXTENSION elements in P3P can still cause some problems). The challenge of mapping language vocabularies can be tackled by defining global and sector-specific vocabulary terms that are standardized and used by all. For example, when describing the entity that collects private data about users, languages currently allow many terms such as “organization,” “entity,” “site,” “collector,” “provider,” “service provider,” “government,” “store,” and so on. However, if all organizations used the term
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
198
Adams & Barbieri
“collector” (for example), it would greatly simplify the task of checking compatibility between different policies or translating from one policy to another. Although some research has been done in this area, such as Powers, Adler, and Wishart (2004) and research conducted at the University of Ottawa, much work remains to be done. Aside from the complexity of translation, it is also important to consider the complexity of use. Expressive power in a policy language can sometimes make the language difficult to use (e.g., by the people who must create the policies). Evaluating how accurately people can represent their desired policies in a specified language — and to what extent user interface tools can help — is an interesting area for future (probably interdisciplinary) work.
Quantitative Evaluation of APEX APEX has been designed to improve privacy enforcement across an environment. But evaluating — in a quantitative way — the extent to which this architecture has met this goal is a subtle and non-trivial exercise. What are the appropriate metrics? How can relevant measurements be made? One possible approach is a long-term comparative study of two organizations of similar size and function, one using APEX and the other using no comprehensive architectural solution. The evaluator can then count the number (and measure the severity) of privacy violations in each organization over a multi-year period to get an intuition about how much APEX has helped. However, such a study may be difficult to do in a rigorous way because of all the variables that need to be taken into account (not the least of which is the competence and diligence of the various system administrators and other relevant entities in the two organizations). Other possible evaluation techniques are an area for further research.
Future Trends Interest in the enforcement of privacy policies and personal preferences is a growing trend within the broad area of privacy protection for e-services. Users are becoming more aware of privacy issues with respect to their online transactions and are beginning to demand greater assurance that their private data will be properly protected throughout its life cycle. Technologies for privacy enforcement play a major role in providing such assurance, and are therefore becoming ever more important to e-service providers. However, individual, isolated technologies are of limited effectiveness since privacy can be so easily violated through unexpected conflicts and interactions between technologies and between components. Thus, a comprehensive architecture for privacy enforcement is essential for proper protection of personally identifiable information. It is expected that privacy enforcement architectures will be an increasingly active area of research, development, and enterprise-wide implementation for at least the next several years. Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
199
Conclusions In this chapter we provide a brief introduction to a variety of technologies for privacy enforcement. We also propose an Architecture for Privacy Enforcement using XML (APEX) that builds upon these standards and research directions. We define a set of privacy enforcement engines that accept requests for private data, consult with the privacy transformation and decision engines, and provide a response to the requester. The privacy transformation and decision engines ensure that the relevant privacy policies (Web privacy policy, organizational privacy policy, inter-organizational privacy policy, and local user preferences) are consistent and transformed into the format required for a correct privacy decision to take place. Auditing, for legal reasons or for additional assurance, may be performed throughout the architecture with particular emphasis at each enforcement engine deployed in the organization. The primary benefit of APEX, aside from integrating many of the current isolated technologies, is that it provides greater assurance for the end user that his or her personal data will be protected in the manner that he or she expects. In particular, the automated translation between different privacy policies throughout the architecture ensures consistency between advertised practices and actual practices with respect to privacy at the e-service provider Web site. Furthermore, although the need for third-party auditing is not eliminated in this architecture, the auditing or certification process is greatly simplified because it is focused on a relatively small number of specific components in the enterprise. Thus, certifications or privacy seals may be achieved more quickly and lead to an even greater degree of user confidence than is possible today. Also discussed in this chapter were some remaining issues and challenges that provide directions for further research in this important area.
Acknowledgments This work was partially supported by a grant from Ontario Research Network for Electronic Commerce (ORNEC). The authors also gratefully acknowledge the anonymous reviewers, whose comments significantly improved the content and presentation of this chapter.
References Agrawal, R., Kiernan, J., Srikant, R., & Xu, Y. (2002). Hippocratic databases. In Proceedings of the 28th VLDB Conference, Hong Kong. Agrawal, R., Kiernan, J., Srikant, R., & Xu, Y. (2003). Implementing P3P using database technology. In Proceedings of the 19th International Conference on Data Engineering, Bangalore, India. Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
200
Adams & Barbieri
Agrawal, R., & Srikant, R. (2000). Privacy-preserving data mining. In ACM SIGMOD Conference on Management of Data, Dallas, Texas, USA. APPEL. (2000, April). A P3P Preference Exchange Language (APPEL) (WWW Consortium, P3P Preference Interchange Language Working Group, W3C Working Draft). Retrieved from http://www.w3.org/TR/P3P-preferences Ashley, P., Hada, S., Karjoth, G., & Schunter, M. (2002a). E-P3P privacy policies and privacy authorization. In ACM Workshop on Privacy in the Electronic Society (WPES) (pp. 103-109). ACM Press. Ashley, P., Powers, C., & Schunter, M. (2002b). From privacy promises to privacy management: a new approach for enforcing privacy throughout an enterprise. In Proceedings of the 2002 New Security Paradigms Workshop, Virginia Beach, Virginia, USA. Ashley, P., Hada, S., Karjoth, G., Powers, C., & Schunter, M. (2003, November 10). Enterprise Privacy Authorization Language (EPAL 1.2) (W3C Member Submission). Retrieved from http://www.w3.org/Submission/2003/SUBM-EPAL-20031110/ AT&T Privacy Bird. (n.d.). AT&T Privacy Bird. Retrieved from http://privacybird.com Backes, M., Bagga, W., Karjoth, G., & Schunter, M. (2004, March 14-17). Efficient comparison of enterprise privacy policies. In Proceedings of the ACM Symposium on Applied Computing (SAC ’04), Nicosia, Cyprus. Beres, Y., Bramhall, P., Casassa Mont, M., Gittler, M., & Pearson, S. (2003). Accountability and enforceability of enterprise privacy policies (HPL-2003-119). Trusted Systems Laboratory (TSL), Hewlett-Packard Laboratories. Clifton, C. (2001). Privacy preserving distributed data mining. Proposal for project funded by the Purdue Research Foundation, August 2002-August 2004, and funded by the National Science Foundation Information Technology Research program, August 2003-August 2006. Common Criteria. (n.d.). Common Criteria Web site. Retrieved from http:// www.CommonCriteriaPortal.com Cranor, L. F. (2002). Web privacy with P3P. O’Reilly. Davies, R., & Tou, C. S. (2005, April). Incorporating and enforcing user preferences in Web site privacy practices (Honors Project Report). University of Ottawa, SITE. Felty, A., & Matwin, S. (2002). Privacy-oriented data mining by proof checking. In Sixth European Conference on Principles of Data Mining and Knowledge Discovery, LNCS 2431. Springer-Verlag. He, Q. (2003). Privacy enforcement with an extended role-based access control model (NCSU Computer Science Technical Report, TR-2003-09). Hope-Tindall, P. (2002). Privacy impact assessment — Obligation or opportunity: The choice is ours! Presented at CSE ITS, Ottawa, Canada. Retrieved from http:// www.dataprivacy.com/mod/fileman/files/PIA_Material.pdf IBM Privacy Manager (2002). Enable your applications for privacy with IBM Tivoli Privacy Manager for e-business: A technical discussion of privacy protection. Retrieved from http://www.ibm.com
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Enforcement in E-Services Environments
201
Kantarcioglu, M., & Clifton, C. (2003). Privacy preserving data mining of association rules on horizontally partitioned data. Transactions on Knowledge and Data Engineering, 16(9), 1026-1037. Karjoth, G., Schunter, M., & Van Herreweghen, E. (2003). Translating privacy practices into privacy promises — How to promise what you can keep. In Proceedings of the 4th International Workshop on Policies for Distributed Systems and Networks. Karjoth, G., Schunter, M., & Waidner, M. (2002). Platform for enterprise privacy practices: Privacy-enabled management of customer. In 2nd Workshop on Privacy Enhancing Technologies. Lakshminarayanan, S., Ramamoorthy, R., & Hung, P. C. K. (2003, June 19-20). Conflicts in inter-prise EPAL policies. In W3C Workshop on the Future of P3P, Kiel, Schleswig-Holstein, Germany. Langheinrich, M. (2002). A privacy awareness system for ubiquitous computing environments. Ubicomp. Li, D., & Yan, H. (2004). Automated translations for architecture for privacy enforcement using XML (APEX) (Honors Project Report). University of Ottawak, SITE. Lindskog, H., & Lindskog, S. (2003). Web site privacy with P3P (pp. 147-157). Wiley. Lorch, M., Proctor, S., Lepro, R., Kafura, D., & Shah, S. (2003). First experiences using XACML for access control in distributed systems. In ACM Workshop on XML Security. Mahmoudian, M. (2004). Developing an internal access control policy for a Web site using an automated privacy policy mapping (Honors Project Report). University of Ottawa, SITE. P3P 1.0. (2002, April 16). The Platform for Privacy Preferences 1.0 (P3P1.0) specification (WWW Consortium, W3C Recommendation). Retrieved from http:// www.w3.org/TR/2002/REC-P3P-20020416/ P3P 1.1. (2004, April 27). The Platform for Privacy Preferences 1.1 (P3P1.1) specification (WWW Consortium, W3C Working Draft). Retrieved from http://www.w3.org/ TR/2004/WD-P3P11-20040427/ P3P Implementations. (n.d.) References for P3P implementations (WWW Consortium). Retrieved from http://www.w3.org/P3P/implementations Powers, C., Adler, S., & Wishart, B. (2004, March 11). EPAL translation of the Freedom of Information and Protection of Privacy Act, version 1.1 (IBM Corporation Report for the Government of Ontario). Rezgui, A., Ouzzani, M., Bouguettaya, A., & Medjahed, B. (2002). Preserving privacy in Web services. In Proceedings of the Fourth International Workshop on Web Information and Data Management, McLean, Virginia, USA. SunAPI. (2003). Sun’s XACML implementation programmer’s guide for version 1.1. Retrieved November 5, 2003, from http://sunxacml.sourceforge.net/guide.html Sweeney, L. (2000). Uniqueness of simple demographics in the U.S. population (LIDAPWP4). Carnegie Mellon University, Laboratory for International Data Privacy, Pittsburgh, PA.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
202
Adams & Barbieri
Sweeney, L. (2001). Computational disclosure control: A primer on data privacy protection. Doctoral dissertation, Massachusetts Institute of Technology. Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5), 557-570. TRUSTe. (n.d.). TRUSTe Privacy Seal Program. Retrieved from http://www.truste.com XACML 1.1. (2003, August 7). eXtensible Access Control Markup Language (XACML), version 1.1 (OASIS Committee Specification). Retrieved from http://www.oasisopen.org/committees/xacml/repository/cs-xacml-specification-1.1.pdf XACML 2.0. (2005, February 1). eXtensible Access Control Markup Language (XACML), version 2.0 (OASIS Standard). Retrieved from http://docs.oasis-open.org/xacml/ 2.0/access_control-xacml-2.0-core-spec-os.pdf Yee, G., & Korba, L. (2003a, January 27-31). Bilateral e-services negotiation under uncertainty. In Proceedings of the 2003 International Symposium on Applications and the Internet, Orlando, Florida, USA. Yee, G., & Korba, L. (2003b, May 18-21). The negotiation of privacy policies in distance education. In Proceedings of the 14th IRMA International Conference, Philadelphia, Pennsylvania, USA. Yee, G., & Korba, L. (2004a, May 23-26). Semi-automated derivation of personal privacy policies. In Proceedings of the 2004 Information Resources Management Association International Conference, New Orleans, Louisiana, USA. Yee, G., & Korba, L. (2004b, July 6-9). Privacy policy compliance for Web services. In Proceedings of the IEEE International Conference on Web Services, San Diego, California, USA.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 203
Chapter VIII
Protecting Privacy Using XML, XACML, and SAML Ed Simon, XMLsec Inc., Canada
Abstract This chapter describes how two new XML-based technologies, XACML (eXtensible Access Control Markup Language) and SAML (Security Assertion Markup Language) can be used to help protect privacy in e-services. The chapter is primarily a tutorial, briefly introducing XML, and then detailing the privacy features of XACML and SAML including XACML’s ability to ensure the expressed purpose of an action matches a purpose allowed for the resource on which the action is to be performed and SAML’s support for pseudonymity and communicating consent. Concepts are illustrated with detailed examples. The author hopes that readers will be both informed and intrigued by the possibilities for privacy applications made possible by XML, XACML, and SAML.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
204 Simon
Introduction The advent of the World Wide Web over the past decade has made it suddenly feasible for even novice human users to find and retrieve information from any Web site. Moreover, human users are not just receiving information; they are actively using the Web to carry out directives on their behalf such as ordering books, banking, and so forth. Today, the latest Web technologies and techniques such as Web Services and ServiceOriented Architecture (SOA) herald breakthroughs for fully automatable cross-enterprise application-to-application communication, promising almost unlimited possibilities for e-services. It now becomes technically quite possible for enterprise applications (with minimal input from human users) to exchange data with each other no matter who owns them, where they are located, or what hardware and software they are made of. Rapid advances in Web technologies have eliminated what were once perceived to be natural technological barriers. But with the removal of these “natural barriers” comes the increased possibility of misuse, whether intentional or not. Advances in enabling eservices must be complemented by a technological framework that protects personal data, stipulates its appropriate use, and logs that use for subsequent audits. This chapter gives a whirlwind tour of how new XML-based security standards, particularly XACML (eXtensible Access Control Markup Language), pronounced “exack-mall” in abbreviated form, and SAML (Security Assertion Markup Language), can be used to support privacy for e-services. During the past few years since its inception, XML has become a widely used, popular format for encoding data. Many new standards for data formats are in XML for a large variety of applications ranging from document formats (such as for Microsoft Office and OpenOffice) to Web Services (enabling different computing components to work together regardless of programming language, platform, or location). What is not so widely known about XML is that it is an incredibly useful tool for expressing and enforcing privacy policies. One of the reasons XML has become so popular is the ease with which it can be used to define and describe a data set through structure and semantics. Structure enables fine-grained dividing of data into its meaningful parts. Semantics describes what a piece of data is and can also describe how it is to be used… and describing how certain data is to be used is an important part of privacy.
Background This chapter starts with a gentle introduction to XML and then heads directly into a description of today’s important standards relevant to those involved in assuring privacy considerations in e-services. For illustration, we introduce a fictitious scenario in which the privacy of a patient’s medical record is protected using inter alia, an XACML policy that is evaluated dynamically according to service requests from Physicians, Administrators, and Researchers (for clarity, subject roles are capitalized throughout). Figure 1 illustrates the scenario.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 205
Figure 1. Patient medical record privacy scenario
In Figure 1, a user (the patient’s Physician, a Hospital Administrator, or a University Researcher) wishes to view information in Patient Judy’s medical record. To do so, the user signs on to the hospital’s identity and access management (I&AM) system ([1]) and launches the appropriate application for viewing patient medical records. The application creates a service request for medical record information ([2]) that is sent to the patient medical record database. The request is intercepted by a policy enforcement point (PEP), which permits or denies requests based on the result of the authorization decision it receives ([3]) from the policy decision point (PDP). The PDP, in determining the authorization decision, acquires the relevant policy ([4]) and other information from policy information points (PIPs) — in this case attributes about the user and the resource ([5]). Should the policy be in concordance with the details of the request and its attributes, the PDP indicates to the PEP that the application request is authorized ([6]), and the PEP allows the request to reach the patient medical record database. In turn, the database provides the medical information requested ([7]). Also in Figure 1, the terms SOAP, SAML, and XACML refer to the respective XML specifications used to define data and/or message protocols. SOAP is the format defined for Web Services messages. The other specifications will be explained throughout this chapter as the scenario is examined in depth. Similarly, the terms PEP, PDP, PIP, and PAP (policy administration point) will also be introduced in more detail. Judy’s medical record contains a variety of information about Judy including her name, address, physical characteristics, and medical conditions. To protect Judy’s privacy, the type of information each of the three types of users can view is limited to that necessary for their positions. For the scenario, we consider the following privacy rules to be in effect: 1.
Judy’s Physician may view all data in Judy’s medical record.
2.
Administrators may view only Judy’s identity data (her name and address).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
206 Simon
Figure 2. Privacy rules for the medical record scenario
3.
Researchers may view only Judy’s physical data and medical conditions, and they may only do so for the purpose of research.
Figure 2 illustrates these privacy rules. Toward the end of this chapter, it will also be shown in a second example how SAML can help protect the privacy of users accessing services. In that example, the Researcher will coordinate different organizations’ e-services without divulging her identity to those organizations.
XML Before going further, a brief introduction to XML is required. As stated earlier, XML brings structure and semantics to data. The structure defines a hierarchical model for the data that provides great flexibility in how the data is organized and deconstructed. Further, the semantics of the data, down to the individual nodes in the structure, can be specified in an extensible manner. Because of XML, a dataset’s structure and semantics can be shared and reused among other datasets.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 207
The primary structure in XML is the element. An element consists of a start tag and an end tag with, optionally, content. Here is a simple XML element:
A start tag consists of a left angle bracket (‘’). An end tag consists of a left angle bracket (‘’). As shown, attributes are named values.
Namespaces XML specifies how to define elements, but it does not define the names of elements. In the above example, there is no definition in the core XML specification as to what “Age” means in an element. (If the core XML specification did that, it would not be extensible!). As XML leaves the naming of elements to implementers, the question arises as to what prevents different implementers from using the same XML names. And the answer is nothing! Implementers are free to use whatever names they like. However, in order to distinguish their element names from other implementers’ element names, implementers can assign a namespace to their element names if their XML tags are to be used in a situation where the namespace is not automatically known. (Assigning a namespace to one or more element names is referred to as “qualifying”). A namespace is much like a glossary. A glossary defines the meanings of words as they are used in a particular context. Though element names need not be unique, namespaces necessarily are. To help assure the uniqueness of namespaces, a namespace is typically defined as a URI related to the XML implementer and is associated with a specification that defines what the names mean and how they are to be used. To use namespaces in XML, XML defines a special xmlns attribute. It may be used in two different styles: 39 or 39
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
208 Simon
Schemas Elements may contain data or other elements. Consider the following example: Female 39 The element contains two elements: and . For XML to be useful beyond trivial examples, it must be possible to rigorously define, in machinereadable terms, what the structure is of an XML document. Structure includes not just stating how elements are to be nested, but also what elements are optional, what number of repeated elements are allowed, what attributes are required, any default attribute values, and so on. It also includes what values or value types are allowed as data content. XML Schema is the specification used to define XML structure. Part of the XML schema that relates to the element just shown could look like this:
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 209
Comments In addition to elements and attributes, XML markup can also contain comments. XML comments start with a ‘’ string. The XACML listings shown later include detailed comments describing them. Like comments in programming code, XML comments are considered to be extraneous with regard to processing.
XML Cryptography Though not shown in the listings, the authenticity, integrity, and confidentiality of all the XML datasets discussed in this chapter can be protected using digital signatures and encryption. Digital signatures guarantee authenticity (knowing who the signer is) and integrity (knowing whether or not a dataset has been maliciously or accidentally modified). Encryption can ensure confidentiality of data while that data is encrypted. The W3C “XML Signature Syntax and Processing” and “XML Encryption Syntax and Processing” specifications describe how to implement XML-aware digital signatures and encryption.
Example: Patient Medical Record Listing 1 shows an example of the full XML document being used in the scenario at hand. In the scenario, various hospital staff will access information in the above medical record: •
Patient Judy’s Physician can access the entire medical record — all the XML data in Listing 1.
•
Administrators can access only Judy’s identity information — the content of the element.
Listing 1. Patient Judy’s medical record Judy 29 Quinine Lane Female 39 Dr. Livingston
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
210 Simon
Listing 1. cont. Diagnosed ... Patient recovering Diagnosed ... No... ...improvement ... Condition is now stable.
•
University Researchers can access only Judy’s physical data and medical conditions — the contents of the and elements.
In the scenario, it is assumed that the XML instance shown in Listing 1 is located, along with many others, inside a patient medical record database. It is further assumed that the database is XML-aware, that is, it can use XML techniques for identifying and retrieving data from the XML instance.
Expressing and Enforcing Privacy Policies in XACML XACML is an OASIS specification now in its second version. XACML, in conjunction with a security framework model, is an extremely robust language for expressing and enforcing distributed, flexible, and abstract access control policies. By “distributed,” it is meant that different XACML policies within different organizations or organizational levels can be drawn together to determine whether a particular access request is to be allowed or not. By “flexible,” it is meant that XACML enables rules and policies (sets of rules) to be specified, merged, and analyzed according to the unique needs of an organization. By “abstract,” it is meant that XACML can express both data-level (e.g., “you can only modify this data node if you are an Administrator”) and much more abstract
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 211
types of rules (e.g., “persons under 18 years old may only buy this product with their parent’s consent”). XACML centers on the notions of: •
Action — as defined by the XACML specification, an action is “an operation on a resource.”
•
Resource — any object that can have an action performed on it.
•
Subject — a user or device performing an action. Note that a subject is NOT defined as the one whom the resource is about; though it is certainly possible for an XACML subject to be one described in the resource if that person (or entity) is trying to perform an action on that resource.
•
Environment — any attributes pertaining to an authorization decision but are unrelated to the subjects, resources, or actions, for example, a calendar date (that an action may only be performed after).
There are three top-level XACML elements: •
— a set of policies and/or subsets of policies that, when evaluated against an \, produce an authorization decision response for that policy set.
•
— a set of related rules that, when evaluated against an authorization decision request, produce an authorization decision response for that policy.
•
— a statement that indicates whether an action is permitted or not given a set of subjects involved in the action and a resource. In addition, a element may also contain conditions that go beyond the basic combination of subjects, resource, and action and that may further influence the outcome. XACML defines as a top-level XACML element to make it easier to generate policies out of independently stored rules; XACML does not define the processing of rules in isolation.
In addition, there are two other important XACML elements: •
— a tetrad of (a) one or more subjects, (b) a resource, (c) an action, and (d) environment attributes. When a element is directly under a (or ) element, it defines to what targets the policy (or policy set) applies. When a element is directly under a element, it defines to what targets the rule applies.
•
— obligations to be performed when an applicable policy results in an action being permitted or denied (each obligation specifies what policy effect requires the obligation’s fulfillment). The construct is particularly important for systems implementing privacy policies such as, “If this citizen’s record is updated, the citizen must be notified.”
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
212 Simon
Policies Figure 3 illustrates the structure of an XACML policy ( element). As illustrated, an XACML policy starts with a element, which defines what subjects, resources, and actions are covered by the policy. As a child of the element, the element has no impact on the outcome of the policy; it only indicates whether the policy is applicable to a particular authorization decision request. Following the policy element is a set of elements. Each element contains a element that indicates whether the rule is applicable to the authorization decision request being processed. If the rule is applicable, and any conditions specified in the (optional) element are met, then the rule will result in either a “Permit” or “Deny” effect. The evaluation of the rules in a policy may result in more than one rule being applicable, and, of those applicable rules, there may be both “Permit” and “Deny” effects. These differing rule effects must be combined to form the outcome of the policy as a whole. XACML provides several in-built algorithms such as permit-overrides
Figure 3. Structure of an XACML policy ( element)
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 213
(any “Permit” effect takes precedence) and deny-overrides (any “Deny” effect takes precedence) but also lets policy designers implement their own rule combining algorithms. Finally, an XACML policy may conclude with a set of obligations contained in the element. A policy may specify both obligations that must be fulfilled upon a “Permit” effect and those that must be fulfilled on a “Deny” effect. An example of a privacy-related obligation stemming from a “Deny” effect might read (in human terms), “If an unsuccessful attempt was made to modify this citizen’s record, report the attempt to the auditor.”
Policy Sets It is often useful to organize policies into one or more sets (perhaps using the same policy in different policy sets intended for different areas of applicability). To support this, XACML provides the construct. The structure of an XACML policy set is shown in Figure 4. Like an XACML policy, an XACML policy set contains a target (to determine whether the policy set is applicable to an authorization decision request) and may contain a set of obligations. Unlike a policy, which contains rules, a policy set contains only policies and/or other policy sets. Upon evaluating an authorization decision request, some of the
Figure 4. Structure of an XACML policy set ( element)
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
214 Simon
applicable policies within a policy set may result in “Permit” effects and others “Deny” effects. Similar to the rule-combining algorithm discussed previously, XACML enables a policy-combining algorithm to be specified in order that the overall effect of evaluating a set of policies, and the rules within those policies, can be determined. In this chapter, only a few aspects of XACML can be highlighted, and this brief description has necessitated some oversimplification. For a compete description of XACML, see the “eXtensible Access Control Markup Language (XACML) Version 2.0” (Moses, 2005).
XACML Support for Privacy XACML is, as its name says, an “extensible access control” markup language. In the area of XML Security standards, the term “profile”, when applied to a standard, means a way of using that standard to achieve a certain task. For the “eXtensible Access Control Markup Language (XACML)” (Moses, 2005) specification, one of its profiles is the “Privacy Policy Profile of XACML” (Moses, 2004), which defines standard extensions to XACML to facilitate its use in privacy applications. The extensions are two XACML attributes that enable a “purpose” to be ascribed to resources and to actions. When ascribed to a resource, the purpose attribute indicates for what the resource may be used; when ascribed to an action, it indicates why the action is being taken. For the action to be taken for a particular resource that has a purpose attribute attached to it, the action’s purpose attribute must be compatible with the resource’s purpose attribute. Listing 2 illustrates the general construct (adapted from the OASIS “Privacy Policy Profile of XACML” specification) for using these attributes in privacy applications. In Listing 2, the element of the rule requires that the (XACML-defined) purpose attribute of the resource match the (XACML-defined) purpose attribute of the action in order for the rule to result in a “Permit” effect.
Listing 2. Construct for using XACML privacy-related attributes
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 215
Example: XACML Policy Set for Controlling Access to Patients’ Medical Records Listing 3 shows a simple XACML policy for the scenario introduced earlier. It controls access by Physicians, Administrators, and Researchers to patients’ medical records such as that illustrated in Listing 1. Though this particular policy focuses only on the read action, XACML (being eXtensible) can support other types of actions associated with privacy. For example, a privacy policy may want to restrict the purposes for which data can be published. It should also be noted that in real applications, Administrators of privacy policies would typically work with XACML through a human-friendly interface, not with the raw XACML shown in the listings. Not only do such tools simplify working with XACML, but they can also help to ensure that the actual policies are accurate, complete, and reliable.
Listing 3. XACML policy set for controlling access to patients’ medical records element indicates that if the resource is a medical record and the action is to read it, then this policy set may be applied to determine the authorization decision response. The element defines the applicable resources or resource types, and, similarly, the element defines the applicable actions or action types. A element may also contain a element, which defines to which subjects the policy set or policy is applicable. Because this element does not have a element, it does not filter any subjects. This policy set is organized so that filtering of subjects (are they hospital staff or not hospital staff) is done at the level of the two contained policies.
Continued on following pages
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
216 Simon
Listing 3. cont.
--> http://example.org/schemas/Hospital/Medical_Record.xsd read element is XACML's top-level element for holding a policy. Its RuleCombiningAlgId attribute describes how to determine the effect of the rules within the policy. In this case, the value of the RuleCombiningAlgId attribute indicates that if any rule evaluates to a "Permit" decision, the effect of the policy will be to permit the request. --> This XACML policy controls hospital staff's read access to patients' medical records. urn:example:Hospital:subject:Staff:true element ensures that the subject of the is a Physician. This rule excludes Physicians who are not the patient's Physician from viewing the medical record. (Note: In reality, it would certainly be possible for an XACML policy to grant emergency doctors or consulting physicians access as well. This introductory example necessarily requires a limited scenario.) -->
Continued on following pages
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
218 Simon
Listing 3. cont.
Hospital Administrators' may only read the patient's identity data. urn:example:... ...Hospital:Staff:subject:role:Administrator /med:Medical_Record/med:Patient/med:Identity_Data This XACML policy controls non-hospital-Staff read access to patients' medical records. urn:example:Hospital:subject:Staff:false Researchers (those affiliated with the University) may, for the purpose of medical research and only for that purpose, read all of the medical record except the patient's name and address. urn:example:Hospital:Non-Staff:... ...subject:affiliation:University
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
222 Simon
XACML in Action So far, we have created a set of information (the patient medical record) and an XACML policy for specifying the privacy purpose requirements surrounding that information. Now, we are ready to see how an identity and access management (I&AM) system would work during actual e-service requests for information in that record. In the fictitious scenario, our forward-thinking hospital has, in addition to using XACML to protect patients’ privacy, also introduced the most advanced specifications for integrating security infrastructures (SAML) and e-services (Web Services). It should be pointed that XACML can still be used with legacy data, security, and e-services systems. For example, it would be perfectly possible to use XACML with purely relational databases, ASN.1 security, and pre-XML Web protocols. However, in order to show how the suite of modern XML-based e-service, security, and privacy technologies can work together, we focus on those in this chapter. Figure 1, which introduced the patient privacy scenario, includes four components of the XACML model: •
PEP — The policy enforcement point where a service request is intercepted and an authorization decision request is issued. The service request is only allowed to continue on to the enterprise application if there is a positive authorization decision response.
•
PDP — The policy decision point where the authorization decision is made. The policy decision point will normally access one or more policies from policy administration points (PAPs) and other information such as additional subject, resource, action, and environment attributes (from policy information points).
•
PIP — A policy information point where information necessary to evaluate the policy is obtained. The Hospital I&AM infrastructure acts as a PIP because it provides information to the PDP regarding the purpose of the request (which it actually gets from the University I&AM infrastructure through SAML).
•
PAP — The policy administration point where policies are managed and published.
Figure 5 focuses in on the PDP and the PEP to illustrate how XACML policies are processed in response to a service request. It introduces one more XACML component, the XACML context handler, which coordinates the PEP, the PDP, and PIPs. In Figure 5, these steps take place: 1.
A service request is intercepted by apolicy enforcement point.
2.
The policy enforcement point analyzes the request and creates an authorization decision request (may be a proprietary format), which it sends to an XACML context handler.
3.
The XACML context handler creates an XACML context request out of the PEP’s authorization decision request. The XACML context request can be thought of as an XACML-formatted version of the authorization decision request enriched by using attribute names and values particular to its operating environment. The
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 223
Figure 5. XACML policy being evaluated against an XACML context request
Request subject/resource/action
Request subject/resource/action
Request subject/resource/action
Request subject/resource/action
Attribute collection and Resource data collection pathways shown highlight that the XACML context handler works closely with the PIPs to provide the information needed by the PDP to evaluate the request. 4.
The PDP receives the XACML context request and locates the appropriate policies by matching the subjects, resource, and action specified within the XACML context request with each policy target. As shown in Figure 3, an XACML target consists of one or more subjects (representing those associated with the action being performed on the resource), a resource, an action, and environment attributes. “Matching” needs not, and will often not, be a matter of direct, one-to-one,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
224 Simon
equivalence. XACML supports a varied, extensible approach for determining which policies and rules apply to a particular request. (Note: The PDP in Figure 5 could be shown containing a policy set containing multiple policies and policy subsets such as that depicted in Figure 4. For clarity, a single policy is shown). 5.
Once an applicable policy has been found (one in which the policy target matches the request target), then a matching rule, or rules, must be found to determine the overall effect of the policy. Again, XACML supports varied, extensible approaches to how rules are matched and interpreted with respect to the overall effect of the policy. For example, an XACML policy might require that all rules result in a “Permit” effect for the policy itself to result in a “Permit” decision. Another XACML policy might result in a “Permit” decision if any particular rule, with a “Permit” effect, matches. Evaluating a matching rule may require information external to the PDP. As stated, the PDP works with the XACML context handler to collect any information needed to process policies.
6.
Once the applicable policies have been processed, the PDP forms an XACML Context Response indicating the authorization decision. The XACML Context Response is sent to the XACML context handler.
7.
The XACML context handler converts the XACML context response into the authorization decision response format required by the PEP.
8.
Upon receiving the authorization decision response (in this case a “Permit” decision), the PEP allows the service request to proceed to, and be processed by, the enterprise application. Had a “Deny” decision been received, the PEP might have sent a fault to the service requestor (if the service interface supported and required faults to be sent). With reference to privacy policies, a “Deny” decision may result from the purpose of a service request not being aligned with acceptable uses of a resource (as defined by XACML purpose attributes).
Now, let us apply these steps to the medical record scenario of Figure 1 focusing on the Researcher because she represents the case with the most elaborate privacy issues. For the moment, assume the Researcher (named M. Curie) has been authenticated to the Hospital I&AM infrastructure and is using an application to query the treatments of medical conditions. Recall that the Researchers are not allowed to see patients’ identifying data and must only access medical records for the purpose of research. In conducting her query, the Researcher’s application sends a service request to the hospital’s medical records database. The service request specifies: •
That the requestor is a Researcher at the University, and
•
That the physical data and medical conditions section of the Patient Judy’s medical record is to be returned (it is understood that the Researcher does not know it is Patient Judy’s medical record, rather than simply a medical record selected from a search of female patients being treated for Tendonitis).
In order to form the XACML context request, the XACML context handler must obtain any necessary information that is not directly in the service request. Recall that Patient Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 225
Judy has consented to allow Researchers to access the medical conditions part of her patient record but only for the purpose of research. Because, in our scenario, that information is not in the service request, when Researcher M. Curie’s service request attempts to operate on the Patient Judy’s medical record, the XACML context handler will need to obtain that information about the purpose of the action. One way of obtaining the purpose would be through an SAML attribute request. If Researcher M. Curie has not already stipulated she is running the queries for the purpose of research, the attribute request could result in M. Curie’s application notifying her that she must verify that she is running the query for research. Such verification could then be securely logged for future reference in privacy audits. With the necessary attribute information, the XACML context handler would form an XACML context request like the one in Listing 4. The XACML context request contains the following elements: •
specifies the subject as a Researcher from the University.
•
specifies the element of Patient Judy’s Medical Record as the resource.
•
specifies read as the action.
As the PDP evaluates the XACML context request (Listing 4) against the policy in Listing 3, the following steps occur: 1.
The XACML context request is evaluated against the policy set target, which indicates that the policy set applies to read operations on patient medical records. Because the resource and action specified in the XACML context request match the target of the policy set (which does not filter subjects), the PDP determines that the policy set is applicable to the request.
2.
The XACML context request is evaluated against the “Hospital Staff” policy target (which indicates the policy is only for hospital employees). As the subject of the XACML context request is not hospital staff, this policy is ignored by the PDP.
3.
The XACML context request is evaluated against the “Non-Hospital Staff” policy target, which indicates the policy is only for subjects outside the hospital. Because the subject of the XACML context request indicates she is one, the “Non-Hospital Staff” policy is deemed applicable, and so its rules are evaluated against the XACML context request.
4.
The target of the sole rule of the “Non-Hospital Staff” matches the subject and resource attributes specified in the XACML context request. As the rule contains a element, the conditions therein must also be evaluated to determine the effect of the rule. The conditions enforce the privacy rule that the action on the resource (reading the data about Patient Judy’s medical conditions) must be for the purpose of research. Once the PDP (through the XACML context handler), in conjunction with the Hospital I&AM infrastructure, has determined this to be true, the rule effect of “Permit” takes force. Because the “Non-Hospital Staff” policy’s rule combining algorithm and the policy set’s policy combining algorithm are both set as permit-
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
226 Simon
Listing 4. XACML context request for the medical record scenario urn:example:... ...Hospital:Non-Staff:subject:affiliation:University urn:example:Hospital:Non-Staff:subject:role:Researcher Patient_Record__Judy.xml /med:Medical_Record/med:Medical_Conditions read
overrides, the PDP creates an XACML Context Response, shown in Listing 5, stating a “Permit” decision. 5.
The XACML context handler transforms the XACML Context Response into a format pertinent to the policy enforcement point and forwards it.
6.
The PEP forwards the Researcher’s service request to Hospital Patient Medical Records database and retrieves Patient Judy’s physical data and medical conditions information.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 227
Listing 5. XACML context response for the medical record scenario Permit
SAML and Privacy SAML, the Security Assertion Markup Language developed by OASIS, enables enterprise applications to share authentication, attribute, and authorization decision assertions. An authentication assertion testifies that the subject of the assertion has been securely identified. In the context of the current discussion, a user could identity himself on one system (an identity provider) and have an authentication assertion sent by that identity provider to another system (a service provider), allowing the user to procure services from the service provider’s system just as if he had signed on to it directly. Attribute assertions provide attribute information about a subject such as characteristics and preferences. And an authorization decision assertion indicates whether an action by a subject on a resource is authorized. A PEP could use the SAML authorization decision assertion protocol to obtain an authorization decision from another organization’s PDP. Or, as described previously, a PDP could use SAML to obtain attributes about a user from another organization’s I&AM infrastructure. SAML’s enabling of authentication, attribute, and authorization decision sharing across trust domains makes it invaluable, both in its own right and as a companion to XACML, for facilitating the implementation of privacy policies.
Pseudonymous Identifiers The latest version of SAML, “Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0” (Cantor, 2005; Madsen, 2005), incorporates techniques from earlier work by the Liberty Alliance Project (Simon, 2003) that allow a user to coordinate e-services of different organizations across trust domains for a particular task yet ensure that only the minimum amount of information about that user necessary to accomplish that e-service is shared among them. This aspect of SAML 2.0 is particularly valuable in applications such as consumers who may want to use a vendor’s Web site to make an auxiliary purchase but who do not want the vendor to be tracking
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
228 Simon
their purchasing habits or collecting other, potentially personal data. As long as domain B trusts the assertions of domain A that entity X is qualified to do action Y or has certain attributes, then domain B need not know anything more about entity X. In the medical records scenario, the Researcher (whose identity is managed by the University) would be able to collect data from the Hospital’s patient medical records for research without divulging her identity to the Hospital. Through SAML, the Hospital is able to verify that the requestor of the patient data has been authenticated through the University and is a Researcher. Figure 6 illustrates how pseudonymous identifiers would be used in a slightly expanded version of the medical records scenario where the Researcher wants to use an e-service of Pharmaceutical Inc. to analyze the data collected from the hospital. In Figure 6, the Researcher signs on to the Hospital (a service provider), which, behind the scenes, obtains an authentication assertion from the University (acting as an identity provider). The assertion from the University assures the Hospital that the user signing on is an authentic University Researcher but does not need to divulge which one; it just specifies that the Hospital is to use the identifier “Rad_Chic” for the Researcher. In conducting her research through the Hospital’s e-services, the Researcher decides to use Pharmaceutical’s e-services, so she also signs on to the Pharmaceutical enterprise applications using a second authentication assertion from the University accomplished in the same way as with the Hospital except that the Pharmaceutical identifier assigned for the Researcher will be “RX_1867”. The Hospital and the Pharmaceutical can then perform services on behalf of the Researcher but cannot cross-link her identity because each has a different identifier for her. To use the pseudonymous identifiers within the service request, the sender-vouches technique described in the Web Services Security: SAML Token Profile specification
Figure 6. SAML — Using pseudonymous identifiers for privacy
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 229
(Hallam-Baker, 2004) could be employed. With this technique, the Web Services SOAP message is signed by an attesting entity (the sender), who also vouches for the subject’s identity through an associated (signed) reference to a SOAP security token. If that SOAP security token uses SAML 2.0’s opaque identifier functionality, the receiver would not be able to deduce the identity of the subject. The Patient Medical Record database of Figure 1 would then be accepting requests signed by a notary acting on behalf of the Researcher rather than the Researcher herself.
Communicating Consent Besides pseudonymous identifiers, SAML also protects the privacy of its users through a Consent attribute that can be specified on SAML requests and responses. This mechanism allows, for example, for federating parties to indicate that consent was obtained from a user during single sign-on. A service provider’s privacy policy could then require that information may only be collected about a user if that user’s consent has been so indicated. In Figure 6, the Researcher has, through her University acting as an identity provider, requested drug information from the Pharmaceutical organization. The Pharmaceutical organization may wish to gather information about the Researcher such as the name of her University department. SAML’s consent mechanism enables the Researcher to grant, or deny, that consent.
Example Listing 6 and Listing 7 illustrate a SAML transaction whereby the Pharmaceutical queries the Researcher for the name of her department, and the Researcher responds. In Listing 6, the attribute query issued by the Pharmaceutical identifies the subject as “RX_1867”, the pseudonym specified for the Researcher. Though the Pharmaceutical does not know the identity of the Researcher, it is interested in knowing her departmental affiliation and so requests the value of that attribute. The response to the attribute query contains an assertion, issued by the University, about the departmental affiliation of the Researcher — she is from the Department of Medicine. Like the attribute query of Listing 7, the Researcher is identified through her Pharmaceutical-provided pseudonym “RX_1867”. It is important to note that the Consent attribute in element indicates the response is sent with the consent of the subject (who is specifically the Researcher herself, not the University). Also note the element by which the issuer (the University) identifies the intended recipients of the assertion information. The SAML specification details how to use XML Signature to sign SAML messages and assertions in order to assure their integrity and authenticity. For readability, the signatures are not shown in Listing 6 and Listing 7.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
230 Simon
Listing 6. SAML attribute query by the Pharmaceutical https://idp.example https://idp.example.org/Pharmaceutical
Listing 7. SAML attribute assertion response by the Researcher https://idp.exampl e.org/University http://sp.example.org/Pha rmaceutical Department of Medicine
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 231
Future Trends At the beginning of this chapter, it was stated that new Web technologies such as Web Services and Service-Oriented Architecture (SOA) will make possible levels and types of e-services far beyond the sophistication of what is seen today. Both SAML and XACML are important technologies for realizing the next generation of e-services not only because of their security capabilities for cross-domain authentication and authorization, but also because they can support the protection of individuals’ privacy. The potential for linking various domains’ e-services is powerful, yet it raises many questions regarding privacy. For example, not only might a different domain have a different privacy policy, but also it may be located in a different jurisdiction and thus be governed by different privacy laws (though it should be noted that many governments are working to ensure cross-border compatibility of their privacy legislation). In addition, technologies such as XACML and P3P (Platform for Privacy Preferences) can allow individual users to specify their privacy preferences, which could then influence the definition and/or the result of privacy policies. Determining the proper application of privacy laws and policy policies across domains is not new in itself. What is new is how this may be done in an automatable manner now that technologies like SAML and XACML make it possible to exchange and process highlevel, machine-readable, privacy-related information. As an example, a Web vendor’s e-service may in turn require a type of e-service for credit checking. In order to find a particular instance of that service, it may turn to a registry in order to determine what instantiations of that type of credit-checking e-service are available. Besides the usual criteria such as price and features, an additional criterion based on the content of the e-services’ privacy policy (e.g., how long client information will be retained) and how well it fits with that of the vendor’s privacy policy and the privacy preferences of its client, can come into play. The challenge, then, is to explore how cross-enterprise (and perhaps cross-jurisdictional) e-services can be engineered so that the respective privacy requirements of all the privacy stakeholders can be satisfied in a fully automated manner.
Conclusions XACML and SAML are important technologies for expressing and enforcing privacy polices in a world of e-services. XACML policies can specify what data is to be protected, who can access it, what actions can be performed on it, and require that actions be performed for a limited set of purposes. SAML assertions can be used to authenticate users and provide attributes about them without revealing the full details of their identity. It is important to understand that while information security and privacy are connected, they are not the same — security tends to be the art of making an intentionally malicious act difficult to achieve, whereas privacy focuses on stating rules of behavior, assuming
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
232 Simon
(with the encouragement of audit trails and the law) that humans and applications will behave accordingly. For example, even though SAML pseudonymous identifiers prevent normal collusion among service providers, this feature would likely not thwart hackers who might use traffic analysis to cross-link network identities. XACML and SAML provide the foundations for privacy, but fully protecting privacy requires techniques beyond the scope of this chapter. The science of applying XACML and SAML security technologies to privacy issues is in its earliest stages. As products supporting XACML and SAML become ensconced in enterprise infrastructures, there will no doubt be further interest in exploring how they can be harnessed to support privacy protection. For example, with regard to XACML in particular, it will be important to ensure that enterprise-wide, and even inter-enterprise privacy policies can be efficiently managed and tested.
Acknowledgments Thanks to Tim Moses of Entrust and Paul Madsen of NTT for their reviews and comments.
References Cantor, S., Kemp, J., Philpott, R., & Maler, E. (2005). Assertions and protocols for the OASIS Security Assertion Markup Language (SAML) V2.0. Retrieved May 16, 2005, from http://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf Hallam-Baker, P., Kaler, C., Monzillo, R., & Nadalin, A. (2004). Web Services security: SAML token profile. Retrieved May 16, 2005, from http://docs.oasis-open.org/ wss/oasis-wss-saml-token-profile-1.0.pdf Madsen, P. (2005). SAML 2: The building blocks of federated identity. Retrieved May 16, 2005, from http://www.xml.com/pub/a/2005/01/12/saml2.html Moses, T. (2004). Privacy policy profile of XACML. Retrieved May 16, 2005, from http:/ /docs.oasis-open.org/xacml/access_control-xacml-2_0-privacy_profile-spec-cd01.pdf Moses, T. (2005). eXtensible Access Control Markup Language (XACML) version 2.0. Retrieved May 16, 2005, from http://docs.oasis-open.org/xacml/access_controlxacml-2_0-core-spec-cd-04.pdf Simon, E. (2003). The Liberty Alliance Project. In M. O’Neill (Ed.), Web Services security (pp. 203-226). New York: Osborne.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Protecting Privacy Using XML, XACML, and SAML 233
Section III: Privacy Protection Architectures and Other Privacy Topics
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
234 Korba, Song & Yee
Chapter IX
Privacy Management Architectures for E-Services1 Larry Korba, National Research Council Canada, Canada Ronggong Song, National Research Council Canada, Canada George Yee, National Research Council Canada, Canada
Abstract There have been a number of recent developments in architectures for privacy management. These architectures may be applied to the development of e-services. This chapter describes some driving forces and approaches for the development and deployment of a privacy architecture for e-services and reviews several architectures that have been proposed or developed for managing privacy. The chapter offers the reader a quick tour of ideas and building blocks for creating privacy-protection enabled e-services and describes several privacy information flow scenarios that can be applied in assessing any e-service privacy architecture. The chapter concludes with a summary of the work covered and a discussion of some outstanding issues in the application of privacy architectures to e-services.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
235
Introduction Before describing several different architectures for managing privacy, it is worthwhile to describe briefly the privacy and e-services landscape. This section outlines the context and general approach for privacy architecture development.
Background and Context Over the past six years, major companies have used Web services (i.e., Internet-enabled services) and e-services (network enabled services) interchangeably. For the purposes of this chapter we will use the term e-service to apply to either a Web service (nonstandards-based Internet-enabled service) or a Web Service (XML-standards-based Internet-enabled service). E-services mean different things to technical people and business people. From the business context, e-services are described as an emerging paradigm that offers increased efficiency, enhanced services, and stronger customer relationships through Internet-enabled applications that are reusable and customizable to user needs. E-services may be applied to business-to-consumer or business-tobusiness situations. Moreover, the approach with e-services is to provide more value to customers. Adding value involves discerning what clients want. A service supplier may attempt to discern wants and needs through questionnaires or surveys, inferences from other data sources, or through direct requests from the consumer. From the technical point of view, standards based e-services refer to a set of programming standards that makes the interplay between different types of software over the Internet happen without human intervention. These standards include eXtensible Markup Language (XML), Standard Object Access Protocol (SOAP), Web Services Description Language (WSDL), and a variety of other Web services definition languages. Middleware is built around these standards to support delivering technology to a customer over the Internet. For the purposes of this chapter, we can simply define an e-service as a service or resource made available on the Internet. The value service providers offer to customers in recent times stem from increasingly personalized services. Personalized services are selected on the basis of the needs and desires of clients. These are often directly associated with the name and other personally identifiable information associated with the customer. In fact, in order to determine possible follow-on services in which a client may be interested, a service provider may resort to data mining from many different sources, collecting or inferring information about a client that may be quite personal. Considering the acceleration in technology development in support of deploying new services, the growing variety of services being developed, and the underlying approach of compiling, storing, and analyzing information about users in an attempt to increase service value, it is clear that there are significant pressures on privacy. The pressures to build service applications rapidly to meet the new revenue opportunities also lead to questions regarding the implementation of security technology in support of privacy functions. It is important to understand that the concept of privacy from the legal perspective is in disarray (Soslove, 2002). Without a consistent definition of privacy, adjudication and law Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
236 Korba, Song & Yee
making do not fare well against the concrete and competing interests of other parties. Similarly, attempting to build privacy technologies based upon legal compliance is tantamount to building a product without appropriate requirements specifications. In fact, researchers and developers today often base their concepts and developments on core ideals related to privacy. Privacy principles, for instance, such as those compiled by the Canadian Standards Association (CSA) provide some general guidelines that have been used to form the basis of some technology developments. Yet there are several aspects other than legislation that lead to determining technology and procedures to be put in place for e-service privacy. At this point it is worthwhile to examine the contributing factors to the need for privacy from the organizational perspective. Shaping users’ or citizens’ attitudes toward privacy are four items (Figure 1):
•
Most organizations have organizational policies for dealing with personally identifiable information. In this case, organizations may include government, private sector, and not-for-profit corporations. The policy may be based upon the organization’s philosophy towards its business or clients. The policy may also reflect legal requirements, requirements based upon the organization’s business model, or requirements of its partners.
•
Legislation also affects privacy attitudes. If privacy is held highly for a country, its laws and the emphasis on compliance to those laws will be of a high standard and will influence organizational policies and social norms as well as overall privacy attitudes.
•
Social norms add another dimension to privacy attitudes. Social norms are modulated by circumstances. For instance, when users are not identified they tend to be freer with their personal information (Cranor, 1999).
•
By technology we mean all sorts of technology in our environment. Technology affects privacy attitudes. In combination with where and how it is deployed or used, technology has an effect on privacy attitudes. For instance, cameras located in public areas along with notices of camera surveillance may lead to a reduction in
Figure 1. Citizens’ attitudes towards privacy attitudes stem from four driving forces: Corporate policy, legislation, social norms, and technology
Organizational policy
Social norms
Privacy attitudes
Legislation
Technology
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
237
shoplifting. However, a camera connected to a home computer being used by someone who is at ease with the technology may lead to exhibitionist behavior! Clearly, there is interplay between all of the factors that drive privacy attitudes. Social norms, for instance, drive legislation, corporate policy, and technology development. Technology may be regulated by legislation, metered by organizational policy, and driven by social norms. Context has another role to play in the perception of what is private (Cranor, Reagle, & Ackerman, 1999) and may be reasoned about in different ways (Lessig, 1998). Within an organization, privacy technologies are developed and deployed based upon corporate policy that is developed and modulated by competitive pressures, consumer expectations, technology and legislation, and the result of any litigation regarding how other organizations have dealt with private data (Figure 2).
•
Technologies in Figure 2 refer to all technologies, but in particular those that may have a threat upon privacy of personal data (e.g., insecure databases, insecure protocols that may be used for an e-service, scanning and sniffing software, radio frequency identification tags, etc.).
•
Competitive pressures arise mainly due to influences in a business environment, wherein services are offered at increasingly competitive prices, or at the same and/ or lower prices but with increased service levels.
•
Consumer expectations drive what an e-service may offer as well as how the service provider is expected to deal with personal data.
•
Service providers may be covered under certain legislation regarding privacy aspects of the services they offer and the nature of the data involved. Litigation and settlements with respect to privacy-related disputes drive corporate privacy policy. Procedures prescribing how data must be dealt with may be driven by pertinent legislation and the threat of litigation.
Figure 2. Corporate drivers for privacy technologies Competitive pressures
Privacy technologies
Corporate privacy policy
Consumer expectations Technologies Legislation & litigation
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
238 Korba, Song & Yee
An organization’s technology requirements for maintaining its privacy policy may vary greatly from service to service. While there may be potential legal requirements for implementing privacy approaches, organizations take several steps to be in a position to develop and implement privacy enhanced processes. These steps are described in Table 1.
Table 1. Privacy-sensitive approach toward service implementation Step 1. Determine organizational roles and responsibilities 2. Develop organizational privacy policy and security policies 3. Educate/inform staff and outsiders 4. Thoroughly understand the service to be deployed, especially regarding data collection, processing and storage
5. Develop a privacy policy for the service
Description/Notes
Involvement
For the service to be provided, determine organizational roles and responsibilities.
Legal, executive management, government, privacy officer The policies will evolve over time, with changes Legal, in the organization, services provided, feedback management, IT from customers, legislative, and technology staff, privacy changes. officer This step would have started with the results of step 1. Within the organization, at all levels, via written, electronic, and oral communication, inform and educate the staff about the roles of those responsible for privacy and security and the security and privacy policies themselves. Data collected: • Determining the data to be collected: Does it relate, or is it linked with the identity of people using the service? In this analysis the objective is to minimize the amount of data collected, thus minimizing potential privacy exposures. • May the user select what is to be collected? • Why is the data needed? Some data must be collected to provide the service; other data may be used for logging or tracking purposes. • How long is the data required? • Will there be a pseudonymous or anonymous service? • How and where will personal data be stored? • What sort of logging will the service application use, and is it possible to discern personally identifiable information from log entries? Based upon organizational policy, legal or regulatory requirement, and requirements for the service and business arrangements related to the service, develop a privacy policy.
All staff
Service architect, consultation with legal, management, privacy officer
Privacy officer, service architect, legal, management
Note: This list does not cover all aspects of a service, only those pertaining to privacy. Continued on following pages
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
239
Table 1. cont. Step
Description/Notes
6. Technical Items to consider: design and • Privacy policy disclosure: How will it be implementa revealed to the user? tion of the • User interfaces for the service need to be service carefully designed and tested to build the trust levels of the user. • Test it well. 7. Privacy It is advisable to have a qualified external impact professional perform a privacy impact assessment assessment, because of the experience and objectivity such a company brings. It is worthwhile having an assessment done at early stages of the design to lessen the chance of privacy-impaired design. 8. Launch service 9. Improve service
Monitor feedback from clients regarding privacy issues. Correct /improve procedures and/or the application to deal effectively with feedback. Based upon feedback from clients or technical assessments made of the service application and support networks or hardware. Steps 7, 8, and 9 may iterate representing new levels of services or improvements coming on stream.
Involvement Service architect, programming staff
Privacy officer, consulting professional, or internal staff with appropriate experience and latitude to perform the assessment Service architect, programming staff, privacy officer Architect, programming staff, privacy officer, other IT staff, management, marketing
Note: This list does not cover all aspects of a service, only those pertaining to privacy.
Privacy Architectures for E-Services Having covered background information regarding privacy and e-services and an approach for deployment of e-services, we now describe several different technologies in various states of development to provide an overview of privacy-enhancing technologies for e-services. The reader may read further about broader guidelines for e-service or Web service architectures elsewhere (WSA, 2004). The privacy technologies covered include: IBM’s Enterprise Privacy Architecture (EPA), CONFAB: a system targeting pervasive services, a description of privacy features in the Liberty Alliance ID-Web Services Framework (Liberty Alliance-1, -2, -3), and an approach for using digital rights management to meet the needs expressed in the privacy principles. EPA is an architecture that is used by IBM in the development of privacy services for its clients. CONFAB is a research project that uses a policy-based approach for privacy management in pervasive applications. This work is of particular interest in the development of future e-services that would respect privacy preferences in an environment where a great deal of personal information, including location, may be gathered and stored in many devices Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
240 Korba, Song & Yee
with different ownership. The description of the Liberty Alliance ID Web Services Framework provides a real-world example of privacy management for a Web service, whereas the digital rights management approach for privacy management describes procedures for handling several different situations dealing with personal data that can be applied for many different situations.
IBM Enterprise Privacy Architecture The objectives for developing the IBM Enterprise Privacy Architecture were: helping organizations understand how privacy impacts business processes and maximizing ebusiness trust. Based on privacy best practices and business requirements, EPA maps players, rules, and data to new or existing business processes by using object-oriented methods. This approach was intended to help organizations minimize the risks of inadvertent privacy disclosures by showing them where personally identifiable information (PII) is stored in their enterprise and how to effectively manage it. In order to introduce privacy-awareness and privacy services into enterprises in a systematic and complete way, IBM EPA is structured in four building blocks: the management infrastructure, the privacy agreement framework, the technical architecture, and the privacy regulation analysis. The management infrastructure enables an enterprise to define: its privacy strategy (e.g., embedding business best practices or rules into privacy policy), the general controls to enforce the privacy policy (e.g., supporting and ensuring general policy compliance), and the privacy practices to translate the privacy policy into its business processes. The privacy agreement framework provides a methodology for embedding the privacy policy into business processes, mapping the privacy parties, rules, and data, and minimizing the risks of inadvertent privacy disclosures. The technical architecture defines the supporting technology for implementing the required privacy services. The privacy regulation analysis identifies the applicable regulations. Figure 3 depicts the building blocks of the IBM EPA (Karjoth, Schunter, & Waidner, 2002).
Figure 3. Building blocks of the IBM Enterprise Privacy Architecture
Management infrastructure Privacy agreement framework Technical architecture
Strategy Controls
Privacy
Practices Data
Privacy management
Players Privacy enforcement
Regulation Rules
Analysis
Audit console
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
241
Management Infrastructure The EPA management infrastructure is the tip of the EPA pyramid. It enforces an enterprise privacy strategy through a comprehensive privacy management program down to the implementation of privacy practices. The management infrastructure consists of three components: strategy, controls, and practices. Figure 4 depicts their components and relationships (Brown, 2003).
•
Strategy: This defines the high-level privacy and security policies and generates the privacy and security strategies for an enterprise.
•
Controls: This defines the general controls to enforce the policies. The controls include a privacy requirements process, an information classification and control program, a compliance process, a definition of the organizational roles and responsibilities, and an employee education program.
•
Practices: This incorporates and translates the policies into an enterprise’s business processes. The practices include an external communication program, a privacy statement, a customer preference process, an individual access process,
Figure 4. Components and their relationships of the EPA management infrastructure
Organizational roles & responsibilities Education Progra m
External communication program
Privacy policy (practices)
Retention management program Strategy
Information classification & control program Individual access process
Privacy Statement Customer preference process
Compliance process
Privacy requirement process
Security policy Control
Contact & dispute process
Information access controls Practice
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
242 Korba, Song & Yee
a contact & dispute process, information access controls, and a data retention management program.
Privacy Agreement Framework The privacy agreement framework provides the privacy management for privacy-enabling business processes at the transaction level. The processes connect the individual to the enterprise, map data collection, storage, uses, disclosures, and retention, minimize risk, and optimize the personal information (PI) handling processes by limiting collection, use, and disclosure according to the risk analysis of the threats and vulnerabilities. The framework consists of three major models: players, data, and rules. Figure 5 depicts a process model for optimizing PI handling processes for privacy (Brown, 2003).
•
Players: The players are the interaction entities in the data collection processing. They are data subjects or users.
•
Data: This model identifies the required data for the collection processes. Based on the privacy-enhancing technologies (Goldberg, Wagner, & Brewer, 2002; Lysyanskaya, Rivest, Sahai, & Wolf, 2000; Pfitzmann & Kohntopp, 2000), the data can be categorized into three classes for privacy protection: PII, de-identified information, and anonymized information. PII is the most sensitive personal information that can be linked to a real-world identity, for example, a social security number. De-identified information is the information replaced by a pseudonym. Anonymized information is the least sensitive personal information that can be obtained by removing all personal data.
•
Rules: This model identifies the rules for the data usage.
Technical Architecture The technical architecture provides the necessary supporting technologies to ensure that an enterprise provides sufficient privacy protection to its customers. The technolo-
Figure 5. Process for optimizing PI handling processes for privacy Collection Data subject Control Subject or authority
Personalized use Release (data + rules) Notify Access update
Rules • • • • •
De-identified use
Anonymized use
De-identify
Request Utilize Re-identify Delete Disclosure …
Anonymize
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
243
Figure 6. IBM EPA technical architecture Data subject
Access/ update PII
Data users
Identity protecti on and manage ment Privacy-enabled authentication
Identity protection
User interaction
…
Enterprise applications
Policy negotiation services
PII submit and access requests
Privac y enforce me nt system
User privacy action manager User privacy contact manager
Privacy-enabled credential
Privacy decision manager Request consent
Get policy
Copy policies and consent
Ask for data Decision Log privacy decisions
Privacy-enabling resource manager Log privacy act ions
Privacy policy Privacy action manager audit manager Policy manage ment and audit system
gies include a policy management system, a privacy enforcement system, an audit console, and others. Figure 6 depicts the IBM EPA technical architecture (Brown, 2003). The major components are:
•
Policy management system: The policy management system enables the system administrators to define, change, and update policies and assigns the policies to the privacy enforcement system.
•
Privacy enforcement system: The privacy enforcement system enforces the privacy protection for all personal data based on the policies it obtains from the policy management system and offers the auditing data to the audit console. The privacy decision and privacy-enabling technologies are two major components of the privacy enforcement system. The privacy enabling technology can promise fair information practices to its customers. Furthermore, the privacy statements/ policies can be formalized using P3P and enforced directly within enterprise applications by the IBM EPAL.
•
Audit console: The audit console enables the privacy officers to review the policies and audit information.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
244 Korba, Song & Yee
Privacy Regulation Analysis EPA privacy regulation analysis uses regulatory summary and regulation rule tables to solve the regulatory compliance challenges. The regulatory summary tables summarize the applicable regulations with a unified terminology. The regulation rule tables describe the specific enterprise-based regulation rules with more formal style, for instance, an entry describing which party can access which type of data and referring it to the legal regulation. With privacy-friendly business processes and privacy-enabling security technology, IBM EPA provides a methodology for enterprises to enhance privacy protection for their customers. Its advantages include providing a well-defined level of privacy to customers, protecting the customer’s data from privacy violations by regular employees, systems, or others, maximizing the business use of personal data for an enterprise, and respecting privacy regulations. In addition, the IBM EPA has integrated some new (e.g., EPAL) or existing (e.g., P3P) privacy-enhancing technologies into the system for defining enterprise privacy practices. IBM now uses EPA for its privacy technology consultation practice. It is also undergoing further research and development to create new privacy technologies related to pseudonym-credential practice for identity protection and privacy regulation analysis for privacy law/principles translation. The architecture assumes that the customers must trust the privacy administrators and privacy enforcement systems of the enterprise. Moreover, IBM EPA also appears to be the company’s business model for targeting opportunities in their consulting services division for managing privacy for organizations (IBM Bus).
Privacy Architectures for Ubiquitous Applications and Privacy Policy Compliance In this section, we examine two previous works that deal with privacy architectures for e-services. Hong and Landay (2004) provide a toolkit, Confab, for facilitating the development of privacy sensitive ubiquitous computing applications. Yee and Korba (July 2004) propose an architecture for a privacy policy compliance system that operates within every service provider to ensure conformance to a user’s privacy policy. In this section, we summarize the key results of each work and compare the approaches in terms of several headings.
CONFAB: Privacy for Ubiquitous Computing Applications Hong and Landay address the difficulty of designing ubiquitous software applications that are privacy-sensitive or that help the user to manage his/her privacy (Hong & Landay, 2004). Their solution is to provide a toolkit with embedded data and programming models that can be used by developers to develop such applications. Summarized here are the privacy requirements for ubiquitous applications that they obtained through surveys of end users and application developers.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
245
End-User Privacy Requirements
•
Clear value proposition: An upfront value proposition that leaves no doubt as to what benefits are offered and what personal information is needed to offer those benefits;
•
Simple and appropriate control and feedback: Simple control over and feedback about who can see what information about the end user;
•
Plausible deniability: Addresses a social need to avoid potentially embarrassing situations, undesired intrusions, and unwanted social obligations, for example, a person answers with a white lie when asked on the phone what they are doing;
•
Limited retention of data: Addresses concerns over long-term retention of personal data that can lead to unforeseen and unwanted use of the data;
•
Decentralized control: Addresses fear that personal data is stored on a central computer over which the end user has very little practical control;
•
Special exceptions for emergencies: The idea that in emergency or crisis situations, safety far outweighs privacy needs, for example, disclosing personal health information in return for treatment in an emergency.
Application Developer Privacy Requirements
•
Support for optimistic, pessimistic, and mixed-initiative applications: In pessimistic applications, end users set up preferences beforehand and place strict constraints on when personal information can flow to others; optimistic applications allow greater access to personal data but make it easier to detect abuses after the fact with logs and notifications; in mixed-initiative applications, the end user is interrupted when there is a request for personal information and he/she must make a decision to allow it or not on the spot;
•
Tagging of personal information: Marking personal information with privacy preferences, for example, whether forwarding is allowed or amount of time to retain the information;
•
Mechanisms to control the access, flow, and retention of personal information (quantity): Controlling the quantity of information disclosed to others, for example, only people in the same building as myself can see my location, or colleagues can see my location between 9AM and 5 PM;
•
Mechanisms to control the precision of personal information disclosed (quality): Granular control over the precision of disclosures (the quality of disclosures), for example, giving one’s location as “123 Main Street” or “Ottawa”;
•
Logging: For both clients and servers; for clients, logs facilitate understanding who is accessing what data; for servers, logs facilitate service audits to ensure that the clients’ personal data is handled properly.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
246 Korba, Song & Yee
Hong and Landay summarize these requirements into four high-level requirements as follows:
•
“A decentralized architecture, where as much personal information about an end user is captured, stored, and processed on local devices owned by that end user”;
•
“A range of mechanisms for control and feedback by end users over the access, flow, and retention of personal information, to support the development of pessimistic, optimistic, and mixed-initiative applications”;
• •
“A level of plausible deniability built-in”; “Special exceptions for emergencies”.
The authors illustrate the kind of applications that they wish to support with their toolkit by using two scenarios: a Find Friend scenario and a Mobile Tour Guide scenario. In the Find Friend scenario, employees can use a server to share their location information with one another. Employees choose to upload updates to the server at different levels, for example, room level or floor level. The server is set up to notify a person whenever his or her location is queried and to accept queries only if the requestor is physically in the same building. In the Mobile Tour Guide scenario, a person visiting a city for the first time uses the Guide in conjunction with his or her location-enabled device to have a locationenhanced tour guide. The Guide can provide different levels of service depending on the level of location detail the person shares. For example, if the location information is at the city level, the Guide can provide information on calendar events or the length of lines at major venues such as museums, whereas if the location information is at the neighborhood level, the Guide can additionally include information on interesting shops and other nearby points of interest. To answer these requirements, Hong and Landay devised the Confab toolkit with data and programming models that facilitate the design of privacy-sensitive ubiquitous applications. We next examine these models. Confab’s data model makes use of “infospaces” that are assigned to people, places, things, and services in order to represent contextual information. An infospace is a network addressable logical storage unit that stores context data about the entity to which it is assigned (see Figure 7). For example, Alice’s infospace contains context information on her health, location, and activity. The following points apply to infospaces:
• •
Infospaces can be populated by sources of context data such as sensors.
•
Individuals can specify privacy preferences for how their infospaces handle access control and flow.
• •
Infospaces are managed by infospace servers.
Applications can retrieve and manipulate infospace data to accomplish contextaware tasks.
The basic unit of storage in an infospace is the “context tuple.” Tuples can represent an attribute about an entity (e.g., a person’s age), a relationship between 2 entities (e.g., a person is in a room), static pieces of contextual information (e.g.,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
247
Figure 7. Ovals represent infospaces about a person, a place, or a thing; squares represent tuples of contextual information associated with infospaces Server Alice’s infospace Loc
Health
Activity Server
Room 52’s infospace Person
Person
PDA-102’s infospace
Owner
Device
an email address), or dynamic contextual information (e.g., a person’s location); tuples can optionally have a “privacy tag” that gives hints from the end user on how that tuple should be used when it flows to another computer outside the end user’s control (e.g., when the tuple should be deleted). Confab’s programming model consists of methods and operators. Infospaces support two kinds of methods: “in” and “out.” In methods include add and remove and determine the data stored in an infospace. Out methods affect the data leaving an infospace and include query, subscribe, unsubscribe, and notify. Infospaces also support operators for manipulating tuples. There are three types of operators: in, out, and on. In operators run on all tuples coming in through in methods, for example, check the infospace’s access control polices to make sure the tuple can be added. Out operators run on all tuples going out through out methods, for example, block tuples if end user is in “invisible mode” (end user does not want to give any information out). On operators run periodically, for example, garbage collection. Hong and Landay include the following operator types (Table 2): in, out and on. The interactive operator (Table 2) allows the end user to have control over disclosures by displaying a simple GUI that allows the user to choose between disclosing the information just this once, ignoring it, or permanently denying access. The coalesce operator deletes tuples with repeated values. For example, location tuples can be duplicated if the person stays at a particular location over some period. The coalesce operator sorts the location tuples by time and deletes the tuples with duplicate values. Operators are loaded through a configuration file on startup and execute in the sequence in which they were added. Each operator in addition has a filter that checks if it should be run on a specific tuple. Once an in or out method is called, a sequence of the appropriate operators is put together and run on the set of incoming or outgoing tuples.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
248 Korba, Song & Yee
Table 2. Confab operators Ope rator Type In
Out
On
Description Enforce access polic ies Enforce privacy tags Notify on incoming data Enforce access polic ies Enforce privacy tags Notify on outgoing data Invisible mode Add privacy tag Interactive Garbage collector Periodic report Coalesce
Confab’s programming model also supports service descriptions and active properties objects. Service descriptions are published by applications and provide basic information about the service as well as describe service options on features and what data types and data formats are needed from the user. A client application making a request on an infospace would first send its service description, for which the infospace can use a previously stored configuration if it has seen the service before, or display a default GUI for the user to choose whether to allow access, choose options, and indicate how long the settings should last. An active properties object simplifies the task of querying for and maintaining context state in applications. Queries can be placed in an active properties instance and be periodically executed for up-to-date values. We conclude our summary of Hong and Landay by describing one of the applications they built using Confab. They call this application “Lemming,” a new location-enhanced instant messenger client that provides two novel features. The first novel feature is the ability to request a user’s current location so that when the request is received, the end user can choose “never allow,” “ignore for now,” “just this once,” or “allow if…” to allow requests under certain conditions. When a location request is received, the end user’s instant messenger client issues a query to the user’s infospace for the user’s current location. The infospace checks to see if there is a context tuple representing location information, and then checks the age of the tuple to see if it is “current” (20 minutes by default). If the location tuple exists, it next flows through the out operators defined in the infospace. Three operators are of interest here: the Enforce Access Policies, the Interactive, and the MiniGIS operators. The enforcement operator checks if there is an existing policy associated with the end user and applies the policy if it exists. The Interactive operator also checks for the policy and displays a GUI to let the end user set a policy if the policy does not exist. The MiniGIS operator converts the data from latitude/ longitude to a place name. The second novel feature involves the ability to automatically display a current location as an away message. The message can automatically update itself as the end user’s location changes. The instant messenger client sets up a query to retrieve the location every 60 seconds, and then displays this location in the away message. Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
249
The authors summarize the advantages of their work with the following points: an extendable suite of mechanisms for managing privacy, and personal information is captured, stored, processed on the user’s computers.
Privacy Policy Compliance System Yee and Korba examine how an e-service client can be assured that the e-service provider with whom he or she is interacting complies with his or her privacy policy (Yee & Korba, July 2004). Underlying this is an e-services transaction model in which an e-service client and the corresponding provider each have a privacy policy that specifies their separate privacy preferences. The client’s privacy policy specifies what private information the client is willing to give up and the conditions for access to the information (e.g., the provider can only have access during week days). The provider’s privacy policy specifies what private information the service requires and the conditions that govern the provider’s access to the information (e.g., need access every day of the week). The e-service can only be engaged if the client’s privacy policy matches the provider’s privacy policy. The authors’ previous work considers policy negotiation (wherein there is no policy match) (Yee & Korba, January 2003, May, 2003), how privacy policies can be semi-automatically generated (Yee & Korba, 2004), and how a match can be determined (Yee & Korba, 2005). However, we are interested here in their work on privacy policy compliance. Yee and Korba’s approach to the problem is to design a privacy policy compliance system (PPCS) that has an embedded private data controller. Basically, the PPCS intercepts the user’s data and ensures that processing of the data complies with the client’s privacy policy. Prior to presenting the design of the PPCS, the authors derive requirements for the PPCS based on Canadian privacy legislation. These requirements are summarized as follows: Requirements for the PPCS
•
Clear purpose: For each purpose for which private information is collected, the PPCS must provide clients with an explanation of what information is necessary in order to accomplish the purpose;
•
Limiting use, disclosure, and retention: For each purpose for which private information is collected, the PPCS must provide clients with an explanation of how it intends to use or disclose the client’s private data; in addition, the PPCS must ensure that all copies (including copies disclosed to other parties) of the client’s private information are deleted at the earliest of (a) the time when the data is no longer needed for the fulfillment of the purpose, or (b) the expiration of the data’s retention time;
•
Accuracy: The PPCS must provide a facility with which clients can access, check the accuracy, update, and add to their private data as necessary for the corresponding purposes;
•
Openness: Upon request, the PPCS must display the provider’s specific information about its policies and practices relating to the management of private information;
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
250 Korba, Song & Yee
•
Individual access: Upon a client’s request, the PPCS must inform the client of the existence, use, and disclosure of his or her personal information and give him or her access to that information;
•
Challenging compliance: Upon request, the PPCS must allow the client or the client’s designate to review the secure log (all PPCS actions are securely logged) to verify compliance to his or her privacy policy.
•
Safeguards: The PPCS must have appropriate security safeguards in place to protect the client’s private information from unwanted disclosure.
To satisfy these requirements, the authors propose the architecture in Figure 8. We now describe each of the components in Figure 8. The Web Interface provides a UI for interactions with the client, client designate, or any Internet user (for checking provider information requirements for specific purposes). The Web Interface also establishes a secure channel to the client or client delegate and authenticates them. The privacy controller controls the flow of provider and client information and requests to fulfill the client’s privacy policy; specific actions include: (a) make log entries, (b) delete private information upon completion of purpose or information expiry, (c) grant access for client update of private information (including the update of information that has been provided to third party data processors), (d) grant access for the examination of logs and comparisons of information, (e) upon request, inform the client of the existence, use, and disclosure of his or her private information. The database access component provides read/write access to the databases as requested by the privacy controller and handles security protection for the databases. The private data import/export component sends private information disclosures to other providers, receives private information disclo-
Figure 8. PPCS architecture
Web interface Clients, client designates, or any Internet user (for checking provider information requirements for specific purposes)
Service processes
From/To other PPCS
Privacy controller
Private data import/export
Database access
Provider Client Logs information information PPCS
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
251
sures from other providers, sets up secure channels to other providers for sending information disclosures, and authenticates the providers. Three databases store information belonging to the provider, the client, and the system (logs). Provider information includes provider privacy policies, purposes, and so on. Client information includes client privacy policies and clients’ personal information. Logs include entries for PPCSclient actions such as information collection, information use and disclosure, information access and update, and information deletion. Finally, the service processes represent the services offered; the arrow going out of these processes indicates private information collected by the services; the arrow going in represents private information required by the services. Yee and Korba point out how parties who have received private information disclosures can be expected to delete the information upon completion of purpose or information expiry. “Such parties are considered to be subcontractor providers of the first provider and provide services to the first provider that are needed to complete the purposes of the first provider. In this case, the first provider is actually a consumer. As a consumer, the first provider has negotiated a consumer privacy policy with each subcontractor provider, containing the required purposes and information retention times reflecting the wishes of the original consumer. The PPCS of each subcontractor provider then deletes the original consumer’s private information upon completion of the purposes in the privacy policy agreed with the first provider or upon information expiry.” The use of PPCSs to ensure privacy policy compliance is actually a distributed approach, where the PPCSs communicate among themselves to share information (via the private data import/export component). Each provider is expected to have one or more PPCSs, depending on how many services it is offering. This situation is depicted in Figure 9 (clients not shown).
Figure 9. Distributed nature of PPCSs: Here, Provider 3 discloses private information to Provider 1; Provider 5 discloses private data to Providers 2 and 4 PPCS
PPCS
Provider 1
Provider 2
Internet
PPCS
PPCS
Provider 5
Provider 3
PPCS Provider 4
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
252 Korba, Song & Yee
Comparison of Confab and PPCS Approaches We compare the above two approaches using the following nine headings. Our comparison is given in Table 3.
• •
Application area: The type of e-services targeted by the approach. Effectiveness at preserving privacy — general: How effective is the approach at preserving user privacy? Is it foolproof?
Table 3. Comparison of approaches for preserving user privacy Comparison item Application area
Hong and Landay Ubiquitous application software, for example, find friend service, mobile tour guide service; payment for service may not be first priority.
Effectiveness at preserving privacy— general
Fulfills privacy requirements under safe environment; can be defeated if infospaces and software not under user control.
Effectiveness at preserving privacy— disclosures
Appears effective, making use of privacy tags, although details of security measures are not provided.
Method for assuring clients
Clients receive feedback, for example, notification of location request; personally identifiable information stored on computing equipment owned by the client. Mostly scalable, bottleneck may occur in high volume multiple disclosures to the same entity. There may be “policy chatter” caused by exchanges with entities requesting data. Security measures not described but can be added. Validated by existing working applications.
Scalability
System security Validation Ease of implementation Costs
Appears straightforward; privacy provisions are part of the design and present from the beginning. Incremental costs hidden in the costs of software development; does not appear to add inordinately more to the costs of developing the same software but without privacy provisions.
Yee and Korba Internet-based e-business, for example, Amazon.com, Futureshop.ca; payment for service first priority; approach may be applied to distributed eservices. Fulfills privacy requirements under safe environment; can be made “mostly secure” (researching defense against malicious copying of user data). Appears effective, making use of recursive provider-client relationships, although details of security measures are not provided. Clients check secure logs to verify privacy policy compliance. Mostly scalable, bottleneck may occur in high volume multiple disc losures to the same entity or in exhaustion of a PPCS due to too many clients (but fix is to add more PPCSs). Security measures fully described. Needs to be validated by building and testing with a prototype. Appears straightforward but needs validation; PPCSs can be added to existing services. Very visible up-front costs (the cost of acquiring and adding a PPCS); however, fully recoverable as a cost of doing business due to attracting more clients through privacy provisions.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
253
•
Effectiveness at preserving privacy — disclosures: How effective is the approach at preserving user privacy for user data that is disclosed to a third party? Is it foolproof?
•
Method for assuring clients: How are users assured that their privacy has been preserved? What gives them the confidence?
• • • •
Scalability: Is the approach scalable?
•
System security: Are any components of the implementation vulnerable to attack? Validation: Has the approach been validated through testing with a prototype? Ease of implementation: How easy is it to implement the system? Does it require further research? Are there already implementations or prototypes? Costs: How expensive is it to implement the system? Is the expense comparable to the expense of implementing similar software? Do the costs make business sense?
Liberty ID-Web Services Framework — Privacy Features (Version 2) The Liberty Alliance ID-Web Services Framework (ID-WSF) is an architectural platform for building secure, privacy-respecting, identity-centric Web services. ID-WSF defines a common framework for Web services of authentication, message protection, service discovery and addressing, policy and metadata advertisement, and data interaction (e.g., query & modify). More information is available from the Liberty Alliance (Liberty Alliance-1, -2, -3). Privacy is a central tenet of the Liberty Alliance (there is an Expert Group within the Alliance dedicated to such issues) and ID-WSF in particular. The following sections highlight certain aspects of ID-WSF designed to enable good privacy. We provide a brief description of some of the privacy considerations given for three different aspects of the service: consent, usage directives, and the interaction service.
Consent ID-WSF-based entities may wish to claim whether they obtained the principal’s consent for carrying out any given operation. The Liberty SOAP Binding specification defines the header block to allow Web service clients to indicate to the Web service provider that they have obtained the consent of the relevant Principal for the release of the location data. The sample message that follows shows the header block in a SOAP message requesting the release of a particular principal’s location data.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
254 Korba, Song & Yee http://circle-of-trust.com/policies/eu-compl liant/location Request for Location Data
The Web service client inserts a reference to a specific privacy policy for location data in a “PrivacyPolicyReference element” (defined by some community of interest separate
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
255
from Liberty). This information will feed into the Web Service Provider’s decision to release the location data or not.
Interaction Service A Web service provider will sometimes need to interact with the principal for which PII is being requested in order to clarify privacy policy. An interaction service allows Web service providers to ask the principal such as policy clarification queries without bearing the burden of maintaining the relevant addresses and details (e.g., call me on my work phone during working hours but send me an instant message at any other times).
Privacy Rights Management Using Digital Rights Management The examples above should manage the information flows appropriately. This section describes three typical message flows that must be maintained by privacy management architectures. Korba and Kenny described an architecture employing a rights management approach for the management of individual privacy rights as expressed by European Union privacy principles. Their work goes on to provide some detail as to how to extend both XML (Kenny & Korba, 2002) and ODRL (Korba & Kenny, 2002) to meet the requirements for privacy rights management (PRM). This approach is useful in the context of systems like CONFAB and IBM’s Enterprise Privacy Architecture as well as PPCS. Within PRM, there are four entities: the data subject (the person who owns the personal data), personal data (or PII), the data controller (the person, agency, public authority, or other body, which alone or jointly with others determines the purposes and means of processing personal data), and the data processor (the natural or legal person, agency, public authority, or other body, which processes personal data on behalf of the controller). This arrangement logically matches the entities used to describe the obligations under privacy laws in many countries. Privacy principles are used to describe the general aspects associated with privacy laws (see Chapter XI). A privacy architecture must accommodate the privacy principles as they pertain to the service and jurisdiction in which it is being offered. In order to understand what is involved for data processors, controllers, and data subjects with respect to handling of personal data, one must explore the implications of the privacy principles and the systems involved. For instance, it is often the case that data subjects do not know which controllers have what data, and whether it is accurate. Data controllers and processors may lose track of the data entrusted to them. This section explores the rights management approach by describing data flows related to particular data management cases and that are intended to meet privacy principle requirements.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
256 Korba, Song & Yee
Privacy Rights Management in Operation Within PRM, servers handle the functions of the data controllers and data processors. In order to perform those functions, the data controller and data processor servers must maintain and use different sets of data. Following is a description of key controller and processor records and transaction logs required for PRM operation. These descriptions will facilitate understanding of the operational scenarios for data subject enrollment, periodic audit, and personal data update by the data subject described in the following sections.
Processor/Controller-Related Records Processors and controllers maintain three key record types regarding PRM operation. These include processing agreements, audit information, and personally identifiable information tracking data. Following, each is described separately.
•
Processing agreements: These are electronic documents containing the details of the arrangements between the controller and the processor. They contain information regarding types of data the processor may accept, any limits to the processing prescribed by the Controller, time limits for access to data, agreements and details for audits (timing, type of data collected), as well as time stamp and approval signatures for the agreements.
•
Audit information: The controller performs periodic audits of the data handling approaches for the processor. Results of the audits include a list of discrepancies between the data held by the processor as compared to those held by the controller. While detailed results are stored in the transaction log, the audit results for the processor/controller are processed/summarized versions of those raw results for use by controller or processor.
•
PII tracking data: The controller keeps track of the PII data sent to each processor, the time of the transfer, and pointers to the policy and purpose for data processing.
Data Subject-Related Records There are several data subject-related records maintained by processors and controllers. These include the following:
• •
PII data: The personal data entrusted to the controller by the data subject.
•
Audit information: Processed audit results pertaining to discrepancies in information regarding data subject data are stored here for review by the Data Subject.
•
Agreed-upon policies and purposes: All privacy policies negotiated with the controller and/or all processors are stored along with a reference to the affected PII data.
Contact information: Contact information for the PII data (e-mail address, home address, cross-referenced to PII data, and policy and purpose for data use).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
257
Transaction Logs In order to keep track of all activities of data controllers, data processors, and data subjects within PRM, the following transaction logs are maintained:
•
Audit results: Detailed results from automated periodic or external audits of the processes used by the processor and controller to assure PII is consistent and used only for the purposes and policies specified.
•
Transfers of PII: Occurrences of transfers of PII (timing, sender, receiver, and a reference to the PII involved).
•
Processing of PII: All processors record time and duration of PII processing, as well as the policies exercised.
•
Policy negotiation/settlement: Time of occurrence of privacy policy negotiation, with reference to the data subject, data processor, and/or data controller involved.
•
Data subject interactions: Data subjects may contact controller and processors to determine accuracy of PII data. Records are kept of all interactions.
•
Processor/controller interactions: Timing and references to details pertaining to interactions between processors and controllers.
PRM Operational Scenarios This subsection details PRM in operation by describing several key scenarios suggesting approaches within the PRM architecture intended to meet several key requirements of the privacy principles. The scenarios described here are:
• • •
Data subject enrollment, Periodic PII data audit, and Request for PII data update by the data subject.
The scenarios are outlined using a description of data flows between the data subject, data controller, and data processor within the PRM system.
Data Subject Enrollment Data subject enrollment involves a data subject coming to an agreement with a data controller on the personal information to be shared, as well as the privacy policy for dealing with the PII and the purpose for which the data may be used or processed. Figure 10 illustrates the data flow between the data subject, data controller, and two data processors. The process starts with the data subject authenticating himself or herself with the data controller. For this and all further exchanges, the data subject and controller set up a Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
258 Korba, Song & Yee
secure communication channel between themselves. The data controller exchanges a policy and purpose statement regarding the use of any personal data submitted by the data subject to the controller. The data subject may negotiate with the data controller for a policy and purpose as described in (Korba, 2002; Yee & Korba, January 2003, May, 2003). When the data subject comes to an agreement with the controller on the personal data to be exchanged, as well as the policy and purpose for which the data is being gathered, the data subject provides the data. The controller holds the personal data, exchanging it and the use and policy information with the processors that request the data. A number of log entries are made at various times during all of the exchanges. Figure 10 illustrates the various stages for enrollment in detail.
Figure 10. Data flow during enrollment (personal data [PD], data subject [DS], acknowledgement [Ack]) Data subject A
Controller
Processor 1
Processor 2
Authentication
Data subject enrolment
Policy request Policy, purpose exchange Use/policy neg/agreement Personal data
Log policy transaction, Store PD, use/policy agreement M atch use/policy with processor
Controller/processor contract
Authentication Log PD, use/policy transfer
DS A PD, use/policy PD Ac k
Log PD, use/policy ack receipt Log PD, use/policy transfer
Authentication Log receipt of data, store DS A PD, use/policy
DS A PD, use/policy PD Ack
Log receipt of data, store DS A PD, use/policy
Log PD, use/policy ack receipt
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
259
Periodic PII Data Audit Overseeing PII distributed amongst the data controller and data processors requires considerable effort and care on the part of the data controller. The Controller may have to deal with requests from data subjects or more detailed investigations conducted by a data protection authority. Either of these concern the quality of the data under the purvey of the data controller. Operating in a reactive mode to these investigations would be less desirable than a proactive approach wherein the data controller assesses the quality of PII under its purvey on a periodic sampled audit basis. Figure 11 illustrates the interactions between the controller and the processors for the audit. The frequency of the audit would depend upon the amount of personal data held by the controller and processors and the desired level of quality. The controller periodically selects personal data from different data subjects (shown as “Data subject X” in Figure 11) and polls all processors, requesting them to return personal data, policies, and purposes. The processors return the data (if any) they have for the selected data subject. From its records, the controller determines whether or not the processor should have the data and determines the accuracy of the personal data, policies, and purposes by comparing them with its own records.
Personal Data Update by Data Subject The data subject has the right to assurance that the data quality of their PII held by the data controller is maintained. The controller may receive a request from a subject to check the data held by it. Figure 12 illustrates the update process. The controller compares the data it distributed to the Processors against the original data received from the data subject in part to ensure there were no discrepancies in data handled by the different parties. Differences in personal data or policies and purpose are recorded and reported to the data subject. Any changes in PII requested by the data subject are made at the data controller and sent to the data processors that currently have the agreements with the controller. The data subject may also negotiate policy and purpose for his or her PII.
Future Trends We started with a description of what is meant by e-services and a description of the driving factors behind privacy attitudes. We illustrated how privacy-enhancing technologies are driven by corporate policies that are shaped by legislation and litigation, the development of new technology, consumer expectation, and competitive pressures. A privacy architecture may house many privacy-enhancing technologies. Rather than prescribing a particular privacy architecture, we described several approaches that have been mentioned in scientific literature and presentations over the last few years. They included IBM’s Enterprise Privacy Architecture, research projects Confab and PPCS, and the privacy architecture of the Liberty Alliance ID-Web Services Framework. We also
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
260 Korba, Song & Yee
Figure 11. Data flow during Periodic PII Data Audit Controller
Processor 1
Processor 2
Periodically, select DS A PD on a sample basis
Authentication Data subject A personal data audit request
Authentication Data subject X personal data audit Request
Gather all instances of PD/use/policy for DS A
Return DS A PD, use/policy
Log audit locally
Ack PD receipt
Gather all instances of personal data for DS A Log audit locally
Return DS X PD, use/policy Ack PD receipt Compare proc. data against controller data Data Record accuracy, timliness of data return Log audit locally
DS X PD, use/policy error report/update
DS A PD, use/policy error report/update
Update DS A personal data & use/policy Update DS A personal data & use/policy
Log update to PD data locally
Log update to data locally
provided examples of typical information flows that would be expected to be supported in a privacy compliant architecture following on from research into a privacy rights management framework. All of these architectures are quite different from each other in implementation requirements. The Enterprise Privacy Architecture is an approach to deal with many aspects of IT operation, integrating some of the different tools IBM has developed (through its Tivoli arm), with legacy systems and a general approach for implementing and managing IT privacy. Confab is a research result targeting a privacy solution for the ubiquitous computing environment. It is especially relevant in the context of location sensitive services and e-services anywhere. The system was designed using the results of user and system developer surveys. PPCS is a research design
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
261
Figure 12. This diagram shows the key interactions between data subject, data controller, and two data processors during a user request for a change in personal data Data subject A
Controller
Processor 2
Processor 1
Data subject req.
Authentication Personal data update request Acknowledge
Log request Determine processor
Personal data compilatilation & compariston
Authentication DS A PD request
Return DS A PD
Authentication DS A PD request Gather all instances of PD/use/policy for DS A
Gather all instances of personal data for DS A
Ack PD rece ipt Return DS A PD, use/policy Ack PD receipt Co mpare proc. data against controller data Report personal data discrepancies
Record accuracy, Timeliness of data return
PD, use/policy error report/update
PD, use/policy error report/u pdate
Update DSA personal data & use/policy
Update DSA personal data & use/policy
Data update
Personal data, use/policy Use/policy agreement Updated personal data
Log receipt of updated data, Store new PD for DS A
DS A PD, user/policy change DS A PD, user/policy change Ack receipt of PD Task comp lete ack
Ack receipt of PD
Log distribution of up dated data for DS A to processors
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
262 Korba, Song & Yee
that illustrates the approach of building a system from the ground up with legislated privacy requirements as the key driving forces in the design. The Liberty Alliance IDWeb Services Framework is an existing architectural platform that can be used today for building secure, privacy-respecting, identity-centric Web services. The PRM work rounds out our examination of this area with examples of privacy-compliant data flows that would be supported by a privacy-enabled information system architecture. The common thread amongst these approaches is the use of a markup language to express rights and to track the use of data objects. Considerable effort is underway worldwide in the development of standards for different markup language variants to support processing of a wide variety of data in many different applications (e.g., Liberty Alliance). Many systems and inference engines have been developed and are currently under development for XML objects. In the future it may well be possible to use some type of XML technology to link between regulations, laws, privacy enhancing technologies, and privacy compliant architectures.
Conclusions We have described some of the driving forces and approaches for the development and deployment of privacy architectures for e-services as well as presented several privacy information flow scenarios that can be applied for assessing privacy architectures. Privacy management in e-services is a challenging multi-faceted task as demonstrated by the privacy management architectures we have presented. However, this challenge can be successfully handled using the architecture ideas and building blocks we have presented.
Acknowledgments The authors would like to thank the National Research Council Canada for its support of this work. In addition, we would like to express our gratitude to Dr. Paul Madsen for contributing the material on privacy features of the Liberty Alliance ID-Web Services Framework.
References Brown, N. (2003, November 6-7). Privacy technology and the public sector. In 12th CACR Information Security Workshop & 4th Annual Privacy and Security Workshop, Toronto, Canada.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Privacy Management Architectures for E-Services
263
Cranor, L. F., Reagle, J., & Ackerman, M. S. (1999, April 14). Beyond concern: Understanding net users attitudes about online privacy (AT&T Labs-Research Technical Report TR99.4.3). Retrieved from http://www.research.att.com/library/trs/ TRs/99/99.4/ CSA Canadian Standards Association. (n.d.). Privacy principles. Retrieved December 1, 2005 from ht tp://www.csa. ca/ stan dar ds/ pri vacy/code/ Default. asp? language=English Goldberg, I., Wagner, D., & Brewer, E. (1997). Privacy-enhancing technologies for the Internet. In IEEE COMPCON’97 (pp. 103-109). Hong, J. I., & Landay, J. A. (2004, June 6-9). An architecture for privacy-sensitive ubiquitous computing. In Proceedings of the Second International Conference on Mobile Systems, Applications, and Services (MobiSys2004), Boston, Massachusetts, USA. IBM. (2001, June 29). Bus IBM announces an enterprise privacy architecture. Retrieved December 1, 2005, from http://www.bizwiz.com/bizwizwire/pressrelease/2005/ 8484ssw8x488ej7f88j.htm IBM EPA. (n.d.). Enterprise privacy architecture. IBM Privacy Research Institute. Retrieved December 1, 2005, from http://www.zurich.ibm.com/pri/projects/epa.html Karjoth, G., Schunter, M., & Waidner, M. (2002). Privacy-enabled services for enterprises. In Proceedings of the 13th International Workshop on Database and Expert Systems Applications (DEXA’02). Kenny, S., & Korba, L. (2002, November). Adapting digital rights management to privacy rights management. Journal of Computers & Security, 21(7), 648-664. Korba, L. (2002, January 7-11). Privacy in distributed electronic commerce. In Proceedings of the 35th Hawaii International Conference on System Science (HICSS), Hawaii, USA. Korba, L., & Kenny, S. (2002, November). Towards meeting the privacy challenge: Adapting DRM. In DRM 2002, Washington, DC, USA. Lessig, L. (1998). The architecture of privacy. In Proceedings of Taiwan NET’98, Taipei, Taiwan. Retrieved from http://www.lessig.org/content/ar ticles/works/ architecture_priv.pdf Liberty Alliance-1. (n.d.). Liberty Alliance Project. Retrieved February 28, 2005, from http://www.projectliberty.org Liberty Alliance-2. (n.d.). Liberty Alliance ID-Web Services Framework overview. Retrieved February 28, 2005, from https://www.projectliberty.org/resources/ whitepapers/Liberty_ID-WSF_Web_Services_Framework.pdf Liberty Alliance-3. (n.d.). Liberty Alliance & privacy. Retrieved February 28, 2005, from http://www.projectliberty.org/resources/trust_security.php Lysyanskaya, A., Rivest, R. L., Sahai, A., & Wolf, S. (2000). Pseudonym systems. In H. Heys, & C. Adams (Eds.), SAC’99, LNCS 1758 (pp. 184-199).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
264 Korba, Song & Yee
Pfitzmann, A., & Kohntopp, M. (2000). Anonymity, unobservability, and pseudonymity — A proposal for terminology. In H. Federrath (Ed.), LNCS Vol. 2009 (pp. 1-9). Springer-Verlag. P3P. (2002, April 16). The Platform for Privacy Preferences 1.0 (P3P 1.0) specification (W3C Recommendation). Retrieved December 1, 2005, from http://www.w3.org/TR/ P3P/ Solove, D. J. (2002). Conceptualizing Privacy. California Law Review, 90(4), 1087-1155. Retrieved December 1, 2005, from http://ssrn.com/abstract=313103 WSAn Web services architecture requirements (W3C Working Group Note). (2004, February 11). Retrieved from http://www.w3c.org/TR/wsa-reqs/ Yee, G., & Korba, L. (2003, January). Bilateral e-services negotiation under uncertainty. In Proceedings of the 2003 International Symposium on Applications and the Internet (SAINT2003), Orlando, Florida, USA. Yee, G., & Korba, L. (2003, May). The negotiation of privacy policies in distance education. In Proceedings of 14th IRMA International Conference, Philadelphia, Pennsylvania, USA. Yee, G., & Korba, L. (2004, July). Privacy policy compliance for Web services. In Proceedings of IEEE International Conference on Web Services (ICWS 2004), San Diego, California, USA. Yee, G., & Korba, L. (2004, May 23-26). Semi-automated derivation of personal privacy policies. In Proceedings of the IRMA International Conference, New Orleans, Louisiana, USA. Yee, G., & Korba, L. (2005, May 15-18). Comparing and matching privacy policies using community consensus. In Proceedings of the IRMA International Conference, San Diego, California, USA.
Endnote 1
NRC Paper Number: NRC48271
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
265
Chapter X
Modeling Method for Assessing Privacy Technologies Michael Weiss, Carleton University, Canada Babak Esfandiari, Carleton University, Canada
Abstract In this chapter we propose a modeling framework for assessing privacy technologies. The main contribution of the framework is that it allows us to model aspects of privacy and related system concerns (such as security and scalability) in a more comprehensive manner than the dataflow diagrams traditionally used for privacy analysis. The feature interaction perspective taken in the chapter allows us to reason about conflicts between a service user’s model of how the service works and its actual implementation. In our modeling framework such conflicts can be modeled in terms of goal conflicts and service deployment. Goal conflicts allow us to reflect conflicting points of view on system concerns (primarily privacy and security) among the different stakeholders, which are part of the system and its context. Deployment refers to the assignment of functionality to system components, which allows us to reason about dataflows between components, as well as potential conflicts of interest. As a demonstration of the framework, we illustrate how it can be applied to the analysis of single sign-on solutions such as .Net Passport.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
266
Weiss & Esfandiari
Introduction Late in 2004, Rice University researchers uncovered a flaw in Google’s Desktop Search tool that could release private local data to an untrusted third party. Earlier in 2004, a flaw in Apple’s Safari Web browser was reported that allowed an attacker to upload and execute an arbitrary program on the user’s machine. These flaws spotlight some of the security and privacy risks users are exposed to when systems are composed from independently created components or services that interact in unexpected ways. In both examples, the application designers had made incorrect assumptions about the identity of the initiators of service requests. For example, in the case of the Safari Web browser flaw, the help viewer would execute scripts referenced in an URL and not check who made the request. It was assumed that it would originate with the help viewer. This “feature” of the help viewer could be exploited by downloading and mounting a disk image with a malicious script and then asking the help viewer to execute it. Such unexpected interactions are also known as feature interactions. Feature interactions occur when independently developed and separately tested components (also known as features) are combined, and the combination results in undesirable side effects. They are difficult to anticipate, in particular in an open system such as the Internet, due to the combinatorial number of ways features can interact with one another. These side effects are often non-functional in nature and in many cases are related to privacy or security. As the examples show, failing to consider the actual identity of the requestor of a service may cause serious privacy and security breaches. The authenticity of a user or service provider on the Internet cannot be taken for granted. A variety of techniques — smart cards, personal information devices, single sign-on services, to name but a few — have been developed to address this issue. However, the benefits and convenience of these techniques must be weighed against the privacy and security issues their use may raise. The focus of this chapter is, thus, on privacy technologies and assessing them for privacy issues. For these to be accepted in the user community, we must ensure that there are no new privacy and security risks by using the very techniques intended to address such issues. We introduce a modeling framework for assessing privacy technologies and their possible side effects on overall system concerns such as privacy and security, and demonstrate its use for analyzing the privacy pitfalls of single sign-on services.
Privacy Assessment Privacy concerns the collection, storage, use, and sharing of data about individuals (Cannon, 2005). It has several dimensions, including the protection of personal data, or privacy protection. Individuals do not want data about themselves to be shared with other parties without their consent, and when data is held by another party, they want to have control over the data that is held about them and its use (IPC, 2000).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
267
Policy makers have defined privacy protection principles, or sets of practices for implementing and enforcing privacy. There have been several attempts to develop such sets of practices, such as the Guidelines Governing the Protection of Privacy and Transborder Flows of Personal Data developed by the OECD (OECD, 1999), or the Ontario Freedom of Information and Protection of Privacy Act (IPC, 1990). Privacy (impact) assessment is the evaluation of a system (or service) for compliance with privacy protection principles (IPC, 2000; TBC, 2002). Aspects to consider during a privacy assessment include data flow, usage, security, user control, user access, disclosure, and dependency (Cannon, 2005). Data flow captures what data is collected and shared by a service. Usage refers to the purposes and parties using of the data. Security relates to the use of encryption during data transfer and storage. User control refers to the control users have over what data is collected, whereas user access captures what data a user can view and change. By disclosure we mean how users are informed about the collection and use of their data. Dependency relates to dependencies between your system (or service) and other systems (upstream and downstream). It should be noted that privacy and security issues are often intertwined. On one hand, for example, data encryption is considered an important component of privacy. On the other hand, security and privacy goals can often be achieved within the same technical solution. A good example is identity management technology, which can be used to achieve both authenticity and privacy. Privacy-enhanced identity management (Damiani, 2003) would, for example, allow the user to control the release of personal data.
Privacy Analysis One common approach to analyze a system for privacy weaknesses is to model them as dataflow diagrams (Cannon, 2005). They lend themselves to tracking the flow of data in a system. They provide a visual means for validating that we have captured all data that is collected, stored, used, and shared. Dataflow diagrams consist of processes (which manipulate data), data stores (where data is stored), entities (which consume or create data), and dataflows (which show the flow of data between the other types of elements). During privacy analysis, a regular dataflow diagram is often annotated with attributes that apply to the data being sent and stored by the system. Common attributes are inclusion in privacy statement, retention policy, security, user control, and encryption. To represent complex systems, dataflow diagrams can be decomposed into subdiagrams. Another interesting feature is the notion of privacy boundaries that encapsulate selected processes and data stores, which can only be accessed by these processes. However, dataflow diagrams are also lacking certain features, which the modeling approach documented in this chapter attempts to remedy. They make it hard to define and analyze system concerns other than privacy or security (e.g., scalability), to consider the possibly conflicting points of view of multiple stakeholders, and to compare different ways of deploying system functionality in terms of its implications. Often we need to be able to model the goals of the different parties in the system and its context. These parties
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
268
Weiss & Esfandiari
may even have conflicting notions of privacy. We also need to be able to document tradeoffs between solution alternatives based on their impact on system concerns.
Feature Interactions Between Services Feature interactions or undesirable interactions between components of a system can occur in any software system that is subject to changes. This is certainly the case for service-oriented architectures. First, we can observe that interaction is at the very basis of the Web services concept. Web services need to interact, and useful Web services will “emerge” from the interaction of many highly specialized services. Second, as the number of Web services increases, their interactions will become more complex. Many of these interactions will be desirable, but other interactions may be unexpected and undesirable, and we need to prevent their consequences from occurring. As noted by Ryman (2003), many such interactions are related to security and privacy. Similarly, O’Sullivan, Edmond, and ter Hofstede (2002) note that service composition amounts to much more than functional composition and consideration must be given to nonfunctional requirements.
Trade-Offs and Feature Interactions In this chapter, we model privacy issues associated with privacy technologies using goal-oriented analysis and scenario modeling. But our attention is not limited to privacy. It is important to recognize that there are trade-offs between privacy and other system concerns such as security, usability, and scalability. Our perspective in this chapter is informed by recent work on feature interaction in Web services initiated by the authors (Weiss & Esfandiari, 2004). As stated earlier, feature interactions are undesirable outcomes of the composition of services that significantly hurt user satisfaction. A feature interaction perspective allows us to reason about such undesirable outcomes in terms of goal conflicts and service deployment, and presents ways of detecting and resolving them. In the specific example of the single sign-on service that we use to illustrate our approach, the conclusion is that the liabilities (in particular their impact on privacy and security) of using of single sign-on approaches such as Passport outweigh their benefits (from a service user’s perspective the main benefit is convenience). With the help of our modeling approach we can justify a solution that involves separating authentication from access to user profiles, and controlling access to the user’s profile within the user agent. However, we can also analyze the new solution for side effects that it brings about, in turn. In the example, users are now responsible for managing their own profiles (profiles are no longer stored by the Passport service). Thus, a solution to a privacy problem often leads to new trade-offs that need to be managed.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
269
Modeling Feature Interactions Our methodology for detecting and resolving feature interactions between Web services (Weiss & Esfandiari, 2004) is based on an approach for early requirements analysis known as User Requirements Notation (URN). URN, as described by Amyot (2003), employs two complementary notations: Goal-oriented Requirements Language (GRL) (GRL, 2005), and Use Case Maps (UCM) (UCM, 2005). GRL is used to model business goals, non-functional requirements, design alternatives, and design rationale, whereas UCMs allow the description of functional requirements in the form of causal scenarios. GRL builds on the well-established goal-oriented analysis techniques. Both functional and non-functional requirements are modeled as goals to be achieved by the design of a system. During the analysis, a set of initial goals describing their requirements is refined into subgoals. Together these goals and their refinement relationships form a goal graph that shows the influence of goals on each other and can be analyzed for goal conflicts. The perspectives of different stakeholders can also be described in GRL. For each stakeholder we model their goals, as well as their dependencies on one another to achieve those goals. These goals of one stakeholder can now also compromise the goals of other stakeholders. The objective of the analysis is to determine the design alternative that resolves the goal conflicts in a way that best satisfies all stakeholders’ initial goals. The UCM notation provides a way of describing scenarios without the need to commit to system components. A scenario is a partially ordered set of responsibilities that a system performs. Responsibilities can be allocated to components by placing them within the boundaries of that component. This is how we will be modeling feature deployment. With UCMs, different structures suggested by alternatives that were identified in a GRL model can be expressed and evaluated by moving responsibilities from one component (the UCM equivalent of GRL actor) to another, or by restructuring components. The subset of the URN notation used in this chapter is summarized in the Appendix.
Feature Interaction Analysis Our methodology contains the following steps: 1.
Start by modeling the features you wish to analyze as a GRL goal graph. Goal graphs allow us to represent features and to reason about conflicts between them.
2.
Analyze the goal graph for goal conflicts. Such conflicts point to potential feature interactions, in particular, if a conflict “breaks” the functionality of the system.
3.
Resolve the interactions. During this step, UCMs allow us to explore the different alternatives suggested by the GRL models. They are particularly suitable for analyzing dependencies between and changes to the deployment and ownership of features, as well as for reasoning about negotiation protocols.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
270
Weiss & Esfandiari
Feature Interaction Analysis of Single Sign-On To demonstrate the modeling framework we have introduced, we illustrate its use in the case of single sign-on services. Single sign-on is one of the most demanded features of identity management. Identity management has emerged as a critical privacy technology for establishing trust relationships in today’s open networks. Single sign-on provides the convenience of only requiring a user to authenticate once with a service provider, and then being able to access resources from other providers without further authentication. A variety of approaches have been proposed for single sign-on, including Microsoft’s .Net Passport infrastructure, Novell’s DigitalMe system, and the Liberty Alliance’s standard for federated identities. The first two take a centralized approach to managing multiple passwords for online services, while the last one is decentralized in nature. Common to all approaches is that they simplify the management of identities by providing this functionality as a service that can be incorporated into other systems. As an example of using single sign-on, consider a service provider offering personalized service to its users. A personalized service can be designed as a composition of three other services (or features): profile management, content adaptation, and authentication. Profile management takes care of collecting user information and storing it in a profile. Content adaptation is used to customize the service delivery to the service user’s profile. Finally, authentication uniquely identifies users to information service providers.
Outline of the Argument During the first step of feature interaction analysis (referring to the steps of our approach outlined in the last section) the authentication service would be modeled as a goal to be achieved by a yet to be selected third-party service. The services to choose from constitute design alternatives, whose impact on system concerns we wish to analyze in terms of undesirable feature interactions. As feature interactions are detected, we also want to find a remedy to resolve them. Among the many non-functional features involved when creating a personalized service are privacy, security, usability, and scalability. As a result of the second step, the implementation of the authentication goal using a single sign-on service such as Passport is found to violate their user’s privacy as well as their security concerns. It is a result of the combination of authentication and profile management within the same service. In the third step we can experiment with design alternatives to find a resolution to the feature conflict. One solution to the privacy violation would be to decouple authentication from profile management. A UCM analysis of these design alternatives in the third step leads to a deeper understanding of what exactly each solution involves. In particular it allows us to analyze deployment and ownership issues. In the Passport example we find that the problem is one of common ownership of authentication and profile management services. The UCM model helps us visualize how these services can be redeployed, for example, by moving Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
271
the responsibilities associated with profile management to the component representing the user. It also allows us to recognize that, after redeployment, the personalized service can no longer benefit from the profile storage service implicitly offered by Passport.
Modeling Single Sign-On With URN The GRL model in Figure 1 shows the stakeholders (actors) and their dependencies involved in a single sign-on scenario. The user of the personalized service and the service provider are represented as two actors, User[Service] and Provider[Service]. The two actors depend on each other to fulfill their goals. In one direction, the user depends on the provider for receiving the service (Receive[Service]). In the other direction, the provider depends on the user’s Profile in order to provide personalized service. These dependencies motivate the use of an identity management service. The provider needs a way of determining whether the service user is legitimate. Similarly, the user interacts with many different providers and can benefit from a single sign-on service that greatly simplifies authentication. Adding another actor for the Passport service, as one example of a single sign-on service, results in the GRL model in Figure 2. In order to Authenticate[User], the Passport service needs to check the user’s Credentials.
Figure 1. GRL model for a personalized service
Figure 2. GRL model for a personalized service with single sign-on
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
272
Weiss & Esfandiari
Since our objective here is not to present a full discussion of all design issues involved in identity management, we refer the reader to the work by Lin and Yu (2004) and Damiani Vimercati, and Damarati (2003) for a more generic model of identity management and the requirements and building blocks of identity management solutions. Our initial model presented here is similar to the single sign-on example given in Lin and Yu (2004). We next refine this model to include the internal goals of each of the actors. From this we can detect goal conflicts, and thus potential feature interactions. At the next level of refinement, we obtain the GRL model in Figure 3. This model only shows the primary goals of each actor, not side effects obtained from the way these goals are achieved. In Figure 3, each actor has been expanded to reveal its internal goals. The dependencies between the actors remain the same, however, we have expressed them in terms of internal
Figure 3. First level of refinement of the GRL model to show the actors’ primary goals
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
273
goals at either end of the dependency. For example, Passport depends on the User[Service] to Enter Credentials[User] in order to Authenticate[User] it. The user’s main goal is Usability[Service], which can be achieved by replacing a potentially large number of logins to individual sites by a Single Sign-On , as well as by providing Personalized [Service]. The provider’s main goal is to deliver Personalized [Service]. This goal can be decomposed into two subgoals, Content Adaptation and Authentication . The primary goal of the Passport service is to provide Identity Management, which comprises the subgoals Profile Management and Authenticate[User]. In each actor’s goal graph, the leaf nodes are tasks that the actor can perform, or where it depends on other actors to perform this task. For example, Perform[Request] is a task the Provider[Service] performs itself, while with regards to the Authentication goal it depends on Passport.
Analysis for Side-Effects Next, we analyze this model for possible side effects. The result of this analysis is shown and will be a refinement of the model in Figure 3. However, we cannot usually arrive at this model without also considering the behavior of the system. This is where scenario modeling complements goal-oriented analysis. Over several iterations on both models, we will finally arrive at the understanding to document the side effects. We first map the goals in the GRL model from Figure 3 to responsibilities in UCM scenarios. There are two scenarios to consider: one for accessing the personalized service, the other for editing the user’s profile. Figure 4 shows how the responsibilities would be allocated to actors when Passport is used to provide a single sign-on service. Passport authenticates users to service providers, but also gives providers access to the profiles of the users. In the first UCM in Figure 4, we show that the Passport actor is responsible for authentication by putting the Authenticate stub within the boundaries of the actor.
Figure 4. Root level map of the UCM model for the single sign-on
IN1
IN1
OUT1
OUT1
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
274
Weiss & Esfandiari
By comparing Figures 3 and 4, it should be apparent how task goals at the leaf nodes of the GRL model are reflected in responsibilities in the UCM model. For example, the responsibilities accessProfile and performService correspond to the tasks Access Profile[User] and Perform Request. Generally, the mapping from GRL goals to UCM responsibilities will be 1:n, that is, the refinement can introduce additional detail that the GRL model does not show. A 1:1 mapping, as shown, is considered a special case. Similarly, the update functionality is represented by a stub (Update). This approach affords us the flexibility to explore different ways of allocating responsibilities at the next level of detail. In fact, we will see later that both the design with the feature interaction, as well as one possible resolution, share the same root level map. The figure also shows that after interacting with an initial provider, the user can sign on with other providers. The next map (in Figure 5) shows the plug-in for the Authenticate stub that models the use of the Passport service. The operation of the Passport service is modeled after the descriptions in Oppliger (2003) and Wiehler (2004). If the user’s credentials can be validated, the service issues a ticket, which is passed on to the provider together with the user profile (issueTicketW ithProfile). If they cannot, access will be denied. Finally, the plug-in for the Update stub can be found in Figure 6. One interesting feature of this map is that it also refers to the Authenticate stub. This illustrates an additional benefit of the hierarchical description of UCMs: plug-ins can be reused multiple times. We would like to remind the reader that these diagrams show the causal flow between responsibilities. This does not necessarily translate into messages between actors. For example, in Figure 4, the flow is shown from User to Passport to Provider. The actual messaging is more involved. The user sends a sign-on request to the provider, which then redirects the request to the Passport service, which redirects the user’s ticket and profile to the provider. That is, users or providers do not send messages to Passport directly. After parts of the UCM model have been described, the GRL model can be further refined,
Figure 5. Map for the Authenticate plug-in
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
275
Figure 6. Map for the Update plug-in
IN1 OUT1
until, after several iterations, we gain the necessary understanding of the operation of the interaction between users, providers, and the Passport service to arrive at the refined GRL model in Figure 7. This model also shows side effects, which are unanticipated contributions (also known as correlations) of goals on system concerns. Figure 7 merits some further discussion. The main difference between Figures 3 and 7 is that Figure 7 has an additional softgoal, Privacy [Profile], for the user, and the correlations of the design decisions made in the Passport service to Give Access to Profile [User] to all participating providers, and for using a Central Storage to store user profiles. The Privacy [Profile] goal competes with the Usability goal, and we need to look at them in combination in order to evaluate whether, overall, the user is (or is not) satisfied. Both correlations indicate a somewhat negative (–) impact on the user’s Privacy [Profile]. The reasons for this are provided in the form of beliefs. With the help of the beliefs, we can summarize the diagram as stating that while the main motivation for single sign-on is to simplify the sign-on to participating Web services, and for users to receive personalized service, its use has significant negative side-effects on the user’s privacy that need to be considered. Specifically, the user has No Control Over W ho Has Access to its profile, and the Transparent Sharing of profiles means that the user may not even be aware of the implications of registering their profiles with a service like Passport. Also, based on an analysis of how profiles are stored in Passport, the model states that Central Storage of profiles presents a likely Target for Attack , which further reduces privacy. Note that Central Storage is shown as one design alternative for Store Profile [User], and that other alternatives could be incorporated into the model, enabling us to reason about their impact on the user’s privacy within the same diagram.
Feature Interactions in Passport The refined GRL model shows that the current implementation of the Passport service violates privacy concerns. This is due to the fact that Passport gives service providers access to the user’s information no matter who they are. All that is required is that they
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
276
Weiss & Esfandiari
Figure 7. Second level of refinement of the GRL model to include side effects
are participating services. The nature of the violation is that while the initial provider (the one to which the user signs in first) is usually a trusted provider, this is not the case for the providers that the user accesses by being linked to them by the initial provider. Due to the unexpected nature of the conflict and the significant impact on the user (the balance severely shifted toward service providers) between the Authenticate [User] and Profile Management goals, we consider this a feature interaction. Of course, whether a conflict is experienced as a feature interaction depends on the user’s goals. Here, we assume a privacy-conscious user who would also like the convenience of single signon.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
277
Also, in the current version of Passport, the user can only choose to mark sections of the profile as either accessible by all service providers or not accessible at all. No finer level of access control can be specified (such as giving access to selected service providers only). We consider this a feature interaction of multiple instances of the Give Access to Profile [User] feature with itself, as profiles are transparently shared between providers. (The notion of a feature interacting with itself is similar to that of multiple email mail transfer agents creating an infinite loop, a well-documented feature interaction itself). The centralized nature of the Passport service also makes it vulnerable to attacks. Ideally, multiple parties would provide the authentication scheme. This can be characterized as a feature interaction between Central Storage and Give Access to Profile [User], which arises because both the storage of profiles and their access are centralized.
Resolving the Feature Interactions Consider, once more, the interaction between Authenticate [User] and Profile Management. The nature of the conflict is that once a user authenticates with a participating
Passport service, the user’s full profile is shared with the service. A solution would be to remove this information from the profile, but this thwarts the user’s objective in using single sign-on in the first place: users want to receive personalized services and custom features that service providers can only deliver if they have personal information about their users. We can compare this design to an alternative in which access control is performed in the service user. Such a design would decouple access control from authentication by requiring that the identity management service can only authenticate the user. The access control service should reside in the service user, instead, and could be implemented in accordance with the P3P (Platform for Privacy Preferences Project) protocol. This would give users control over what information is shared with which service providers. One drawback of this design is that the user is now responsible for managing his or her profile. We have classified this interaction as a deployment and ownership interaction (Weiss & Esfandiari, 2004). What makes the interaction between Authenticate [User] and Profile Management undesirable is that they are both owned by the same service provider (Passport). A general pattern for breaking such ownership conflicts is a physical decoupling of the services, as described above. UCMs allow us to explore deployment alternatives because the same scenario can be bound to components in different ways.
Future Trends Privacy and security are increasingly important concerns for open, Internet-based systems. Technologies have been proposed to support privacy. It is, therefore, paramount to assess these technologies themselves for new privacy or security risks they may Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
278
Weiss & Esfandiari
introduce. Recent research in requirements engineering has developed new modeling notations for eliciting non-functional requirements, including privacy (Amyot, 2003; Lin & Yu, 2004;Yu & Cysneiros, 2002), which can be leveraged for privacy assessment. Our modeling framework is complementary to existing techniques developed specifically for addressing privacy concerns (such as dataflow diagrams). It facilitates the analysis of privacy technologies in their context of use, which comprises the different stakeholders with possibly conflicting views of privacy (human users, as well as other software systems). There is also a need for considering privacy in the context of other nonfunctional concerns, which traditional techniques do not support. More work will be required toward formal verification and validation of the models presented here. We feel that a significant role will be played by agent-oriented approaches, which extend the goal-oriented and scenario-based approaches discussed to the detailed design and implementation stages. Agent-oriented approaches are also expected to be instrumental for the implementation of new privacy technologies. In particular, one approach seems of great interest: the Tropos project (Brescani, Perini, Giorgini, Giunchiglia, & Mylopoulos, 2004). Tropos has developed a process for agentoriented goal refinement, as well as a language for the formal specification of agent systems that is supported by model checking techniques.
Conclusions In this chapter we proposed a modeling framework for assessing privacy technologies. The main contribution of the framework is that it allows us to model aspects of privacy and related system concerns (such as security and scalability) in a more comprehensive manner than the dataflow diagrams traditionally used for privacy analysis. The feature interaction perspective taken in the chapter allows us to reason about conflicts between a service user’s model of how the service works and its actual implementation. That is, a privacy technology may achieve its primary objective (such as authentication), but result in privacy or security violations that were not anticipated by the user. In our modeling framework, such conflicts can be modeled in terms of goal conflicts and service deployment. Goal conflicts allow us to reflect conflicting points of view on system concerns (primarily privacy and security) among the different stakeholders, which are part of the system and its context. Deployment refers to the assignment of functionality to system components, which allows us to reason about dataflows between components, as well as potential conflicts of interest. As a demonstration of the framework, we illustrated how it can be applied to the analysis of single sign-on solutions such as .Net Passport. We also provided a brief outlook on how the modeling framework can allow us to explore ways of resolving undesirable interactions.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Modeling Method for Assessing Privacy Technologies
279
References Amyot, D. (2003). Introduction to the user requirements notation: Learning by example. Computer Networks, 42(3), 285-301. Brescani, P., Perini, A., Giorgini, P., Giunchiglia, F., & Mylopoulos, J. (2004, May). Tropos: An agent-oriented software development methodology. Journal on Autonomous Agents and Multi-Agent Systems, 8(3), 203-236. Cannon, J. (2005). Privacy. Addison-Wesley. Damiani, E., Vimercati, S., & Damarati, P. (2003). Managing multiple and dependable identities. IEEE Internet Computing, 7(6), 29-37. GRL. (2005). Retrieved from http://www.cs.toronto.edu/km/GRL Information Privacy Commissioner/Ontario. (1990). Freedom of Information and Protection of Privacy Act. Information Privacy Commissioner/Ontario. (2000). Multi-application smart cards: How to do a privacy assessment. Liu, L., & Yu, E. (2004). Intentional modeling to support identity management. In International Conference on Conceptual Modeling, LNCS 3288 (pp. 555–566). OECD. (1999). Inventory of instruments and mechanisms contributing to the implementation and enforcement of the OECD Privacy Guidelines on Global Networks. Oppliger, R. (2003, July). Microsoft .NET Passport: A security analysis. Computer, 2935. O’Sullivan, J., Edmond, D., & ter Hofstede, A. (2002). What’s in a service? Towards accurate description of non-functional service properties. Distributed and Parallel Databases, 12, 117-133. Ryman , A. (2003). Understanding Web serv ices. Retrieved from http :// www.software.ibm.com/wsdd/ library/techarticles/0307_ryman/ryman.html Treasury Board of Canada. (2002). Privacy impact assessment guidelines: A framework to manage privacy risks. UCM. (2005). Retrieved February 2005, from http://www.usecasemaps.org Weiss, M., & Esfandiari, B. (2004). On feature interactions in Web services. In Proceedings of the 2nd International Conference on Web Services, IEEE (pp. 88-95). Wiehler, G. (2004). Mobility, security, and Web services. Siemens/Publicis. Yu, E., & Cysneiros, L. (2002). Designing for privacy and other competing requirements. In Symposium on Requirements Engineering for Information Security.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
280
Weiss & Esfandiari
Appendix This appendix summarizes the URN notation used in this chapter. Note that the examples did not make use of all the elements shown in Figures 8 and 9.
Figure 8. Summary of the GRL notation (Amyot, 2003) Softgoal
Actor
Satisficed Weakly Satisficed
Belief
Actor Boundary
Undecided
AND
Weakly Denied
Goal
Denied
Resource
Task
Conflict
(a) GRL Elements
OR
(b) GRL Satisfaction Levels
(c) Link Composition
Contribution
?
Break Hurt Some- Unknown
Correlation Means-end Dependency
Make
Decomposition
Help Some+ Equal
(e) GRL Contributions Types
(d) GRL Links
Figure 9. Summary of the UCM notation (Amyot, 2003) Path
Start Point
OR-Fork & Guarding [C1] Conditions
End Point
…
… …
… Responsibility … Direction Arrow
… … …
… Timestamp Point … Failure Point … Shared Responsibility
[C3] AND-Fork
…
(c) UCM (Generic) Component
…IINN 11
OUT1 OUT 1
N1 …IIN1
OUT1 OUT 1
S{IN1} Wait ing
Waiting Place
T rigg er Pat h
Path
C ontin uation
(async hron ou s)
Path
T ime r
OR-Join
…
AND-Join
…
(b) UCM Forks and Joins
(a) UCM Path Elements
Waitin g Path
… … … … … … …… …… ……
[C2]
T imeo ut
Timer
Pat h
… Static Stub & Segments ID
… Dynamic Stub
Plug-in Map
E{OUT1}
(d) UCM Stubs and Plug-ins
Continuation Path
R elease (sync hron ou s)
(e) UCM Waiting Places and Timers
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
281
Chapter XI
Legislative Bases for Personal Privacy Policy Specification1 George Yee, National Research Council Canada, Canada Larry Korba, National Research Council Canada, Canada Ronggong Song, National Research Council Canada, Canada
Abstract The growth of the Internet has been accompanied by a proliferation of e-services, especially in the area of e-commerce (e.g., Amazon.com, eBay.com). However, consumers of these e-services are becoming more and more sensitive to the fact that they are giving up private information every time they use them. At the same time, legislative bodies in many jurisdictions have enacted legislation to protect the privacy of individuals when they need to interact with organizations. As a result, e-services can only be successful if there is adequate protection for user privacy. The use of personal privacy policies to express an individual’s privacy preferences appears best-suited to manage privacy for e-commerce. We first motivate the reader with our e-service privacy policy model that explains how personal privacy policies can be used for e-services. We then derive the minimum content of a personal privacy policy by examining some key privacy legislation selected from Canada, the European Union, and the United States.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
282 Yee, Korba & Song
Introduction The rapid growth of the Internet has been accompanied by a proliferation of e-services targeting consumers. E-services are available for banking, shopping, learning, healthcare, government services, and many other areas. However, each of these services requires a consumer’s personal information in one form or another. This leads to consumer concerns over unwarranted leakage, storage, and/or exploitation of their private information. Indeed, consumer savvy regarding their rights to privacy is increasing. In Canada, recent federal privacy legislation known as the Personal Information Protection and Electronic Documents Act (PIPEDA) (Government of Canada) has forced businesses to seek consumer permission before collecting personal information. Similar legislation exists in the European Union (European Union, 1995) and in the United States for healthcare (U.S. Government). In this light, e-services must respect consumers’ personal privacy if they are to be successful. A promising solution for management of private information in e-services is to employ consumer personal privacy policies, that is, a consumer expresses his/her privacy preferences in a personal privacy policy. Once the e-service provider agrees with this privacy policy, it is then the provider’s responsibility to comply with it. These are the basic tenets of our e-service privacy policy model (explained in the section E-Service Privacy Policy Model [EPPM]) to motivate the need for personal privacy policies). However, what should go into the personal privacy policy? In this work, we answer this question by examining privacy legislation from Canada, the European Union, and the United States. (Although we could have looked at privacy legislation in other countries as well, we settled on these three because our audience is expected to be mostly from these regions and thus be subject to privacy legislation from them). The result is the minimum personal privacy policy, that is, one that contains the necessary elements to satisfy privacy legislation, but one that can contain extra privacy provisions according to consumer wishes. We shall indicate what some of these “extra provisions” could be. Policy-based management approaches have been used effectively to manage and control large distributed systems. As in any distributed system, e-services may also use a policybased framework to manage the security and privacy aspects of operations. For privacy policies, there are related works such as P3P (W3C), APPEL (W3C, 2002), PSP (Carnegie Mellon University), and EPAL (IBM), which are languages for expressing privacy preferences in policies. Web sites use P3P to divulge their privacy policies to consumers. APPEL is a specification language used to describe a consumer’s privacy preferences for comparison with the privacy policy of a Web site. PSP is a protocol in the research stage that provides a basis for policy negotiation. EPAL is a markup language for privacy preferences. These works are not necessary for the purposes of this chapter. They only serve as illustrations of what has been done in the related area of capturing privacy preferences in a form amenable to machine processing. Our work differs from P3P, APPEL, PSP, and EPAL in that we look at privacy legislation and other regulations in order to derive a core set of privacy attributes that are required by law in the content of a consumer personal privacy policy, rather than be concerned with expressing preferences in machine processable form. In fact, the example personal privacy policies we give in this chapter are expressed in English. Therefore, we are not
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
283
proposing another policy language but rather looking at what content must be in personal privacy policies to satisfy privacy legislation. Our example privacy policies may well be expressed in P3P, APPEL, PSP, or EPAL. We are not aware of any other work that derives personal privacy policy content requirements from privacy legislation. The rest of this chapter is organized as follows. The section E-Service Privacy Policy Model (EPPM) describes our model of using personal privacy policies to protect consumer privacy. This is followed by the section Privacy Legislation, which examines privacy legislation from Canada, the European Union, and the United States and derives privacy attributes for inclusion in personal privacy policies. The next section, Personal Privacy Policy Specification, integrates the privacy attribute findings from Privacy Legislation into the minimum personal privacy policy that satisfies privacy legislation. The last section gives our conclusions and ideas for future work.
E-Service Privacy Policy Model (EPPM) Before explaining our e-service privacy policy model, it is useful to describe what we mean by an e-service. An e-service is a service that is offered by a provider to a consumer across a computer network. A stock quotation service is often used as an example of an e-service. Here a consumer would logon to the service from a computer, and after appropriate user authentication, would make use of the service to obtain stock quotes. Accessing one’s bank account through online banking is another example of an e-service. Here the provider is the bank, and the service consists of allowing the consumer to check the balance, transfer funds, or make bill payments. The network is usually the Internet, but could also in principle be a private enterprise network. At any time, one provider may be serving many consumers, and many providers may be serving one consumer. For the purposes of this chapter, the business relationship between provider and consumer is always one-to-one, that is, the service is designed for one consumer and is provided by one provider (although this provider may make use of other providers to compose its eservice), and payment for the service is expected from one consumer. In addition, service providers may also be service consumers, and service consumers may also be service providers. The e-service privacy policy model describes the origin and use of personal privacy policies to protect consumer privacy for e-services. This model consists of six phases as given in Table 1. These phases would occur in ascending numerical order. In this model, a provider of a particular e-service has a privacy policy for that e-service, stating what private information it requires from the consumer and how the information will be used. The consumer has a privacy policy stating his/her privacy preferences for using a particular e-service, for example, what private information the consumer is willing to give to the provider and with whom the information may be shared. An entity that is both a provider and a consumer has separate privacy policies for these two roles. For Phases 3 and 4, a privacy policy is attached to a software agent that acts on behalf of a consumer or a provider as the case may be. Prior to the activation of a particular service, the agent for the consumer and the agent for the provider undergo a privacy
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
284 Yee, Korba & Song
Table 1. E-service privacy policy model phases Phase
Description
Reference
1. Formulate
Privacy policy creation; every e-service consumer needs to have his/her own personal privacy policy that expresses his/her privacy preferences for using a particular e-service; every provider also needs one or more privacy policies expressing what private information it needs for it’s e-services.
2. Search
E-service consumer knows the type of eservice he/she needs (e.g., book seller) and searches the Internet (e.g., using Google) for such an e-service.
3. Match
Before an e-service is engaged, the service Yee and Korba (May consumer and service provider exchange and 2005) compare their individual privacy policies for the e-service to see if there is a match. If there is a match, Phase 5 is entered next. Otherwise, Phase 4 is entered next.
4. Negotiate
Consumer and provider negotiate with each Yee and Korba other to try to arrive at a mutually agreed (January 2003, May privacy policy. If this negotiation is successful, 2003) Phase 5 is entered next. Otherwise, Phase 2 is entered next (consumer searches for another provider).
5. Engage
The consumer engages the e-service.
6. Comply
The provider must comply with the personal Yee and Korba (July privacy policy of the consumer or comply with 2004, March 2005) their mutually negotiated privacy policy.
Yee and Korba (January 2005) describes methods for creating personal privacy policies
-
-
policy exchange in which the policies are examined for compatibility. The service is only activated if the policies are compatible (Yee & Korba, May 2005), in which case we say that there is a “match” between the two policies. In addition, we assume that in general, the provider always asks for more private information from the consumer than the consumer is willing to give up. Figure 1 illustrates Phases 3 and 4. For the purposes of this work, it is not necessary to consider the details of service operation. Further detailed descriptions of how to carry out individual phases of this model are beyond the scope of this work; however, we include in Table 1 some references to Yee and Korba, who have works describing how some of the phases may be carried out.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
285
Figure 1. Exchange of privacy policies (PP) between consumer agent (CA) and provider agent (PA)
Consumer
CA PP
Policy exchange
PA
Provider
PP
Privacy Legislation Canada In Canada, federal privacy legislation is enacted in the PIPEDA (Government of Canada) and is based on the Canadian Standards Association’s Model Code for the Protection of Personal Information (Department of Justice), recognized as a national standard in 1996. This Code consists of 10 Privacy Principles that for convenience, we label as CSAPP (Table 2). Since the privacy provisions of PIPEDA are represented by CSAPP, we examine CSAPP rather than the more complicated text of PIPEDA itself. To identify some attributes of private information collection using the CSAPP, we interpret “organization” as “provider” and “individual” as “consumer.” In the following, we use CSAPP.n to denote Principle n of CSAPP. Principle CSAPP.2 implies that there could be different providers requesting the information, thus implying a collector attribute. Principle CSAPP.4 implies that there is a what attribute, that is, what private information is being collected. Principles CSAPP.2, CSAPP.4, and CSAPP.5 state that there are purposes for which the private information is being collected. Principle CSAPP.5 implies a retention time attribute for the retention of private information. Principles CSAPP.3, CSAPP.5, and CSAPP.9 imply that the private information can be disclosed to other parties, giving a disclose-to attribute. Thus, from the CSAPP we derive five attributes of private information collection, namely collector, what, purposes, retention time, and disclose-to. The Privacy Principles also prescribe certain operational requirements that must be satisfied between provider and consumer, such as identifying purpose and acquiring consent. Our EPPM and the exchange of privacy policies automatically satisfy some of these requirements, namely Principles CSAPP.2, CSAPP.3, and CSAPP.8. The satisfaction of the remaining operational requirements depends on compliance mechanisms (Principles CSAPP.1, CSAPP.4, CSAPP.5, CSAPP.6, CSAPP.9, and CSAPP.10) and security mechanisms (Principle CSAPP.7).
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
286 Yee, Korba & Song
Table 2. CSAPP — The 10 Privacy Principles (Canadian Standards Association) Principle 1. Accountability 2. Identifying purposes 3. Consent 4. Limiting collection 5. Limiting use, disclosure, and retention 6. Accuracy 7. Safeguards 8. Openness 9. Individual access
10. Challenging compliance
Description An organization is responsible for personal information under its control and shall designate an individual or individuals accountable for the organization's compliance with the privacy principles. The purposes for which personal information is collected shall be identified by the organization at or before the time the information is collected. The knowledge and consent of the individual are required for the collection, use, or disclosure of personal information, except when inappropriate. The collection of personal information shall be limited to that which is necessary for the purposes identified by the organization. Information shall be collected by fair and lawful means. Personal information sha ll not be used or disc losed for purposes other than those for which it was collected, except with the consent of the individual or as required by the law. In addition, personal information shall be retained only as long as necessary for fulfillment of those purposes. Personal information shall be as accurate, complete, and up-to-date as is necessary for the purposes for which it is to be used. Security safeguards appropriate to the sensitivity of the information shall be used to protect personal information. An organization shall make readily available to individuals specific information about its policies and practices relating to the management of personal information. Upon request, an individual shall be informed of the existence, use, and disc losure of his or her personal information and shall be given access to that information. An individual shall be able to challenge the accuracy and completeness of the information and have it amended as appropriate. An individual shall be able to address a challenge concerning compliance with the above principles to the designated individual or individuals accountable for the organization's compliance.
European Union (EU) Privacy in the EU is defined as a human right under Article 8 of the 1950 European Convention of Human Rights and Fundamental Freedoms (ECHR). The implementation of this Article can be traced to The Directive (European Union Directive). Similar legislation and enforcement structures as the European model exist in Canada, Australia, Norway, Switzerland, and Hong Kong. The Directive applies to all sectors of EU public life, with some exceptions. It specifies the data protection rights afforded to “data subjects,” plus the requirements and responsibilities obligated for “data controllers” and by association “data processors” (Deitz, 1998). This triad structure of entities balances data subject fundamental rights against the legitimate interests of data controllers (see Figure 2). The Directive places an obligation on member states to ratify national laws implementing its requirements.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
287
Figure 2. A schematic representation of the roles of the three entities defined in the directive
The Directive
Data
Dec Cons lares train s Inf Con orms s Entrusttraints s
Data
Data Processor ts trac Con Co nt ra cts
Data Controller Data Subject
E-service consumer
Data Processor
E-service provider
The data subject is a person who can be identified by one or more pieces of data related to his physical, physiological, mental, economic, cultural, or social identities. Even data associated with an individual in ambiguous ways may be deemed reasonable personally identifiable information. Following Article 1 of the ECHR, the fundamental right to data protection falls not to the nationality of the data subject, but as an obligation to a party that relies on the data subject (Council of Europe Convention 108). The parties that rely on the data subject are the data controller and, by association, the data processor. The data controller is an entity that determines the purpose and means of processing personal data and is defined as the holder of ultimate accountability as it relates to the correct processing and handling of the information from the data subject. The data processor is an entity that processes personal data on behalf of the data controller. For e-services in this work, the data subject maps to the e-service consumer and the data controller and data processors together map to the e-service provider, as depicted using gray in Figure 2. EU privacy principles (Table 3) abstracted from the complexities of legislation have been developed to simplify compliance with privacy regulations. Analyzing an approach using the principles as a guide offers a fruitful means for determining the effectiveness and pitfalls of the approach. We thus use these principles to derive privacy attributes based on EU privacy legislation. For convenience, we label these principles as EUPP, with EUPP.n meaning Principle n of EUPP. Looking at Table 3, we note that the EUPP has much in common with the CSAPP. We see that EUPP.2 has “who is processing his personal data and for what purpose” implying the privacy attributes collector (for “who”), what (for “personal data”), and purpose. Purpose is also confirmed by EUPP.3. EUPP.7 talks about exchanging personal data implying the disclosed-to attribute. While retention time is not mentioned explicitly, EUPP.3 mentions, “not further processed in a way that is incompatible with those
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
288 Yee, Korba & Song
Table 3. European Union Privacy Principles Principle
Description
1. Reporting the processing
All non-exempt processing must be reported in advance to the National Data Protection Authority.
2. Transparent processing
The data subject must be able to see who is processing his or her personal data and for what purpose. The data controller must keep track of all processing it performs and the data processors and must make it available to the user.
3. Finality & purpose limitation
Personal data may only be collected for specific, explicit, legitimate purposes and not further processed in a way that is incompatible with those purposes.
4. Lawful basis for data processing
Personal data processing must be based on what is legally specified for the type of data involved, which varies depending on the type of personal data.
5. Data quality
Personal data must be as correct and as accurate as possible. The data controller must allow the citizen to examine and modify all data attributable to that person.
6. Rights
The data subject has the right to improve his or her data as well as the right to raise certain objections regarding the execution of these principles by the data controller.
7. Data traffic outside EU
Exchange of personal data to a country outside the EU is permitted only if that country offers adequate protection. The data controller assures appropriate measures are taken in that locality if possible.
8. Data processor processing 9. Security
If data processing is outsourced from data controller to processor, controllability must be arranged. Measures are taken to assure secure processing of personal data.
purposes,” which can be partly enforced using retention time. In addition, EUPP.8 talks about arranging for controllability, which is supported by retention time. In any case, retention time as a privacy attribute can only help the cause of privacy protection. The remaining aspects of the EUPP refer to operational requirements (e.g., EUPP.1, EUPP.9). We thus conclude that the privacy attributes collector, what, purposes, retention time, and disclose-to are also supported by the EUPP, and moreover, that the EUPP does not introduce any additional attributes.
United States In the United States, privacy protection is achieved through a patchwork of legislation at the federal and state levels. Privacy legislation is largely sector-based (Banisar, 1999). At the Federal level there are presently more than a dozen privacy laws. Some of these laws are: Privacy Act of 1974 as amended (5 USC 552a), Electronic Communications Privacy Act of 1986, and Right to Financial Privacy Act of 1986. Laws applicable to the private sector include: Family Educational Rights and Privacy Act of 1978, Privacy Protection Act of 1980, and Video Privacy Protection Act of 1988. As can be seen, the
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
289
laws typically apply to specific technologies or privacy threats to, for example, bank records, government databases, or video rental history. The laws serve as operational boundaries rather than requirements, and there is no national all encompassing code for privacy protection. As such, the United States laws are less effective at protecting personal privacy than either the legislations of the European Union or Canada. The United States is not the leader in privacy protection (Banisar, 1999; Hurley, 1999; Milberg, Burke, Smith, & Kallman, 1995). However, there is one very hot area where privacy is treated very seriously, and that is the area of personal health information privacy, as exemplified by the Health Insurance Portability and Accountability Act (HIPAA) (U.S. Government). We therefore examine the privacy provisions of HIPAA for privacy attributes. The following three paragraphs are quoted from U.S. Government-1 and give a good background on HIPAA privacy provisions: The following overview provides answers to general questions regarding the Standards for Privacy of Individually Identifiable Health Information (the Privacy Rule), promulgated by the Department of Health and Human Services (HHS). To improve the efficiency and effectiveness of the health care system, the Health Insurance Portability and Accountability Act (HIPAA) of 1996, Public Law 104-191, included ‘Administrative Simplification’ provisions that required HHS to adopt national standards for electronic health care transactions. At the same time, Congress recognized that advances in electronic technology could erode the privacy of health information. Consequently, Congress incorporated into HIPAA provisions that mandated the adoption of Federal privacy protections for individually identifiable health information. In response to the HIPAA mandate, HHS published a final regulation in the form of the Privacy Rule in December 2000, which became effective on April 14, 2001. This Rule set national standards for the protection of health information, as applied to the three types of covered entities: health plans, health care clearinghouses, and health care providers who conduct certain health care transactions electronically. By the compliance date of April 14, 2003 (April 14, 2004, for small health plans), covered entities must implement standards to protect and guard against the misuse of individually identifiable health information. Failure to timely implement these standards may, under certain circumstances, trigger the imposition of civil or criminal penalties. The Privacy Rule is too long to use here (even the summary of the Privacy Rule takes 25 pages). We instead present a summary of HIPAA health information privacy rights from U.S. Government-2 (Table 4). For convenience, we label Table 4 as HIPR and use HIPR.n to mean the n-th right of the HIPR, which has eight rights. Looking at the HIPR, we note that the HIPR has much in common with the CSAPP and the EUPP. We see that HIPR.3 states, “You can learn how your health information is used
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
290 Yee, Korba & Song
Table 4. HIPAA consumer privacy rights Your Health Information Privacy Rights: Providers and health insure rs who are require d to follow this law must comply with your rights to… 1 Ask to see and get a copy of your health records
You can ask to see and get a copy of your medical records and other health information. You may not be able to get all of your information in a few special cases. For example, if your doctor decides something in your file might endanger you or someone else, the doctor may not have to give this information to you. • In most cases, your copies must be given to you within 30 days, but this can be extended for another 30 days if you are given a reason. • You may have to pay for the cost of copying and mailing if you request copies and mailing.
2 Have corrections added to your health information
You can ask to change any wrong information in your file or add information to your file if it is incomplete. For example, if you and your hospital agree that your file has the wrong result for a test, the hospital must change it. Even if the hospita l believes the test result is correct, you still have the right to have your disagreement noted in your file. • In most cases the file should be changed within 60 days, but the hospital can take an extra 30 days if you are given a reason.
3 Receive a notice that tells you how your health information is use d and shared
You can learn how your health information is used and shared by your provider or health insurer. They must give you a notice that tells you how they may use and share your health information and how you can exercise your rights. In most cases, you should get this notice on your first visit to a provider or in the mail from your health insurer, and you can ask for a copy at any time. 4 Decide whether to give your permission before your information can be used or shared for certain purposes In general, your health information cannot be given to your employer, used or shared for things like sales calls or advertising, or used or shared for many other purposes unless you give your permission by signing an authorization form. This authorization form must tell you who will get your information and what your information will be used for. 5 Get a report on whe n and why your health information was shared Under the law, your health information may be used and shared for particular reasons, like making sure doctors give good care, making sure nursing homes are clean and safe, reporting when the flu is in your area, or making required reports to the police, such as reporting gunshot wounds. In many cases, you can ask for and get a list of with whom your health information has been shared for these reasons. • You can get this report for free once a year. • In most cases you should get the report w ithin 60 days, but it can take an extra 30 days if you are given a reason.
6 Ask to be reached somewhere other than home
You can make reasonable requests to be contacted at different places or in a different way. For example, you can have the nurse call you at your office instead of your home, or send mail to you in an envelope instead of on a postcard. If sending information to you at home might put you in danger, your health insurer must talk, call, or write to you where you ask and in the way you ask, if the request is reasonable. 7 Ask that your information not be shared You can ask your provider or health insurer not to share your health information with certain people, groups, or companies. For example, if you go to a clinic, you could ask the doctor not to share your medical record with other doctors or nurses in the clinic. However, they do not have to agree to do what you ask. 8 File complaints If you believe your information was used or shared in a way that is not allowed under the privacy law, or if you were not able to exercise your rights, you can file a complaint with your provider or health insurer. The privacy notice you receive from them will tell you who to talk to and how to file a complaint. You can also file a complaint with the U.S. Government.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
291
and shared by your provider or health insurer.” This implies a collector (“by your provider”) and a what (“your health information”). HIPR.4 contains “used or shared for many other purposes,” implying a purpose attribute. HIPR.7 talks about sharing your health information with other entities, implying a disclose-to. The HIPR does not mention retention time, but the much more detailed Privacy Rule of the HIPAA (see above) mentions “retention of records.” The remaining aspects of the HIPR refer to operational requirements (e.g., HIPR.1, HIPR.2) some of which are similar to the CSAPP and EUPP (e.g., HIPR.1 and HIPR.2 relate well with EUPP.6, CSAPP.6, and CSAPP.9). We thus conclude that the privacy attributes collector, what, purposes, retention time, and disclose-to are also supported by the HIPR and HIPAA. Moreover, the HIPR does not introduce any additional private information attributes.
Personal Privacy Policy Specification Based on Table 4, the contents of a personal privacy policy should, for each item of private information, identify (a) collector — who wishes to collect the information, (b) what — the nature of the information, (c) purposes — the purposes for which the information is being collected, (d) retention time — the amount of time for the provider to keep the information, and (e) disclose-to — the parties to whom the information will be disclosed. Privacy policies across different types of e-services (e.g., e-business, elearning, e-health) are specified using these attributes (see Figure 3) and principally differ from one another according to the values of the attributes what and purposes. For example, an e-commerce privacy policy might specify credit card number as what and payment as purposes, whereas an e-learning privacy policy might specify marks as what and student assessment as purposes. Figure 3 gives three examples of consumer personal privacy policies for use with an elearning provider, an online bookseller, and an online medical help clinic. The first item in a policy indicates the type of online service for which the policy will be used. Since Figure 3. Example of consumer personal privacy policies Policy Use: E-learning Owner: Alice Consumer Proxy: No Valid: unlimited
Policy Use: Bookseller Owner: Alice Consumer Proxy: No Valid: June 2005
Policy Use: Medical Help Owner: Alice Consumer Proxy: No Valid: July 2005
Collector: Any What: name, address, tel Purposes: identification Retention Time: unlimited Disclose-To: none
Collector: Any What: name, address, tel Purposes: identification Retention Time: unlimited Disclose-To: none
Collector: Any What: name, address, tel Purposes: contact Retention Time: unlimited Disclose-To: pharmacy
Collector: Any What: Course Marks Purposes: Records Retention Time: 2 years Disclose-To: none
Collector: Dr. A. Smith What: medical condition Purposes: treatment Retention Time: unlimited Disclose-To: pharmacy
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
292 Yee, Korba & Song
a privacy policy may change over time, we have a valid field to hold the time period during which the policy is valid. The proxy field holds the name of the proxy if a proxy is employed to provide the information. Otherwise, this field has the default value of “no.” The address and telephone contact of the proxy may be specified as an informational item in the privacy policy itself. These policies need to be expressed in a machine-readable policy language such as APPEL (W3C, 2002) (XML implementation). A personal privacy policy thus consists of “header” information (policy use, owner, proxy, valid) together with 5-tuples or privacy rules, where each 5-tuple or rule represents an item of private information and the conditions under which the information may be shared:
A personal privacy policy therefore consists of a header plus one or more privacy rules. As we mentioned in the Introduction, this is a minimum personal privacy policy that satisfies the privacy legislation of Canada, the European Union, and the United States (for personal health information), since it can contain additional privacy provisions. Additional provisions could include, for example, (a) elaborated retention time using conditions such as “6 months unless I call to have it deleted right away,” (b) negative purposes such as “not for purposes A or B,” (c) negative disclose-to such as “not to be disclosed to persons C or D,” and d) operational items such as requesting a report as in HIPR.5 or requesting different means of contact as in HIPR.6.
Conclusions and Future Research We began by introducing our E-Services Privacy Policy Model that describes how personal privacy policies can be used to protect personal privacy in an e-services environment. We then examined privacy principles and rights representative of privacy legislation from Canada, the European Union, and the United States and derived five attributes of private information collection for use in specifying personal privacy policies, namely collector, what, purposes, retention time, and disclose-to. We showed that these attributes are supported by the privacy legislation of all three geographic regions. Moreover, these attributes are complete in the sense that no other privacy attributes are required by the legislation, although one could add additional modifiers to the attributes and include operational items in the policy. In addition, we believe that the five attributes lead to a policy that is very understandable and manageable by the average e-service consumer, which is important if the EPPM is to succeed. In this sense, the possible additional privacy provisions mentioned would only complicate matters and may not be used by most consumers. Ideas for future research include: (a) expressing our personal privacy policy using a policy language such as APPEL (W3C, 2002) to investigate ease and practicality of use,
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Legislative Bases for Personal Privacy Policy Specification
293
and (b) have consumer volunteers express their privacy preferences using our privacy policy to gauge how well consumers adapt to it and whether or not it meets the requirements for a variety of e-services in different domains. The latter work would explore how personal privacy policies may be used with Web services via their implementation protocols (e.g., UDDI, WSDL, and SOAP).
References Banisar, D. (1999, September 13). Privacy and data protection around the world. In 21st International Conference on Privacy and Personal Data Protection. Carnegie Mellon University. (n.d.). Privacy Server Protocol Project. School of Computer Science, Robotics Institute and eCommerce Institute, Internet Systems Laboratory. Retrieved August 13, 2002, from http://yuan.ecom.cmu.edu/psp/ Council of Europe Convention 108. (n.d.). Convention for the Protection of Individuals with Regard to Automatic Processing of Personal Data. Retrieved from http:// conventions.coe.int/treaty/EN/Treaties/Html/108.htm Deitz, L. (1998). Privacy and security — EC’s privacy directive: protecting personal data and ensuring its free movement. Computers and Security Journal, 17(4), 25-46. Department of Justice. (n.d.). Privacy provisions highlights. Retrieved July 3, 2002, from http://canada.justice.gc.ca/en/news/nr/1998/attback2.html European Union Directive. (1995). Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Unofficial text retrieved Sept. 5, 2003, from http://aspe.hhs.gov/datacncl/eudirect.htm Government of Canada. (n.d.). Personal Information Protection and Electronic Documents Act. Retrieved February 28, 2005, from http://www.privcom.gc.ca/legislation/index_e.asp IBM. (n.d.). Enterprise Privacy Architecture Language (EPAL). Retrieved March 9, 2005, from http://www.zurich.ibm.com/security/enterprise-privacy/epal/ Milberg, S. J., Burke, S. J., Smith, H. J., & Kallman, E. A. (1995, December). Values, personal information, privacy, and regulatory approaches. Communications of the ACM, 38(12), 65-74. U.S. Government. (n.d.). Office for Civil Rights — HIPAA: Medical privacy — National standards to protect the privacy of personal health information. Retrieved February 28, 2005, from http://www.hhs.gov/ocr/hipaa/ U.S. Government-1. (n.d.). General overview of standards for privacy of individually identifiable health information. Retrieved February 28, 2005, from http:// www.hhs.gov/ocr/hipaa/guidelines/overview.pdf U.S. Government-2. (n.d.). Your health information privacy rights. Retrieved February 28, 2005, from http://www.hhs.gov/ocr/hipaa/consumer_rights.pdf
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
294 Yee, Korba & Song
W3C. (2002, April 15). A P3P Preference Exchange Language 1.0 (APPEL1.0) (W3C Working Draft). Retrieved August 12, 2002, from http://www.w3.org/TR/P3Ppreferences/ W3C. (n.d.). The Platform for Privacy Preferences. Retrieved August 12, 2002, from http:/ /www.w3.org/P3P/ Yee, G., & Korba, L. (2003, January). Bilateral e-services negotiation under uncertainty. In Proceedings of the 2003 International Symposium on Applications and the Internet (SAINT2003), Orlando, Florida, USA. Yee, G., & Korba, L. (2003, May). The negotiation of privacy policies in distance education. In Proceedings of the Information Resources Management Association International Conference 2003 (IRMA 2003), Philadelphia, Pennsylvania, USA. Yee, G., & Korba, L. (2004, July). Privacy policy compliance for Web services. In Proceedings of the IEEE International Conference on Web Services (ICWS 2004), San Diego, California, USA. Yee, G., & Korba, L. (2005, January). Semiautomatic derivation and use of personal privacy policies in e-business. International Journal of E-Business Research, 1(1), 54-69. Yee, G., & Korba, L. (2005, March). An agent architecture for e-services privacy policy compliance. In Proceedings of the IEEE 19th International Conference on Advanced Information Networking and Applications (AINA 2005), Tamkang University, Taiwan. Yee, G., & Korba, L. (2005, May). Comparing and matching privacy policies using community consensus. In Proceedings of the Information Resources Management Association International Conference 2005 (IRMA 2005), San Diego, California, USA.
Endnote 1
NRC Paper Number: NRC 48270
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
About the Authors 295
About the Authors
George Yee is a senior research officer in the Information Security Group, Institute for Information Technology, National Research Council Canada (NRC), Canada. Prior to joining the NRC in late 2001, he spent over 20 years at Bell-Northern Research and Nortel Networks. George received his PhD (electrical engineering), Master of Science (systems and information science), and Bachelor of Science (mathematics) from Carleton University, Ottawa, Canada, where he is now an adjunct research professor and teaches a graduate course on security and privacy for e-services. Dr. Yee is on the editorial review board of several journals including International Journal of Distance Education Technologies and International Journal of E-Business Research. He is a senior member of IEEE and member of ACM and Professional Engineers Ontario. His research interests include security and privacy for e-services, enhancing reliability, security, and privacy using software agents, and engineering software for reliability, security, and performance. *
*
*
*
*
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
296 About the Authors
Carlisle Adams is an associate professor in the School of Information Technology and Engineering (SITE) at the University of Ottawa, Canada. Prior to his academic appointment in 2003, he worked for 13 years in industry (Nortel, Entrust) in the design and standardization of a variety of cryptographic and security technologies for the Internet. His research and technical contributions include the CAST family of symmetric encryption algorithms, secure protocols for authentication and management in public key infrastructure (PKI) environments, a comprehensive architecture and policy language for access control in electronic networks, the design of efficient mechanisms to assess and constrain the trust that must be placed in unknown peers in a network, and the creation of effective techniques to preserve and enhance privacy on the Internet. Dr. Adams is co-author of Understanding PKI: Concepts, Standards, and Deployment Considerations, Second Edition (Addison-Wesley, 2003). Katerine Barbieri obtained an undergraduate degree in computer science (information management systems option) and a Master’s degree in computer science from the University of Ottawa, Canada. As a graduate student, she researched Internet security and privacy topics with a focus on privacy enforcement technologies. Katerine has worked in the IT Security field for several years in both government and private sector organizations. She has worked on projects in many areas, including network security, intrusion detection, incident response, security product evaluation, and database and software design/development, among others. She has also taught courses in many of these areas. Scott Buffett is a research officer in the Internet Logic group at the National Research Council of Canada’s Institute for Information Technology — E-Business, Canada, and an honorary research associate in the Faculty of Computer Science at the University of New Brunswick. He received his PhD in computer science from the University of New Brunswick in 2004. After lecturing at UNB for five years and winning the Faculty of Computer Science Award for Excellence in Teaching in 2001, Scott joined the National Research Council in 2002 to conduct research on automated purchasing procedures in e-commerce. His interests include artificial intelligence, e-markets, auctions, decision analysis, automated negotiation, privacy, preference elicitation, intelligent agents, and multi-agent systems. Barbara Carminati has a position as an assistant professor of computer science at the University of Insubria at Como, Italy. Barbara Carminati received an MS degree in computer sciences in 2000, and a PhD in computer science from the University of Milano in 2004. Her main research interests include database and Web security, XML, secure information dissemination, and publishing. She is also a lecturer at the Computer Science School of the University of Milano and University of Insubria at Como, and she has given industrial courses on topics such as database systems and security. She has also served as program committee member of several international conferences and workshops.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
About the Authors 297
Babak Esfandiari is an assistant professor at Carleton University, Canada. He obtained his PhD in computer science in 1998 (University of Montpellier, France) and then worked for two years at Mitel Corporation as a software engineer before joining Carleton in 2000. His research interests include agent technology, network computing, and objectoriented design. Elena Ferrari is a full professor of computer science at the University of Insubria, Como, Italy. She received the MS degree in computer science from the University of Milano (Italy) in 1992. In 1998, she received a PhD in computer science from the same university. Her research activities are related to various aspects of data management systems, including Web security, access control and privacy, multimedia databases, and temporal databases. On these topics she has published more than 100 scientific publications in international journals and conference proceedings. Dr. Ferrari has served as program chair of the 4 th ACM Symposium on Access Control Models and Technologies (SACMAT’04), software demonstration chair of the 9th International Conference on Extending Database Technology (EDBT’04), co-chair of the first COMPSAC’02 Workshop on Web Security and Semantic Web, the first ECOOP Workshop on XML and Object Technology (XOT 2000), and the first ECOOP Workshop on Object-Oriented Databases. She has also served as program committee member of several international conferences and workshops. Professor Ferrari is on the editorial board of the VLDB Journal and the International Journal of Information Technology (IJIT). She is a member of the ACM and senior member of IEEE. Scott Flinn is a research officer in the Human Web group at the National Research Council of Canada’s Institute for Information Technology — E-Business, Canada, and an honorary research associate in the Faculty of Computer Science at the University of New Brunswick. After receiving an M.Math degree from the University of Waterloo in 1990, Scott spent several years at the University of British Columbia studying human factors of information retrieval before entering industry to design and build enterprise scale certificate authority products for Xcert International and RSA Security Inc. Since joining the National Research Council in 2002, Scott has combined his interests in HCI, information management, and information security in an effort to help individuals use complex information systems safely, securely, and confidently. Maria Yin Ling Fung completed a Bachelor of Commerce degree in management and information systems in 2000 and a Master of Commerce degree with first class honors in information systems in 2004 at the University of Auckland, New Zealand. Her research interests are in the area of e-healthcare privacy protection and e-government and e-local government strategies in New Zealand. Her current research is the evaluation of the local body Web sites with a focus on local body elections. She is also the administration/ finance manager of the Bioengineering Institute at the University of Auckland.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
298 About the Authors
Patrick C. K. Hung is an assistant professor at the Faculty of Business and Information Technology in a newly established university in Canada, the University of Ontario Institute of Technology (UOIT), Canada. He has worked as a visiting assistant professor at the Department of Computer Science in the Hong Kong University of Science and Technology (HKUST) in Hong Kong and as a research scientist with Commonwealth Scientific and Industrial Research Organization (CSIRO) in Australia. He also has prior industrial experience in e-business projects in the USA, Canada, and Hong Kong. He is also an editorial board member of several international journals such as the International Journal of Web Services Research (JWSR) and the International Journal of Business Process Integration Management (IJBPIM). He has served as the co-chair of the Ninth International IEEE EDOC Conference (EDOC 2005): The Enterprise Computing Conference. His research interests include Web services security and privacy, business process integration, electronic negotiation, and contracting. Larry Korba is a principal research officer with the National Research Council of Canada, Canada. He is the leader of the Information Security Group in the Institute for Information Technology (http://www.iit-iti.nrc-cnrc.gc.ca/) and involved in the research and development of security and privacy enhancing technologies for applications ranging from gaming to ad hoc wireless systems. Martine C. Ménard is the network administrator for the Policy Research Initiative (PRI), Canada. She focuses on researching and implementing new security measures or reviewing existing ones. Prior to the PRI, She was an IT consultant with Adaptek Systems Inc. She obtained her Bachelor of Civil Engineering from Lakehead University in Thunder Bay, Ontario, Canada and her Master of Science (information systems) from Athabasca University in Alberta, Canada. John Paynter has a Bachelor of Science in biology, a Master of Science in zoology, and a Bachelor of Commerce in operations research and econometrics. He teaches software engineering at the University of Auckland’s Business School, New Zealand. His research interests are in usability and software engineering ethics. John is a keen Master’s athlete, when he has time to spare from his farm-forest operations. Osama Shata is currently a consultant with the Specialized Engineering Office, Egypt. Since 1986, Dr. Shata has joined several international academic institutions and private sector companies. He has developed technical and service operations, provided consultation and lectured internationally, and has an active publication record. His initial interests focused on artificial intelligence and databases. As the World Wide Web grew, so did Dr. Shata’s interest in date-based Web systems and distance education. Currently he focuses on the application of information systems development tools and methodolo-
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
About the Authors 299
gies to the implementation of Web-based learning environments and related areas such as models for electronic course delivery and online privacy and security. Dr. Shata also managed consulting services and IT solutions for companies in the areas of security advice, privacy assessments, products, and training. Dr. Shata was educated for his postgraduate studies at UWC and UWCC, earning his Master’s and PhD degrees in computer science, and he is a frequent speaker. Ed Simon has been an ardent advocate and implementer of XML since 1997 and is coauthor of both the XML Signature and XML Encryption specifications. Today he provides training and consulting services in the area of XML Security through his company XMLsec, Canada (see http://www.xmlsec.com/). Prior to starting XMLsec, He served as Entrust’s XML security architect, explored new online information technologies at IBM, and developed biomedical research software at the University of Calgary’s Faculty of Medicine. He holds a Master of Engineering degree from the University of Alberta. Ronggong Song is an associate research officer in the Information Security Group, Institute for Information Technology, National Research Council Canada (NRC), Canada. He had been employed as network planning engineer at Telecommunication Planning Research Institute of MII, P.R. China in 1999, and postdoctoral fellow at University of Ottawa in 2000. He received his Bachelor of Science degree in mathematics in 1992, Master of Engineering degree in computer science in 1996, and PhD in information security from Beijing University of Posts and Telecommunications in 1999. His research interests include information security, privacy protection, and trust management including network security, e-commerce, and agent-based security applications. Michael Weiss is an assistant professor at Carleton University, Canada, which he joined in 2000 after spending five years in industry following his PhD in computer science in 1993 (University of Mannheim, Germany). In particular, he led the Advanced Applications group within the Strategic Technology group of Mitel Corporation. His research interests include Web services, software architecture and patterns, business model design, and open source.
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
300 Index
Index
A
B
A P3P Preference Exchange Language (APPEL) 176 access control 80 access control mechanisms 34 accountability, pseudonym technology 142 action 211 active attacks 87 adware 131 AHIMA (American Health Information Management Association) 58 anonymity 119 anonymity, pseudonym technology 142 anti-virus software 117 APIs (application programming interfaces) 177 assertion 227 AT&T Privacy Bird 3 attribute assertion 227 audit trails 78 authentication 227 authentication assertion 227 authentication decision assertion 227 authenticity 209 authorization decision request 211, 213 authorization decision response 222, 224
backdoors 126 biometric systems 79 biometrics 84 blind signature 149 Bugnosis 16
C CA (certification authority) 144 Canadian Standards Association (CSA) 4, 236 careless disclosure safeguards 81 CIA (confidentiality, integrity, and availability) 121 Code of Fair Information Practices 4 Common Criteria 194 Computer Security Institute (CSI) 67 CONFAB 239 confidentiality 209 consent 229 control 2 cookies 9, 123 credential 150 cryptography 83, 209 cryptography, XML 209
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Index 301
CSA Model Code 2 cyberspace 142
D DAC (discretionary access control) 81 DAFMAT (Dynamic Authorization Framework for Multiple Authorization Types) 180 DAML-S 40 data aggregation 6 data controllers 257, 286 data processor 257, 286 data profiling 6 data subject 257, 287 database management systems (DBMS) 46 database security 81 decision process 2 decryption 148 decryption keys 35 decryptor 157 demilitarized zone (DMZ) 49 digital pseudonym 143 digital rights management (DRM) 10, 192, 239 digital signature 209 discretionary access control (DAC) 81 domain-type enforcement (DTE) 180 duty of confidentiality 2
E e-cash (electronic cash) 153 e-coin 161 e-health database application 32 E-Marketplace UDDI Registry 50 e-services 204, 235, 281, 282 e-ticket 144, 155 e-voting 144, 156 e-wallet 144 ECHR 286 electronic cash (e-cash) 153 electronic commerce 3 electronic communication services 61 encryption 76, 209 encryption algorithms 35 enterprise applications 204
Enterprise Privacy Authorization Language (EPAL) 15, 38, 176 environment 211 environmental threats 124 Europe Union Data Protection Directive 30 European Convention of Human Rights and Fundamental Freedoms (ECHR) 286 European Privacy Directive (95/46/EC) 4 exposure 2 eXtensible Access Control Markup Language (XACML) 177, 204, 213 eXtensible Markup Language (XML) 206, 229, 235 eXtensible Stylesheet Language Transformation (XSLT) 185
F feature interactions 266 Federal Trade Commission (FTC) 30 Firewalls 49, 78, 83, 117
G Goal-oriented Requirements Language (GRL) 269 GUI 248
H hackers 67, 116 hash functions 36 health information 57 Health Insurance Portability and Accountability Act (HIPAA) 30, 145, 289 Healthcare Finance Administration (HCFA) 74 healthcare industry 57 healthcare privacy 57 healthcare professionals 57 Hippocratic database 179 hird-party cookie 9 human computer interaction (HCI) 14 human errors 66 human threats 124
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
302 Index
I
O
IBM’s Enterprise Privacy Architecture (EPA) 239 identity and access management 205, 222 identity based encryption (IBE) 181 incentives 2 information privacy 30 information technology 57 informed consent 6 integrity 209 inter-organizational privacy enforcement engine (IOPP) 187 Internal Enterprise Application UDDI Registry 49 Internet 57 Internet privacy 3 intrusion detection monitoring 78 intrusion detection systems 133 IPsecurity (IPSec) 77
obligation 211 omnivore 17 operating system 129, 133 Organization for the Advancement of Structured Information Standards (OASIS) 196 organizational policies 236 organizational privacy enforcement engine (OPEE) 187 organizational privacy policy (OPP) 189 OWL Web Ontology Language 40
J jurisdiction 231
L legislation 231 Liberty Alliance ID-Web Services Framework 239 local privacy enforcement engine (LPEE) 187 local privacy policy (LPP) 188
M malacious software 124 malware 122 management infrastructure 240 medical record 57, 204 multi-attribute utility theory 4 multi-issue automated negotiation 4
N namespaces 207 natural threats 124 negotiation strategy 4 non-functional requirements 268
P P3P (see Platform for Privacy Preferences Project) Partner Catalog UDDI Registry 50 passive attacks 87 password management 80 patient databases 62 patients 56, 57 penetration testing 135 personal digital assistants (PDAs) 87 Personal Health Information Protection Act (PHIPA) of 2004 30 personal information (PI) 242 Personal Information Protection and Electronic Documents Act (PIPEDA) 30, 145, 184, 282 personal privacy policies 181 personally identifiable information (PII) 186 personally identifiable information tracking data 256 phishing 121, 126, 135 physicians 57 Platform for Privacy Preferences Project (P3P) 3, 36, 173, 231, 277 policies 204 policy decision point (PDP) 205 policy definition languages 173 policy enforcement point (PEP) 205 policy information points (PIPs) 205 policy sets 213 Portal UDDI Registry 49 privacy 30, 115
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Index 303
privacy (impact) assessment 267 Privacy Act of 1974 30 privacy architecture 234 Privacy Commissioner 12 privacy enforcement 173, 182 privacy enhancing technologies (PETs) 173 privacy impact assessment (PIA) 15 Privacy Incorporated Software Agents project (PISA) 14 privacy management 2 privacy policies 3, 30, 204 privacy policy architecture (PPA) 186 privacy policy compliance system (PPCS) 249 privacy principles 239, 285 privacy protection 142 privacy protection principles 267 privacy requirements 244 privacy rights management (PRM) 10, 255 privacy risks 2 Privacy Seal programs 194 privacy-aware role-based access control (PARBAC) 180 PrivacyPact protocol 22 Private Credentials 144, 158 private information contracts 18 profiles 118 Protected Health Information (PHI) 30 pseudonym 142 pseudonym system 143 pseudonym technology 141 pseudonym technology, accountability 142 pseudonym technology, anonymity 142 pseudonym-credentials 142 pseudonymity 146, 203, 227 pseudonymous identifiers 227 public key infrastructure system (PKI) 158
R receiver 2 reputation 7 reputation network 8 reputation system 8 resource 211 right of privacy 2
risk identification 6 risk management 2 risks 2 role 204 role-based access control (RBAC) 180
S safeguards 116 schemas 208 secure socket layer (SSL) 77 security 116 Security Assertion Markup Language (SAML) 204 security layers 129 sender 2 service-oriented architecture (SOA) 204 signature, digital 209 single sign-on 268 smart cards 79 SOAP 29, 229, 235 social engineering 81 spyware 116 subject 211 subject role 204
T target 211 technology 120 telemedicine 61 third party (TTP) 181 threat risk analysis (TRA) 15 threats, environmental 124 threats, human 124 threats, natural 124 transaction logs 257 Trojan horses 116 trust mark 8 TRUSTe 8, 194
U UDDI Business Registry 50 unforgeability 147 uninterruptible power supplies (UPS) 125 Universal Description, Discovery and Integration (UDDI) 29
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
304 Index
unlinkability 147 user satisfaction 268 utility elicitation 4
V valuation 2 Virtual private networks (VPNs) 77 virus 116 vulnerability scanning 133
W W3C (see World Wide Web Consortium) Web bugs 16 Web privacy enforcement engine (WPEE) 187 Web privacy policy (WPP) 189 Web services 204, 235 Web services architecture (WSA) 29 Web services architecture requirements 29
Web Services Description Language (WSDL) 29, 235 Web Services Discovery Agencies 42 wireless LAN (WLAN) 87 World Wide Web Consortium (W3C) 29
X XACML (see also eXtensible Access Control Markup Language) 177, 204, 213 XACML context handler 222 XACML policies 231 XML (see also eXtensible Markup Language) 206, 229, 235 XML-based security standards 204 XML cryptography 209 XSLT (see also eXtensible Stylesheet Language Transformation) 185
Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited.
Single Journal Articles and Case Studies Are Now Right at Your Fingertips!
Purchase any single journal article or teaching case for only $18.00! Idea Group Publishing offers an extensive collection of research articles and teaching cases that are available for electronic purchase by visiting www.idea-group.com/articles. You will find over 980 journal articles and over 275 case studies from over 20 journals available for only $18.00. The website also offers a new capability of searching journal articles and case studies by category. To take advantage of this new feature, please use the link above to search within these available categories: Business Process Reengineering Distance Learning Emerging and Innovative Technologies Healthcare Information Resource Management IS/IT Planning IT Management Organization Politics and Culture Systems Planning Telecommunication and Networking Client Server Technology
Data and Database Management E-commerce End User Computing Human Side of IT Internet-Based Technologies IT Education Knowledge Management Software Engineering Tools Decision Support Systems Virtual Offices Strategic Information Systems Design, Implementation
You can now view the table of contents for each journal so it is easier to locate and purchase one specific article from the journal of your choice. Case studies are also available through XanEdu, to start building your perfect coursepack, please visit www.xanedu.com. For more information, contact
[email protected] or 717-533-8845 ext. 10.
www.idea-group.com
Experience the latest full-text research in the fields of Information Science, Technology & Management
InfoSci-Online InfoSci-Online is available to libraries to help keep students, faculty and researchers up-to-date with the latest research in the ever-growing field of information science, technology, and management. The InfoSci-Online collection includes: Scholarly and scientific book chapters Peer-reviewed journal articles Comprehensive teaching cases Conference proceeding papers All entries have abstracts and citation information The full text of every entry is downloadable in .pdf format
InfoSci-Online features: Easy-to-use 6,000+ full-text entries Aggregated Multi-user access
Some topics covered: Business Management Computer Science Education Technologies Electronic Commerce Environmental IS Healthcare Information Systems Information Systems Library Science Multimedia Information Systems Public Information Systems Social Science and Technologies
“…The theoretical bent of many of the titles covered, and the ease of adding chapters to reading lists, makes it particularly good for institutions with strong information science curricula.” — Issues in Science and Technology Librarianship
To receive your free 30-day trial access subscription contact: Andrew Bundy Email:
[email protected] • Phone: 717/533-8845 x29 Web Address: www.infosci-online.com
A PRODUCT OF Publishers of Idea Group Publishing, Information Science Publishing, CyberTech Publishing, and IRM Press
infosci-online.com