Dr. Dobb's Journal (October)

, Dr. Dobbs J O U R N A L #377 OCTOBER 2005 SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER http://www.ddj.com COMPU...

42 downloads 1062 Views 1MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

,

Dr. Dobbs J O U R N A L

#377 OCTOBER 2005

SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER http://www.ddj.com

COMPUTER SECURITY Preventing Piracy, Preserving Privacy Trusting the Web–Again! Extended Visual Cryptography Inside the SmartDongle USB Security Key Developing JSR-168 Portlets Eclipse’s Test and Performance Tools Platform $4.95US $6.95CAN

10

Macintosh’s Move to the x86 Calling C Library DLLs from C#

0

74470 01051

7

Memory Errors & 64-Bit Platforms C++ Pointer Containers Hardware Trace & Performance Analysis

C O N T E N T S

OCTOBER 2005 VOLUME 30, ISSUE 10

FEATURES Preventing Piracy While Preserving Privacy 16 by Michael O. Rabin and Dennis E. Shasha

The security approach presented here is a privacy-preserving, flexible, antipiracy solution that does not suffer from “Break Once, Run Everywhere.”

Reestablishing Trust in the Web 28 by Amir Herzberg and Ahmad Jbara

The TrustBar browser extension provides improved security, identification, and trust indicators.

Extended Visual Cryptography Schemes 36 by Daniel Stoleru

Visual cryptography is a graphical form of information concealing.

Inside the SmartDongle USB Security Key 40 by Joel Gyllenskog

Joel lifts the hood on his USB security key.

Developing JSR-168 Portlets 44 by Ted O’Connor and Martin Snyder

The JSR-168 portlet specification defines APIs for building applications viewed inside portal frameworks.

The Eclipse Test and Performance Tools Platform 48 by Andy Kaylor

The Eclipse Test and Performance Tools Platform provides open Standards for interoperability.

The Mac’s Move to Intel 52 by Tom Thompson

Steve Jobs dropped a bombshell when he told software developers that the Macintosh will switch from PowerPC to Intel x86 processors.

Calling C Library DLLs from C# 58 by Shah Datardina

Need to utilize legacy software? Here are techniques for calling unmanaged code written in C from C#.

Removing Memory Errors from 64-Bit Platforms 63 by Rich Newman

It’s crucial to address potential memory errors before porting to 64-bit platforms.

Pointer Containers 68

FORUM

by Thorsten Ottosen

Smart containers are useful and safe utilities that can lead to flawless object-oriented programming.

EDITORIAL 8 by Jonathan Erickson

EMBEDDED SYSTEMS PROGRAMMING Using Hardware Trace for Performance Analysis 71

LETTERS 10 by you

by Michael Lindahl

Michael examines embedded-systems performance-analysis techniques, and discusses some of their inherent limitations.

COLUMNS Programming Paradigms 75

Chaos Manor 82

by Michael Swaine Ringtones are where the money is — for now anyway.

by Jerry Pournelle Jerry looks back when inventing the future, and looks forward to the world of 64-bit computing.

Embedded Space 78

Programmer’s Bookshelf 85

by Ed Nisley Large, complex embedded systems have more places for things to go wrong.

by Michelle Levesque Michelle examines Greg Wilson’s Data Crunching: Solving Everyday Problems Using Java, Python, and More.

http://www.ddj.com

Dr. Dobb’s Journal, October 2005

DR. ECCO’S OMNIHEURIST CORNER 12 by Dennis E. Shasha NEWS & VIEWS 14 by DDJ Staff PRAGMATIC EXCEPTIONS 26 by Benjamin Booth OF INTEREST 87 by DDJ Staff SWAINE’S FLAMES 88 by Michael Swaine NEXT MONTH: In November, we’re all over the place when we cover distributed computing.

3

D R .

D O B B ’ S

O N L I N E

C O N T E N T S O N L I N E

E X C L U S I V E S

T H E

N E W S

S H O W

http://www.ddj.com/exclusives/

http://thenewsshow.tv/

.NET—The Decompiler Will Get You

Dual-Core Duel

How can you avoid opening up your intellectual property to intruders?

It’s Round 1 in the AMD and Intel dual-core duel.

Beautiful Code

S O F T WA R E D E V E L O P M E N T

Pretty programs can be a thing of beauty.

D O B B S C A S T

A U D I O

http://www.sdmagazine.com/ newsletters/securestart/

http://www.ddj.com/podcast/

Reverse Engineering John Blattner on analyzing real-time embedded systems.

Windows Vista: The Developer Perspective John Montgomery on what features Windows Vista offers developers.

W I N D O W S / . N E T

The Seven Touchpoints of Secure Software Security must be built-in throughout the development lifecycle.

http://www.ddj.com/topics/windows/

Windows Security Investigating software and source-code theft.

Windows/.NET Q&A How do you create or modify strings composed of several strings?

D O T N E T J U N K I E S http://www.dotnetjunkies.com/

Selecting, Confirming, & Deleting Multiple Checkbox Items Here’s how to select and delete across pages in single batch deletes.

Web Hosting for ASP.NET 2.0 Beta 2 Microsoft sanctioned web hosting providers that deliver beta services.

B Y T E . C O M http://www.byte.com/

Media Lab Autodesk, Adobe, and the art of the interface.

Developing for Cell Phones Examining the Qualcomm BREW SDK, and exploring ringtone conversion.

T H E

P E R L

J O U R N A L

http://www.tpj.com/

A Music Player Remote Control in Perl/Tk This search facility hooks into WinAmp to automatically play songs.

Managing Documents Using a SOAP::Lite Daemon This document-management system decouples interfaces and back-end logic.

RESOURCE CENTER As a service to our readers, source code, related files, and author guidelines are available at http://www.ddj.com/. Letters to the editor, article proposals and submissions, and inquiries should be sent to [email protected]. For subscription questions, call 800-456-1215 (U.S. or Canada). For all other countries, call 902563-4753 or fax 902-563-4807. E-mail subscription questions to [email protected], or write to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80322-6188. If you want to change the information you receive from CMP and others about products and services, go to http://www.cmp.com/ feedback/permission.html or contact Customer Service at Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80322-6188. Back issues may be purchased prepaid for $9.00 per copy (which includes shipping and handling). For issue availability, send e-mail to [email protected], fax to 785-838-7566, or call 800-444-4881 (U.S. and Canada) or 785838-7500 (all other countries). Please send payment to Dr. Dobb’s Journal, 4601 West 6th Street, Suite B, Lawrence, KS 66049-4189. Digital versions of back issues and individual articles can be purchased electronically at http://www.ddj.com/.

WEB SITE A C C O U N T A C T I VA T I O N Dr. Dobb’s Journal subscriptions include full access to the CMP Developer Network web sites. To activate your account, register at http://www.ddj.com/registration/ using the web ALL ACCESS subscriber code located on your mailing label.

DR. DOBB’S JOURNAL (ISSN 1044-789X) is published monthly by CMP Media LLC., 600 Harrison Street, San Francisco, CA 94017; 415-947-6000. Periodicals Postage Paid at San Francisco and at additional mailing offices. SUBSCRIPTION: $34.95 for 1 year; $69.90 for 2 years. International orders must be prepaid. Payment may be made via Mastercard, Visa, or American Express; or via U.S. funds drawn on a U.S. bank. Canada and Mexico: $45.00 per year. All other foreign: $70.00 per year. U.K. subscribers contact Jill Sutcliffe at Parkway Gordon 01-49-1875-386. POSTMASTER: Send address changes to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80328-6188. Registered for GST as CMP Media LLC, GST #13288078, Customer #2116057, Agreement #40011901. INTERNATIONAL NEWSSTAND DISTRIBUTOR: Worldwide Media Service Inc., 30 Montgomery St., Jersey City, NJ 07302; 212-332-7100. Entire contents © 2005 CMP Media LLC. Dr. Dobb’s Journal is a registered trademark of CMP Media LLC. All rights reserved.

4

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

,

Dr.Dobbs J O U R N A L

PUBLISHER Michael Goodman

SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER

EDITOR-IN-CHIEF Jonathan Erickson

EDITORIAL MANAGING EDITOR Deirdre Blake SENIOR PRODUCTION EDITOR Monica E. Berg ASSOCIATE EDITOR Della Wyser ART DIRECTOR Margaret A. Anderson SENIOR CONTRIBUTING EDITOR Al Stevens CONTRIBUTING EDITORS Bruce Schneier, Ray Duncan, Jack Woehr, Jon Bentley, Tim Kientzle, Gregory V. Wilson, Mark Nelson, Ed Nisley, Jerry Pournelle, Dennis E. Shasha EDITOR-AT-LARGE Michael Swaine PRODUCTION MANAGER Stephanie Fung INTERNET OPERATIONS DIRECTOR Michael Calderon SENIOR WEB DEVELOPER Steve Goyette WEBMASTERS Sean Coady, Joe Lucca AUDIENCE DEVELOPMENT AUDIENCE DEVELOPMENT DIRECTOR Kevin Regan AUDIENCE DEVELOPMENT MANAGER Karina Medina AUDIENCE DEVELOPMENT ASSISTANT MANAGER Shomari Hines AUDIENCE DEVELOPMENT ASSISTANT Melani Benedetto-Valente MARKETING/ADVERTISING ASSOCIATE PUBLISHER Will Wise SENIOR MANAGERS, MEDIA PROGRAMS see page 86 Pauline Beall, Michael Beasley, Cassandra Clark, Ron Cordek, Mike Kelleher, Andrew Mintz MARKETING DIRECTOR Jessica Marty SENIOR ART DIRECTOR OF MARKETING Carey Perez DR. DOBB’S JOURNAL 2800 Campus Drive, San Mateo, CA 94403 650-513-4300. http://www.ddj.com/ CMP MEDIA LLC Gary Marshall President and CEO John Day Executive Vice President and CFO Steve Weitzner Executive Vice President and COO Jeff Patterson Executive Vice President, Corporate Sales & Marketing Leah Landro Executive Vice President, Human Resources Mike Mikos Chief Information Officer Bill Amstutz Senior Vice President, Operations Sandra Grayson Senior Vice President and General Counsel Alexandra Raine Senior Vice President, Communications Kate Spellman Senior Vice President, Corporate Marketing Mike Azzara Vice President, Group Director of Internet Business Robert Faletra President, Channel Group Tony Keefe President, CMP Entertainment Media Vicki Masseria President, CMP Healthcare Media Philip Chapnick Vice President, Group Publisher Applied Technologies Paul Miller Vice President, Group Publisher Electronics Fritz Nelson Vice President, Group Publisher Network Computing Enterprise Architecture Group Peter Westerman Vice President, Group Publisher Software Development Media Joseph Braue Vice President, Director of Custom Integrated Marketing Solutions Shannon Aronson Corporate Director, Audience Development Michael Zane Corporate Director, Audience Development Marie Myers Corporate Director, Publishing Services

American Buisness Press

6

Dr. Dobb’s Journal, October 2005

Printed in the USA

http://www.ddj.com

EDITORIAL

Salary Surveys & Programmer Pay

A

rnold Schwarzenegger, erstwhile movie star and current governor of California, has been giving me headaches. For starters, I’ve been diagnosed as suffering from PAS, short for “Post Arnold Syndrome,” after downloading the Governator ringtone and seeing both Kindergarten Cop and Conan the Barbarian on the same day. Symptoms include nausea, nightmares, and an uncontrollable urge to pump serious iron. But the real grief Schwarzenegger has brought home is the revelation that he’s been earning more than $1 million a year moonlighting as a magazine editor. When this news broke, the Dr. Dobb’s Journal editorial staff descended on my office, with the equal-pay advocates jostling for head of the line. Not to be left out, Margaret Anderson tried to change her job title from “Art Director” to “Art Editor,” while Senior Production Editor Monica Berg, who grew up in Germany, made her case because she sounds like Schwarzenegger. Jeez, that’ll teach me to wander into the office. However, the good news is that U.S. compensation for software developers seems to be on the rise, at least, according to a recent survey conducted by Foote Partners (http://www.footepartners.com/). Granted, most programmers (or editors) won’t be breathing that rarified Schwarzenegger ether anytime soon. Still, in the first six months of 2005, pay for noncertified skills inched up 2.1 percent for application developers; 4.3 percent database developers/administrators; 5.1 percent networking/internetworking; and 8.2 percent for operating systems experts. For categories in certified tech skills, salaries were up 3.8 percent for web/e-commerce development; 2.3 percent for application development/programming languages; and 0.7 percent for database administration. In Foote Partners parlance, skills-related pay is typically paid as cash bonuses or embedded in base salary as an adjustment for the presence of a dominant vendor or technology skill critical to the job. For example, the salary for an Oracle database administrator, Linux systems administrator, or .NET developer, can be different than what employers might provide for generic “systems administrator”, “programmer”, and “developer” job titles. According to Foote Partners head-ofresearch David Foote, the redefinition of IT jobs is currently so pervasive that traditional job titles are becoming increasingly meaningless. Instead of overhauling job titles, employers are finding it easier to differentiate workers with the same job titles by recognizing technical skills fundamental to their jobs, putting a market value on those skills, and adjusting base pay accordingly. So what’s going on? Are employers suddenly opting for kinder and gentler employment practices? Hardly. For one thing, Foote sees a return to hiring as the economy has strengthened. For another, many companies were stung by botched offshore outsourcing projects, particularly when they failed to keep key people who had both technical skills and an understanding of the business and industry. Consequently, says Foote, companies are trying to do a better job of hiring and retaining talent with specific technical skills and business and industry experience — and reinvesting in onshore application development. He adds that as Sarbanes-Oxley compliance related work tapers off at many companies, the need for complex combinations of industry knowledge and technical skills is rising. “The shift is on innovation and new products,” Foote says. The survey, which queried 50,000 IT professionals, found that the hottest noncertified skills (that is, those that exhibited 25 percent or more growth in skills pay over the last 12 months) focused on SQL Server, WebSphere, Active Server Pages, SQL Windows, and .NET. Likewise, the highest paying noncertified skills involved project-level security, RAD/Extreme Programming, VoIP, Gigabit Ethernet, IBM WebSphere, Oracle database and applications, and SQL Windows. The Foote Partners findings are more or less confirmed by Information Week’s 2005 salary survey conducted earlier this year. (In the spirit of disclosure, Information Week is published by CMP Media, which also publishes Dr. Dobb’s Journal.) Information Week (http://www .informationweek.com/) found that the highest salaries were commanded by web-security experts, followed by wireless infrastructure personnel. However, in the Information Week survey, which queried more than 12,000 IT professionals, salaries for application developers was more or less flat, as compared to 2004, although networking jobs were paying slightly more. Within the next couple of months, Software Development magazine (another of DDJ’s sister publications; http://www.sdmagazine.com/) plans on publishing the results of its annual developer salary survey. It will interesting to see how that matches up with Foote Partners and Information Week. Until then, you can find me either lifting weights at the gym or in my neighborhood movie theater, setting a personal record of seeing Hercules in New York, Terminator 2: Judgment Day, and Conan the Destroyer— all in the same day.

Jonathan Erickson editor-in-chief [email protected] 8

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

LETTERS

, PO S S O BB

10.3.9: No Optimization, 22:34:59; Wirth’s Validity Check, 08:06:42; Permutation Vector, 03:49:08; both optimizations, 01:53:54; Total Time 36:24:45 (131,085 seconds). The Mac at ~63 percent MHz gets ~98 percent of the performance. F.C. Kuechmann [email protected]

T

D

2

CE

S NT

2

Duff’s Device Dear DDJ, Reading Ralf Holly’s article “A Reusable Duff Device” (DDJ, August 2005) brought back memories of doing much the same thing around 1981. I was working on a proprietary system that had a limited instruction set and functionality. It’s been many years, but my solution for unrolling a loop was to use a series of reentrant calls with an instruction such was an output to an I/O device executed just before the primary return. Calls queued up like that allowed me to do 2, 4, 8, 16, and so on executions of the instruction that needed to be repeated. As I recall, we did not have an adder or increment instruction, so looping was a bit of a problem. Gary G. Little [email protected] Optimal Queens Dear DDJ, I enjoyed the article “Optimal Queens” by Timothy Rolfe (DDJ, May 2005), but it seems like a good time for a food fight. I made a few mods to Timothy’s Queens.c code to adapt it for a Macintosh CodeWarrior C console program that automates testing using each of the four optimizations listed below for 4–18 queens. Timothy must enjoy sitting around waiting for each test to finish so he can enter the next set of n- queens and optimization parameters [24 times for n-queens 12–18], then wait again. I globalized a few vars, paired allocs with frees inside nested for loops, and let the program handle the permutations. Because times start exceeding one second at 13 –14 queens, my times can be directly compared with Timothy’s without too much picking of nits or other amusements. His Dell desktop computer with a 2GHz Pentium 4, OS not specified: No Optimization, 20:59:51; Wirth’s Validity Check, 08:51:50; Permutation Vector, 03:52:36; both optimizations, 01:54:54; Total Time 35:39:11 (128,351 seconds). My Macintosh 2003 MDD Dual 1.25-GHz G4, OS X 10

Editor’s Note: F.C.’s modified Queens.c code is available electronically; see “Resource Center,” page 4. Licensing Again Dear DDJ, Thanks to Jim Wiggins for his detailed and interesting note (“Letters,” DDJ, March 2005). I do not object to licensing software professionals in one area: embedded computing. Many of the cases that are cited below are exactly that kind of development, and Ed Nisley does a fine job every month of describing exactly how different a world that is. Licensing for “embedded space” would need to include EE training as well as computer science, and some mechanical engineering and materials science couldn’t hurt. I think this kind of training and education is well beyond that of the typical software engineer. In fact, while I don’t know about automotive engineering, much of the rest of the embedded industry does have special requirements levied on the way they do software. Some of the strongest requirements are imposed upon nuclear power stations, and I have some experience with those. They cannot use C++, Ada, Java, and there are restrictions on the use of C. (I can provide references but don’t have the time right now.) Why? Because to assure predictability of the software, dynamic storage allocation is disallowed. There are also strictly technical benefits, such as it being easier to burn code into PROMs and stuff, and know what box it is in. Software and its algorithms must fit into a structured reliability discipline, and the reliability engineer makes the call on whether a change of means or algorithm is acceptable. Indeed, this kind of really rigorous structure is absent even in NASA’s work, and is certainly missing in most defense and FAA aerospace development. They need to get things to work, yes, but the reliability element is often hit or miss because (at the systems level at prime) contractors, quality control folks, and testers have little clout — [they are] seen as impediments to the company getting paid. FAA’s government people have a strong hand, but they can get burnt if the company has good political ties. Nevertheless, NASA and FAA demand detailed, written specifications and formal Dr. Dobb’s Journal, October 2005

test plans and procedures, all under ruthless change control. Defense is supposed to work that way, but it depends upon the program and the character of the SPO running it. A lot of this is criticized in Congress and in the trade press, lamenting how long it takes them to get anything done in comparison to their apparently fleet-footed commercial brethren, along with much lathered on opinion about the superiority of free-market versus government-run programs. I strongly suspect they are swift because they can afford to not be rigorous. Could things be done smarter? Of course, they can be: Consider the Shuttle program versus the Mars Rovers. Alas, that’s pitting NASA mainstream against Caltech-JPL. And don’t mention SpaceShipOne: As admirable as that project is, and as supportive of that effort I am, its scope is far more limited than what NASA must do and it builds upon a lot of work originally paid for by government. When someone needs to launch a satellite as part of a tsunami-warning network for the Indian Ocean, Scaled Composites can’t do it — not soon, anyway. There may be other niches in software development that admit comparable discipline. I should think that software running securities trading and monetary exchange arbitrage must be of necessity right on, considering the amounts of money that can be lost in a mistake. In principle, there ought to be a high-reliability version of Windows or at least Windows NT out there. There’s a question of how to pay for it, however. High-reliability Linux? There ought to be that, too, and perhaps there will be. It’s being used in quasi-embedded situations more and more. As negative as I might sound, there are successes to tout. Relational database systems sell themselves to their customers primarily because of their design for data safety, reliability, and ability to recover from all kinds of misfortune, man- and nature-made. I hope I conveyed what I think to be a collective frustration on the part of software users of all kinds with how long it takes to do anything in software. Some of that is, as I tried to express, part of the nature of the beast because it demands being very precise about things people normally aren’t and don’t have to be. But some of it is limitations of our own technology and smarts, stuff that, apart from important hardware assists, really hasn’t changed since 1980. I simply do not see how licensing will get us to fix that. Jan Galkowski [email protected] DDJ http://www.ddj.com

DR. ECCO’S OMNIHEURIST CORNER

Calculation in the Narrows Dennis E. Shasha

E

cco was invited to a submarine base, where he heard about a new naval computational technique called “parallel local computation” (PALC). He was told it was very useful for high security applications, but they couldn’t tell him which. Instead, they presented him with the following sanitized version of a problem commanders face: A group of people have lined up in a long narrow corridor that allows only two people to be side-by-side. Each has been given a number between 1 and 10,000. After some number of rounds (described below), each person is to report a whole number that is no more than one above or one below the mean of the numbers. For example, if there are four people and they have been given the numbers 2, 2, 2, 3; then it is fine if some report 2 and some report 3 because the mean is 2.25. During the course of this calculation, no person should have to remember more than five significant digits in total including those to the left and those to the right of the decimal point. Start: P0 P1 P2 P3 P4 Next: P1 P2 P3 P4 P0 Next (so, P1 and P2 can exchange information as can P0 and P3): P2 P3 P4 P1 P0 Next: P3 P4 P2 P1 P0 Next: P4 P3 P2 P1 P0 End of round Next: P4 P3 P2 P1 P0 Next (now P4 leads): P3 P2 P1 P0 P4 Next: P2 P1 P0 P3 P4 Next: P1 P0 P2 P3 P4 Next: P0 P1 P2 P3 P4 Next: P0 P1 P2 P3 P4 end of second round

Figure 1: People moving in a corridor. Dennis, a professor of computer science at New York University, is the author of four puzzle books. He can be contacted at [email protected]. 12

Because the corridor is so narrow, the people are going to move as shown in the example of Figure 1 for five people P0,…,P4. Only when two people are sideby-side can they exchange information. Note that in the first round, P0 encounters P1 and P3; P1 encounters P0, P2, and P4; and P2 encounters P1 and P3. In each pairwise encounter, people can exchange numbers and do any calculation they like with those numbers. However, the number that each person retains after an encounter may contain no more than five digits. Warm-Up: Suppose that two people A and B meet and that initially A’s number x is greater than B’s number y. Suppose the mean of the entire collection, while unknown to A or B, is denoted M_all. Consider the initial error of A and B to be the maximum of |x – M_all| and |y – M_all|. What can A and B do that will reduce their error without preventing them from calculating the mean correctly? Solution to Warm-Up: They calculate the mean of x and y, denoted M_xy. A substitutes M_xy for x, and B substitutes M_xy for y. If M_all is greater than M_xy, then |M_all – y| was the error before, so the error is reduced. Here is a proof: Suppose that x<M_all, then the order of elements is y<M_xy<x <M_all, so the conclusion follows. If M_all<x, then the order must be y<M_xy< M_all<x. Because M_xy is the mean of x and y, (M_xy – y) =(x – M_xy), which is greater than both x–M_all and M_all –M_xy. So the error is reduced and the sum of all numbers stays the same. We have ignored rounding so far. If rounding is required, then allow A, beStart: P0 P1 P2 P3 P4 Next: P1 P2 P3 P4 P0 Next (reverse): P4 P3 P2 P1 P0 Next: P2 P1 P0 P3 P4 Next: P0 P1 P2 P3 P4 Next: P3 P4 P2 P1 P0 Next: P4 P3 P2 P1 P0 Next: P0 P1 P2 P3 P4 End of round

Figure 2: List reversal. Dr. Dobb’s Journal, October 2005

cause it initially has a higher value, to take a value above the mean, and B to take a value below the mean in such a way that A loses as much as B gains. This guarantees that the mean of the results equals the mean of the inputs. End of warm-up. The warm-up suggests the following possible solution. Each exchange rounds to the nearest whole number (always keeping the sum of all numbers constant). For example, in an exchange between 53 and 42, the resulting numbers would be 48 and 47. So the initially greater number would be given the higher integer. 1. Might there be some configuration of a dozen people and their initial values that requires more than 20 rounds in this case? What is the maximum necessary? The answer to this might suggest that using a number including one decimal digit would help. 2. If one does use a decimal digit for all numbers below 10,000 and there are at least 12 people in the hallway, then will any initial configuration require more than 20 rounds? What is the maximum necessary? 3. Suppose there could be only three people in the hallway. (I use three because two would have a solution in one exchange.) Then will any initial configuration require more than 20 rounds? 4. The protocol now requires that the person having the higher initial value gets the higher value after rounding. What if the assignment to higher or lower occurs randomly (with equal probability for the initially higher and initially lower)? Would this change the answer to question 1? Here is an open problem. Consider a protocol in which after every exchange (rather than after every round), the list reverses itself. So the protocol looks like Figure 2 in a typical round. Even though each round is more expensive, I cannot find a case involving 10 or more people in which the protocol takes more than six rounds provided one can use one digit to the right of the decimal point. Can you find a limit in terms of the number of rounds required? For the solution to last month’s puzzle, see page 80. DDJ http://www.ddj.com

SECTION

A

MAIN NEWS

Dr. Dobb’s

News & Views

Unified EFI Forum Established The nonprofit Unified EFI Forum has been formed to manage the evolution and promotion of the Extensible Firmware Interface (EFI) specification (http://www.uefi .org/). The EFI spec defines an interface that hands off system control from the preboot environment to the operating system. In short, EFI is a modern replacement for the BIOS. (For more information on EFI, see “The Extensible Firmware Interface,” by Craig Szydlowski; DDJ, September 2005). Founding members of the EFI Forum include AMD, American Megatrends, Dell, HP, Intel, IBM, Insyde Software, Microsoft, and Phoenix Technologies. The Forum will publish the EFI 1.10 specification by the end of 2005. It will also make available test suites for the UEFI spec based on contributions from member companies.

Eiffel Standardized by ECMA A Standard for the Eiffel programming language has been adopted by the General Assembly of ECMA International (http:// www.ecma-international.org/publications/ standards/ECMA-367.htm). ECMA’s charter is to evaluate, develop, and ratify telecommunications and computing Standards. The Eiffel language, originally designed by Bertrand Meyer, is available through implementations by Meyer’s Eiffel Software (http://www.eiffel.com/) and other providers. ECMA Standardization guarantees total, line-by-line compatibility between different implementations. The specification also has been submitted for ISO (International Standards Organization) approval as part of ECMA’s “fast-track” ISO status.

Grid Security Examined The Enterprise Grid Alliance, an open consortium focused on developing and promoting enterprise grid solutions, has released its Enterprise Grid Security Requirements document, which identifies a set of requirements for grid security (http:// www.gridalliance.org/en/workgroups/ GridSecurity.asp). Developed by the EGA Grid Security Working Group, the document builds on the previously released EGA Reference Model by identifying the 14

unique security requirements of commercial enterprise grid computing. It is intended as a guide for users, Standards organizations, and vendors.

IBM Launches Academic Licenses IBM has launched an program that provides universities with free access to a range of emerging technologies developed in IBM’s R&D labs (http://www.developer.ibm.com/ university/scholars/). The goal of the “Academic License” program is to help train, educate, and accelerate development skills around open Standards-based technologies. University professors can use the technologies to build course curriculum. Professors will have access to more than 25 technologies, including games and simulations, to accelerate skills around IBM on- demand offerings including open Standards technologies, such as Java and Eclipse, tools to enable grid computing. MIT and Harvard’s Division of Engineering and Applied Sciences will be the first universities to participate in the program. The program is open to academic institutions participating in IBM’s Academic Initiative.

Open Authentication Moves Forward OATH, the initiative for Open Authentication, has released Version 1.0 of its OATH Reference Architecture, which provides a framework for open authentication (http://www.openauthentication.org/ reg.asp). The document’s client framework section addresses topics of authentication methods, innovation in authentication tokens for multifunction purposes or mobile devices, token interfaces for one-time password tokens, and authentication protocols. The validation framework covers interfaces for protocol and validation handlers, and protocols used by applications to authenticate user credentials. OATH will develop a framework to let vendors develop Standards-based provisioning protocols and evaluate the need to standardize on one or more provisioning protocols to target specific credential types. OATH is a collaboration of device, platform and application companies, with the goal of fostering strong authentication across networks, devices, and applications. Dr. Dobb’s Journal, October 2005

DR. DOBB’S JOURNAL October 1, 2005

Secure Voice-over-IP Phil Zimmermann, developer of Pretty Good Privacy (PGP) e-mail encryption software, is now working towards building similar security tools for Internetbased Voice-over-IP (VoIP). Codenamed “zFone,” the prototype Zimmermann demonstrated at the Black Hat Briefings security conference scrambles information until it reaches its destination. To unscramble the data, recipients must be running a program that uses the same protocols. According to Zimmermann, zFone interoperates with any standard SIP phone. Zimmermann’s prototype is based on Shtoom, a VoIP client written in Python. For more information, see http://www .philzimmermann.com/EN/zfone/index.html.

UC Berkeley and Yahoo Partner for Research In a first-of-its-kind partnership between a public university and private Internet company, Yahoo Research Labs and the University of California at Berkeley, are launching a joint lab to explore Internet search technology, social media (photos, video, music, audio, and text obtained from personal, public, or community sources, then shared, referenced, or remixed in ways that help foster social relations), and mobile media. Most intellectual property developed at the lab will be shared jointly between UC Berkeley and Yahoo. The founding director of Yahoo Research Labs-Berkeley is Marc Davis, an assistant professor at UC Berkeley’s School of Information Management and Systems.

IBM Steps Up Open-Source Java Efforts IBM has stepped up its efforts to see an open-source, compatible, and independent implementation of the Java 2 Platform Standard Edition 5 (J2SE 5) by participating in (and eventually contributing code to) the Apache open-source Harmony project. Among other goals, the Harmony project was launched to create an open- source modular runtime (virtual machine and class library) architecture to allow independent implementations to share runtime components. http://www.ddj.com

Preventing Piracy While Preserving Privacy A flexible antipiracy solution MICHAEL O. RABIN AND DENNIS E. SHASHA

I

n the battle between pirates and content providers, the pirates are winning. Movies appear on bootlegged DVDs and on peer-to-peer networks even before they appear in theatres. Expensive software can be obtained at rock-bottom prices without royalties flowing to the authors. Pricey technical counter-measures are easily defeated. In 2002, a multimillion dollar CD-based antipiracy scheme developed by Sony, was defeated by writing on the outer rim of protected CDs with a magic marker. License servers are routinely cracked. Total losses, while hard to calculate exactly, may amount to tens of billions of dollars per year. Content vendor reactions vary from hand-wringing to threats of lawsuits to hope for yet a better protected medium. Platform vendors such as Intel, Microsoft, Apple, and Panasonic are more ambivalent. If one platform prevents piracy, will consumers choose another? This proposition has not been tested but platform vendors have been cautious so far. Some content vendors even view piracy as a kind of loss leader. A few years ago, a scientist from a leading vendor, for example, announced to an expert panel (in substance): “Piracy doesn’t worry us. The best thing that can happen to us is that someone buys our software, next that someone steals it, and the worst that someone buys our competitors’ software.” More recently, however, a scientist from the same company said to one of us: “We can no longer afford to sell just one copy in country X and see the rest stolen.” Frustrated with platform vendor inactivity, content vendors have chosen to use law enforcement and the courts to stop piracy—261 lawsuits were filed on just one day in 2003, for example. This has met with some success, but only in some countries and in a few cases. Even then, there is something distasteful about prosecuting librarians and 12-year-old children. There must be a better way. Look beyond computer software and beyond movies to driving behavior. When faced with speed bumps, you slow down. You don’t need police to tell you to. Your butt or your passengers’ discomfort will ensure you don’t speed. The underlying philosophy behind our solution is to implement a software speed bump to combat piracy. Our solution requires no police and preserves the privacy of everyone, even pirates. Michael is a professor at Harvard University and a recipient of the Turing Award. Dennis is a professor at New York University and is the puzzle columnist for DDJ. They can be contacted at [email protected] and [email protected], respectively. 16

As a matter of terminology, we use the term “software” (or simply “content”) to indicate any digital content, such as computer programs, computer games, audio and video, and so on. Starting Points We start with two assumptions, the first moral and the second technical.

“The underlying philosophy behind our solution is to implement a software speed bump to combat piracy” The moral assumption is that stealing is wrong, even if it’s easy. A screenplay writer friend once said she doesn’t condone stealing except of computer software. It weighs nothing. The computer copies it. Some big corporation suffers. What could be wrong? How would she feel if someone stole her screenplay? Oh, that’s different. But it isn’t, big corporation or not. She wouldn’t feel it’s okay to steal a car from an automobile factory. On the other hand, the punishment should fit the crime. Ideally, software pirates should get no benefits from pirating but there should be no jail sentences or onerous fines. The more successful we are at preventing piracy by technical means, the less the need for law enforcement and high penalties. (As a matter of legal principle, if the odds of getting caught are 1 in 1000, then the penalty should be 1000 times the profit to render piracy unattractive. We avoid high penalties by reducing the profit from piracy to virtually zero.) Think speed bump again. The technical assumption is that User Devices (computers or other software playing devices) have a secure clock and software called the “Supervising Program” that cannot be changed and that is given a periodic time-slice when it can run. “Secure” means that even the owner of the device cannot alter the progress of

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

(continued from page 16) the clock, alter the Supervising Program, or intervene with its actions. This assumption lies within the technical state of the art: • In a paper written in 1992, Lampson, Abadi, Burrows, and Wobber [1] suggested a way to load an operating-system kernel reliably using a bootstrapping method based on a single cryptographic key. Continuous checking of the integrity of the kernel or of our Supervising Program can be achieved by similar means. In these days of inexpensive hardware, there are other possibilities —IBM, HP, and Dell already ship computers that include a so-called “Trusted Platform Module,” a coprocessor providing a feature called “remote attestation” [2]. The Trusted Platform Module can guarantee (and even promise to other devices) that a certain operating system and a certain BIOS are running. Similar techniques can be used for the Supervising Program. Only the Supervising Program needs to be secure, not the Software that is later going to be protected. • Hardware vendors must provide a clock that advances continuously and uniformly (for example, one that keeps in step with Greenwich Mean Time so is unaffected by time zone or daylight-savings time). The Trusted Platform Module already provides a counter that is guaranteed to increase over time. • The operating system must interact with the Supervising Program by ensuring that it runs periodically. The Shield system does the rest, ensuring piracy prevention while preserving privacy. Before we discuss it, however, we briefly examine the main existing approaches to prevent piracy. Current Approaches to Combat Piracy Many companies offer piracy prevention or, more generally, digital-rights-management software. The main distinction between the two is that digital-rights-management software may also include linguistic constructs to describe usage possibilities, a prominent example being ContentGuard’s XrML language (http:// www.contentguard.com/xrml.asp). In this article, however, we concentrate on piracy prevention, because that is the fundamental technology upon which all else rests. The best current approach is to encapsulate software inside hardware. Video cameras do this, but in the computer software world, such software comes on hardware attachments, such as so-called “dongles,” like those from MicroWorks (http://www .mw- inc.com/) and SafeNet (http://www.safenet- inc.com/ products/tokens/ikey1000.asp). This solution is feasible if the dongle can be rendered tamperproof and by running impractical-to-reconstruct parts of the software program on the dongle. The dongle approach is vulnerable to a reverse engineering attack of that “impractical-to-reconstruct” software. Even when the dongle approach works technically, however, the hardware apSotware is freeware? No User device has rights to Yes run software?

Yes

It runs normally

No Hinder use

Figure 1: Piracy prevention flowchart. No information leaves the User Device. 18

proach makes it difficult to use several unrelated but protected software items at once and is, in general, cumbersome. A part-hardware approach is to ship software out on “copyproof” CDs. Again, extremely low-tech attacks (scribbling on CD rims) have defeated such solutions in the past. But even if the CD is truly copyproof, what happens if the content ends up on a web site from which it can be downloaded. This attack, dubbed “Break-Once, Run Everywhere” (BORE), can render an entire factory’s work a waste of time and effort. A software-imitates-hardware approach is to encrypt the content and ship the key to the client site, which can then execute the software only if it has the proper keys. This solution suffers from the BORE problem as well: If the content can ever be constructed in the clear through either an attack on the encryption, an attack on even one User Device where the software has been running, or an insider leak by an employee of the software author, it can be used everywhere. License servers combat piracy by requiring licensed software to get permission to continue running from time to time. This scheme can be attacked if a would-be pirate can simulate the license server’s responses, or change the software not to query the license server. If either happens, there is a BORE problem. In addition, this solution requires the software author to modify the software by introducing the (hopefully nonremovable) calls to the license server. Even if not, the notion of having to report usage to an outside license server inherently infringes on privacy. There are approaches that don’t try to prevent piracy but try to track and/or punish the pirates. The “watermarking” approach is to write some unique undetectable digital message on each instance of the software. If that digital message is found on many instances of software in the field, then the original purchaser of that watermarked copy is the source of those copies. The problems with this scheme range from the theoretical (it doesn’t seem possible to create an undetectable watermark) to the practical (how does one track down copies and test them for watermarks). Further, there is the problem of legal punishment. Trials are expensive, timeconsuming affairs. Finally, the technique depends fundamentally on violating privacy, because it requires identifying the “criminal.” A second form of punishment is to put “poisoned apples” in places where pirates are likely to look. The idea is to punish pirates by giving them something that looks good but isn’t — conceivably a virus but more commonly a broken piece of content. Two years ago, a pirate downloading a Madonna song from a site might instead find a furious Madonna piping out expletives. Since then, poisoning peer-to-peer networks has become a thriving cottage industry. For certain kinds of software, notably movies and music, the aforementioned solutions do not prevent a would-be pirate from digitally recording the content while watching or listening and then later redistributing the recording. Copying and redistributing content in this way is known as the “Analog Hole” attack. All existing solutions (other than wrapping the software inside a hardware device) suffer from a BORE attack. Most of these solutions infringe on privacy, sometimes by design. A better solution should avoid BORE, avoid courts, and preserve privacy. Towards a New Approach Our approach to protection is simple: As Figure 1 illustrates, periodically during the execution of software on the User Device, our Supervising Program checks whether the software is freeware or not. If not, the Supervising Program identifies the software and checks whether this User Device has the rights to run this software. If so, the software continues to run; if not, the software is either stopped or markedly slowed down. No information leaves the device. The punishment is to hinder use. To realize this approach, we have to specify how rights are transported to the User Device, how rights can be transferred

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

(continued from page 18) between User Devices for purposes of fair use and upgrades, and how the Supervising Program can determine which software is running. At each step, we show how privacy is preserved. The basic data flow of the Shield system is in Figure 2. Briefly, privacy-preserving purchases are shown on the left side of the User Device, content-identifying information enters from the Superfingerprint server depicted on the upper right, and privacypreserving rights information is exchanged with the Guardian Center. One important point: The indicated interactions with the Software Vendor, the Superfingerprint Server, and the Guardian Center are infrequent (on the order of once per week) and need little bandwidth. People who like to work mostly offline can continue to do so. Privacy-Preserving Purchase Our ability to preserve privacy while preventing piracy is based on the fact that rights, as embodied in “Tags,” are stored on the User Device in data structures called “Tag Tables”; see Figure 3. The relationship between the Tag Table Identifier (TTID) and the Tag is an internal affair of the User Device. At purchase time, Tag-related information flows between the User Device and the Vendor/Author, but the Vendor/Author does not know for which TTID. At rights-management time, TTIDs flow between the User Device and the Guardian Center but the Guardian Center does not know for which Tags. So even if the Vendor, Author, and Guardian Center all collude, they cannot determine which sets of Tags belong to the same User Device, much less which particular User Device owns any particular Tag. When the owner of a User Device wishes to purchase digital content (including digital content that has been preloaded on the User Device or installed from a CD), the Supervising ProContent identifying info Content Author

SPFs

Superfingerprints (SPFs) Server

User Device

Content Signed Purchase Order

Purchase Order

Content Vendor

Call-ups(TTIDs)

Continuation Guardian Center messages

Purchase Order

Content

Figure 2: System architecture. OS & Supervising Program (SP) Secure Clock Superfingerprints Tag Tables (TTs) TTID1 Tag Tag …

TTID2 Tag Tag …

...

Content

Figure 3: User Device. 20

TTIDk Tag Tag …

gram on that device creates a structure identifying the software and its associated Tag Table Identifier: S = (Name(C), TTID, Hash(C), UsagePolicy, NONCE)

Name(C) is the name of the content. TTID is the identifier of the Tag Table into which the Tag will eventually go. Hash(C) is the hash value of the content. UsagePolicy is some kind of policy such as perpetual use or three-month use. NONCE is a number that is randomly chosen from a large number space (for instance, from 128-bit numbers) and that is never used again. We use the NONCE to hide the value of TTID even should the Vendor collude with the Guardian Center. A Purchase Order consists of: (Hash(S), Name(C), Hash(C), UsagePolicy)

The hash function is a one-way hash function (see the accompanying text box entitled “Crypto Technologies”) such as SHA1 (or any of its improved versions), so no outsider can compute the TTID by inverting the function and no outsider can guessand-check the TTID because of the NONCE component of S. The Purchase Order may be sent to Vendor/Author over an anonymizing network, to make the source unknown to the sender [3]. The purchase may be in digital cash. Thus, the Vendor/Author can be prevented from knowing the identity of the purchaser, but can verify that the purchase amount corresponds to the correct price. If so, the Author digitally signs the Purchase Order Sign_Author(PurchaseOrder) and sends it to the User Device via the Vendor. (By signing, the Author guarantees that it is paid for every purchase. If Vendor signatures were sufficient, then a rogue Vendor could start selling content on its own. As a practical matter, the Author may devolve signing privileges to select Vendors.) The Supervising Program then verifies that the Author’s signature is correct. This is possible because the User Device has previously downloaded from the Superfingerprint Server authenticated (digitally signed) data including a list of the Authors’ public signature-verification keys. If the signature is verified to be that of the author of the content C, and S is consistent with the signed Purchase Order, the Supervising Program installs the triple (Author name, S, signed Purchase Order) into the Tag Table having identifier TTID. That triple is the Tag; see Figure 4. If a user pays with anonymous digital cash (or even a one-use credit card) and sends orders over an anonymizing network (see, for example, http://tor.eff.org/), the Vendor/Author will not know who did the purchase. Further, the Vendor/Author will not know which TTID is associated with this purchase. Superfingerprint Information The Superfingerprint Server (upper right corner of Figure 1) periodically sends several kinds of information updates to the User Device. All User Devices receive the same information and must be reasonably up-to-date (for instance, this information must not be more than one week old, so the User Device must receive the Superfingerprints once a week). • Content Identifying Information. This data associates with the name Name(C), of each content C that is protected by the system, data enabling the Supervising Program to identify C when it runs. What running or executing means depends on the type of digital content. In the case of a computer program, running means the execution of the program and identification information can then be derived from sequences of machine instructions executed by the program at runtime and from functionalities of the program. Alternatively, the content could be music, in which case, the identification information could be derived from frequency components of the melody. The Content-Identifying Information for a content C typically fits

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

(continued from page 20) in about 1/1000 of the number of bytes of C (significantly less for movies). Each Author wishing to protect a content C runs a program (or asks a professional organization to run a program) that generates relevant Content-Identifying Information. That information is distributed to Superfingerprint Servers. These in turn send the additional Content-Identifying Information to User Devices during the next Superfingerprint broadcast. A Vendor/Author need not change the content C in any way to enable this protection. As a consequence, the antipiracy protection can be deployed after distribution of C. • Content-Identifying Algorithms. The Supervising Program initially includes a suite of Content-Identifying Algorithms (which employ the Content-Identifying Information) to identify protected content. The algorithms are tailored to the type of content; for example, one class of algorithms for computer programs, another for music or video, and so on. But the algorithms apply to all examples of content in each class. One attack on the combination of Content-Identifying Information and Content-Identifying Algorithms consists of obfuscating the code or music or other content so it has the same effect to end users but looks different to our detection system. Experimentation has shown that detection algorithms can be made robust against a wide range of obfuscation attacks. User Device

(Compressing the content does not hinder our detection because detection occurs primarily at runtime.) The framework counters further obfuscation attacks by requiring the User Device to obtain periodic updates (weekly, for instance) of Content-Identifying Information and algorithms from the Superfingerprint Server. As obfuscations improve, so can our detection. • Lists of pairs: Signature-verification key, Author name. This information lets the User Device verify whether a given Author’s signature corresponds to an Author. In addition, there will be pairs relating the hashes of content to Author names. Together, these ensure that the signature of an Author as found in a Tag in fact constitutes sufficient Authority to allow the use of software. This combats the attack where author A creates content X but author B signs Purchase Orders for content X without having the right to do so. All communication with the Superfingerprint Server is oneway— from Superfingerprint Server to User Device, again possibly through an anonymizing network. Consequently, no information leaves the User Device. Transfers Without Promiscuity Finally, there is the question of managing rights. Fair rights laws and tradition require the ability to make backups. Our technology Author

Vendor

Prepare Purchase Order. Digital cash/one-time credit-card number.

Anonymizing network

Verify purchase conditions/money.

Verify. Install Tag into Tag Table.

Pass signed Purchase Order through.

Sign Purchase Order. (Knows what has been purchased, but not by whom.)

Figure 4: Privacy-preserving purchase. Identity of user hidden by anonymizing network and digital cash. Tag Table Identifier is embedded into Purchase Order using a one-way function.

Crypto Technologies

W

hereas our approach never encrypts content, it makes substantial use of three cryptographic technologies — one-way functions to hide Tag Table Identifiers and User Device Descriptive Values, digital signatures to establish the identity of sites on the network, and Secure Sockets Layer (SSL) to ensure private communication of TTIDs. Intuitively, a function f is one-way if, given x, it is easy to compute f(x) whereas given y, it is hard to find an x such that y = f(x). The hash function SHA-1 is one example (among many) of a oneway function. The purpose of a digital signature is the same as of a written one — to establish the identity of the signer of a mes-

22

sage. When you sign a contract, the holder of that contract can go to court and assert your agreement to the contract. Ideal written signatures are unforgeable but recognizable: only X can produce X’s signature but anyone can recognize that signature. So, only one person can sign, but anyone can verify (at any time or place). Digital signatures work the same way: An agent (say, the Guardian Center) in our protocol uses a private key to sign a document but that agent’s signature-verification key is well known (say, is in the Supervising Program of every User Device). Therefore, if a message arrives purporting to be from that agent, then any User Device can test whether the message is in fact from that agent.

Dr. Dobb’s Journal, October 2005

The Secure Sockets Layer (SSL) protocol is a client-server protocol offering asymmetric authentication and private communication. SSL assures the client (in our protocols, the User Device) that the server has a particular identify (in our Call-Up protocol, that the server really is the Guardian Center). SSL also enables the client and server to agree on a private key, which can be used in subsequent communication. The net effect is that the client knows the identity of the server (but not the other way around) and that the content of the exchange between client and server remains hidden from anyone else. — M.R. and D.S.

http://www.ddj.com

allows any number of backups to be made of everything — the Tags, Tag Tables, and content. Further, we want to allow transfers of rights, so Tag Tables may be moved from one User Device to another, provided the Tag Table is disabled on the first device. On the other hand, we don’t want the same Tag Table to appear on millions of devices. We reconcile these two goals through communication between each User Device and the Guardian Center. The basic purpose of this communication is to determine whether a Tag Table having some Tag Table Identifier is on several devices. Let us back up for a moment. TTIDs come about by randomly generating an identifier from a large (128-bit numbers) space perhaps based on time, typing characteristics, or a special random process. The chances of collisions in such a case are, for all practical purposes, negligible until the number of TTIDs is extremely large (for instance, a billion billion for 128bit TTIDs). So when first created, every Tag Table has a globally unique TTID. To ensure that only one User Device contains a particular TTID at a given time, each User Device performs a “Call-up” between some minimum and maximum time, say every five to seven days. As shown in Figure 5, a Call-up from device U consists of a message to the Guardian Center where the message contains a list of all enabled TTIDs of User Device U, a timestamp, and the hash of a “User Device Descriptive Value” of U appended to a NONCE. The User Device Descriptive Value contains some slowly changing property of the device that only a small number of devices have (for example, a processor ID, if available, or something about the number of files or structure of directories on the device). The use of the one-way hash function prevents any knowledge of this value from leaving the device. The Call-up is sent using a well-known secure protocol such as SSL (see “Crypto Technologies”), so no third party can see which TTIDs are being sent. The Guardian Center checks each TTID x in the list of TTIDs to see whether an overly recent Call-up contained x. If so, the Guardian Center either records the fact for future reference or, if this has happened more than some threshold number of times, the Guardian Center invalidates that TTID. After this analysis, the Guardian Center responds to the Callup with a signed “Continuation Message” listing valid Tag Table Identifiers: Sign_GuardianCenter(timestamp, Hash(User Device Descriptive Value, NONCE), TTID1, TTID3,…)

The timestamp ensures that the device cannot simply replay an old Continuation Message. The hash together with the NONCE prevent the Guardian Center from learning the User Device Descriptive Value. The User Device Descriptive Value permits the Supervising Program on User Device U to ensure that the Continuation Message was meant for U. This prevents a single Continuation Message from being used by many shadow User Devices. The User Device associates the most recent Continuation Message and its associated User Device Descriptive Value with each Tag Table. If the User Device Descriptive Value no longer matches the relevant properties of the User Device (perhaps due to a transfer of a Tag Table to this device), the Supervising Program on the User Device performs a new Call-up for just that Tag Table. On the User Device, the Supervising Program disables Tag Tables whose TTIDs have not been included in the most recent Continuation Message. There is a grace period policy, however, allowing devices to use the software associated with Tag Tables even if out-of-date, provided this doesn’t happen too often. A user transfers content by disabling its associated Tag Table x on the source device and sending it to a destination device. After doing a Call-up for Tag Table x, the destination device can now use all the software items whose Tags involve the transferred TTID. http://www.ddj.com

Dr. Dobb’s Journal, October 2005

23

Failure to disable Tag Table x and its TTID on the source device will soon thereafter lead to overly frequent Call-ups for that TTID being sent to the Guardian Center. Call-ups must be done over a secure channel (such as SSL) to prevent malicious users from fakGuardian Center

User Device List of Tag Table IDs (TTIDs), Hash (User Device Descriptive Value, NONCE). Anonymizing network

Verify that the Continuation Message pertains to this device based on User Device Descriptive Value and that the timestamp > time of Call-up.

For each TTID t, if t hasn’t been included in a Call-up too frequently over the recent period, then t is valid.

Continuation message= Sign_GuardianCenter (timestamp, Hash(User Device Descriptive Value, NONCE), list of valid TTIDs). Guardian Center sends Continuation Message.

Figure 5: Privacy-preserving Call-ups. User knows that it is talking to the Guardian Center but not vice versa (an option of SSL). TTIDs do not reveal the associated Tags. The one-way hash function associated with the NONCE prevents any revelation of the User Device Descriptive Value, so even processor identifiers can be used without fear of privacy breach.

24

ing Call-ups with a given TTID y just to deny the real owner of the Tag Table having TTID y from using that Tag Table. Note also that the Guardian Center need not be a single device. Guardian Center data may be replicated and any one of several Guardian Center nodes can handle a given Call-up request, or data may be partitioned based on TTID. (The Guardian Center data consists of information about TTIDs: time of last Call-up and a history of any overly early Call-ups.) In any case, the Guardian Center workload scales easily. Putting It All Together Here is a quick overview of the whole system. Every User Device includes a Supervising Program. When software C is being used (for example, executed) on the User Device, the Supervising Program attempts to identify C by use of Content Identifying Information and Algorithms present on the User Device. If unsuccessful, then C is deemed to be freeware and use proceeds. If identified as software named N, then the Supervising Program searches for a Tag for N in a Tag Table having a valid TTID. If found, then the Supervising Program verifies that the current usage is in accordance with the UsagePolicy for that instance of C included in the Tag for C. If everything checks out, then use of C is allowed, otherwise use is stopped or hindered. The Supervising Program is run at regular periods, checking the running queue of the User Device. It can be designed to consume fewer than 2–3 percent of the computing resources. In our experiments, its impact on the performance of even compute-intensive workloads, such as computer games, is unnoticeable. The Supervising Program performs the protected software installation task. The actual software purchase can be done outside of the User Device, for example, by an organization’s purchasing department.

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

Frequently Asked Questions When we talk about this framework, we hear several questions: Q: How can we claim that we preserve privacy when we have Call-ups? A: The Call-ups send information that neither identifies the user nor the software nor the Tags on the User Device, because TTIDs are sent rather than Tags. The protocol can be verified by third parties. Alternatively, you could avoid Call-ups by linking Tags to machines IDs, but then transfers would become more complicated and purchases as well as transfers might potentially infringe on privacy. Q: Why don’t we suffer from BORE? Superfingerprints detect use of software rather than mere possession. Can’t one subvert your detection? A: Maybe, but it is possible to do a very good job of detecting functional equivalents of software. Also, Superfingerprints can be improved with each download to counter new attacks. Q: What happens when you catch someone stealing? A: The Supervising Program on the device stops or slows down the use of that software. No information leaves the User Device. This is the functional equivalent of a speed bump: Behave, because you get car-sick if you don’t. Q: So, if this is so great, why isn’t this adopted? A: For this architecture to take hold, the hardware and operating-system Vendors must cooperate. The enabling technology for the protection system essentially exists, so it is a question of willingness. Platform Vendor incentives aren’t so clear. If one platform Vendor provides piracy prevention and another doesn’t, consumers may prefer the one that doesn’t. If our solution is used, the only reason consumers will have to dislike the piracy-prevention system is that it prevents the ability to steal. It is possible that legislation will be necessary to ensure that no platform vendor benefits by making a platform that makes stealing easier. There is precedent for this: When catalytic converters to reduce automobile pollution emissions first came on the market, many consumers resisted their introduction because they made both acceleration and gas mileage suffer, besides raising the price of the car. Their introduction has greatly reduced air pollution, however, so it constituted a societal good. Legislation was necessary to avoid having consumers punish vendors who advanced that societal good. The same may happen here. Further, whereas our architecture imposes negligible penalties on performance, it permits many new usage models such as paying for use only when needed (pay for tax software only at tax time), the preloading of software, and digital distribution of software. The saved costs from cheaper distribution and vastly reduced piracy run into tens of billions of dollars, enough to benefit all players — authors, consumers, and platform vendors. Again, there is precedent for the situation where taking on a burden ultimately enhances profit. When credit-card companies cap payments by 26

consumers due to fraudulent uses of their cards, consumers feel more confident about using their credit cards. Similarly, when platform vendors support this framework, this will allow many new creative and inexpensive uses of and distribution of content, enhancing the value of platforms everywhere and ultimately reducing the price of software to all consumers. Indeed, we foresee an alliance between (enlightened) consumers, platform vendors, and authors supporting this framework, because it is in everyone’s economic and artistic interest. Conclusion The Shield Approach is a flexible, privacy-preserving, antipiracy solution that does not suffer from “Break Once, Run Everywhere.” It protects privacy in a strong sense: It can be configured so that no one knows what you buy, what you use, or even whether you cheat. Because the content is obtainable separately from the Tag, preloading the content is possible. Transfers and fair use are straightforward. Finally, the solution is technology friendly. We embrace peer-to-peer networks, video-on-demand, superdistribution, and free software. Content Vendors will feel free to distribute content over the Internet, reducing distribution costs and material waste. Lawsuits will be reduced. Isn’t it time for technology to solve this problem? Acknowledgments Warm thanks to our principal coworkers in this effort: Yossi Beinart, Carl Bosley, Ramon Caceres, Aaron Ingram, Timir Karia, David Molnar, and Sean Rollinson. References [1] “Authentication in Distributed Systems: Theory and Practice.” Butler Lampson, Martmn Abadi, Michael Burrows, and Edward Wobber. ACM Transactions on Computer Systems Volume 10, Number 4, (November 1992), pp. 265–310. [2] For information about trusted coprocessors, see https:// www.trustedcomputinggroup.org/home/. [3] For information about anonymizing networks, see http:// tor.eff.org/. DDJ

Pragmatic Exceptions . . . .

(continued from page 24) The Supervising Program periodically downloads authenticated (that is, timestamped and digitally signed) updates of the ContentIdentifying Information, Content-Identifying Algorithms and lists of (Author name, content-hash) pairs, and (Author name, signatureverification key) pairs from the Superfingerprint Server. To revalidate its Tag Table identifiers, the Supervising Program periodically calls up a Guardian Center. The Call-ups are infrequent and require little bandwidth. Transfers entail movements of Tag Tables from one User Device to another. Back-ups are unlimited. Every reasonable model of fair use is easy to implement. For example, it’s possible to lend your software to your friend (two transfers), to allow short term use (Tags having short-term Usage Policies), and family packs (single purchase yields the privilege to obtain multiple Tags).

Tip #2: Refactor for Exceptional Clarity Tip #2 is the logical follow-on from Tip #1, “If In Doubt, Throw It Out,” (DDJ, September 2005). In deciding if a local exception should be caught or pitched out, wellfactored and highly focused code is ideal. Refactoring is fundamental to good programming practice in so many ways, an important one being that it helps you understand the finer points of the method contract. This removes doubt and leads to better decisions regarding exceptions. The smaller your functions, the easier it is to tell whether what just happened was normal. Because each function can clearly specify exactly what should be expected, knowing whether to throw an exception becomes obvious. Large functions obscure or even obliviate the contract they’re supposed to fulfill. — Benjamin Booth [email protected]

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

Reestablishing Trust In the Web A browser extension identifying sites AMIR HERZBERG AND AHMAD JBARA

E

lectronic commerce is growing rapidly. Unfortunately, electronic fraud is growing just as fast. Among the most acute current security threats are web spoofing and “phishing.” Web spoofing is the creation of fake web sites, typically a fake e-banking login page designed to harvest user passwords. On the other hand, phishing attacks involve fake e-mail messages, typically directing recipients to spoofed sites on some pretext. Alas, many users fall victim to such attacks. Studies estimate millions of victim users and stolen accounts (“phish”), and damages in the order of $1 billion in 2003 alone. Furthermore, both the rate and sophistication of the attacks is accelerating, attracting more and more criminal elements. Amazingly, current browsers fail to protect users by helping them tell the difference

Amir is a professor of computer science at Bar Ilan University and Ahmad an instructor at Netanya Academic College. Ahmad is also a graduate student in the computer science department at Bar Ilan University. They can be contacted at http://AmirHerzberg.com/ and achmad@ netanya.ac.il, respectively. 28

between known, trustworthy sites and spoofed ones. We created TrustBar (http://AmirHerzberg .com/TrustBar/) to address this problem. TrustBar is a browser extension that provides improved security, identification, and trust indicators. TrustBar is sufficiently visible to draw the attention of even naive users upon entering a spoofed site. In this article, we examine the current browser UI for identifying sites and their weaknesses and explain the extended UI provided by TrustBar. Browser Identification Mechanisms Granted, browser UIs include areas that should be examined by users to authenticate web sites. For instance, the location bar contains the location (URL) of the current web page, while the status bar contains a closed padlock icon in pages protected by the SSL/TLS protocol. However, these elements are not sufficient to protect most users. For one thing, the location bar and padlock are not sufficiently visible, and many naive users are not even aware of their existence, let alone importance or meaning. In particular, the location is usually given as a URL, and most users do not know which part of it identifies the domain. Indeed many users ignore it or completely remove it. Furthermore, spoofed sites may remove it, possibly replacing it by a fake look-alike image and/or script. Nor, for that matter, is the padlock highly visible, and most users are not aware that a padlock is meaningful only in the status bar and not inside the web page. In any event, the padlock merely indicates only whether the site has invoked SSL/TLS Dr. Dobb’s Journal, October 2005

with a public key certificate from one of the (hundred or so) Certificate Authorities (CAs) trusted by the browser. These CAs differ extensively in their requirements,

“Most users rely on the content of web sites as a means to identify the site and whether it is protected” procedures, and costs. Specifically, some certificates only involve validation of ownership of the domain by e-mail/phone to the contact, while others involve validation of corporate documents. Most users are not aware of the identity of most CAs, not to mention that these (unknown) entities are responsible for validating the identities of owners of (protected) web sites. In reality, most users rely on the content of web sites as a means to identify the site and whether it is protected. Unfortunately, it is trivial for attackers to http://www.ddj.com

(continued from page 28) mimic the appearance of victim sites. This situation is made worse because some of the most important web sites request passwords and even indicate a padlock and claim to use security in unprotected web pages. Most of these sites invoke SSL/TLS to encrypt the password in transit, but users have no way to know this in advance, and therefore, are unlikely to detect a spoofed version that sends the passwords to a cracker. Amazingly, this trivial-to- fix— yet fatal — vulnerability exists in many sensitive sites, including online banks (Chase, PayPal, Wells Fargo, MidFirst, TD Waterhouse, Bank of America); merchants (Amazon); and even security services (Microsoft Passport and Equifax); see Figure 1 and http:// AmirHerzberg.com/Shame.htm for an updated list. The combined vulnerability of the browser’s UI, user’s naïveté, and the irresponsibility of site designers encourage web spoofing and phishing attacks. TrustBar is an open-source browser extension we built for the Mozilla and Mozilla Firefox browsers (http://trustbar .mozdev.org/). The purpose of TrustBar is to provide highly visible, preferably graphical, indicators for the identification of sites. Specifically, TrustBar pre-

sents the identity of the site as names or preferably logo, rather than URLs, and lets users select their own name or logo (“My Bank,” for instance). Furthermore, TrustBar also presents the identity of the Certificate Authority (CA), which is the entity that validated the identity of the site. Some CAs have multiple certificate products, typically for different levels of identity validation; TrustBar lets the CA display a different logo for each such product or class of certificate. Because we designed TrustBar as an integral part of the browser UI, attackers have no control over its display and cannot remove or clone it. Furthermore, it contains clear graphical and/or textual indicators so that users can distinguish between original and cloned sites — identification for the site, identification for the CA, and indication whether the site is protected. Users can edit both text and graphical identifications. We decided to locate our bar at the top of the browser window, above all other toolbars. It is fixed and beyond the control of web sites. As such, it appears under all conditions. These properties were implemented via the XUL language that was used to build the Mozilla UI. We are currently investigating ways to make Trust-

Bar more like other bars, while still protecting it from removal or cloning by rogue web pages. Mozilla Firefox Extensions Using extensions, you can enhance Mozilla’s Firefox browser functionality. The TrustBar extension is a collection of files developed as an independent package and overlaid on Mozilla Firefox. This package consists of XUL, CSS, JavaScript, and image files. All files must be zipped in one XPI file using a ZIP utility. The XUL file contains a description of the extension UI. If the UI is complicated, then its description can span several XUL files. Each XUL file can be overlaid with the original Firefox XUL file to affect the browser UI. The CSS files describe the attributes of the UI elements defined in the XUL files, and the JavaScript files are the code controlling the system. To create an installable extension, all these files are ZIPPED into a JAR file and that, together with install.rdf file, must be packaged in an XPI file. The TrustBar Package The TrustBar package is provided in the installation file TrustBar.xpi created by using ZIP. This file contains: • Install.rdf, a mandatory file in any Firefox extension. This is an XML-like file that defines properties such as GUID, name, version, target application, and JAR files. • TrustBarOverlay.xul, which defines the UI of the main bar of TrustBar. • TrustBar.js, a Javascript file that contains code for supporting the UI defined in TrustBarOverlay.xul. • TrustBarDlg.xul, which defines the TrustBar dialog UI. This file also includes JavaScript code for supporting this UI. • TrustBarGlobal.js, which defines functionality used globally. • TrustBar.jar, a ZIPPED file that contains the XUL and JavaScript files, except for the install.rdf file.

Figure 1: Unprotected login.

The TrustBar identification mechanism is always displayed, regardless of the security level of the loaded site. When the site reached is unprotected, TrustBar displays a message (such as Figure 2) and two buttons: • Suspect Fraud Button, which lets users report suspected sites. • What Does It Mean Button, which presents users with explanations about the meaning of entering an unprotected site and suggests a potential secure alternative. Figure 2: TrustBar within an unprotected versus browser within an unprotected site. 30

Dr. Dobb’s Journal, October 2005

In secure sites (as shown in Figure 3), TrustBar lets users modify the identification http://www.ddj.com

(continued from page 30) details by clicking the “show TrustBar dialog” (Figure 4). Within this dialog, users can edit the site/CA organization name, attach a new logo to the site/CA, and change the trust extent for the current CA. To do it easily and quickly, TrustBar lets users replace current site logos with new ones by right clicking the mouse over any image within the site contents and making that image the active logo (Figure 3). TrustBar UI The TrustBar UI is merged into Mozilla Firefox UI using Listing One. This is an XUL file that defines the TrustBar UI and is overlaid with the Firefox original XUL file, called browser.xul. Line 1 defines file elements that are embedded with the browser’s original main window and become a part of its UI. The TrustBar’s main element is the horizontal TrustBarBox defined in line 13. This element functions as a container of

any number of elements such as logos, text, and buttons. We verified that the mainCommandSet element defined in the Firefox browser.xul file is always above all other elements. We use the insertbefore attribute to place TrustBarBox above it. The context attribute defines the context menu activated from inside the TrustBarBox. The elements contained in our main box are defined in lines 14 to 21. These elements represent either the site and CA logos or their text. The whatId button lets users reach the TrustBar dialog for further information and modifications, and the fraudId button lets users report suspected sites. The TrustBar dialog pops up whenever an unrecognized secure site is reached or by clicking the whatId button within a secure site. The XUL file describing the TrustBar dialog UI is available electronically; see “Resource Center,” page 4.

<window id="main-window" onload="Init()" >

This statement overrides the onload event of the main window of the browser by the TrustBar Init function. The Init function in line 3 initially calls the initialization function of the main window of the browser and then initializes TrustBar. The initialization tasks of TrustBar include: • Creating local directories for saving information about sites and CAs. This is done by calling the createLogoDir function (line 7). • Initializing a listener to the browser so TrustBar can get all types of notifications from the browser. Line 5 adds the user-defined listener TrustBarProgressListener (Listing Three) to the browser. The listener definition includes all notifications that have to be implemented. One of the main notifications to which TrustBar responds is the onSecurityChange notification (line 10). This notification is received whenever a switch occurs between unprotected and secure sites and vice versa. We look into whether the browser switched to a secure or unprotected site in line 13. The switch statement checks the State variable against two constant values in lines 15 and 26. In case the browser security level changes to a secure site, TrustBar responds in line 15; if it changes to an unprotected site, TrustBar, line 26, responds. The response consists of two functions — one is for destroying the current secure UI, and the other for constructing the unprotected UI to be conformed to the new state. These two functions are straightforward. Basically, they hide some elements and make others visible. In the secure case, we initially verify the SSL certificate by calling the verifySSLCertificate. If this check fails, TrustBar automatically switches to unprotected state. If this verification passes, we make some UI initializations. In line 21, we call the updateTrustBarDB function that checks whether the CA and the site are known to TrustBar, and accordingly, updates TrustBar’s local database. Based on the database, TrustBar decides whether to present its dialog.

Figure 3: TrustBar within a secure site.

DDJ (Listings begin on page 34.)

Figure 4: TrustBar dialog. 32

TrustBar in Action Once the TrustBar UI is merged into the Mozilla Firefox UI, the JavaScript code in it is activated (see Listing Two). The main function that initiates TrustBar is Init. This function is called when the main window of the browser is loaded. This is done using this XUL statement:

Dr. Dobb’s Journal, October 2005

http://www.ddj.com

Listing One 1. 2. <window id="main-window" onload="Init()" > 3. <script type="application/x-javascript" src="chrome://safergn/content/safergn.js" /> 4. <script type="application/x-javascript" src="chrome://safergn/content/TrustBarGlobalFunctions.js" /> 5. <popupset> 6. <popup id="safergnMnu"> 7. <menuitem id="dialogmenuId" label="Show identification details dialog" disabled="true" 8. oncommand="ShowLogoDialog(true);"/> 9. <menuitem id="resetmenuId" label="Reset..." oncommand="Reset();"/> 10. <menuitem id="aboutmenuId" label= "About TrustBar" oncommand="openAboutDlg()"/> 11. 12. 13. 14. 15.