,
Dr. Dobbs J O U R N A L
#378 NOVEMBER 2005
SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER http://www.ddj.com
DISTRIBUTED COMPUTING Inside the Media Grid Parallel Processing & PVM Debugging Distributed Apps InfiniBand Technology
Linux Kernel Debugging Improving Google Desktop Searches Building Distributed Apps Creating ASP.NET 2.0 Designing Enterprise Architectures Server Controls Jerry Pournelle SIGGRAPH, the ACM’s Special Interest Group on Graphics, Functional is one of the key places to observe computer trends Programming & Java $4.95US $6.95CAN 11 Ed Nisley Java & RFID Lego Mindstorms provides a Tags stepping stone into the world of robotics 0 74470 01051 7
C O N T E N T S
NOVEMBER 2005 VOLUME 30, ISSUE 11
FEATURES The Media Grid 16 by Aaron E. Walsh
The Media Grid is a digital media network infrastructure and software development platform based on distributed grid technology.
Parallel Processing Clusters & PVM 24 by David J. Powers
The Parallel Virtual Machine is network-clustering software that provides a scalable network for parallel processing.
Debugging Heterogeneous Distributed Applications 32 by Stephen B. Jenkins
Debugging complex, asynchronous, heterogeneous, distributed applications is hard. The techniques Stephen presents here makes the process easier.
Building Internet Distributed Computing Systems 39 by Charles Peck, Joshua Hursey, Josh McCoy, & Vijay Pande
Our authors present a framework for harnessing distributed, tightly coupled cluster and SMP resources for computational science research.
InfiniBand Technology 42 by Corky Seeber
InfiniBand is a serial I/O interconnect architecture designed to connect hundreds — if not thousands — of computers.
Building Internet Distributed Computing Applications 47 by David Houlding
Protégé is a tool that lets you efficiently map out an Enterprise Architecture to enable knowledge mining for analysis and planning.
Linux Kernel Debugging 51 by Dean A. Gereaux
Dean explains how to debug drivers with Linux Kernel Debugger, add hooks into KDB from your drivers, and create KDB modules.
Improving Search Precision Using Google Desktop Search 1.0 55 by Lawrence Reeve
The Google Desktop Search SDK lets you build plug-ins for extending Google’s Desktop Search local indexing and search service.
Functional Programming in Java 60 by Mark Zander
Generic Java lets you extend the language to gain some of the advantages of functional programming languages.
The CustomTreeView ASP.NET 2.0 Server Control 64
FORUM
by Shahram Khosravi
EDITORIAL 8 by Jonathan Erickson
CustomTreeView is a server control derived from the ASP.NET 2.0 TreeView server control that is used to display hierarchical data.
LETTERS 10 by you DR. ECCO’S OMNIHEURIST CORNER 12 by Dennis E. Shasha
EMBEDDED SYSTEMS Java & RFID Tags 68 by Shamshad Ansari
The Java Communication API lets you send commands to and receive responses from RFID readers such as the TI S2000 Micro Reader.
COLUMNS Programming Paradigms 72
Chaos Manor 78
by Michael Swaine Ringtones are where the money is — for now anyway.
by Jerry Pournelle SIGGRAPH, the ACM’s Special Interest Group on Graphics, is one of the key places to observe computer trends.
Embedded Space 75
Programmer’s Bookshelf 81
by Ed Nisley Lego Mindstorms provides a stepping stone into the world of robotics.
by Gregory V. Wilson Greg’s reading list includes books on everything from software vulnerability and regular expressions to XML and Perl.
http://www.ddj.com
Dr. Dobb’s Journal, November 2005
NEWS & VIEWS 14 by DDJ Staff PRAGMATIC EXCEPTIONS 22 by Benjamin Booth OF INTEREST 83 by DDJ Staff SWAINE’S FLAMES 84 by Michael Swaine
NEXT MONTH: December is all about database development.
3
D R .
D O B B ’ S
O N L I N E
C O N T E N T S ONLINE EXCLUSIVES
THE NEWS SHOW
http://www.ddj.com/exclusives/
http://thenewsshow.tv/
The Role of Hardware in Exposing Security Breaches
Ensuring Outsourcing Expertise
Far too often, security is considered solely a software problem.
It’s the XML Configuration File’s Fault So why did Greg give up on Java and switch to Python?
IT skills tests for offshore call centers are in the offing.
DOBBSCAST AUDIO
U N I X R E V I E W. C O M
http://www.ddj.com/podcast/
http://www.unixreview.com/
Apache Kicks Off Software-Integration Project
Regular Expressions: Don’t Fear Reliability
David Chappell talks about web services and the Apache Synapse Web Service Mediation Framework project.
Linux as a Platform for Mobile Phones
Don’t be afraid to script.
Peder Ulander explains why Linux may be the platform for next-generation mobile phones.
http://devnet.developerpipeline.com/windows/
Lightweight Tracing in C# What is a good way to implement tracing in your application?
DOTNETJUNKIES http://www.dotnetjunkies.com/
Microsoft Enterprise Library 2005 The Enterprise Library is a collection of Application Blocks released by the Patterns and Practices group within Microsoft.
BYTE.COM http://www.byte.com/
The Future of the Mac What can we expect from these next generation Macs?
THE C/C++ USERS JOURNAL http://www.cuj.com/
C++’s find The role of the C++ Standard’s find is to find a specific element in a sequence.
THE PERL JOURNAL http://www.tpj.com/
Ten Things You (Probably) Didn’t Know About Perl Here are some tricks and techniques that you probably didn’t know you could do in Perl.
RESOURCE CENTER As a service to our readers, source code, related files, and author guidelines are available at http://www.ddj.com/. Letters to the editor, article proposals and submissions, and inquiries should be sent to
[email protected]. For subscription questions, call 800-456-1215 (U.S. or Canada). For all other countries, call 902563-4753 or fax 902-563-4807. E-mail subscription questions to
[email protected], or write to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80322-6188. If you want to change the information you receive from CMP and others about products and services, go to http://www.cmp.com/ feedback/permission.html or contact Customer Service at Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80322-6188. Back issues may be purchased prepaid for $9.00 per copy (which includes shipping and handling). For issue availability, send e-mail to
[email protected], fax to 785-838-7566, or call 800-444-4881 (U.S. and Canada) or 785838-7500 (all other countries). Please send payment to Dr. Dobb’s Journal, 4601 West 6th Street, Suite B, Lawrence, KS 66049-4189. Digital versions of back issues and individual articles can be purchased electronically at http://www.ddj.com/.
WEB SITE A C C O U N T A C T I VA T I O N Dr. Dobb’s Journal subscriptions include full access to the CMP Developer Network web sites. To activate your account, register at http://www.ddj.com/registration/ using the web ALL ACCESS subscriber code located on your mailing label.
DR. DOBB’S JOURNAL (ISSN 1044-789X) is published monthly by CMP Media LLC., 600 Harrison Street, San Francisco, CA 94017; 415-947-6000. Periodicals Postage Paid at San Francisco and at additional mailing offices. SUBSCRIPTION: $34.95 for 1 year; $69.90 for 2 years. International orders must be prepaid. Payment may be made via Mastercard, Visa, or American Express; or via U.S. funds drawn on a U.S. bank. Canada and Mexico: $45.00 per year. All other foreign: $70.00 per year. U.K. subscribers contact Jill Sutcliffe at Parkway Gordon 01-49-1875-386. POSTMASTER: Send address changes to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80328-6188. Registered for GST as CMP Media LLC, GST #13288078, Customer #2116057, Agreement #40011901. INTERNATIONAL NEWSSTAND DISTRIBUTOR: Worldwide Media Service Inc., 30 Montgomery St., Jersey City, NJ 07302; 212-332-7100. Entire contents © 2005 CMP Media LLC. Dr. Dobb’s Journal® is a registered trademark of CMP Media LLC. All rights reserved.
4
Dr. Dobb’s Journal, November 2005
http://www.ddj.com
3D image rendered courtesy of Richard D. Eberly. Globe image by Margaret A. Anderson.
WINDOWS/.NET
,
Dr.Dobbs J O U R N A L
PUBLISHER Michael Goodman
SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER
EDITOR-IN-CHIEF Jonathan Erickson
EDITORIAL MANAGING EDITOR Deirdre Blake SENIOR PRODUCTION EDITOR Monica E. Berg ASSOCIATE EDITOR Della Wyser COPY EDITOR Amy Stephens ART DIRECTOR Margaret A. Anderson SENIOR CONTRIBUTING EDITOR Al Stevens CONTRIBUTING EDITORS Bruce Schneier, Ray Duncan, Jack Woehr, Jon Bentley, Tim Kientzle, Gregory V. Wilson, Mark Nelson, Ed Nisley, Jerry Pournelle, Dennis E. Shasha EDITOR-AT-LARGE Michael Swaine PRODUCTION MANAGER Stephanie Fung INTERNET OPERATIONS DIRECTOR Michael Calderon SENIOR WEB DEVELOPER Steve Goyette WEBMASTERS Sean Coady, Joe Lucca AUDIENCE DEVELOPMENT AUDIENCE DEVELOPMENT DIRECTOR Kevin Regan AUDIENCE DEVELOPMENT MANAGER Karina Medina AUDIENCE DEVELOPMENT ASSISTANT MANAGER Shomari Hines AUDIENCE DEVELOPMENT ASSISTANT Melani Benedetto-Valente MARKETING/ADVERTISING ASSOCIATE PUBLISHER Will Wise SENIOR MANAGERS, MEDIA PROGRAMS see page 82 Pauline Beall, Michael Beasley, Cassandra Clark, Ron Cordek, Mike Kelleher, Andrew Mintz MARKETING DIRECTOR Jessica Marty SENIOR ART DIRECTOR OF MARKETING Carey Perez DR. DOBB’S JOURNAL 2800 Campus Drive, San Mateo, CA 94403 650-513-4300. http://www.ddj.com/ CMP MEDIA LLC Gary Marshall President and CEO John Day Executive Vice President and CFO Steve Weitzner Executive Vice President and COO Jeff Patterson Executive Vice President, Corporate Sales & Marketing Leah Landro Executive Vice President, Human Resources Mike Mikos Chief Information Officer Bill Amstutz Senior Vice President, Operations Sandra Grayson Senior Vice President and General Counsel Alexandra Raine Senior Vice President, Communications Kate Spellman Senior Vice President, Corporate Marketing Mike Azzara Vice President, Group Director of Internet Business Robert Faletra President, Channel Group Tony Keefe President, CMP Entertainment Media Vicki Masseria President, CMP Healthcare Media Philip Chapnick Vice President, Group Publisher Applied Technologies Paul Miller Vice President, Group Publisher Electronics Fritz Nelson Vice President, Group Publisher Network Computing Enterprise Architecture Group Peter Westerman Vice President, Group Publisher Software Development Media Joseph Braue Vice President, Director of Custom Integrated Marketing Solutions Shannon Aronson Corporate Director, Audience Development Michael Zane Corporate Director, Audience Development Marie Myers Corporate Director, Publishing Services
American Buisness Press
6
Dr. Dobb’s Journal, November 2005
Printed in the USA
http://www.ddj.com
EDITORIAL
Happiness Is a Warm… Customer
L
et’s see, we’ve got our gigabyte, gigahertz laptop computers that weigh less than a college textbook. We have cell phones with Internet access and built-in cameras. We’ve got video game consoles with web browsers. We can communicate via voice mail, e-mail, instant messaging, text messaging, and XXX-rated chat rooms. We can make movies without leaving our desktops, and make music albums without backing the car out of the garage. But you know what? We’re still not that happy, at least according to the recent American Customer Satisfaction Index survey (http://www.theacsi.org/). Established in 1994 and conducted by the University of Michigan, the ACSI (which refers to itself as “The Voice of the Nation’s Consumer”) measures consumer experiences by addressing issues such as customer expectations, perceived quality, perceived value, customer complaints, and customer loyalty. In the process, the ACSI tracks satisfaction in a range of consumer markets, including everything from major appliances and breweries, to personal computers and search engines. All in all, the news could have been worse for a lot of the industries and companies surveyed, but then it could have been better, too. In the personal computer industry, for instance, fourth quarter overall customer satisfaction remained steady Dr. Dobb’s Journal compared to last year. In large part, this is thanks to lots 2006 Editorial Calendar of happy Apple Computer customers who responded to Apple’s innovative products and focus on customer service. Say what you will, there’s no denying that Apple’s January Programming Languages sales increased 33 percent over the last year, with net February 64-Bit Development income growing 300 percent and stock prices nearly tripling March Intelligent Systems in parallel with increases in customer satisfaction. On the other hand, Dell customers aren’t such happy April Algorithms campers, with customer satisfaction dropping nearly 6 May Testing & Debugging percent. In particular, the ACSI zeroed in on Dell’s customer service problems that are tied to long wait-times June Graphics & Game Development and difficulties with the company’s call-center support. July Java Programming Hmmm, so much for the impact of my whining about August Computer Forensics my experiences with Dell’s call-center customer support (see “Editorial,” DDJ, March 2004). The ACSI hedges September Communications & Networking on the long-term effect of declining satisfaction with October Computer Security Dell customer service, but does point out that ACSI history indicates that “changes in customer satisfaction November Distributed Computing often signal similar changes in future financial perforDecember Database Development mance.” To illustrate, Apple’s stock price has increased along with its customer satisfaction, while Dell’s stock We’re on the lookout for articles on these — and other — price has remained flat as customer satisfaction has topics of interest and relevance to the art and craft of software dropped. In the online sector, customers are generally satisfied development. This list is just a start. If you have an article idea, drop us a note at
[email protected]. Or, if you’d like to suggest with the usual suspects —Google, Yahoo, AOL, Ask Jeeves, a topic that we should cover, let us know. For details on sub- MSN, and the like — with customer satisfaction for search mitting articles, see our author guidelines at http://www engines and portals rising 4.7 percent over the past year. That said, Google remains the big dog in the number of .ddj.com/ddj/authors.htm. searches executed (Google exceeds the combined searches of Yahoo, MSN, AOL, and Ask Jeeves), in revenue, and in customer satisfaction. Again, rises in customer satisfaction mirrored increases in revenues, with both Google and Yahoo more than doubling revenues from 2003 to 2004. The bottom line is that satisfied customers are good for, well, the bottom line. And one of the easiest ways to create and keep satisfied customers is to reply to that e-mail and return that phone call in a timely and polite manner. Okay, I promise to be better about that — starting real soon now.
Jonathan Erickson editor-in-chief
[email protected] 8
Dr. Dobb’s Journal, November 2005
http://www.ddj.com
LETTERS
, PO S S O BB
T
D
2
C
T EN
S
2
More on Licensing Dear DDJ, I have been reading with much interest the running discussion about licensing software developers (see “Letters,” DDJ, January 2005). As someone who works in the building construction industry, where by law, building plans for all areas of work (architectural, structural, plumbing, electrical, mechanical, fire protection, communication, etc.) must be prepared under the direction of and sealed by a registered/licensed professional architect or engineer. In theory, the registration or licensing is a measure of competence. To even take the test to become a PE (registered professional engineer), one must have a BSEE at an accredited college or university. Having seen what the actual test is for electrical engineers in my state, I am a bit skeptical that all areas of the test really apply to electrical system design. I work with PEs and also RCDDs (a different certification, this one for telecommunications system design). Some of these professionals are competent, in my opinion, and others, well…I’m not so sure. The key area of difficulty is an obvious one — getting drawings and specifications that clearly describe what the final result is supposed to be when completed. In the building industry, electrical engineers develop drawings and specifications showing what is required to be installed. There are varying levels of detail provided. In many cases, subcontractors (electrical, low voltage, security, fire alarm, communications) have to interpret the drawings and specifications, submit details of their approach to installation, and once that submission is reviewed by the specifying engineer and not disapproved, these subcontractors install the actual system(s). In some cases, the final installed result is reviewed by the specifying engineer, but not always. Behind all of this is a whole series of codes and standards (National Electrical Code, International Building Code, International Mechanical Code, 10
etc.), which are adopted by law in many areas of the country and that provide the basis of the various areas of building design. Local inspectors review plans and specifications before work commences and can require changes in the design to meet code and standards requirements. In the last analysis, the actual workmen who build the building are the ones who make the whole process work and create a safe, usable finished product. Without a similar infrastructure of codes, standards, reviews against these codes and standards, and a final inspection by a knowledgeable authority who has some actual legal power of enforcement, the whole idea of licensing software developers is not meaningful. We have very few actual software design standards that I am aware of in the nonmilitary arena. It will take a long time to develop meaningful, useful commercial standards, unless we adopt military (U.S. DOD) standards, and I don’t think commercial developers are interested in that. A process for inspection against these standards by a recognized authority will also have to be created. There may be a more difficult problem to overcome in all of this. My limited experience working with software developers suggests to me that it’s the creative problem solving part of the job that attracts the most talented, those I call “genius programmers.” These are software developers who can write complex applications quickly and get excellent results. In my experience, they are not interested in bureaucracy, paperwork, design review, software standards, or code inspections by others. At best, they tolerate these things, but they really don’t like them. Now, this is not a criticism, but it will have to be dealt with. The other big problem I see is that in complex software designs, it may be difficult to actually know how the design works, and it may be impossible to adequately test or inspect it thoroughly. Ken Ciszewski
[email protected] LabVIEW Fan Dear DDJ, I am surprised to see so little mention about National Instrument’s LabVIEW or its graphical language “G.” I know LabVIEW was initially created with data acquisition and control in mind, but it is a surprisingly versatile and powerful general-purpose language. Here’s a link talking about how it is a fully featured programming language: http://zone.ni .com/devzone/conceptd.nsf/webmain/ F34045D2CC5357F486256D3400648C0F#3. I am just learning LabVIEW and am very Dr. Dobb’s Journal, November 2005
impressed with it. I have had introductory classes in a handful of languages, and used C in my last job. I definitely would have rather used G. One really big headache that LabVIEW takes care of is said well in this quote taken from the aforementioned link: “LabVIEW performs all memory management automatically and ensures that it is used efficiently and safely.” Imagine not having to worry about those elusive bugs caused by memory- management mistakes. Array overruns have caused me headaches more than once. In LabVIEW, it can’t happen. It’s pretty amazing. Is there some kind of a real-world problem with G becoming a commonly used general-purpose language? I’m surprised it hasn’t started catching on by now outside of the data acquisition and control world. Has National Instruments failed in some ways to promote it in this way? National Instruments now has a LabVIEW add-on package for developing code for embedded processors. Brad Mosch
[email protected] Editor’s Note: Thanks Brad. As luck would have it, Stephen Jenkins touches on LabVIEW in his article “Debugging Heterogeneous Distributed Applications” in this issue. More Duff’s Device Dear DDJ, In Ralf Holly’s most interesting article on using the famous Duff’s Device (see “A Reusable Duff Device,” DDJ, August 2005), I found one issue that bears mentioning. Ralf makes the comment that he “changed a couple of minor things,” one of them being the use of the right-shift operator (>>) to help out the compiler, and he also points out that it “doesn’t hurt.” Unfortunately in this case, it does hurt, because the values being shifted right are signed ints, which is a bad idea due to the unlikely but possible case of a negative value being used. Right shift is only safe in general with unsigned data. If you really use this minor optimization in the macro, then you should consider changing the type to unsigned int. Thanks to Ralf for presenting a practical application of Duff’s Device. I have known about it for years, but never found a use for it in practice. It would have been very interesting to see some performance data on one or more reference platforms to get a better idea of its advantages. Randy Howard
[email protected] DDJ http://www.ddj.com
DR. ECCO’S OMNIHEURIST CORNER
Feedback Strategies Dennis E. Shasha
“I
promise that you will enjoy this one,” Baskerhound said, coming in for a surprise visit. “It will resonate with your theories of human nature. Let me start with the populist pitch. “Have you ever admired your own skill at navigating a narrow road at high speed? If not, imagine the following alternative method of travel: Pore over a detailed map of the same road, figure out how much the wheel should turn and the accelerator should be pressed at every time point, and then drive down the road blindfolded. Even without obstacles, this is beyond the memory and trigonometric capacity of most of us.” “I will grant you that,” Ecco replied with a sly smile. “Instead, we’re hardly conscious of the intellectual effort of driving,” Baskerhound continued. “Perhaps the reason is that the act of driving consists of very short-term plans (a few seconds at most) followed by adaptation based on eyesight. The driver has an overall goal — get to the end of the road — but the plan is incremental and adaptive. This requires less brainpower and is far more robust to changes in the environment. “Any person on the street understands this argument, but my bosses require quantification. So to make this concrete, Dennis, a professor of computer science at New York University, is the author of four puzzle books: The Puzzling Adventures of Dr. Ecco (Dover, 1998); Codes, Puzzles, and Conspiracy (Freeman 1992, reprinted by Dover in 2004 as Dr. Ecco: Mathematical Detective); and recently Dr. Ecco’s Cyberpuzzles (W.W. Norton, 2002); and Puzzling Adventures (W.W. Norton, 2005). With Philippe Bonnet, he has written Database Tuning: Principles, Experiments, and Troubleshooting Techniques (2002, Morgan Kaufmann). With Cathy Lazere, he wrote Out of Their Minds: The Lives and Discoveries of 15 Great Computer Scientists (1995, Copernicus/Springer). He can be contacted at
[email protected]. 12
I have proposed the following game. Consider this standard checkerboard that has 8 rows and 8 columns (see Figure 1). “You want to go from row 1, column 4 (the black square above the S) to row 8 column 5 (the black square below the E). Each move goes from black square to black square and proceeds up a row and either to the left or right diagonally adjacent square. If you fall off the checkerboard or reach the top row without reaching the correct square, you lose. “At each move, you get to aim to go either right or left. You will achieve that step’s aim with probability Pgood, whose values we will discuss in a minute. There are two kinds of strategies: FeedYes and FeedNo. “A FeedYes strategy can decide where to aim on the ith move after seeing the results of the first i –1 moves. A FeedNo strategy must decide where to aim at step i from the very beginning. “Here is an example to show you the difference. Suppose that you want to go from row 1, column 4 to row 3, column 4. Suppose that Pgood is 0.9. Then in the FeedYes strategy, you might aim right the first move. If you in fact go right (probability 0.9), then you would aim left the second move. But if you go left on the first move (probability 0.1), you will aim right the second move. The net result is that you have a probability of 0.9 to hit E 8 7 6 5 4 3 2
your destination. In the FeedNo strategy, you might do something like aim right the first move and aim left the second. There are two cases in which you would win with that strategy: You in fact move right in move 1 and left in move 2 (probability 0.9×0.9=0.81) or you move left in move 1 and right in move 2 (probability 0.1×0.1= 0.01). So FeedNo has a probability of 0.82 of hitting the destination. “Call the feedback dividend the probability of hitting the destination with the optimal FeedYes strategy divided by the probability of hitting it with the optimal FeedNo strategy. (Optimal means that you do as well as you can based on the probabilities.) In the example here, the feedback dividend is 0.9/0.81. “Here’s a warm-up: Are there any values of Pgood for which the feedback dividend is 1 regardless of source and destination?” “My dear Benjamin, of course,” said Dr. Ecco. “If Pgood were 0.5 or 1, the feedback dividend would be only 1. In the first case, it doesn’t matter where you aim. In the second, you don’t need feedback. For all other Pgood values, the dividend will exceed 1.” “I didn’t think that would be hard,” said Baskerhound. “Now here is the full problem. You start at row 1, column 4 and you want to hit row 8, column 5. “1. If Pgood is 0.9, what is the probability of hitting in the FeedYes strategy and in the FeedNo strategy? “2. For which value of Pgood does the feedback dividend reach its greatest value? What is the feedback dividend in that case?” I was quite surprised by the result of this second question. Not intuitive at all. After Baskerhound left, Ecco asked me one other: “3. If we cut off the three rightmost columns and the two left-most columns, then which value of Pgood would give the highest feedback dividend? Assume that falling off the board gives a certain loss.”
1 S 1
2
3
4
5
6
7
8
Figure 1: The checkerboard. Dr. Dobb’s Journal, November 2005
For the solution to last month’s puzzle, see page 74. DDJ http://www.ddj.com
SECTION
A
MAIN NEWS
Dr. Dobb’s
News & Views
Digital Display Interface Spec Proposed The Video Electronics Standards Association (http://www.vesa.org/) has proposed DisplayPort, a new digital display interface specification for computer monitors, TVs, projectors, PCs, and the like. DisplayPort enables high-quality audio to be available to display devices over the same cable as the video signal. It also enables a common interface approach across both internal connections, such as interfaces within a PC or monitor, and external display connections. The standard includes an optional digital audio capability so highdefinition digital audio and video can be streamed over the interface. DisplayPort incorporates Main Link, a high-bandwidth, low-latency, unidirectional connection supporting isochronous stream transport. One stream video with associated audio is supported in Version 1.0, but DisplayPort is seamlessly extensible, enabling support of multiple video streams. Version 1.0 also includes an Auxiliary Channel to provide consistent-bandwidth, lowlatency, bidirectional connectivity with Main Link management, and device control based on VESA’s E-DDC, E-EDID, DDC/CI, and MCCS standards. The Main Link bandwidth enables data transfer at up to 10.8 Gbits/second using a total of four lanes.
Electronic Passports On the Way The U.S. State Department is developing an electronic passport that will be put in use before the end of the year (http:// travel.state.gov/passport/). Embedded in the cover of the passport will be a microchip that includes facial-recognition information, along with the name, date of birth, gender, place of birth, dates of passport issuance and expiration, passport number, and photo image of the bearer. A digital signature will protect the stored data from alteration and mitigate the threat of photo substitution. To combat unauthorized reading, the passport will also incorporate anti-skimming technology in the front cover. Conventional paper passports will be replaced as upon renewal.
TeraGrid Gets Funding The National Science Foundation (NSF) has made a five-year, $150 million award 14
to operate and enhance the TeraGrid (http://www.teragrid.org/). Built over the past four years, TeraGrid is the world’s largest distributed cyberinfrastructure for open scientific research. Through high- performance network connections, TeraGrid integrates highperformance computers, data resources and tools, and high- end experimental facilities around the country. Scientists and engineers responsible for TeraGrid operations will work closely with researchers whose science requires powerful computing resources. For example, researchers using TeraGrid are exploring functions of decoded genomes, how the brain works, the constitution of the universe, disease diagnosis, and realtime weather forecasting to predict the exact locations of tornado and storm threats. The TeraGrid award includes $48 million for overall architecture, software integration, operations, and coordination of user support and $100 million for operation, management, and user support of TeraGrid resources at eight resource provider sites. Science gateway projects are aimed at supporting access to TeraGrid via web portals, desktop applications, or via other grids. An initial set of 10 gateways will address new scientific opportunities in fields from bioinformatics to nanotechnology as well as interoperation between TeraGrid and other grid infrastructures.
Consortium Launched for Wireless Community The Digital Communities consortium has been launched to encourage communities to use wireless technology. Led by Intel, Cisco, Dell, IBM, and SAP, among others, the Digital Communities initiative (http:// www.intel.com/go/digitalcommunities/) will promote Wi-Fi and WiMAX technology for developing and deploying services to enhance government efficiency, promote economic growth, foster greater community satisfaction, and bridge the digital divide. Applications range from automating mobile workers (meter readers and building inspectors, for instance) to increasing the safety and enhancing resource management of first responders by remotely monitoring vehicle location to enhancing parent/teacher collaboration for improved student success. Among the pilot communities are Cleveland, Ohio; CorDr. Dobb’s Journal, November 2005
DR. DOBB’S JOURNAL November 1, 2005
pus Christi, Texas; Philadelphia, Pennsylvania; and Taipei, Taiwan.
E-Voting Center Opened ACCURATE, short for “A Center for Correct, Usable, Reliable, Auditable, and Transparent Elections,” has been launched at John Hopkins University. Avi Rubin, a professor of computer science at Johns Hopkins and technical director of the university’s Information Security Institute, will direct the center, which is dedicated to improving the reliability and trustworthiness of voting technology. Researchers from five other institutions — Rice University, Stanford University, the University of California at Berkeley, the University of Iowa, and SRI International — will participate in the project. The multidisciplinary team will include experts in computer science, public policy, and human behavior. All findings will be made public and used to help develop technical standards and proposals for electronic voting systems that are easy to use and tamper evident.
H-1B Max Out for ’06 The U.S. Citizenship and Immigration Services has announced that it received enough H-1B petitions in early August to meet the congressionally mandated cap of 65,000 H-1B visa requests for fiscal year 2006. Federal officials say this is the earliest that the limit has ever been reached.
World Community Grid at Work With more than 130,000 cooperating computers, the World Community Grid (http:// www.worldcommunitygrid.org/) has already made computations equal to a single PC running continuously for more than 14,000 years. Launched in November 2004, the community’s goal is to harness some of the unused computing power of the world’s 650 million PCs. To join, all you need is a PC, Internet access, and a free downloadable program that runs in the background. Currently, the World Community Grid is running research for the Human Proteome Folding Project, which seeks to understand common diseases and develop possible cures by studying the way proteins function. The organization is accepting proposals for other research projects. http://www.ddj.com
The Media Grid A public utility for digital media AARON E. WALSH
T
he Media Grid is a digital media network infrastructure and software-development platform based on new and emerging distributed computational grid technology. The Media Grid (http://www.MediaGrid.org/) is designed as an ondemand public computing utility that software programs and web sites can access for digital content delivery (graphics, video, animations, movies, music, games, and so forth), storage, and media processing services (such as data visualization and simulation, medical image sharpening and enhancement, motion picture scene rendering, special effects, media transformations and compositing, and other digital media manipulation capabilities). As an open platform that provides digital media delivery, storage, and processing services, the Media Grid’s foundation rests on Internet, web, and grid standards. By combining relevant standards from these fields with new and unique capabilities, the Media Grid provides a novel software-development platform designed specifically for networked applications that produce and consume large quantities of digital media. As an open and extensible platform, the Media Grid enables a wide range of applications not possible with the traditional Internet alone, including: on-demand digital cinema and interactive movies; distributed film and movie rendering; truly immersive multiplayer games and virtual reality; real-time visualization of complex data (weather, medical, engineering, and so forth); telepresence and telemedicine (remote surgery, medical imaging, drug design, and the like); telecommunications (such as video conferencing, voice calls, video phones, and shared collaborative environments); vehicle and aircraft design and simulation; computational science applications (computational biology, chemistry, physics, astronomy, mathematics, and so forth); biometric security such as real-time face, voice, and body recognition; and similar high-performance media applications. By giving software developers the ability to easily access a theoretically unlimited pool of computing resources optimized for digital media, we anticipate that the Media Grid will enable these types of applications almost immediately, while unlocking the potential for a new class of applications that we can’t conceive of today. The Media Grid has been under active development for several years and is now reaching critical mass thanks to partner organizations and foundation technologies such as the Globus Toolkit. In this article, I examine how some of the system’s key capabilities are supported by the Globus Toolkit 4.0 (GT4). Along the way, I explore several significant GT4 features, including its support for standard web services. Public Utility for Digital Media At a conceptual level, the Media Grid is modeled after an improved national power grid, with added security and stability features that Aaron is Director of the MediaGrid.org open Standards group through which the Media Grid is designed and developed. He can be contacted at MediaGrid.org/people/aew/. 16
eliminate downtime and blackouts. As with the U.S. national power grid, which standardizes the production and consumption of power in the United States, the Media Grid is built with the intention of establishing a new generation of technology standards that enable computer applications to “plug-in” to digital media services over the public Internet. Applications that only need to consume media content, store or archive media files, or access media processing services can do so at a fair and standardized price (or for free in certain cases), which we anticipate will be greatly reduced compared to the cost of today’s proprietary digital content delivery systems, while the owners of computers that host and deliver media or provide media processing services receive compensation for their contribution to the Media Grid.
“The Media Grid enables a wide range of applications not possible with the traditional Internet alone” Desktop computers, workstations, laptops, handhelds, PDAs, mobile phones, game consoles, and kiosks are just a few of the many types of computing devices that can tap into the Media Grid (see Figure 1). Devices that have enough computational power and fast enough network connections can become nodes on the Media Grid, meaning they can store, deliver, or process media for other users in exchange for credit; in this way, some users can earn enough credit to pay for all of the premium (forfee) content and services they wish to consume. In contrast, less powerful devices may simply consume media and services provided by the Media Grid as shown in Figure 2. Regardless, any device that runs Media Grid software can be spontaneously networked together over the traditional Internet to form ad-hoc grids, or swarms, that exchange media and media processing services. Grids can also be assembled from specific devices and administrated much like a traditional managed network. By providing a global computing fabric for digital media, the Media Grid provides a public utility infrastructure that should have great appeal to the industry. Rather than building and maintaining proprietary solutions, companies and individuals will be able to utilize the Media Grid at a fraction of the cost and with minimal effort as compared to custom in-house solutions. A company such as Apple Computer, for example, stands to significantly reduce the time, effort, and cost of hosting and managing its wildly popular iTunes music service by offloading some portion of that service to the Media Grid. Similarly, television and movie production companies might enjoy significant economic
Dr. Dobb’s Journal, November 2005
http://www.ddj.com
(continued from page 16) advantages by using the Media Grid as a massive on-demand rendering farm rather than continue to invest time and money in their own in-house rendering farms. Scientists and researchers, meanwhile, may find the Media Grid a fast and convenient alternative to more esoteric or complex data visualization and simulation systems. At the other end of the spectrum, individual software application developers and web developers will be able to use the Media Grid for their own work as well. Web developers, for instance, can increase the performance, reliability, and scalability of their web sites while simultaneously reducing hosting fees simply by hosting images, movies, and music files on the Media Grid instead of using traditional ISPs or web servers. The Media Grid isn’t intended to replace or circumvent existing grids, clusters, or rendering farms — it’s designed to provide uniform and simplified access to a wide range of such systems. Like the Web before it, which shields users and developers from the complexity of the Internet, the Media Grid provides a unified view to an otherwise complex system. In the same way that the Web simplifies Internet development and provides a standard browser interface for text-oriented information and basic media content, the Media Grid aims to make it easy for developers to access computational resources provided by existing technology vendors such as Oracle, IBM, Sun Microsystems, Hewlett-Packard, Microsoft, and others as shown in Figure 1. By making digital media content and processing power available through unified APIs, grid services, and web services, the Media Grid provides a public computing infrastructure that both developers and owners of high-performance computer systems can benefit from.
Figure 1: The Media Grid features a simplified API that shields application and web developers from the back-end complexity typically associated with high-performance computing systems such as clusters, computational grids, and rendering farms. Execution Management
Globus Toolkit 4 When I first hinted about the Media Grid to DDJ readers (see “Creating Java Grid Services,” DDJ, September 2003) Version 3 of the Globus Toolkit had just been released. What a difference two years makes. Both the Globus Toolkit and the organization behind it have grown by leaps and bounds. Globus Toolkit 4.0 (GT4) is now available and features a range of new capabilities including support for established and emerging web-services standards, while the main organizational structure behind Globus has been formalized as the “Globus Alliance.” In the words of the alliance, the Globus Toolkit is: A fundamental enabling technology for the “Grid,” letting people share computing power, databases, and other tools securely online across corporate, institutional, and geographic boundaries without sacrificing local autonomy. The toolkit includes software services and libraries for resource monitoring, discovery, and management, plus security and file management. The toolkit includes software for security, information infrastructure, resource management, data management, communication, fault detection, and portability. It is packaged as a set of components that can be used either independently or together to develop applications. The Globus Toolkit was conceived to remove obstacles that prevent seamless collaboration. Its core services, interfaces and protocols allow users to access remote resources as if they were located within their own machine room while simultaneously preserving local control over who can use resources and when.
In short, the Globus Toolkit is open-source software that enables computational grids and distributed applications that run across grids. In terms of the Media Grid, GT4’s most significant capability is its support for standard web services. GT4 consists of a number of software components that directly support web services and several original components that predate the system’s general focus on web services (see Table 1). These socalled “nonWS” or “preWS” components may eventually be replaced with WS versions in the future, yet remain a vital part of the toolkit in any case. GT4 supports a number of important web-services standards and technologies, as Table 2 illustrates. Significantly, GT4 also supports the Web Services Resource Framework (WSRF) specification from OASIS (http://www.oasis-open.org/), which defines an open standard for implementing stateful resources that are accessible to web services. The ability to provide access to resources that maintain state across service invocations gives GT4 the ability to support distributed computing capabilities via web services. In other words, WSRF is the infrastructure that makes stateful grid services possible in GT4 (as you may recall, a grid service is merely a special-purpose web service designed to operate in a grid environment).
Security
Data Management
Information Services
Common Runtime
Authentication Authorization
Data Replication
Community Scheduler Framework
Index
C WS Core
Community Authorization
Reliable File Transfer (RFT)
Grid Resource Allocation and Management
Trigger
Java WS Core
Delegation
Data Access and Integration
Grid Telecontrol Protocol
Web Monitoring and Discovery System (Web MDS)
Python WS Core
Workspace Management THE COMPONENTS BELOW ARE NOT BASED ON WEB SERVICES: PreWS GridFTP Authentication Authorization Credential Management
PreWS Grid Resource Allocation and Management
PreWS Monitoring and Discovery (DEPRECATED: Will be dropped in a future release)
Replica Location
C Common Libraries
eXtensible IO (XIO)
Table 1: Globus Toolkit 4 introduces a number of web-services (WS) components while continuing to support the original (nonWS) components. Bold indicates stable components that form the main Globus API, while preview components that are subject to change are shown in plain text. 18
Dr. Dobb’s Journal, November 2005
http://www.ddj.com
(continued from page 18) Swarming Connections and Quality of Service The Media Grid combines swarming network connections with Quality of Service (QoS) levels to deliver content over the public Internet at high speeds and without interruption. Preliminary research indicates that swarming alone can scale to deliver content under loads several orders of magnitude beyond what is possible with traditional client-server architectures while enabling servers to gracefully cope with flash crowds (aka the “Slashdot effect”). By combining swarming with QoS, the Media Grid overcomes the inherent limitations of client-server technology that are especially detrimental to servers running on low-end devices
MediaGrid.org
T
he Media Grid is a digital media network infrastructure and development platform developed by the Grid Institute in partnership with Boston College, Vertex Pharmaceuticals (a public biotech company in Cambridge, Massachusetts), Japan’s University of Aizu (the world’s first university dedicated entirely to computer science and computer-related fields), and LUA (a Boston-based medical company), with individual contributions from employees at Harvard University, Oracle, Hewlett-Packard, Sun Microsystems, John Hancock Financial Services, and other companies and universities. Later this year, the Grid Institute and its partners will donate the current Media Grid implementation, source code, and technical specifications to the MediaGrid.org open standards group, through which development of this public computing utility will continue in cooperation with industry, academia, and governments from around the world. The Grid Institute is also working with the City of Boston to build a dedicated supercomputing facility in downtown Boston to showcase the Media Grid and large-scale projects that utilize it. —A.E.W.
and over low-capacity network connections. Specifically, limited file storage space and sparse processing resources combined with network bottlenecks render low-end computers useless for delivering large quantities of content over the Internet. This is especially true for consumer devices in the home because many ISPs restrict upload bandwidth to prevent home systems from acting as servers. Asymmetrical network connections such as this are a technical barrier that client-server architectures appear unable to overcome. By mobilizing devices into dynamic and spontaneously configured ad-hoc grids, however, swarming overcomes these limitations, as Figure 3 illustrates. Here, we see a small swarm of four consumer devices working together to deliver pieces of a DVD video in parallel to a single client over the public Internet. Media Grid software running on the client receives and assembles the pieces of the movie as they are delivered from the swarm so that the entire video can be played in real time as it is downloaded. QoS mechanisms ensure that if a device in the swarm slows down or disconnects from the network, a new device is dynamically added to the swarm so that transmission continues at full speed without interruption, while standby modes support instant failover and context switching. Using swarming alone, early tests of the Media Grid confirmed the system is capable of delivering full-length DVD movies (encoded in DivX format) to home users fast enough to allow such content to be viewed in real time within moments of initiating the download. QoS measures currently slated for implementation are expected to further enhance performance and increase the system’s scalability. GT4 provides a number of data-management tools that make it possible for the Media Grid to locate, transfer, and manage large quantities of distributed data. GridFTP, for example, is a reliable, secure, and high-performance transport mechanism that is, in essence, an enhanced FTP server optimized for both memory-to-memory and disk-to-disk data transfer between grid nodes. GT4’s GridFTP server supports separate front-end (client) and back-end (data) processes and a “striped” configuration in which a single client can connect to multiple back-end data sources. Striped data transfers are distributed across all data nodes that, when combined with GridFTP’s support for parallel and partial data transfers, facilitate Media Grid swarming. Although GridFTP is not a web-services component itself, GT4 does provide a WS component called “Reliable File Transfer” (RFT) that can be used to manage multiple file transfers over GridFTP. In addition to exposing high-performance data-transfer Specification or Technology
Figure 2: High-powered Media Grid nodes (indicated by the big Gs) provide storage, delivery, and processing services to other devices, including less capable devices that can only consume services (indicated by the smaller Gs) and standard web-enabled applications such as web browsers and rich clients. Red arrows denote high-speed connections between nodes, while gray and black lines depict traditional broadband connections such as DSL and cable modems. 20
Description
WS-I Basic Profile
Clarifies and amends key nonproprietary webservices specifications (e.g., SOAP, WSDL, UDDI) to promote interoperability across web-services implementations.
WS-Security
Specifies mechanisms for protecting the integrity and confidentiality of web services (e.g., authentication, authorization, policy representation, and trust negotiation).
WS-Addressing
Specifies transport-neutral mechanisms to address web services and messages (i.e., identifying webservice endpoints and securing end-to-end endpoint identification).
WS-Resource Defines a generic framework for web services to Framework (WSRF) model and access stateful resources. WS-Notification
Specifies a pattern-based approach for distributing information between web services.
Table 2: Web-services standards and technologies supported by Globus Toolkit 4.0.
Dr. Dobb’s Journal, November 2005
http://www.ddj.com
(continued from page 20) capabilities through a web-services interface, RFT also makes data transfers more reliable. RFT, unlike GridFTP, does not require the client to maintain an open socket connection for the duration of the transfer (an especially important feature for occasionally connected client devices such as mobile and wireless devices). RFT also maintains the state of data transfers in storage so that client- or data-source failures are easier to recover from. Location-Independent Content and Services Applications that utilize the Media Grid can access content and services by formal name or metadata (keywords, descriptions, and so forth). By eliminating the need for location-based access mechanisms, such as the URL, the Media Grid is designed to eliminate “file not found” or “service unavailable” situations. Location-independent capabilities will be supported through an
Pragmatic Exceptions . . . .
Figure 3: Swarming connections, enabled by GT4’s “striped” GridFTP capabilities, distribute media transfers across multiple Media Grid nodes.
22
implementation of “Universal Cache” technology. Universal Cache is necessary because today’s networked software applications typically maintain a private local cache that is not shared with other applications. This produces redundant storage and network transmission for each cached resource. Universal Cache solves this problem by providing a method and apparatus for a shared local cache that enables applications to access resources regardless of where they are actually located. GT4’s Replica Location Service (RLS) provides a reliable and scalable foundation on top of which the Media Grid’s Universal Cache functionality is implemented. RLS uses a distributed registry to store the location of file and dataset copies, or replicas, available across any number of grid nodes. When a user or software application places a digital media file on the Media Grid, for example, it can be automatically registered with RLS.
Figure 4: Grid gateways, supported by GT4’s support for resource sharing among virtual organizations, enable standalone Media Grid configurations to collaborate and share resources in a secure manner.
Tip #3: The Bird’s-Eye View Knows What To Do The best view of your program is the bird’s-eye version. The top of the execution stack is the best context for deciding what to do with the exceptionhot potato. It’s also no accident that the higher in the stack you go, the closer you get to the application user’s perspective. Because this is the perspective you (as the exception catcher) should be most concerned about, it follows that the top of the execution stack is where the best exception handling can occur. If “you’re it” as the exception catcher for the job, your options include: • • •
Handling it in a nonfatal way. Exiting the application. Eating it (ignoring it).
If the current exception doesn’t seem like an application killer but you’re not sure, play it safe and follow Tip #1—If In Doubt, Throw It Out. This assumes you’re
not the highest-level catcher. You might try Refactoring for exceptional clarity to give yourself more certainty. Exiting the application is an option, but this is reserved for the most fatal of errors. Assuming it’s not one of these and knowing exceptions may be poisonous, you’re left handling the exception somehow. The rubber finally meets the road. Here, you have three kinds of choices: • • •
Give users an obvious message. Write an output message (cmd line or log). Punt it all and quit writing software that sucks.
In picking from the first two options, you can Make Only the Actionable Obvious (Tip #4; available in next month’s issue) to give us a simple heuristic for making the right decision. —Benjamin Booth
[email protected]
Dr. Dobb’s Journal, November 2005
http://www.ddj.com
Every time the file is downloaded by a user (that is, replicated or copied) RLS can be updated with that new location information. By tracking file ingress and data migrations in this way, the Media Grid hides low-level location details from software developers and instead provides them with higher level abstractions (such as names and metadata) to describe media files that are ultimately resolved by RLS. Grid Gateways: Virtual Organizations The Media Grid gateway system is similar, in concept, to the CGI mechanism introduced by the World Wide Web. Whereas the Web’s CGI mechanism is a standard for interfacing external applications with information servers, such as web and HTTP servers, the Media Grid gateway mechanism defines a standard for interfacing Media Grid clients and middleware with backend grids, clusters, render farms, scientific workstations, and similar high-performance computing systems. As Figure 4 illustrates, Media Grid gateways also enable resource sharing between grids residing across organizational boundaries. By defining a uniform gateway interface between Media Grid software and back-end systems, the Media Grid can be extended to support any form of third-party grid, cluster, render farm, or other computational systems. This open and extensible architecture does not come without risks, however, which are expressed primarily in the form of security risks. Companies, universities, and even individuals are often loath to expose their computer systems to outsiders — it’s hard enough to maintain the integrity of a high-performance computer system that is inside an organization, let alone one that is available to the outside world over the public Internet. To address these understandable concerns, the Media Grid is being developed with high-grade security as a nonnegotiable system requirement. Indeed, the Media Grid is designed with mission-critical applications, such as medical and military applications, in mind. To this end, GT4 provides a secure infrastructure that is specifically designed to allow resource sharing between multiorganizational virtual organizations (VO). Resource sharing among VOs goes far beyond merely transferring files between entities, however, and includes direct access to computer resources such as processors, hard drives, software applications, datasets, and other resources that are typically off limits to the outside world. GT4 provides the tools necessary to establish and enforce the rules and policies involved with multiorganizational resource sharing between VOs, which in turn provides the substrate on top of which the Media Grid’s gateway mechanism is built. Tapping into the Media Grid Despite numerous improvements that have been made to the general installation process and related documentation compared to the previous version, GT4 is by no means simple to install, configure, and use. Distributed computational grids are, by their very nature, complex systems that are typically difficult to build, manage, and maintain. In this respect, GT4-based grid development is no different. It took well over a month for one Media Grid team to build and deploy a modest 100+ node test bed grid network comprised of both Linux and Windows nodes running GT4. Because the Media Grid is a public utility, however, developers do not have to install GT4 themselves and can instead access the network directly through http://www.MediaGrid.org/. Experts in grids, security, and digital media technology are encouraged to tap into the Media Grid today so that improvements can be made in anticipation of the formal public launch scheduled for 2006. DDJ http://www.ddj.com
Dr. Dobb’s Journal, November 2005
23
Parallel Processing Clusters & PVM Network clustering for UNIX/Linux DAVID J. POWERS
P
arallel Virtual Machine (PVM) is freely available network clustering software (http://www.csm.ornl.gov/pvm/) that provides a scalable network for parallel processing. Developed at the Oak Ridge National Lab and similar in purpose to the Beowulf cluster, PVM supports applications written in Fortran and C/C++. In this article, I explain how to set up parallel processing clusters and present C++ applications that demonstrate multiple tasks executing in parallel. Setting up PVM-based parallel processing clusters is straightforward and can be done with existing workstations that are also used for other purposes. There is no need to dedicate computers to the cluster; the only requirements are that the workstations must be on a network and use UNIX/Linux. PVM creates a single logical host from multiple workstations and uses message passing for task communication and synchronization (Figure 1). My motivation for setting up a parallel processing cluster was to provide a system that students could use for coursework and research projects in parallel processing. My specific goals were to set up a working cluster and demonstrate with test software that multiple tasks could execute in parallel using the cluster. Dave is a professor in the Department of Mathematics and Computer Science at Northern Michigan University. He can be reached at
[email protected]. 24
Why Use PVM? Granted, there is other software — most notably Beowulf — for clustering workstations together for parallel processing. So why PVM? The main reasons I decided to use PVM were that it is freely available, requires no special hardware, is portable, and that many UNIX/Linux platforms are supported. The fact that I could use Linux workstations that were already available in our computer lab without dedicating the use of those machines to PVM was a major advantage for it. Other important PVM features include: • A PVM cluster can be heterogeneous, combining workstations of different architectures. For example, Intel-based computers, Sun SPARC workstations, and Cray supercomputers could all be in the same cluster. Also, workstations from different types of networks could be combined into one cluster. • PVM is scalable. The cluster can become more robust and powerful by just adding additional workstations to the cluster. • PVM can be configured dynamically by using the PVM console utility or under program control using the PVM API. For example, workstations can be added or deleted while the cluster is operational. • PVM supports both the SPMD and MPMD parallel processing models. SPMD is single program/multiple data. With PVM, multiple copies of the same task can be spawned to execute on different sets of data. MPMD is multiple program/multiple data. With PVM, different tasks can be spawned to execute with their own set of data. How PVM Works A PVM background task is installed on each workstation in the cluster. The pvm daemon (pvmd) is used for interhost communication. Each pvmd communicates Dr. Dobb’s Journal, November 2005
with the other pvm daemons via User Datagram Protocol (UDP). PVM tasks written using the PVM API communicate with pvmd via Transmission Control Protocol (TCP). Parallel-executing PVM tasks can
“A PVM cluster can be heterogeneous, combining workstations of different architectures” also communicate with each other using TCP. Communication between tasks using UDP or TCP is commonly referred to as “communication using sockets” (Figure 2). The pvmd task also acts as a task scheduler for user-written PVM tasks using available workstations (hosts) in the cluster. In addition, each pvmd manages the list of tasks that are running on its host computer. When a parent task spawns a child task, the parent can specify which host computer the child task runs on, or the parent can defer to the PVM task scheduler which host computer is used. A PVM console utility gives users access to the PVM cluster. Users can spawn new tasks, check the cluster configuration, and change the cluster using the PVM console utility. For example, a typical cluster change would be to add/delete a workstation to/from the cluster. Other console commands list all the current tasks that are running on the cluster. The halt http://www.ddj.com
command kills all pvm daemons running on the cluster. In short, halt essentially shuts the cluster down. The PVM console utility can be started from any workstation in the cluster. For example, if workstations in the cluster are separated by some physical distance, access to the cluster may be from different locations. However, when the cluster is shut down, the first use of the PVM console utility restarts the PVM software on the cluster. The machine on which the first use of the console utility occurs is the “master host.” The console utility starts the pvmd running on the master host, then starts pvmd running on all the other workstations in the cluster. The original pvmd (running on the master host) can stop or start the pvm daemon on the other machines in the cluster. All console output from PVM tasks is directed to the master host. Any machine in the cluster can be a master host. Once the cluster is started up, only one machine in the cluster is considered the master host.
In setting up PVM, I wanted nonroot users to be able to use the PVM cluster even though some of the installation steps may require root privileges. PVM will not run a cluster of machines unless rsh is installed and enabled on all workstations in the cluster. The rsh setup is somewhat problematic and the man pages for rsh confusing. Additional information is available on the Web by using the phrase “Redhat rsh” on Google. There are five steps to installing and enabling the rsh server:
Installing PVM Installing and running PVM is straightforward if you do it on a single machine. You can then use the PVM API to simulate parallel processing. However, to demonstrate true parallel processing (that is, multiple tasks executing at the same time), more workstations need to be added to the cluster. However, installing and configuring PVM in a multiworkstation cluster can initially be painful. PVM requires this hardware/software environment to function:
Parallel Virtual Machine
1. Install the rsh server. To install the rsh server on Red Hat 9, click on the red hat (lower-left of screen), select System Settings, and click on Add/Remove Applications. You must wait while the system checks to see which software packages are already installed. You are then presented with a screen from which you can add/delete applications by
MEMORY
MEMORY
MEMORY
MEMORY
CPU
CPU
CPU
CPU
Figure 1: A single logical machine.
PVM Host#1 PVMTask
• A heterogeneous cluster of workstations networked together. • A machine architecture that uses a supported version of UNIX/Linux. • rsh (remote shell), command network support for PVM. The configuration I selected consisted of five 850-MHz Pentium workstations (with network connections) running Red Hat 9.0 Linux. The reason I used this hardware and operating system was because it was already available in our campus computer lab. The installation and configuration of PVM may vary depending on the version of UNIX/Linux used. If you already have a workstation with a version of UNIX/Linux installed, there are three steps to installing PVM on your workstation: 1. Install and enable the rsh server. 2. Set environment variables for PVM. 3. Download and install the PVM software. The first two steps are the most challenging. Step 3 is relatively straightforward. http://www.ddj.com
TCP
Host#2 PVMTask
TCP
TCP
TCP
pvmd
Host#3 PVMTask TCP
pvmd
pvmd
UDP
UDP
Figure 2: Communication using sockets.
host1 host1 host2 alpha alpha
user1 user2 user1 smith jones
Figure 3: Sample rsh file: /etc/hosts.equiv. host1 host2 host3 alpha
Figure 4: Sample rsh file: $HOME/.rhosts. Dr. Dobb’s Journal, November 2005
25
2.
3.
4.
5.
checking/unchecking the appropriate box. Under the Servers category and Network Servers subcategory, check the box for rsh-server, then click the Update button. You will be asked to insert a distribution CD for Red Hat. The rsh-server is copied from the CD and installed on your system. Enable the rsh server. To install the rsh server on Red Hat 9, click on the red hat, then select System Settings, Server Settings, and Services. You are then presented with a screen from which you can add/delete applications by checking/unchecking the appropriate box. Check the rsh box, select the xinetd item, and press the restart icon. Quit the Services and save changes. Create a file of users who can execute commands on this workstation. Create a file, /ect/hosts.equiv or $HOME/.rhosts, which lets users on other workstations execute commands on the workstation where this file exists. The /etc/hosts.equiv file is used system wide but will not provide root or superuser access for rsh (Figure 3). The $HOME/.rhosts file is created for a specific user, where $HOME refers to the user’s home directory, such as /home/dsmith or ~dsmith. This file can be created for any user, including root (Figure 4). This file lets the same user from a different workstation execute commands on this workstation. Set file permissions for the file in Step 3. The file permissions for /etc/hosts.equiv or $HOME/.rhosts must be set to 600 (rw access for the owner only) by using the chmod command. The owner of the file must issue the chmod command: chmod 600 /etc/hosts.equiv or chmod 600 $HOME/.rhosts. Test rsh as a root user (if .rhosts file used) and nonroot user. To see if you can rsh to yourself, try: rsh your_host_name ' -l'. To see if you can rsh to another host, try: rsh another_host_name 's -l'. You will get the error “Permission denied” if the user account does not exist on the remote host or if the host/user is not in the remote host /etc/hosts.equiv or $HOME/.rhosts file. Set the environment variables for PVM in the /etc/profile or $HOME/.bash_profile file before downloading and installing PVM (Figure 5).
Restart the computer so that the environment variables can take effect. Download and install the PVM software. Select the file pvm3.4.4.tgz and download PVM to the $HOME directory on the workstation. Uncompress the tgz file: tar xvfz pvm3.4.4.tgz. This creates a directory, pvm3, under the $HOME directory, which contains all of the PVM files. Build and install the PVM software using the command make from the $HOME/pvm3 directory. For example, assume that a cluster of four Linux workstations have the network hostnames alpha, beta, delta, and gamma. Also assume that PVM will be run by the nonroot user, myaccount, on all the workstations. When logged in as the user myaccount, $HOME is equal to /home/myaccount. On each host or workstation: 1. As a root user, create the file /etc/hosts.equiv with the contents: alpha myaccount beta myaccount delta myaccount gamma myaccount
2. Set the file permissions chmod 600 /etc/hosts.equiv. 3. As a root user, edit the file /etc/profile and add: # PVM environment variables PVM_ROOT=/home/myaccount/pvm3 PVM_ARCH=LINUX export PVM_ROOT PVM_ARCH # location of the pvm daemon, pvmd PVM_DPATH=/home/myaccount/pvm3/lib/ pvmd export PVM_DPATH
4. Restart the workstation so that the environment variables can take effect. 5. Download and install the PVM software in the /home/myaccount folder. 6. Create and compile programs. Store the binaries in the /home/myaccount/ pvm3/bin/LINUX folder. 7. On the master host, login as user “myaccount” and create a hostfile called “pvm_hosts” in /home/myaccount with the contents alpha, beta, delta, and gamma. 8. Run the pvm console utility using the command /home/myaccount/pvm3/console/LINUX/pvm pvm_hosts. This command starts the pvm console utility and the pvmd running on the master host.
#PVM environment variables PVM_ROOT=/home/pvm_user/pvm3 PVM_ARCH=LINUX export PVM_ROOT PVM_ARCH #location of the pvm daemon, pvmd PVM_DPATH=/home/pvm_user/pvm3/lib/pvmd export PVM_DPATH
Figure 5: Updates to /etc/profile or /home/pvm_user/.bash_ profile. 26
Dr. Dobb’s Journal, November 2005
Also, slave pvmds are started on all the other hosts in the cluster, which are listed in pvm_hosts. 9. Using the conf (configuration) command at the pvm prompt lists all the host workstations in the PVM cluster. 10.Use the console command to start the first PVM task: pvm>spawn -> p1. Using PVM To create and compile programs that use the PVM API, you must include the header file pvm3.h and link with libpvm3.a. To compile and link a program (say, p1.cpp) to use the PVM API, enter the command: g++ -o $HOME/pvm3/bin/$PVM_ARCH/p1 -I $PVM_ROOT/include p1.cpp -L $PVM_ROOT/lib/$PVM_ARCH -lpvm3
The default folder for the executable program files (PVM binaries) is $HOME/ pvm3/bin/$PVM_ARCH. This is where the pvmd task looks for tasks to execute (spawn). If multiple architectures are used in the cluster, programs need to be compiled and linked for each architecture because any program could execute on any available workstation in the cluster. To execute tasks in the PVM environment, start the pvm console utility using: $HOME/pvm3/console/$PVM_ARCH/pvm hostfile. The pvm console utility starts the pvmd task(s) running if the daemon(s) are not running. Again, the workstation that starts up the pvmds is the master host. Additional hosts may be added from a list of hosts in the hostfile, by using the “add hostname” command at the pvm> console prompt or by adding hostnames from executing PVM tasks. The file hostfile, stored in the $HOME directory, can use any filename and contains a list of computers (DNS names or hostnames) to be added to the cluster. User programs can be started in one of these three ways: • From the pvm console utility by issuing: pvm>spawn -> p1. This requires a space after spawn and before the task name. • From the system prompt if pvmd is already running on the host: ./p1. • By spawning the task from a currently executing task. The PVM execution environment requires the location of the program binaries on each host and the location of the pvmd on each host. The execution environment is set by editing /etc/profile. The PVM API The PVM API contains functions that can be grouped into several categories, including: http://www.ddj.com
(continued from page 26) • Process Management and Control, which contains functions that spawn child tasks, kill specific tasks, halt all pvm tasks and daemons, add hosts to the cluster, and delete hosts from the cluster. • Message Sending and Receiving, which contains functions for sending and receiving messages from one task to another. There is also a multicast function
that lets one task send a message to multiple tasks. Messages are routed by using a task identification (TID). Each task running in the PVM cluster has a unique task ID. Communication is synchronized by using blocking receives. This means that the task’s execution is suspended until the requested information is received. • Message Buffer Management and Packing/Unpacking, which handles message
* 10 tasks that require 30 sec. of execution time each * for 1 CPU: elapsed time = 5 min. * for 5 CPUs: elapsed time = 5 min.
Figure 6: Results for sequential algorithm. * 10 tasks that require 30 sec. of execution time each * for 1 CPU: elapsed time = 5 min. * for 5 CPUs: elapsed time = 1 min.
Figure 7: Results for parallel algorithm. Loop n times { spawn child task; send message to child; receive result from child; }
// blocking operation
Example 1: Sequential algorithm for parent task.
28
Dr. Dobb’s Journal, November 2005
buffering and data conversion. PVM handles the data conversion involved when data is sent/received over different architectures. All data is packed before sending and unpacked after receiving. Test Programs and Results Getting multiple programs to run at the same time on multiple hosts is more complicated than merely starting several programs and assigning them to different hosts. Normally, there is a main program, which is started from the console utility via the spawn command. The main program then spawns child tasks. Example 1 is pseudocode for a main program that loops n times, each time spawning a new child task and sending information to the child task that it needs to continue processing. Then the parent task receives the results from the child when the child is done processing. The problem with this algorithm is that the child tasks are not executing at the same time. Receive operations are blocking and the first child task must finish and send a message to the parent before the next child is spawned. So even though multiple tasks are being executed on multiple hosts, the tasks are not executing in parallel; see the results in Figure 6.
http://www.ddj.com
Example 2 is pseudocode for a revised main program. This main program contains two loops. The first loop spawns all of the child tasks and sends the information necessary for each child to begin processing. All the child tasks start execution about the same time and execute in parallel because there are no blocking operations in the first loop. The second loop waits for each child task to complete processing and sends results back to the parent task; see the results in Figure 7. The actual listings for the parent and child programs are in Listing One. Data can be passed from a parent task to a child task in one of two ways: Using a message (Listing Two; available electronically, see “Resource Center” page 4) or by using argument values when the child is spawned (Listing Three; also available electronically). Conclusion Using the techniques presented here, it is fairly easy to create a cluster of net-
Loop n times { // asynchronous (nonblocking) operations spawn child task; send message to child; } Loop n times { // synchronous (blocking) operations receive result from child; }
Example 2: Parallel algorithm for parent task. worked workstations that can be used for parallel processing. The cluster works as one virtual machine to reduce the elapsed execution time of multiple tasks. The test programs demonstrate that multiple tasks execute at the same time on multiple hosts. Future work with PVM would include efficient ways to load software changes over the cluster. Every time a program is
Listing One (a) #include #include #include #include
// David J. Powers. This program spawns another 10 tasks, sends a text message // to each child task, then waits to receive a message from each child task.
}
updated and rebuilt, the program binaries must be updated on all machines in the cluster. A primary use of PVM in the future would be to implement and test algorithms decomposed for parallel operation. Some possible algorithms would include matrix multiplication, sorting, and puzzle solutions using trees.
w[i++] = n % 10 + '0'; } while ( (n /= 10) > 0); w[i] = '\0'; // now reverse the chars for (i=0, j = strlen(w) - 1; { c = w[i]; w[i] = w[j]; w[j] = c; } // end of itoa
DDJ
i