,
#371 APRIL 2005
Dr. Dobbs J O U R N A L
SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER
http://www.ddj.com
INTERNET & WEB DEVELOPMENT Collaborative Web Surfing XForms & Cause-and-Effect Programming CCXML & The Voice Conference Manager Digital Libraries & XML-Relational Data Binding
2005 Award Recipient Guy Steele Jr.
Exploring WS-Notification C++ & operator []= $4.95US $6.95CAN
0
74470 01051
04
7
SharePoint Web Part Development A Silent Component Update for IE
Software Synthesis & OS-Independent Coding
Building HighPerformance Clusters Debugging Complex Embedded Applications Jerry Pournelle on
CPU Trends
C O N T E N T S FEATURES Dr. Dobb’s Journal 2005 Excellence in Programming Award 16 by Jonathan Erickson
Guy Steele Jr. is the recipient of this year’s Excellence in Programming Award.
A Conversation with Guy Steele Jr. 17 by Jack J. Woehr
DDJ chats with Guy Steele Jr. on topics ranging from programming languages research to programming language implementation.
Collaborative Web Surfing 24 by Gigi Sayfan
Cosurfer is a peer-to-peer GUI application that lets two users chat and surf the Web together.
XForms & Cause-and-Effect Programming 32 by John M. Boyer
XForms is a clean architecture for separating presentation, user interface, and business processing models.
RDF: The Resource Description Framework 38 by Bob Ducharme
RDF lets you store metadata about anything, anywhere.
Digital Libraries & XML-Relational Data Binding 42 by Rene Reitsma, Brandon Whitehead, and Venkata Satya Gokul Suryadevara
Conversion from XML to the relational model can be problematic. Here’s a technique that lets you avoid hard coding.
Exploring WS-Notification 48 by Marco Aiello, Manuel Zanoni, and Alessandro Zolet
WS-Notification is a web-service protocol that defines a standard approach to notification.
Call Control XML & The Voice Conference Manager 52 by Moshe Yudkowsky
Call Control XML is a W3C API for third-party call control.
Software Synthesis for OS-Independent Coding 58 by Bob Zeidman
Software synthesis lets you hide low-level implementation details from programmers.
C++ & operator []= 64 by Matthew Wilson
Matthew thinks that the language definition of the C++ subscript operator operator [ ] is too coarse grained.
Building High-Performance Clusters 70 by Christopher Jeffords and Dung Pham
Need to build a high-performance, 32- or 64-bit cluster computer? Here’s how.
SharePoint Web Part Development 74 by Seth Bates
Microsoft’s Windows SharePoint Services is a web-based team collaboration and document management platform.
A Silent Component Update for Internet Explorer 78 by Zuoliu Ding
Zuoliu presents a silent update technique for IE components.
EMBEDDED SYSTEMS Debugging Complex Embedded Applications 84 by Graham Morphew Cover photography by Howard Friedenberg.
Multithreaded real-time operating systems create unique problems when it’s time to debug applications.
COLUMNS Programming Paradigms 92
Chaos Manor 100
by Michael Swaine
by Jerry Pournelle
Embedded Space 96
Programmer’s Bookshelf 105
by Ed Nisley
by Bjorn Karlsson
APRIL 2005 VOLUME 30, ISSUE 4
FORUM EDITORIAL 8 by Jonathan Erickson LETTERS 10 by you Dr. Ecco's Omniheurist Corner 12 by Dennis E. Shasha NEWS & VIEWS 14 by Shannon Cochran OF INTEREST 106 by Shannon Cochran SWAINE’S FLAMES 108 by Michael Swaine
RESOURCE CENTER As a service to our readers, source code, related files, and author guidelines are available at http:// www.ddj.com/. Letters to the editor, article proposals and submissions, and inquiries can be sent to
[email protected], faxed to 650-513-4618, or mailed to Dr. Dobb’s Journal, 2800 Campus Drive, San Mateo CA 94403. For subscription questions, call 800-456-1215 (U.S. or Canada). For all other countries, call 902-563-4753 or fax 902-563-4807. E-mail subscription questions to ddj@neodata .com or write to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 803226188. If you want to change the information you receive from CMP and others about products and services, go to http://www.cmp .com/feedback/permission.html or contact Customer Service at the address/number noted on this page. Back issues may be purchased for $9.00 per copy (which includes shipping and handling). For issue availability, send e-mail to
[email protected], fax to 785838-7566, or call 800-444-4881 (U.S. and Canada) or 785-8387500 (all other countries). Back issue orders must be prepaid. Please send payment to Dr. Dobb’s Journal, 4601 West 6th Street, Suite B, Lawrence, KS 66049-4189. Individual back articles may be purchased electronically at http://www.ddj.com/.
NEXT MONTH: May means algorithms, including everything from Optimal Queens to Naïve Bayesian text classification.
DR. DOBB’S JOURNAL (ISSN 1044-789X) is published monthly by CMP Media LLC., 600 Harrison Street, San Francisco, CA 94017; 415-947-6000. Periodicals Postage Paid at San Francisco and at additional mailing offices. SUBSCRIPTION: $34.95 for 1 year; $69.90 for 2 years. International orders must be prepaid. Payment may be made via Mastercard, Visa, or American Express; or via U.S. funds drawn on a U.S. bank. Canada and Mexico: $45.00 per year. All other foreign: $70.00 per year. U.K. subscribers contact Jill Sutcliffe at Parkway Gordon 01-49-1875-386. POSTMASTER: Send address changes to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80328-6188. Registered for GST as CMP Media LLC, GST #13288078, Customer #2116057, Agreement #40011901. INTERNATIONAL NEWSSTAND DISTRIBUTOR: Worldwide Media Service Inc., 30 Montgomery St., Jersey City, NJ 07302; 212-332-7100. Entire contents © 2005 CMP Media LLC. Dr. Dobb’s Journal is a registered trademark of CMP Media LLC. All rights reserved.
http://www.ddj.com
Dr. Dobb’s Journal, April 2005
5
,
Dr.Dobbs J O U R N A L
PUBLISHER Michael Goodman
SOFTWARE TOOLS FOR THE PROFESSIONAL PROGRAMMER
EDITOR-IN-CHIEF Jonathan Erickson
EDITORIAL MANAGING EDITOR Deirdre Blake MANAGING EDITOR, DIGITAL MEDIA Kevin Carlson SENIOR PRODUCTION EDITOR Monica E. Berg NEWS EDITOR Shannon Cochran ASSOCIATE EDITOR Della Wyser ART DIRECTOR Margaret A. Anderson SENIOR CONTRIBUTING EDITOR Al Stevens CONTRIBUTING EDITORS Bruce Schneier, Ray Duncan, Jack Woehr, Jon Bentley, Tim Kientzle, Gregory V. Wilson, Mark Nelson, Ed Nisley, Jerry Pournelle, Dennis E. Shasha EDITOR-AT-LARGE Michael Swaine PRODUCTION MANAGER Douglas Ausejo INTERNET OPERATIONS DIRECTOR Michael Calderon SENIOR WEB DEVELOPER Steve Goyette WEBMASTERS Sean Coady, Joe Lucca AUDIENCE DEVELOPMENT AUDIENCE DEVELOPMENT DIRECTOR Kevin Regan AUDIENCE DEVELOPMENT MANAGER Karina Medina AUDIENCE DEVELOPMENT ASSISTANT MANAGER Shomari Hines AUDIENCE DEVELOPMENT ASSISTANT Melani Benedetto-Valente MARKETING/ADVERTISING ASSOCIATE PUBLISHER Will Wise SENIOR MANAGERS, MEDIA PROGRAMS see page 107 Pauline Beall, Michael Beasley, Cassandra Clark, Ron Cordek, Mike Kelleher, Andrew Mintz MARKETING DIRECTOR Jessica Marty SENIOR ART DIRECTOR OF MARKETING Carey Perez DR. DOBB’S JOURNAL 2800 Campus Drive, San Mateo, CA 94403 650-513-4300. http://www.ddj.com/ CMP MEDIA LLC Gary Marshall President and CEO John Day Executive Vice President and CFO Steve Weitzner Executive Vice President and COO Jeff Patterson Executive Vice President, Corporate Sales & Marketing Mike Mikos Chief Information Officer William Amstutz Senior Vice President, Operations Leah Landro Senior Vice President, Human Resources Mike Azzara Vice President/Group Director Internet Business Sandra Grayson Vice President & General Counsel Alexandra Raine Vice President Communications Robert Faletra President, Channel Group Vicki Masseria President CMP Healthcare Media Philip Chapnick Vice President, Group Publisher Applied Technologies Michael Friedenberg Vice President, Group Publisher InformationWeek Media Network Paul Miller Vice President, Group Publisher Electronics Fritz Nelson Vice President, Group Publisher Network Computing Enterprise Architecture Group Peter Westerman Vice President, Group Publisher Software Development Media Joeseph Braue Vice President, Director of Custom Integrated Media Solutions Shannon Aronson Corporate Director, Audience Development Michael Zane Corporate Director, Audience Development Marie Myers Corporate Director, Publishing Services
American Buisness Press
6
Dr. Dobb’s Journal, April 2005
Printed in the USA
http://www.ddj.com
EDITORIAL
AI By Any Other Name
T
wo decades ago, artificial intelligence was the cat’s pajamas. Grant and venture-capital money was flowing like spring run-off, with everything from neural networks and expert systems to fuzzy logic and natural-language processing hailed as national priorities. Lucid alone garnered $25 million to implement its Common LISP. Books were written, companies launched, magazines published — and then, in a preInternet-like bubble burst, it all seemed to go away. After a brief flurry of AI bashing (“AI is neither artificial or intelligent”), it became quickly apparent that the best way to kill a product was to hang an “AI” name on it. Still, some of the stronger — and presumably smarter — companies not only hung on but prospered by focusing on development tools. Franz Inc. (http://www.franz.com/), for instance, is still going strong with its Allegro suite of LISP-based tools for everything from connectivity and the Web to GUIs and database development. Likewise, Amzi! (http://www.amzi.com/) continues to create Prolog-based tools that support Java, C++, and .NET, and builds custom expert systems. (Note that Amzi!’s Dennis Merritt is also editor of Dr. Dobb’s AI Expert Newsletter; http://www.ddj.com/maillists/.) At the same time, tools like CLIPS (short for “C Language Integrated Production System”), a development tool for building rule- and object-based expert systems, moved into the public domain and have been widely adopted in industry, government, and academia (http://www.ghg.net/clips/CLIPS.html). After the bubble burst, AI vendors began focusing on practical, commercial implementations of AI, while avoiding the AI label. Among the current crop of emerging AI-based applications are those that detect credit-card fraud, provide network security, build industrial robots, and help create human-like computer-generated characters for video games and animated movies. For example, a robotic prescription dispensing system from ScriptPro (http://www.scriptpro.com/) fills, labels, and collates up to 150 prescriptions per hour. Upon selection of one prescription, pharmacists are alerted about all prescriptions associated with that patient, thereby saving money and lives. Then there’s Inflow (http://www.inflow.com/), an application that maps the relationships amongst people in organizations. Specifically, the Inflow software measures and graphs the “connectedness” of a group of people, organizations, or both. Links between entities can be recorded in the package for analysis. The resulting cluster graphs provide an intuitive feel for the dynamics of the group. Travel-planning companies such as Orbitz (http://www.orbitz.com/) let you search for the best airfares. What with up to 25 million daily flight combinations and fares updated in real time throughout the day, you can bet that AI techniques like case-based reasoning are at the heart of such systems. And while some might question its practicality, Sony’s Aibo robot dog is clearly an AI application (http://www.sonystyle.com/is-bin/INTERSHOP.enfinity/eCS/Store/en/-/USD/SY_BrowseCatalogStart?CategoryName=AIBO&Dept=AIBO). When introduced in 1999, Sony sold 3000 of these “pups” via the Internet in just 20 minutes. Since then, Aibo has gone through at least five versions, with the most recent being wireless enabled. For details on Aibo technology, see Dr. Dobb’s AI Expert Newsletter, November 2003 (http://www.ddj.com/documents/s=7730/ddj0411ai/ddj0411ai.html#aibo). That’s not to say that AI research has dried up and drifted away. Microsoft for one is doing extensive research in the areas of machine learning, adaptation, and intelligence (http://research.microsoft.com/ research/detail.aspx?id=9). On the academic front, even though the famed MIT AI Lab merged last year with the Lab for Computer Science, forming the Computer Science and Artificial Intelligence Lab (http://www.csail.mit.edu/), it hasn’t slowed down a bit. Likewise, serious AI research is ongoing at schools ranging from the University of Alberta (http://www.cs.ualberta.ca/~ai/) to Iowa State University (http://www.cs.iastate.edu/~honavar/aigroup.html). Additionally, a pair of researchers at the Rensslaer Polytechnic Institute recently snared a $1.2 million DARPA grant to build a system that reads books and answers questions about the text. In truth, it was Dr. Dobb’s AI Expert Newsletter that recently reignited my interest in AI. The original AI Expert, if you recall, was a stellar magazine that began publication in 1986, riding the AI bubble until 1994. Covering all aspects of AI languages, algorithms, tools, and techniques, AI Expert was in many ways the AI-specific equivalent of Dr. Dobb’s Journal. Not only did it cover all the right topics, but it had all the right authors — Richard Gabriel, Paul Graham, Maureen Caudill, Rodger Knaus, Kamran Parsaye, the eminent Nick Bourbaki, and many others. After poring through back issues of the original AI Expert and being reminded what a great magazine it was (Caution! Pitch Ahead), we decided to bring back some of the best articles in a series of e-zines called “The Best of AI Expert.” To date, we’ve compiled three e-zines, with more in the planning stages. You can find out more about this e-zine series at http://shop.sdmediagroup.com/. Whether these e-zines are an intelligent idea or not remains to be seen. However, there’s little question that in the years to come, AI— by whatever name it goes by— will continue to be central to the world of computing.
Jonathan Erickson editor-in-chief
[email protected] 8
Dr. Dobb’s Journal, April 2005
http://www.ddj.com
LETTERS
, PO S S O BB
T
D
2
C
T EN
S
2
Binary Floating-Point Arithmetic Dear DDJ, In the past 20 years or so, the IEEE-754 Standard has brought about substantial improvements in the portability and reliability of programs that use binary floatingpoint arithmetic. But a lot has changed in 20 years, and we’ve realized that there are a few things in the Standard that could have been done better. One of them is the Signaling NaN (sNaN). Signaling NaNs were intended to extend the Standard in ways we could not foresee at the time. The idea was that sNaNs, together with carefully implemented versions of the optional trapping feature, would allow a user to implement new features in an economical and portable way for debugging or arithmetic extensions. Unfortunately, since traps were optional and each manufacturer chose to implement them in a slightly different way, sNaNs were not provided with the support they needed to flourish. Making traps both mandatory and portable is not feasible. The only remaining feature about sNaNs that could be counted on from one implementation to the next is that when an sNaN is encountered, it sets the Invalid flag and turns into a Quiet NaN (qNaN). Using this feature to extend the Standard turns out to be cumbersome and perhaps almost as easy to implement with qNaNs alone. For example, there have been attempts to fill uninitialized data with qNaNs containing identifying information sufficient to track down attempts to consume uninitialized variables. Beyond that, we know of very little that has been done with sNaNs, and none of it is portable. Further, recent informal inquiries have turned up no current use of sNaNs. In an attempt to simplify the Standard, we are, therefore, considering making sNaNs no longer mandatory with a view to their ultimate elimination. But we are concerned that doing so would violate the precedent established in 1985. We suspect the constituency counting on this one portable feature of sNaNs just might be empty. If so, the precedent serves no 10
one and the elimination of sNaNs would cause no harm and do much good. Therefore, to perform due diligence, we are asking the software community if there is any portable software out there that requires this one feature of sNaNs in order to function. We would also like to learn about nonportable software that requires sNaNs. If you are responsible for such software, please contact us at
[email protected] with a description of the purpose of the software and how it uses sNaNs to accomplish that purpose. Source code would help but is not necessary if the application is clear enough. We thank you for your help in settling this matter. Dan Zuras Chairman, IEEE-754R GA in the Real World Dear DDJ, While reading “Genetic Algorithms & Optimal Solutions” by Michael Larson (DDJ, April 2004), I was surprised that Michael used a GA for this problem. That is because it seemed possible to directly test each combination of variables in less than the 20 seconds the author achieved; it also did not seem to be a good application of genetic algorithms. Out of curiosity, I tested the direct computation approach. My first attempt minimized processing in the inner most loop through QP2; i.e., Q+2, where Q is one of Michael’s control variables. My results were comparable to Michael’s. Wishing to get faster results, I converted my (object Pascal) code to inline assembler and replaced floating-point instructions with 64-bit integer multiplication and division instructions using the EDX:EAX register pair. While doing that, I ran into an overflow error. While resolving that error, I realized that my inner loop variable needed to be constrained to prevent overflow. That led to the more significant realization that I could directly compute an upper and lower bound of QP2 for the desired tolerance level for any combination of P, PO, and Divider. This reduced the number of iterations needed to directly test each potentially feasible combination of variables by a factor sometimes larger than 100. The results of these changes is that for all examples I tried, the computation of all feasible combinations of variables were completed in around 1 second on my 150-MHz Thinkpad 380D running Windows 98SE. Had the program stopped after finding the first feasible solution, the times would have been a lot smaller. Even had the problem not been amenable to solution by direct computation, my sense is that this problem is still not a good candidate for solution by a GA. That’s because GAs should be used for problems where combining individual solutions, each Dr. Dobb’s Journal, April 2005
containing good building blocks, is likely to result in better solutions. For example, in the travelling salesman problem, it is plausible that combining the best segments of individually good solutions might result in a better solution having those segments. In contrast, in this problem, the better the initial solution, the more likely it is that changing part of it will result in an output frequency that is worse. Given the large number of acceptable solutions to the examples I tried, many of which had over a thousand solutions with a 0.01 percent tolerance level, it seems likely that just trying randomly generated 26-bit pseudorandom numbers would be better than using a GA. Several good pseudorandom number generators, such as those developed by Pierre L’Ecuyer, can be found on the Internet. Phil Troy
[email protected] Quincy 2005 Dear DDJ, I am an Australian lecturer in IT. I am currently “doing up” Al Stevens’ Quincy 2002 into a Quincy 2005 version with minor enhancements and fixes useful for my teaching. As the “spruced-up” version may be of use to other C/C++ teachers and learners around the world, I intend to make Quincy 2005 available on the Internet to anyone. A prototype web page is [available] at http://codecutter.net/tools/quincy/. The main changes are to the project page and the build options page. The binary release also contains MingW and all the required libraries for development of simple graphics programs. This ensures that all students have the same compiler version at the start of semester, which simplifies the teacher’s task a lot. I still have a lot to do to fix and polish up Quincy for the start of semester, but the current version on that page can give a fair idea of the “visible” changes. The current source (built with Visual Studio .NET 2003) is also available there. Thanks for Quincy. Jean-Loup Komarower Swinburne University
[email protected] Al responds: Jean-Loup, thanks for telling me about how you are using Quincy. That’s exactly how I hoped programmers and educators would use it. Let me know if you have any problems that I can help with, but realize that it has been two or three years since I looked at the code. I suggest returning the Windows version to the About dialog. It helped me isolate platform-dependent issues when users reported problems. DDJ http://www.ddj.com
DR. ECCO’S OMNIHEURIST CORNER
Jam Session
Dennis E. Shasha
T
he young man at the entrance to Ecco’s apartment turned out to be a Special Forces colonel. “Name is Carl,” he said introducing himself. He registered mild surprise to find Liane, Tyler, and me also there. “No matter,” he said. “Just don’t publish for two months.” Once I agreed (did I have a choice?), he began to speak: “Our spy in enemy territory has a weak transmitter that transmits 1 bit per millisecond simultaneously to two receivers, one east and one northwest. The enemy can’t detect our transmitter, but knows we are there, so has a jamming device that broadcasts noise in focused directions. The jamming device rotates counterclockwise every 10 milliseconds. When the jam signal intersects the spy’s signal in some direction, the jam signal may flip a bit going in that direction, but just one. Therefore, it can flip at most one in every 10 bits going east and a different bit (again at most one in every 10) going northwest. “Our spy must be able to send reliable messages, so we plan to encode the transmission with redundant bits to reDennis is a professor of computer science at the Courant Institute, New York University. His latest books include Dr. Ecco’s Cyberpuzzles: 36 Puzzles for Hackers and Other Mathematical Detectives (W.W. Norton, 2002) and Database Tuning: Principles, Experiments, and Troubleshooting Techniques (Morgan Kaufman, 2002). He can be contacted at
[email protected]. 12
cover each 10-bit signal without errors. We should be able to do something fairly efficient, given the fact that we have two receivers.” Easy Warm-up: Suppose our only goal were to detect whether there were an error and we had just one receiver. How could we ensure detection using only one bit in every 10? Solution to Easy Warm-up: Use the concept of parity. Allow 9 data bits and then one “parity” bit with the property that if there are no errors among the 10 bits, the number of 1s altogether will be odd. If the receiver counts the number of 1s and finds that the number is even, it has discovered an error. Harder warm-up: What if there were only one receiver? How many bits are necessary to correct against any single error in 10 bits sent? (Think about the information you would need to locate the error.) Solution: We want an encoding that sends at least 6 data bits for every 10 transmitted bits. The other bits will be redundant, but will allow the correction of any single bit flip. We need 4 bits to correct any possible bit because the 4 bits can count up to 16 possibilities. This is more than sufficient to indicate which of the 10 bits has been flipped (10 possibilities) or whether none have been (11th possibility). If there were three or fewer check bits, there would be 8 or fewer configurations of check bits to use as a diagnostic. “Here are the questions gentlemen,” Carl paused looking at Liane, “and young lady. Dr. Dobb’s Journal, April 2005
“1. How would you use the four check bits in the case of a single receiver? “Now back to the case of multiple receivers. Suppose that the jammer may flip bits going to the two receivers but these must be different bits. You are sending the same message to both receivers. Moreover, the receivers will combine their information. “2. Suppose further the position of the different bits must differ by an odd number, e.g., 1, 3, 5, 7, 9. In this situation, can you safely send 8 data bits out of every 10 bits transmitted? “3. What if the offset is known to be 4 bits?” “4. Can you do as well if you don’t know the offset?” Dr. Ecco was able to solve all but the fourth one. Reader Solutions to “The Luck of the 4” Jeanne Boyarsky and Michael Birken both improved on my solution to The Luck of the 4. Jeanne showed that 36 could be computed with three 4s: 4×(4/.4R). Birken went further, finding a very suggestive “unfindable” phone number in the 315 area code. He also showed that 36 out of the first 40 numbers could be done with just three 4s. This included such remarkable insights as that 37 can be rendered as: ((!4)+(√[.4R]))/(√[.4R])=(74/3)×(3/2). For the solution to last month’s puzzle, see page 95. DDJ http://www.ddj.com
SECTION
A
MAIN NEWS
Dr. Dobb’s
News & Views
Sun Preps OpenSolaris Sun has announced its intention to release an open-source version of the Solaris 10 operating system under its Common Development and Distribution License (CDDL). While a date has not been set for the OpenSolaris release (“Expect to see buildable Solaris code…in Q2 2005,” Sun declared), the company has opened the source to the performance profiling component DTrace as a gesture of goodwill (http://www.opensolaris.org/). While Sun claims that the OpenSolaris release provides access to 1670 Sun patents, questions have been raised by open-source developers. The nonprofit Public Patent Foundation warned that “the legal nitty-gritty behind the announcement shows that Sun has retained the right to aim its entire patent portfolio at GNU/Linux or any other free and opensource operating system, except, of course, for their soon to be released version of Solaris.”
RFID Implementations Found Insecure A team of security researchers at RSA and Johns Hopkins University, including noted cryptographer Avi Rubin, is questioning the security of RFID systems used today in antitheft car keys and the ExxonMobil SpeedPass electronic payment program (http://www.rfidanalysis.org/DSTFAQ .htm). “Although RFID technology has venerable roots, it is one that has only recently begun to see widespread deployment,” the researchers explain. “RFID is being increasingly employed for user and device authentication, areas [that] have well-established, secure techniques in the ‘wired’ world. But a much lower level of security is being offered initially for these purposes in the wireless world. Arguably, wireless devices ought to have higher security because they are so readily available to unauthorized parties due to their ubiquitous and highly mobile nature.” Specifically, the team tested Texas Instruments’ Digital Signature Transponder, which they estimate is used in “more than 7-million cryptographically enabled keychain tags accepted at 10,000 locations worldwide.” They found that the TI DSTs use an unpublished, proprietary encryption algorithm with key lengths of only 40 bits. Using commercially available FPGAs, the researchers were able to crack the encryption in under two hours. 14
PHP Security Consortium Launched The PHP Security Consortium (PHPSC), a group whose mission is to promote secure programming practices within the PHP community, has been officially launched. PHPSC (http://phpsec.org/) provides a variety of security resources for PHP developers, including the group’s flagship project the PHP Security Guide, an online book for PHP developers that covers some of the most common security concerns.
Vanderpool Comes Early Intel is rolling out its new virtualization technology, code named “Vanderpool,” a year earlier than planned. It will be included in new desktop processors later this year. Preliminary specifications are available at http://www.intel.com/technology/ computing/vptech/. The current specs address IA-32 and Itanium architectures; next year, Intel plans to incorporate Vanderpool technology into Xeon servers and mobile chips. Intel claims the new technology “will allow a platform to run multiple operating systems and applications in independent partitions…” Vanderpool will also support other Intel initiatives, such as “LaGrande,” the code name for the trusted computing architecture that Intel is developing in partnership with Microsoft.
Tech Investments Return Investments in technology start-up companies are on the rise. Venture capital firms pumped nearly $21 billion into new high-tech companies in 2004, according to one estimate. The MoneyTree Survey (http://www.pwcmoneytree.com/ moneytree/index.jsp), a quarterly report on venture capital investment activity in the U.S. compiled by PricewaterhouseCoopers, Thomson Venture Economics, and the National Venture Capital Association, found that 2004 saw the first increase in technology start-up investments since 2000. Another survey, the U.S. Venture Capital Report from Ernst & Young and VentureOne (http://www.ey.com/global/content .nsf/US/Media_-_Release_-_01-21-05DC) corroborated those results. “After three consecutive years of decline, U.S. venturecapital investment increased in 2004— with Dr. Dobb’s Journal, April 2005
DR. DOBB’S JOURNAL April 1, 2005
$20.4 billion invested in 2067 deals,” Ernst & Young announced. “By region, the San Francisco Bay area continued to garner the most venture-capital activity in the nation, with 638 deals and $7.1 billion invested in 2004, or about 31 percent of all deals and 35 percent of the capital invested.”
Welcome to the Age of Fighting Robots After countless appearances in sciencefiction movies, books, and TV shows, weaponized robots are now a reality. Eighteen remote controlled, track wheeled, three-foot tall robots are to be deployed by the U.S. Army in Iraq. The SWORDS (Special Weapons Observation Reconnaissance Detection Systems) can be configured with M240, M249, or Barrett 50 caliber guns, and also come with alternate mounts for a 40-mm grenade launcher or rocket systems. According to manufacturer Foster-Miller (http:// www.foster-miller.com/lemming.htm), the robots can operate day or night in sand, snow, mud, rocky ground, or underbrush at speeds of up to four miles per hour. Their lithium ion batteries are good for up to four hours. The robots are controlled through radio-frequency remote control or a fiber-optic link from a 30- pound, briefcase- sized operator control unit.
Grid Consortium Formed IBM, Sun, Hewlett-Packard, and Intel have joined forces to promote opensource grid development efforts by forming the Grid Consortium (http://www .globusconsortium.com/), which will define specifications for the Globus Toolkit. A major goal of the Globus Consortium is to work with industry, developers, and Standards bodies to define the Grid. The group is also committed to making the software infrastructure of the Grid available under an open-source license. As a vendor-neutral, not-for-profit organization, the Globus Consortium will act as a central clearinghouse where vendors, customers, and developers can collect, organize, and prioritize requirements to define the Globus Toolkit. For information on the Globus Toolkit, see “Creating Java Grid Services,” by Aaron E. Walsh (DDJ, September 2003). http://www.ddj.com
2005
Dr. Dobb’s Journal
Excellence in Programming Award JONATHAN ERICKSON
T
he Dr. Dobb’s Journal Excellence in Programming Award is annually bestowed on individuals who, in the spirit of innovation and cooperation, have made significant contributions to the advancement of software development. Past recipients include: • Alexander Stepanov, for his work on the C++ Standard Template Library. • Linus Torvalds, a name synonymous with Linux. • Larry Wall, author of Perl. • James Gosling, chief architect of Java. • Ronald Rivest, educator, author, and cryptographer. • Gary Kildall, for his work in operating systems, programming languages, and user interfaces. • Erich Gamma, Richard Helm, John Vlissides, and Ralph Johnson, authors of Design Patterns: Elements of Reusable ObjectOriented Software. • Guido van Rossum, Python creator. • Donald Becker, Linux networking contributor and chief investigator of the Beowulf Project. • Jon Bentley, computer science author and researcher. • Anders Hejlsberg, developer of Turbo Pascal and architect of C# and the .NET Framework. • Adele Goldberg and Dan Ingalls, pioneers in Smalltalk and object-oriented programming. • Don Chamberlin, a database researcher and coauthor of SQL. • P.J. Plauger, a longtime champion of the C/C++ programming languages. This year’s recipient, Guy L. Steele Jr., is unique in that he’s receiving this award not necessarily for his focus on a specific language, tool, or operating system, but for the breadth of his contributions over the years. To give you a sense, Steele is the author or coauthor of books on Lisp, C, Fortran, and Java. Along with James Gosling and Bill Joy, he wrote the original specification for Java. He is credited as cocreator (with Gerald Sussman) of the Scheme language. Steele has been awarded more than 30 patents on technologies ranging from network configuration to floating-point calculations. He designed the original EMACS command set and was the first person to port TeX. Steele has served on accredited standards committees for C, Fortran, Common Lisp, and Scheme, and is the author of dozens of technical papers, in-
cluding the influential “Lambda Papers” (cowritten with Sussman) that questioned common assumptions about programming practices and language implementations, and on topics such as compilers, parallel processing, and constraint languages. That said, his legacy may well be the legendary “The Telnet Song (‘ControlUparrow Q’),” a humorous song dedicated to late-night hacking originally published in the Communications of the ACM (http://portal.acm.org/citation.cfm?id=1035691). In addition, Steele has found time to serve on Ph.D. thesis committees for numerous students and chair a number of ACM conference programs. His contributions over the years have not gone without recognition. The Association for Computing Machinery awarded him the 1988 Grace Murray Hopper Award and named him an ACM Fellow in 1994, and he was elected as a Fellow of the American Association for Artificial Intelligence. He led the team that received a 1990 Gordon Bell Prize honorable mention for achieving the fastest speed to that date for a production application — 14.182 Gigaflops. He was also awarded the 1996 ACM SIGPLAN Programming Languages Achievement Award. In 2001, he was elected to the National Academy of Engineering of the United States of America. In 2002, he was elected to the American Academy of Arts and Sciences. Steele currently is a Sun Fellow, Distinguished Engineer, and Principal Investigator at Sun Microsystems Laboratories, where he is responsible for research in programming languages, parallel algorithms, implementation strategies, and architectural and software support. In general, Steele’s Programming Language Research group is working to increase programmer productivity through improvements to existing programming languages and possibly the design of “the next Java.” This includes applying the lessons learned from Java to the next generation of programming languages. Steele earned a Bachelor’s Degree in applied mathematics at Harvard College, and a Masters and Ph.D. in computer science and artificial intelligence at the Massachusetts Institute of Technology. He has also been an assistant professor of computer science at Carnegie-Mellon University, a member of technical staff at Tartan Labs in Pittsburgh, Pennsylvania, and a senior scientist at Thinking Machines. He joined Sun Microsystems in 1994. At his request and in his name, Dr. Dobb’s Journal is pleased to make a grant of $1000.00 to to an educational program of his choice. Please join us in honoring Steele’s contribution to the art and science of software development.
Jonathan is editor- in- chief of DDJ and can be contacted at
[email protected]. 16
Dr. Dobb’s Journal, April 2005
DDJ http://www.ddj.com
A Conversation With
Photography by Howard Friedenberg.
Guy Steele Jr. A Sun Fellow & Distinguished Engineer takes time out to chat JACK J. WOEHR
D
DJ: What was it like getting a degree at Harvard in 1975? GS: Interesting, because I was also involved at MIT at the same time. I actually had a very unusual opportunity. After graduating from Boston Latin School, I applied to Harvard and MIT and Princeton and accepted Harvard’s offer. Then two months later, I heard that William Martin [http://www.cdm .csail.mit.edu/people/wam.html] was looking for LISP programmers at MIT. As a teenager I had hung out at the MIT AI lab, so I had a pretty good idea who Martin was. I went and applied for the job and I got it.
DDJ: How did you get it? GS: I was naïve enough to go over there on the Fourth of July and put my head in Bill Martin’s door and said, “I hear you’re looking for LISP programmers.” I wasn’t 18 yet. Bill looked at me seriously and said, “You’ll have to take my LISP quiz.” He reached in a file drawer and pulled out a three- or four-page quiz and sat me down in his office, and I spent an hour or two doing this quiz. He graded it and said, “You’re the first person who has ever gotten it 100 percent right. You’re hired.” DDJ: How did you know that much about LISP? GS: MIT has a program where college students teach high-school students on Saturday mornings. I’d been involved in that for three or four years. I was familiar with MIT’s facilities because the people there were sort of tolerant of young kids hanging around the computer labs. And I’d read the blue-and-white book [John McCarthy et al., LISP 1.5 Programmer’s Manual, MIT Press, 1962]. During my senior year at Latin School, I coded a complete implementation of LISP for the IBM 1130 from scratch. I had a pretty good idea what was going on. Jack is a DDJ contributing editor and independent consulting team mentor residing in Fairmount, Colorado. He can be contacted at http://www.softwoehr.com/softwoehr/. http://www.ddj.com
Guy L. Steele Jr. So I was hired programming LISP at MIT and started full-time in the summer of 1972. That fall I entered Harvard. I arranged my classes so I’d spend mornings at Harvard and my afternoons working part-time at MIT. The beautiful thing is that I had access to two separate faculties with two different points of view of computer science. DDJ: What was the difference? GS: I’d say that at Harvard I found a bit more emphasis on theory and at MIT a bit more of an emphasis on the pragmatics of getting things done. I’d listen to what my professor said at Harvard, and in the afternoons I might talk to someone at MIT: “At Harvard, they say this.” “Oh, that’s not right. This is…” The next day, I’d go back to Harvard, “This is what the guys at MIT say…” It also worked out financially. I made just about enough money at MIT to pay for my tuition at Harvard! DDJ: Did you actually learn Latin at Boston Latin School? GS: Five years with an option for six. But instead of the Classics track, I took the Science track, so I took physics instead of Latin in the sixth year. I also took four years of German. DDJ: And you play chess. And you sing. There are more than a few computer programmers who are into languages, chess, and music! GS: Common themes programmers are interested in. DDJ: After school, what did you do? GS: After I graduated from MIT I took a professorship at Carnegie Mellon because I decided I wanted to learn to build compilers at the feet of Bill Wulf [http://www.cs.virginia.edu/~wulf/resumes/ resume.html). Very shortly after I arrived, he left to start a compiler company called Tartan Laboratories. I worked at Carnegie
Dr. Dobb’s Journal, April 2005
17
Mellon for about two-and-a-half years, and among other things, did the first draft of the Common LISP specification. But I decided I still really wanted to learn to build compilers at the feet of Bill Wulf, so I took a leave of absence and went to work at Tartan Labs for a very good two years, and learned about compilers. We developed wonderful technology so that we could retarget compilers quickly to new architectures as they came out. As my leave of absence was ending, I was still involved at Carnegie Mellon and showed up for a seminar given by Danny Hillis [http://www.edge.org/digerati/hillis/] about the CM-1 Connection Machine [http://mission.base.com/tamiko/cm/]. He went
“I have a theory that programming languages have lifetimes. Java will probably be around another 20 years” on quite enthusiastically about artificial intelligence and neural networks but thought the CM-1 was less suited for numerical programming. Whenever you tell me that you can’t make a computer do something, it’s a red flag. I believe the Turing hypothesis. The only question is the efficiency of the calculation. I believe in crunching the numbers, doing the calculations, and finding out whether an assertion is true or not. So while Danny finished the lecture, I scribbled calculations on the back of an envelope. Okay, suppose you did the arithmetic bit serially, like the PDP-8S? A 32-bit floating-point multiply you have to multiply two 24-bit significands, that’s going to take 600 time steps. For the additions, a logarithmic number of shifts, each will take 24 time steps…It turned out that a floatingpoint add would take about 1/2 a millisecond and a multiply a millisecond. Sounds terrible, but you’ve got 64,000 processors. DDJ: So you could do 64,000 of these calculations in a millisecond. GS: That’s up in the range of 100 million operations a second, equivalent to a Cray 1, which was the supercomputer of the day. So after the lecture, I went up to Danny and said, “You said this is a terrible numeric machine, but I can tell it’s approximately the equivalent of a Cray 1.” And he replied, “Can we have dinner tonight?” DDJ: And from then until you came to Sun Microsystems in 1994, you were at Thinking Machines Corporation. GS: That was a lot of fun. I was there for just shy of 10 years. I was initially in charge of software development. I moved aside into a programming language development role when we hired on another manager. I’m really more of a technologist than a manager. DDJ: Whatever happened to the Connection Machine? GS: Thinking Machines went Chapter 11 in 1994. No one to my knowledge is still using the technology. Abstractly, the ideas behind the Connection Machine have permeated the industry and are well understood at IBM, Cray, and Sun. 18
DDJ: How do those ideas find their representation today? GS: I’m hesitating, because after arriving at Sun, I was not active in high-performance computing for about eight years. I’m back into it now because of the High Productivity Computing Systems Project [http://research.sun.com/sunlabsday/docs/talks/ 1.01_Mitchell.pdf], which emphasizes overall productivity, not just performance. DDJ: What’s the difference? GS: Performance means shortening the time for the computer to solve the problem. Productivity means shortening the time between posing the problem to the programmer and having the answer. It’s important to make the programmer efficient as well as the program. DDJ: And how does one do that? GS: We have ideas for improvements to programming languages which might make programmers more productive by relieving them of more of the mechanical and administrative burdens of getting a program to work. We’re looking at automated testing, more rigorous type systems, new languages. At Sun, we’re looking at the conjecture that making programming languages closer to traditional mathematical notation can make things easier for the scientific programmer. DDJ: APL or Mathematica? GS: APL looks rather strange to a working mathematician or physicist. Fortran is a little bit like math, but not a lot. Where did the asterisk for multiplication come from? Fortran was invented on machines intended for accounting. What if we tried very hard to make a programming language look like mathematics and took advantage of Unicode? We’re finally getting good support for full mathematical character sets in text editors. We’re designing a programming language called “Fortress.” Kind of a takeoff on Fortran… DDJ:…wedded to the zeitgeist of our troubled times! GS: Well, we did have in mind a programming language with greater security through a stronger type system. We’re trying to take some of the security features of Java, and mathematical notation, and good ideas already in Fortran and roll them all together. DDJ: How did you end up at Sun? GS: A number of companies sent interviewers to Thinking Machines. Sun made job offers to several dozen employees and I was one. I was hired as a Distinguished Engineer, which is the second-highest rung on the technical ladder at Sun. The top rung is Sun Fellow, to which I was promoted about two years ago [on March 6, 2003]. DDJ: Your citation from Dr. Dobb’s describes you as a “relatively quiet yet influential contributor to the world of programming,” citing your many publications and remarking on the breadth and depth of your contributions. GS: I’ve had the good fortune to be in the right place at the right time on a number of projects. Also, I like to write and many engineers don’t, so I’ve often been called upon to write the English specification of a programming language: Common LISP, Scheme, Java, High Performance Fortran. DDJ: How’s Scheme doing these days, and who is using it? GS: There’s a committee actively working on revising the specification for this millennium. I’m on the steering committee, not actively involved in the editing process. Scheme is used pretty heavily in universities, and is one of the working programming languages for research conferences, such as ACM’s Principles of Programming Languages Conference. Scheme is particularly good
Dr. Dobb’s Journal, April 2005
http://www.ddj.com
(continued from page 18) at modeling things that programming language designers worry about, such as the scopes of variables. For some purposes, Haskell has taken over the role Scheme used to have. DDJ: Regarding variable scoping, Dr. Ron Rivest, with whom I spoke some years ago [“A Conversation with Ron Rivest”, DDJ, September 1997; http://www.ddj.com/articles/1997/9710/], said that Java had some difficulties with closures. GS: If your programming style involves extensive use of closures, then Java is syntactically a lot more awkward. The design of the language does not make it particularly easy. DDJ: Design flaw in Java? Tradeoff? GS: There was a conscious design decision for object-oriented rather than closure-oriented programming. Interestingly, when anonymous inner classes were introduced into Java, we had a full implementation that made them act like closures. In particular, if you did up-level references to variables, you could assign as well as read. We got push-back from users, “We don’t want this, we prefer an implementation in which you can only read the up-level variables.” In order to support the full-blown closure implementation, it was necessary to do heap allocation implicitly. At that time, the users were still a little nervous about heap allocation and garbage collection. They felt more comfortable if places where heap allocation took place in Java were always explicitly flagged by the new keyword. Nowadays, heap allocation in Java is better understood and that feature could be added easily, but there’s no call for it. I have a theory that programming languages have lifetimes. Java will probably be around another 20 years. It’s time for a new programming language to come along.
20
DDJ: Fortran was called “automatic programming,” then we had “structured programming,” then “declarative programming,” “functional programming,” “object-oriented programming”…Is Fortress introducing some new model? Or is it just a new object-oriented language? GS: There’s a difference between a new programming language and some advance that creates a major paradigm shift. Fortress has a new object-oriented type structure, multiple inheritance, and the traits idea [S. Ducasse, N. Schaerli, O. Nierstrasz, R. Wuyts, and A. Black, “Traits: A Mechanism for Fine-Grained Reuse,” Transactions on Programming Languages and Systems, 2004; http://www.iam.unibe.ch/~scg/Archive/Papers/Duca04wtoplastraitnotfinal.pdf]. Traits are a way of getting better code reuse in object-oriented programming. Imagine Java interfaces but you can put code in them: not field declarations, just code. DDJ: Does Fortress look like Java and C++? GS: We’re doing a research prototype rather than a product prototype, so we feel free to try out a bunch of crazy ideas. Syntactically, we’re trying more to be inspired by mathematics than by Java. The object organization is Java-inspired. Array handling is Fortran and APL inspired. Other influences are MatLab and Mathematica. We’re also trying to add checking of dimensional units such as kilograms, meters, and feet. We had a paper [E. Allen, D. Chase, V. Luchangco, J.W. Maessen, G.L. Steele Jr., “ObjectOriented Units of Measurement,” OOPSLA 2004, http://www .pag.csail.mit.edu/reading-group/allen04measurement.pdf] in the last OOPSLA about that. DDJ: How does one compose a Fortress program? Rich keyboard? Dragging sigmas in a visual editor? GS: We’re hoping to do some user productivity studies on that open question. We’ve got an ASCII encoding of the language,
Dr. Dobb’s Journal, April 2005
http://www.ddj.com
(continued from page 20) and also a presentation form for an IDE using Unicode. We’re in the middle of a three-year research project and haven’t ruled anything out. DDJ: Is Fortress going to be open source? GS: As a research issue, we are thinking very hard about organizing a language that would support open-source growth. DDJ: Coming up through academia, it would be easy to adopt the contemplative life. You seem to be a very active person. GS: I’m interested in being involved. In particular, I was interested in raising a family, and have just become an empty-nester as my youngest, Matthew, enters MIT. Now I’m trying to assess my life and allocate the balance of my time. I enjoy my work at Sun very much, but maybe I’ll do more singing again! DDJ: Did the Golden Age of Programming end with the 20th century? GS: The set of concerns has certainly changed. I’m reluctant to identify what I did when I was young as the Golden Age. I don’t see myself as an old fogey. I’m very excited with what’s going on now. The things that excited me back then, like making the best use of every last bit in a word, are less important nowadays. There are wonderful bit-hacking algorithms in Hacker’s Delight [Henry S. Warren, Jr., Hacker’s Delight, Addison-Wesley 2004, ISBN 0201914654]. I’m glad someone put them together. Compiler writers need to know this stuff, but not everybody. DDJ: It seems to me that the best programmers I hire do know computing machinery down to the bits and gate level, often from embedded control experience. People who start with high-level languages frequently get frustrated because they don’t under-
22
stand why certain limits impinge on their perfect world of highlevel language. GS: I had the opportunity to take courses which tackle all levels of how a computer works, from high-level languages down to quantum mechanics. I’ve taken signals and systems courses, basic electrical engineering courses, architecture courses…I had the good fortune to be in on the first VLSI class that Lynn Conway [Carver Mead and Lynn Conway, Introduction to VLSI Systems, Addison-Wesley, 1980 ISBN 0201043580] taught at MIT. The very best programmers will be so fanatical that they spent their youth learning all this stuff top-to-bottom. But people have different balances and strengths. You can’t expect every programmer to have done this. DDJ: Some people go through life tossing out visions like rice at a wedding, others seem to pick up the individual grains of rice and examine them under a microscope and actually make the vision work. GS: And I’m more the second kind of guy. On the Connection Machine, Danny Hillis had a grand vision for a version of LISP that could operate on vectors. I sat down and cranked through the mechanics of “What does that mean for how the if statement works, what does that mean for how lambda behaves?” Gerald Sussman [http://library.readscheme.org/page1.html] was the great visionary on Scheme. I made sure the parts fit together and made it work. I’m in the best position at work when I have a visionary and a manager to work with. What I’m best at, is when an idea comes from somewhere, to combinatorially work through the consequences of it, and make sure the details are nailed down. Not that I truly compare myself to either, but I’m more like Edison than Einstein.
Dr. Dobb’s Journal, April 2005
DDJ
http://www.ddj.com
Collaborative Web Surfing A peer-to-peer application for surfing the Web together GIGI SAYFAN
C
osurfer is a peer-to-peer (P2P) GUI application that lets two users chat and surf the Web together. Cosurfing means that whenever one user browses to a new URL, the other user’s browser is automatically updated so they are “always on the same page”— literally. Users typically run Cosurfer on different machines, although during development I run two Cosurfers on the same machine for testing purposes. Cosurfer has a GUI, connection manager, browser component, and Cosurfing engine that orchestrates all the other components. The complete source code and related files for Cosurfer are available electronically; see “Resource Center,” page 5. Cosurfer involves a number of aspects of .NET programming, including: • Windows Forms programming. • Socket-level networking.
Gigi is a software developer specializing in object- oriented and componentoriented programming using C++. He can be contacted at gigi_sayfan@playstation .sony.com. 24
• Asynchronous method calls. • Multithreaded programming. • Events and Delegates. • XML parsing and composition. • COM interop. In addition, Cosurfer explores the Internet Explorer (IE) programming model quite extensively. When you launch Cosurfer, it automatically starts listening on port 8888. If you launch a second instance, it starts listening on port 8889. If you click on the Connect button on one of the forms, a TCP connection is established. In addition, each instance launches a new IE browser with a blank page. From this point on, the two Cosurfers function like Siamese twins. Whenever users navigate to a new URL or frame inside one of the browsers, the other browser follows. Moreover, you can exchange witty chat lines with your friend (or yourself) through the chat facility. The trick is to hook into IE and monitor it closely. Every change is encoded in a proprietary XML dialect and reported to the peer Cosurfer, which updates the attached browser. It is not enough to send only URLs because frame-based web pages might have arbitrarily nested structure. IE Programming When I originally wrote Cosurfer, Internet Explorer was a web browser that was starting to show its age. It hadn’t been updated for years (except for security patches). Nevertheless, it exhibited (and still does) total configurability and flexibility through a variety of methods. You can control some aspects of its operation through commandDr. Dobb’s Journal, April 2005
line switches and other aspects through registry entries, you can customize its appearance, you can host it in your application (with or without UI), and you can hook into its events and intercept and modify almost anything it does. It provides several ways to attach your code to running
“Cosurfer is made up of a couple of generic components that can be used as-is in other applications” instances and also an elaborate DHTML object model. In this article, I show how to launch a new instance of IE, hook into its events, control its navigation, and drill down the HTML DOM. ShDocVw.dll, located in \System32 and commonly known as the “the WebBrowser control,” contains various shell COM objects, interfaces, and enumerations. One object is InternetExplorer, which represents an instance of a standalone IE application. You can use the OLE/COM Object Viewer tool http://www.ddj.com
(continued from page 24) (located in \Common7\Tools\oleview.exe) to investigate it. The type library, called “Microsoft Internet Controls Version 1.1,” contains 18 interfaces, nine objects (coclasses), and eight enumerations. Most interfaces are dual and contain around 10 methods each. Luckily, you don’t have to tackle the raw complexity head on. As you can see in Figure 1, the InternetExplorer object implements two interfaces —IWebBeowser2 and IWebBrowserApp— and has two outgoing dispinterfaces, DWebBrowserEvents and DWebBrowserEvents2. The dispinterfaces are event interfaces that the client code implements to receive events from InternetExplorer. DWebBrowserEvents exists only for compatibility reasons with code that automates Internet Explorer 3 (can you believe it?). You can
safely concentrate on DWebBrowserEvents2. Listing One (also available electronically) contains the Puppeteer class that controls IE, demonstrates how to create an instance of the InternetExplorer application, hooks into two events (Quit and NavigateComplete) and shows how to navigate to a URL (google in this case). The AutoResetEvent m_quitEvent is initialized to false, and the Run( ) method waits for it to be signaled. This happens when the Quit event is received from the browser. MSHTML.dll (located in \System32) is responsible for HTML parsing and rendering, as well as exposing the Dynamic HTML Document Object Model (DHTML DOM). It is usually hosted by ShDocVw.dll, although it can be hosted directly by any application. If you thought the ShDocVw.dll contained many
interfaces and objects, think again. MSHTML.dll has an intimidating number of objects and interfaces —147 enums and typedefs, 121 objects, and more than 400 interfaces! The good news is that you will only need to work with a few interfaces most of the time. Listing Two (available electronically) shows how to get the Document object (which is really an IHTMLDocument2 interface) and use its all collection to iterate over all the elements in the document. Once you get the DocumentComplete, it means that the DHTML object model can be accessed safely. (Before using ShDocVw.dll and MSHTML.dll, make sure they are referenced by the using assemblies. If they are not, you should import them using Add Reference… from the context menu of assembly references.) HTML Documents and Frames So far, this sounds simple. The browser finishes loading a document and sends the DocumentComplete event. In your code, you respond by getting the Document property and starting to use the object model. However, frames introduce some complications. For starters, there are two types of frames — regular frames embedded in a frameset, and iframes that are embedded anywhere inside another document. An HTML/XHTML document should have either a frameset or body element. If it has a frameset element, it means that the document is actually divided into multiple frames or more nested framesets; see Figure 2. Each frame contains a separate document with its own URL. Of course, such a framed document may have a frameset that contains more frames. If it has a body element, then it may contain (in addition to regular elements) one or more iframe elements, which are simply HTML documents embedded in the middle of the body. The frameset.html document represents such a complex document. Here is its structure:
Figure 1: Cosurfer UI.
frameset.html frame_1.html frame_2.html frame_3.html www.google.com
The frameset.html document has no content of its own — it just “hosts” frame_1, frame_2, and frame_3. frame_3.html contains an iframe element that points to Google’s main page. If you search through Google and browse to another framed document, then the structure becomes even more complicated. IHTMLDocument2 provides the frames collection that is supposed to contain all the frame elements in a document, with a frameset element or all the iframe elements
Figure 2: Frames. 26
Dr. Dobb’s Journal, April 2005
http://www.ddj.com
in document with a body. Unfortunately, it doesn’t work. The debugger claims that the frames collection is indeed of type mshtml.FramesCollection, but the value is . I recall that, even in the old days, when I tried to manipulate the DHTML object model from C++, the frames collection never worked properly. Fortunately, the all collection also contains all the frame elements. Checking the documentation, you find that there are numerous frame-related interfaces —IHTMLFrameBase, IHTMLFrameBase2, IHTMLFrameBase3, IHTMLFrameElement, IHTMLFrameElement2, IHTMLFrameCollection2, IHTMLFramesetElement, and IHTMLFramesetElement2— most of which deal with various visual properties of the frame elements or containment relationships. When drilling down the DHTML object model of a nested document, you would like to recursively acquire the document object inside each frame. Fortunately, all the various flavors of frame elements implement the IWebBrowser2 interface from which you can get to the document object. Another complication related to multiple frames embedded in a frameset is that the order of the DocumentComplete events is not deterministic, so when you get a DocumentComplete event, it is not clear from what frame it originated (especially if some frames have the same URL). A simple way to deal with it is to drill down recursively from the top-level document whenever any DocumentComplete event is received. Listing Three (available electronically) contains a different OnDocumentComplete( ) handler that employs this technique. .NET Socket Programming The Base Class Library (BCL) provides communication APIs spanning diverse protocols, technologies, and levels of abstraction. The Socket API (popularized by BSD sockets) is where the rubber meets the road. At this level, you send/receive raw byte buffers. The traditional socket programming model is blocking, which means that the application is blocked until some network event happens (data is received or a connection is accepted, for instance), or the application programmer has run the socket code in a separate thread. There is also a polling API, where the app can check if some network event has happened and handle it or carry on with its business if not. Finally, asynchronous I/O is introduced where the application received a notification when a network event happened. All these APIs and programming models made the sockets API complicated. There are numerous options you http://www.ddj.com
can set; some functions should be used only in certain situations and only with other functions. Now, I’ll demonstrate how to work in asynchronous mode with the Socket class using the TCP protocol. The System.Net.Sockets assembly contains all the interesting types. System.Net contains some helper types that are useful, too. Echo server is the canonical communications “Hello, World” program where the server simply sends back to the client whatever it gets. Listing Six (available electronically) contains the
“The Socket class works with raw byte arrays while most applications exchange some combination of text and binary data” entry code to the app that simply creates TcpEchoServer and TcpEchoClient objects, tells the server to listen on port 6666, and tells the client to connect to the server. The client and server take it from there, exchanging messages. The server (Listing Seven; available electronically) binds itself to a port and starts listening for incoming connections. Immediately after calling listen( ), the server should be ready to accept connections. The Socket class provides an asynchronous method pair BeginAccept( )/EndAccept( ) and TcpEchoServer uses it. Once a connection has been accepted, a new socket is created, and through this socket, the server sends/receives data to/from the client. While most servers handle many clients, in this code (and in CoSurfer), there is always just one client (peer). The client (Listing Eight; available electronically) model is different. It connects to the server, which resides in some well-known EndPoint (IP + port). Once connected, the client starts sending messages to the server using Send( ) and waits for responses using OnReceive( ), which is the asynchronous callback function. The server works in a similar fashion — it waits for messages using OnDr. Dobb’s Journal, April 2005
Receive( ) and responds by sending them back using Send( ). Both the server and client may close the connection at any time. The Socket class works with raw byte arrays while most applications exchange some combination of text and binary data. It is necessary to translate back and forth between byte arrays and your data structures. System.Net.Sockets contains the two helper classes TcpListener and TcpClient, which take much of the drudgery away. They provide a thin veneer on top of the Socket class and are inherently thread safe. The classes wrap the buffer and expose a stream abstraction to Read( ) and Write( ), which takes care of the necessary transformations and buffer recycling. The TcpClient also provides a friendly Connect( ) method overload that accepts a string URL so you don’t have to painstakingly create an IPEndPoint that the Socket.Connect( ) method requires. There is also a UDPClient for datagram protocol communications (no connections and no server here). I chose to use the raw Socket class because I wanted to investigate the nuts-and-bolts and full control and asynchronous operation. While the TcpClient class is probably appropriate for most purposes, the TcpListener is a little weak. It doesn’t have BeginAccept( )/EndAccept( ) methods — only Accept( ) and a Pending property to poll for incoming connections. Another major omission is that it lacks the Select( ) method, which is an absolute must for servers that handle a lot of traffic. I will not go into all the intricacies here, but to that, Select( ) is needed to handle multiple connections without creating a separate thread for each connection (which will bring a machine to its knees after several hundreds of connections). .NET XML Programming XML is a .NET technology pillar (borrowing a catchphrase from Longhorn nomenclature). Several other core .NET technologies — configuration files, ADO.NET, and web services — rely on XML. I leave it to you whether it is wise to tie your platform so tightly to a verbose textual format. In any event, XML is standard, ubiquitous, and hot, and the answer for platform/language neutral structured data exchange across any boundary. Of course, no one can keep track of all the XML languages, technologies, and metatechnologies that pop up everywhere. The .NET Framework seems to gather the most important XML-derived standards (XPath, XML Schema, XSLT, and so on) under its wings. The XML facilities in the BCL include parsing, composition, validation, navigation, and transformation. Here, I only explore XML parsing and composition because that’s what I use in Cosurfer. 27
The two most common programming models for parsing XML are DOM (Document Object Model) and SAX (Simple API for XML). DOM parses an entire XML document and creates a tree structure in memory, which it returns for the calling code to manipulate. The XmlDocument from the System.Xml assembly provides conforming DOM level 1 and level 2 implementations. SAX streams through the document, raising events whenever it encounters an element, processing instruction, or attribute. It’s the application’s responsibility to handle all the appropriate events and there is no going back. It is also known as the “push model.” The BCL provides an interesting alternative model. The XmlReader-derived classes (XmlTextReader, XmlNodeReader, and XmlValidatingReader) implement a pull model,
which is a combination of DOM and SAX. The idea is to provide an efficient readonly, forward-only noncaching parser. Instead of firing events automatically as SAX does, the XmlReader model lets the application pull more content whenever it is ready. It is also possible to skip the children of the current node. This way, whole branches may be pruned and a lot of processing is avoided. In Cosurfer, I only use the XmlDocument’s DOM interface. Listing Nine (available electronically) demonstrates how to load an XML buffer from a string into a new instance of XmlDocument, get the root element (DocumentElement), and then iterate over all the children recursively and print the value of an attribute. The code reads just like plain English, which is a sign of a good interface (cred-
... ConnectionManager connectionManager = new ConnectionManager(); CosurfEngine engine = new CosurfEngine(connectionManager); connectionManager.AttachSink(engine); ...
Example 1: AttachSink() method.
Example 2: XML you get when visiting http://www.w3schools.com/tags/planets.htm. StringWriter sw = new StringWriter(); XmlTextWriter w = new XmlTextWriter(sw); w.IndentChar = ‘\t’; w.Indentation = 1; w.Formatting = Formatting.Indented; w.WriteProcessingInstruction(“xml”, @”version=””1.0”””); w.WriteStartElement(“Document”); w.WriteStartElement(“Frame”); w.WriteAttributeString(“Url”, “http://www.w3schools.com/tags/planets.htm”); w.WriteStartElement(“Frame”); w.WriteAttributeString(“Url”, “http://www.w3schools.com/tags/venus.htm”); w.WriteEndElement(); w.WriteStartElement(“Frame”); w.WriteAttributeString(“Url”, “http://www.w3schools.com/tags/sun.htm”); w.WriteEndElement(); w.WriteStartElement(“Frame”); w.WriteAttributeString(“Url”, “http://www.w3schools.com/tags/mercur.htm”); w.WriteEndElement(); w.WriteEndElement(); w.WriteEndElement();
Example 3: XML processing instruction. 28
Dr. Dobb’s Journal, April 2005
its go to W3C for designing the DOM interface). XML Parsing and traversing the DOM (or using XmlReader) is what you need most of the time. However, composing XML documents dynamically is also often necessary. You can do it by creating a new XmlDocument and start creating nodes and appending them, but there is an easier way. The XmlTextWriter is the best tool for the job. XmlTextWriter has several constructors that take one of various outputs (a stream, text writer, or filename). It has numerous WriteXXX methods to write elements, attributes, processing instructions, and whatnot. Some of them come in pairs, as in WriteStartDocument( ) and WriteEndDocument( ). It tries very hard to make you write well-formed XML. For example, if you write multiple XML processing instructions or if you don’t have exactly one root element, it raises an exception (except if you write an XML fragment). I noticed one glitch when using the constructor that accepts a TextWriter and using the WriteStartDocument( ) method. XmlTextWriter automatically writes the line: . This is unfortunate because the encoding was not UTF-16. I couldn’t find a way to control the encoding (it is possible with the constructor that accepts an IO.Stream); so eventually, I dropped the WriteStartDocument( ) method and created the XML processing instruction myself (Example 3). Cosurfer Architecture & Design Principles Cosurfer is made up of a couple of generic components that can be used as-is in other applications and a couple of components that are specific to the Cosurfer application. The components utilize abstract interactions through interfaces. Every component defines an events interface IEvents (for example, IConnectionManagerEvents). An interested component (event sink) implements the events interface to receive events. The sender components needs a reference to the sink, of course, to call the event handlers it implemented. A simple solution is to pass the sink as one of the constructor arguments. However, in some cases, two components need to call each other. In this case, it is not possible to pass both references in the constructor. In this case, one of the components implements an AttachSink( ) method that can be called later. The Factory class usually hooks up event sources to event sinks by attaching the receiving component to the sending component by passing it in the constructor or via the AttachSink( ) method (Example 1). http://www.ddj.com
(continued from page 28) User Interface. The Cosurfer UI is spartan, yet you can learn a lot about Windows.Forms programming by following the code. There is only one window (or “form” in Windows.Forms lingo). This form contains a connect button and two edit boxes for chat purposes (Figure 3). The form is semitransparent (Opacity=70%) and always stays on top (TopMost=true). This combination keeps it always visible,
while not completely obscuring what transpires underneath. It is convenient during development when I have two Cosurfer instances and two IE browsers open. The design of Cosurfer separates the UI code from the functional code. The MainForm receives events from the various components and updates the UI accordingly. In response to user actions, such as pressing the Connect button or sending a new chat line, the MainForm simply delegates
public void Listen() { try { CommCleanup(); ThreadPool.QueueUserWorkItem(new WaitCallback(SockThreadFunc)); } catch (Exception e) { Console.WriteLine( “*** Exception *** ConnectionManager::ConnectionManager(), Description: “ + e.Message); } } void SockThreadFunc(Object state) { Socket listener; listener = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp); listener.Blocking = true; IPEndPoint endPoint = new IPEndPoint(IPAddress.Any, m_listenPort); try { listener.Bind(endPoint); } catch(SocketException e) { // If there is already a listener on this port, listen on another port if(e.ErrorCode == 10048) { endPoint = new IPEndPoint(IPAddress.Any, ++m_listenPort); —m_connectPort; listener.Bind(endPoint); } else { throw e; } } catch(Exception e) { Console.WriteLine(“Exception - {0}”, e.Message); Debug.Assert(false); return; } m_listening = true; m_sink.OnStartListening(m_listenPort); listener.Listen(m_listenPort); m_sock = listener.Accept(); m_listening = false;
the action to another component. This kind of design allows for better maintainability and flexibility. For example, it should be simple to port Cosurfer to XML because the functional components are totally UI agnostic. ConnectionManager. The communication layer of Cosurfer works at the TCP socket level and is mildly sophisticated (see Example 4). Every Cosurfer is both a server and a client. The reason is that, as a P2P application, every Cosurfer may initiate a connection or accept incoming connection from a peer. The .NET Framework provides TcpClient and Socket classes that the ConnectionManager class builds upon. Asynchronous I/O (as opposed to blocking or polling I/O) is usually preferred in an event- driven programming model. While the TcpClient supports Async I/O through its stream, there is no corresponding Async listening capability. The TcpListener class supports only blocking I/O or polling I/O. The ConnectionManager uses instead the Socket class to listen for incoming connections from the peer. However, instead of using the BeginAccept( )/EndAccept( ) method, I chose to create a thread (or rather request a thread from the ThreadPool ) and use the blocking Accept( ) method (see Listing Ten; available electronically). Once connected, it uses asynchronous method calls to read incoming data and act accordingly. The ConnectionManager encapsulates the gory details and exposes a streamlined asynchronous interface to the world. Listing Five (available electronically) contains two interfaces: IConnectionManager and IConnectionManagerEvents. IConnectionManager is the active interface that ConnectionManager implements. It allows connecting and sending data to a connected peer. The IConnectionManagerEvents interface is implemented by the user of ConnectionManager, and ConnectionManager calls its methods when a corresponding event occurs. The
// Notify the sink that a connection was accepted m_sink.OnConnectionAccepted(); m_stream = new NetworkStream(m_sock); m_writer = new StreamWriter(m_stream); m_writer.AutoFlush = true; AsyncCallback readReadyCallback = new AsyncCallback(OnReadReady); m_stream.BeginRead(m_inBuff, 0, BUFF_SIZE, readReadyCallback, this); }
Example 4: The communication layer of Cosurfer. 30
Dr. Dobb’s Journal, April 2005
Figure 3: Cosurfer. http://www.ddj.com
ConnectionManager is a generic TCP P2P component. It knows nothing about the specifics of Cosurfer or even the type of the object that implements the IConnectionManagerEvents interface. The interaction is completely abstract through an interface. The only assumption the connection manager makes is that the stream of bytes it reads is ASCII encoded because it converts the bytes to an ASCII string before calling the OnReceive( ) method. Browser Component. The Browser component is a wrapper around the BrowserControl COM object. I use the COM interoperability to utilize it in the .NET-managed environment. Making it work was the most difficult part of the project. I sort of anticipated it because integrating and bridging across separate technologies is almost never painless. The Browser class public interface includes the Navigate( ) method and two properties —Busy and Document. It also has a constructor that accepts an event sink that implements the IBrowserEvents interface. The constructor creates a new instance of the InternetExplorer COM object (which effectively launches a new Explorer window) and stores the event sink reference. Navigate( ) navigates the associated web browser to the provided URL. The Browser listens for the two webbrowser events DocumentComplete and OnQuit and immediately forwards them to its sink. When the sink receives the OnDocumentComplete event, it should check if the browser is busy (via the Busy property), and if it is not, it is safe to access the entire nested object model through the Document property. CosurfEngine. The CosurfEngine is the brain of Cosurfer. CosurfEngine controls Cosurfer’s behavior and the protocol used for communication. It also interacts with the generic components (ConnectionManager and Browser). The engine is wired to the ConnectionManager and the Browser components (thanks to the Factory). It reacts to browser and connection manager events. In the case of connectionor chat-related events, it forwards them to the MainForm through the ICosurfEngineEvents interface and handles all other events itself. The interesting events are OnReceive( ) from the ConnectionManager, and OnDocumentComplete( ) from the Browser. When OnReceive( ) is called, it means that the peer Cosurfer is sending a chat line or surf buffer. To distinguish between the two, a chat line is surrounded by the XML-like and , and a surf buffer is already an XML document. This is necessary to be able to parse the incoming data into meaningful http://www.ddj.com
messages (either chat lines or surf buffers). Without it, all kinds of TCP streaming issues raise their heads, such as long chat lines and surf buffers that are divided between multiple packets or multiple chat lines and/or surf buffers that arrive in a single TCP packet (OnReceive( ) event). These issues should be handled in any industrial-strength application. I accumulate partial data until I have a full message (chat line or surf buffer) and then
“The motivation is not to interfere with an ongoing complex process that can easily get out of whack”
chat lines are simply forwarded to the MainForm via the OnIncomingChatLine( ) event. Surf buffers get much more serious treatment. A surf buffer is an XML serialization of the frame structure of the HTML document in the peer’s browser. Example 2 contains the XML you get when visiting http://www.w3schools.com/tags/planets.htm, which has a frameset that contains three frames. When the CosurfEngine receives a surf buffer, it can be in one of two states: Idle or Updating. Idle means idle (big surprise); Updating means that CosurfEngine is in the process of navigating the local browser to an incoming surf buffer. This is not an atomic operation due to the asynchronous nature of getting web pages combined with the infamous multiple nested frames. If the state is Updating and the browser is still busy, the incoming surf buffer is simply ignored. The motivation is not to interfere with an ongoing complex process that can easily get out of whack. So how do the browsers stay in sync if some surf buffers are ignored? When the updating CosurfEngine is done, it sends the complete surf buffer to its peer. This sounds like a vicious circle, but if the incoming surf buffer corresponds to the content of the receiving browser, there is no sending back. Thus, if both users stop fiddling with their browser for a second, both browsers will settle Dr. Dobb’s Journal, April 2005
down. This mode of operation is intuitive (from the user point of view) and does not require explicit control passing as in “now, it’s my turn to navigate us somewhere.” It is very nonintuitive to figure out from the programmer’s seat. I spent a lot of time getting this micro state machine (only two states) working. The OnDocumentComplete( ) event is sent whenever the browser finished loading a document into one of the frames in the current page. In the simple case, there is only one frame and the page is fully loaded, which means it’s probably the right time to send it to the peer. In the not so simple case, it is only one frame out of several. Unfortunately, there is no good way to distinguish between these cases or verify when the last frame has been loaded. So, when is the right time to send a surf buffer to the peer? The answer is every OnDocumentComplete( ), as long as the browser is not busy anymore. If the browser is busy, it means that more frames are still loading. If it’s not busy, it might be done or it might not be done. So, it means that one Cosurfer may send a partial surf buffer to its peer. It might seem counter productive initially, but it actually is a performance booster. The receiving peer starts loading the frames it received and soon gets more frames when the sending Cosurfer completes loading the other frames in the current page. The nasty part is when the receiving Cosurfer completes loading a partial surf buffer, it sends it back to the sender. The sender may be in a more advanced stage already and will treat the partial surf buffer it sent itself just a short while ago as a fresh buffer from the peer and revert back to it. Well, I didn’t witness it in practice, so I don’t protect against it. If it ever becomes a real problem I can always keep a history of sent buffers and ignore them. I admit that it is a pretty crazy algorithm to synchronize two dynamic beasts — web browsers loading multiframe pages while users liberally click hyperlinks, hit the back button, and enter new URLs on the address bar. Still, I found this algorithm to be the best way to address these high-uncertainty conditions. The code includes the Cosurfer solution (Visual Studio .NET 2003), the Cosurfer project itself, the IE_Puppeteer project that demonstrates how to control IE, the SocketSpike project that demonstrates some low-level socket programming, and the XmlSpike project that demonstrates parsing and composing XML. I also attach the original .bat files, .rsp, and cordbg.cfg files I used to build and debug Cosurfer using the command-line tools of .NET Framework Beta-1 of several years ago. DDJ 31
XForms & Cause-and-Effect Programming A clean separation of presentation, user interface, and business processing model
ties, input validation by XML schema and dynamic constraints, event-driven action sequences, intent-based user-interface definition, and language features for specifying the properties of data submissions. In essence, the XForms processing model provides a cause-and-effect programming paradigm that backs user-initiated and event-driven modifications of data with a declarative business rules engine.
JOHN M. BOYER
The XForms Processing Model With XForms, you define a business processing model that includes:
F
orms provide a fundamental way in which users interact with web applications. However, the now familiar HTML-based forms have shortcomings ranging from poor integration with XML to device dependence. To address these problems, the W3C has turned to XForms, an application that combines XML and forms. XForms combines its own vocabulary with that of XPath and XML Schema to allow the expression of the core business processing model of a web-based application. This includes not just a structured XML-based data model, but also many constructs that simplify the applicationdesign experience, including a declarative business rules engine for automatic calculation of data values and other proper-
John is a senior product architect and research scientist for PureEdge Solutions. He can be contacted at
[email protected]. 32
• Instances of XML data to be processed. • XML schema for the data instances. • Declarative business rules describing the calculated relationships among the data. • Declarative business rules defining properties of and dynamic constraints on the data. • Event-driven invocation of imperative action sequences. • An intent-based user interface definition that provides the core behaviors of the subsuming host language’s presentation layer. • Definitions for submission of XML data. Schema data-type validation occurs in real time as data entry occurs, and full schema validation is performed before submission. XForms is at its best when XML data is submitted directly to the server, rather than boiling everything down to Dr. Dobb’s Journal, April 2005
tag-value pairs. The response of a submission is typically replacement XML data or a whole new document.
“The backbone of XForms is the processing model”
The backbone of XForms, though, is the processing model that is defined for XML instance data. XForms provides a cause-and-effect programming paradigm by backing user input and event-based action sequences with schema validation and a declarative business rules engine. As anyone who has experienced the increased power of event-driven programming or seen the rise of the spreadsheet can attest, each of these methods alone significantly reduces complexity. XForms (continued on page 35) http://www.ddj.com
(continued from page 32) provides a powerful blend of these successful methodologies to eliminate the tedium of imperative scripting over ad hoc data models. The declarative computation engine performs updates of instance data values using depth-first search and topological sorting, the same methodology as a spreadsheet. With an element, the form author simply declares that the content of an XML node or a property of that node is the calculated result of an XPath expression, and the XForms processor takes care of updating the XML node or property whenever any other XML node referenced by the XPath expression changes. With the at the origin, XForms derives power from the declarative computation system along four axes:
to calculate the monthly payment and total payback of a compound interest loan. Listing One is an XForms model containing a simple data instance for the details of a loan. In many XForms applications, the data will have an associated schema to define data types and structural constraints on the XML. This could be done by placing the XML schema directly into the xforms:model, or more commonly (as in Listing One), by URI reference in the schema attribute. For data-type assignment, which is the most important to client-side processing, it is also possible to avoid using XML Schema altogether by instead making the assignments with the XForms type model item property in an . This is an important technique for forms that must operate on small, resource-constrained devices. Listing Two shows an assignment of types and a few other properties to a few of the instance nodes. In Listing Two, each selects one or more nodes using the nodeset attribute, then an attribute such as type assigns a model item property. Note that the second bind uses both type and the dynamic constraint model item properties to ensure that each of the two nodes is both a double number and positive. The third bind uses the required model item property to ensure that the name and address of the loan applicant are filled out. The final bind uses the relevant model item property to indicate that the payment and total payout nodes are not relevant unless the inputs that will be used to calculate them are given. Although relevance can affect submission, it is used here to make the UI controls that will be bound to these nodes invisible until they become relevant. The most important model item property of an instance node is its intrinsic value. In the case of the loan payment and total payout, these are derived by calculation over other instance nodes. Listing Three shows how to declare value formulae using the calculate attribute. The first formula is the easiest since the total amount paid back is simply the product of the individual payment amount and the number of payments. However, it is also clear that this formula should be run last, even though it appears to be first.
“The most important model item property of an instance node is its intrinsic value”
• Automatic ordering. XForms models support an arbitrary number of bind elements, so the XForms processor determines the order of recalculation using the spreadsheet algorithm. • Iteration. XPath is used to the fullest by allowing the form author to apply a single XPath expression to an arbitrary number of XML instance data nodes. For example, the calculations for all line subtotals of a purchase order can be declared with a single bind element, regardless of how many lines there are. • Model item properties. Bind elements can declare formulae for the intrinsic value of an XML node as well as several XForms defined, such as whether the XML node is relevant, required, or read only, or whether it satisfies a dynamic constraint (which cannot be expressed with XML Schema). • Cause-and- effect programming. The declarative computation engine is invoked to effect the business rules expressed by the model binds, regardless of the cause of the change to the XML instance data. The resultant values of all model item properties are then automatically applied throughout the XForms system. Despite the intent of XForms to express more complex web-based applications, it is useful to begin with a straightforward example to illustrate the basics; say, a form http://www.ddj.com
Dr. Dobb’s Journal, April 2005
35
The XForms compute system automatically works out the correct order for the calculations. The second calculates the result of a temporary variable that was created to simplify the main loan calculation. Any number of additional XForms instances can be added to the XForms model, and each can contain as much data as needed by the form author. In this case, you need to translate from the human version of interest rate percentage to the mathematical value needed by the loan formula. The third in Listing Three has several interesting features. First, it shows the use of conditional logic. Using an if( ) function, the calculation determines whether it is appropriate to apply the compound interest formula. The
36
compound interest formula itself requires exponentiation, which is not available in XPath. It is also not available in XForms 1.0 (though it is currently planned to appear in XForms 1.1). However, XForms does allow extension function to be added by implementations, and forms, which use extension functions, must declare the extra required functions in the functions attribute of the element (see Listing One). The XForms Intent-Based UI The values and model item properties of an XForms model are propagated to XForms UI controls by binding them to instance nodes. For example, Listing Four shows the ref attribute being used with controls to obtain data,
Dr. Dobb’s Journal, April 2005
such as the principal or interest rate for the loan application of the prior section, with the control to provide the results. The in Listing Four lets users perform the data submission. The allows the ability to perform sequences of actions other than just submission. Although the example in the listing could more efficiently be written with an action, the example uses a few actions instead to demonstrate causeand-effect programming. Clearly, the payment and total payout are recalculated in response to user input of a principal, duration, or interest rate, but the model effects the declared binds regardless of the cause of the change. Hence, pressing the clear button causes the actions to occur, after which the payment and total payout are recalculated (back to zeroes, in this case). XForms has several other user-interface controls to collect input by other methods. For example, the collects multiline input, and the control provides a multiselection control. Listing Five presents an example of a single selection control that would be used to let users choose a type of currency in the aforementioned loan application. The exact type of single selection control is not specified by XForms. It could be a popup list, list box, or set of radio buttons. Regardless, one selection within the control will be created for each node of the nodeset given in the . The indicates what to show to users for the choice, and the indicates what to put for a given choice into the referenced node of the . In this example, the third choice is USD, and if the user selects the choice labeled USD, then the value “USD” is placed into the currency attribute of the Principal element. The is not the only XForms element capable of iteration. We’ve already seen the use of the nodeset attribute in the to iterate through a set of nodes and assign model item properties to each node. Perhaps the most powerful UI construct in XForms is the , which iterates a UI template for each node of a given nodeset. Often we think of the XForms repeat as generating a “table,” with each “row” being a copy of the template for one of the nodes in the node set. While the iteration capabilities of the XForms repeat are powerful, sometimes it is useful in the design process to have a simpler construct that lets multiple controls be grouped together and act as a unit. http://www.ddj.com
For this, XForms offers the . It also offers a sort of multifaceted grouping construct called the , which allows easy switching to a particular group of controls from among many cases. For example, an address block consists of multiple controls, but its countenance depends on the type of address (for instance, the locale choice of users). Conclusions XForms represents a clean architecture for separating presentation, user interface, and business processing model. The separation, though, is first and foremost a logical one with the purpose of deriving classic software engineering benefits such as encapsulation and loose coupling. These are particularly important to the design experience for intelligent documents. Yet, the user experience can often be simplified and enhanced through the cre-
ation of single documents that combine these separate logical components. Users are least encumbered when they have a
“The most important model item property of an instance node is its intrinsic value” single file containing their data, its presentation template, and the functional rules that govern how the document is updated in response to their further in-
put. It is particularly helpful if users can attach other supporting files to the intelligent document. This lets users exploit the intelligence of the document when offline, and to efficiently e-mail or otherwise transport the entire context of a transaction. For these and many other reasons, the XFDL language (see my article “XFDL: The Extensible Forms Description Language,” DDJ, December 1999, and http://www .PureEdge.com/xforms/) has been upgraded to be a new “skin” for XForms. Perhaps the most compelling augmentation that XFDL provides to a core XForms document is the ability to create secure, auditable web transactions via the application of digital signatures over not just the data and processing model of the form, but the XFDL presentation layer itself. DDJ
Listing One
Listing Four
John Q. Public 123 Main St. Tinyville 10000 12 5 856.07 10272.84
Enter principal:
Listing Two
Listing Three
http://www.ddj.com
Enter duration: Enter interest rate: Monthly payment: Total to be repaid: Clear