This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
DEVELOPMENT PLATFORMS Eclipse 3.1 NetBeans 4.1 Visual Studio 2005 STL-Compatible
Porting Compilers & Tools to 64 Bits Performance & .NET Apps Moving to .NET 2.0
Data Structures A Reusable Duff Device Michael Swaine
New Threads for Linux Introducing the VSTSEclipse Project Reverse Engineering & Binary Code
In the Apple Spotlight Jerry Pournelle on
Longhorn
C O N T E N T S
AUGUST 2005 VOLUME 30, ISSUE 8
FEATURES NetBeans 4.1 & Eclipse 3.1 14
FORUM
by Eric J. Bruno
NetBeans 4.1 and Eclipse 3.1 are at the forefront when it comes to development platforms for J2SE, J2EE, and J2ME.
Visual Studio 2005 Visualizers 24 by James Avery
“Visualizers” are Windows Forms dialogs in Visual Studio 2005 that let you create graphical views into the value of an object.
The Eclipse Modeling Framework 28 by Frank Budinsky
The Eclipse Modeling Framework helps you define models, from which many common code-generation patterns are generated.
The TMS Development Platform 34 by Alexander Frey
The TMS build system offers a simple, powerful way to do multiplatform development.
The VSTSEclipse Project 40 by Joe Sango
The VSTSEclipse project is focusing on an Eclipse plug-in for utilizing Visual Studio Team System functionality outside the VSTS framework.
Performance Diagnosis & .NET Applications 44 by Ramkumar N. Chintalapati and Sachin Ashok Wagh
The authors present a tool that lets you identify .NET-related problems and resolve bottlenecks during performance analysis.
Moving To .NET 2.0 48 by Eric Bergman-Terrell
.NET 2.0, C# 2.0, and Visual Studio 2005 include a host of new features.
Finding Binary Clones with Opstrings & Function Digests: Part II 56 by Andrew Schulman
Andrew continues his examination of reverse engineering, this month, focusing on binary code.
NPTL: The New Implementation of Threads for Linux 62 by L. Blunt Jackson
Introduced with Version 2.6 of the Linux kernel, the Native POSIX Thread Library brings full compliance to the POSIX Standard.
Porting Compilers & Tools to 64 Bits 67 by Steven Nakamoto and Michael Wolfe
Rehosting compilers and tools to 64-bit processors may not be as difficult as you think.
An STL-Compatible Hybrid of Linked List & Hash Map 69 by William Nagel
“Linked_hash” is an STL-compatible data structure based on the best of the linked-list hash-map classes.
EMBEDDED SYSTEMS A Reusable Duff Device 73 by Ralf Holly
Duff’s Device is a special kind of loop-unrolling mechanism that’s useful when performance counts.
COLUMNS Programming Paradigms 75
Chaos Manor 82
by Michael Swaine
by Jerry Pournelle
Embedded Space 78
Programmer’s Bookshelf 85
by Ed Nisley
by Jacek Sokulski
EDITORIAL 6 by Jonathan Erickson LETTERS 8 by you DR. ECCO’S OMNIHEURIST CORNER 11 by Dennis E. Shasha NEWS & VIEWS 12 by DDJ Staff OF INTEREST 87 by DDJ Staff SWAINE’S FLAMES 88 by Michael Swaine
RESOURCE CENTER As a service to our readers, source code, related files, and author guidelines are available at http:// www.ddj.com/. Letters to the editor, article proposals and submissions, and inquiries can be sent to [email protected], faxed to 650-513-4618, or mailed to Dr. Dobb’s Journal, 2800 Campus Drive, San Mateo CA 94403. For subscription questions, call 800-456-1215 (U.S. or Canada). For all other countries, call 902-563-4753 or fax 902-563-4807. E-mail subscription questions to ddj@neodata .com or write to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 803226188. If you want to change the information you receive from CMP and others about products and services, go to http://www.cmp .com/feedback/permission.html or contact Customer Service at the address/number noted on this page. Back issues may be purchased for $9.00 per copy (which includes shipping and handling). For issue availability, send e-mail to [email protected], fax to 785838-7566, or call 800-444-4881 (U.S. and Canada) or 785-8387500 (all other countries). Back issue orders must be prepaid. Please send payment to Dr. Dobb’s Journal, 4601 West 6th Street, Suite B, Lawrence, KS 66049-4189. Individual back articles may be purchased electronically at http://www.ddj.com/.
NEXT MONTH: In September, we turn to Communications and Networking, focusing on what’s going on when it comes to wireless.
EDITORIAL MANAGING EDITOR Deirdre Blake MANAGING EDITOR, DIGITAL MEDIA Kevin Carlson SENIOR PRODUCTION EDITOR Monica E. Berg NEWS EDITOR Shannon Cochran ASSOCIATE EDITOR Della Wyser ART DIRECTOR Margaret A. Anderson SENIOR CONTRIBUTING EDITOR Al Stevens CONTRIBUTING EDITORS Bruce Schneier, Ray Duncan, Jack Woehr, Jon Bentley, Tim Kientzle, Gregory V. Wilson, Mark Nelson, Ed Nisley, Jerry Pournelle, Dennis E. Shasha EDITOR-AT-LARGE Michael Swaine PRODUCTION MANAGER Eve Gibson INTERNET OPERATIONS DIRECTOR Michael Calderon SENIOR WEB DEVELOPER Steve Goyette WEBMASTERS Sean Coady, Joe Lucca AUDIENCE DEVELOPMENT AUDIENCE DEVELOPMENT DIRECTOR Kevin Regan AUDIENCE DEVELOPMENT MANAGER Karina Medina AUDIENCE DEVELOPMENT ASSISTANT MANAGER Shomari Hines AUDIENCE DEVELOPMENT ASSISTANT Melani Benedetto-Valente MARKETING/ADVERTISING ASSOCIATE PUBLISHER Will Wise SENIOR MANAGERS, MEDIA PROGRAMS see page 86 Pauline Beall, Michael Beasley, Cassandra Clark, Ron Cordek, Mike Kelleher, Andrew Mintz MARKETING DIRECTOR Jessica Marty SENIOR ART DIRECTOR OF MARKETING Carey Perez DR. DOBB’S JOURNAL 2800 Campus Drive, San Mateo, CA 94403 650-513-4300. http://www.ddj.com/ CMP MEDIA LLC Gary Marshall President and CEO John Day Executive Vice President and CFO Steve Weitzner Executive Vice President and COO Jeff Patterson Executive Vice President, Corporate Sales & Marketing Leah Landro Executive Vice President, Human Resources Mike Mikos Chief Information Officer Bill Amstutz Senior Vice President, Operations Sandra Grayson Senior Vice President and General Counsel Alexandra Raine Senior Vice President, Communications Kate Spellman Senior Vice President, Corporate Marketing Mike Azzara Vice President, Group Director of Internet Business Robert Faletra President, Channel Group Vicki Masseria President, CMP Healthcare Media Philip Chapnick Vice President, Group Publisher Applied Technologies Michael Friedenberg Vice President, Group Publisher InformationWeek Media Network Paul Miller Vice President, Group Publisher Electronics Fritz Nelson Vice President, Group Publisher Network Computing Enterprise Architecture Group Peter Westerman Vice President, Group Publisher Software Development Media Joseph Braue Vice President, Director of Custom Integrated Marketing Solutions Shannon Aronson Corporate Director, Audience Development Michael Zane Corporate Director, Audience Development Marie Myers Corporate Director, Publishing Services
American Buisness Press
4
Dr. Dobb’s Journal, August 2005
Printed in the USA
http://www.ddj.com
EDITORIAL
TiVo This
D
igital video recorder technology in general and systems such as TiVo in particular have turned the world of television topsy-turvy, not to mention turning proper nouns into verbs. Don’t want to stay up to watch the late show? TiVo it. Irritated by an irritating commercial? TiVo it. Can’t get enough of Paris Hilton selling hamburgers? TiVo it. Part of the magic of DVR is that it truly ushers in the concepts of “time-shifting,” whereby you determine your viewing schedule, and “space shifting,” in which you transfer digital content onto gadgets other than those originally intended. TiVo’s Windows-based TiVoToGo program (http://www .tivo.com/4.9.19.asp), for instance, lets you transfer programs from TiVo DVRs directly to portable media players, laptops, and other devices. Other approaches to space shifting include TV on cell phones, like that provided by MobiTV (http://www.mobitv.com/). Reportedly, more than 300,000 subscribers have signed on with MobiTV to watch everything from Congressional hearings on C-SPAN to Larry, Curly, and Moe on the ToonWorld TV Classics channel. To make this happen, MobiTV has launched its MobiEnabled Developers Program, which provides sample MobiTV applications, product testing, and service preproduction (http://209.132.240.166/ developer/index.html). Assuming that your carrier network and cell phone support MobiTV services, 100-percent pure Java MIDP 2.0 client applications can be running on a cell phone near you real soon now. (For more information on Java MIDP, see http://java.sun.com/products/midp/.) In fact, as Linden deCarmo pointed out in “The OpenCable Application Platform” (DDJ, June 2004), Java is at the core of most of the emerging digital video technology and APIs. In addition to MobiTV’s platform, for instance, there’s TiVo’s Java-based Home Media Engine SDK (see “Building on TiVo,” by Arthur van Hoff and Adam Doppelt; DDJ, March 2005). Likewise, Java is at the heart of the recently updated OpenCable Application Platform Specification for DVRs (http://www.opencable.com/ specifications/), which defines a minimal profile for DVR software for digital cable receivers that have local storage. The OCAP DVR spec includes all required APIs, content and data formats, and protocols, up to the application level. Applications adhering to the OCAP DVR Profile are executed on OpenCable-compliant host devices. The OCAP DVR platform is applicable to a variety of hardware and operating systems, giving consumer electronics manufacturers flexibility in implementation. Granted, not everyone sees Java as the development platform for space shifting, DVRs, and other TVrelated initiatives. It’s a safe bet that Microsoft won’t be standardizing on Java APIs anytime soon for its Microsoft TV project (http://www.microsoft.com/tv/). While large corporations such as AMD, Intel, Microsoft, Apple, and others are supporting DVR standards and cranking out toolkits, homebrewers are discovering that DVR isn’t that hard. In his Circuit Cellar article “Build a Digital Video Recorder,” Miguel Sanchez presents a system built around Linux and a 500-MHz Pentium III computer with 64 MB of RAM and a 120-GB hard disk (http://www.circuitcellar.com/magazine/174toc.htm). Not to be outdone, Ken Sharp’s online article “Free TiVo: Build a Better DVR Out of an Old PC” takes you through all the hardware and software necessary for putting together a Windows-based DVR (http://www.makezine.com/extras/4.html). Still, DVR’ing isn’t a free ride, as even industry-leading TiVo (the noun) has had its share of ups and downs. One minute, former Federal Communications Commission chairman Michael Powell calls TiVo “God’s machine,” and the next thing you know, TiVo’s CEO is jumping ship and the company turns to serious cost cutting. Then before you can say “space-shift” three times, TiVo’s stock price jumps up on buyout rumors, but plummets when earnings (or the lack thereof) are announced. In truth, TiVo’s problems have nothing to do with technology, market conditions, or competition. What it really comes down to is “content”— you know, all those TV programs we’re supposed to be time and space shifting at will. For one thing, space-shifted viewing and mobile devices require a whole new kind of TV programming. Cecil B. DeMille-like productions won’t stand a chance on cell phone displays — they’re too grandiose and too long for constrained devices. Instead, new types of TV programs called “mobisodes” (short for “mobile episodes”) that are uniquely created for the screen and time limitations of cell phones are in the works. But mobisode producers are going to have to do better than what we’re seeing with the current crop of TV shows. Case in point: In a recent rerun of “Law & Order,” the editor-in-chief of a computer magazine is brutally murdered (http://www.nbc.com/Law_&_Order/episode_ guide/113.html). As if that wasn’t shocking enough, ungrateful coworkers of the poor fellow posthumously referred to him as a “pig,” among other less than endearing terms. If the Parents Television Council or some other likeminded organization wants to object to what’s on TV, forget about Paris Hilton’s hamburgers. There’s no better place to start than drivel that eliminates editors. TiVo that, Law & Order.
it, I was cleaning out some boxes from a recent move and found a 1976 Intel Data Catalog that I was going to throw away. Now I think I’ll try to sell it… Stuart Ball [email protected]
T
D
2
N CE
TS
2
Vintage or Quirky? Dear DDJ, When I was a social scientist at Princeton in the ’80s, I befriended a newly immigrated Russian who referred to his trade as “mathematics.” Jonathan Erickson’s June 2005 “Editorial” brought back some incredible skills possessed by this man. Recall that this was just before the fall of the Soviet Union. As part of his job at a Soviet university, he had to build microcomputers. The only microprocessors he could get were stolen or smuggled from the U.S. Here’s where it gets interesting: Most of the microprocessors were from Intel; most were rejects; and most came with a list of unimplemented or incorrectly implemented instructions! You know, 2+2=7. This guy had to build computers around processors with known defects! And since each had a different list of defects, each computer had to be designed anew, cleverly working around the defects. Oh, one more thing: This guy could effortlessly multiply and divide in hexadecimal — in his head! There were few calculators in his department, and none did hex. I’ve since hired about a dozen programmers, but I’ve never, ever seen skills like those again. Keep up the good work! David L. Ransen [email protected] Dear DDJ, I recently read and enjoyed Jonathan Erickson’s June 2005 “Editorial” and add this site on CPU History to his list: http:// www.sasktelwebsite.net/jbayko/cpu.html. Actually, a column or series could emerge from a discussion of processor characteristics. I recall a BYTE magazine article on what certain processors did wrong. Perhaps a “What ever happened to the XXX processor?” article might prove interesting. Wil Blake [email protected] Dear DDJ, I was glad to see Jonathan Erickson’s “Editorial” on vintage data catalogs in the June 2005 issue of DDJ. The day before I read 8
Jonathan responds: Thanks for your note, Stuart. I’m always happy to contribute to someone else’s clutter. Programming Language Popularity Dear DDJ, Rereading Jonathan Erickson’s February 2005 “Editorial” in which he mentions programming language popularity and the TIOBE Programming Community Index (http://www.tiobe.com/), a much more useful statistic would be the rankings of how many currently commercially available programs are written in which languages. A person looking to improve his programming skills would be ill-advised to choose x-language just because it has a high ranking, if it has no commercial program sellers. Nobody is going to pay you to write a program in some language that has all its applications, which are “free programs.” I would find it hard to believe that there are more currently commercially available programs in C than in C++, especially, since it is very difficult to write a Windows program in C, and very easy to write a Windows program in C++ using MFC. In fact, a really interesting statistic would be how many programs are available using MFC and how many are not. Phil Daley [email protected] TIOBE’s Paul Jansen responds: Thanks for your TPC Index feedback, Phil. The index is not about job opportunities; there are other sites publishing such statistics. We only say that you can use our index to see whether your programming skills are still up-to-date (both commercially and noncommercially). Questions such as how many currently commercially available programs are written in a particular language are much more interesting than our index, but in my opinion, impossible to accomplish. Networking In the Fast Lane Dear DDJ, The “The Google Revolution” column by Jerry Pournelle (DDJ, February 2005) discussed his recent troubleshooting and upgrade of his network, following the outage of one of his network switches. In it he laments the marginal performance enhancement of installing gigabit Ethernet on his older systems. The PCI bus comes in different widths, depending on the quality and price of the motherboard. Server boards routinely come Dr. Dobb’s Journal, August 2005
with 64-bit PCI bus slots at varying speeds, and these slots are the only width capable of providing enough bandwidth to allow an add-in PCI gigabit card with true gigabit speeds. But there are other limits besides just the PCI bus. Purchasing a bargain basement gigabit switch is definitely not the way to go in testing the theoretical maximum of your network, as this could be the “real” bottleneck. Some switches and cards don’t handle port speed autonegotiation properly, and the gigabit interfaces will think they can only run at 10 or 100 Mbit. There is a real difference between a $75 gigE switch and a $1250 gigE switch. Jerry also mentions the idea of waiting for motherboards with gigabit Ethernet onboard. Again, there is no guarantee that just because a manufacturer put a gigE port on the board, they’ve provided sufficient bus bandwidth to run that port at full speed. Some other culprits can be shared IRQs, cheap chipsets, or bad driver/OS support. If you wired your LAN in the days before CAT5e/CAT6, you have no way of guaranteeing that the wire in your wall is capable of the enhanced speeds of gigE. The only true way to benchmark is with benchmark applications like Netperf, and using nothing but a CAT5e/CAT6 crossover cable between two machines with high quality, 64-bit add-in PCI cards in a 64-bit PCI slot. Some simple math shows that a 32-bit PCI bus is overwhelmed by the amount of data generated at true gigabit speeds, and your maximum will depend on how the PCI bus reacts to being flooded with more data than it can pass reliably. This explains the wild variation Jerry describes. I won’t go into details about why desktop-quality Windows is not a good platform to use for benchmarking the cards. Booting a Knoppix CD would provide a more reliable testing platform. David Backeberg Network Staff Assistant, MIT Math Dept. [email protected] Jerry responds: Thanks for your letter, David. Since most of the readership probably uses desktop Windows systems, that’s the platform I tested things with. People who have professional-grade servers with 64-bit PCI bus slots will clearly get different results from those of us who tend to make do with old desktops as servers. Alas, most readers don’t have the IT budget that the MIT Math Department has. I have never for a moment supposed that there is no difference between a $1250 Ethernet switch and a $75 one, nor have I supposed that people who buy $1250 Ethernet switches should seek my advice on the purchases. I think I did state that we are using Cat5 cable. DDJ http://www.ddj.com
DR. ECCO’S OMNIHEURIST CORNER
Election Fraud in Verity Dennis E. Shasha
I
n a certain county having voting machines without paper trails, inspectors depend on exit polls to determine whether the voting machines have worked properly. Normally, they use statistics and the assumption of random sampling, but sometimes they want to be sure. The city of Verity is proud of its honest electorate. Two candidates, Fred and Wendy, have run against one another. There are only 100 voters. Each voter is given a unique number between 1 and 100 upon leaving the voting booth. The five pollsters record those numbers as well as the votes when they ask the voters how they voted. Each pollster manages to talk to 80 voters, and in every case, Fred beats Wendy by 42 to 38. Yet Wendy carries the city by 51 to 49. Upon hearing these results, Fred cries foul. You are brought in to investigate. Both Fred and Wendy agree about the following:
• The voters were honest with the pollsters and the pollsters reported their results honestly. • Every pollster spoke to 80 people, 42 of whom voted for Fred against only 38 for Wendy. • Between every pair of pollsters, all 100 people were interviewed. 1. How many pollsters could there be under these conditions for it to be possible that Wendy won, even if this were unlikely assuming random sampling? 2. How might the voters be divided among the pollsters? 3. So, was Fred right? Here is an open problem: Suppose we made a change, so that between every pair of pollsters, at least 96 distinct voters were interviewed, instead of all 100 people. In this case, how many pollsters could there be? DDJ Dennis is a professor of computer science at New York University. His most recent books are Dr. Ecco’s Cyberpuzzles (2002) and Puzzling Adventures (2005), both published by W. W. Norton. He can be contacted at [email protected]. http://www.ddj.com
Dr. Ecco Solution Solution to “Treasure Arrow,” DDJ, July 2005. Let’s first solve the problem symbolically. The pole has length L, mass Mp, and the arrowhead has weight Ma. We represent the pole’s mass Mp as a point mass at the center. Now, let’s say we have five elastics including ones at the ends and at every quarter point. We know that the lengths of the bands must form a straight line because the pole is stiff. This observation (due to my colleague Alan Siegel) gives us a constraint in addition to balanced torque and balanced vertical force. Here, s1 is the stretch of the leftmost band in centimeters. ∆ is the difference from one band to the next one to the right, again in centimeters. s1+∆=s2 s2+∆=s3 s3+∆=s4 s4+∆=s5 The balance of vertical forces gives us: 5•s1+((1+2+3+ 4)•∆)=Mp+Ma The balance of torques gives us: L•Mp/2 from the pole.
So (1) Mp/2=2.5(s1+∆), from torque balance; (2) Mp+Ma=5(s1+2∆), from ver-
Dr. Dobb’s Journal, August 2005
tical balance. From (2), ((Mp + Ma)/5) – 2∆=s1. Therefore, using the torque balance: Mp/2=2.5(s1+∆) =2.5((((Mp+Ma)/5) –2∆)+∆) = 2.5((((Mp+Ma)/5) –∆)) =(2.5((Mp+Ma)/5)) –(2.5•∆) =(((Mp+Ma)/2)) –(2.5•∆)
So: 2.5•∆=Ma/2 ∆=Ma/5
Therefore, s1=Mp/5+Ma/5 – 2Ma/5= Mp/5 – Ma/5. This implies that if Mp= Ma, then s1=0. Since these two quantities are in fact equal, s1=0, so the left band is at its rest length of 1 meter, the right band is down 60 centimeters, and the arrow is pointing to a point that is 100 cm+2•60 cm=2.2 meters from the top. Further, ∆ is 20 centimeters. So, Mp=Ma=100 kilograms. Optimal Farming: Errata and Reader Improvements Alan Dragoo was the first to point out the bug in my solution to problem 1. He suggested a design where one circle would cover a central square and four smaller rectangles would cover side rectangles. Denis Birnie independently arrived at the same solution, as well as a very interesting solution for open problem three. You can find those at http:// cs.nyu.edu/cs/faculty/shasha/papers/ birniedobbssol.doc. DDJ
11
SECTION
A
MAIN NEWS
Dr. Dobb’s
News & Views
Free Pascal 2.0 Released After five years of development, Free Pascal 2.0, an open-source compiler compatible with Turbo Pascal and Delphi, has been released for Linux, FreeBSD, Mac OS X/Darwin, Mac OS classic, DOS, Win32, OS/2, Netware (libc and classic), and MorphOS (http://www.freepascal.org/). Free Pascal, which is a GPL compiler for 32- and 64-bit CPU architectures such as Intel 32/64-bit, AMD 32/64-bit, SPARC, PowerPC, and Intel ARM, comes with a cross-platform runtime library, interfaces to existing libraries, and nonvisual classes in the Free Component Library. Maryland Opts for Computer Recycling Under the terms of a new state law, computer manufacturers who do business in Maryland must begin paying an annual $5000 fee to offer free computer take-back programs to help recycle the state’s 60,000 tons of annual electronic waste. For the time being, the law only targets computers, but televisions may be included in the future. Annually, U.S. consumers discard approximately 50 million computers and 130 million cell phones. Other states that have adopted high-tech recycling programs include California and Maine, both of which take different approaches. California requires that consumers foot the recycling bill by paying a $6 to $10 disposal fee on every computer and TV purchased. Maine requires manufacturers to pay the complete cost of recycling computers or TVs. For more information, see http://www .marylandrecyclers.org/hb575.htm. µC++ Debuts µC++, a C++ extension that supports concurrent and real-time programming, has been released. Pronounced “micro-seeplus-plus,” µC++ 5.1.0 extends C++ with new constructs by providing advanced control-flow including lightweight concurrency on shared-memory uni- and multiprocessor computers running UNIX and Linux. µC++ (available at http://plg.uwaterloo .ca/~usystem/uC++.html) accomplishes this by providing: “coroutines,” which have independent execution states; “tasks,” which have their own threads; and “monitors” that allow for safe communication among tasks. These new classes can take part in inheritance, overloading, and templates, just like other classes. µC++ is implemented as a translator, µ++, that reads a program con12
taining the extensions and transforms each extension into one or more C++ statements, which are then compiled by an appropriate C++ compiler (currently only GNU C++) and linked with the µC++ concurrent runtime library. This release of µC++ includes support for GCC 4.0.0 and the Intel C++ 8.1 compiler, and AMD64 and Intel EM64T processors in 64-bit mode. SOA: Behind the Buzzword The OASIS Standards group has formed a committee to define exactly what constitutes a service-oriented architecture: Formally dubbed the “OASIS SOA Reference Model (SOA-RM) Technical Committee, the group will be responsible for drafting a reference model that defines the functional components of service-oriented architectures and their relationships; see the committee’s web site at http://www.oasis-open .org/committees/tc_home.php?wg_abbrev =soa-rm. The committee has 47 members, including representatives from Infosys, Boeing, Unisys, and the U.S. Department of Homeland Security. Avalon, Indigo, InfoCard Previewed Microsoft has released prebeta versions of two key pieces of its development platform for Longhorn — the Avalon presentation subsystem and the Indigo messaging framework. The “Beta1 Release Candidates” support Visual Studio 2005 Beta2 and the .NET Framework 2.0 Beta 2. Avalon, Longhorn’s 3D UI layer, includes a display engine and a managed- code framework incorporating XAML markup. New in this release is support for online video and for Microsoft’s Metro technology, an XML-based alternative to PDF. Indigo, Longhorn’s connectivity piece, is intended to support service-oriented and distributed architectures. For more information, see http://www.microsoft.com/ downloads/details.aspx?familyid=b789bc8d -4f25-4823-b6aa-c5edf432d0c1&displaylang=en. MATLAB Contest Winners Announced Congratulations to Tim Vaughan, grand prize winner of the 10th MATLAB Programming Contest sponsored by The MathWorks (http://www.mathworks.com/ contest/ants/winners.html). The problem Tim tackled involved a sandbox in which there are ants, sugar cubes, anthills, and rocks. The challenge was to write a control program used by each ant, taking into Dr. Dobb’s Journal, August 2005
DR. DOBB’S JOURNAL August 1, 2005
consideration limitations such as size and distance. Other prize contest winners include Jack Snoeyink, Hannes Naudé, Cobus Potgieter, Jan Langer, Niilo Sirola, and Timothy Alderson. Harmony: An Open-Source Java? The Apache Foundation has launched a new development effort, dubbed “Project Harmony,” to build an open-source version of Java. Harmony has been approved as part of the Apache Incubator (http:// incubator.apache.org/), a program created in 2002 as a starting point for new Apache Foundation codebases. Harmony will provide a “clean room” implementation of J2SE 5, the latest version of the Java 2 Standard Edition. The Harmony team plans to use Sun’s TCK to test compatibility with the specification — assuming that TCK can be licensed under Sun’s academic and nonprofit scholarship program. SANS Lists New Vulnerabilities The SANS Institute has issued its most recent lists of the top 20 security holes in Windows and UNIX systems, with a supplementary advisory highlighting recently discovered vulnerabilities (http://www .sans.org/top20/). The institute identified holes in Internet Explorer, ActiveX, PNG file processing, cursor and icon handling, the Windows License Logging Service, and the Microsoft Server Message Block (SMB) as the most severe of new Windows vulnerabilities in 2005; web servers and services pose the most critical ongoing threat. Meanwhile, UNIX remains plagued by holes in BIND, SNMP, and SSL. Mac OS X also made an appearance on the list due to buffer overflows in the cross-platform RealOne Player. Language Challenges The Python Challenge is a set of riddles designed to provide an entertaining way to get involved with the Python programming language by exploring Python’s module library. Among other things, the Python Challenge (http:// www.pythonchallenge.com/) demonstrates the power of Python features such as batteries. Likewise, the Ruby Quiz (http://www.rubyquiz.com/) is a weekly programming challenge for Ruby programmers with similar goals — to think, learn, and have a good time. http://www.ddj.com
PROGRAMMER’S TOOLCHEST
NetBeans 4.1 & Eclipse 3.1 Development platforms for J2SE, J2EE, and J2ME ERIC J. BRUNO
W
hen I began my career as a software developer, the modern concept of an integrated development environment (IDE) was yet to be defined. Instead, developers would simply use their favorite editor and compile code from the command line. I remember how much I looked forward to getting a cup of coffee as I started the compiler, knowing how long it would take. When the compiler was finished, I would go through any errors that were listed, open the appropriate source module, and locate the offending line of code. This was a luxury, veteran developers told me, compared to punch cards or submitting code to be compiled as overnight batch jobs. Everything changed with Microsoft’s Visual C++ 1.0. There were other IDEs available for Windows development (Borland and Symantec both offered them), but Visual C++ quickly took the lead. Over time, Microsoft set the standard for modern IDE features with Visual Studio, which included tools for building C, C++, and Visual Basic applications for Windows. With Visual J++, Microsoft even extended its support to Java. Alas, Visual J++ was abandoned, never to be extended to support Java past Version 1.1. For me, this put Java development back into the “dark ages” of using a plain old editor and command line. Quickly, vendors such as Borland and Symantec offered Java IDEs, with JBuilder and Visual Café, respectively. However, these IDEs felt clunky because they placed their own abstractions on top of important aspects of Java development, such as setting the classpath and defining packages. Because of this, not to mention a high price tag and the fact that it was still difficult (if not impossible) to debug J2EE applications running inside application servers, many developers stuck with simple editors for development. Debugging often consisted of adding System.out.prinln statements in the code. Sun recognized the shortcomings of Java development tools and offered its own IDE after acquiring NetBeans Developer from Forte. Today, it’s simply called NetBeans, and Sun has made it open source. I first used NetBeans for Java development about two years ago and wasn’t entirely comfortable with it. Consequently, I began to use Eclipse. With it, I could develop and debug Java code within a full-featured IDE that didn’t feel clunky. Soon, application server vendors such as BEA offered Eclipse plug-ins, letting us debug J2EE applications as well. Eric is a consultant in New York and has worked extensively in Java and C++ developing real-time trading and financial applications. He can be contacted at [email protected]. 14
NetBeans 4.1 Overview With the release of NetBeans 4.1 (http://www.netbeans.org/), NetBeans has become a premier Java development platform, letting you easily build, package, and deploy standalone J2SE applications, J2EE applications (EJBs), Java web services, and (its most unique feature) wireless mobile MIDP applications that run on cell phones. NetBeans features include:
“Both Eclipse and NetBeans support Java development using multiple JDK versions” • Support for both J2SE 1.4.2 and 1.5 (development and debugging). • Support for both J2EE 1.3 and 1.4 (Servlet and/or EJB development, deployment, and debugging). • Support for J2ME Wireless Toolkit 2.2, MIDP 2.0, and CLDC 1.1 (development, debugging, and emulation). • The ability to create, test, and consume web services. • Project builds that are based entirely on Apache Ant scripts. Your projects can be built outside of NetBeans using the Ant scripts it composes, or you can import your existing Ant scripts into NetBeans. • Code refactoring that lets you easily rename fields, methods, classes, interfaces, and packages, as well as add get/set methods for fields. • Code profiling that lets you check for memory leaks and CPU performance bottlenecks. • Code editing features such as code-specific fonts and colors, code formatting, automatic indent, bookmarks, tasks, code folding (expand and collapse), auto completion, macros, and block tabbing and commenting. Configurable editor settings for Java, JSP, CSS, HTML, XML, DTS, properties, and plain text files. • A classpath that can be set per project through a straightforward interface. You can simply add JAR files or folders directly, (continued on page 18)
Dr. Dobb’s Journal, August 2005
http://www.ddj.com
(continued from page 14) or you can add libraries, which are sets of one or more JAR files that you group together in advance. • A Navigator component (part of the editor UI) that lets you quickly browse through the methods, fields, and inheritance tree for each Java source module without scrolling through the source file itself. Overall, the NetBeans interface is attractive and efficient (see Figure 1). The default layout is as you would expect, with a component to display your currently opened projects, a portion of the screen dedicated to code editing, the navigator component, and an area to display the output of builds and debug sessions. The project display does a nice job of breaking out the components of each project, such as the source files (by package), the libraries and JAR files used in the project, and any test packages you may want to include to test your code. When you’re in debug mode, there are components to display the states of variables, show the entire call stack, and view the application’s running threads. The layout is customizable, visually appealing, and wastes no space. I found the NetBeans UI to be responsive, and in many areas slightly more responsive than Eclipse. In particular, NetBeans performed better while opening/ closing projects, and while stepping through code in the debugger. The performance difference isn’t great, but noticeable. But NetBeans is technically more than just an IDE. It is an entire platform that can be used as an application runtime, elimi-
Figure 1: NetBeans IDE.
Figure 2: Eclipse UI. 18
nating the need for you to write window, menu, or other UIrelated code. The NetBeans platform is a framework that can be used to build application user interfaces, with data storage access and presentation management. When you use the full NetBeans development platform, you simply concentrate on your application’s business logic. Eclipse 3.1 and the Web Tools Platform Overview In many ways, it’s difficult to compare Eclipse to NetBeans. The overview documentation for Eclipse (http://www.eclipse.org/) describes the platform as “an open, extensible IDE for anything, and nothing in particular.” Eclipse is a platform that can be used for Java development, as well as C, C++, Smalltalk, COBOL, Perl, Python, and Fortran. There are tools that integrate UML and other design methodologies into the platform, too. However, in this article I focus on the Eclipse plug-ins and facilities for developing Java applications. When you install the appropriate plug-ins, such as those available for WebLogic (https://eclipse-plugin.projects.dev2dev.bea.com/) and Tomcat (http://jakarta.apache.org/tomcat/resources.html), along with the Eclipse Web Tools Platform (http://www.eclipse.org/ webtools/index.html), you can develop, deploy, and debug J2EE applications with Eclipse 3.1. Among Eclipse’s features are: • Support for both J2SE 1.4.2 and 1.5 (development and debugging). • Support for both J2EE 1.3 and 1.4 (Servlet and/or EJB development, deployment, and debugging). • The ability to create, test (with the Eclipse web-service validation tools plug-in), and consume web services. • Project builds that support Ant, but the Eclipse IDE is required for all software builds. • Code refactoring. • Code profiling with the appropriate plug-in, such as that available from SourceForge (http://sourceforge.net/projects/ eclipsecolorer/). • Code formatting using predefined, customizable code styles. • The normal code-editing features, such as those supported by NetBeans. • A code outline component similar to the NetBeans Navigator that lets you browse code. • Support for programming languages and environments other than Java. The Eclipse UI (Figure 2) is just as attractive and efficient as NetBeans. However, there are more options available to customize the look-and-feel in Eclipse, such as view-specific fonts, the tab layout, and whether to display text in tabs. The fact that Eclipse breaks up the display into perspectives (different screen component layouts) is both a plus and minus. For example, I find it useful to have separate perspectives for debugging and coding, but some people may find this confusing. However, if you’re developing in different languages or want to alter the display when debugging to emphasize (say, the call stack instead of the code editor), then having separate perspectives is a necessity. In terms of performance, the one area that Eclipse outperformed NetBeans was on startup time. In all cases (with and without a project open), it took noticeably less time for Eclipse to start. However, the difference was nominal, and its significance questionable in the bigger picture. My overall opinion is that both Eclipse and NetBeans can afford quicker startup times. Standard Java Development (J2SE) Both Eclipse and NetBeans support Java development using multiple JDK versions, including 1.4.2 and 5.0. With Eclipse,
Dr. Dobb’s Journal, August 2005
http://www.ddj.com
you can add new Java runtime environments in the Preferences dialog box, available through the menu titled Windows. Within the Preferences dialog box, expand the Java entry within the tree list, and click on Installed JREs. Here, you can add JDK/JRE entries, and set one as the default for new projects. With NetBeans, you add versions of the JDK (and set a default for new projects) via the Java Platform Manager, available from the Tools menu. Because NetBeans contains substantial support for desktop and mobile Java development, you can add development platforms for both J2SE and J2ME developments via this dialog box. The process of starting a new J2SE development project in both Eclipse and NetBeans is similar. To start, choose the New Project menu item from the File menu, available in both IDEs. With NetBeans, select the project type by choosing an item listed in the Categories list. The General category is for standard desktop Java applications, or plain-old Java object (POJO) development. With Eclipse, select the Java Project option within the Wizards tree list. After you enter a project name and select a project directory, the project is opened within each IDE’s project display (see Figure 3). Both IDEs do a good job of displaying the component of projects, either by source packages or by folders that you create within each project. The NetBeans Projects display lists components mainly by their package type, while the Files display is a straightforward folder/file list. Eclipse does the same via its Package Explorer and Navigator views, respectively. Project components are broken out differently depending upon the project. For instance, a web-application project includes additional tree nodes for web pages, web services, and web configuration files. There’s also an extensive use of icons within the different parts of a project to help you quickly navigate through the various components at a glance. It all comes down to editing code — but before you actually start typing, you need to know where to start. Both IDEs offer visual code browsers that let you quickly locate classes, methods, and fields within a project without scrolling through numerous source files. In NetBeans, it is the Navigator display, while in Eclipse it is the Outline view, as in Figure 4. In both cases, class members are displayed with icons that indicate public or private access, static or final declaration, or in the case of Eclipse, whether a method contains errors or warnings. NetBeans includes an inheritance tree within the display, whereas Eclipse uses a separate component, the Hierarchy view, to accomplish this. Both IDEs have similar code-editor features, such as syntaxspecific fonts and colors, code folding, bracket and parentheses matching, and warning/error indicators. Code folding lets you collapse portions of your code to one summary line, such as import lists, comment blocks, and entire function definitions. Both IDEs represent folding with an indicator placed within the editor’s left margin, as well as with an ellipsis (“…”) in the folded code itself. Warnings and errors are indicated, in real time, as you write code within the editor’s margin. When you hover (with the mouse) over the indicator in the margin, a tooltip is displayed with the problem description. Both NetBeans and Eclipse offer solutions to the problem when you click on the margin, although I found Eclipse to be more thorough and accurate in its resolution options. To avoid errors and reduce typing, both IDEs support what is called “auto completion.” If you type a class or object name, for example, the autocompletion feature displays the appropriate method or field names that apply. As you scroll through the list of methods and fields, the associated Javadoc entry is http://www.ddj.com
displayed. You can choose a method or field from the list displayed, or you can narrow down the choices by typing the beginning letters of the method or field you’re looking for. After choosing a method, the display shows the parameter list, highlighting the next parameter with each comma you type. This functionality is almost identical for both NetBeans and Eclipse. Once you are done editing and have all syntax-related errors cleaned up, it’s time to get the bugs out by stepping through portions of your code. Both IDEs support source-level debugging for all Java projects, although Eclipse requires the appropriate plug-ins for JSP, Servlet, and EJB debugging. The plug-in you use to accomplish this is usually the one that comes with the application server you are deploying to (such as BEA WebLogic). NetBeans comes installed with built-in support for Tomcat and Sun Java System Application Server for J2EE project debugging. While this is an advantage for NetBeans users in that it eliminates set-up, Eclipse plug-ins make it possible to develop and debug with languages and platforms not supported by NetBeans (such as C++).
Figure 3: The NetBeans Projects display and the Eclipse Package Explorer categorize the components of each project in a similar way.
Figure 4: NetBeans shows the Navigator display, while Eclipse shows the Outline view.
Dr. Dobb’s Journal, August 2005
19
Both IDEs show where breakpoints are set with an indicator in the editor’s left margin. You can set absolute breakpoints (where execution stops at a line of code) or conditional breakpoints (where execution stops when a condition is met). There is support for stepping into method calls as well as over them, jumping out of methods, and “run to cursor,” where the code executes until it reaches the selected line of code. NetBeans and Eclipse both support built-in source-code version control. Eclipse supports only CVS by default upon installation, and NetBeans supports CVS, Serena PVCS, and Microsoft Visual SourceSafe (VSS). However, Eclipse also supports PVCS and VSS via plug-ins, such as those available from SourceForge (http://sourceforge.net/projects/pvcsplugin/, and http://vssplugin .sourceforge.net/). Most source-code control vendors — such as Perforce, Subversion, Mavin, and Microsoft — offer Eclipse plugins for their tools as well. To connect to a CVS repository in Eclipse, for example, you switch to the CVS Repository Exploring perspective. Once there, right-mouse click in the CVS Repositories view, and choose the New->Repository Location menu options. When the Add CVS Repository dialog appears, enter the appropriate CVS information, and click Finish. NetBeans has a similar CVS setup dialog box, accessible through the Versioning Manager, which is available from the Versioning menu. Both IDEs let you easily add new projects to the version control system you choose, or to pull an existing project from version control. Enterprise Java Development (J2EE) Enterprise Java development has long been the domain of expensive IDEs, such as Symantec’s Visual Café, Borland’s JBuilder, and Oracle’s JDeveloper. However, with open-source IDEs such as Eclipse and NetBeans supporting the development, deployment, and debugging of J2EE applications, the barriers to entry have been
20
removed. Once again, plug-ins let Eclipse work with just about any available application server. There is built-in support for Tomcat and any J2EE-compliant application server such as BEA WebLogic, JRun, and JOnAS. Additionally, there are plug-ins available from vendors or open-source projects such as SourceForge/ObjectWeb (http://forge.objectweb.org/projects/lomboz/) and Sysdeo (http://www.sysdeo.com/). The Preferences dialog is where you configure Eclipse to work with the appropriate application server. This shows that there are various application server plug-in entries, such as Tomcat and WebLogic (which is chosen). For the WebLogic plug-in, you need to enter WebLogic-specific information, the Java classpath, and the Eclipse project(s) it applies to. Each plug-in has its own set-up requirements, but the paradigm is generally the same. To add a new application server runtime environment, such as for WebLogic, click on the Add button within the Server…Installed Runtimes preferences. Next, choose from the available server types and enter the appropriate directory and classpath information. Finally, when you create a J2EE project in Eclipse, you need to tell it which application server runtime to deploy to and run within. To do this, right-mouse click on the project name and select the New menu. Next, choose the menu option Other, then choose Server Wizard from the dialog box that appears. At this point, choose from the preconfigured J2EE application server runtimes that you created previously. NetBeans comes with built-in project configurations for Tomcat as well as Sun Java System Application Server 8.1. You can add support for just about any Java application server via the Server Manager interface, available from the Tools menu. Here, you can modify the settings of existing server definitions or add new ones via the Add Server button.
Dr. Dobb’s Journal, August 2005
http://www.ddj.com
(continued from page 20) The NetBeans Runtime display component provides a global view of the available servers, running processes, configured databases, web servers, source control definitions, and web services defined in your development environment. The display also lets you add new servers, start/stop existing servers, connect to databases, and add web services. Additionally, you can view a
database’s tables and stored procedures, create new tables, and execute queries. This feature is useful as it allows you to stay within the NetBeans IDE even when you need to modify a data model or test a query or stored procedure. Web-Service Development Web services can be created only within an Enterprise project in NetBeans. To create one, right-mouse click on the project name, and choose the New->Web Service menu option. A dialog box appears for you to enter the service name, the package, and location. You can choose to create one from scratch or start from an existing class or WSDL definition. The new service is listed within your project under the Web Services folder. To add a new operation, right-mouse click on the web-service name and choose Add Operation. Figure 5 is a project with two web services created (HotelWS and RestaurantWS), each with its available operations. With NetBeans, making a call to a web-service operation is simple. In an Enterprise application with a Servlet, for example, you begin in the Servlet’s Java class. In the code where you wish to call into a web service, right-mouse click in the editor, and choose Web Service Client Resources->Call Web Service Operation from the popup menu. The available web services for your environment are displayed, along with their respective operations. Selecting an operation inserts the code needed to call into that web service and handle the response. Although Eclipse doesn’t make it as easy to consume a web service as NetBeans, it does have a visual web-service definition tool, called the WSDL Editor. The tool (Figure 6) lets you visually define the WSDL for your web service. You can map whole web-service definitions to messages, specify the SOAP bindings and port types, add new operations, and define new inputs and outputs. Mobile Java Development (J2ME) NetBeans has built-in support for J2ME Mobile Java application development (MIDP). Likewise, Eclipse supports J2ME development with the appropriate plug-in. Typically, you can get a plugin from a phone vendor, such as Nokia. For this article, I worked with Nokia’s Developer Suite for J2ME (http://www.eclipseplugins.info/eclipse/plugin_details.jsp?id=712). The plug-in lets you build J2ME projects within Eclipse, and comes with emulators for the Nokia Series 60, Series 90, and 6320 models. The Nokia Eclipse plug-in adds a new project type, a MIDP Project, and adds a toolbar to the IDE. With it, you can define new packages, new MIDP classes, phone deployment parameters, and emulator configurations. When debugging, Eclipse starts the appropriate emulator and allows you to step through code seamlessly. UI development, however, is not done in Eclipse, but is done through the standalone Nokia development suite installed with the plug-in. NetBeans, however, has a J2ME UI development tool built-in — the Screen Designer. The tool lets you add menu commands (such as OK, Exit, Back, and so on), lists, text boxes, alerts, gauges, images, and other UI elements appropriate for mobile application development. When you add UI elements, NetBeans automatically generates code to handle the component actions. You need to add the implementation, but all the skeleton code is created for you. With the NetBeans Flow Designer (Figure 7), you design the flow of a J2ME application’s screens based on user actions. You can visually specify exactly what happens as users choose certain commands, enter data into a form, or choose entries from a list, for example. Doing this visually lets you quickly arrange and rearrange an application’s UI behavior without touching any code. Of course, you can still get to the code by switching the editor to the Source view at the top of the editor window. You can toggle a single Java source file between the Source view, Screen Designer view, and Flow Designer tool as you see fit; actions taken in any view affect the same source module.
Dr. Dobb’s Journal, August 2005
http://www.ddj.com
Figure 8: NetBeans smart phone emulator. NetBeans supports J2ME application debugging and comes with a built-in smart phone emulator (Figure 8). You can interact with the application as you would on the real phone. Button and menu selection, and text entry, can be performed using the PC mouse and keyboard. As with Eclipse, you can download development tools from a specific cell phone vendor, such as Nokia, that integrate with NetBeans and include your target phone’s emulator. Conclusion Comparing IDEs such as Eclipse and NetBeans is similar to comparing development languages — there is no winner. Both IDEs are competent development platforms for J2SE, J2EE, and J2ME development. Choosing between them largely comes down to taste, influence by others, and whether you develop in other languages in addition to Java. DDJ http://www.ddj.com
Dr. Dobb’s Journal, August 2005
23
Visual Studio 2005 Visualizers See what you are debugging in Visual Studio 2005 JAMES AVERY
D
ebuggers have long been a part of the development process. Ever since the first debugger, programmers have been using debuggers as a way to step through executing code, view the values of variables, and watch how the program reacts to different situations. However, most modern debuggers have failed to adapt to the increased prevalence of complex objects in modern programming languages. These debuggers simply display the object in a tree structure that can be hard to navigate and hard to understand, or simply summarize the object using its string representation in a tool tip. This becomes even more problematic when dealing with objects that store their data as XML or some type of encoded string. Trying to determine the value of these types of variables inside the debugger becomes almost impossible. One notorious example in .NET is the DataSet object. Trying to get a look at the data inside of a DataSet is often painful as you can have multiple tables and then have a plethora of rows in each of those tables. Trying to find an individual row is an exercise in browsing through numerous tables and possibly hundreds of rows. Figure 1 shows a watch window in Visual Studio.NET 2003 displaying the data inside of a single row in a DataSet. To solve this problem, Visual Studio 2005 introduces “debugger visualizers”—Windows Forms dialogs that can be used to create graphical views into the value of an James is a .NET architect and author of Visual Studio Hacks (O’Reilly & Associates, 2004). He can be contacted at javery@ infozerk.com. 24
object, and in some cases even add additional troubleshooting functionality to the debugger. While debugging, you see a small magnifying glass next to certain types. When you click on this magnifying glass, you see one or more visualizers listed in the dropdown menu to choose from, as in Figure 2. After selecting the visualizer to use, Visual Studio then displays that visualizer; the dataset visualization can be seen in Figure 3. Using the DataSet visualizer, you can easily navigate through the tables and rows inside of a DataSet and edit the data of the DataSet. It will then be persisted back to your application. The new visualizer is leaps and bounds ahead of the old tree view model of digging through rows and tables, with an excellent editing ability to boot. Microsoft is planning to release HTML, XML, and string visualizers in addition to the DataSet visualizer with the final release of Visual Studio 2005. (This article and its code are based on the Beta 2 release of Visual Studio 2005. There is a possibility that some of the details will change before the final release of Visual Studio 2005.) How It Works Visualizers run in both the application you are debugging and the debugger: This means that they must send the object you want to visualize between these two different application domains. Visualizers accomplish this by using a System.IO.Stream, meaning that any data that will be passed from the application to the visualizer must be serialized and sent through this stream. Thus, writing a visualizer for an object that is not serializable requires a little bit of extra work. The visualizer infrastructure also provides an outgoing stream to pass data back from the visualizer to the executing application, letting you create visualizers that can edit the data they are visualizing. This means that visualizers can be more than just tools used to inspect the value of an object; they can actually become editors or analyzers, as you will see in a later example. Once the data has been transferred from the executing application to the visualizer, there are virtually endless possibilities of Dr. Dobb’s Journal, August 2005
what can be done because the dialog shown to users is simply a Windows form shown in dialog mode. Just about anything that can be done in a normal Windows Forms application could be done from a visualizer.
“Visualizers are Windows Forms dialogs that can be used to create graphical views into the value of an object” Visualizer Extensibility Model While the default visualizers are an excellent new feature, the best part is that the model is completely extensible. Without too much effort, you can quickly develop a custom visualizer, either for objects in the .NET Framework or for objects in your own projects. The visualizer extensibility model allows for visualizers to be linked to multiple classes, and multiple visualizers to be linked to a single class. To illustrate, I present a simple visualizer that can be used when debugging base64 encoded strings. Normally when strings are encoded in base64, their value is difficult to view in the debugger. You would have to cut-and-paste the value to some other tool and convert the string. Updating the string would be equally hard. Both of these problems can be solved by writing a visualizer to view, edit, and save base64 strings. The goal of this visualizer is to let users view the decoded string, then edit the value and save it back to the executing application. The first step for creating a visualizer is to create a new Class Library project in Visual Studio 2005. The next step is to right click on the references folder and add a new reference to the Microsoft.VisualStudio.DebuggerVisualizers assembly, http://www.ddj.com
The saving of the object is done through the object provider using the ReplaceObject( ) method (this method can be overridden as well if needed). Listing Two is the code used for the form.
Figure 1: Visual Studio 2003 watch window. (continued from page 24) which you can find in the .NET tab of the Add References dialog. This is the assembly that contains the interface and abstract class needed to write a custom visualizer. A basic visualizer consists of a number of different parts: • The object source that Visual Studio uses to transfer data from the application to the visualizer and back. • The Windows form that displays the data. • A small debugger side class that ties these two components together. The Object Source The role of the object source is to facilitate the transfer of the object from the application being debugged through the Visual Studio infrastructure and to the visualizer form. If the object you are visualizing is serializable, then the default object source will actually do the serialization for you. If the object is not serializable, then you will need to determine what values can be passed to the visualizer and send that data; for instance, if visualizing a SqlConnection object, you might send the connection string and connection status because the entire object cannot be serialized and sent.
Figure 2: Visual Studio 2005 visualizer.
Because the string object is serializable, I actually don’t need to create a custom object provider. However, I do so anyway for the sake of completeness and to illustrate what needs to be done in situations where automatic serialization is not possible. To create the object source: 1. Add a new class to your project. 2. Add a using statement for the Microsoft.VisualStudio.DebuggerVisualizers namespace to your class. 3. Set your class to inherit from VisualizerObjectSource. 4. Override the GetData method and provide the transport method for your object; in this example, it is simply a matter of serializing the object using the binary formatter. Listing One presents the complete object source. The Form The form is where users view and interact with the object they are visualizing. Before you can add a Windows form to your Class Library project, you first need to right click on the references folder and add a reference to the Windows.Forms assembly. For this visualizer, the form needs to: 1. Accept the object source as part of its constructor; this allows the form to read/write the value of the variable through the object source. 2. Convert the string from base64 encoding to plain text. 3. Convert the plain text back to base64 and save the object.
Deploying Visualizers Before the visualizer is ready to be deployed, you must first add an assemblylevel attribute such as this: [assembly: DebuggerVisualizer(typeof(Base64StringVisualizer.Base64StringVisualizer), typeof(Base64StringVisualizer.Base64VisualizerObjectSource), Target = typeof(string), Description = "Base64 StringVisualizer")]
The first parameter specifies your debugger side class that inherits from DialogDebuggerVisualizer. The second value specifies your custom object source. The Target parameter specifies the type that this visualizer is built for, and the description parameter specifies the name for the visualizer that should be displayed in the visualizers menu. An assembly can have multiple DebuggerVisualizer attributes, so you can link a visualizer to multiple target types or include multiple visualizers in a single assembly. Once you have added the assembly level attribute, you install the assembly. Adding a new visualizer does not involve
Figure 4: First step to running the visualizer.
Figure 3: Dataset visualization. 26
The Debugger Side Class The debugger side class is the last class needed for the visualizer. This class is used by Visual Studio to actually display your custom form. To create the debugger side class, you simply add another class to your project and set it to inherit from the DialogDebuggerVisualizer class. You then need to override the Show method and inside of that method create an instance of your visualizer form passing in the object source, then lastly, call the windowService.ShowDialog( ) method passing in your form. When the user clicks on your visualizer, the debugger calls your Show method, thus creating an instance of your form and displaying it to your user. Listing Three presents the DebuggerVisualizer class.
Dr. Dobb’s Journal, August 2005
http://www.ddj.com
registry settings or even configuration settings, you drop your assembly into one of two different files paths: • \Documents and Setting\%profile%\My Documents\Visual Studio\Visualizers. Visualizers added here will only be available for this user. (You may need to create this directory if it does not exist.) • \%vs install directory%\Common7\ Packages\Debugger\Visualizers. Visualizers added here will be available to every user on this machine. Debugging Visualizers Because the visualizer is an assembly with no entry point, it is inherently hard to directly debug. Microsoft anticipated this and made it easy to debug visualizers by including the special class VisualizerDevelopmentHost. This class can be used to directly load your visualizer without the hassle of loading and debugging
another application. To debug your debug visualizer: 1. Add a new Console Application Project to your solution and set it as the default startup project. 2. Right click on the references folder in your console application and add references to your visualizer project as well as the Microsoft.VisualStudio.DebuggerVisualizers namespace. 3. Declare an instance of the variable you want to visualize. In this example, I create a base64 encoded string. 4. Pass your local variable to the VisualizerDevelopmentHost as in Listing Four. You can now place breakpoints in your visualizer and then simply launch the debugger, and the VisualizerDevelopmentHost launches your visualizer just like it was being launched from a normal running instance of Visual Studio.
namespace Base64StringVisualizer { class Base64VisualizerObjectSource : VisualizerObjectSource { public override void GetData(object target,System.IO.Stream outgoingData) { BinaryFormatter formatter = new BinaryFormatter(); formatter.Serialize(outgoingData, target); } } }
using using using using using using using using using
Running the Visualizer To run the visualizer, you simply need to create a test project, create a base64 encoded string, and then launch the debugger. You first see the visualizer in the dropdown visualizer menu, as in Figure 4. After selecting the visualizer in the menu, you will see the running visualizer. You can view the base64 string, the decoded string, and then edit and save the decoded string value. When you save the value, it is sent back to the executing application. Visualizers make debugging complex, modern day applications easier, and will truly enhance the debugging experience in Visual Studio 2005. The extensibility model makes it easy to create your own visualizers for either framework classes or your own classes.
Listing Four string base64string = "This is a test of the Base64 system"; base64string = base64Encode(s); //using method from Listing 2 VisualizerDevelopmentHost visualizerHost = new VisualizerDevelopmentHost(base64string, typeof(Base64StringVisualizer)); visualizerHost.ShowVisualizer();
Dr. Dobb’s Journal, August 2005
DDJ 27
The Eclipse Modeling Framework Moving into modeldriven development FRANK BUDINSKY
T
he idea of building applications by first modeling them, then transforming these models into implementation code has been around for many years. Providing a higher level abstraction for defining software would seem to be a natural evolution. Twenty or so years ago, structured programming languages replaced assembly language, or machine code, as the most popular way to write software. About 10 years ago, object-oriented programming languages became entrenched as the most predominant languages, again raising the abstraction level. Lately, there’s been a lot of talk about model-driven development as the next higher level abstraction. Each step in the evolution of software development has been accompanied by skepticism, and model-driven development is no different. The skepticism is usually the result of overly grandiose visions and promises, opening up the visionaries to attack from the more practical types. Many programmers think that class diagrams might be helpful to document their designs, but they know that implementing complex systems by simply “transforming a picture” is a pipe dream. They know that all the expressive power of a programming language can’t be available in a model because if it was, it wouldn’t be any simpler (higher level). It would just be another programming language. That said, most programmers do recognize that generating some of the code that Frank is an engineer in IBM’s Software Group and lead author of Eclipse Modeling Framework: A Developer’s Guide (Addison-Wesley, August 2003). He can be contacted at [email protected]. 28
they write over and over must be possible. How many copy-and-paste operations do you need to do before you start to wonder if there couldn’t be a way of specifying parameters for patterns that you want, and just have the code generated automatically? Clearly, these patterns must represent some higher level abstraction that, if only it could be specified (modeled), could enable us to write a lot less code. The problem is that the lack of simple and seamless ways to enter the parameters for these patterns has tended to limit the use of automatic code generation in mainstream development environments. So, to foster the evolution of widespread model-driven development, a low-cost way to gradually introduce it is first needed. The Eclipse Modeling Framework (EMF; http://www.eclipse.org/emf/) has emerged as a middle-ground in the modeling versus programming worlds. Integrated with the Eclipse Java development tools, EMF provides an easy way for you to define models, from which many common codegeneration patterns are generated. It leverages the fact that just about every program we write manipulates some kind of data model. It might be defined using Java, UML, XML Schema, or some other definition language. EMF aims to extract this intrinsic “model,” thereby giving programmers an easy way to get some of the benefits of modeling, without necessarily becoming a full-fledged modeler overnight. Defining the Model Have you wondered how many of the applications you write manipulate data? The answer is pretty close to 100 percent. So where is the description of the data that your program is manipulating? Is the data simply defined by the interfaces in a Java program? Maybe you have an XML Schema that defines the data? Maybe it’s a relational database (RDB) schema? Maybe you’re already a modeling enthusiast and have a UML class diagram? Whatever the case may be, you either already have a “data model,” or are about to implicitly define it in the code you’re going to write. Dr. Dobb’s Journal, August 2005
Say that you want to write a program to manage purchase orders for some store or supplier. A purchase order includes a “bill to” and “ship to” address, and a collection of (purchase) items. An item includes a product name, quantity, and a price. If you’re a Java programmer, you’d probably start by writing some Java interfaces, something like Listing One.
“EMF is a framework and toolkit that extends Eclipse’s Java Development Tooling into modeldriven development” The abstraction provided by Java itself doesn’t explicitly define anything more than simple interfaces and methods, but through naming convention and comments, these interfaces do define the model. For example, the get/set method pairs (or Bean properties, if you want to look at it that way) represent data, or attributes, in the model. The getItems( ) method indicates that the purchase order aggregates, or is associated with (references), its corresponding items. Using this model information, one could easily imagine generating an implementation of many of these methods. Another common way to describe a data model like this one is by using an XML Schema. In this case, the same model information could be provided using XML Schema complexType and element constructs; see Listing Two. Notice that XML Schema is serving two purposes here. In addition to defining the model, it’s also specifying the persistent format of the XML http://www.ddj.com
PurchaseOrder shipTo : String billTo : String
items
Item
0..*
productName : String quantity : int price : float
Figure 1: UML class diagram of a purchase order model. (continued from page 28) file, into which the data will presumably be serialized. This dual-purpose role of XML Schema can lead to complexity, especially when some of the more advanced constructs are used, but it is becoming one of the most popular ways of modeling data structures these days. A third, and the simplest, way to represent the purchase order model would be with a UML class diagram, as in Figure 1. Not surprisingly, the UML diagram is the most concise representation. As you can see, it provides all the same information as the other two forms of the model, only this time in a simple picture. Identifying an application’s data model has two big advantages: • It gives you a nice, high-level way to both communicate the design and to share data with other applications. • You can potentially generate some, if not all, of the implementation code. Today’s applications are becoming increasingly less monolithic and more integrated with other applications. The key to supporting fine-grain data integration between applications is a common understanding of the data— that is, a model — and ideally a common implementation framework. Being able to generate some of the implementation code is also the key to increased productivity and high quality and reliability of these increasingly complex integrated applications. Modeling with EMF EMF is a framework and toolkit that extends Eclipse’s Java Development Tooling into the world of model-driven development. EMF provides a metamodel (that is, a model of a model), called “Ecore,” for
Figure 2: Eclipse workbench. 30
describing EMF models. Ecore, which is essentially the class diagram subset of UML, is based on the Object Management Group’s (OMG) Meta Object Facility (MOF) specification. The canonical form of an EMF model is an XML Metadata Interchange (XMI) serialization of an Ecore model. However, one of the main strengths of EMF is its flexibility with respect to the means by which the Ecore model can be defined: • XMI; you can create an (Ecore) XMI document directly, using an XML or text editor, or using EMF’s simple tree-based sample Ecore editor. • UML; you can define the model using a commercial UML modeling tool such as Rational Rose, or using an Eclipse plug-in such as Omondo’s free EclipseUML graphical editor (available at http:// www.omondo.com/). • Java Interfaces; basic Java interfaces with a few simple (@model ) annotations can be used to describe an Ecore model. • XML Schema; an XML Schema defining the data structures for the model, can be converted to an Ecore model. • Others; EMF is a highly extensible and customizable framework/toolkit. Support for other forms of model definition is not only possible, but expected. The first approach is the most direct, but generally appeals only to XML gurus. The second choice is the most desirable if you are already using modeling tools, while the Java approach provides the benefits of modeling in a pure Java development environment like Eclipse’s JDT. The XML Schema approach is most desirable when the tool or app is intended to manipulate XML data that is already defined using an XML Schema (as with web services). Regardless of which input form is used to define an EMF model, the benefits are the same.
Figure 3: Eclipse Java Editor. Dr. Dobb’s Journal, August 2005
Given an Ecore model of the classes, or data, of the application, EMF provides a surprisingly large portion of the benefits of modeling. The mapping between an EMF (Ecore) model and Java is natural and simple for Java programmers to understand. At the same time, it’s enough to support fine-grain data integration between applications and provide a significant productivity gain resulting from code generation. From an Ecore model, EMF’s generator can create a corresponding set of Java implementation classes. Every generated EMF class extends from the framework base class, EObject, which enables the objects to integrate and work in the EMF runtime environment. EObject provides an efficient reflective API for accessing the object’s properties generically. In addition, change notification is an intrinsic property of every EObject and an adapter framework can be used to support open-ended extension to the objects. The runtime framework also manages bidirectional reference handshaking, cross-document referencing including demand-load, and arbitrary persistent forms with a default generic XMI serialization that can be used for any EMF model. EMF even provides support for dynamic models; that is, Ecore models that are created (in memory), and then instantiated without generating code. All of the benefits of the EMF runtime framework apply equally to them. In addition to the core EMF framework, another framework, EMF.Edit, extends and builds on the core, adding support for generating adapter classes that enable viewing and command-based (undoable) editing of a model, and even a basic working model editor. Generating a Model It is straightforward to use EMF to convert an XML Schema (the purchase order schema example) into a model, then generate an implementation of the model including an XML model editor for instances of the schema. I start from an Eclipse Workbench with the purchase order schema (SimplePO.xsd) in its workspace; see Figure 2. To create the EMF (Ecore) model from the schema, all you need to do is use the EMF Project Wizard to import the model: 1. Invoke File->New->Project… and select EMF Project. 2. Name the project, that is, “simplepo.” 3. Select Load from an XML Schema and then locate the file SimplePO.xsd. 4. Proceed through the remaining wizard pages, following instructions and accepting any default values. Doing this creates a new Eclipse Java project initially containing two modelrelated files with the following suffixes: http://www.ddj.com
• ecore, an XMI serialization of the Ecore model corresponding to our XML schema. • genmodel, a generator model, which is a wrapper (decorator) model of the .ecore model that provides options relevant only to the code generator (for example, the directories in which to generate the code, and choices for other user-selectable implementation patterns). At this point, you have extracted the EMF model of the application and can now display the UML equivalent of your XML Schema by opening the .ecore file using a graphical UML editor, such as Omondo’s EclipseUML. Given that the XML Schema-based model doesn’t include graphical layout information, the UML model is presented using a default layout. The purchase order model looks nice, but more complicated models may need a bit of graphical editing. To generate the model’s corresponding Java interfaces, as well as implementation code, close the .ecore editor and open the .genmodel using the EMF code generator. A popup menu in the generator lets you generate model implementation code (Generate Model Code), adapters that support editing and display (Generate Edit Code), and even a working model editor (Generate Editor Code). Generate All generates all three. Both the Ecore and the generator models (.ecore and .genmodel) are EMF models themselves. That is, Ecore is its own metamodel, as well as the metamodel for the generator model. Both models have been generated with the EMF code generator, and the EMF code generator is even an EMF-generated editor, like the one just generated for the purchase order model. Now that you’ve generated the Java interfaces and classes, you can look at the third form (Java) of the model by opening the Java files using the Eclipse Java Editor; see Figure 3. Because you’re inside the Eclipse Java Development Environment, the generated Java code has already been compiled and is ready to run. Launching a runtime workbench, you can now create and edit XML instance documents to edit purchase orders and their corresponding items. The generated editor, although simple, is surprisingly powerful, even supporting drag-and-drop and unlimited undo. After editing the purchase order, you can save the file by invoking File->Save. The serialized form will, as expected, be an XML instance document conforming to the original schema (in SimplePO.xsd); see Figure 4. The generated EMF editor provides a good example of the recommended style for EMF model editors and can be used as either a test application or as a starting http://www.ddj.com
point from which to start customizing. This is practical because of another important advantage of EMF: You can regenerate a model after changing any of the generated source code, without wiping out your changes. The EMF generator produces files that are intended to be a combination of generated pieces and hand-written pieces. You are expected to edit the generated classes to add methods and instance variables. You can always regenerate from the model as needed, and your additions will be preserved during the regeneration. Code Generation Patterns Code generating tools often generate code that isn’t particularly clean, simple, or elegant — or even intended to be looked at. It’s just meant to be a behind-the-scenes function that you simply rely on to work. EMF’s code generation, on the other hand, is expected to be viewed, extended, and even modified, so a great deal of effort went into designing clean and elegant patterns — the kind of code you would write by hand. The intention is to solve the repetitive and sometimes difficult problems that you face over and over. Among the available patterns are: Feature Access and Notification. Simple EMF features (properties) are generated using the standard JavaBean pattern. For example, the get method for the shipTo attribute of the purchase order example simply returns an instance variable like Listing Three. The corresponding set method simply sets the same variable, but it also sends a notification to any observers that may be interested in the state change; see Listing Four. Notice that to make this method more efficient when the object has no observers, the relatively expensive call to eNotify( ) is avoided by the eNotificationRequired( ) guard. More complicated patterns are generated for other types of features, especially bidirectional references where referential integrity is maintained. In all cases, however, the code is generally as efficient as possible, given the intended semantic. Bidirectional Reference Handshaking. The implementation pattern generated for bidirectional references provides a great example of the value of EMF’s code generation. Look at an example of a simple doubly linked list of purchase orders. In UML, it might look something like Figure 5. Here you define a simple 1-to-1 bidirectional relationship, which could be used to create a chain of purchase orders. The “next” and “previous” references are used to access a purchase order’s following and previous order, respectively. By defining this as a bidirectional reference, you’re saying that the “next” reference of a givDr. Dobb’s Journal, August 2005
en purchase order and the “previous” reference of its following order should always be kept in sync. If you use Java interfaces to declare this same association, you use @model annotations in the Javadoc to indicate that the two references, “next” and “previous,” are opposites (that is, reverses) of each other, rather than simply a pair of independent references; see Listing Five. Because the two references are opposites, you can set the association from either end, and the other end is automatically updated. This may even involve removing references from other objects. For example, consider the case where three purchase orders (p1, p2, and p3) are initially linked as in Figure 6(a). If you now want to change the “next” reference of p1 to be p3, instead of p2, call the setNext( ) method on p1: p1.setNext(p3);
As a result of executing this single statement, the three objects in Figure 6(b) are updated. As you can see, there is significantly more happening than simply updating the “next” reference of p1. In fact, all of this happens: • Remove p1.next pointer to p2. • Remove p2.previous pointer to p1. • Set p1.next pointer to p3. • Remove p3.previous pointer to p2. • Remove p2.next pointer to p3. • Set p3.previous pointer to p1. • Send notifications (changes to p1, p2, and p3). To do this, the generated EMF implementation of the getNext( ) method looks something like Listing Six. Notice that, in addition to adding and removing all of the required references, the notifications are queued and only dispatched at the very end when all of the objects are in their final consistent state. Although implementing something like this isn’t that hard for good programmers, the bottom line is that it isn’t trivial and can be hard to get right. By simply leveraging the higher level EMF concept of a bidirectional relationship, we get an efficient, proven, correct implementation pattern for free. Object Persistence (Proxy Resolution and Demand Load) The ability to persist, and reference other persisted model objects, is one of the most important benefits of EMF modeling; it’s the foundation for fine-grain data integration between applications. The EMF framework provides simple, yet powerful, mechanisms for managing object persistence. Many EMF models use the default XMI serialization provided by EMF to persist 31
Figure 4: Serialized form. next
PurchaseOrder
0..1 previous 0..1
Figure 5: UML diagram of a doubly linked list of purchase orders.
their models. Others implement customized serialization, allowing the model to be persisted in its natural (native) format. For example, XML Schema models can be made persistent as .xsd files, Java models as .class files, and so on. Other models, for example, mappings, are typically persisted using (the default) XMI, although specific types of mappings may be serialized differently. For example, XML Schema mappings can also be serialized in XSLT format. Regardless of the actual persistence format, the EMF framework provides generic support for resource saving and loading, including cross-resource linking with proxy resolution and demand load. An object serialized in an XMI file, for example, can reference another object serialized in another format, or possibly even stored in a relational database. EMF’s resource cross-referencing and
(a) next p1
previous
next p2
(b)
p3
previous
next p1
p3
previous p2
demand-loading mechanisms tie them together with one simple and uniform Java API. Conclusion EMF is an open- source project at Eclipse.org providing a framework and code-generation facility for building robust applications based on surprisingly simple models. Models can be defined in several different ways — Java interfaces, XML Schemas, UML — from which EMF will generate a large part of the application. The generated code is clean, efficient, and easily hand modified. You can even regenerate the model after changing the code, without wiping out your changes. EMF provides low-cost modeling for the Java mainstream by leveraging the intrinsic model in an application. With EMF, no high-level modeling tool is required. There are two fundamental benefits from using EMF. First, it results in a productivity gain by providing a nice, high-level way (UML) to communicate the design and then generate part of the implementation code. Second, EMF allows applications to integrate at a much finer granularity than is otherwise possible. The designers of EMF believe that it mixes just the right amount of modeling with programming to maximize the effectiveness of both.
Figure 6: (a) Three purchase orders that are linked; (b) Updating the three objects using setNext().
DDJ
Listing One
Listing Four
public interface PurchaseOrder { String getShipTo(); void setShipTo(String value); String getBillTo(); void setBillTo(String value); List getItems(); // List of Item }
Listing Five public interface PurchaseOrder { /** @model opposite="previous" */ PurchaseOrder getNext(); /** @model opposite="next" */ PurchaseOrder getPrevious(); ... }
Listing Six public void setNext(PurchaseOrder newNext) { if (newNext != next) { NotificationChain msgs = null; if (next != null) msgs = ((InternalEObject)next).eInverseRemove(this, POPackage.PURCHASE_ORDER__PREVIOUS, PurchaseOrder.class, msgs); if (newNext != null) msgs = ((InternalEObject)newNext).eInverseAdd(this, POPackage.PURCHASE_ORDER__PREVIOUS, PurchaseOrder.class, msgs); msgs = basicSetNext(newNext, msgs); if (msgs != null) msgs.dispatch(); } else if (eNotificationRequired()) eNotify(new ENotificationImpl(this, Notification.SET, POPackage.PURCHASE_ORDER__NEXT, newNext, newNext)); // touch notification }
public String getShipTo() { return shipTo; }
32
DDJ Dr. Dobb’s Journal, August 2005
http://www.ddj.com
The TMS Development Platform A multiplatform, multitarget development system ALEXANDER FREY
W
hen developing platformindependent software, you quickly reach the point where you need a development environment that reaches beyond the familiar IDEs and makefiles, offering a more universal approach. In this article, I present a simple but powerful solution called the “TV Server Makefile System” (TMS) that we use at Fast TV Server AG (where I work) to deal with development and target platforms, ranging from Windows and Linux to various embedded processors. TMS provides development teams with a universal way of building software components, integrates a common test approach, and addresses problems that arise when supporting different hardware and platforms. Together with coding rules and component templates, we use it as a common framework for developing objectoriented software in C— still the main language for embedded programming. I used Make as the build engine because it offers three advantages: It’s free, available for every platform, and all developers are at least somewhat familiar with it. While the solution I present here is designed for C/C++, it can be used for other languages as well. One of the main goals was to keep the whole system as simple and readable as possible (which Alexander is a software engineer at FAST TV SERVER AG (http://www.tv-server.de/) in Munich. He can be contacted at [email protected]. 34
means to not use every trick Make offers and limit the necessary preparations to a minimum) so that developers can understand or modify it. A Project In TMS, a “project” is the basic element for creating software. A project contains everything that you need for building an executable or library. Usually a project is a component in a larger system that offers a certain functionality through an API. To support a test-driven software development process, a project usually produces a library to contain the API implementation but also an executable to test the library. A project can stand on its own, as it contains all files, output- directories, testscripts, documentation, and the like required for building, testing, and using the component. In short, each project consists of the subdirectories in Table 1. The output directories (bin, extlib, lib, and obj) also contain subdirectories for each target platform (such as GCC or Visual C 6), which contain directories for debug and release versions. Although this increases the total number of directories, it offers the advantage of allowing two source/object files (or in a limited way even components) to use the same name, and it also lets you copy or backup a complete project by simply copying the project’s main directory, including its subdirectories. First Steps To be set up on your system, TMS only requires: • The GNU Make tool. On Windows you can to download and install it from the MinGW open-source project (http:// sourceforge.net/project/showfiles.php? group_id=2435). On UNIX platforms, it’s usually available in the development packages. • A system variable called “TMS_BASE_ DIR,” which contains the path to your Dr. Dobb’s Journal, August 2005
working directory but with UNIX slashes (for example, TMS_BASE_DIR=c:/ work/). On Windows you can do this in the system control panel; on Linux, using bash, you can add an export statement in your .bash_profile file inside your home directory.
“TMS addresses problems that arise when supporting different platforms”
You can easily develop on Windows for Linux. You only need a Linux server that runs Samba. You can then mount your home directory on the server (create a network drive with a fixed drive letter, for example) and put all of your working directories and files there. To build and test projects on Linux, log in to the server using an ssh or telnet client. Experienced users may prefer an X server, like that offered by Hummingbird Exceed (http://www .hummingbird.com/products/nc/exceed/) or Cygwin (http://www.cygwin.com/). The makefiles The most important thing when writing makefiles is to keep them as simple as possible. In the TMS build system, the functionality is separated in a bunch of makefiles to simplify their use and avoid duplication by separating the common from the special: http://www.ddj.com
• The project makefiles (makefile.mak, for example; see Listing One). • Common settings for specific target platforms (gcc_makefile_settings.mak; see Listing Two). • Common actions for specific target platforms (gcc_makefile_create.mak; see Listing Three). • User settings for specific target platforms (gcc_makefile_user_settings.mak; see Listing Four). Each project has its own project makefile, which consists only of a few lines. Depending on the user’s target platform, it includes the so-called settings and create makefiles, which exist only once inside the $TMS_BASE_DIR/makefiles directory. They define the platform-specific compiler and linker settings, as well as the compiling and linking process. Consequently, a project makefile only needs to specify what is really unique for the project (see Listing One): • Special compiler and linker flags (CINCS, CFLAGS, LDFLAGS). • The source files that should be compiled for the library (LIBOBJECTS) and executable (APPOBJECTS). • Tests (TESTS) that should be performed when Make is called with the test target. • The name of the library (ARCHIVE) and executable (APPLICATION). • The projects/components that are used by this project (dependencies) using their include makefiles (include…). Keeping the project makefiles this simple has two important advantages: They are easy to read and modify, and they do not contain any platform- or targetspecific code. The system makefile (Listing Two) defines the compiler, linker calls, default flags, libraries, paths, and so on. The create makefile (Listing Three) provides actions such as compiling and linking files that are performed when a certain target is defined on the console. Usually you only need to create them once when adding support for new compilers. But not all settings are user independent — some are left to be defined in the so-called user makefile (see Listing Four), which is included by the settings makefile. Here you can specify, for example, build debug or release versions, or the target hardware platform. All of these settings can also be overwritten by using system variables or command-line options. Building a Project To build a project on the console in TMS, you only need to call Make for the projects makefile. The parameters you need to specify are the target compiler (called http://www.ddj.com
TMS_TARGET, which can be also defined by a system variable if you only work with a specific compiler) and the build targets, which are performed in the given order. Possible targets are: • lib, which compiles the objects in the LIBOBJECTS list and combines them in a static library. • dll, which compiles the objects in the LIBOBJECTS list and combines them in a shared library. • exe, which compiles the objects in the APPOBJECTS list and links them together with system and user libraries to create an executable. • clean, which removes all objects, libraries, and executable files. • test, which invokes the executable with the parameters in the TESTS list once for every list item (more details will follow). • dep, which builds all items in the DEPENDENCIES list with the same settings (dependencies are not built recursively). For example, to rebuild the library and executable using the GCC compiler: make -f makefile.mak TMS_TARGET= gcc clean lib exe
Building Larger Projects By Components Larger software projects are usually built out of many different libraries, classes, and APIs. The TMS environment makes it easy to build components that can be reused by higher level projects using so-called inc-makefiles (Listing Five) located in the directory $TMS_BASE_DIR/makefiles/inc. This type of makefile acts like a header file for a specific component and can be included by other projects. It adds the path to the source (the header, for instance) and library files, as well as the library itself to the higher level projects compiler and linker flags. For example, imagine the project MyProject, which uses a component named MyTimer. The only thing you need to do is add an include statement for the inc_MyTimer makefile in the MyProject makefile; for example: include $(MAKE_DIR)/inc_MyTimer.mak
When you build MyProject, it automatically links the MyTimer library. The inc-makefiles also add an entry in the DEPENDENCIES list, which is used when you call the makefile with the target dep. Then for each project in this list, Make is called again, as you can see at the end of the inc-makefile. For example, the command: make -f makefile.mak TMS_TARGET= gcc clean lib dep
Dr. Dobb’s Journal, August 2005
rebuilds the project’s library and all libraries in the DEPENDENCIES list. Remember that all components are built in place — no header or library file needs to be copied to another location. You can order and visualize the components hierarchy by using meaningful subdirectories for your projects. Therefore, you can also build layers of software that can be reused in other projects. Platform-Specific Implementations When working on multiple platforms or when you need to support different hardware, you always encounter a situation where you need to do specific implementations of a certain functionality. For example, imagine the MyTimer component needed to be implemented differently on Windows and Linux. The first solution that comes to mind is to use #ifdefs in the code, but although this is written within a second, the consequences usually are much harder to come by. The first problem with #ifdefs is that the code gets harder to read. Especially when the number of platforms or hardware versions rise, you find it more and more difficult to tell which part is used by just taking a quick look at the source. When you need to support different versions based on the same platform/compiler, the second problem is that you need to recompile the code, which usually results in recompiling all projects every time you switch between different targets and platforms — just to be sure you are not linking some older objects. And you’ll really start to hate the #ifdefs when you need to add support for a new platform or hardware version later on. Then you will need to go through all of your code looking for #ifdefs and add support for the new target. And because you modified the source code, you should compile and test the older targets as well. The only solution to these problems is to abandon #ifdefs and instead write abstraction layers for operating systems, hardware boards, and so on. In the MyTimer example, you would create two separate projects —MyTimer_win and MyTimer_linux— using the same header file MyTimer.h, but having different implementations (see Figure 1). If you are using a versioning system such as SourceSafe or CVS, you can simply do this by sharing the header file so it’s always unique. Then you only need a small including makefile inc_MyTimer.mak that works as a switch to choose the right library at link time (Listing Six). A higher level project will now only need to include the MyTimer project and won’t have to care about the implementation anymore. Using this approach, you get rid of the previously mentioned 35
make -f makefile.mak TMS_TARGET=gcc test
#ifdef problems and improve the quality and reusability of your code.
results in two shell calls: ../bin/gcc/debug/MyTimer.out 123 ABC ../bin/gcc/debug/MyTimer.out testscript1.txt
Testing Again, TMS enables a test-driven software development process. The test code is separated from the library code and is only compiled to build the executable. This executable can use command-line parameters to perform different test cases. You should also think of using a generic test framework like CUnit; we use a simple, platform-independent parser to run test scripts. Test cases can then be added to the TESTS list in the project makefile. When calling the makefile with the test target, the executable is invoked once for every item in the list. For example, if the MyTimer project had a TESTS list like the following:
Make quits if the executable does not return 0 as a result. (This is the desired behavior to stop if a test error occurs. You can disable this by using the -I option when calling make.) An advantage is that you can compile, link, and test your project in one step; you can even test all depending projects using dep test at once. Remote Systems You can also benefit from this integrated test approach when developing for remote systems that offer some kind of remote execution. You only have to put the command for running the executable on the remote machine below the test target in the create makefile. For example, consider a remote Linux system. First, you have to make sure that the executable (and other test files or
TESTS = 123\ ABC\ testscript1.txt
A call to:
MyProject
MyTimer MyTimer_win
SocketIO
MyTimer_linux
…
…
…
Figure 1: Project hierarchy using a switch-makefile to choose the right implementation of MyTimer. Subdirectory
shared libraries) on the host are available for the remote machine. The easiest way to do this is to mount your host working directory using NFS. Then, inside the user makefile, create two variables: • REMOTE_IP, to contain the remote IP address (for example, REMOTE_IP = 192.168.2.1). • NFS_HOME, for the mounted working directory (for example, NFS_HOME = /var/tmp/nfs). For remote execution, we use the remote shell daemon (rshd), a background process that listens on a well-known port for remote command requests (security isn’t an issue for our development purposes). Finally, inside the create makefile, the test target can look like this: $(TESTS): @ echo "Executing on $(REMOTE_IP) $(APPLICATION) with parameter(s): $@" @ rsh $(REMOTE_IP) "exportTMS_BASE_DIR= $(NFS_HOME)$(TMS_BASE_DIR); \ cd $(NFS_HOME)$(PRJDIR);$ (NFS_HOME)$(BINDIR)/ $(APPLICATION).out $@"
Now the test target executes the tests on the remote machine, where the stdout messages are shown on the host system. It offers a transparent and fast way to do remote and embedded development. Conclusion The TMS build system offers a simple, powerful way to do multiplatform development. New compilers can easily be introduced by adding new settings and create makefiles. Using abstraction layers, new hardware targets can be added without problems and have no or only minimal impact on the existing software. Tests are an integrated part of the system, improving the software quality at the component level. Finally, it lets you choose your favorite IDE and development system. DDJ
# check input ifndef TMS_BASE_DIR $(error "Error: TMS_BASE_DIR not defined!") endif ifndef TMS_TARGET $(error "Error: TMS_TARGET not defined!") endif
# tests TESTS = std # targets ARCHIVE = MyProject APPLICATION = MyProject # include extern projects (dependencies) include $(MAKE_DIR)/inc_MyTimer.mak
# the directory containing this file CURRENT_DIR = $(TMS_BASE_DIR)/MyProject/prj # include common compiler flags, definitions and user settings include $(TMS_BASE_DIR)/makefiles/$(TMS_TARGET)_makefile_settings.mak # additional include directories, compiler and linker flags CINCS += CFLAGS += LDFLAGS +=
# create targets include $(TMS_BASE_DIR)/makefiles/$(TMS_TARGET)_makefile_create.mak
ifndef HWPLATFORM HWPLATFORM = Linux endif # os version of build system ifndef OS_TYPE OS_TYPE = Linux endif # location of compliler etc. ifndef GCC GCC = /usr/bin endif
Listing Three # create makefile for gcc # additional include directories CINCS += -I$(EXTINCDIR) # additional linker flags LDFLAGS += $(LDPATHPREFIX)$(EXTLIBDIR)
Listing Five # inc-makefile for MyTimer # the name of the component MYTIMER_NAME = MyTimer
Listing Four # user makefile for gcc # define Debug or Release version ifndef DEBUG DEBUG = YES endif # select hardware platform
http://www.ddj.com
Dr. Dobb’s Journal, August 2005
39
The VSTSEclipse Project Integrating Eclipse and Microsoft’s Team System JOE SANGO
T
he VSTSEclipse project started with a conversation about how Eclipse users could utilize Visual Studio Team System, Microsoft’s software-development lifecycle solution. In this article, I introduce the VSTSEclipse project, state its goals, and explain how it can impact the development community. I then examine two major components of the project. In future articles, I will detail specific development issues and components. But first, a little background. Visual Studio Team System (http://lab .msdn.microsoft.com/teamsystem/) is an integrated suite of lifecycle tools that is part of Visual Studio 2005. As such, Visual Studio Team System (VSTS) provides facilities to support integration and communication among architects, developers, testers, project managers, and others involved in the software development lifecycle. As Figure 1 illustrates, VSTS for architects includes tools for visually constructing serviceoriented solutions, while tools for developers include those for static analysis, code profiling, code coverage, and unit testing. Tools for testers include unit testing, manual testing, web testing, and load testing. In addition, support for development teams includes tools for project tracking and source-code control. For its part, Eclipse is an open-source framework for integrated development tools. However, Eclipse, which is maintained by the Eclipse Foundation (http:// www.eclipse.org/), is more than just a Java IDE. The Eclipse platform is built around a plug-in architecture and provides the Joe is a senior developer in Melbourne, Australia, and principal in the consulting firm TeamForce. He can be reached at [email protected]. 40
runtime in which plug-ins are loaded, integrated, and executed, with the ultimate goal of enabling developers to easily build and deliver integrated tools. While the Eclipse SDK includes a Java IDE, the platform also supports other languages, primarily C and C++. With the Beta 2 release of Visual Studio 2005 and Team System, the .NET community is anticipating the opportunity to utilize VSTS functionality on projects ranging in size from small to enterprise. More often than not, software projects based on .NET utilize the Visual Studio IDE for all facets of the development process, and Team System seamlessly integrates on top of the development process, bringing improved integrated project functionality. But what about project teams that want to utilize Team System’s integrated SDLC functionality, yet aren’t .NET based? This is the niche that VSTSEclipse project addresses. The VSTSEclipse project (http:// sourceforge.net/projects/vstseclipse/) will provide an Eclipse plug-in that lets you effectively utilize Team System core functionality outside the VSTS framework. The importance of this functionality is evident when you understand how critical communication can be between the processes and components of a successful software project. One example is keeping track of application builds and releases on a software project. How do you know what has gone into the build? Were changes — bug fixes, for instance — made to the previous code base? Who made those changes? Who is assigned certain tasks that need to be in the build before it is released? What is their progress with the task? These questions are extremely important for project managers who are trying to aggregate this information for this scenario and determine the state that their project is in. Information aggregation is made easier and more relevant when the respective components are effectively communicating and linked together. Granted, there are any number of tools available that assist in executing the particular components of a software project, regardless of the base programming language — for example, source-code conDr. Dobb’s Journal, August 2005
trol, application building, application deployment, and task tracking (work item tracking). However, it is rare that any of these tools integrate these different processes effectively and add significant value to the life of a project. This is where Team System comes in. VSTSEclipse will initially concentrate on getting a few core features of Team Foundation Server (the collaboration server component of VSTS) integrated into the Eclipse development environment, ready to use on any project instance that uses VSTS. These features include work item
“VSTSEclipse will initially concentrate on getting a few core features of Team Foundation Server integrated into Eclipse” tracking, Team Foundation source control, and possibly Team Build integration. Wouldn’t it be great to build your Java application with Microsoft’s new build framework via your Eclipse IDE? (Well maybe. It depends who you talk to!) The point is, the freedom of choice will be there for non.NET developers to take advantage of Team System technology. And as more of the generic project functionality of the Team System product is explored and utilized within the .NET community, the likelihood of that same functionality wanting to be exposed to the Java/Eclipse community will be quite high. A brief technical outline of the project begins with breaking it up into two main components: • The Eclipse IDE user interface for VSTS. • Utilization and integration of VSTS functionality to Eclipse. (continued on page 43) http://www.ddj.com
Visual Studio Team Architect Edition
Visual Studio Team Developer Edition
Visual Studio Team Test Edition
Application Modeling
Dynamic Code Analyzer
Load Testing
Local Infrastructure Modeling
Static Code Analyzer
Manual Testing
Deployment Modeling
Code Profiler
Test Case Management Unit Testing Code Coverage
Class Modeling Visio and UML Modeling
Visual Studio Industry Partners
Process and Architecture Guidance
Visual Studio Team System
Team Foundation Client Visual Studio Professional Edition
Visual Studio Team Foundation Change Management
Reporting
Integration Services
Work Item Tracing
Project Site
Project Management
Build Automation
Figure 1: Visual Studio Team System facilities. (continued from page 40) Version 3.0.2 of the Eclipse plug-in framework makes it straightforward to implement custom plug-ins, but not necessarily without its complexities. We are aiming to provide seamless integration of the different Team System features within the Eclipse environment that developers are used to. This basically translates to having somewhat similar IDE functionality in Eclipse as to what is provided in Visual Studio 2005. This will entail customization of most — if not all — of the Eclipse user-interface elements, such as perspectives, views, and editor layouts. The relevant Team System functionality will be called on different occasions through the Eclipse user-interface experience, including user action, menu selection, popup menu action, and even on the initial load of the Team System perspective. This leads us into the second component of the project, integration of features between Team System and the Eclipse development environment. The feature integration component promises to be interesting to implement because it will be cross platform between Java and .NET. There are a few strategies that we’re currently examining, including wrapped VSTS API calls, web-service utilization, and bridging framework implementations. Each has its advantages and disadvantages, and is subject to scrutiny by the VSTSEclipse team. Relevant features in Team System that we are looking to integrate are currently being investigated in terms of how they work “under the hood” so that we can fully understand http://www.ddj.com
how they will be efficiently integrated into the Eclipse environment. One of the main considerations of creating the initial architecture will be to understand the availability of separate components that make up Team System to developers. We are looking to avoid heavy dependencies on the Eclipse environment. Having said that, there will undoubtedly be certain dependencies initially on the availability of the Team Foundation Server. The VSTSEclipse project resides on SourceForge.net (http://sourceforge.net/ projects/vstseclipse/), a web site that hosts collaborative open-source software projects. We currently have five team members, including two Microsoft employees (U.S. and Australia based) and are in the planning stages of how the plug-in will fit in with the Eclipse environment and what architecture is required to enable crossplatform communication between .NET and the Java Virtual Machine. The plug-in produced will be freely available for community download upon release, and hopefully allow organizations that are considering using Team System functionality, but do not have a .NET code base, to take advantage of the core target features provided by the Team Foundation Server. We welcome any support and collaboration from the community if anyone is interested in joining our initiative. Head on down to SourceForge.net, register for the VSTSEclipse project, and make a contribution. DDJ Dr. Dobb’s Journal, August 2005
43
WINDOWS/.NET DEVELOPER
Performance Diagnosis & .NET Applications A tool for identifying problem areas and bottlenecks RAMKUMAR N. CHINTALAPATI AND SACHIN ASHOK WAGH
P
erformance analysis for any application must be managed at every stage of the software-development lifecycle. Each of these stages use different performance management tools — profilers during coding and unit testing stages, load-testing tools during system validation and QA stages, and tools that deal with monitoring. During performance testing, however, analysis becomes a stumbling block when the system is subjected to production-like workloads and has a distributed operating environment. In this context, large amounts of information need to be monitored and analyzed to detect bottlenecks. The lack of automation in this diagnosis process motivated us to design and implement the tool for the .NET Framework that we present here. This tool lets you identify problematic areas, then helps you resolve bottlenecks in the .NET Framework during performance analysis. (The complete source code for the tool is available electronically; see “Resource Center,” page 3.) Our approach to diagnosis begins with the creation of a knowledge base comRamkumar and Sachin are software architects for Infosys. They can be contacted at [email protected] and [email protected], respectively. 44
prising several performance patterns, which are indicators for detecting bottlenecks. We represent these performance patterns as a Bayesian network that has been extensively used in the field of medical diagnosis. A given scenario is diagnosed with respect to several performance patterns to report the possible problem areas or to comment on the scalability of the application. Performance Patterns The process of diagnosis for systems under load involves collecting and understanding metrics related to system resources, the .NET managed code layer, and the application layer that constitutes the .NET stack. When examined, these metrics lead to tell-tale signs that can detect performance bottlenecks. For example, consider the performance counter System\Processor Queue Length defined in Microsoft’s .NET platform for which a sustained queue of more than two threads is an indication of a processor bottleneck. This example illustrates the first category of performance patterns that are based on individual thresholds. There are at least 30 such performance patterns in the .NET platform (see http://www.microsoft .com/downloads/details.aspx?FamilyId= 8A2E454D-F30E-4E72-B53175384A0F1C47& displaylang=en). Likewise, if you were to determine the benefit of adding an extra processor, you need to correlate it with Processor\% Processor Time counter. This illustrates the second category of performance patterns where, based on the relationships between a set of metrics, problems are diagnosed. The source for the list of the second category of performance patterns are operational laws in queuing theory (see The Art of Computer Systems Performance Analysis by Raj Jain, John Wiley & Sons, 1990) that define relationships between overall performance metrics, load condiDr. Dobb’s Journal, August 2005
tions, and system utilization, and performance tuning references that recommend specific tips. Thus, by using these performance patterns, you can detect bottlenecks and the scalability of applications. In this article,
“By using these performance patterns, you can detect bottlenecks and the scalability of applications” we restrict our discussion to design and implementation of a tool that handles both of these types of patterns, and for brevity, illustrate it with a subset of both types of patterns. To come up with a comprehensive list of these performance patterns, the first step is to identify the metrics and their defined thresholds. This can then be logically augmented by the relationships that exist between them. Metrics to be Captured There are two types of metrics that need to be considered for analysis, the first being overall application metrics such as throughput and response time. These metrics can be obtained from load-testing tools, such as Microsoft’s ACT. For all other metrics, we rely on the Windows system monitor (Microsoft’s Perfmon/Logman) that has well-defined performance objects for each layer of the http://www.ddj.com
.NET stack. These performance objects are a logical collection of counters that reflect the health of a resource. The threshold for each of these counters can also depend on the type of server on which it is processed. For example, Microsoft products, such as Commerce Server, Web Server (IIS), and Database Server (SQL Server), have their own subset of these objects, and their thresholds may differ. Table 1 presents a subset of these performance patterns. At the first pass, threshold-based performance patterns quickly identify the bottlenecks at a high level. However, a more detailed diagnosis is often driven by examining the relationships between metrics defined by the second category of performance patterns (see the accompanying text box entitled “Operational Laws”). The scope of our examination here is problem diagnosis, where a performance pattern is recognized and a recommendation is made when bottlenecks occur. These recommendations can be in the possible impact areas, such as hardware, source code, or software configuration problems. To illustrate, consider the counter Available Mbytes, which examines the amount of physical RAM available in MBs not satisfying the required threshold value. The recommendation would be to add more main memory or decrease the code footprint. This leads you to a point where you need to further drill down along these recommendations to resolve the problem. Conceptual Design Of the Diagnosis Tool The diagnosis engine relies on the collection of performance patterns for the .NET Framework. The strength of the tool lies in the knowledge base that constitutes these performance patterns. Hence, an important design requirement should be to offer flexibility to maintain and update the performance patterns over time. This needs to be achieved by a suitable representation of the knowledge base. The popular forms of representation are Bayesian networks (http://www.niedermayer.ca/papers/ bayesian/bayes.html); conventional Decision Trees (http://www.aaai.org/AITopics/ html/trees.html); and Case Base Reasoning (CBR) (http://www.aiai.ed.ac.uk/links/ cbr.html). In our design, we use Bayesian networks because they offer maximum flexibility and ease in modeling the knowledge base for our needs. The Bayesian network in this context is an acyclic graph where a set of nodes and their relationships represent performance patterns. Each node is associated with a predefined set of values and its corresponding outcomes in their relationships. (For more details, see the accompanying text box “Bayesian Networks.”) http://www.ddj.com
.NET CLR Memory\# Gen 1 Collections .NET CLR Memory\# Gen 2 Collections .NET CLR Memory\% Time in GC ASP.NET Applications\Requests Timed Out ASP.NET Applications\Requests/sec ASP.NET\Request Wait Time ASP.NET\Request Rejected
Database Server
PhysicalDisk\Avg. Disk Read Queue Length PhysicalDisk\Avg. Disk sec/Transfer PhysicalDisk\ Avg. Disk Write Queue Length
Table 1: Mapping between various servers and their respective metrics.
Figure 1: Conceptual design.
Figure 2: Bayesian network. Dr. Dobb’s Journal, August 2005
45
As indicated in Figure 1, conceptual stages in the design are: • Collecting metrics from performance monitoring utilities such as Microsoft’s Perfmon/Logman and Microsoft’s ACT. • Evaluating these metrics with performance patterns represented by Bayesian network. • Proposing one or more recommendations for the potential bottlenecks. • Updating the knowledge base periodically to build an effective diagnosis engine.
Tool Implementation Our tool has two important components — monitoring and diagnosis. The monitoring part deals with collection of all the required metrics during load testing. Here we use the load generator’s monitoring facility, along with utilities supported by Windows. There are other good utilities in Windows XP, such as Logman (http://www.microsoft.com/ resources/documentation/windows/xp/ all/proddocs/en-us/nt_command_logman .mspx), that offer flexibility to define all the metrics at one time and use the same script for the subsequent iterations.
Bayesian Networks
A
Bayesian network (BN) is a graphical representation based on probability theory. It is a directed acyclic graph with nodes, arcs, tables, and their associated probability values. These probabilities may be used to reason or make inferences within the system. Further, BNs have distinct advantages compared to other methods, such as neural networks, decision trees, and rule bases, when it comes to modeling a diagnostic system. One of the many reasons why Bayesian
networks are preferred over decision trees is that in BN, it is possible to traverse both ways. Recent developments in this area include new and more efficient inference methods, as well as universal tools for the design of BN-based applications. An extended list of software for BNs is at http://bayes.stat.washington.edu/almond/ belief.html. — http://www.niedermayer.ca/ papers/bayesian/bayes.html
In this tool, we implemented the monitoring functionality via the ServerMonitor.java program (available electronically), which takes test configuration details, such as the type of servers, hostname, and load levels as inputs. Based on the type of the server, the metrics to be captured are determined by the program. The program then generates performance counter logs using the Logman utility with relevant metrics on the local machine. When testing starts, the utility sends instructions to the relevant host machines to retrieve the specific metrics. While this remoting option offers flexibility, there are some restrictions one needs to be aware of. One restriction is that the host and local machines need to be on the same LAN. In addition, the local machine needs certain security access permissions (http://support.microsoft.com/ default.aspx?scid=kb;en-us;818032). The data gathered by using these counter logs is then stored in Comma Separated Values (CSV) format in the local machine, and the filename follows a specific naming convention. By default, one folder is created for each load level, and all the individual servers have a CSV file in each of them. The set of folders containing the collected metrics for different load levels forms one of the feeds for the diagnosis engine. The other feed consists of overall system-level metrics, such as application throughput and application response times that needs to be recorded from the load-testing tools for each load level.
Operational Laws Let, N: Number of users X: Transaction throughput (in transaction/sec) R: Transaction response time (in sec) S: Service Demand (in sec) Z: Think time (in sec) U: Utilization of the resource (% of time the resource is busy)
Then, The Utilization Law: U=X*S; facilitates the computation of the service demand. Little’s Law: N=X*(R+Z); Little’s Law check is mandatory even before we look for bottlenecks in the application as this validates the test bed. If Little’s Law is violated, it means that either the testing process or the test data captured is incorrect. —The Art of Computer Systems Performance Analysis by Raj Jain
46
Dr. Dobb’s Journal, August 2005
http://www.ddj.com
In our implementation of the diagnosis engine (which is based on the Bayesian network), we used the Java version of the Netica API (http://www.norsys.com/ netica-j.html?popup). Specifically, we used its stub framework to add our specific performance patterns and create the specific network. We refer to this compiled java source as “PerfDiag.” The input to the PerfDiag starts with taking the overall system metrics and providing an option to associate the folder containing all the CSV files for each load level. These inputs are sufficient to trigger the diagnosis process. Figure 2 depicts a portion of the graphical model of the network. After the inputs are assessed by the diagnosis engine, a complete bottleneck report is generated. The next section details a small case study and the sample report. Experimental Results To demonstrate how you use this diagnosis tool, we present a performance analysis of an Online Job Recruitment system based on the .NET Framework. The requests were fired from a load generator to a cluster of two web servers that were load balanced. Figure 3 shows the deployment diagram. We conducted load testing for one representative business scenario, Search And Apply, where users log in to the system, search or apply for a job, and log out. The load test was conducted for four different load levels. The ServerMonitor program creates and stores the performance logs for these servers. Further, the PerfDiag diagnosis engine analyzes the log files and a consolidated report is generated. The first round of diagnosis revealed a problem with the test bed as Little’s Law was not getting validated. The analyst determined the root cause of the problem was because of insufficient access permissions when the users logged in. This problem was fixed and the tests rerun. The second round of diagnosis revealed an important bottleneck with respect to the database disk performance. The recommendation pointed to improving the database disk organization. Other minor recommendations were related to garbage collection tuning and improving the file system cache performance. The report also contained graphs that indicated application scalability and performance under varying load conditions.
knowledge base, we chose the Bayesian network framework as it is found to be most suitable for the problem statement at hand. While we have customized this tool for .NET applications, we believe that the same conceptual design can be extended to performance diagnosis on other platforms.
Acknowledgments Thanks to Rajeshwari G. for the overall guidance and the support received from the rest of the Quality of Service group, SETLabs at Infosys Technologies Ltd. DDJ
Figure 3: Pilot application.
Conclusion We have presented a new approach to automate the performance diagnosis. This automation hinges on identifying performance patterns that are tell-tale signs for detecting bottlenecks. The process involves retrieving the required metrics and further steering the diagnosis with the knowledge of performance patterns. To represent the http://www.ddj.com
Dr. Dobb’s Journal, August 2005
47
WINDOWS/.NET DEVELOPER
Moving to .NET 2.0 Porting your apps to the new platform ERIC BERGMAN-TERRELL
V
isual Studio 2005, .NET 2.0, and C# 2.0 include a host of new features. But since your .NET 1.1 app probably runs as-is on .NET 2.0, is there any rush to load it into Visual Studio 2005 and start exploiting new .NET and C# functionality? I recently ported one of my .NET 1.1 applications to 2.0 to learn about the promise and perils of .NET 2.0— and I learned a lot more than I expected! During the port, I discovered bugs in my application that I never knew I had. I found that some programming techniques that worked flawlessly in .NET 1.1 were either partially or completely nonfunctional in .NET 2.0. And I learned which C# and Windows Forms enhancements were useful to me, and which were not. In this article, I show how you convert .NET 1.1 applications to 2.0 while taking advantage of the most compelling features in this new platform. I’ve included the C# Programmable Calculator, an application that I ported (available electronically; see “Resource Center,” page 3) to .NET 2.0. This article and the sample application are based on the Beta 1 versions of Visual Studio 2005 and .NET 2.0. A few of the details may change as Visual Studio and the .NET platform evolve. The Sample App The C# Programmable Calculator (Figure 1) is a Reverse Polish Notation (RPN) or HP-style calculator. RPN calculators have no parenthesis, no operator precedence rules, and no “=” button. You enter the operands first, then select the operation. For example, to calculate “2+2,” press the Eric has developed everything from data reduction software for particle bombardment experiments to software for travel agencies. He can be contacted at ericterrell@ comcast.net. 48
2 button, then press Enter. Press 2 again and press +. When you press +, the operands are removed from the stack and replaced with the result of the operation. The stack is displayed as a list box in the upper left. Users can add new buttons to the calculator by writing C# methods. Press the Edit Functions button to add a new custom button. Scroll until the cursor is inside the Functions class and isn’t in the middle of a method. Select Edit/Add Function and choose the button’s tab, as well as the function name and return type. After you press the OK button, write the method’s code. Then press OK and go to the tab you specified. You’ll see a new button corresponding to the method you just added. The code you were editing was dynamically compiled into an inmemory assembly. Then this assembly was interrogated by the Reflection API. The calculator’s custom buttons were generated from the methods with [Button] attributes. When you press a custom button, the corresponding method in the assembly will be called, and the results will be placed on top of the stack. There are two versions of the sample app. The Visual Studio 2003 and 2005 versions are stored in the “Dot Net 1.1” and “Dot Net 2.0” folders, respectively. To install either version, navigate to the Setup folder and double-click cspcalc.zip. If you use WinZip, you can automatically install by clicking the Install toolbar button. Otherwise, you may need to extract the .zip file to a hard disk folder and run setup.exe. To build either version, extract the source code to a hard disk folder and load the .sln file into Visual Studio. If you want to install the .NET 2.0 version, you’ll need the .NET 2.0 framework. You’ll also need Visual Studio 2005 or Visual C# 2.0 Express Edition to build the software and debug it. Testing a .NET 1.1 App on .NET 2.0 Before I started the port to 2.0, I wanted to find out how well the original application ran on .NET 2.0. I had both .NET 1.1 and 2.0 installed on my machine but, by default, applications compiled for 1.1 only run in 2.0 if 1.1 is not available. There are at least two ways to force a 1.1 application to run on 2.0: Use a config file or a Dr. Dobb’s Journal, August 2005
registry setting. Store a config file such as this in the folder containing your .exe: <startup> <requiredRuntime version="v2.0.40607"/> <supportedRuntime version="v2.0.40607"/>
“The enhancements in Visual Studio 2005, C# 2.0, and .NET 2.0 are evolutionary, not revolutionary” Be sure that the config file’s name is identical to your .exe’s name, with an extra “.config” on the end. For example, the sample app’s .exe is CSPCALC.exe, so the config file must be named CSPCALC.exe.config. The “v2.0.40607” version specifies .NET 2.0 Beta 1. If you’re using a subsequent beta or the released version of 2.0, change the version attribute accordingly. Hint: You can find the installed .NET versions on your machine by looking at the folder names in your \%windir%\Microsoft.NET\Framework folder. If you have multiple 1.1 apps to test on 2.0, it takes effort to create config files for each application. In this case, use RegEdit to add a key named OnlyUseLatestCLR with a DWORD value of 1 in the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework section of the registry. I configured the 1.1 version of C# Programmable Calculator to run on .NET 2.0 and ran it. To verify that it was really running on .NET 2.0, I wrote this custom function. The first argument of the Button attribute specifies the Tab that contains the button. The second argument is http://www.ddj.com
(continued from page 48) optional and specifies the button text. By default, the button text is the same as the method name. [Button("Main", ".Net Version")] public static Version DotNetVersion() { return Environment.Version; }
Because this function returned 2.0.40607.42, I knew the app was running on .NET 2.0 Beta 1. I continued testing the 1.1 version of the app on .NET 2.0 and didn’t find any significant bugs. After convincing myself that the program ran without problems on .NET 2.0, I updated its setup program. By default, a setup program created in Visual Studio 2003 will not install an app unless .NET 1.1 is installed on the user’s system. The default value for the setup’s SupportedRuntimes element is “8:1.1.4322,” which is .NET 1.1’s version number (except for the “8:” prefix). I added .NET 2.0 Beta 1’s version number to SupportedRuntimes to allow the setup to install on systems with just 1.1, just 2.0 Beta 1, or both. You can change the SupportedRuntimes element as follows: Right-click the Setup project in the Solution Explorer. Select View/Launch Conditions and edit the .NET Framework value. Separate each version number with a semicolon “;”. Building and Testing With Visual Studio 2005 The next step was to compile and test the application with Visual Studio 2005. Visual Studio converted the .sln and .csproj files with no errors. The app built successfully, but when I ran it the first time, I immediately encountered threading errors when I called custom functions by pressing the corresponding buttons. In the Win32 environment, a GUI component must only be accessed and manipulated by the thread that created it. In most of my code, I followed this rule, but I broke it in a few places. Fortunately, the Visual Studio 2005 debugger detected these issues. I wrote a small test application named “ThreadTest” (Figure 2) that illustrates this situation. Load the code into Visual Studio 2003 and run it in the debugger.
When you click Update from the GUI Thread button, the current time will be updated (see updateFromGUIThreadButton_Click in Listing One). When you click Update from the Worker Thread button, a new worker thread is created, and this thread updates the time by calling UpdateTime to change the currentTimeLabel’s text. However, this is a flagrant violation of Win32 threading rules. The only clue that something is wrong is that the Assert will fail. .NET GUI classes, such as Form and Control, have an InvokeRequired property that is True when a GUI object is accessed by the wrong thread. To correctly update the time from a worker thread, press Update from Worker Thread (Invoke). In this case, a new worker thread is created. When it’s started, it calls the UpdateTimeFromGUIThread method. UpdateTimeFromGUIThread creates a delegate to UpdateTime, and calls UpdateTime indirectly by passing the delegate to the Form’s Invoke method. This causes the UpdateTime method to be called by the GUI thread. I recommend testing all of your multithreaded applications in Visual Studio 2005. You may have hidden threading bugs that the new debugger can easily detect. I noticed another issue as I ran the program with the Visual Studio 2005 debugger. The output window was constantly scrolling messages about first-chance exceptions. The exceptions were being caught, but I was concerned about the frequency with which they were occurring. After all, throwing and catching exceptions is computationally expensive. The exceptions were being thrown by the code that determines which buttons should be enabled, based on the number of items on the stack. For example, the + button is only enabled when there are at least two numbers on the stack. This code was throwing an exception if there were fewer than two items on the stack. Even though the exception was properly caught, this reduced performance, especially on slower machines. I restructured the method, StackListBox.StackItemsAreNumeric, to check the stack depth rather than throw an exception. Bravo, again, to the new debugger for bringing an important, previously unnoticed issue to my attention! More Bugs The new debugger uncovered other bugs. The app was accessing a browser window using COM interop. When the browser was launched from a worker thread, I got this error message: “Cannot instantiate ActiveX object because current thread is not in a singlethreaded apartment.” It’s not often that an error message is so accurate. I fixed this issue by calling SetApartmentState(ApartmentState.STA) on the worker thread. Dr. Dobb’s Journal, August 2005
Calling Process.Start’s one-argument overload with a URL string to launch a web browser didn’t work in dynamically compiled code. I had to use the twoargument flavor, and specify the browser’s .exe filename (for instance, “IExplore.exe” or “Firefox.exe”) as the first argument. This issue does not affect ordinary code, just code in dynamically compiled assemblies. I found another bug in the Help/About dialog box. This form includes a hyperlink to my e-mail address. When clicked, it launches the user’s e-mail client and automatically composes an email with the addressee and subject line filled in. Code like this works on .NET 1.1: Help.ShowHelp (this, "mailto:[email protected]? subject= C# Programmable Calculator");
But in .NET 2.0, the e-mail address includes the ?subject= text inappropriately. Fortunately, this code works in both .NET versions: Process.Start ("mailto:[email protected]? subject= C# Programmable Calculator");
After fixing these bugs, I corrected some deprecation issues that the compiler detected. Then it was time to start exploiting the new C# language, Windows Forms, and .NET platform enhancements in .NET 2.0. Nullable Value Types In .NET 1.1, only objects can have null values. Floating-point variables can have a value of NaN (“not a number”) that signifies “undefined” or “not applicable.” But other value types do not have a special value for this purpose. In .NET 2.0, all value types can be nullable. To declare a variable of a nullable value type, just put a “?” at the end of the type name: int? count1; int? count2 = null; int? count3 = 42;
Because C# 2.0 supports nullable types, and because my C# Programmable Calculator lets users write their own C# code for custom functions, I needed to test the app’s ability to support nullable value types. I wrote these functions to see if the app would compile the code and handle the nullable values correctly: [Button("Main", "Null Test 1")] public static int? NullTest1() { return 42; } [Button("Main", "Null Test 2")] public static int? NullTest2() { return null; }
http://www.ddj.com
When I added NullTest1( ) and NullTest2( ), the code compiled without errors. When I ran NullTest1( ), it returned “42” as expected. But NullTest2( ) caused a blank item to be pushed on the stack (Figure 1). This occurred because the app was calling object.ToString( ) to display values in the stack. A nullable object’s ToString( ) method will return an empty string when the object is null. Representing a nullable object null value as an empty string may be reasonable in many situations, but displaying a blank stack entry in a calculator will confuse users. I fixed this problem by checking for nullable object type values in StackListBox.StackListBox_DrawItem: // If item is the empty string, could // be a nullable type null value. if (NullableTypeUtils.IsNullableType (currentObject)) { display = NullableTypeUtils.ToString (currentObject); }
.NET 2.0 generic containers cannot contain objects of fundamentally different classes. For example, if you uncomment the aforementioned commented-out lines, the compiler issues this syntax error: “Argument ‘1’: cannot convert from ‘string’ to ‘…Class1’.” If you want to store objects of multiple types in a generic container, make sure that all the objects are derived from the same base class and specify the base class in the container declaration. You could even declare a generic container to contain any object (List